PALACKÝ UNIVERSITY Faculty of Medicine and Dentistry

Doctoral Dissertation

Olomouc, 2019 Veronika Grešáková

PALACKÝ UNIVERSITY

Functional analyses of dual role of Fam208a protein

Veronika Grešáková, M.Sc.

Supervising department: Institute of Molecular Genetics, Academy of Sciences of Czech Republic, Prague

Supervisor: Doc. RNDr. Radislav Sedláček, Ph.D.

Olomouc 2019

Statement of authorship:

I hereby declare that I am the sole author of this dissertation thesis entitled: ‘Functional analyses of dual role of Fam208a protein’ and that I have not used any sources than those listed in the bibliography and identified as references I further declare that I have not submitted this thesis at any other institution in order to obtain a degree. The research was carried out at the Institute of Molecular Genetics of Academy of Sciences in Czech Republic, Prague.

Olomouc, 2019 Mgr.Veronika Grešáková

Acknowledgments:

First, I want to thank my supervisor, Doc. RNDr. Radislav Sedláček, Ph.D., for his support, positive attitude and belief in my capability throughout my whole PhD study. Second, to Jan Procházka, for taking over my case in the fifth year of my PhD study and hours spent with my manuscript, thesis and project planning. You both were there when I did not believe in achieving the degree and did not let me to give it up. I will be always grateful for your strength you provided me with. I am also very grateful to all my lab-mates from Laboratory of transgenic models of diseases for the motivating atmosphere and constructive criticism they created.

My special thanks go to my whole the family and friends without whose support none of this would have happened. And one giant big thanks go to Blanche, who sacrificed hours with proofreading and improving my texts. I wouldn’t make it without you.

Research on these projects was supported by grants: GACR 15-23165S, RVO

68378050 by AS CR, and by LM2015040, CZ.1.05/2.1.00/19., CZ.1.05/1.1.00/02.0109 and LQ1604

March 2019 Olomouc Veronika Grešáková, M.Sc. …………………………….

Bibliografická identifikace Jméno a příjmění autora: Veronika Grešáková Název práce: Funkční analýza duální role Fam208a proteínu Typ práce: Dizertační Pracoviště: Ústav molekulární a translační medicíny Lékařská fakulta Univerzita Palackého Hněvotínská 5, 779 00 Olomouc Vedoucí práce: Doc. RNDr. Radislav Sedláček, Ph.D. Rok obhajoby práce: 2019 Klíčová slova: TASOR, Fam208a, Mphosph8, epigenetické regulace, buněčné dělení, mutageneze, CRISPR/Cas9 Jazyk: Anglický

Bibliographical identification: Author’s name and surname: Veronika Grešáková Title: Functional analyses of dual role of Fam208a protein Type of thesis: Dissertation Department: The Institute of Molecular and Translational Medicine Faculty of Medicine and Dentistry Palacký University Hněvotínská 5, 779 00 Olomouc Supervisor: Doc. RNDr. Radislav Sedláček, Ph.D. The year of defense: 2019 Keywords: TASOR, Fam208a, Mhosph8, epigenetic regulations, cell division, mutagenesis, CRISPR/Cas9 Language: English

Abstract Maintenance of genome stability is essential for every living cell as genetic information is repeatedly challenged during DNA replication in each cell division event. Errors, defects, delays, and mistakes that arise during mitosis or meiosis lead to an activation of DNA repair processes. In case of their failure, programmed cell death, i.e. apoptosis, can be initiated. Cell cycle is a baseline for developmental processes involving cell division, growth and differentiation. When development proceeds, the gastrulation initiates. Primitive streak forms when epiblast cells migrate and give rise to mesoderm. This phase includes very rapid proliferation and extensive complete epigenetic reprogramming.

The goal of this work was to evaluate the function of FAM208a, a protein whose importance in heterochromatin maintenance has been described recently, during the cellular division and embryonic development. We showed that ENU-induced substitutional mutation L130P in Fam208a is responsible for lethal embryonic phenotype. Moreover, embryos displayed impaired primitive streak formation and increase in p53 signalling. CRISPR/Cas9-induced knock out mouse line also exhibited embryonic lethality but with slightly delayed onset of phenotype. This model points to the differences also between heterozygotes and homozygotes in early stages. We used also siRNA approach to downregulate Fam208a in zygotes in order to avoid the influence of maternal RNA in early stages of development. This early downregulation increased arresting the embryonal development at the two-cell stage and occurrence of multipolar spindles formation. To study the reasons leading to these effects, we used yeast two-hybrid (Y2H) system by which we identified new putative interaction partners of Fam2081, namely Gpsm2, Amn1, Eml1, Svil, and Itgb3bp. Their co- expression with Fam208a was assessed by qRT-PCR profiling and in situ hybridization (ISH) in multiple murine tissues.

Based on the results, we propose that Fam208a functions within HUSH complex by interaction with Mphosph8 as these proteins are not only able to physically interact but also to co-localise. HUSH complex consists of FAM208A, MPHOSPH8 and PERIPHILIN and it is responsible for epigenetic silencing mediated by spreading of H3K9me3 promoted by SETDB1. We brought new evidence that Fam208a is multi- interacting protein affecting genome stability on the level of cell division at the earliest stages of development and also by interaction with methylation complex in adult tissues. These date demonstrate new putative functions in maintaining epiblast fitness and establishing epigenetic programming. Besides these functions, Fam208a appears to have an additional role in zygotic division, likely via interaction with newly identified putative partners Gpsm2, Amn1, Eml1, Svil, and Itgb3bp.

Keywords: Genome stability, Fam208a, multipolar spindle apparatus, HUSH, Mhosph8, epigenetic regulations, cell division, mutagenesis, CRISPR/Cas9

Abstrakt Udržování stability genomu je klíčovým úkolem pro všechny živé buňky, jelikož samotná genetická informace je opakovaně zatěžována v průběhu každé DNA replikace při buněčném dělení. DNA reparační mechanismy jsou aktivovány poruchami, které vznikají během mitózy nebo meiózy. Pokud selžou, dochází ke spuštění programované buněčné smrti, apoptózy. Buněčný cyklus je základem pro vývoj, který zahrnuje buněčné dělení, růst a diferenciaci. S postupem času dochází ke gastrulaci, během níž buňky z epiblastu z primitivního proužku migrují, aby vytvořily mezoderm. V této fázi dochází k velice rychlé proliferaci a ke kompletnímu a úplnému epigenetickému přeprogramování.

Nedávno byl popsán protein Fam208a, který je právě pro tyto procesy důležitý. V naši práci popisujeme roli Fam208a proteinu při udržování genomické stability v průběhu buněčného dělení a v průběhu embryonálního vývoje. ENU mutageneze vyprodukovala linii s mutací v proteinu Fam208, L130P, která způsobuje embryonálně letální fenotyp. Embrya trpí narušením tvorby primitivního proužku a zvýšenou aktivitou dráhy p53. Pomocí metody CRISPR/Cas9 jsme vytvořili myší linii s kompletně knouckoutovaným Fam208a, která se také projevila jako embryonálně letální, ale k úmrtí dochází o něco později. Zajímavé je, že tento model poukázal na projevové rozdíly mezi homozygoty a heterozygoty. Abychom se vyhli efektu působení maternální RNA, použili jsme siRNA metodu na odbourání Fam208a v zygotách. Výsledkem byl zvýšený výskyt tripolárních dělicích vřetének. Abychom mohli vytvořit novou hypotézu, použili jsme kvasinkový dvouhybridní systém a identifikovali jsme řadu nových interakčních partnerů Fam208a včetně Gpsm2, Amn1, Eml1, Svil, a Itgb3bp. Exprese všech byla následně popsaná pomocí metody qRT-PCR a in situ hybridizací na několika myších tkáních.

Naše výsledky ukázaly, že Fam208a pracuje jako součást HUSH komplexu společně s Mphosph8, jelikož spolu nejenom kolokalizují, ale v rámci buňky také interagují. HUSH komplex se skládá z FAM208A, MPHOSPH8 a PERIPHILINU a je odpovědný za epigenetickou regulaci prostřednictvím šíření H3K9me3 markeru, jehož produkce je řízená aktivitou SETDB1. Přinášíme dále nové poznatky, z kterých lze soudit, že Fam208a vzájemně působí na různé bílkoviny a podílí se na udržování genomové stability jak během buněčného cyklu tak během embryonálního vývoje. Prokázali jsme, že kromě udržování epiblastu a nastolení epigenetického programu se Fam208a účastní i zygotického dělení, a to prostřednictvím spolupráce s nově popsanými partnery Gpsm2, Amn1, Eml1, Svil a Itgb3bp.

Klíčová slova: Genomová stabilita, Fam208a, multipolární dělicí vřeténko, HUSH, Mhosph8, epigenetická regulace, buněčné dělení, mutageneze, CRISPR/Cas9

Table of Contents

List of Abbreviations ...... 12 Introduction ...... 15 Mutagenesis – generation of research models ...... 18 Introduction to the cell cycle and division ...... 26 Epigenetic mechanisms ...... 30 Epigenetic regulation during the cell cycle ...... 36 Epigenetic regulation during embryonic development ...... 40 Fam208a – an important component of the HUSH complex ...... 45 Aims of the dissertation ...... 51 Materials and Methods ...... 52 Whole-mount in situ hybridization and histology ...... 52 Whole mount Immunofuorescence ...... 52 RNA expression analyses and qPCR ...... 53 Microarray analysis ...... 53 Generation of Fam208a KO mouse ...... 53 Mutant Hek293t cell lines ...... 54 E9.5 LC-MS analysis ...... 54 Hek293t mutant cell LC-MS analysis ...... 55 nLC-MS 2 Analysis ...... 55 Data analysis ...... 55 Oocyte RNA interference ...... 56 Oocyte immunofluorescence ...... 56 Yeast two-hybrid system ...... 56 BIOMARK and qRT-PCR ...... 57 Expression vectors and staining procedure ...... 58 PI staining and cell cycle measuring ...... 58 Results ...... 59 L130P mutation of Fam208a leads to developmental delay and gastrulation failure ...... 59 Fam208a is important for epithelial-to-mesenchymal transition at the onset of gastrulation . 63 Fam208a mutant embryos exhibit an alteration in anterior-posterior patterning ...... 65 Fam208a mutants exhibit a decreased number of total cells and increased incidence of the cell cycle arrest ...... 68 The expression profile of L130Pa embryos is altered already at E6.25 ...... 74 Full ablation of Fam208a leads to embryonic lethality at early somite stage ...... 76 Fam208a mutation massively impacts the protein expression profile homozygous mice ...... 80 Downregulation of Fam208a in zygotes leads to immediate cell division phenotype ...... 83 Y2H screen revealed a novel Fam208a interaction network in the spindle assembly machinery ...... 86 Fam208a has tissue-specific subsets of interacting partners ...... 90 Ablation of FAM208a in somatic cells did not impair the cell division processes ...... 93 Overexpression of L130P Fam208a might cause an increase in G2/M phase arrest ...... 97 L130P mutation impairs intracellular distribution of Fam208a ...... 100 Fam208a is not involved in DNA reparation processes but has nucleic acid binding capability ...... 104 Fam208a localization and function is Mphosph8-level dependent ...... 107 Discussion ...... 110 Summary ...... 119 Souhrn ...... 120 References ...... 121 Supplementary tables ...... 131 Table S1 ...... 131 Table S2 ...... 132 Table S3 ...... 133 Table S4 ...... 134 Table S5 ...... 135 List of appendices ...... 136

11

List of Abbreviations

aa Amino acid ac Acetylation AME Anterior mesoderm AP Anterior-posterior AVE Anterior visceral endoderm C' terminus 5' terminus Cas9 CRISPR-associated 9 CD Chromodomain CDK Cyclin-dependent kinase cDNA Complementary DNA CENP-A-S7 Centromere protein A serine 7 CKI CDK inhibitors CpG islands Cytosine and guanine dinucleotide repetitive islands CPT Camptothecin CRISPR Clustered regularly interspaced short palindromic repeats crRNA CRISPR RNA DAPI 4′,6-Diamidine-2′-phenylindole dihydrochloride DE Definitive endoderm DNA Deoxyribonucleic acid Dnmt De novo methyltransferase DOT Disruptor of telomeric silencing DSB Double-strand breaks dsRNA Double-strand RNA DVE Distal visceral endoderm EMS Ethyl methanesulfonate EMT Epithelial-to-mesenchymal transition ENU Ethyl nitrosourea ERV Endogenous retroviruses ESC Embryonic stem cells ExE Extraembryonic ectoderm FAM208a Family with sequence similarity 208 member A FH Facultative heterochromatin GFP Green fluorescent protein gRNA Guide RNA GV Germinal vesicle H3K14 Histone 3 lysine 14 H3K27 Histone 3 lysine 27 H3K36 Histone 3 lysine 36 H3K79 Histone 3 lysine 79 H3K9 Histone 3 lysine 9 H3S10 Histone 3 serine 10

12

H4K12 Histone 4 lysine 12 H4K20 Histone 4 lysine 20 H4K5 Histone 4 lysine 5 H4R3 Histone 4 arginine 3 HDR Homology-directed repair HIV Human immunodeficiency virus HR Homology recombination HUSH Human silencing Hub CH Constitutive heterochromatin ICRs Imprinting coding regions IMPC International Mouse Phenotyping Consortium Indels Insertions or deletions kb Kilobase KDMs Lysine demethylases KMTs Lysine methyltransferases KO Knock out L130P Lysine instead of proline at position 130aa LC-MS Liquid chromatography mass spectrometry LINE1 / L1 Long interspersed nuclear elements lncRNA Long non-coding RNA m6A N6- methyladenosine m7G 7-methylguanylate MB Mitotic bookmarks mCARs Mitotic chromatin-associated RNAs me Mono-methylation me2 Di-methylation me3 Tri-methylation MII Meiosis 2 miRNA Micro RNA Momme Modifier of murine metastable epialleles Mpp8 M-phase phosphoprotein 8 mRFP Monomeric red fluorescent protein mRNA Messenger RNA MT Microtubules MTOC Microtubule organizing center MZT Maternal to zygotic transition N' terminus 3' terminus N/C Nucleus to cytoplasm ncRNA Non-coding RNA NE Nuclear envelope NHEJ Non-homologous end joining nt Nucleotide OFR Open reading frame PAM Protospacer adjacent motif

13

PcG Polycomb group PCGs Primordial germ cells PCNA Proliferating cell nuclear antigen PEV Position effect variation ph Phosphorylation PI Propidium iodide piRNA PIWI-interacting RNA Platr Pluripotency associated transcript poly(A) Polyadenylate PPI Protein-protein interaction PRC Polycomb repressive complex PS Primitive streak PTGS Post-transcriptional gene silencing PTM Post-transcriptional modifications qRT-PCR Quantitative reverse transcriptase polymerase chain reaction Rb Retinoblastoma protein RBC Red blood cells RdRP RNA-dependent RNA polymerase RGEN RNA-guided engineered nuclease RNA Ribonucleic acid RNAi RNA interference RVD Repeat variable diresidues S phase Synthetic phase SA Spindle apparatus SAM S-Adenosyl methionine SET Su(var)3-9,enhancer-of-zeste and trithorax siRNA Small interfering RNA snoRNA Small nucleolar RNA ssRNA Single-strand RNA T Brachyury TALEN Transcription activator-like effector nuclease TF Transcription factors tracrRNA Trans-activating crRNA tRNA Transfer RNA VE Visceral endoderm Vpr HIV-1 accessory protein Vpx Viral protein X wt Wild type XCI X- inactivation Y2H Yeast two-hybrid ZFN Zinc-finger nuclease ZGA Zygotic genome activation

14

Introduction

The main condition for survival of all living organisms is their capability of changing and updating their own genetic information. Moreover, due to the natural selection, they are able to better adapt to the changing environment and toward what is beneficial for the species. Random changes, mutations, which are a driving force of evolution, can be advantageous for further development; however, in the majority of cases, it is the opposite case – mutations are either negative or neutral without direct advantage. Another outcome of the same mutations can be case-dependent, and it might have different effects in different environments (positive vs. negative). This final effect of the mutation is the major condition based on which the natural selection decides whether or not to protect and spread this mutation. Natural evolution is fully based on natural selection and it is tightly linked with naturally occurring mutations. From a different point of view, targeted mutagenesis is the major pillar of modern genetic research. In the last two decades, there was a huge expansion in development of tools that can be used for targeted mutagenesis. This development has completely changed the entire biological research and has become its main driving force. Physical, chemical and biological mutagens are used to understand the basic molecular mechanisms and principles of the most complex pathways and phenomena. The biggest hope is that this understanding will allow us to prepare effective and strictly specific treatments for severe genetic diseases (cystic fibroses, anemia, Huntington disease, muscular dystrophies, etc.).

We live in a post-genomic era, which provides us with detailed information and data about the whole genomes of thousands of organisms. Even though represent only a minority form of the whole genome, we still do not know complete details about their functions and roles. To make it even more complex, over the past 20 years, the discovery of chromatin-modifying enzymes and associated mechanisms that alter chromatin have transformed our knowledge of epigenetics from a collection of phenomena to a novel research field [1].

From the historical point of view, it all started almost a hundred years ago, when Emil Heitz and his colleagues defined nucleic acids, chromatin, and histone proteins whose role was to orchestrate the active and non-active state of chromatin, euchromatin, and heterochromatin [2]. This was followed by studies of chromosomal translocation in

15

Drosophila melanogaster that revealed the breaking off of the piece of one chromosome and attaching it to another one, the process known as non-homologous recombination [3]. The next crucial milestone was the discovery of position effect variegation (PEV) in maize. It was noticed that the comparison of types of mutations that appeared during mutagenesis have shown diversity not only with regard to the changes in phenotypic expression, but also with regard to the manner in which mutagenesis is controlled [4]. A crucial breakthrough was suggested by the outcome of the study of Mary F. Lyon, who was the first to claim that chromosome X is inactivated, that it can be of either maternal or paternal origin, and that in different cell types, different origin can be inactivated while this inactivation occurs early in embryonic development [5]. This answered the dosage compensation question, which had been preoccupying scientists for almost a century, and helped to understand the basic mechanisms responsible for X-linked diseases. Subsequently, a general concept for imprinting was independently described by two groups with the same conclusions that maternal and paternal contributions to the embryonic genome in mammals are not equivalent and that paternal imprinting of the genome appears to be necessary for the normal development [6, 7].

DNA methylation became known as the key epigenetic mark responsible for inactivation of genes, . The initial idea proposed that the DNA methylation sites are palindromic and that enzymes are responsible for methylation of unmodified DNA based on DNA methylated already on one strand. It was assumed that the first methylation process would be much more difficult and once it is done, the complementary strand would be quickly modified in the same way as the complementary DNA strand. Detailed studies of methylation patterns revealed other interesting details. It was found that the principal target of methylation in mammals is the CpG sequence, and these sites are either completely methylated or unmethylated. This model establishes the basic epigenetic mechanism based on transmission of the methylation mark through semiconservative propagation of the methylation pattern [8].

In the last two decades, it was discovered that not only DNA can be methylated and that these methylation marks on nucleosome proteins, histones, are transmitted through the cell division. An important step was the understanding that modified histones could recruit proteins in a modification-specific way that could affect the local structural and functional states of chromatin. This particular mechanism assumes that the H3K9

16 methylation mark recruits the HP1 protein, which subsequently recruits the methylation enzyme responsible for spreading the heterochromatin mark [9].

Recently, a novel mechanism for spreading the heterochromatin mark and epigenetic silencing has been described. The protein complex named HUSH (human silencing Hub), containing MPP8 (Mphosph8), PPHLN1 (periphilin 1) and our gene of interest, FAM208a (TASOR), recruits SETDB1 methyltransferase to proceed with tri-methylation of histone 3 and its lysine 9 (H3K9me3). This silencing process mediates position effect variegation, meaning that repositioning of a normally active gene into the heterochromatin region results in its epigenetic silencing [10]. The publication describing this process partially verifies the data we have obtained in our study. Additionally, as this scientific group focused on details of mechanisms of FAM208A in silencing of ERV- and PEV-mediated silencing, we drove our attention to other possible activities that might involve FAM208a.

Epigenetics is not only responsible for non-genetic regulation of gene expression in differentiated cells. It is also crucial for cell division and differentiation as such. Cell cycle progression depends on correct timing and coordination of events driven by the epigenetic machinery governing genome accessibility. Chromatin modifiers regulate cell cycle progression locally by controlling expression of individual genes and globally by controlling chromatin condensation and chromosome segregation [11]. The delicate order of gene expression during development is critical to ensuring proper lineage differentiation. Epigenetic regulation of chromatin structure is fundamental for gene activation or repression. DNA methylation, posttranslational histone tail modifications and non-coding ribonucleic acid (RNA) control of chromatin structure precisely orchestrate cellular potency and differentiation [12]. Genetic mechanisms and processes provide the primary control of cellular differentiation and development; however, in contrast, epigenetics provides the molecular mechanisms based on the environment to influence development and tune the genetic processes. Its establishment during the organism development plays a crucial role and any alteration in the epigenome can be detrimental. The complementary and integrated functions of genetic and epigenetic mechanisms during the whole lifespan of a cell are absolutely crucial, and involvement of epigenetic regulators can orchestrate cell fate at multiple levels [13].

17

Mutagenesis – generation of research models

The overall survival of animal species is fully dependent on their ability to update and change their genetic information. Natural selection creates the necessary pressure, and therefore new gene variants help to adapt to the constantly changing living environment. Carriers either can benefit from these spontaneous mutations or be handicapped by them. Another exemplary limitation is the case when the gene mutation can bring a positive effect in one particular region while being disadvantageous in different conditions. This is the situation where natural selection takes place and helps to spread only beneficial gene variants according to current conditions. We can conclude that natural evolution is tightly linked with natural selection and it is purely based on random natural mutagenesis. On the other hand, precisely targeted artificial genome editing is the basic pillar of genetic research. This mainstream orientation has caused massive development of novel tools for targeted mutagenesis. Physical, chemical, and biological gene modifying tools are used to help us understand the molecular basis of distinct phenomena that should lead us to invention of treatments for genetically based human diseases [14].

There are two main approaches to studying gene functions in vivo (Fig.1). The first one, forward genetics, uses naturally or artificially occurring mutants that can be distinguished based on the changes in their observable characteristics – phenotype. The main aim of forward genetics is to answer the basic questions how the mutation is inherited, whether the phenotype results from one or more mutations, and what would the offspring of two mutants look like. Forward genetics was developed as the first approach, and in the past, it did not even necessarily identify the targeted gene. The most important condition was a strong inheritable phenotype, and the mutagenesis was random and unspecific. Forward genetic approaches, however, provide the unique ability of assigning functions to genes in an unbiased, global manner that is independent of previous assumptions about the gene function and may also include non-coding sequences to consider [15].

18

Figure 1 Schematic diagram of workflow of forward genetics in comparison with reverse genetic screenings. Forward genetics starts with a certain phenotype and proceeds with discovering the underlying gene. Reverse genetic screening targets the gene of interest and describes the phenotype caused by the alteration in the gene (https://igtrcn.org/aphid-genetics/; 20.03.2019).

Nowadays, the whole genomes of model organisms are discovered, and gene sequences are mostly known. This has opened a new field called reverse genomics or functional genomics, which specifically targets a gene or other regulatory sequences and describes the phenotype afterwards. This approach either deletes or inserts the gene of interest into the model organism to identify its molecular function. Collectively, the starting point of reverse genetics is a specific sequence to be targeted. The main goal is to modify the gene or its expression and then characterize the phenotypical effect caused by these changes. Generally, reverse genetics aims not to have a severe impact on breeding capabilities; this rule is omitted in embryonically lethal phenotypes [16].

Both strategies are currently broadly used to identify the function of genes. Forward genetics brings an unbiased approach that benefits from no need for a putative role of the gene and might also lead to identification of not only important coding loci, but also of regulatory sequences or even non-coding RNAs. When applied to mice, it is one of the most powerful methods to facilitate understanding of the genetic basis of human biology and diseases. The speed at which disease-causing mutations can be identified in mutagenized mice has been markedly increased by recent advances in the NGS DNA sequencing technology [17]. Functional genomics preferably uses the strategy of null mutants, meaning that gene expression is completely eliminated, but not necessarily.

19

Reverse genetic methods can be divided into two main groups. The first one uses direct chemical mutagenesis or DNA insertion via transposons. The second strategy downregulates, upregulates, or completely restricts gene expression via programmable endonucleases. Technological advances in sequencing have greatly accelerated accumulation of genetic sequence data to the point where whole genome sequences for a large number of model organisms are publicly available. The toolbox for genetic research is expanding rapidly, and together with reverse genetics they can be used to investigate the whole genome function [18].

There are two basic groups of chemicals used for mutagenesis. Alkylating agents, which are used for random mutagenesis of nucleotide bases, belong to the first group. Analogs of nucleotide bases, such as 5-bromouracil, belong to the second group. The mostly used chemicals are ethyl methanesulfonate (EMS) and ethyl nitrosourea (ENU). EMS is responsible for alkylation of the guanine basis, when instead of cytosine, it is guanine that binds to thymine, leading to a change of genetic information. Chemical mutagenesis can be used for both forward and reverse genetics, but as it is not able to directly target a sequence, it is preferably used in the forward approach [19]. Otherwise, as some regions of DNA are more sensitive to mutagenesis, chemically induced mutation can also be used in reverse genetics. This approach will bring some disadvantages as well. First of all, the targeted gene has to be located in the region of mutagenesis hot spots. Usually, a very high dosage has to be used, which leads to formation of many other mutation integrations. These have to be subsequently removed by backcrossing and selection for the targeted mutation. A great example is a recently prepared library of mutants Arabidopsis thaliana, which includes 3,712 different lineages. Later on, it was discovered that the incidence of mutation is 1 in 89,000 nucleotides, which means that every mutant line carries approximately several hundreds of unspecific mutations that cannot be eliminated by selective breeding into one desired mutation [20].

A careful comparison of the phenotypic variations generated by different alleles at a given locus is often an important source of information for understanding of the gene functions. In fact, it is always possible to match a specific alteration observed at the genomic level with a particular pathology. Additionally, it is also possible to establish a relationship between the gene and its function. Insertion mutagenesis is based on insertion of external DNA (transposon or retrovirus) into the genome of a model organism. In case that this insertion happens in the coding sequence, it leads to a mutagenic effect and to a gain or

20 loss of function of the gene. The biggest disadvantage of this approach is a high chance of insertion into the promoter region or insertion that allows expression of a modified protein, a truncated or substitution variant, and therefore, the loss of function effect is only partial. Even this type of mutagenesis tends to create multiple insertions, but their numbers are not so high and it is possible to eliminate them for the selected mutation within few generations [21].

Transposons, also called “jumping DNA”, are discrete segments of DNA capable of moving through the genome of their host. They use either an RNA intermediate in the case of class I retrotransposon, or a "cut-and-paste" mechanism for class II DNA transposons. Since transposons take advantage of their host's cellular machinery to proliferate in the genome and enter new hosts, transposable elements can be viewed as parasitic or "selfish DNA". When transposons are recruited for mutagenesis, it is possible to regulate it via selection of the correct transposase, a crucial enzyme responsible for transposon insertion and excision. In the past, transposons were broadly used for preparation of extensive libraries of mutants of model organisms. Many mutant lines of Caenorhabditis elegans were produced by Tc1 transposon, which is present in hundreds of copies within the genome [22]. Later on, the Mos1 transposon was used as it has only one copy. Transposons may have been beneficial for their hosts as genome evolution drivers, thus providing an example of molecular mutualism [23]. In addition, another favorite model organism was used for insertion mutagenesis in the past. Mobile elements, P-elements, were used to create a library of more than 6,000 mutated Drosophila melanogaster lines. These elements are also called selfish genetic elements and they can be transferred horizontally (from one organism to another) [24]. The largest transposon- induced library of mutants was produced in the mouse, Mus musculus. It includes over 100 thousand independently mutated embryonic stem cells. These totipotent stem cells can be inserted into the host embryo during any developmental stage, and subsequently, they get involved in the development of any organ of the new organism. Most efficient and widely used is microinjection into blastocyst. The final outcome of this mutagenesis is called chimera, as this organism is a mixture of original and inserted genetic material [25].

The next aim was to produce a genetically modified organism with a targeted change in the complete genetic material, and programmable endonucleases provided the proper tools (Fig. 2).

21

Figure 2 Programmable endonucleases and the comparison of their mechanisms and establishment of gene targeting methodology. ZFN and TALEN use Fok1 enzyme to induce DSB into the genome of a model organism and their targeting via a DNA sequence is either by specific pre-designed triplets (ZFN), or every domain carries only one nucleotide (TALEN). RGENs use several categories of RNA to target the mutagenesis, and their tool for producing DSB is the Cas9 enzyme. Mutagenesis itself is performed via either the NHEJ or HR procedure. The figure is original, produced by the author.

Programmable endonucleases are enzymes that are capable of identifying specific genomic sequences and subsequently introducing double-strand breaks (DSBs). These DNA breaks increase incidence of homology-directed repair (HDR) or can lead to non- homologous end joining (NHEJ). The final outcome of these processes is targeted mutagenesis. To produce targeted knock-in or any other specific mutation, DSB needs to be fixed by the HDR pathway. In contrast, NHEJ can lead to various length insertions or deletions (indels) and it can result in loss of function due to shifting of the open reading frame (OFR). Furthermore, HDR has lower efficiency compared to the NHEJ pathway. Therefore, many new methods are focused on enhancing HDR and inhibiting NHEJ (chemical modulation, synchronized expression, overlapping homology arms) [26]. The success of gene targeting in mice mainly results from the usage of embryogenic stem cells

22

(ESC). ESCs are known to have relatively high homology recombination (HR) ability compared with NHEJ. Numerous experiments with embryogenic stem cells indicate that sequence-specific integration of a transgene by HR occurs at frequencies of around 10−20 or higher among the total integration events [27]. Broken chromosomal ends in somatic cells of the plants are preferably healed by ligation of the DNA ends to unspecific sequences or to sequences with micro-homologies. Recently, reproducible gene targeting by homologous recombination in plants has been reported. This model uses specific Agrobacterium-mediated gene targeting using strong positive-negative selection [28]. The presence of sequential identical non-broken DNA is essential to proceed with HR. In case this DNA part is not supplied, the backup plan in the form of NHEJ takes place. HR takes place preferably during S or G2 phase of the cell cycle as the cell is just preparing for the division as such. However, NHEJ is fully cell cycle-independent [29].

The first rounds of gene-specific manipulation in the model organisms were driven by programmable zinc-finger nucleases (ZFN), mostly in the past. ZFN induces DSBs in the cellular DNA and induces repair processes leading to both, targeted mutagenesis and targeted gene replacement at remarkably high frequencies. These nucleases have two separate domains, a DNA-binding domain and a DNA-cleavage domain. The cleavage domain contains the Fok1 enzymatic part with natural cleavage activity. This domain must dimerize to act and to be capable of cutting the DNA. As the dimer interaction is weak, the best solution relies on targeting neighboring sequences, which brings both parts of the cleavage dimer together. An inverted oriented binding site needs a spacer of 5-7 bp from the cleavage region. This requirement is advantageous, as the cleavage reagent is assembled only at the target if the fingers have adequate specificity and the combined requirement for binding two proteins brings the overall specificity. Specificity is one of the biggest challenges for ZFNs. Some fingers bind equally well to triplets other than their proposed preference and have unspecific affinity for related sequences. Addition of fingers can improve specificity. On the other hand, this can also increase the possibility that a subset of fingers in a polydactyl domain will mediate off-target binding [30].

Specificity-driven research proceeded with development of transcription activator-like effector nuclease (TALEN). TALENs have similar structure comparable to ZFNs, but they use a different DNA-binding domain that can be easier to guide towards the desired sequence. The simple and modular TALE domain recognizes only one nucleotide of the major DNA groove and it consists of numerous tandem repeats. Each repeat has a specific

23 recognition capacity for a single base pair. Single base pair recognition by each repeat is determined by alteration of only two hypervariable amino acids, termed repeat variable diresidues (RVDs), and each repeat appears to recognize DNA in a modular manner. The biggest limitation of TALENs is the sequence requirement, as every targeted region must be immediately preceded by 5’-thymine for efficient DNA recognition [31].

In the meantime, a new approach using RNA instead of DNA targeting was identified. Initially, RNA-guided engineered nucleases (RGENs) were used to induce targeted mutagenesis in bacteria and archaea, where these RGENs provide adaptive immunity against invading phages or plasmids. RNA-mediated mutagenesis can be mediated via RNA interference (RNAi). RNA silencing is a conserved phenomenon of regulation of gene expression by small RNAs derived from cleavage of double-stranded RNA (dsRNA). There are three overlapping modes of small RNA-mediated silencing, particularly in plants. In case of post-transcriptional gene silencing (PTGS), endonuclease Dicer cleaves dsRNA to produce approximately 21nt-long small interfering RNAs (siRNAs), which guide RISC, another nuclease complex, to destroy specific target mRNAs based on the sequence complementarity with the siRNA. Another class of 25nt- long siRNAs is also produced from dsRNA by Dicer, different from the one that generates the 21nt-long siRNA. These longer siRNAs are probably involved in systemic silencing during post-transcriptional gene silencing (PTGS). They guide methylation of both DNA and histones, and induce heterochromatin formation and consequent transcriptional repression of the targeted gene. Both siRNA-mediated PTGS and epigenetic modification of the genome are considered defense mechanisms to protect against invading viruses, transposons, or aberrantly expressing transgenes. Regulation of expression of endogenous genes is mediated by another class of 21nt-long small RNAs called microRNAs (miRNA). Genes encoding the miRNAs are present either in the intergenic regions, introns, or coding regions of the plant genome. Cleavage of a stem-loop precursor transcript called pre-miRNA by another class of Dicer generates miRNAs. Subsequently, miRNAs in association with a nuclease complex similar to RISC either degrade the target mRNA or cause translational repression. [32].

The most successful and currently used RGENs belong to the gene editing technology of CRISPR/Cas9. This method has significantly increased efficiency of transgenesis and it is being used not only in model organisms, but also in humans. The idea behind this breaking method was very simple. The HDR pathway has higher efficiency in reparation

24 of DSBs, and so the delivery of a longer repair template would result in higher efficiency for generation of mutant alleles [33]. A small DNA fragment (up to 20 bp) is captured and inserted into its own genome to form CRISPR (clustered regularly interspaced short palindromic repeats). These CRISPR regions are transcribed as pre-crispr RNA and processed to give rise to target-specific crispr RNA (crRNA). The target-independent trans-activating crRNA (tracrRNA) is also transcribed from the locus and contributes to processing of the pre-crRNA. Both crRNA and tracrRNA are in complex with CRISPR- associated protein 9 (Cas9) to form an active DNA endonuclease. This final endonuclease is capable of cleaving the 23nt-long target DNA consisting of 20 bp guide sequence in the crRNA and the 5’-NGG-3’sequence known as protospacer adjacent motif (PAM) directly identified by Cas9 itself [34]. In the last decade, the CRISPR/Cas9 gene editing technology has considerably changed the world of transgenesis methods. New updated variants are regularly produced, and their benefit to generation of targeted animal models is constantly increasing. Its therapeutical potential has just started to arise and it already has reached many valuable achievements, for example in the treatment of Duchenne muscular dystrophy.

25

Introduction to the cell cycle and division

Cell cycle is a complex of events during which all cellular components double and accurately segregate into two daughter cells. The whole cycle can be divided into four phases, Synthesis or S phase, Mitosis or M phase, and two Gap phases G1 and G2. Numerous regulatory proteins are involved in directing a cell through the cell cycle. This multilevel machinery needs control moments, checkpoints, during which it operates in order to ensure equal distribution of all newly synthesized proteins and DNA into daughter cells (Fig. 3). There are three main checkpoints – the regulatory events enabling cells to make sure that the previous phase passed correctly - the M, G1, and G2 checkpoints. Progression through these checkpoints is regulated by favorable conditions, proper chromosome segregation and correct attachment of sister chromatids. The central machines that drive the cell cycle progression are cyclins and cyclin-dependent kinases (CDKs) with selective specificity to the individual cell cycle phases. CDKs phosphorylate key substrates to promote DNA synthesis and mitotic progression. Their activity can be regulated by binding of small inhibitory proteins, CKI, by blocking phosphate transfer to their substrate [35].

Important target of one of the CDKs, CDK1, during G1 phase is retinoblastoma protein (Rb), which is phosphorylated on multiple residues. Hypo-phosphorylated Rb binds E2F transcription factor and thus blocks it from promoting the transcription. Once the CDK phosphorylates Rb, E2F is released to participate in the transcription machinery. During the rest of the cell cycle, the Rb protein remains hyper-phosphorylated [36].

Each daughter cell produced by cell division must obtain an appropriate amount of genetic and biosynthetic material. To ensure this, the cell must double its content prior to division. Cell size control is essential for multicellular organisms, and it is critical for regulation of nutrient distribution. Cell size checkpoints happen during the G1 or G2 phase. Large daughter cells are allowed to speed up progression through G1 or G2 phase, while smaller daughter cells are provided with extra time during these gap phases to sufficiently grow [37].

S-phase is an essential event for DNA replication, and it is also the most DNA damage- sensitive part of the cell cycle. The whole procedure starts at specific sites, the replication origins, which are epigenetically defined. Beginning of the replication is controlled by

26 phosphorylation of two proteins, Cdt1 and Cdc6. This process initiates replication, meanwhile causing degradation of these two proteins preventing the re-start of the next replication [38].

Figure 3 The cell cycle consists of G1, S, G2, and M phases, which are driven by various cyclin/CDK complexes. Each progression from one phase to the next one is monitored by different checkpoints. S phase is regulated by the replication checkpoint that monitors the initiation of replication, replication fork stability, fork progression, and DNA lesions. G2/M checkpoint monitors the completion of DNA replication with high fidelity. The spindle checkpoint makes sure that chromosomes are aligned and segregated for even distribution into two daughter cells [39].

The actual phase responsible for division of genetic and cellular material is M phase, mitosis. It consists of four basic phases: prophase, metaphase, anaphase and telophase. The first one can be divided into two stages, prophase and prometaphase. These phases always follow in strict sequential order. During the first stage, prophase, the spindle apparatus is assembled and the condensation of mitotic chromatids occurs. Prometaphase is crucial for removal of the nucleolar envelope and the spindle poles migrate to the cellular poles. Metaphase plate is a specific arrangement of chromosomes for this part of the M-phase and the sister chromatid segregation is essential for anaphase. The whole

27 mitosis is finished by telophase and cytokinesis with formation of two identical daughter cells [40].

Appropriate spindle apparatus (SA) assembly is dependent on correct interaction between microtubules (MT) and chromosomes, which ensures equal portioning of chromosomes into daughter cells. Spindle MT and a set of material that surrounds and permeates spindle MTs is defined as spindle matrix and it retains its integrity upon MT disassembly. Spindles are not surrounded by any membranes, and therefore they need to concentrate many components in order to support spatially diverse reactions [41]. The central role in organizing and assembling of SA is played by centrosome, which is a major component of the microtubule organizing center (MTOC). MTOC is responsible for orchestrating a wide variety of cellular processes such as acquisition of polarity, signaling, adhesion, and protein trafficking. The centrosome provides crucial links to the nucleus, cytoskeleton and Golgi apparatus, and thus it positions and shapes the MT [42].

Assembly of the spindle apparatus is also partially regulated epigenetically. Chromosome-induced polymerization is a crucial mechanism in cells without centrosomes (female meiosis in Drosophila) and it is a supportive mechanism in centrosome-containing cells. Typical cells with this type of epigenetic regulation of SA are oocytes, where the whole procedure happens in the absence of transcription. Histones in nucleosomes, not the DNA alone, mediate the spindle apparatus assembly and nuclear envelope (NE) re-formation (Fig. 4) [43].

The duration of cell cycle phases varies between different cell types. The whole cell cycle of typical proliferating human cells takes approximately 24 hours, where the G1 phase usually lasts approximately 11 hours, S phase might take up to 8 hours, G2 phase needs around 4 hours and the mitotic phase itself takes approximately 60 minutes. Cleavage is a very specific type of cell cycle. It is a typical cell division process happening in early embryonic cells shortly after fertilization. In this particular case, there is no time used for cell growth and both gap phases are omitted. This cell cycle consists only of very short S phases alternated with M phase. On the other hand, there are also specific cell types that divide only occasionally upon stimulation due to the cell loss caused by injury or cell death, for example skin fibroblasts or kidney cells [44].

28

Figure 4 Nucleosome functions in nuclear envelope formation and spindle assembly. A: During spindle assembly, nucleosomes (top) generate signals (yellow) that induce polymerization of microtubules (red). In cells without centrosomes (left), these signals are responsible for spindle assembly. In cells with centrosomes [45], these signals support spindle assembly through microtubule formation at kinetochores. B: After mitosis, nucleosomes (top) generate signals (green gradient in the middle picture) around decondensing chromosomes (blue) to specify nuclear envelope formation (bottom) [43].

29

Epigenetic mechanisms

Epigenetics is characterized by changes in gene expression that are not caused by DNA alterations. It is one of the most recent fields of genetics and is linked with diseases and cancer. In addition, epigenetic drugs preventing the onset of these diseases including cancer have been proven to bring promising results [46]. Moreover, epigenetic orchestration is essential for normal embryonic development and it is responsible for maintenance of tissue-specific gene expression. There are four main components of epigenetic machinery: DNA methylation, histone modifications (methylation, acetylation, and phosphorylation), non-coding RNA regulations (RNA interference) and nucleosome positioning (promoter-enhancer interactions). All procedures are variable, reversible and flexible; therefore, they have become a main target of epigenetic therapy. The place of action of an epigenetic modification is a eukaryotic cell nucleus, which is formed by DNA strands compacted by histones into nucleosomes. Histones form an octamer made of two of each histone proteins (H2A, H2B, H3, and H4). Epigenetic modification modulates the expression from these genes in every cell type not only differently, but also specifically for each developmental stage [47].

The most natural epigenetic modification is DNA methylation. It is widely spread across all the species including bacteria, plants, or mammals. Methylation of deoxyribonucleic acid (DNA) occurs during the cell division process and it is considered as stable epigenetic regulation. Its activity often results in eliminated expression of the targeted gene or genes, the process called gene silencing. This mechanism is based on transferring a methyl group from S-adenosyl methionine [48] onto the cytosine ring at the 5’end of the region with cytosine and guanine dinucleotide repeats (CpG islands). The whole procedure is carried out by de novo methyltransferases (Dnmt). These enzymes are crucial for embryonic development. Dnmt3A and Dnmt3B are responsible for the process of embryo formation and are required for new methylation patterns, from scratch [49]. Dnmt1 is also considered to act like the maintenance methyltransferase responsible for epigenetic inheritance. Due to the Dnmt1 methylation activity, the hemi-methylated CpG islands are functional only if one of the complementary DNA strands carries the methylation marks [50].

30

Figure 5 Three basic components of epigenetic machinery, histone modifications, RNA interference, and DNA methylation [51].

Methylation silences the gene expression either by preventing the recognition and binding of a transcription factor to its binding site, or through the recruitment of methylated binding domain proteins mediating the gene silencing. Alternatively, gene silencing can also be introduced by blocking the binding enhancer and DNA binding proteins, which results in the regulation of chromatin boundary and arrest of gene expression [52].

In normal cells, CpG islands are strategically distributed. They can be found in promoter regions, where their methylation leads to the long-term silencing of transcription of all genes that are under the control of particular promoters. The DNA methylation pattern is tissue specific and is continuously formed during cell differentiation. On the other hand, methylation might be the reason for the cell division arrest. Unmethylated chromosomal regions are transcriptionally active and expressed. These regions are called euchromatin. Opposite structures, with methylation marks and condensed DNA, which cannot be approached by the transcription machinery, are called heterochromatin [53].

31

Heterochromatin is a tightly compacted part of DNA forming dark areas in nuclear staining, as the intercalating agents cannot approach small and large DNA grooves. DNA compaction is mainly caused by histones and their methylations. N’ parts of histones are displayed out of the nucleosome core. Therefore, their amino acids can easily undergo covalent modifications, including methylation, acetylation, phosphorylation, ubiquitination, or sumoylation [54]. These modifications can act either individually, or in combination with others. Multiple active histone modifications combine redundantly to ensure the robust chromatin regulation. They are believed to encode inheritable epigenetic programs that regulate gene expression, X-chromatin inactivation, heterochromatin formation, mitosis, and DNA repair [55]. Chromatin acts as a template for DNA-mediated processes; therefore, histone modifications represent an essential component of the expression machinery. The methylation of different histones has distinct effects on the expression of the region. Site-specific histone modifications correlate with particular regulation of biological functions such as DNA transcription [56]. Euchromatin or transcriptionally active marks responsible for an accessible/opened conformation are represented by acetylation of lysine 9 and 14 of histone 3, (H3K9ac, H3K14ac) and of lysine 5 of histone 4 (H4K5); methylation of lysine 4, 36 and 79 of histone 3 (H3K4me, H3K36me and H3K79me) and arginine 3 of histone 4 (H4R3me); and phosphorylation of serine 10 of histone 3 (H3S10ph). Heterochromatin marks leading to inactivation of gene expression and to the nucleosome fiber condensation are represented by acetylation of lysine 12 of histone 4 (H4K12ac); methylation of lysine 9 and 27 of histone 3 (H3K9me, H3K27me) and lysine 20 of histone 4 (H4K20me), and phosphorylation of serine 10 of histone 3 (H3S10ph) and serine 7 of centromere protein A (CENP-A-S7ph) [57, 58].

Histone methylation of lysines is a reversible process catalyzed by specialized lysine methyltransferases (KMTs), which are removed by lysine demethylases (KDMs). The most abundant and stable methylation is located at lysine 9 of histone 3 (H3K9) and is involved in gene repression and heterochromatin formation. The first KMT identified, SUV39H1, is responsible for methylation of H3K9 [59]. At present, many other enzymes involved in these processes have been identified and divided into two classes based on their catalytic domains. The first group contains a highly conserved SET (Su(var)3-9, Enhancer-of-zeste and Trithorax) domain, which is approximately 130-140 amino acids long. For its function, it needs the presence of two cysteine-rich domains: one prior and

32 one posterior to the SET domain. These domains are crucial for the substrate recognition [60]. The second group consists of proteins without SET domain. However, it is typically also highly conserved and consists of DOT (Disruptor Of Telomeric silencing) and its homologs. Generally, the first group of KMTs is responsible for methylation of N’ terminal lysines, while the second group prefers lysines inside the histone globular core. H3K9 methylation is one of the best studied histone modifications and is mediated by SUV39h, G9a, and SETDB [61]. SUV39H is the major H3K9-specific methyltransferase and targets the pericentric regions. The trimethylation of lysine 9 of histone 3 (H3K9me3) is a crucial hallmark of constitutive heterochromatin (CH), which is invariable and critical for functional chromosomal domains such as pericentromere. SUV39H (1 and 2) targets CH domains and transposons via its chromodomain [62]. It specifically recognizes only the trimethylated version of H3K9 and catalyzes silencing of pericentric regions [63]. Hence, for correct functioning of SUV39H, three rounds of methylation of H3K9 must be accomplished. Mono- and di-methylation processes are driven by two other KMTs, G9a and GLP. Both belong to the first group of KMTs with the SET domain. These two enzymes usually occur as heteromeric complex and are crucial for euchromatin structure. In comparison with other histone marks, H3K9me2 exhibits longer and continuous distribution in genomic DNA. G9a/GLP- mediated H3K9me2 marks are important for the establishment of facultative heterochromatin (FH) domains that are located within silent euchromatin. The G9a/GLP complex also interacts with DNMTs and controls the DNA methylation [64]. Another important KMT, SETDB (1 and 2), is crucial for early embryonic development [65]. It can either directly di-methylate H3K9, or together with its cofactor MCAF1 tri-methylate the residue. SETDB has a dual role and is capable of both gene silencing and pericentric heterochromatin formation [66]. In particular, SETDB1 was recently linked with the newly identified HUSH complex, which consists of MPHOSPH8, PERIPHILIN and FAN208A. MPHOSPH8 recognizes H3K9me3 loci via its chromodomain [62] and together with other components of the HUSH complex recruits SETDB1 and drives deposition of the tri-methylation mark to maintain transcriptional silencing [10]. Interestingly, Mphosph8 also mediates the link between DNA methylation by DNMT3a and heterochromatin methylation marks by G9a [67].

RNA silencing is a new field of research based on small RNAs derived from the cleavage of double-stranded RNA (dsRNA). These RNA molecules of can trigger epigenetic silencing both in the cytoplasm and at the genomic level. Messenger RNA (mRNA)

33 complementary to this small RNA is sentenced to posttranscriptional degradation, or in plants, it is methylated based on homology. RNA interference (RNAi) is basically described as a posttranslational gene-silencing (PTGS) phenomenon caused by dsRNA molecules. This dsRNA is processed into shorter fragments that guide the recognition and target homologous RNA, leading to its own cleavage. The original dsRNA sequence can be generated from various sources, such as transcription through inverted DNA repeats, simultaneous synthesis of sense and antisense RNAs, viral replication, and activity of RNA-dependent RNA polymerase (RdRP) on single-stranded RNA (ssRNA) template [68]. Epigenetic silencing might also be targeted by RNA with little or no protein-coding potential, called non-coding RNA (ncRNA). The majority of the mammalian genome is comprised of ncRNAs. The maximum length of ncRNA is set to 200 nucleotides and epigenetics distinguishes several subtypes including microRNA (miRNA), PIWI- interacting RNA (piRNA), small interfering RNA (siRNA), small nucleolar RNA (snoRNA), and transfer RNA [69] [70].

The majority of the is represented by non-coding regions, which are still capable of transcription, and it results in production of long non-coding RNAs (lncRNA). These regulating sequences are above 200 nt long and they directly interact with the proteins involved in heterochromatin formation and epigenetic modifiers. Most of lncRNAs can directly interact with histone methyltransferases or members of the Polycomb Repressive Complex (PRC2). These interactions orchestrate gene transcription, chromatin re-modulation, X-inactivation and pre-mRNA splicing [71].

Recent advances in epigenetic profiling revealed the importance of regulatory elements, enhancers, and promoters in the regulation of expression at the non-genomic level. Enhancers are the most utilized parts of the whole genome, and the number of putative enhancers in human strikes one million. This expansion of enhancer elements suggests an enormous combinatorial complexity of expression patterns. Enhancer DNA is commonly 200–500 bp in length and contains clustered sites that are recognized by various transcription factors (TF). This genomic region exhibits nucleosomal depletion and the presence of specific histone variants H3.3 and H2A.Z. These nucleosomes are highly mobile, their DNA is hypomethylated, and in case they border TF-binding regions, they possess specific modifications such as H3K4me1 (nucleosomal depletion) and H3K27ac (enhancer activation). The presence of H3K4me1 creates an opportunity for enhancer activation and can lead to facilitation of nucleosomal mobility and binding of

34

TFs. The H3K27ac mark is associated with active enhancer, but it is formed exclusively in the presence of H3K4me1. Relative enrichment of both of these epigenetic marks is currently considered the major epigenetic feature that can differentiate enhancers from promoters, and active and poised states. Understanding that enhancer regions share common chromatin features brought a new perspective to their identification and possible regulation [72].

35

Epigenetic regulation during the cell cycle

Transmission of epigenetic marks from mother to daughter cell is a crucial process ensured by epigenetic inheritance. Both, genetic and epigenetic information is mostly challenged by the replication fork. Inheritance of epigenetic marks can be ensured from the disruptive event in a replication- or time-coupled manner. DNA methylation is inherited at the replication fork in a semiconservative way. Transmission of information is highly affected by redistribution of parental modifications and histones at the replication fork. The centromeric histone is newly deposited during the late telophase/G1 phase and highlights a new window for inheritance during the cell cycle. Pericentric heterochromatin is maintained by the complex activity of histone modifiers, DNA methylation, and chromatin assembly factors during the whole cell cycle. Reversibility of epigenetic state is underlined by reprogramming during the embryonic development [73].

True epigenetic information is fully qualified only if the marks are heritable. Furthermore, in contrast to genetic information, which is known to be very stable, epigenetic information reveals a certain level of plasticity and is inherently reversible. During DNA replication, chromatin undergoes disruption followed by restoration. Subsequently, the replication fork passage opens a window for epigenetic reprogramming, usually broadly used during cell differentiation and development [74]. DNA replication proceeds in two manners - continuous synthesis of the leading strand and discontinuous synthesis of the lagging strand. Both strands use different DNA polymerases, but share loading of DNA factor proliferating cell nuclear antigen (PCNA). In addition to the important role in DNA duplication, PCNA is also crucial for epigenetic inheritance due to its capability of interacting with chromatin-modifying factors (DNMT, HDAC, SETDB) [75]. The DNA methylation profile at the replication fork is maintained by DNMT1. Although this methyltransferase is usually responsible for de novo activity, during the cell cycle, it interacts with SET-and-RING-associated (SRA)-domain-containing proteins (VIM1, NP95), which preferentially bind to hemi-methylated DNA and perform replication of heterochromatin regions. The newly replicated DNA methylation machinery is boosted by the presence of ATP-dependent chromatin-remodeling factor decreased DNA methylation 1 (HELLS), responsible for providing access to DNA [76].

36

Figure 6

Histone inheritance mechanism during DNA replication. First, nucleosomes are disrupted by the replication fork [77], original parental histones with original epigenetic marks are used for recycling [77] or as a template for a new histone; the last possibility is a brand new histone with de novo deposition of epigenetic marks [78].

Histone modifications are also considered an important part of the heritable epigenome. Unlike DNA methylation patterns, histone marks are disrupted by replication forks, and the template for correct reassembly seems to be missing. However, to avoid the loss of information, a specific replicative histone H3 variant (H3.1) is used. Proper coordination between the recycling of parental histones and incorporation of newly synthetized histones is required. This assembly includes deposition of tetramer (H3-H4)2 onto DNA, followed by deposition of two H2A-H2B dimers. Correct disruption and reassembly is assisted by histone chaperones. During DNA replication, parental (H3-H4)2 can either divide into two dimers (split), or remain as a complex (unsplit). This unsplit variant can be directly used for formation of a new nucleosome while keeping its epigenetic marks for the whole time. In case the split parental dimers re-associate, the marks can be also

37 directly used onto daughter strands. Newly synthetized histone dimers carry specific marks and either form nucleosomes by combination with one of the split parental dimers, or can form a novel tetramer made of only new histones (Fig.6). In both cases these histones undergo maturation to restore epigenetic marks [79].

During mitosis, chromatin undergoes massive changes; the transcription machinery arrests, chromatin binding factors are removed, followed by changes in histone modification and chromatin structure. Mitotic chromosomes lose their cell-specific and locus-dependent organization characteristics for interphase chromatin. They open into randomly positioned loop arrays. Once the mitosis is finished, cells quickly restore the original cell type-specific organization [80].

Mitotic chromosomes have specific, locus-independent chromosomal organization, which is common across the cell types, while interphase chromosome organization is highly cell type specific. Cells restore this cell type-specific organization during early G1 phase of the cell cycle [81]. Crucial elements for the cell type-specific organization restoration are mitotic bookmarks. Mitotic bookmarks (MB) are specific gene regulatory elements that can exist in the form of histone modifications and variants, DNA methylation marks, non-coding RNA, or specific transcription factors and histone readers. There are two classes of MBs, those that perform directly during mitosis (histone phosphorylation) and those that are responsible for epigenetic memory by controlling the opened/closed chromatin structure (phospho/methyl switch, H3K9me3/H3S10ph) during early G1 phase [82].

In the past, it was generally assumed that RNA does not play a regulatory role during the cell division processes, since the transcription machinery is paused due to replication procedures and the retained molecules of RNA have to be produced before the actual initiation of replication [83]. Identification of coined mitotic chromatin-associated RNAs (mCARs) referred to the presence of this large group of highly conserved non-coding RNAs (ncRNA) during mitosis. So far, only some hints are known about this special mitosis-related RNAs. They can be divided into two groups based on their localization. All of them are located on condensed chromatin, but they can be either external or internal. These mCARs are highly conserved across the species, which suggests their conserved function in mitosis and maintenance of heritable epigenome [84].

38

DNA replication creates a challenge for preserving the epigenetic memory. It requires restoration of chromatin-associated histone modifications after previous disruption. Histone methyltransferase (SUV39H) plays a crucial role in de novo nucleation based on the recruitment by a DNA or RNA strand. Once the methylation marks are established, the heterochromatin itself provides an additional epigenetic template that propagates heterochromatin structure in a self-templating manner [85]. Another well-known histone methyltransferase STEDB1 seems to play a role mostly in spreading the H3K9me3 mark to form new heterochromatin regions. The whole mechanism involves the HUSH complex (TASOR, MPP8 and PPHLN1). The silencing process starts with H3K9me3 mark recognition by the chromodomain of MPP8, and this is followed by methylation of consecutive histones by STEDB1. This heterochromatin formation is mostly used in epigenetic silencing of newly integrated viral and non-viral DNA and leads to position effect variegation (PEV) [86]. On the other hand, it is quite possible that the cooperation of SETDB1 and HUSH plays an important role during the maintenance of epigenome inheritance.

39

Epigenetic regulation during embryonic development

Gastrulation is a critical developmental process of differentiation into the three germ layers (ectoderm, mesoderm, and definitive endoderm, DE). The most evident embryonic shape and size progress happens between the end of implantation and the onset of gastrulation (E5.5 to E6.5). In this time window, the embryonic epiblast performs at highest proliferation rate to obtain a critical cell number. Formation of the primitive streak (PS) at the posterior embryonic site at E6.5 is a typical gastrulation hallmark. Roughly at the same time, formation of the anterior visceral endoderm (AVE) occurs by migration of the distal visceral endoderm (DVE). Finally, the epiblast cells undergo epithelial-to- mesenchymal transition (EMT) in the PS and create the mesoderm and DE. Epiblast cells that did not enter the PS remain in the epiblast and give rise to ectodermal lineages. Gastrulation is also a period of dynamic epigenetic changes, involving many different known epigenetic regulations [69].

Epigenetics evidently also controls the development. Hundreds of cell types are differentiated from a single fertilized oocyte. Epigenetic mechanisms are essential for establishment and maintenance of a stable cell identity. At the molecular level, transcription factors initiate the specific gene expression programming, and epigenetic regulations contribute to stabilization of expression patterns. These regulations include small RNAs and chromatin and DNA-modifying complexes. Methylation of cytosine and histone modifiers of the Polycom group (PcG) proteins (PRC1,2) are prominent examples of regulators that repress inappropriate genes for a certain cell type [87].

PcG maintains the special patterning of Hox genes, which are responsible for the regulation of anterior-posterior (AP) patterning. Their main tool is deposition of epigenetic marks, mainly H3K27me3 and H3K9me2, in the regions where they normally do not belong. PcG-induced tri-methylation of H3K27 is responsible for X-chromosome inactivation (XCI) and genetic imprinting. Imprinting is the process when single allele expression is performed due to epigenetic regulation. Many imprinted genes are involved in the regulation of placenta and fetus growth. Both, imprinting and XCI are regulated by ncRNAs, which bind to the respective imprint coding regions (ICRs) that control expression of all genes located in the clusters. An alteration of chromatin states induces silencing mediated by DNA methylation of PcG- mediated histone modifications [88].

40

Zygotic genome expression starts at the two-cell stage, but before that, it all starts within gametes during the process of gametogenesis. Male and female gametes differ in their epigenetic profiles. Sperm DNA is highly methylated and tightly condensed with specific proteins that have replaced histones, protamines. Oocyte DNA is specifically methylated in a dual pattern and non-canonical distribution of chromatin modifications. Both patterns are established during gametogenesis, which happens already during embryogenesis. Precursors of gametes, primordial germ cells (PCGs), migrate from an epiblast to the genital ridge (E9.5 – E11.5), undergoing almost complete de-methylation of the genomic DNA. Histone repressive marks are also widely reorganized, mostly with the loss of H3K9me2 and increase of H3K27me3 [89].

Male progenitors, prospermatogonia, re-establish the DNA methylation status before birth and complete it at the termination of meiosis after the birth. Sperm DNA is initially wrapped around histones, and later on, those are replaced first with histone variants and transition proteins and subsequently with protamines [90]. Female precursors first undergo massive mitotic expansion followed by meiotic arrest in prophase I. During folliculogenesis, oocytes experience complete de novo DNA methylation mediated by DNMT3a and its cofactor DNMT3L. This methylation is strictly located in a unique pattern restricted to transcribed gene bodies. Histone modifications, mostly acquisition of H3K36me3 and exclusion of H3K4me3, are tightly associated with the regulation of DNA methylation as well. Obtaining non-canonical domains of H3K4me3 is required for genome-wide transcriptional silencing, oocyte maturation, and meiotic restoration [91]. The oocyte chromatin architecture is very special. The major chromatin changes happen during the maturation into the germinal vesicle (GV) oocyte. The non-surrounded nucleolar-like form transforms into the surrounded state with central position and highest meiotic competence [92].

Immediately after the fertilization, paternal protamines are replaced by maternal histones together with almost complete erasure of paternal DNA methylation marks. Maternal DNA methylation is highly conserved at this stage. Later on, the situation changes and non-canonical H3K4me3 maternal marks are removed by demethylases (KDM5B and KDM1A); this subsequently initiates zygotic genome activation (ZGA). Zygotic chromosome compartmentalization is associated with DNA methylation, H3K27me3 and chromatin accessibility, but not with maternal H3K4me3 [93].

41

Figure 7 An overview of the maternal-to-zygotic transition (MZT) in the mouse model organism. Key embryonic stages are depicted schematically above the corresponding cleavage cycle and time after fertilization. The red curves represent the degradation profiles of destabilized maternal transcripts in each species. The light and dark blue curves illustrate the minor and major waves, respectively, of zygotic genome activation. The last embryonic stage is the developmental point at which there is a major requirement for zygotic transcripts [94].

The onset of zygotic transcription is dependent on three main features: activation of gene expression necessary for installation of the new cellular state, modification of chromatin status, and removal of the rest of existing products from the previous cellular program. Maternal-to-zygotic transition (MZT) plays an important role in all the above-mentioned features. MTZ controls gene expression remodeling at all levels and is responsible for active clearance of maternally deposited mRNAs (Fig.7). Maternal RNAs and proteins, which are loaded into the oocyte during oogenesis, drive basic biosynthesis processes, complete the first mitotic divisions, and specify the faith and patterning of cells in the early embryo [94]. Transcriptional competence and gene expression are achieved by regulation of chromatin accessibility from both parental epigenomes. An open nucleosome arrangement, DNA methylation, and histone modifications provide conditions necessary for starting transcription at the MZT. The transcription machinery and accessory factors engage this chromatin and gene expression. General and specific transcription factors help to regulate the gene expression at ZGA across all species. Stem cell markers Nanog, Sox2, and Oct4 are associated with chromatin remodeling activities and they bind to the regions of repressed chromatin to induce nucleosome repositioning and allow other transcription factors to bind [95].

Fusion of male and female pronuclei completes the meiosis and induces a synchronous and rapid set of cell cycles. This cell division, cleavage, differs from the standard one, as

42 it lacks the gap phases, G1 and G2. During the cleavage, cells quickly shuffle between M and S phases of the cell cycle. The outcome of this rapid proliferation is a cluster of cells with the same cytoplasmic volume as the original zygote. This stage massively alters the nucleus/cytoplasm (N/C) volume ratio, meaning increasing quantity of nuclear material relative to the constant cytoplasm volume. Increasing the N/C ratio controls the MZT and activates the transcription factors. Short cell cycles do not allow transcription and processing of large genes and prefer expressing shorter genes with fewer introns. The cell cycle dynamics appears to define the hierarchy of ZGA, where transcription of shorter genes is compatible with rapid cleavage, while expression of longer genes is delayed until the cell cycle lengthens in a standard cell division process [96].

Two distinct mechanisms of maternal mRNA clearance have been described across animals. The maternal model is based on maternally deposited factors, while the zygotic model relies on de novo zygotic transcription. The maternal mode is strongly active shortly after fertilization. The second round of clearance overlaps with ZGA and is stopped by transcription inhibition. Zygotic clearance uses microRNAs (miRNAs), which are ~22nt small RNAs incorporated into silencing complex miRISC. This complex targets mRNAs and induces RNA decay via translation repression and deadenylation. The stability of mRNA is influenced by its sequence, the 7-methylguanylate (m7G) cap at the 5’ end and the 3’ end poly(A) tail. All three stability elements are targeted during the clearance of maternally deposited mRNA. The zygotic degradation pathway uses miRNAs as mediators that temporarily couple deacetylation of the 3’poly(A) tail. Maternal transcripts are typical by their uniformed distribution, while zygotic transcripts are expressed only in restricted patterns. The maternal mode of mRNA decay occurs before or independently of zygotic transcription, while the zygotic mode occurs only after zygotic transcription and is blocked when zygotic transcription is inhibited. This indicates that some pathways of mRNA clearance are inherited in the oocyte cytoplasm and others are synthetized de novo. Therefore, there is a contribution of both, the maternal cytoplasm and the zygotic nucleus to MZT [97].

Zygotic gene activation and maternal clearance are conserved activities during the maternal-to-zygotic transition that cooperate to reprogram terminally differentiated gametes to embryonic pluripotency. Cell cycle dynamics and chromatin structure approve gene expression during the MZT. Maternally deposited transcription factors may initiate ZGA by inducing open chromatin and recruiting other transcriptional machinery. Soon

43 after initiation of ZGA, zygotic miRNAs are transcribed and act as the key players of maternal clearance [98].

44

Fam208a – an important component of the HUSH complex

Genome stability can be impaired by variable processes including erroneous DNA replication or unequal sister chromatid segregation. Homologous recombination is a universal mechanism responsible for DNA repair and at the same time, it provides a support for DNA replication. Moreover, it is a major pathway that suppresses the genome instability [99]. In contrast, natural decay or exogenous genotoxic agents such as ultraviolet light, oxidative stress, chemical mutagens, and radiation affect the stability of DNA constantly. To counteract this instability, multiple mechanisms of DNA repair have been developed in the cells to prevent accumulation of these changes. The DNA repair system is a complex machinery involving the system of checkpoints, homologous recombination or non-homologous end-joining process, posttranscriptional RNA modifications (m6A, alternative splicing), and posttranslational modifications of proteins (histone methylation, phosphorylation, etc.) [100, 101].

Histone methylation was discovered and described as the first of all modifications and there are several complementary mechanisms that orchestrate it; one of these pathways also includes the protein of our interest, Fam208a. Fam208a is a large protein (1660aa), originally designated as RAP140a and described as an interaction partner for human partial retinoblastoma protein, Rb. Two transcription variants (7 kb and 9 kb long cDNA fragments) of Fam208a were identified and both exhibited identical distribution among various tissues. The smaller construct was named RAP140 and predicted to encode a 1,233 amino acid hydrophilic protein. The first suggestions about its role were linked to intracellular translocation of Rb and to general involvement in cell-cycle control, gene expression, and tumorigenesis [102].

Afterwards, Fam208a was identified as a direct transcriptional target of Oct4, which is critical for cell pluripotency, differentiation activation, and gene repression. This regulation for the first time links Fam208a with stem cell identity and control of chromatin structures. Oct4 itself orchestrates cellular pathways to regulate the stem cell state and via transcription regulations, it provides a crucial tool to establish the future cell function [103].

Later, Fam208a was described as a potential suppressor of variegation in ENU mutagenesis screen with the integrated multi-copy green fluorescent protein (GFP)

45 transgene under the control of hemoglobin promoter. Random N-ethyl-N-nitrosourea (ENU) mutagenesis was developed for sensitized screen to identify genes involved in silencing of the variegating GFP transgene. The dominant screen included both, suppressors and enhancers of variegation. This screen has identified genes that are involved in epigenetic reprogramming of the genome. In addition, the behavior of the mutant lines suggests a common mechanism between gene inactivation and transgene or retrotransposon silencing [104].

In the screen describing Fam208a, mouse lines that affected fluorescence (intensity or percentage of positive cells) of red blood cells (RBC) were called modifiers of murine metastable epialleles (Mommes). Based on their inheritance, Mommes were further described as either recessive (Momme R) or dominant (Momme D) [105]. Two independent lines with induced mutations in Fam208a, MommeD6 (L130P) and MommeD20 (Stop codon in intron 1 region), were identified and described as embryonically lethal. The effect of both mutations led to suppression of variegation and resulted in a remarkable increase in both GFP signal strength and the amount of positive RBC. Haploinsuficiency for Fam208a effected the transgene expression at the RNA level, which suggested Fam208a-dependent regulation at the expression level [106]. Studies on Fam208a-L130P suggested its essential involvement in early embryonal development. A MommeD6 homozygote line showed post-implantation defects. This mutation caused impairment of primitive streak elongation and postponed the effect on epithelial-to- mesenchymal transition (EMT). Epiblasts of L130P mutants reported increased levels of p53 pathway genes as well as several pluripotency-associated long non-coding RNAs (lncRNAs). This establishes Fam208a as an important factor for the onset of gastrulation when cells exit the pluripotent state and start differentiating [69].

The usage of near-haploid KBM7 cells in subsequent genetic screen allowed identification of a complex of four proteins, FAM208a, MPHOSPH8, SETDB1, and PPHLN1, designated as HUSH (human silencing Hub) complex [10]. This complex plays an important role in maintaining the genome stability and epigenetic silencing via spreading the H3K9me3 mark. The HUSH complex recruits MORC2 to target sites in heterochromatin (Fig. 8).

46

Figure 8 Epigenetic silencing mediated by the HUSH complex. MORC2 provides ATPase activity crucial for spreading the heterochromatin mark and recruitment of SETDB1, resulting in silencing of cellular genes (black) and integrated transgenes (red) [107].

MORC2 has specific ATPase activity, which is critical for HUSH-mediated silencing. Upon loss of MORC2, chromatin completely loses its compaction. Therefore, its presence is important for appropriate functioning of HUSH [107].

Position effect variegation (PEV) has been widely studied in the flies, and even though the HUSH complex is involved and conserved in all vertebrates, the drosophila genome does not possess any clear orthologs to any of HUSH elements. Therefore, this whole system represents a relatively novel route to heterochromatin formation via H3K9me3. The main HUSH function is to mediate spreading of the pre-existing heterochromatin through the reading and writing of H3K9me3, and hence it is responsible for the silencing of active transgenes only when integrated into heterochromatic genomic loci [86].

47

Methylation processes are also involved in retaining the genome stability driven by several signaling pathways. One of them includes the HUSH complex with FAM208a in its core (together with Periphilin and Mphosph8). Modifiers of position-effect variegation have been proved to be involved in heterochromatin assembly and in methylation patterning (Fig. 9). Loss of the HUSH complex resulted in a decrease of H3K9me3 mark in endogenous loci as well as in retroviral integrated regions. The HUSH complex as such is able to distinguish loci rich in H3K9 tri-methylated marks and to recruit methyltransferase Setdb1 to promote spreading of the transcriptional silencing mark down the chromatin [10].

Figure 9 Mechanism of PEV with integration of virus in the proximity of heterochromatin. Integration of viral DNA into the host genome leads to repression by the HUSH complex with a high methylation level of H3K9. The HUSH complex recruits SETDB1 and proceeds with spreading of heterochromatin [108].

The HUSH complex was shown to be involved in the regulation of silencing integrated retroviruses as well as of endogenous regions by recruiting methyltransferase SETDB1 to H3K9 methylation sites [109]. Recently, it was proved that FAM208a in the HUSH complex binds to sequences of endogenous retroviruses (ERVs) and to long interspersed nuclear elements (LINE-1s/L1s) [110]. TRIM28 is necessary to repress ERV and evolutionarily young L1a, and together with Fam208a it cooperates in the HUSH complex within human naïve cells. Moreover, it was shown that the HUSH complex together with TRIM28 are responsible for co-repression of young retrotransposons and new genes promoted by noncoding DNA silencing [111].

48

The major players of genome evolution and regulation are considered to be transposons. These transposable elements used to be denounced as parasitic DNA, which must be controlled by the host. The current knowledge proved that these sequences are crucial and evolutionarily very stable. One of the best described autonomously mobile element, L1 (Long interspersed element-1, LINE-1), occupies seventeen percent of the human genome. It is responsible for variations amongst individuals and its transposition can result in disease (hemophilia A and B, cystic fibrosis, and others). The explicit mechanism of its regulation is not completely described, but several functionally diverse genes can either promote or restrict the control of L1 insertion. Examples of these regulation genes, are MORC2 and Fam208a (together with the HUSH complex). Both genes selectively bind evolutionarily young L1 located within introns of euchromatin genes. The silencing event is promoted via tri-methylation of histone 3 (H3K9me3) and results in downregulation of the host gene expression [110].

Proviral gene and endogenous retroelement expression is epigenetically repressed in embryonic stem cells. Even this phenomenon is not fully mechanistically described; the Moloney murine leukemia virus-based retroviral vector is known to be repressed by the Setdb1/Trim28 pathway. However, there are many other genes and complexes involved in the whole mechanisms. Fam208a also plays an important role in provirus silencing in mouse pluripotent stem cells. In addition, the HUSH complex is capable of repressing L1 and some ERVK such as IAPEY [109] .

Components of the HUSH, SETDB1, and BAF complexes were also identified as provirus silencing factors in an siRNA screen [112]. Recently, the HUSH complex was also linked with controlling viral expression via restriction at the epigenetic level [113]. Human immunodeficiency viruses 1 and 2 (HIV) have evolved mechanisms to evade the host immune system. The main target is to eliminate cell restriction factors, and one of the first tools used by the viruses is viral protein X (Vpx). Vpx downregulates and binds HUSH, and induces its proteasomal degradation via ubiquitin adaptor. Therefore, Vpx is able to reactivate HIV latent proviruses. The HUSH complex can work as a restriction factor in primary CD4+T cells and is deactivated by Vpx, which links the intrinsic immune response and epigenetic control [114]. It has been well described that to escape from elimination by the immune system, HIV uses silencing of provirus transcription in CD4+T cells. However, later on, these mechanisms need to be activated again. The HUSH

49 complex is capable of repressing primate immunodeficiency virus transcription, and therefore it is the target of Vpx or Vpr degradation [115].

50

Aims of the dissertation

 Detailed study of Fam208a ENU-induced MommeD6 line with L130P mutation and description of impaired mechanisms in prenatally lethal embryos;

 Production of complete murine KO strain for Fam208a and characterization of its phenotype;

 Ablation of the Fam208a protein during the earliest developmental stages of the embryo;

 Identification of putative interaction partners of the Fam208a protein;

 Expression profiling of potential interaction partners at the RNA and proteome level in murine embryos and cell lines;

 Depicting the interaction between L130P Fam208 and HUSH complex member Mphosph8.

51

Materials and Methods

Whole-mount in situ hybridization and histology Embryos were dissected from time mating females into cold PBS containing 10% FBS and were fixed overnight in 4% paraformaldehyde in PBS containing 0.1% Tween-20 at 4 °C (PBT). Single-color whole mount in situ hybridization was carried out as described [116]. RNA probes were either labelled with digoxygenin (DIG - Roche Diagnostics, Germany) or FITC (Roche Diagnostics, Germany). The riboprobe template for Fam208a was prepared using the primers FwdACCACTGGAGAAGCCTGAGA and Rev- GGAATCTTCCTGCTGCACTC and templates for T, Nodal, Cer1, Foxa2, Shh, Noto, Wnt3, Eomes, Gbx2, Lim1, and Otx2 were obtained from Prof. Janet Rossant and were used previously [117]. After post-fixing overnight in 4% paraformaldehyde, embryos were imaged using an inverted microscope (SteREO Discovery V12, Zeiss). Selected embryos were then washed 5–6 times in PBT, embedded in agarose and then embedded in paraffin for sectioning at 3 µm for haematoxylin and eosin (H&E) staining. Sections were imaged using a Zeiss Imager.Z2 equipped with objective N-Achroplan 40x/0.65 M27 and ZEN Sofware for image acquisition.

Whole mount Immunofuorescence Dissected embryos were fxed in 2% paraformaldehyde in PBT at room temperature for 20 mins and washed twice in PBT. Embryos were permeabilized in 0.1M glycine/0.1% Triton X-100 for 12 mins (E6.5) or 15 mins (E7.5) at room temperature and washed twice in PBT. The embryos were blocked in 10% FBS/1%BSA in PBT (blocking buffer) at room temperature for 3 hrs. For primary mouse antibodies, the embryos were further blocked using the mouse MOM IgG kit (Vector Laboratories) according to the manufacturer’s instructions. Embryos were incubated overnight with primary antibodies (Table S4) diluted in blocking buffer and the following day, were incubated further in primary antibodies for 2 hours at room temperature, washed three times in PBT for 10minutes each and incubated in secondary antibodies (Table S4) diluted in blocking buffer for 3 hours at room temperature. The embryos were washed three times with PBT, stained with DAPI (nuclei), mounted, and confocal imaged using a Leica TCS SP5 AOBS Tandem microscope and Leica Application Suite Advanced Fluorescence (LAS AF version 2.7.3.9723) software. Objectives LP/-/C HC PL APO 40x/1,30 OIL CS2 and LP/0,14–0,20/D HC PL APO 63x/1,40 OIL were used for imaging. In all cases, a single confocal z-stack is shown from one representative embryo of each genotype. For each

52 marker, the number of positive cells and the total number of DAPI-positive (nuclei marker, blue) cells were enumerated using the cell counter plugin, FIJI sofware. Te apoptotic index (cleaved caspase3), pro-apoptotic index (p53), proliferative index (Ki67) and mitotic index (phospho-H3-ser28) were calculated as the percentage of cells positive for each marker to the total number of DAPI-positive (nuclei marker, blue) cells in each of Epi, ExE and VE per embryo in a single confocal plane per embryo (at least 3 per group).

RNA expression analyses and qPCR L130P embryos were dissected at E6.25 and their Reicher’s membrane was removed. Afterwards, their epiblasts were extracted from the rest of the embryo (ExE/EPC). Each sample was genotyped by Sanger sequencing. All epiblasts were lysed, frozen at -80oC, and ribonucleic acid was isolated by RNA micro kit (Qiagen). RNA quality and concentration was verified by Agilent RNA 6000 Pico Kit. Minimal RNA integrity score was set to 8. Only those samples were used for further processing of microarray analysis (Affymetrix GeneChipb Mouse Gene 2.0 ST Array). RNA from single epiblasts aimed for qPCR was isolated by PicoPure RNA isolation kit (Life Technologies). The reaction itself was performed using Roche LC480 light cycler. Used primers are listed in Table S1. Microarray analysis Microarray data was processed from .CEL files using the oligo library in R [118]. Data was normalized using the RMA method and batch corrected using the remove Batch Effects function in limma. Data was assessed pre and post normalization and corrected using the prcomp function and plotted to visualize sample in principal component space. Functional set enrichment was performed using SPIA60 and a modified version of sigPathways [119] as described by Maciejewski [120]. All graphs were generated using GraphPad Prism version 7 and data are shown as mean and SEM. Mann-Whitney U test was used for analysing cell number; apoptotic, pro-apoptotic, proliferative, and mitotic indices and *p < 0,05 was considered as significant.

Generation of Fam208a KO mouse To produce mouse model with complete Fam208a ablation CRISPR/Cas9 method was used. Guide RNAs were targeting the first exon (Table S2). C57Bl/6NCrl fertilized oocytes were electroporated by NEPA21 electroporator (Nepagene) with impedance set to range 0.18-0.22Ω and transfer pulse set to 5V for 50msec with 5 pulses. Zygotes were

53 transferred to the foster mothers 48hours later. Tail tips from born pups were used for DNA extraction by Quick extract solution (Lucigen, QE09050) and this DNA was further analysed by genotyping PCR (primers are listed in Table S2). DreamTaq PCR (ThermoFisher, K1081) was standardly prepared and annealing temperature was set to 65 degrees for 40 seconds. Animals used for the analysis were generated from G3 from backcrosses of heterozygotes with wild type C57Bl/6NCrl mice. Mutant Hek293t cell lines CRISPR/Cas9 targeting sequence for both genes, FAM208a and MPHOSPH8, were cloned into pX330 Venus vector (Table S2). All constructs were verified by sequencing. Hek293t cells were incubated with transfection mixture consisted of 30µg of vector DNA and 90µl of X-treme GENE HP DNA Transfection Reagent (Roche, 06 366 236 001) and DMEM (D6429). After 24hour incubation, cells were sorted by FACS for GFP positive cells and 10000 cells were plated for further cultivation. 48hours later, GFP negative sorting took place, followed by single cell plating. This sorting was further incubated and single cell colonies were formed and analysed by PCR followed by sequencing. Finally, 6 stable lines were selected for next experimental plans: HekMT (no mutation); Fam-a1 (-39bp, exon 4/11); Fam-a2 (-34bp, exon 4/11); Mpp8-a (-19bp, exon 8) and Mpp8-b (- 46bp, exon 8). Both transcription variants of FAM208a should be effected, in FAM208a- 202 exon 4, in FAM208a-209 exon11. In MPHOSPH8 were CRISPRS targeted to Ankyrin rich domain, as this was identified by Y2H system as interaction domain with Fam208a, but the mutation lead to complete protein ablation. E9.5 LC-MS analysis We collected 4 littermate embryos at the stage of E9.5, and performed separation from yolk sack and maternal residues. Two wild types, one heterozygote and one homozygote embryo were lysed in 100mM TEAB containing 2% SDC and boiled at 95°C for 5 minutes. Protein concentration was determined using BCA protein assay kit (Thermo) and 20µg of protein per sample was used for MS sample preparation in technical triplicates. Cysteines were reduced with 5mM final concentration of TCEP (60°C for 60 minutes) and blocked with 10mM final concentration of MMTS (10 minutes at Room Temperature). Samples were digested with trypsin (trypsin/protein ratio 1/20) at 37°C overnight. After digestion, samples were acidified with TFA to 1% final concentration. SDC was removed by extraction to ethyl acetate [77, 121] and peptides were desalted on Michrom C18 column (Michrom BioResources Inc.).

54

Hek293t mutant cell LC-MS analysis Mutant Hek293t cells were grown in 6-well plate till confluence. Cell pellets were lysed in 100mM TEAB containing 2% SDC and boiled at 95°C for 5 minutes. All the subsequent steps are identical to previously described method (E9.5 LC-MS analysis). nLC-MS 2 Analysis Nano Reversed phase columns (EASY-Spray column, 50 cm x 75 µm ID, PepMap C 18,2µm particles, 100Å pore size) were used for LC/MS analysis. Mobile phase buffer A was composed of water, 2% acetonitrile and 0.1% formic acid. Mobile phase B was composed of 80% acetonitrile, 0.1% formic acid. Samples were loaded onto the trap column (Acclaim PepMap300, C18, 5µm, 300Å Wide Pore, 300µm x 5mm, 5 Cartridges) for 4 minutes at 15μl/min loading buffer was composed of water, 2% acetonitrile and 0.1% trifluoroacetic acid. Peptides were eluted with Mobile phase B gradient from 2% to 40% B in 120 minutes. Eluting peptide cations were converted to gas-phase ions by electrospray ionization and analysed on a Thermo Orbitrap Fusion (Q-OT- qIT, Thermo). Survey scans of peptide precursors from 350 to 1400m/z were performed at 120K resolution (at 200m/z) with a 5×105 ion count target. Tandem MS was performed by isolation at 1,5Th with the quadrupole, HCD fragmentation with normalized collision energy of 30, and rapid scan MS analysis in the ion trap. The MS 2 ion count target was set to 104 and the max injection time was 35ms. Only those precursors with charge state 2–6 were sampled for MS 2. The dynamic exclusion duration was set to 45s with a 10ppm tolerance around the selected precursor and its isotopes. Monoisotopic precursor selection was turned on. The instrument was run in top speed mode with 2s cycles [121]. Data analysis All data were analysed and quantified with the MaxQuant software (version 1.5.3.8) (Cox, Hein et al. 2014). The false discovery rate (FDR) was set to 1% for both proteins and peptides and we specified a minimum peptide length of seven amino acids. The Andromeda search engine was used for the MS/MS spectra search against the Human database (downloaded from Uniprot on September 2015, containing 147,934 entries). Enzyme specificity was set as C-terminal to Arg and Lys, also allowing cleavage at proline bonds and a maximum of two missed cleavages. Dithiomethylation of cysteine was selected as fixed modification and N- terminal protein acetylation and methionine oxidation as variable modifications. The ‘match between runs’ feature of MaxQuant was used to transfer identifications to other LC-MS/MS runs based on their masses and retention time (maximum deviation 0.7 minutes) and this was also used in quantification

55 experiments. Quantifications were performed with the label-free algorithms described recently. Data analysis was performed using Perseus 1.5.2.4 software [122-124]. Oocyte RNA interference C57Bl/6NCrl females at the age of 4-6weeks were stimulated for superovulation with injection of PMSG at 2:00 pm of day one. Next day, HCG was applied and males were added to the cages. Third day morning, females were screened for vaginal plugs and fertilized zygotes were isolated and incubated in M2medium (Sigma, M7167). Downregulation of desired proteins was mediated by ON-target smart pool siRNA from Dharmacon (Fam208a- L-047440-00-0005; GAPDH- D-001830-20-05; Non Targeting pool- D-001810-10-05). RNA was dissolved in PCR Ultra H2O from Top-Bio (P340) for final concentration 5,8µM. Electroporation (EP) was performed by NEPA21 electroporator (Nepagene) with impedance set to range 0.18-0.22Ω and transfer pulse set to 5V for 50msec with 5 pulses and 5 zygotes per run. Oocyte immunofluorescence o After EP, cells were incubated at 37 C with 5% CO2 in M2media. 24, 48 & 72 hours later, zygotes were fixed with 4% PFA for 45 minutes. Three sets of washes in PBS/FBS 5% were performed followed by permeabilization in 0.5% PBST for 60 minutes. Blocking step was performed with solution containing 5% NGS, 0.3M glycine and 0.1% Triton X in PBS for at least two hours. Primary antibody incubation took overnight at 4°C with 1% NGS and β-tubulin antibody (Cell signalling, #2146) with dilution 1:50. Another set of washing took place next morning followed by incubation with secondary antibody donkey anti-Rabbit AlexaA488 (Invitrogen, A21206) with dilution 1:1000 for 90 minutes. Last set of washes followed by nuclear staining with DAPI and glycerol series with mounting in 90% glycerol with 5% NPG took place. Cells were visualized by Dragonfly spinning disc microscope with 40x objective.

Yeast two-hybrid system Fam208a sequence was divided into two overlapping parts N’ (3-741aa) and C’ (536- 1640aa) terminal part. Special primers were designed to create cloning sites for Y2H vector (pGBKT7, Clontech, 630489) (Table S2) and therefore the PCR product kept reading frame after its subcloning into vector sequence. Restriction sites XmaI/SalI for N’ terminal and EcoRI/SalI for C’ terminal part were used. Standard protocol for Phusion PCR reaction (NEB, M0532S) was used. Template cDNA was prepared by M-MLV reverse transcription (Promega, M170A) with original protocol by supplier and mRNA

56 from murine D3 cell line was used (2µg/µl). Prepared cDNA was diluted into final concentration of 100ng/µl. Both products were cloned into pGBKT7 vector (Clontech, 630489). Complete set of constructs was prepared for Y2H screen : C’ terminal construct (536-1640aa) was used as construct C, construct A (536-947aa) was prepared from construct C digested with SmaI and blunted, construct B (536-1160aa) was C construct digested with PstI with spliced out cassette exon 17 (60bp) (natural variant from cDNA), α (536-1498aa) was C construct without exon 17 and 274bp deletion in exon 24 and β (536-1640aa) was C construct without exon 17 and 179bp deletion in exon 22.

Further steps were following producer’s manual for Matchmaker Gold Yeast two hybrid system (Cat. No. 630489) with murine embryonal library of 11 days of age (Clontech, 630478). Finally, library plasmids were isolated with lysis buffer (50mM Tris-HCL pH8; 10mM EDTA + 20mg/ml RNaseA) and purified with incubation with 200mM NaOH, 1% SDS and after 5 minutes with addition of 3M Sodium acetate, pH4.8. Overnight precipitation with isopropanol was followed by standard ethanol washes. Identified vectors were used for transformation into DH5α E.coli cells to increase a yield and later were re-purified and sent for sequencing to analyse putative partners.

BIOMARK and qRT-PCR BioMark high throughput microfluidic qPCR platform (Fluidigm, San Francisco, CA) was used to perform gene expression analysis. Prior to the qPCR, the samples were pre- amplified as follows: 2µl of cDNA (10ng RNA/µl) was mixed with 1.25µl of 200nM primer mix (all primers together at a final concentration of each primer of 25 nM), 5 µl of iQ Supermix (BioRad, Prague, Czech Republic) and 1.75µl of RNase/DNase-free water (ThermoFisher Scientific). The mixture was first incubated for 3 minutes at 95°C, then 18 cycles of 15s at 95°C, and finally 4 minutes at 59°C. Pre-amplified cDNA was diluted 40×. qPCR was carried out in the GE Dynamic array 48.48 in a BioMark HD System (Fluidigm, San Francisco, CA). 5µl of Fluidigm sample premix consisted of 1 µl of 40× diluted pre-amplified cDNA, 0.25 µl of 20× DNA Binding Dye Sample Loading Reagent (Fluidigm), 2.5µl of Sso Fast EvaGreen Supermix (Bio-Rad, Czech Republic), 0.1µl of 4× diluted ROX (Invitrogen, USA) and 1.15µl of RNase/DNase-free water. Each 5µl assay premix consisted of 2µl of 10µM primers (forward and reversed primer each at a final concentration of 400nM)( Table S3), 2.5µl of DA Assay Loading Reagent (Fluidigm, USA) and 0.5µl of RNase/DNase-free water. Thermal qPCR protocol was: 50°C for 5 s and 98°C for 3 minutes, 40 cycles of 98°C for 5 s, and 60°C for 5 s. The data

57 were collected with the BioMark 4.2.2. Data Collection software and analysed with the BioMark Real-Time PCR Analysis Software 4.1.3. (Fluidigm, USA).

Expression vectors and staining procedure Full reading frames of both, Fam208a and Mphosph8, were cloned in-frame into pCVM6 expression vectors. Fam208a sequences (both, wt and L130P mutated variant) were fused with C-terminal tGFP fluorophore (pCMV6-AC-tGFP). Mphosph8 was cloned into pCMV6-AN-mRFP vector. Hek293t cells were transfected with either one of the plasmids or co-transfected with Mpp8 and one of the Fam208a vectors. X-tremeGENE™ HP DNA reagent was used in 1:3 ratio (DNA:X-treme) and complete transfection reaction was incubated at RT for 15minutes prior to addition to cell culture plates. Hek293t cell were grown to 50-60% confluence and they were incubated with transfection reaction for 24hours, then the fresh media was added. 48hours after transfection, cells were twice gently washed with pre-warmed PBS, fixed with 4% PFA and stained with DAPI or Draq5 to visualize cellular nucleus. Afterwards, cells were mounted with ProLong™ Gold Antifade Mountant (ThermoFisher) and left O/N to dry and solidify. PI staining and cell cycle measuring Hek293t cell were transfected as described previously with pCMV6-AN-tGFP vector with sequence of Fam208a both, wt and L130P mutated variant. For control measuring, samples with empty tGFP vector and samples without exogenous DNA were prepared simultaneously. 48 hours after transfections, cells were harvested and fixed with 70% ice cold ethanol. Cells were subsequentially washed with PBS and treated with RNAseA (with final concentration of 100µg/µl) and stained with PI solution (0,2% Triton-X, PI 40µg/µl). Samples were sorted based on the side scatter fold and GFP emission intensity.

58

Results

L130P mutation of Fam208a leads to developmental delay and gastrulation failure The Fam208a mRNA profile in embryos (E5.5, E6.5, E7.5, and E8.5) was analyzed by whole-mount in situ hybridization to describe the role of Fam208a during post- implantation development. The first stages exhibit very specific expression of Fam208a, which was identified only in the primitive ectoderm – epiblast. As the development proceeds, the signal extends into extraembryonic ectoderm (ExE) and 24 hours later, it can also be detected in ectoderm, allantois, amnion, and chorion. Embryos from E8.5 show ubiquitous expression of Fam208a (Fig. 10). L130P mutation leads to defective primitive streak (PS) elongation [106]. During development, extraembryonic regions develop normally, while the embryo possesses increasing delay and growth retardation from E6.5. Analyses of key markers for ExE development [125] (Cdx2, Elf5, Spc4 and Bmp4) revealed no effect on their expression in homozygous L130P mutants (Fig. 11). PS formation marks successful beginning of gastrulation. As this process is impaired in mutant embryos, Brachyury (T) [126] as a marker of PS and axial mesoderm was studied. Homozygote L130P embryonic T was restricted to the posterior part compared to wt littermates with T extended to the distal tip. Similar results were obtained by analyzing Cripto [127] expression. Correct specification of anterior mesoderm [66] was studied by looking at markers Noto and Nodal [128]. Data showed that in mutant embryos, Noto expression is delayed (Fig. 12), as well as Nodal expression, but Nodal actually points to the missing signal in the node as this embryonic structure does not seem to develop in L130P embryos (Fig. 13). The node absence was further analyzed by expression patterning for Foxa2 and Shh [129]. The first to be expressed is Foxa2, and this particular marker was not detected (or only at very faint levels) in E6.5 mutants (Fig. 14). As previous markers propose, the Foxa2 expression start will be one day delayed, but it does not reach standard distribution. Embryos with Fam208a mutation can develop with no overall morphological changes to the E6.5 stage and can initiate gastrulation, after which development becomes increasingly delayed and fails to progress beyond the E7.5 stage [69].

59

Figure 10 Fam208a is widely expressed during early development. Whole-mount in situ hybridization indicates widespread Fam208a expression at E5.5-7.5 (A - C) strongly seen in the epiblast and lesser in the ExE. Later, at E8.5-9.5, it becomes strongly expressed ubiquitously (D - E). Scale bar: 100 µm. EC, Egg Cylinder; Pr-S, Pre-streak; al, Allantois; EHF, Early Head Fold; 8s, 8 somites [69], prepared by Bhargava S.

Figure 11 Fam208aD6/D6 mutants exhibit minimal changes in extra-embryonic ectoderm (ExE) marker gene expression. Whole-mount in situ hybridization at E6.5 of Fam208aD6/D6 mutants (A’-D’) and their wild- type littermates (A-D) shows minimal expression changes of the ExE markers Elf5, (A’) Spc4, (B’) Bmp4, (C’) Cdx2 (D’). Note that A-A’ show images from single-colored in situ hybridization for the simultaneous detection of Elf5 and Brachyury (T). Scale bar: 100 µm. ES, Early streak; Pr-S, Pre-streak. * indicates BM purple precipitate [69], prepared by Bhargava S.

60

Figure 12 Fam208a mutation leads to significant delay in the formation of the node. (A) Bright-field images of Fam208aD6/D6 mutant embryos at E8.5-9.5 (B & D) and their wild-type littermate controls (A & C). Whole mount in situ hybridization at E8.5-9.5 shows significantly delayed, slight, but correct distal expression of the node marker, Noto. Note that the mutants also possess the allantois. * Asterisk indicates the node-forming region. Scale bar: 100 µm. al, Allantois [69], prepared by Bhargava S.

61

Figure 13 Fam208aD6/D6 mutants exhibit gastrulation failure defects. Whole-mount in situ hybridization at E7.5- E7.75 of Fam208aD6/D6 mutants (A’–F’) and their wild-type littermate controls (A–F). The Fam208aD6/D6 mutant embryos at E7.5 are phenotypically distinguishable, with severely retarded epiblast. (A’) In mutants, PS initiates but remains hardly 1/3rd in its length with no distal and anterior expression, as seen by Brachyury expression (pan-mesodermal marker). (B’) Cripto, a PS and nascent mesoderm marker is expressed slightly delayed in mutants. Together, they show arrested PS elongation. (C’, D’ and F’). The expression of Noto, Nodal and Shh is undetectable in the node of the Fam208aD6/D6 mutant embryos with (G’) reduced anterior expression of AME marker Foxa2. The line indicates the length of the PS. The dashed line in black demarcates the length of the PS and the blue dashed line indicates the node and head process. Scale bar: 30 µm. PS, Primitive streak, LPHF, Late pre- head fold; LSEB, Late streak, early allantoic bud; LS, Late-streak; EPHF, Early pre-head fold; LHF, Late Head fold; EHF, Early head fold; al, Allantois [69], prepared by Bhargava S.

62

Fam208a is important for epithelial-to-mesenchymal transition at the onset of gastrulation During gastrulation, epiblast cells undergo epithelial-to-mesenchymal transition (EMT), migrate and ingress through the PS, and later emerge as differentiated cells to form the mesoderm, a new layer between the epiblast and the overlying visceral endoderm (VE) [130]. Two main EMT markers were examined, E-cadherin and Snail. E-cadherin is expressed in the epiblasts and endoderm and it is downregulated before the onset of EMT [131]. In contrast, Snail is detected within the PS and mesoderm [132]. Immunostaining of L130P homozygotes revealed no significant changes in E-cadherin but robust elimination of Snail-positive cells at E6.5. Later, at E7.5, the number of Snail-positive cells increases and is localized to nascent PS, but never extends to the distal tip (Fig. 14). Fam208a mutants can initiate EMT, but are unable to sustain progression, leading to a shortened PS [69].

63

Figure 14 Fam208aD6/D6 mutants exhibit significantly delayed epithelial-to-mesenchymal transition during gastrulation. Whole-mount immunofuorescence of mutant Fam208aD6/D6 embryos (E6.5: A’–D’ and E7.5: E’– H’) and their wild-type littermates (E6.5: A–D and E7.5: E–H). (D’) Confocal images show only a very few Snail-expressing (mesodermal marker, green) cells within the PS with failure to downregulate E-cadherin (epiblast and endodermal marker, red) at E6.5. Snail expression increases along the elongated PS by E7.5, but gets arrested halfway. The boxed region to the bottom left is of 4-fold magnification. Scale bar: 30 µm. ES, Early streak; EPHF, Early pre-head fold; MS, Mid-streak; PS, Primitive streak; Al, Allantois [69], prepared by Bhargava S.

64

Fam208a mutant embryos exhibit an alteration in anterior-posterior patterning Because of the delay in gastrulation progression in Fam208a, several regulatory genes expressed within mutant epiblasts from E6.5–7.5 were analyzed. L130P mutants display Nodal expression in the anterior epiblast, but miss expression in visceral endoderm at E6.5-E7.5 [133]. The gastrulation process should downregulate the signal in the epiblast, and due to this failure, Wnt3a and Eomes expression was investigated. Both markers are downstream targets of Nodal signaling. The Wnt3a signal in homozygous L130P embryos is restricted to only one third of epiblast and anterior expansion (E6.5-E7.5) (Fig.15). Eomes is crucial for mesodermal formation, and it is usually expressed in the PS, nascent mesoderm, and extraembryonic tissues [134]. Embryos at E6.5 have significantly downregulated Eomes signal in both, embryonic and extraembryonic tissues. Later, some increase of signal can be detected, but it is crucial to consider that L130P mutants are developmentally delayed and usually exhibit a body structure of 24 hours younger wt embryos. To evaluate whether the failure to downregulate Nodal expression in Fam208a L130P mutants is simply due to a developmental delay or whether the lack of Fam208a alters the regulatory network of Nodal signaling, the expression profiles of several known genes involved in anterior-posterior (AP) patterning were examined. Cer1 is a Nodal antagonist and at E6.5 extends from anterior visceral endoderm (AVE) towards the distal tip [135, 136]. As development proceeds, the signal disappears and remains only in the midline underlying the future head. L130P mutants at E6.5 exhibit normal distribution of Cer1. E7.5 embryos either completely lack the signal, or it stayed within the endoderm in the distal tip. As previous data showed that there was a lack of Foxa2 (AME marker), these Cer1-positive cells were more probably endodermal cells and not AME cells. To confirm the lack of AME, another expression marker was studied, Lim1 [69]. Lim1 is expressed in the mesoderm and lateral plate mesoderm [45]. At E7.5 stage, all embryos, wt as well as mutants, displayed correct localization of Lim1expression. Otx2, an anterior forebrain/midbrain marker, is widely expressed in the epiblast (Fig. 16). During development, Otx2 expression is progressively reduced in the posterior epiblast and then becomes limited to the anterior half of the embryo [137]. At E6.5, Otx2 is robustly expressed within the entire epiblast of Fam208a L130P embryos, while in littermate controls, its expression domain has already shifted anteriorly. Later, at E7.5, Otx2 still remains located throughout the epiblast and is reminiscent to that of the normal embryos. Posterior neuroectodermal marker Gbx2 [138] is completely absent in E7.5 mutant embryos, confirming severe alterations in AP patterning.

65

Figure 15 Gene marker expression in Fam208aD6/D6 mutant embryos. Whole-mount in situ hybridization at E6.5- E7.5 of Fam208aD6/D6 mutants (A’–L’) and their wild-type littermates (A–L). (A’–D’) Posterior epiblast markers Wnt3 and Nodal fail to be completely downregulated anteriorly in mutant embryos at E6.5–6.75. (F’) Eomes (key-regulator of EMT and inducer of mesoderm) is downregulated in E6.5 mutant embryos. (I’) Complete absence of Gbx2 (posterior neuroectoderm; hindbrain marker) (J’) with expanded (both anterior and posterior) Otx2 (anterior forebrain marker) expression domain in E7.5 mutants. (H’) Slight Lim1 (anterior PS marker) expression is seen at E7.5 in mutants when compared to wild type. (K’,L’) Note that AVE migrates correctly in mutants at E6.5. At E7.5, Cer1 is expressed in the ADE overlaying future head formation, while in mutants Cer1 expression is reduced and remains in the distal epiblast. The dashed line in black indicates the length of the primitive streak and in blue indicates the length of the AVE. Scale bar: 30 µm. ES, Early streak; MS, Midstreak; Pr-S, Pre-streak; EPHF, Early pre-head fold; LSEB, Late streak, early allantoic bud; LPHF, Late pre-head fold; al, Allantois, [69], prepared by Bhargava S.

66

Figure 16 Altered expression of epiblast specification markers in L130P Fam208a embryos at E6.5. Whole-mount in situ hybridization at E6.5 of L130P Fam208a mutants (A’-C’) and their wild-type littermates (A-C). The Fam208aD6/D6 mutant embryos have no to faint Foxa2 expression (A’) with robust bilateral expression of Otx2 (B’). There is no discernible change in Cripto expression in the mutant embryo (C’). Scale bar: 100 µm, [69], prepared by Bhargava S.

67

Fam208a mutants exhibit a decreased number of total cells and increased incidence of the cell cycle arrest To investigate whether the obvious developmental phenotype of Fam208a L130P mutant embryos was due to an overall decrease in the cell number, the total number of cells was first quantified in three distinct regions, namely the epiblast, ExE, and VE at E6.5. Significant reduction was found in the cell number for all three regions in mutant embryos when compared to littermate controls (epiblast, n=17; ExE, n=15; VE, n=15) (Fig.17). Next, immunofluorescence staining was performed using pan-proliferation marker Ki67 to determine the proliferative index in the epiblast, ExE, and VE. There was a significant gene-dosage dependent increase in the proliferative index compared to the littermate controls (n=4) (Fig. 18). To further examine this defect, the M phase marker phospho-H3 was checked, but no significant changes in the mitotic index were found in L130P mutant embryos. This suggests that an increased percentage of cells that have entered the cell cycle (Ki67) are arrested or delayed in a phase of the cycle before the M-phase (phospho- H3) [69].

During development, there is an extremely high rate of proliferation mainly in the epiblast. This replicative stress and genotoxic stress constantly challenges the controlling mechanisms. In response to these factors, epiblast cells normally undergo rapid cell apoptosis. This might be the mechanism that is responsible for the decreased overall number of cells in L130P mutants. Immunofluorescent staining for cleaved caspase-3 (Asp175) was performed to measure the apoptotic index (Fig. 18). Fam208a mutants had a significantly increased number of Asp175-positive cells at E6.5 specifically in the epiblast. A p53-dependent apoptosis-mediated mechanism increases the embryo fitness by removing mutated or damaged epiblast cells during early post-implantation development, allowing selective clonal expansion of healthy cells. Immunofluoresce staining for p53 was used to investigate the association with this pathway. Epiblast, as well as ExE-epiblast junction, exhibited a significant increase of p53-positive cells in L130P mutants. To investigate whether the gastrulation block seen in Fam208a mutants could be rescued in a p53−/− background, double heterozygous Fam208a;+/L130P p53+/− mice were produced and dissected at E9.5, a time point when homozygous Fam208a L130P mutant embryos are severely retarded and morphologically similar to the E6.75–7.0 stage (Fig. 19). Double mutated embryos (homozygous for both mutations) showed a partial rescue phenotype with beating heart at E9.5. They reached

68 developmental milestones associated with E8.5–9.0 with several developmental abnormalities, including neural tube closure defects (open mid and hind-brain), abnormal and enlarged pericardium, and irregular/smaller somites (Fig. 20). Half of p53 heterozygous embryos also exhibited detectable rescue. These results suggest that the developmental phenotype seen in Fam208a mutant embryos might be due to a p53 dosage-mediated increase in the rate of apoptosis [69].

69

Figure 17 Fam208aD6/D6 mutants have reduced cell numbers. Whole-mount immunofluorescence of mutant Fam208aD6/D6 embryos and their wild-type littermate controls at E6.5. (A) Confocal images show that mutant embryos have a smaller epiblast, as seen by the smaller expression domain of Oct4 (epiblast marker, green). Quantification of (B) epiblast, (C) ExE, and (D) VE cell numbers. All results are mean ± SEM from 17 (Fam208a+/+), 15 (Fam208aD6/+), and 15 (Fam208aD6/D6) embryos. *p < 0.05, ****p < 0.0001 Scale bar: 50 μm. ES, Early streak; MS, Mid-streak; ExE, Extra-embryonic ectoderm; Epi, Epiblast; VE, Visceral endoderm, [69], prepared by Bhargava S.

70

Figure 18 Fam208aD6/D6 mutants exhibit altered proliferation and increased p53-mediated apoptosis. (A,B) Confocal images of mutant embryos with significantly increased Ki67-positive cells [119] in Epi and ExE; *p < 0.05. (C,D) The mutant embryos show no significant change in pH3 (red) expression, a measure of mitotic index. (E-F) Fam208aD6/D6 mutant embryos have significantly increased apoptosis, as shown by cleaved caspase-3-positive cells (Cl. Casp3; red), particularly in the epiblast, *p < 0.05 (G,H), also with a pronounced increase in the p53 level primarily in the epiblast and in part of the ExE region adjacent to the epiblast, *p < 0.05. All results are calculated as mean ± SEM from at least two different litters. The number of embryos analyzed for each marker are indicated, [69], prepared by Bhargava S.

71

Figure 19 Partial rescue of Fam208aD6/D6 mutant gastrulation block upon p53 removal. Gross morphology of E9.5 embryos obtained from Fam208a;+/D6 p53+/− intercrosses. (A) Representative bright-field image of normally developing Fam208a;+/+ p53−/− control embryos, (B) no rescue of the Fam208aD6/D6 phenotype is observed as a result of the introduced mixed background (FVB/N and C57BL/6 J); however, variable rescue of the Fam208aD6/D6 phenotype is observed in embryos with p53+/− (C,D) or p53−/− (E,F) genotypes. In all cases, representative embryos of each genotype are shown. Scale bar: 100 μm. [69], prepared by Bhargava S.

72

Figure 20 Overt developmental defects highlighted in partially rescued Fam208aD6/D6 embryos in a p53-/- background at E9.5. (A-B) Bright-field images of a mutant Fam208aD6/D6 embryo, its corresponding rescued littermate at E9.5 (same embryo as represented in Fig. 8E). Partial rescue of all homozygous Fam208a mutants in p53-/- background up to E8.5-9.0 with abnormalities in the mid-hindbrain closure, kinky neural tube (arrowheads) and enlarged pericardium, which can be seen in semi-thin transverse sections stained with hematoxylin and eosin (B’- B’’). The boxed region to the bottom left & right is of 3- fold magnification. The dotted line indicates the region of transverse section. Arrowheads highlight developmental defects. Scale bar: 100 µm [69], prepared by Bhargava S.

73

The expression profile of L130Pa embryos is altered already at E6.25 An expression microarray was performed using total RNA isolated from single dissected epiblasts at E6.25. Using the signaling pathway impact analysis (SPIA) [139] against the KEGG database, a significant enrichment of the p53 signaling pathway was observed (mmu04115, p=0.0023) (Fig. 21). This enrichment is readily apparent in a related 24- member gene set defined recently [140], comprising the p53-bound (50 kb from TSS) subset of genes significantly downregulated in response to combined p53/p73 depletion in mouse embryonic bodies. For this p53/p73 dataset not only there was a significant enrichment in homozygotes, but also in heterozygotes (Table S5). Many embryogenesis markers displayed increased levels in homozygotes (Dkk1, Gsc, Cfc1 and Nodal). This indicates that altered expression of patterning genes is detectable already at E6.25 even though the embryos look morphologically identical to the wt littermates. For Dkk1 this increase was confirmed by qRT-PCR. A functional set enrichment was also found for a previously defined Oct4 co-expression-based pluripotency, suggesting that mutant epiblast cells may have delayed transition from a naïve to a primed pluripotency state, or alternatively delayed exit from pluripotency. The long non-coding RNA fraction from this gene module, termed Platr (pluripotency associated transcript), was particularly enriched with a similar distribution profile between males and females. For Platr3, Platr20 and Platr27, this upregulation was confirmed by qRT-PCR. Notably, the Platr gene set was also enriched in heterozygotes [69].

74

Figure 21 Increased p53 signaling and deregulation of pluripotency-associated transcripts in Fam208aD6/D6 epiblasts. (A) Volcano plot of Fam208aD6/D6 contrast with Fam208a+/+ indicating position of selected genes selected for validation by qPCR. (B) Density plots showing positions of Platr genes in all genes differentially expressed between Fam208aD6/D6 and Fam208a+/+ epiblasts with data segregated according to sex (n = 2 each). (C) Heat map showing differential expression of a p53-bound, p53/p73- regulated gene set defined by [7]. (D) Statistical overrepresentation of the p53 signaling pathway using SPIA analysis, (E) single epiblast qPCR verification at E6.25 for gene expression normalized to Gapdh with mutant values represented as a fold change relative to wild-type littermate (average expression converted to 1). Expression measurements were carried out in duplicates per embryo (number of embryos; +/+, n = 6; +/D6, n = 4, D6/D6, n = 6). All results are calculated as mean ± SEM from at least two different litters, [69], prepared by Bhargava S.

75

Full ablation of Fam208a leads to embryonic lethality at early somite stage Since previously published data were generated using mutant lines produced with ENU mutagenesis where a random mutation does not necessarily result in a complete loss of functional alleles [106], we prepared a full Fam208a knock-out (KO) mouse line, where the whole critical exon 1 was deleted by the CRISPR/Cas9 technique according to the IMPC standard [141]. Guide RNAs (gRNAs) targeted exon 1 and after the first round of electroporation of C57Bl/6NCrl fertilized eggs, a founder female with the desired deletion of the whole exon 1 and approximately 160 nucleotides up and downstream from the exon 1, together deletion of 866 base pairs (bp), was obtained. Three sets of breeding with wt animals took place with selection pressure on this particular mutation. Afterwards, in crossbreeding using heterozygous mice, we observed a fully penetrant pre-weaning lethal phenotype consistent with previously published data. Instead of viable homozygotes, five litters with total number of 32 pups provided 23 heterozygotes and 9 wild-type animals (Fig. 22). In order to investigate the cause of this lethal phenotype, we analyzed embryonic development of Fam208a KO mutants with the aim to identify the critical developmental period of Fam208a malfunction. We followed the IMPC embryonic lethal screen guidelines [48] and started at embryonic stage E12.5. As expected, no viable homozygous embryos could be observed at this stage. The heterozygous littermates were indistinguishable from wt, they were of similar size and weight, and the somite number was the same as in the wild-type embryos (Fig.23).

Next, we examined the embryos at E8.5 and E9.5 and surprisingly, we identified Fam208a null embryos before placentation as alive, and these null embryos harvested at E8.5 and E9.5 were developmentally delayed but comparable to the wild-type littermates. Nevertheless, Fam208a null embryos went through the gastrulation process and formed the head; however, they reached the maximal number of four somites at E9.5 compared to the wild-type embryos with an average of 27 somites. Interestingly, retarded development was also observed in heterozygous embryos with a reproducible difference in formation of up to six somites less than in wild-type littermate embryos at E9.5 (Fig. 22). In E8.5 null embryos, we observed the same genotype distribution as within the group of E9.5, and so there was a similar amount of wt and homozygote embryos and more than twice that much of heterozygotes.

Based on these findings, we conclude that Fam208a ablation causes embryonic lethality with robust developmentally delayed phenotype observed already at E8.5, progressing

76 through E9.5, with full lethality at E12.5. Remarkably, at earlier stages of development, the dose-dependent effect of the mutated allele is visible in delayed developmental progress of heterozygous embryos; however, this effect is fully compensated in later developmental stages and results in fully viable and fertile mice.

77

Figure 22 Downregulation of Fam208a in zygotes leads to embryonic lethality. A) Mutagenesis design shows location of genotyping primers (F’; R1’ and R2’) and guide RNAs (G1 and G2) with labeled PAM sequence, yellow. B) Quantitative analyses of mRNA levels shows a slight increase of transcription levels of Fam208a in heterozygous and reduction of RNA in homozygous mutants. C) Within five litters, no viable homozygote is observed. D) Bright-field microscopy shows morphological differences between the development of Fam208a +/+; Fam208a -/+ and Fam208a -/- embryos at embryonic day 9.5. E) As no mutants were born, we examined embryos at stages E8.5 and E9.5. In 43 dissected embryos (E8.5+E9.5), 12 were full mutants with Fam208a -/- genotype, 13 embryos were wild types and 18 embryos were heterozygous for deletion in Fam208a. F) Graph representing the number of somites at embryonic stage E9.5 with milder differences observed between wild types and heterozygotes and rapid elimination of the number of somites in Fam208a -/- embryos.

78

Figure 23 Fam208a KO embryos are not viable, but heterozygotes are almost indistinguishable from wt. A) Wild- type and Fam208a heterozygous E12.5 embryos do not exhibit any obvious morphological differences. Homozygotes are not viable, but resorptions are always present at this stage. B) Diagram of measured weight of E12.5 embryos with no data for absorbed embryos with homozygous mutation. The difference between wild types and heterozygotes at this stage of embryonic development is negligible.

79

Fam208a mutation massively impacts the protein expression profile homozygous mice

To reveal the impact of Fam208a on the molecular landscape during early embryonic development of Fam208a null embryos, we performed an unbiased differential proteomics by LC-MS analyses in E9.5 embryos. Proteomic data showed massive divergence between homozygous and wild-type embryos. We examined the samples and while both wild-type and heterozygote results were almost identical, homozygous embryos exhibited completely altered protein expression profiles. We identified over 4,800 proteins, of which 800 showed highly significant differences in the expression pattern in KO samples (Fig. 24). The most represented groups with common function were proteins with nucleic acid-binding activity (170) and proteins involved in transcription regulation (165). The third mostly affected group included proteins playing different roles in epigenetic processes (122), crucial for embryonic development (Fig. 24). Proteins involved in cell cycle control and cell division processes were also highly affected (83 and 103 proteins, respectively). The putative nucleic acid-binding activity of Fam208a is discussed later, but the direct effect of Fam208a KO can create a window for alternative DNA/RNA binders to partially take over the function and drive other silencing mechanisms to the heterochromatin regions. Fam208a was originally identified as a protein interaction partner for Rb, which is capable to bind transcription factors and inhibit their function [103]. Therefore, the impairment of this interaction might lead to distortion in expression profiles of transcription regulation. The link between epigenetic regulation and Fam208a has already been published [10], and therefore the effect on proteins involved in epigenetic processes is not surprising. Our newly suggested role of Fam208a is tightly linked with cell cycle and cell division orchestration, and downregulation of Fam208a should therefore influence these processes as reported. Regarding the expression of previously published interaction partners [77, 142], Pphln1 showed only mild alteration of the expression in Fam208a null embryos; however, the level of Setbd1 was remarkably decreased, and Mphosph8 (Mpp8) in homozygotes was almost beyond the detection limit of the method (Fig. 25).

80

Figure 24 Fam208a mutation massively impacts the protein expression profile in homozygous mice. A) Heat map representing overall identified genes for LC-MS data obtained from E9.5 embryos. Wild types and a heterozygote show high similarity in the expression profile, while a homozygous embryo shows a different pattern. B) Detail visualization of affected proteins and distinct expression profiles identified in a homozygous Fam208a KO embryo. C) The affected proteins were grouped according to their ontologies and the chart represents their distribution amongst these groups; numbers in lines represent common ontologies for a protein.

81

Figure 25 Chart representing relative amounts of detected proteins in murine E9.5 embryos. Fam208a showed generally low expression, which is beyond the detection limit in homozygotes. Pphln1 and Setdb1 do not display any alterations in their expression, while Mphosph8 shows decreased values in homozygotes.

82

Downregulation of Fam208a in zygotes leads to immediate cell division phenotype To further investigate the role of Fam208a in embryonic development preceding the post- gastrulation and pre-placentation lethal period (E8.5-E12.5), we systematically focused on the earliest events in embryonic development employing microarray profiling of the whole zygotic transcriptome at several stages. To explore the effects of waves of zygotic transcription activation, fully grown GV oocytes, metaphase II-arrested eggs, one-cell zygotes, two-cell zygotes, four-cell zygotes, morula, and blastocyst datasets were analyzed and compared [143, 144]. RNA levels of Fam208a RNA showed that it belongs to genes with lower expression in the first three stages, i.e., GV, MII oocytes, and one- cell zygote. However, the expression remarkably increased after the two-cell stage transcription activation and stayed relatively high during all the following stages, i.e., four-cell stage, morula, and blastocyst (Fig. 26). Based on transcriptomic data, we investigated the putative function of Fam208a in the first events of zygotic division. To avoid the effects of maternally deposited mRNA in early zygotes, which can significantly diminish the null phenotype [145], we used siRNA to knockdown Fam208a immediately in the zygote after electroporation. We introduced RNAi by electroporation into a one- cell stage zygote and observed the ability of C57Bl/6NCrl fertilized eggs to further develop. The zygotes with downregulated Fam208a, which overcame the two-cell stage block, had problems to proceed with typical cell division, and multipolar spindle formations were created. A tri-polar spindle apparatus was formed most frequently (n=9) in siRNA-knockdown Fam208a. These abnormal spindles were observed in all selected time points. Most dramatic differences were observed 48 and 72 hours after the siRNA delivery. All controls (siGAPDH, siNon-Coding, negative control) developed spindles without any observable disturbances (n=11) (Fig. 26).

Our findings suggest that Fam208a also plays a role in other cell physiology processes than epigenetic regulations. Moreover, it might be a direct part of the spindle apparatus regulatory pathways with a potentially novel spectrum of interacting proteins distinct from epigenetic modifiers discovered before. This is also supported by the finding that the methylation state in the early zygote is stable until massive de-methylation at four- cell stage embryo occurs [146]. As heterozygote embryos develop normally, with slight retardation observed between E8.5 to E12.5, it opens the possibilities that either even monoallelic expression is sufficient for Fam208a functions, or decreased levels of

83

Fam208a introduce a mechanistic obstruction that causes delay in development but eventually is overcome by other pathways.

84

Figure 26 Downregulation of Fam208a in zygotes leads to formation of multipolar spindles. A) RNA-seq results show increased expression of Fam208a after two-cell stage activation. The first line of expression profile in one-cell stage shows a low amount of identified RNA mainly obtained from maternal RNA. In the third line presenting expression after two-cell stage activation (four cells), we see increased levels of both introns and exons, which is evidence of transcription coming from embryonic RNA. In morula, the fourth line, the trend is still detectable, but obviously the highest expression peak for Fam208a is during the first rounds of zygotic division. B) Immunofluorescence staining of zygotes, which are either 42 hours or 72 hours after siRNA electroporation, shows increased incidence of formation of multipolar spindles in the absence of Fam208a (n=9/11). Red arrows point to spindle poles of dividing cells. C) Detailed view of the spindle apparatus with a multipolar spindle in Fam208a downregulated cell at different planes (z) to observe all formed spindle centers (left panel) and two different planes of a normal spindle apparatus with two spindle poles in a control cell (right panel).

85

Y2H screen revealed a novel Fam208a interaction network in the spindle assembly machinery

To study the Fam208a interactome, we used Y2H for unbiased identification of all the potential binding partners. The advantage of the Y2H system is the possibility to search for putative interacting partners that could not have been identified by proteomic approaches in differentiated cell cultures. The already known and verified interaction partners of Fam208a from the HUSH complex (MPP8, PPHLN1, and SETDB1) served as an internal control. However, none of them has been linked with establishment of spindle poles or direct DNA binding with actin and tubulin fibers of the spindle apparatus. Because of its large size, we split Fam208a (1610aa) into two overlapping parts for the purpose of the Y2H system. The N’ terminal part covered the first 740aa. The C’ terminal part with 1,100 amino acids was further studied by generation of four deletion constructs based on natural variants (alternative reading frame, exon skipping, and partial deletions). All constructs were used to closely describe potential protein binding interaction domains (Fig. 27). Our results confirmed the possibility of this in silico prediction as there were no verified interaction partners for the N’ part (possible DNA/RNA binding part), while the Y2H screen identified 20 putative interaction partners for the C’ part. The size and color intensity of the yeast colonies were used as strength markers for different interaction partners. One of the strongest color signals was observed in colonies with the ankyrin domain of Mphosph8 protein, which confirmed our approach, as the direct interaction between Fam208a and Mphosph8 was already described [77]. Reciprocal verification mating ruled out four out of twenty identified constructs. As an example, periphilin 1 was first identified in 39 out of 65 diploid yeast colonies; all possible transcription variants were pulled. Nevertheless, control mating did not confirm this protein as a direct binder to Fam208a. In the screening assay, Fam208a-Mphosph8 interaction was confirmed as the ankyrin repeat domain was identified in Mphopsh8 sequences, pulled by the Y2H system, and in the region between amino acids 600-904 in Fam208a. These regions should be responsible for the interaction of these two proteins. In addition, novel putative interaction partners were identified from the different protein function groups such as calcium-regulating proteins (S100a10, Inpp5a), proteins with DNA/RNA binding (Ncbp1, Hcfc2, Parpbp), and proteins involved in the regulation of the cell division (Eml1, Svil, Gpsm2, Itgb3bp, and Amn1) (Table 1). These proteins play an important role in sister chromatid segregation, spindle assembly, and cytokinesis. Fam208a might therefore be involved in orchestrating the formation of spindle poles and in the case of its

86 downregulation, the multipolar spindles may occur. Proteomic data support this hypothesis, as there is selective elimination of all putative interaction partners in KO embryos analyzed by LC-MS. Svil, Eml1, Gpsm2, and Itgb3bp were all beyond detection limits of the LC-MS method in E9.5 homozygous embryos. However, all these putative interaction partners exhibited stable and strong expression in wild-type and heterozygous mice. Taken together, we found increased incidence of impaired spindle apparatuses upon Fam208a downregulation and identified several presumed interaction partners linked to spindle pole establishment and functioning. These findings suggest that Fam208a plays an important role in the spindle pole assembly during zygotic division.

87

Figure 27 Y2H screen revealed a novel Fam208a interacting network in the spindle assembly machinery The yeast two-hybrid screen used several constructs as a bait to pull down interaction partners; a scheme of prepared constructs is shown with an outline of possible interaction domains with competitive binding partners. Constructs α, β and B lack cassette exon 17 (60 bp), the α construct had also an identified alternative reading frame caused by splicing differences, and the β variant had deletion in exon 22 that influences the reading frame in the rest of the protein. B) All identified preys are listed in a table with symbols to identify the observed strength of their interaction based on the size and color of the yeast diploid colonies after mating. C) Proposed scheme with already identified function of proteins involved in the cell division process. Gpsm2, mitotic spindle pole organization; Amn1, nuclear orientation checkpoints; Eml1, assembles and organizes microtubules and regulates orientation of the spindle apparatus; Itgb3bp, member of centromere-specific complex, recruits histone H3 to the centromere region; Svil, coordinates actin filaments and myosin II during cell spreading.

88

Table 1

List of identified and verified putative interaction partners with their ontologies

Gene Name Function Eml1 Echinoderm microtubule-associated protein-like cell division, spindle Svil Supervillin cell division, adhesion Gpsm2 G-Protein Signaling Modulator 2 cell division, spindle Itgb3bp Integrin Subunit Beta 3 Binding Protein cell division, kinetochore Amn1 Antagonist Of Mitotic Exit Network 1 Homolog cell division, spindle Cntn1 Contactin 1 adhesion Etfa Electron Transfer Flavoprotein Alpha Subunit energy Psmd8 Proteasome 26S Subunit, Non-ATPase 8 proteasome, degradation Inpp5a Inositol Polyphosphate-5-Phosphatase A Ca regulation S100a10 Calpactin Ca binding Mphosph8 M-Phase Phosphoprotein 8 HUSH Pphln1 Periphilin HUSH Tmem100 Transmembrane protein 100 differentiation Alb Albumin carrier protein Parpbp PARP1 Binding Protein DNA/RNA mechanism Hcfc2 Host Cell Factor C2 DNA/RNA mechanism Ncbp1 Nuclear Cap Binding Protein Subunit 1 DNA/RNA mechanism

89

Fam208a has tissue-specific subsets of interacting partners The Y2H system provided an unbiased view on the pleiotropy of putative binding partners of Fam208a. However, identification of biological processes where possible interactions play a role is a challenging question. To study the functional relationships between Fam208a and its potential binding partners we analyzed the expression pattern of all identified interacting proteins in adult murine tissues. Twenty murine organs were used for preparation of an RNA library and subsequently used for BIOMARK q-RT-PCR screen with 18 gene-specific primers, of which 16 were designed based on identified Y2H preys. Primers for Pphln1 and Setdb1 from the HUSH complex were also included to the screen. Different tissues showed various expression levels of Fam208a as well as of all different binding partners. Fam208a is generally ubiquitously expressed in the majority of organs at lower levels; however, its expression was almost six times higher in male tissues than in females (Fig. 28). Generally, higher levels were detected in the kidneys, spleen, thymus, seminal vesicles, uterus, and ovaries. Three other HUSH proteins (Mphosph8, Pphln1, and Setdb1) also exhibited higher expression in the kidneys, uterus, seminal vesicles, and testes. Besides that, their expression was higher in the brain and lungs. Genes involved in the spindle apparatus assembly and function and cell division regulation (Amn1, Eml1, Gpsm2, Itgb3bp, and Svil) are commonly highly expressed in seminal vesicles, lungs, duodenum, and brain. Amn1 and Itgb3bp had the highest expression in the testes. The stomach and lungs exhibited higher levels of Eml1, whereas Gpsm2 appeared to be predominantly expressed in proximal colon and ileum. Svil was strongly expressed in the heart and tongue. Interestingly, Hcfc2 and Ncbp1 together with HUSH proteins Psmd8 and Parpbp exhibited very high expression in the testes. To study the expression of Fam208a and other preys at the single cell resolution, we selected three adult tissues with expression of Fam208a (testes, ovaries, and brain) for ISH. The diversity of co-expressed interaction partners with Fam208a based on the observed organ sample obviously supported the contextual nature of Fam208a protein interactions also at the cell resolution level. Detailed look at tissues from the testes showed possible co- localization of Fam208a and several other partners (Fig. 28). Fam208a itself was highly expressed in seminiferous tubules with a strong signal in Sertoli cells, spermatogonia, and spermatocytes. Eml1, Gpsm2, Psmd8, Inpp5a, Amn1, Cntn1 and Parpbp were also specifically expressed in Sertoli cells, which are in the base of the epithelium and visually form ring-like staining around the edge of seminiferous tubules. Itgb3bp, Etfa, Hcfc2, Mphosph8, and Ncbp1 were more abundant towards the lumen, and the signal was seen

90 in spermatogonia and spermatocytes. Svil and Alb showed no expression in the testes. In analyzed ovaries, Fam208a had a strong signal in granulosa cells surrounding the oocyte itself. The majority of our probes exhibited this pattern within the ovary. Parpbp, Gpsm2, Etfa, and Cntn1 had the strongest ovarian signal. The expression profiling in the brain revealed that Fam208a is strongly expressed mostly in granular cells within the cerebellum. A similar signal was detected with Svil, Alb, Gpsm2, Etfa, Parpbp, Mphosph8, and Ncbp1. To conclude, the general expression pattern of Fam208a and its interacting partners suggests a contextual role of interaction in dependence on the tissue type and involved physiological process. Moreover, tissues with higher proliferation levels have higher expression of Fam208a and HUSH partners, while gametogenic tissues are richer in partners important for the spindle apparatus control.

91

Figure 28 Fam208a is differentially co-expressed with putative partner proteins in adult murine tissues. A) Complex heat map based on qRT-PCR data showing expression patterns of 18 genes within 20 different murine tissues with dark blue representing relatively low expression and bright red representing relatively high expression of pre-selected genes. X axis is divided into columns representing genes and Y axis forms rows dedicated for murine tissues. ‘F’ states for female and ‘M’ for male samples. B) In situ hybridization staining of paraffin sections of testes and C) ISH of ovaries, NC = negative control

92

Ablation of FAM208a in somatic cells did not impair the cell division processes To observe the possible effects of ablation of FAM208a in fully differentiated somatic cells, we used CRISPR/Cas9 to delete exon 4 (exon 10 in an alternative splicing variant) in Hek293t cells. To study the assumed overlapping roles of FAM208a and HUSH complex, we also prepared an MPHOSPH8 deletion mutant introducing a mutation in exon 7, in which the ankyrin repeat region was identified, and thus only the Fam208a interaction in the HUSH complex was targeted. However, deletion of the ankyrin domain resulted in complete ablation of the MPHOSPH8 protein. We established three lines for FAM208a (Fam-a1, Fam-a2 and Fam-a3) KO cells, and two lines for mutated MPHOSPH8 (Mpp8-a; Mpp8-b). To test whether Fam208a deletion in somatic cells also affects cell division, we performed trypan blue viability measurement. Evaluated parameters included concentration of viable cells, their diameter, and total viability. We did not detect any significant differences between mutant variants of FAM208a and control samples (Fig. 29). On the other hand, MPHOPH8 KO lines showed diminished cell viability after 48 hours. In addition, we noticed increased cellular diameters in the mutant lines Fam-a2, Fam-a3, and Mpp8-a. These data propose little effect of FAM208a depletion on cell proliferation as well as on the functionality of the HUSH complex.

To further analyze FAM208a, we performed proteomic analysis of all lines using the LC- MS approach, in which more than 4,000 proteins were detected and used for differential proteomics. Original Hek293t cells (HekWT) and cells that underwent the entire mutagenesis procedure but did not have an edited genome (HekMT) were used as controls. LC-MS analyses verified the absence of FAM208a in lines Fam-a1, Fam-a2, and Fam-a3 and MPHOSPH8 in lines Mpp8-a and Mpp8-b (Fig. 29).

Evaluating the presence of other members of the HUSH complex, no difference in the levels of PERIPHILIN1 was observed. Moreover, SETDB1 could not be detected in any of our mutant lines. Approximately 25% of LC-MS-detected proteins were differentially expressed among the mutated lines (Fig. 30). To show the most significantly up- and downregulated FAM208a-interacting proteins, we filtered out 127 of them complying to the most stringent analytical criteria. The selection filter was set up at several levels as follows. First, the protein had to be detected either in both control samples or in neither of them. Second, the levels of these proteins had to be beyond the detection limit in at least two out of three FAM208a KO lines or opposite. As a result, 104 proteins were detected in both controls, HekWT and HekMT, but they were not measured in at least

93 two of FAM208a KO lines. Moreover, 23 proteins were upregulated only in mutant Hek293t lines. The same method was used for MPHOSPH8 cell lines. This gave us the final number of 67 downregulated and 16 upregulated proteins compared to non-mutated HekWT and HekMT. Based on the gene ontologies and identified functions, we could cluster FAM208a-dependent proteins into six groups: DNA/RNA binding group with 24 downregulated and seven upregulated proteins, proteins involved in transcription regulation with 22 downregulated and two upregulated proteins, proteins involved in cell cycle with 15 downregulated and two upregulated proteins, cell division-linked proteins with nine downregulated and three upregulated proteins, and proteins connected with cellular apoptosis with four downregulated and two upregulated identified changes. The majority of the proteins had more than just one role, and therefore they can be involved in multiple categories. Proteomic data from somatic cell lines showed largely affected protein levels, allowing us to assume the impact of depletion of FAM208a in transcription regulation, nucleic acid binding, epigenetic regulation without real impairment of cell viability (based on trypan blue staining, Fig. 30).

Thus, our data suggests that FAM208a in somatic cell lines should be responsible for epigenetic silencing via a complex together with MPHOSPH8. However, no clear phenotype of FAM208a KO cell lines points to the compensatory mechanism that overcomes the ablation of the protein.

In conclusion, FAM208a is involved in the regulation of cleavage during early zygote development, although its removal in stable somatic lines does not impair the cell cycle or cell division.

94

Figure 29 Ablation of FAM208a and MPHOSPH8 in somatic cells does not impair the cell division processes. A) Relative fold ratio of expression levels of HUSH complex proteins in mutated Hek293t cell lines. The levels of FAM208a in Fam208a mutants and MPHOSPH8 in Mpp8 mutants are below the detection limit. There is almost no effect on PERIPHILIN levels (PPHLN1), but the levels of SETDB1 could not be traced down in any of affected cell lines. B) To study the effect of downregulation of FAM208a and MPHOSPH8 in somatic cell lines, analyses of live cell concentration, total viability, and cell size measurements were performed.

95

Figure 30 Ablation of FAM208a and MPHOSPH8 in somatic cells did not impair the cell division processes. A) Hek293t cells were mutated with the CRISPR/Cas9 system targeting either the Fam208a or Mphosph8 gene. LC-MS was used to identify changes implicated by these mutations. Normal wt Hek293t cells (HekWT) were used as a control and standard sample; the HekMT line does not carry any mutations and was used as a control of the mutagenesis process, which could also have introduced changes; lines Fam-a1, Fam-a2 and Fam-a3 carry different deletions in FAM208a and lines Mpp8-a and Mpp8-b had knocked out MPHOSPH8 protein. B) Expression profiles of cell lines differ in approximately one fourth of the proteins identified by LC-MS. C) Ontology graph representing 127 affected proteins grouped according to their ontologies. Proteins might be involved in more categories simultaneously, and numbers above lines represent common proteins for the linked groups. The first number in brackets represents upregulated while the second one shows downregulated protein.

96

Overexpression of L130P Fam208a might cause an increase in G2/M phase arrest

The function of Fam208a in the regulation of cell cycle was not yet assigned, and to investigate the effect of increased expression in Hek293t cells, we studied the cell cycle upon transfection. Cells were transfected with pCVM6 constructs with variants, wt and L130P Fam208a and stained with propidium iodide (PI) to analyze the cell cycle. PI staining is a well-established method for analyzing the cell distribution within the major phases of the cell cycle based on cellular DNA content [147]. Expression vectors with turbo GFP were used for this assay. The analyzed samples were divided into three groups; GFP negative (GFP-), GFP positive (GFP+) and GFP super-positive (GFP++) cells (Fig. 31). This distribution naturally occurred from measured fluorescence, where GFP+ cells were a minority and their expression was probably either not so strong or the localization of the protein was altered. Cells transfected with wtFam208a created massive nuclear foci in the majority of cases, but cells with more even distribution were still observed (up to 18 %). In the case of L130P the situation was opposite, the majority of cells displayed strong but evenly distributed localization, while in a small portion of cells, we could see formation of small nuclear clusters (up to 20 %). Therefore, we conclude that for wtFam208a, the colony labeled as GFP+ contains cells with equally distributed signal and GFP++ cells display strong nuclear localization. L130P Fam208a also created both signal intensities, GFP+ cells promoted nuclear expression with small foci and GFP++ represented cells with global distribution of mutated Fam208a. Generally, the cell cycle could not be severely affected, as the transfected cells performed well and did not display increased apoptosis. Our data show only a very mild effect of wtFam208a or L130P expression on G0+G1 and S phase of the cell cycle. There was a small effect on decreased numbers of GFP++ cells fixed within G0+G1 upon transfection with the L130P construct. However, we could also observe a slight effect on cells with control empty GFP vector. Overexpression of wtFam208a from the pCMV6 vector did not affect the distribution of the cell population amongst the cell cycle phases. Interestingly, GFP++ cells with mutated Fam208a displayed significantly increased percentage of cells blocked in G2/M phase of the cycle (Fig. 32). This is a crucial link between the cell cycle progression and Fam208a and it suggests that the dysfunctional Fam208a might cause increased G2 phase arrest incidence or direct mitotic blockage.

97

Figure 31 Flow cytometry using LSRII distinguished three distinct populations of cells based on the ratio between side scatter and GFP intensity. Hek293t cells with wtFam208a were successfully transfected with 66.6% efficiency and the majority of GFP-positive cells displayed extra strong signal (GFP++). Small populations of cells were grouped together and based on lower intensity of the fluorophore, they were labeled as GFP+.

98

Figure 32 Cell cycle analyses based on PI staining and flow cytometry. Based on tGFP intensity, samples were divided into negative (GFP-), positive (GFP+) and super positive (GFP++). Three constructs were used, wtFam208a (wt), L130P Fam208a (L) and empty GFP vector (E). Cells negative for any GFP signal were analyzed as negative controls (N). No severe alteration of the cell cycle was observed. The only significant effect was detected in GFP++ samples with L130P mutated construct.

99

L130P mutation impairs intracellular distribution of Fam208a

To investigate the relative localization of fluorescently labeled Fam208a protein, we prepared fusion vectors pCMV6 with either C’ terminal tGFP fluorophore or with N’ terminal mRFP labelling. The original open reading frame (ORF) sequence was obtained from a directly pre-ordered vector and it was cloned into the pCMV6-ENTRY plasmid with flag tag. Visualization upon transfection is easier with fluorescent dyes, and therefore Fam208a ORF was completely re-cloned into expression vectors with fluorophores. This localization study was performed using two different variants of Fam208a; one was a wt allele and the second type included L130P mutation identified by ENU mutagenesis [105], named MommeD6. Both ORFs were cloned into both color expression variants. The third construct used for this study was pCMV6-AN-mRFP- Mphosph8. The gene sequence was obtained from the D3 cell line by using reverse transcriptase PCR. All constructs were verified by sequencing.

First, single transfections were performed. The plasmid with ORF or wt Fam208a was transfected to Hek293t cells and interestingly, we observed that the majority of the protein exhibited nuclear localization (>75 % of the protein) (Fig. 33). The signal within the cytoplasm seemed to be evenly distributed and did not create any patterns, suggesting no compartmentation. The situation within the nucleus was different. Fam208a with both systems of labeling (N’ terminal and C’ terminal) formed large clusters within cellular nuclei. Interestingly, these large clusters were located within nuclear staining-free regions (for DAPI and Draq5 dyes) (Fig. 34). Intercalation agents bind into small or large grooves of the DNA double helix. This interaction can be aborted either by condensed conformation of chromatin or by the presence of different binding proteins recognizing the same regions [148]. Therefore, our data suggest that Fam208a directly binds DNA, and it is possible that it specifically binds to heterochromatin regions of the epigenome. This theory is supported by recent studies of the Fam208a function, and these data suggest that together with Mphosph8 and Pphln1, Fam208a creates the HUSH silencing complex and performs heterochromatin spreading via recruitment of Setdb1 methyltransferase and deposition of H3K9me3 marks [86].

Expression study of the L130P variant of Fam208a protein brings a new insight into its possible functions. Unlike wt protein, L130P has increased cytoplasmic signal (>50 %) (Fig. 33) and its nuclear portion lacks clustering. Generally stated, the mutated variant is

100 evenly distributed within the cell. It does not exhibit clustering or shape formation resulting from organelle deposition. Both versions of the protein expression indicate that the cytoplasmic localization as such might be just an effect caused by mammalian vector overexpression (Fig. 35). However, high levels of the protein, which is preferably nuclear, might be present in the cytoplasm as well. Nuclear localization of L130P provides additional information. This substitution mutation is localized within the N’ terminal region of the protein. Our Y2H results identified the C’ part of Fam208a to be responsible for protein-protein interaction (PPI) and excluded all [77] identified preys for the N’ part. In silico analyses suggested that the 3’end of the protein can create a nucleic acid binding clamp, very similar to RNA polymerase II structure. L130P mutation is located within this putative DNA/RNA binding clamp. Evidently, this mutation is responsible for changes from nuclear cluster formations into evenly distributed signal. This signal overlaps with nuclear staining emission. Therefore, we suggest that the Fam208a protein is capable of recognizing and binding to DNA/RNA, preferentially into heterochromatin regions, by using its N’ terminus. This nuclear protein uses the C’ terminus for PPI, where the binding domain for Mphosph8 is also localized.

Figure 33 Graphical representation of intracellular distribution of overexpressed proteins Fam208 wt and L130P. The number of analyzed cells was >200 and ImageJ Colocalization finder tool was used to establish distribution of the protein. The majority of wt Fam208a was localized within the nuclear regions, while mutated L130P form was evenly and equally spread within the cells.

101

Figure 34 Microscopy images of Hek293t cells transfected with wtFam208a in either mRFP-pCMV6 or pCMV6- tGFP expression vector. Both variants were used to exclude localization changes due to fluorophore fusion. Nuclear staining in A) line is DAPI; in B) line it is far red Draq5. We can see preferential nuclear localization regardless of the terminus of labeling. RGB profile graphs C) and D) show exclusivity of the Fam208a signal within nuclear clusters and removal of nuclear staining signal.

102

Figure 35 Microscopy images of Hek293t cells transfected with mutated L130P Fam208a in either mRFP-pCMV6 or pCMV6-tGFP expression vector. Both variants were used to exclude localization changes due to fluorophore fusion. Nuclear staining in A) line is DAPI; in B) line it is far red Draq5. We can see protein distribution amongst the whole cell without formation of nuclear clusters.

103

Fam208a is not involved in DNA reparation processes but has nucleic acid binding capability Transfected cells with both Fam208a construct variants were irradiated or chemically stimulated (camptothecin, CPT) to induce DNA damage and activate the repair mechanism. Antibody for γH2AX was used to visualize increased incidence of DNA double-strand breaks (DSB). Cells with higher signal for DSBs have influenced Fam208a nuclear localization, the large and highly intense clusters were partially dispensed (in > 53% of cells). Some cells displayed the staining signal for Fam208a and γH2AX, which did not overlap at all; in others, the co-localization seemed random (Fig.36). Therefore, we conclude that Fam208a does not may protect unaffected nucleic acids and remove them from active reparation regions. There was almost no effect of irradiation on the localization of mutated L130P Fam208a protein (Fig. 36).

The putative nuclear acid-binding capability of Fam208a was challenged by treatments with DNase I or RNase A [149]. Both treatments should partially or completely affect the nuclear localization and cluster formation of the overexpressed protein. Control of the treatment impact was performed with Srsf5 RNA-binding protein [150]. Nuclear cluster formation of wtFam208a, as well as nuclear spots of Srsf5, was severely impaired and the signal was portioned into smaller droplets and more uniform distribution. Nuclear localization of both signals was preserved (Fig. 37). Dissociation from chromatin obviously influenced the global expression pattern within the transfected cells. These experimental data support our theory that Fam208a is capable of nucleic acid binding, probably with higher inclination towards DNA and heterochromatin.

104

Figure 36 DSB induction does not display recruitment of Fam208a. Hek293t cells were transfected with both Fam208a constructs and 24 hours after transfection, they were irradiated. The next day, ICC took place. The increased signal of DSB ɣH2AX was mis-localizing the signal of Fam208a protein and large cluster formation was eliminated.

105

Figure 37 Treatment with endonucleases impairs intracellular localization of Fam208a. Vector-driven expression of Srsf5 and wtFam208a proteins within Hek293t cells shows standard nuclear localization of both proteins. Both treatments, DNAse I and RNAse A, caused changes in localization of control RNA-binding protein Srsf5 and wtFam208a, suggesting nucleic acid-binding activity.

106

Fam208a localization and function is Mphosph8-level dependent

To deeply investigate the relationship between Fam208a, Mphopsh8, and DNA/RNA- binding activity, we performed co-transfection of both proteins and observed the effects of their overexpressed levels. We already inspected the interaction between the endogenous Mphosph8 and fluorescently labeled overexpressed Fam208a. The endogenous phosphoprotein was clearly co-localized within large and intensive Fam208a nuclear foci (Fig. 38). Therefore, we believe that in the presence of large amounts of Fam208a, it is Fam208a that controls the localization of Mphosph8. This situation changes upon co-transfection, which means that both proteins with different labeling will be exogenously expressed from the delivered vectors. Once the levels of Fam208a and Mphosph8 are both driven by pCMV6 promoter, the localization of proteins is distinct. Both proteins highly co-localize within the nucleus, but this time they do not form large and intensive clusters. The Mphosph8 protein drives Fam208a into DAPI staining-free regions in smaller protein dots, which are located in much more similar ways reminiscent of mRFP-Mphosph8 itself (Fig. 39). On the other hand, single transfection experiments with Mphosph8 revealed that the protein forms small and regularly distributed dots within DAPI-positive regions (Fig. 40). In the case of co-transfection with wtFam208a, the position pattern is reminiscent of Mphosph8 distribution, but these smaller and equally localized foci maintain the DAPI-free characteristics, and so the intercalating staining cannot accesses the chromatin.

L130P Fam208a has a different localization compared to the wt variant. It does not form massive nuclear clusters but is evenly distributed between the nucleus and the cytoplasm (52 %). It tends to create smaller foci within the nucleus (21 %), and these are DAPI positive. Our data suggest that N’ terminally located substitutional mutation, L130P, impairs the capability of Fam208a to bind nucleic acids, but it should not affect PPI with putative partners. This theory was verified by a co-transfection experiment. Hek293t cells were transfected with fluorescently labeled L130P-Fam208a-AC-tGFP and AN-mRFP- Mphosph8 construct. As expected, both signals overlapped, and so the interaction between Fam208a and Mphosph8 was not impaired by the presence of L130P mutation (Fig. 41). Interestingly, the localization of co-transfected signal of L103P Fam208a with Mphosph8 was very similar to the co-localization signal of wtFam208a and Mphosph8. Therefore, we conclude that L130P mutation does not impair the interaction with Mphosph8, but their overexpression, driven by the pCVM6 promoter, causes a change of

107

L130P Fam208a localization by recruitment by Mphopsh8 to a similar pattern as for wtFam208a (Fig. 39).

From what we know about Fam208a and its mechanism in the HUSH complex, taken together with our studies it seems that its localization is flexible, and one of the regulators are the levels of Mphosph8. Different tissues express different levels of these proteins, and so in cells with lower amounts of Mpp8, Fam208a is organized in a different way and could interact with other proteins that are not involved in the HUSH complex and its functions.

Taken together, we identified 16 putative interaction partners which might cooperate in different complexes with Fam208a. Besides its verified HUSH complex function, there is a novel complex and a role suggested, linking Fam208a with the involvement in maintaining the genome stability via spindle apparatus control. This complex seems to function mainly during early embryonic divisions when the HUSH function is paused.

Figure 38 Endogenous MPHOSPH8 localization in the presence of expressed Fam208a. The MPP8 signal retains its strict nuclear localization, but in the presence of wtFam208a, it is lured into DAPI-free regions. This mechanism is impaired by the presence of L130P mutation.

108

Figure 39 Co-transfection of Mphosph8 and Fam208a (both variants) leads to intermediate localization. The presence of the same level of expression of all observed proteins changes the intracellular localization, which is more similar to the pattern of expressed Mphosph8. However, in the case of wtFam208a, its small foci still remain without nuclear staining, or only at the very boundary of the cluster. The L130P protein shares very similar localization when co-transfected with Mpp8, but the foci stay within the DAPI-positive regions.

Figure 40 Localization of mRFP-Mphosph8 protein expressed from the pCMV6 vector. Nuclear localization is typical of phosphoprotein and regarding its cellular function, it is reasonable. We can see that Mpp8 creates small foci within the nucleus, which are evenly distributed and positive for nuclear staining; in this case it is Draq5.

109

Discussion

Identification of gene functions is essential for understanding the genome data and for linking genomic data to human health. Two main approaches are used for these purposes, forward and reverse genetics. In our study, we used both of them. Forward genetics was used for ENU mutagenesis and production of substitutional mutant L130P Fam208a, which was first studied for suppression of variegation [106]. Our results provide detailed phenotype description and characterization of the crucial mechanisms and their impairment caused by L130P mutation. A reverse mutagenesis tool, the CRISPR/Cas9 method, was used for production of a targeted knock-out mouse strain with complete ablation of the Fam208a protein. Both these models were used in comparison, displaying interesting differences in the timing of phenotype, however, with a common identical outcome that is embryonic lethality in homozygotes. Reverse mutagenesis was also used for preparation of mutant human cell lines (Hek293t), which were mostly used for cell cycle studies and proteomic analyses. In addition, Fam208a was described as an epigenetic regulator of retro-element silencing [111] and heterochromatin mark keeper [151].

Extensive epigenetic changes occur at the onset of gastrulation as cells leave the pluripotent state and differentiate into embryonic germ lineages. Epigenetic repressor Fam208a is essential at this stage, as L130P Fam208a mutant embryos exhibit a profound developmental delay beginning by E6.5. In all bilateria, the establishment of the anterior- posterior axis is crucial and occurs during gastrulation. In L130P Fam208a embryos, we observed significantly impaired growth in the epiblast at E6.5, a time corresponding with the onset of gastrulation and establishment of AP patterning. The interpretation of the developmental delay is supported by the eventual appearance of the node activity at E8.5. However, several discrepancies that cannot be explained only by the developmental delay were identified. First, there was an increasingly desynchronized growth of extraembryonic tissue compared to the embryonic tissue. Second, there was a failure in downregulating anterior Nodal expression in mutant embryos. Under normal conditions, expression of Lefty1 and Cer1 in the DVE inhibits Nodal expression, resulting in a proximal-distal Nodal gradient. This gradient rotates to become the AP axis, whereby the DVE moves anteriorly to form the AVE and the proximal epiblast moves posteriorly [152]. This rotation also occurred in E6.5 L130P Fam208a embryos, with the AVE clearly visible as an anterior Cer1-expression domain, but the expected effect of downregulating

110 anterior Nodal expression did not occur [153]. Surprisingly, Wnt3 expression, a source of inhibitory signaling to AVE, rotated normally to the posterior epiblast. Similarly to mutant homozygous L130P Fam208a embryos, Drap1 and Lefty2 knockouts also displayed gastrulation failure with excessive Nodal signaling, although they were able to correctly specify the AVE [154]. L130P Fam208a mutant epiblasts exhibited highly increased expression of Dkk1 at E6.5, which could explain the rescue of AVE. Dkk1 is an activation signal for AVE migration and upon exogenous administration of Dkk1, the migration defects caused by inhibition of proliferation can be fixed, and therefore suggest some flexibility in the coordination of AP axis formation [155].

The gastrulation-defective L130P Fam208a mutants displayed upregulation of the p53 pathway and increased p53 protein stability. Critically, mutant embryos after crossing with a p53 null background showed partial rescue of the gastrulation phenotype. The rescued embryos, with abnormal neural tube, failure of anterior neural tube closure and cardiac defects, resembled p53/p63/p73 triple knockout chimeras in which the phenotypes were attributed to impairment in mesendodermal specification with a corresponding proclivity to assume a neurectoderm fate [140].

Expression profiling of the epiblast has been shown to contain an amplified p53 signaling response and high cytoplasmic priming towards apoptosis. This increased precaution protects from the accumulation of mutations, during the formation period when rapid cell cycling and a relatively open chromatin conformation can make them more accessible to mutagenesis. It was already described that the period between E5.5 and E7.5 comprises a sensitive developmental window during which the deletion of many genes important for genome integrity is lethal. The possible reason for an increase in p53-dependent apoptosis in L130P Fam208a embryos can be an increased genomic instability. This can be achieved for example by impaired repression of endogenous retroviruses and satellite repeats, or it could be the result of stabilization of the p53 protein itself. However, L130P Fam208a embryos showed a gene dosage-dependent increase in the percentage of cells positive for proliferation marker Ki67. This marker of proliferation might be a sign of a compensatory response to cellular elimination through p53-mediated apoptosis and cell cycle arrest. Many similar models of compensatory epiblast growth were described in the past [156, 157].

111

Fam208a was originally proposed as a pluripotency-related gene, as it was identified as a target of the Oct4 regulation module and demonstration of Oct4 occupancy of its promoter was proved [103]. Our expression profiles of L130P Fam208a mutant epiblasts show gene set enrichment for an Oct4 signaling pathway, and in particular in the nuclear long non- coding subset of this co-expression module, which have been termed pluripotency- associated transcripts [158]. Platr members were affected differentially, Platr3, -4, -20 and -27 were upregulated while the expression of Platr22 was actually decreased. None of the dysregulated lncRNAs in Fam208a mutants has been previously assigned with a functional role in maintaining embryonic stem cell pluripotency. It is also possible that these lncRNAs are markers for more global, repressive epigenetic changes associated with pluripotency exit. If so, then silencing of Platr3, -4, -20 and -27 may be especially sensitive to Fam208a-dependent expansion of H3K9me3 domains, whereas other Platr genes may rely more on other mechanisms of epigenetic silencing.

The role of Fam208a in the maintenance of genome stability via interaction with MPHOSPH8 in the HUSH complex has been widely described, but our findings suggest a new putative role for Fam208a in organization of the spindle apparatus. In fact, the heterochromatin instability can be tightly linked with spindle apparatus establishment and cell division as such. Hence, it is possible that these functions are related and create a complex orchestration strategy involving Fam208a. Recently, an error-prone chromosome-mediated spindle assembly mechanism was described in human oocytes [159], which drives our attention to the mouse oocyte and a possibility of similar effect brought by Fam208a downregulation. First, we created a new mouse model with complete loss of the critical first exon of Fam208a, which caused functional ablation of the protein. This mutation is in homozygous state embryonic lethal. This model exhibits delayed mortality of embryos (between E9.5 and E12.5) compared to the L130P model. This is remarkably later than in ENU mutagenesis-induced L130P mutants (point mutation of Fam208a leading to amino acid substitution – MommeD6), which are fully absorbed by E9.5 [160]. There might be several reasons for a milder effect in the complete knock out. L130P mutants have not been fully characterized previously, and it was not reported whether the Fam208a mutation fully eliminates the endogenous protein. Thus, the presence of remaining and maybe not fully active Fam208a might cause a dominant negative effect. In addition, murine zygotes deficient for Fam208a may develop compensatory mechanisms (HUSH2, Kap1) and use pathways that provide partial rescue

112 for embryos. Moreover, the presence of maternal RNA leads to the production of Fam208a protein in oocytes and thus postpones the onset of effects caused by Fam208a zygotic mRNA decay. By this act, maternal Fam208a helps to overcome the first rounds of zygotic division [161]. Moreover, we observed the delayed dosage effect in E9.5 stage heterozygous embryos with 22 somites compared to the wild-type littermates with 28 somites. The fact that born heterozygotes are fully developed and viable suggests that the Fam208a role is crucial mostly in the very early stages of embryogenesis. Interestingly, there is a presence of compensatory mechanisms during further development, as E12.5 heterozygous embryos are indistinguishable from wild types. Even homozygous embryos contain a subset of wild-type mRNA from a heterozygous female (maternal products) and the use of wild-type RNA in first zygotic events results in a delayed effect of Fam208a ablation. RNA-seq data from zygotes indeed show the presence of maternal Fam208a RNA from maternal to zygotic transition of embryonic transcription at four-cell stage. There is also a verified high-level presence of Fam208a in metaphase I stage oocytes, showing clear involvement of maternal-originating molecule deposition [162]. To overcome this problem, we downregulated maternal RNA with the pool of siRNA directly in fertilized eggs. This manipulation resulted in increased incidence of multiple spindle formation and higher risk of cell division arrest. None of the control groups has developed spindle apparatus with multiple poles. On the contrary, zygotes with downregulated Fam208a had impaired poles in nine out of 11 detected spindles. Based on this and on RNA-seq data, we suggest that Fam208a is critical for early zygotic division processes, spindle dynamics, and establishment of bi-polar apparatus.

Using the Y2H system, we identified and verified 16 putative Fam208a interaction partners. Five of these proteins (Gpsm2, Eml1, Svil, Amn1 and Itgb3bp) are directly linked with spindle apparatus establishment and correct functioning and might be more important for interaction with Fam208a during early rounds of zygotic divisions. One of the identified proteins is Gpsm2, a protein member of the cortical complex (consisting of NUMA and Dynactin/dynein) [163 2004] that plays a key role in establishing proper spindle orientation [164]. Another interesting protein is Supervillin, which co-localizes with endogenous myosin II and EPLIN in the cleavage furrow during early cytokinesis [165]. Eml1 is critical for correct formation of the cleavage plane [166]. The function of Amn1 is linked with both, spindle assembly and nuclear orientation checkpoints [167]. Itgb3bp (CENPR) is a core centromere protein, which prevents pre-mature separation

113

[168]. Considering the fact that almost one third of identified interaction partners are involved directly in the cell division mechanism, we propose that Fam208a is most likely involved in this process as well.

To further investigate the interaction partners of Fam208a in tissues, we performed expression profiling in murine organ samples, showing variable expression patterns in different tissues. Therefore, it is possible that the cell-specific role of Fam208a is governed by different interactions. To map the possible interactions among Fam208a and its partners in specific tissues, we performed in situ hybridization with RNA probes. The hybridization revealed that co-localization and co-expression of Fam208a and its putative interaction partners is strongly tissue- and cell type-specific. Therefore, in hyper- proliferative cells, Fam208a might play a different role compared to slowly proliferative tissues.

To further describe a role of Fam208a in the spindle apparatus assembly, we used CRISPR-Cas9 to prepare stable somatic cell lines with mutations in Fam208a and Mphosph8, which was identified as one of the interaction partners using the Y2H system. This interaction has already been well described and verified by others [77]. FAM208A, MPHOSPH8, PERIPHILIN, and SETDB1 were designated as HUSH complex, whose function was linked with gene silencing. To investigate whether these partners are also involved in mitotic cell division, we prepared knock-down cell lines. No viability defects were observed in Fam208a mutant cells, indicating that in comparison with the described effect during early zygotic division, the described machinery in somatic cells might not be affected at all. The crucial difference between cleavage of zygotic cells and normal cell division is that there is no increase of cytoplasmic mass in dividing cells in comparison with an increase of nuclear mass and overall cell number [169]. The cleavage cycle completely omits the G1 and G2 phases and only consists of quick sets of S and M phases [170]. Due to these differences, various proteins orchestrate this zygote-specific division. While Dnmt3a (methyltransferase cooperating with Fam208a), which recruits the HUSH complex to its active sites, is not necessary during zygotic divisions [171], LGN (GPSM2) is important for nuclear positioning and cellular polarity establishment particularly during embryonic cleavage [169]. Thus, Fam208a seems to be acting, beside the HUSH complex, during early zygotic division and cleavage. Data from expression profiling were supported by LC-MS proteomics, showing similar interaction variety. However, it appears that both experimental setups, embryonic and cellular, point to the

114 involvement of Fam208a in DNA/RNA binding and transcription regulation as the majority of effected proteins are functionally linked to these ontologies.

While the analysis of Fam208a interactions in cell lines revealed that protein partners act more as cell cycle regulators, the analyses performed in embryos identified a protein group participating in epigenetic processes. Therefore, it seems that the epigenetic machinery in cell lines is more stable even if Fam208a is downregulated, while in embryos Fam208a removal causes accumulation of errors. Another aspect of the whole machinery was brought in by studying the effect of overexpression in cell lines. Either increased or downregulated levels of wtFam208a proteins in Hek293t cells did not seriously impair the cell cycle. However, the mutated protein overexpression caused a remarkable shift in the percentage of cells stacked in the G2 or M phase. Taking into consideration that in differentiated cells, the methylation status is relatively stable, it is not surprising that fluctuation of Fam208a levels did not impair the cell cycle harmony. On the other hand, plasmid-driven overexpression of L130P mutated variant of Fam208a obviously impairs the cell cycle progress and blocks it within the G2 or M phase. The G2-M checkpoint is crucial for verification of complete and finished DNA replication. This partially also includes establishment of epigenetic marks and heterochromatin formation. L130P overexpression might drive mutated Fam208a to the activity site, but upon delivery, these proteins fail to finish the marking procedure and so the cells need a backup plan. We do not suggest that without functional Fam208a the cells could not proceed with the cycle. It is more possible that cells spend much more time identifying the L130P Fam208a malfunction while another machinery is recruited with a remarkable delay. In mutated cell lines with deleted exons of Fam208a, it is possible that cells already possess the information about the missing protein; hence, they recruit the secondary mechanism. Fam208s is a member of a large interactome, and it is obvious that its function in differentiated cell is not unique and can be replaced by an alternative pathway. On the other hand, during development and transitions it seems that the role of Fam208a is quite crucial, and sooner or later its malfunction leads to the collapse of the whole machinery and causes embryonal lethality. In summary, the analysis of the proteomic data suggests that Fam208a has a crucial role during the zygotic cleavage. Its epigenetic regulation role becomes a key function once the methylation processes are needed, i.e., it seems that the epigenetic machinery in cell lines is not influenced by the ablation of Fam208a, while the Fam208a removal in embryos causes a gastrulation arrest and

115 primitive streak formation failure. Thus, the depletion of Fam208a does not seem to affect the standard mitotic division. In case of impairment of the HUSH complex (Fam208a KO) function, cells can operate through other mechanisms, e.g., through the Kap1 complex [172]. However, when MPHOSPH8 is downregulated and thus excluded from HUSH and other methylation complexes, e.g. with Dnmt3a [67], cells are barely able to cope with this lost. On the contrary, ablation of MPHOSPH8 predominantly causes an increase in proliferation followed by lower viability and higher sensitivity to cell death.

The relationship between Fam208a and Mphosph8 was under deeper observation. Our data suggest that internal localization of wtFam208a and its mutant variant L130P is dependent on the expression levels of Mphosph8. The intracellular distribution itself is influenced by the presence of mutation. While the wtFam208a is predominantly localized within the nucleus, where it creates large intensive nuclear-staining free clusters, the L130P variant is evenly distributed among the whole cell cytoplasm as well as inside the nucleus. Based on our co-localization studies, we conclude that the mutation itself does not affect the protein-binding capability of Fam208a, but it might restrict the DNA/RNA- binding activity as we could not observe large nuclear foci without chromatin staining signal. Another effect of the L130P mutation can lead to the change of the tertiary structure of the protein conformation, which is then less suitable for the nuclear import. In a small percentage of mutated Fam208a cells, we could identify the presence of smaller clusters within the nucleus. This localization might be caused by recruitment of the L130P protein by Mphosph8 as these small clusters are DAPI positive.

Co-expression of Fam208a and Mphosph8 from pCMV6 vectors revealed dose- dependent dynamics of localization of these two proteins. Expression of Fam208a itself creates large nuclear DAPI-free regions. Expression of Mphosph8 creates smaller nuclear foci that are H3K9me3 positive. When both constructs are co-expressed, the situation changes and Fam208a phenocopies the localization of expressed Mphosph8. However, Mphosph8 overexpression has exactly the same effect on the localization of L130P Fam208a. This verifies the expectation that the mutation does not impair the protein- protein interaction between Fam208a and Mpp8, and it is interesting to see that Mphosph8 recruits dysfunctional Fam208a to the methylation sites. On the other hand, lower endogenous levels of Mphosph8 are not capable of this ‘translocation’ of Fam208a. In this case, endogenous Mphosph8 is trapped with Fam208a within the large nuclear foci.

116

The question whether this complex plays two distinct functions when differentially localized remains opened.

From the very beginning, the proteomic information about Fam208a suggested the possibility of nucleic acid-binding activity. Results from Y2H identified the C’ terminal part as protein interaction region with several competitive binding domains. The N’ terminal part function was obviously not linked with PPI, and according to the 3D modeling program, it could form the nucleic acid-binding fork. Massive nuclear clusters without nuclear staining signal suggested that Fam208a itself binds to the large groove, and therefore inhibits other staining. Indeed, the treatment with endonucleases led to impairment of formation of nuclear foci and supported our theory. Neither of these treatments had any effects on the localization of L130P Fam208a, which is mis-localized compared to wtFam208a and does not create nuclear foci; therefore, it seems that its mutation alters the DNA/RNA-binding activity. Another possible usage of this binding clamp was proposed to be active during the DNA damage response. Chemical and physical inducement of DSB was visualized by increased signal of DNA damage marker γH2AX, which always excludes the signal of Fam208a within the nucleus. Obviously, Fam208a does not co-localize with the DSB, but on the other hand, it might protect the rest of un-damaged DNA from the reparation machinery. A rapid decrease of H3K9 methylation marks can be observed after induction of DSB. This procedure is driven by p53-mediated suppression of Suv39H1. Chromatin relaxation due to the displacement of HP1 and the decrease of H3K9 methylation increases chromatin accessibility to facilitate end resection for HR [62]. Methylations among different histones and their performances are not individual, but connected. They have been indicated to act in concert with each other in a context-dependent manner in DSB repair. Therefore, it is possible that Fam208a maintains the condensed conformation of chromatin in DSB-free regions as it is actually removed by the reparation machinery from its point of action.

Altogether, we identified 16 putative Fam208a-interacting partners, which besides the HUSH complex create a novel protein network (Eml1, Svil, Gpsm2, Amn1, and Itgb3bp), linking Fam208a to the maintenance of the genome stability via controlling the function of the spindle apparatus. This new role of Fam208a within a unique complex appears to affect the processes of early embryonic division, when the HUSH function is paused. The epigenetic role of Fam208a seems to be more crucial during and after the differentiation process. The biologic functions of this dual-acting protein could be predicted not only for

117 the developmental process, but they seem to be cell-type and tissue-specific. The molecular mechanism is yet to be discovered; nevertheless, based on other putative interaction proteins, we conclude that Fam208a is a crucial protein involved in several mechanisms maintaining the genome stability.

118

Summary

Many mechanisms of epigenetic maintenance of genome stability are still not fully understood and described. New epigenetic players are identified constantly, and only after that, we are capable of assigning them into functional complexes. Multi-functionality of these players is just another obstacle to combat. Our study is focused exactly on this type of epigenetic modifier, which has a dual function with a common aim to maintain the genome stability. Fam208a has been proved to be a part of heterochromatin formation complex. It cooperates within a complex responsible for silencing of retroviral DNA and transposons while promoting the spreading of H3K9me3 mark across the chromatin. Our study brings a new insight into the Fam208a developmental function linked with the maintenance of epiblast fitness. The absence of Fam208a in embryos leads to a decreased number of cells via p53-mediated apoptosis. Rescuing the phenotype by mutating p53 rises questions of whether Fam208a mutation or misexpression can similarly affect the fitness of other cell populations, and whether such fitness defects are similarly ‘rescued’ by the loss of tumor suppressor p53 that leads to cancer. The general involvement of Fam208a during early developmental processes was also verified by studying KO embryos with very similar, yet slightly different results. The novelty brought by this study is represented by discovery of a new protein complex that can link spindle assembly during zygotic division with Fam208a as an epigenetic regulator of correct formation of SA. To perform this role, distinct interaction partners have to be present besides those responsible for epigenetic silencing (HUSH complex). We identified 16 proteins as putative interaction partners, and five of them are linked with cell division processes. Their library was analyzed for their expression profiles, showing high tissue specificity. Thus, the roles of Fam208a may differ in a cell-specific manner, as they are likely dependent on the availability of its interaction partners. Altogether, our work provides an important insight into understanding the molecular landscape of Fam208a as a multi- interacting protein affecting genome stability, also playing an essential role in the initial steps of embryonic development via orchestrating spindle pole assembly and chromosome segregation.

119

Souhrn

Existuje mnoho mechanizmů pro udržování epigenetické stability, které nejsou stále plně pochopeny a popsány. Neustále poznáváme nové epigenetické faktory a postupně je přiřazujeme do funkčních komplexů. Mezi mnohé komplikace patří právě univerzálnost těchto proteinů. Tato práce se zabývá přesně takovým druhem epigenetického regulátoru, který má duální funkci, jejichž společný cíl spočívá v udržování stability genomu. Fam208a je popsaný jako součást komplexu na tvorbu heterochromatinu. Tento soubor proteinů je odpovědný za epigenetické umlčování DNA z retrovirů a transpozonů prostřednictvím šíření H3K9me3 značky podél chromatinu. Naše studie přináší nový pohled na problematiku udržování zdatnosti epiblastu pomocí Fam208a regulace v průběhu vývoje. Nepřítomnost Fam208a v embryích způsobuje zmenšení embrya snížením počtu buněk díky buněčné smrti regulované p53 dráhou. K částečné záchraně letálního fenotypu dochází právě při mutaci p53 proteinu. Tohle pozorování vybízí k otázce, zdali mutace nebo chybná exprese proteinu Fam208a může obdobně ovlivnit zdatnost i jiných buněčných populací, a pokud k ovlivnění dojít může, tak zdali je takový fenotyp také ovlivněn ztrátou funkce proteinu p53, vedoucímu ke vzniku rakoviny. Účast Fam208a na brzkém embryonálním vývoji byl také potvrzen studiem KO embryí. Novinkou v této studii je vyzdvihnutí nového proteinového komplexu, který propojuje sestavení dělícího vřeténka během zygotického dělení se samotným Fam208a, epigenetickým regulátorem správné formace SA. Kromě interakčních partnerů odpovědných na epigenetickém umlčování (HUSH komplex) je třeba i přítomnost dalších proteinů. Identifikovali jsme 16 proteinů jakožto pravděpodobných interakčních partnerů, z nichž 5 hraje roli během buněčného dělení. Analyzovali jsme jejich expresní profily a objevili jsme vysokou tkáňovou specifitu. Role Fam208a se tedy mohou lišit v různých tkáních v závislosti na přítomnosti specifických interakčních partnerů. Naše práce tedy poskytla vhled do problematiky kolem Fam208a, jakožto multi-interaktivního proteinu hrající roli v genomové stabilitě a v počátečních stádiích embryonálního vývoje prostřednictvím uspořádávání dělícího vřeténka a segregace chromozomů.

120

References

1. Allis, C.D. and T. Jenuwein, The molecular hallmarks of epigenetic control. Nat Rev Genet, 2016. 17(8): p. 487-500. 2. Passarge, E., Emil Heitz and the concept of heterochromatin: longitudinal chromosome differentiation was recognized fifty years ago. Am J Hum Genet, 1979. 31(2): p. 106-15. 3. Muller, H.J. and E. Altenburg, The Frequency of Translocations Produced by X- Rays in Drosophila. Genetics, 1930. 15(4): p. 283-311. 4. McClintock, B., Chromosome organization and genic expression. Cold Spring Harb Symp Quant Biol, 1951. 16: p. 13-47. 5. Lyon, M.F., Gene action in the X-chromosome of the mouse (Mus musculus L.). Nature, 1961. 190: p. 372-3. 6. Surani, M.A., S.C. Barton, and M.L. Norris, Development of reconstituted mouse eggs suggests imprinting of the genome during gametogenesis. Nature, 1984. 308(5959): p. 548-50. 7. McGrath, J. and D. Solter, Completion of mouse embryogenesis requires both the maternal and paternal genomes. Cell, 1984. 37(1): p. 179-83. 8. Bird, A.P. and E.M. Southern, Use of restriction enzymes to study eukaryotic DNA methylation: I. The methylation pattern in ribosomal DNA from Xenopus laevis. J Mol Biol, 1978. 118(1): p. 27-47. 9. Bannister, A.J., et al., Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain. Nature, 2001. 410(6824): p. 120-4. 10. Tchasovnikarova, I.A., et al., GENE SILENCING. Epigenetic silencing by the HUSH complex mediates position-effect variegation in human cells. Science, 2015. 348(6242): p. 1481-1485. 11. Bou Kheir, T. and A.H. Lund, Epigenetic dynamics across the cell cycle. Essays Biochem, 2010. 48(1): p. 107-20. 12. Boland, M.J., K.L. Nazor, and J.F. Loring, Epigenetic regulation of pluripotency and differentiation. Circ Res, 2014. 115(2): p. 311-24. 13. Skinner, M.K., Role of epigenetics in developmental biology and transgenerational inheritance. Birth Defects Res C Embryo Today, 2011. 93(1): p. 51-5. 14. Gresakova, V., Cesty mutageneze. ŽIVA, 2017. 2/2017: p. XLIV - XLVI. 15. Appleby, M.W. and F. Ramsdell, A forward-genetic approach for analysis of the immune system. Nat Rev Immunol, 2003. 3(6): p. 463-71. 16. Hardy, S., et al., Reverse genetics in eukaryotes. Biol Cell, 2010. 102(10): p. 561-80. 17. Moresco, E.M., X. Li, and B. Beutler, Going forward with genetics: recent technological advances and forward genetics in mice. Am J Pathol, 2013. 182(5): p. 1462-73.

121

18. TIERNEY, M.B.L., K. H., An introduction to reverse genetic tools for investigating gene function. The Plant Health Instructor., 2005(PHI-A-2005- 1025-01). 19. Guenet, J.L., Chemical mutagenesis of the mouse genome: an overview. Genetica, 2004. 122(1): p. 9-24. 20. Martin, B., et al., A high-density collection of EMS-induced mutations for TILLING in Landsberg erecta genetic background of Arabidopsis. BMC Plant Biol, 2009. 9: p. 147. 21. Alonso, J.M., et al., Genome-wide insertional mutagenesis of Arabidopsis thaliana. Science, 2003. 301(5633): p. 653-7. 22. Mori, I., et al., Transposable element Tc1 of Caenorhabditis elegans recognizes specific target sequences for integration. Proc Natl Acad Sci U S A, 1988. 85(3): p. 861-4. 23. Bessereau, J.L., Transposons in C. elegans. WormBook, 2006: p. 1-13. 24. Hill, T., C. Schlotterer, and A.J. Betancourt, Hybrid Dysgenesis in Drosophila simulans Associated with a Rapid Invasion of the P-Element. PLoS Genet, 2016. 12(3): p. e1005920. 25. Carlson, C.M. and D.A. Largaespada, Insertional mutagenesis in mice: new perspectives and tools. Nat Rev Genet, 2005. 6(7): p. 568-80. 26. Liu, M., et al., Methodologies for Improving HDR Efficiency. Front Genet, 2018. 9: p. 691. 27. Jasin, M., M.E. Moynahan, and C. Richardson, Targeted transgenesis. Proc Natl Acad Sci U S A, 1996. 93(17): p. 8804-8. 28. Iida, S. and R. Terada, A tale of two integrations, transgene and T-DNA: gene targeting by homologous recombination in rice. Curr Opin Biotechnol, 2004. 15(2): p. 132-8. 29. Ray, A. and M. Langer, Homologous recombination: ends as the means. Trends Plant Sci, 2002. 7(10): p. 435-40. 30. Carroll, D., Genome engineering with zinc-finger nucleases. Genetics, 2011. 188(4): p. 773-82. 31. Ousterout, D.G. and C.A. Gersbach, The Development of TALE Nucleases for Biotechnology. Methods Mol Biol, 2016. 1338: p. 27-42. 32. Pattanayak, D., et al., Small but mighty RNA-mediated interference in plants. Indian J Exp Biol, 2005. 43(1): p. 7-24. 33. Quadros, R.M., et al., Easi-CRISPR: a robust method for one-step generation of mice carrying conditional and insertion alleles using long ssDNA donors and CRISPR ribonucleoproteins. Genome Biol, 2017. 18(1): p. 92. 34. Kim, H. and J.S. Kim, A guide to genome engineering with programmable nucleases. Nat Rev Genet, 2014. 15(5): p. 321-34. 35. Barnum, K.J. and M.J. O'Connell, Cell cycle regulation by checkpoints. Methods Mol Biol, 2014. 1170: p. 29-40. 36. Arroyo, M. and P. Raychaudhuri, Retinoblastoma-repression of E2F-dependent transcription depends on the ability of the retinoblastoma protein to interact

122

with E2F and is abrogated by the adenovirus E1A oncoprotein. Nucleic Acids Res, 1992. 20(22): p. 5947-54. 37. Killander, D. and A. Zetterberg, Quantitative Cytochemical Studies on Interphase Growth. I. Determination of DNA, Rna and Mass Content of Age Determined Mouse Fibroblasts in Vitro and of Intercellular Variation in Generation Time. Exp Cell Res, 1965. 38: p. 272-84. 38. Nishitani, H. and Z. Lygerou, DNA replication licensing. Front Biosci, 2004. 9: p. 2115-32. 39. Patil, M., N. Pabla, and Z. Dong, Checkpoint kinase 1 in DNA damage response and cell cycle regulation. Cell Mol Life Sci, 2013. 70(21): p. 4009-21. 40. McIntosh, J.R. and T. Hays, A Brief History of Research on Mitotic Mechanisms. Biology (Basel), 2016. 5(4). 41. Jiang, H., et al., Phase transition of spindle-associated protein regulate spindle apparatus assembly. Cell, 2015. 163(1): p. 108-22. 42. Bettencourt-Dias, M., Q&A: Who needs a centrosome? BMC Biol, 2013. 11: p. 28. 43. Zierhut, C. and H. Funabiki, Nucleosome functions in spindle assembly and nuclear envelope formation. Bioessays, 2015. 37(10): p. 1074-85. 44. Cooper, G., The Cell: A Molecular Approach. 2nd edition. 2000. 45. Barnes, J.D., et al., Embryonic expression of Lim-1, the mouse homolog of Xenopus Xlim-1, suggests a role in lateral mesoderm differentiation and neurogenesis. Dev Biol, 1994. 161(1): p. 168-78. 46. Zheng, Y.G., et al., Chemical regulation of epigenetic modifications: opportunities for new cancer therapy. Med Res Rev, 2008. 28(5): p. 645-87. 47. Sharma, S., T.K. Kelly, and P.A. Jones, Epigenetics in cancer. Carcinogenesis, 2010. 31(1): p. 27-36. 48. Dickinson, M.E., et al., High-throughput discovery of novel developmental phenotypes. Nature, 2016. 537(7621): p. 508-514. 49. Bird, A., DNA methylation patterns and epigenetic memory. Genes Dev, 2002. 16(1): p. 6-21. 50. Goll, M.G. and T.H. Bestor, Eukaryotic cytosine methyltransferases. Annu Rev Biochem, 2005. 74: p. 481-514. 51. Hagood, J.S., Beyond the genome: epigenetic mechanisms in lung remodeling. Physiology (Bethesda), 2014. 29(3): p. 177-85. 52. Hark, A.T., et al., CTCF mediates methylation-sensitive enhancer-blocking activity at the H19/Igf2 locus. Nature, 2000. 405(6785): p. 486-9. 53. Chen, Z., et al., Epigenetic Regulation: A New Frontier for Biomedical Engineers. Annu Rev Biomed Eng, 2017. 19: p. 195-219. 54. Kouzarides, T., Chromatin modifications and their function. Cell, 2007. 128(4): p. 693-705. 55. Schreiber, S.L. and B.E. Bernstein, Signaling network model of chromatin. Cell, 2002. 111(6): p. 771-8. 56. Li, B., M. Carey, and J.L. Workman, The role of chromatin during transcription. Cell, 2007. 128(4): p. 707-19.

123

57. Jenuwein, T. and C.D. Allis, Translating the histone code. Science, 2001. 293(5532): p. 1074-80. 58. Kaelin, W.G., Jr. and S.L. McKnight, Influence of metabolism on epigenetics and disease. Cell, 2013. 153(1): p. 56-69. 59. Rea, S., et al., Regulation of chromatin structure by site-specific histone H3 methyltransferases. Nature, 2000. 406(6796): p. 593-9. 60. Jenuwein, T., et al., SET domain proteins modulate chromatin domains in eu- and heterochromatin. Cell Mol Life Sci, 1998. 54(1): p. 80-93. 61. Alam, H., B. Gu, and M.G. Lee, Histone methylation modifiers in cellular signaling pathways. Cell Mol Life Sci, 2015. 72(23): p. 4577-92. 62. Young, L.C., D.W. McDonald, and M.J. Hendzel, Kdm4b histone demethylase is a DNA damage response protein and confers a survival advantage following gamma-irradiation. J Biol Chem, 2013. 288(29): p. 21376-88. 63. Shirai, A., et al., Correction: Impact of nucleic acid and methylated H3K9 binding activities of Suv39h1 on its heterochromatin assembly. Elife, 2017. 6. 64. Shinkai, Y. and M. Tachibana, H3K9 methyltransferase G9a and the related molecule GLP. Genes Dev, 2011. 25(8): p. 781-8. 65. Dodge, J.E., et al., Histone H3-K9 methyltransferase ESET is essential for early development. Mol Cell Biol, 2004. 24(6): p. 2478-86. 66. Fritsch, L., et al., A subset of the histone H3 lysine 9 methyltransferases Suv39h1, G9a, GLP, and SETDB1 participate in a multimeric complex. Mol Cell, 2010. 37(1): p. 46-56. 67. Chang, Y., et al., MPP8 mediates the interactions between DNA methyltransferase Dnmt3a and H3K9 methyltransferase GLP/G9a. Nat Commun, 2011. 2: p. 533. 68. Matzke, M., A.J. Matzke, and J.M. Kooter, RNA: guiding gene silencing. Science, 2001. 293(5532): p. 1080-3. 69. Bhargava, S., et al., The epigenetic modifier Fam208a is required to maintain epiblast cell fitness. Sci Rep, 2017. 7(1): p. 9322. 70. Butler, A.A., W.M. Webb, and F.D. Lubin, Regulatory RNAs and control of epigenetic mechanisms: expectations for cognition and cognitive dysfunction. Epigenomics, 2016. 8(1): p. 135-51. 71. Hanly, D.J., M. Esteller, and M. Berdasco, Interplay between long non-coding RNAs and epigenetic machinery: emerging targets in cancer? Philos Trans R Soc Lond B Biol Sci, 2018. 373(1748). 72. Calo, E. and J. Wysocka, Modification of enhancer chromatin: what, how, and why? Mol Cell, 2013. 49(5): p. 825-37. 73. Probst, A.V., E. Dunleavy, and G. Almouzni, Epigenetic inheritance during the cell cycle. Nat Rev Mol Cell Biol, 2009. 10(3): p. 192-206. 74. Turner, B.M., Cellular memory and the histone code. Cell, 2002. 111(3): p. 285- 91. 75. Corpet, A. and G. Almouzni, Making copies of chromatin: the challenge of nucleosomal organization and epigenetic information. Trends Cell Biol, 2009. 19(1): p. 29-41.

124

76. Jeddeloh, J.A., T.L. Stokes, and E.J. Richards, Maintenance of genomic methylation requires a SWI2/SNF2-like protein. Nat Genet, 1999. 22(1): p. 94-7. 77. Iva A. Tchasovnikarova, R.T.T., 1* Nicholas J. Matheson,1 Kim Wals,1 Robin Antrobus,1 Berthold Göttgens,2 Gordon Dougan,3 Mark A. Dawson,4 Paul J. Lehner, Epigenetic silencing by the HUSH complex mediates position-effect variegation in human cells. Sciencexpress, 2015. 78. Clement, C. and G. Almouzni, MCM2 binding to histones H3-H4 and ASF1 supports a tetramer-to-dimer model for histone inheritance at the replication fork. Nat Struct Mol Biol, 2015. 22(8): p. 587-9. 79. Groth, A., et al., Chromatin challenges during DNA replication and repair. Cell, 2007. 128(4): p. 721-33. 80. Oomen, M.E. and J. Dekker, Epigenetic characteristics of the mitotic chromosome in 1D and 3D. Crit Rev Biochem Mol Biol, 2017. 52(2): p. 185- 204. 81. Naumova, N., et al., Organization of the mitotic chromosome. Science, 2013. 342(6161): p. 948-53. 82. Wang, F. and J.M. Higgins, Histone modifications and mitosis: countermarks, landmarks, and bookmarks. Trends Cell Biol, 2013. 23(4): p. 175-84. 83. Bender, M.A. and D.M. Prescott, DNA synthesis and mitosis in cultures of human peripheral leukocytes. Exp Cell Res, 1962. 27: p. 221-9. 84. Meng, Y., et al., The non-coding RNA composition of the mitotic chromosome by 5'-tag sequencing. Nucleic Acids Res, 2016. 44(10): p. 4934-46. 85. Taneja, N. and S.I.S. Grewal, Shushing histone turnover: It's FUN protecting epigenome-genome. Cell Cycle, 2017. 16(19): p. 1731-1732. 86. Timms, R.T., I.A. Tchasovnikarova, and P.J. Lehner, Position-effect variegation revisited: HUSHing up heterochromatin in human cells. Bioessays, 2016. 38(4): p. 333-43. 87. Leeb, M. and A. Wutz, Establishment of epigenetic patterns in development. Chromosoma, 2012. 121(3): p. 251-62. 88. Kiefer, J.C., Epigenetics in development. Dev Dyn, 2007. 236(4): p. 1144-56. 89. Seki, Y., et al., Extensive and orderly reprogramming of genome-wide chromatin modifications associated with specification and early development of germ cells in mice. Dev Biol, 2005. 278(2): p. 440-58. 90. Bao, J. and M.T. Bedford, Epigenetic regulation of the histone-to-protamine transition during spermiogenesis. Reproduction, 2016. 151(5): p. R55-70. 91. Zhang, B., et al., Allelic reprogramming of the histone modification H3K4me3 in early mammalian development. Nature, 2016. 537(7621): p. 553-557. 92. Hanna, C.W., H. Demond, and G. Kelsey, Epigenetic regulation in development: is the mouse a good model for the human? Hum Reprod Update, 2018. 24(5): p. 556-576. 93. Dahl, J.A., et al., Broad histone H3K4me3 domains in mouse oocytes modulate maternal-to-zygotic transition. Nature, 2016. 537(7621): p. 548-552. 94. Tadros, W. and H.D. Lipshitz, The maternal-to-zygotic transition: a play in two acts. Development, 2009. 136(18): p. 3033-42.

125

95. Lee, M.T., A.R. Bonneau, and A.J. Giraldez, Zygotic genome activation during the maternal-to-zygotic transition. Annu Rev Cell Dev Biol, 2014. 30: p. 581- 613. 96. Heyn, P., et al., The earliest transcribed zygotic genes are short, newly evolved, and different across species. Cell Rep, 2014. 6(2): p. 285-92. 97. Huntzinger, E. and E. Izaurralde, Gene silencing by microRNAs: contributions of translational repression and mRNA decay. Nat Rev Genet, 2011. 12(2): p. 99- 110. 98. Yartseva, V. and A.J. Giraldez, The Maternal-to-Zygotic Transition During Vertebrate Development: A Model for Reprogramming. Curr Top Dev Biol, 2015. 113: p. 191-232. 99. Iraqui, I., et al., Recovery of arrested replication forks by homologous recombination is error-prone. PLoS Genet, 2012. 8(10): p. e1002976. 100. Peters, A.C., et al., Mammalian DNA mismatch repair protects cells from UVB- induced DNA damage by facilitating apoptosis and p53 activation. DNA Repair (Amst), 2003. 2(4): p. 427-35. 101. Ferguson, D.O., et al., The nonhomologous end-joining pathway of DNA repair is required for genomic stability and the suppression of translocations. Proc Natl Acad Sci U S A, 2000. 97(12): p. 6630-3. 102. Li, Q., H. Wen, and S. Ao, Identification and cloning of the cDNA of a Rb- associated protein RAP140a. Sci China C Life Sci, 2000. 43(6): p. 637-47. 103. Campbell, P.A., et al., Oct4 targets regulatory nodes to modulate stem cell function. PLoS One, 2007. 2(6): p. e553. 104. Blewitt, M.E., et al., An N-ethyl-N-nitrosourea screen for genes involved in variegation in the mouse. Proc Natl Acad Sci U S A, 2005. 102(21): p. 7629-34. 105. Daxinger, L., et al., An ENU mutagenesis screen identifies novel and known genes involved in epigenetic processes in the mouse. Genome Biol, 2013. 14(9): p. R96. 106. Harten, S.K., et al., The first mouse mutants of D14Abb1e (Fam208a) show that it is critical for early development. Mamm Genome, 2014. 25(7-8): p. 293-303. 107. Tchasovnikarova, I.A., et al., Hyperactivation of HUSH complex function by Charcot-Marie-Tooth disease mutation in MORC2. Nat Genet, 2017. 49(7): p. 1035-1044. 108. Brummelkamp, T.R. and B. van Steensel, GENE REGULATION. A HUSH for transgene expression. Science, 2015. 348(6242): p. 1433-4. 109. Fukuda, K., et al., A CRISPR knockout screen identifies SETDB1-target retroelement silencing factors in embryonic stem cells. Genome Res, 2018. 28(6): p. 846-858. 110. Liu, N., et al., Selective silencing of euchromatic L1s revealed by genome-wide screens for L1 regulators. Nature, 2018. 553(7687): p. 228-232. 111. Robbez-Masson, L., et al., The HUSH complex cooperates with TRIM28 to repress young retrotransposons and new genes. Genome Res, 2018. 28(6): p. 836-845.

126

112. Yang, B.X., et al., Systematic identification of factors for provirus silencing in embryonic stem cells. Cell, 2015. 163(1): p. 230-45. 113. Chougui, G. and F. Margottin-Goguet, HUSH, a Link Between Intrinsic Immunity and HIV Latency. Front Microbiol, 2019. 10: p. 224. 114. Chougui, G., et al., HIV-2/SIV viral protein X counteracts HUSH repressor complex. Nat Microbiol, 2018. 3(8): p. 891-897. 115. Yurkovetskiy, L., et al., Primate immunodeficiency virus proteins Vpx and Vpr counteract transcriptional repression of proviruses by the HUSH complex. Nat Microbiol, 2018. 3(12): p. 1354-1361. 116. Georgiades, P. and J. Rossant, Ets2 is necessary in trophoblast for normal embryonic anteroposterior axis development. Development, 2006. 133(6): p. 1059-68. 117. Polydorou, C. and P. Georgiades, Ets2-dependent trophoblast signalling is required for gastrulation progression after primitive streak initiation. Nat Commun, 2013. 4: p. 1658. 118. Carvalho, B.S. and R.A. Irizarry, A framework for oligonucleotide microarray preprocessing. Bioinformatics, 2010. 26(19): p. 2363-7. 119. Tian, L., et al., Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci U S A, 2005. 102(38): p. 13544-9. 120. Maciejewski, H., Gene set analysis methods: statistical models and methodological differences. Brief Bioinform, 2014. 15(4): p. 504-18. 121. Masuda, T., M. Tomita, and Y. Ishihama, Phase transfer surfactant-aided trypsin digestion for membrane proteome analysis. J Proteome Res, 2008. 7(2): p. 731-40. 122. Hebert, A.S., et al., The one hour yeast proteome. Mol Cell Proteomics, 2014. 13(1): p. 339-47. 123. Richards, A.L., et al., One-hour proteome analysis in yeast. Nat Protoc, 2015. 10(5): p. 701-14. 124. Tyanova, S., et al., The Perseus computational platform for comprehensive analysis of (prote)omics data. Nat Methods, 2016. 13(9): p. 731-40. 125. Beck, S., et al., Extraembryonic proteases regulate Nodal signalling during gastrulation. Nat Cell Biol, 2002. 4(12): p. 981-5. 126. Herrmann, B.G., Expression pattern of the Brachyury gene in whole-mount TWis/TWis mutant embryos. Development, 1991. 113(3): p. 913-7. 127. Ding, J., et al., Cripto is required for correct orientation of the anterior- posterior axis in the mouse embryo. Nature, 1998. 395(6703): p. 702-7. 128. Yamanaka, Y., et al., Live imaging and genetic analysis of mouse notochord formation reveals regional morphogenetic mechanisms. Dev Cell, 2007. 13(6): p. 884-96. 129. Echelard, Y., et al., Sonic hedgehog, a member of a family of putative signaling molecules, is implicated in the regulation of CNS polarity. Cell, 1993. 75(7): p. 1417-30.

127

130. Ciruna, B. and J. Rossant, FGF signaling regulates mesoderm cell fate specification and morphogenetic movement at the primitive streak. Dev Cell, 2001. 1(1): p. 37-49. 131. Burdsal, C.A., C.H. Damsky, and R.A. Pedersen, The role of E-cadherin and integrins in mesoderm differentiation and migration at the mammalian primitive streak. Development, 1993. 118(3): p. 829-44. 132. Smith, D.E., F. Franco del Amo, and T. Gridley, Isolation of Sna, a mouse gene homologous to the Drosophila genes snail and escargot: its expression pattern suggests multiple roles during postimplantation development. Development, 1992. 116(4): p. 1033-9. 133. Brennan, J., et al., Nodal signalling in the epiblast patterns the early mouse embryo. Nature, 2001. 411(6840): p. 965-9. 134. Ciruna, B.G. and J. Rossant, Expression of the T-box gene Eomesodermin during early mouse development. Mech Dev, 1999. 81(1-2): p. 199-203. 135. Biben, C., et al., Murine cerberus homologue mCer-1: a candidate anterior patterning molecule. Dev Biol, 1998. 194(2): p. 135-51. 136. Belo, J.A., et al., Cerberus-like is a secreted BMP and nodal antagonist not essential for mouse development. Genesis, 2000. 26(4): p. 265-70. 137. Ang, S.L., et al., Positive and negative signals from mesoderm regulate the expression of mouse Otx2 in ectoderm explants. Development, 1994. 120(10): p. 2979-89. 138. Bouillet, P., et al., Sequence and expression pattern of the Stra7 (Gbx-2) homeobox-containing gene induced by retinoic acid in P19 embryonal carcinoma cells. Dev Dyn, 1995. 204(4): p. 372-82. 139. Tarca, A.L., et al., A novel signaling pathway impact analysis. Bioinformatics, 2009. 25(1): p. 75-82. 140. Wang, Q., et al., The p53 Family Coordinates Wnt and Nodal Inputs in Mesendodermal Differentiation of Embryonic Stem Cells. Cell Stem Cell, 2017. 20(1): p. 70-86. 141. Munoz-Fuentes, V., et al., The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation. Conserv Genet, 2018. 19(4): p. 995-1005. 142. Robbez-Masson, L., et al., The HUSH complex cooperates with TRIM28 to repress young retrotransposons and new genes. Genome Res, 2018. 143. Abe, K., et al., The first murine zygotic transcription is promiscuous and uncoupled from splicing and 3' processing. EMBO J, 2015. 34(11): p. 1523-37. 144. Karlic, R., et al., Long non-coding RNA exchange during the oocyte-to-embryo transition in mice. DNA Res, 2017. 24(2): p. 219-220. 145. Nieto-Estevez, V., et al., A global transcriptome analysis reveals molecular hallmarks of neural stem cell death, survival, and differentiation in response to partial FGF-2 and EGF deprivation. PLoS One, 2013. 8(1): p. e53594. 146. Saitou, M., S. Kagiwada, and K. Kurimoto, Epigenetic reprogramming in mouse pre-implantation development and primordial germ cells. Development, 2012. 139(1): p. 15-31.

128

147. Darzynkiewicz, Z., X. Huang, and H. Zhao, Analysis of Cellular DNA Content by Flow Cytometry. Curr Protoc Immunol, 2017. 119: p. 5 7 1-5 7 20. 148. Estandarte, A.K., et al., The use of DAPI fluorescence lifetime imaging for investigating chromatin condensation in human chromosomes. Sci Rep, 2016. 6: p. 31417. 149. Seo, Y., et al., In-Cell RNA Hydrolysis Assay: A Method for the Determination of the RNase Activity of Potential RNases. Mol Biotechnol, 2015. 57(6): p. 506- 12. 150. Botti, V., et al., Cellular differentiation state modulates the mRNA export activity of SR proteins. J Cell Biol, 2017. 216(7): p. 1993-2009. 151. Timms, R.T., et al., ATF7IP-Mediated Stabilization of the Histone Methyltransferase SETDB1 Is Essential for Heterochromatin Formation by the HUSH Complex. Cell Rep, 2016. 17(3): p. 653-659. 152. Zhou, X., et al., Nodal is a novel TGF-beta-like gene expressed in the mouse node during gastrulation. Nature, 1993. 361(6412): p. 543-7. 153. Constam, D.B., Running the gauntlet: an overview of the modalities of travel employed by the putative morphogen Nodal. Curr Opin Genet Dev, 2009. 19(4): p. 302-7. 154. Meno, C., et al., Mouse Lefty2 and zebrafish antivin are feedback inhibitors of nodal signaling during vertebrate gastrulation. Mol Cell, 1999. 4(3): p. 287-98. 155. Stuckey, D.W., et al., Coordination of cell proliferation and anterior-posterior axis establishment in the mouse embryo. Development, 2011. 138(8): p. 1521- 30. 156. Lewis, N.E. and J. Rossant, Mechanism of size regulation in mouse embryo aggregates. J Embryol Exp Morphol, 1982. 72: p. 169-81. 157. Power, M.A. and P.P. Tam, Onset of gastrulation, morphogenesis and somitogenesis in mouse embryos displaying compensatory growth. Anat Embryol (Berl), 1993. 187(5): p. 493-504. 158. Bergmann, J.H., et al., Regulation of the ESC transcriptome by nuclear long noncoding RNAs. Genome Res, 2015. 25(9): p. 1336-46. 159. Holubcova, Z., et al., Human oocytes. Error-prone chromosome-mediated spindle assembly favors chromosome segregation defects in human oocytes. Science, 2015. 348(6239): p. 1143-7. 160. Bhargava, S., et al., Author Correction: The epigenetic modifier Fam208a is required to maintain epiblast cell fitness. Sci Rep, 2018. 8(1): p. 5762. 161. Kim, K.H. and K.A. Lee, Maternal effect genes: Findings and effects on mouse embryo development. Clin Exp Reprod Med, 2014. 41(2): p. 47-61. 162. Gasca, S., et al., Identifying new human oocyte marker genes: a microarray approach. Reprod Biomed Online, 2007. 14(2): p. 175-83. 163. Du, Q. and I.G. Macara, Mammalian Pins is a conformational switch that links NuMA to heterotrimeric G proteins. Cell, 2004. 119(4): p. 503-16. 164. Kschonsak, Y.T. and I. Hoffmann, Activated ezrin controls MISP levels to ensure correct NuMA polarization and spindle orientation. J Cell Sci, 2018. 131(10).

129

165. Smith, T.C., Z. Fang, and E.J. Luna, Novel interactors and a role for supervillin in early cytokinesis. Cytoskeleton (Hoboken), 2010. 67(6): p. 346-64. 166. Bizzotto, S., et al., Eml1 loss impairs apical progenitor spindle length and soma shape in the developing cerebral cortex. Sci Rep, 2017. 7(1): p. 17308. 167. Wang, Y., et al., Exit from exit: resetting the cell cycle through Amn1 inhibition of G protein signaling. Cell, 2003. 112(5): p. 697-709. 168. Verdaasdonk, J.S. and K. Bloom, Centromeres: unique chromatin structures that drive chromosome segregation. Nat Rev Mol Cell Biol, 2011. 12(5): p. 320- 32. 169. Ajduk, A. and M. Zernicka-Goetz, Polarity and cell division orientation in the cleavage embryo: from worm to human. Mol Hum Reprod, 2016. 22(10): p. 691-703. 170. O'Farrell, P.H., J. Stumpff, and T.T. Su, Embryonic cleavage cycles: how is a mouse like a fly? Curr Biol, 2004. 14(1): p. R35-45. 171. Hirasawa, R., et al., Maternal and zygotic Dnmt1 are necessary and sufficient for the maintenance of DNA methylation imprints during preimplantation development. Genes Dev, 2008. 22(12): p. 1607-16. 172. Rowe, H.M., et al., De novo DNA methylation of endogenous retroviruses is shaped by KRAB-ZFPs/KAP1 and ESET. Development, 2013. 140(3): p. 519-29.

130

Supplementary tables

Table S1

Oligo Name Sequence

mCdkn1a-F AACATCTCAGGGCCGAAA mCdkn1a-R TGCGCTTGGAGTGATAGAAA mCcng1-F TGGACAGATTCTTGTCTAAAATGAAG mCcng1-R CAGTGGGACATTCCTTTCCTC mDkk1-F CCGGGAACTACTGCAAAAAT mDkk1-R CCAAGGTTTTCAATGATGCTT mWnt3-F GATGTGGAGGCAGGTCTCTT mWnt3-R CAGAGCAGCCCATTCTTTCT mEomes-F AGCAGCCCAGAGGGTTAAA mEomes-R TGAAGAGCCCACTGTTAACTCA mOct4-F AATGCCGTGAAGTTGGAGAA mOct4-R CCTTCTGCAGGGCTTTCAT mPlatr4-F TGTGAGAATCAGGGAAAGTGG mPlatr4-R TGAGTGCTGAGTTGCAGGTT mPlatr20-F CGGGAAAGCAGAGTGCTG mPlatr20-R TTGCCTTGTTTTTCAAATAGTACCT mPlatr27-F GACTCAGCTGGGTTCCAGAG mPlatr27-R CTGGCTCTTCAAGTCTTCTGC mGAPDH GGGTTCCTATAAATACGGACTGC mGAPDH CCATTTTGTCTACGGGACGA

All real-time PCR primer pairs were designed with Tm approximately 60 ⁰C using the Universal Probe Library Tool on the Roche Website (https://lifescience.roche.com/en_cz/brands/universal-probe-library.html#assay-design- center)

131

Table S2

Name Plasmid Res.E. Sequence

N'terF pGBKT7 XmaI AACCCGGGGACTGCCGCGGAGACG N'terR pGBKT7 SalI AAAGTCGACAAGCCAATGGACTGTGGAGA C'terF pGBKT7 EcoRI CAATGTAGAAAAGAATTCAAAACTAT C'terR pGBKT7 SalI CAAGTCGACTTAATGGAGATTTCTCTGTACCC

Yeast Two Hybrid Yeast Two Name Sequence F2F CACCGTTGCAGCCTTTATGAAGTTG F2R AAACCAACTTCATAAAGGCTGCAAC F3F CACCGGTTTCCTTATAAAACAGTGC F3R AAACGCACTGTTTTATAAGGAAACC M82bR CACCGTGATGCTTGCCGCCGCCGGA

M82bR AAACTCCGGCGGCGGCAAGCATCAC VerifyFam2+3F GGTTGGAAATATTGCCTGGCT VerifyFam2+3R CAGCAACAGACAGACACCTCA VerifyM8a2abF GGCAGGGTTACCACAAACCT VerifyM8a2abR GTCCATCGGCAGGAATACCA

Crispr/Cas9 cell line cell Crispr/Cas9 Genotyping F TCAGAGCAGACCGATCACAC

Genotyping R1 GAAACGCTTCAAACCTGAGC

mice Genotyping R2 GCACTCCAGCCACAGAGAC Fam208a 5c+T7 TAATACGACTCACTATAGGG ATGGCGTCGACGCTTTCCC TAATACGACTCACTATAGGG Fam208a 3b+T7

Crispr/Cas9 ACCCTATCCTCTCGCACCAA

132

Table S3

Gene Forward Reverse Product

Gpsm2 CACGAGCAGCGTCTCCTAAT TTGTCATCAGGCGCTCAAGT 1090

Alb AGATGACAGGGCGGAACTTG GGTTTGGACCCTCAGTCGAG 962

Itgb3bp GGAGCACAGAAACGGACCAT AGGAATTCAAAGCTGTCAAGATGA 354

Etfa AGTAGCTGGCGTAGCAAAGG GCCACCTGGAAAATTGGAGC 726

Psmd8 GAACCGGAAGAACCCGAACC TGGCCAGTTCAGTAGAGGGG 1367

Inpp5a CCTGCTGGTCACGGCCAA TTCCGGTCATTGTCTGACTCC 792

Hcfc2 ATGGTTCTGGTGTTGTGGGC GGAAGTGTTGGGTGCCAATG 908

Parpbp CACATGCCAGAGTCACCAGT AGTGAAGAGCAGACAAGGGC 928

Cntn1 GATCCTGCCTTGGACCTCAC CATCACTGGAAGGTCCGCAT 1099

Amn1 AGCATTCGGGGTCGGATAAC TGTTTGGCACATGGTCCACT 452

Svil GGCCTTTGGTAGAGCACAGT GCTGCATATCTGGTCTGGCT 984

Eml1 ACATCACGGAGGAGCAACAG CCGCAAATACCGCTTCGTTC 978

Ncbp1 CCGACAAACCACATCCACAG GTGCAGCTTCCCCTTATCACTC 1100

Mphosph8 GCTTGAAAGCACGAATGCCT TGTTGCAGTCAGCTCCACAT 1043

ISH probesISH Gene Forward Reverse Product

Gpsm2 TCTTCGACATCCTTGTAAAGTGC GGACAGTCGGCCCCTTAG 90

Alb GACTTTGCACAGTTCCTGGAT TGCATCTAGTGACAAGGTTTGG 91

Itgb3bp GGAACTTATCAGTTGAGCCCATT AGTTACTCCGTTTCCTTGTTTCA 97

Etfa GGAGCGTCTGCTTTTGGA TGATGTCAGAAACTGGAGCAA 76

Psmd8 TCTACATCAAACACCCTGTTTCC CTTTCGGCAGGGATGTTC 94

Inpp5a ATTCGGACACTTTGGAGAGC CCTTTTCTTGACCATTTGCAC 88

Hcfc2 CGTACCAAGCTACATCGTCTGA CCTTGTCTGTGAGGGTCCA 76

Parpbp CCAACAACATCAGTCCTGTCC GACCAGCAAAATTTTCACAGC 92

Cntn1 AACAAGGAAATTACGCATATCCA CATTTCGGATGAGCAGTTCC 76

Amn1 GGATAACAGATTCCAATATAAGTGAGG GCTGGAGAGCCACATCTGA 93

Svil AGGACCGTTCACACACACAG AGGAAGTCTCGCCGTTACAG 113

Eml1 CATCTCCCCCACCATGTC CGATGCGGTCTGACACCT 89

Ncbp1 CTACACTGCTAATCGAACTGTGC GCATGTACAGCATCTCAGTCG 84

Mphosph8 GGGGAGGACGTTTTCGAG GATGTATATCCTTTCCATCGAACTTT 92

Tmem100 TTGCTGCTGTCTCAGTCCAC AAAGAGCCTGTCACCCACTG 88

Biomark primers

133

Table S4

Primary antibodies

Antibody Cat. No. Dilution

anti-Ki67 Abcam #ab15580 1:250 anti-phospho-Histone H3 (Ser28) Cell Signaling # 9713 1:250 anti-Cleaved Caspase3 Cell Signaling #9664 1:250 anti-p53 Novocastra #NCL-p53-CM5p 1:250 anti-Oct4 Santa Cruz #sc-5279 1:100 anti-Snail Cell Signaling # C15D3 1:250 anti-E-Cadherin Sigma Aldrich #U3254 1:500

Secondary antibodies

Antibody Cat. No. Dilution

Alexa Flour 488-conjugated goat anti-rabbit Invitrogen #A-11034 1:500 Alexa Flour 594-conjugated goat anti-rabbit Invitrogen # A-11012 1:500 Alexa Flour 594-conjugated goat anti-rat Invitrogen # A-11007 1:500 Alexa Flour 488-conjugated goat anti-mouse Jackson# 715-545-150 1:500

134

Table S5

Homozygotes (Fam208aD6/D6)

Pathway_Name Size tstat.mea Comp.Probabili NEk NEk.p.va NEk.q.va n ty l l p53/p73a 23 4.505 <0.001 5.78 7.69E-09 <0.001 Platrd 20 6.389 <0.001 3.09 2.00E-03 0.003 KRABb 132 0.479 0.001 1.71 8.70E-02 0.071 Oct4coexpressc 423 2.390 0.005 2.75 6.00E-03 0.006

Heterozygotes (Fam208a+/D6)

Pathway_Name size tstat.mea Comp.Probabilit NEk NEk.p.va NEk.q.va n y l l p53/p73a 23 0.717 0.005 2.18 2.90E-02 0.094 Platrd 20 0.590 0.044 1.36 1.74E-01 0.347 Oct4coexpressc 423 0.138 0.242 0.60 5.51E-01 0.551 KRABb 132 -0.009 0.957 -0.16 8.70E-01 0.938 athe p53-bound (50kb from TSS) subset of genes significantly upregulated in response to combined p53/p73 depletion in mouse embryoid bodies (Wang et al, 2016). bgene set defined according to InterPro ID (IPR001909: Krueppel-associated box). cpluripotency-associated Oct4 co-expression module (Bergmann et al, 2015). dpluripotency-associated long non-coding transcripts (Bergmann et al, 2015) .

135

List of appendices

 Article n.1 o The epigenetic modifier Fam208a is required to maintain epiblast cell fitness Shohag Bhargava, Brian Cox, Christiana Polydorou, Veronika Gresakova, Vladimir Korinek, Hynek Strnad, Radislav Sedlacek, Trevor Allan Epp & Kallayanee Chawengsaksophak Scientific Reports, 7: 9322 | DOI:10.1038/s41598-017-09490-w ISSN 2045-2322  Article n.2 o Cesty mutageneze Veronika Grešáková ŽIVA, 2/2017, XLIV-XLVI, ISSN 0044-4812  Article n.3 o Dual role of Fam208a in maintenance of genome stability in mammals Veronika Gresakova, Vendula Novosadova, Michaela Prochazkova, Shohag Bhargava, Irena Jenickova, Jan Prochazka and Radislav Sedlacek Experimental cell research, ISSN 0014-4827, submitted  Abstract n.1 o D14Abb1e –Tracking Down A Putative Suppressor Of Variegation Veronika Grešáková, Slavomir Kinsky, Radislav Sedláček And Trevor Epp, XIV Mezioborové setkání mladých biologů, biochemiků a chemiků, Chemické listy, Vol 108 No 5 (2014), ISSN 1213-7103  Abstract n.2 o From ENU mutagenesis to knock-in reporter fusion alleles: using genetic technologies to study the role of Fam208a (ID 50) Veronika Grešáková; Shohag Bhattacharyya; Björn Schuster, Inken M. Beck, Radislav Sedláček, Kallayanee Chawengsaksophak and Trevor A. Epp, 13th Transgenic Technology Meeting, March 20-23, 2016, Prague Transgenic Res (2016) 25: 195. DOI: 10.1007/s11248-016-9936-6, ISSN 1573-9368

136

www.nature.com/scientificreports

Correction: Author Correction OPEN The epigenetic modifer Fam208a is required to maintain epiblast cell ftness Received: 15 March 2017 Shohag Bhargava1,6, Brian Cox5, Christiana Polydorou1, Veronika Gresakova1, Vladimir Accepted: 26 July 2017 Korinek3, Hynek Strnad4, Radislav Sedlacek 1,2, Trevor Allan Epp1,2 & Kallayanee Published: xx xx xxxx Chawengsaksophak 1,2

Gastrulation initiates with the formation of the primitive streak, during which, cells of the epiblast delaminate to form the mesoderm and defnitive endoderm. At this stage, the pluripotent cell population of the epiblast undergoes very rapid proliferation and extensive epigenetic programming. Here we show that Fam208a, a new epigenetic modifer, is essential for early post-implantation development. We show that Fam208a mutation leads to impaired primitive streak elongation and delayed epithelial-to-mesenchymal transition. Fam208a mutant epiblasts had increased expression of p53 pathway genes as well as several pluripotency-associated long non-coding RNAs. Fam208a mutants exhibited an increase in p53-driven apoptosis and complete removal of p53 could partially rescue their gastrulation block. This data demonstrates a new in vivo function of Fam208a in maintaining epiblast ftness, establishing it as an important factor at the onset of gastrulation when cells are exiting pluripotency.

Gastrulation is a critical developmental process whereby the three germ layers (ectoderm, mesoderm and defn- itive endoderm; DE) are specifed. Immediately post implantation and prior to gastrulation (E5.5 to E6.5), the mouse embryo dramatically changes in size and shape. Te embryonic epiblast shows the highest proliferation rate (2–8 hours1,) in order to attain a critical cell number threshold2,3. Formation of the primitive streak (PS) at the posterior side of the embryo at E6.5 is the hallmark of gastrulation, and coincides with the completion of distal visceral endoderm (DVE) migration to the anterior side of the embryo to form the anterior visceral endoderm (AVE)4. As gastrulation progresses, the epiblast cells undergo an epithelial to mesenchymal transition (EMT) at the PS, giving rise to mesoderm and DE. Epiblast cells that do not ingress through the PS remain in the epiblast and give rise to ectodermal lineages such as the neurectoderm5. Gastrulation is also a period of dynamic epige- netic change, involving many diferent known epigenetic silencing factors, and likely others that are still to be discovered. Several epigenetic silencing factors have been discovered in a dominant ENU mutagenesis screen in the mouse for modifers of transgene variegation6,7. Tese were designated as modifers of murine metastable epi- alleles or Momme. One group of genes identifed in this screen are specifcally involved in writing or reading repressive H3K9me3 marks; these are MommeD9 (Trim28/Kap1), MommeD13 (Setdb1/Eset), MommeD33 (Suv39h1), and MommeD44 (Trim33/ectodermin)7–9. A new member to this list is Fam208a (MommeD6 and MommeD2010, which in human has recently been shown to be a core factor of a new epigenetic silencing com- plex comprising FAM208A, MPHOSPH8, PPHLN and SETDB111. MPHOSPH8 through its chromodomain spe- cifcally binds H3K9me3, and SETDB1 catalyzes trimethylation of adjacent K9 residues. Tis complex, termed HUSH (human silencing hub) has been proposed to be important for heterochromatin spreading, as opposed to TRIM28-SETDB1 complexes which may be more important for de novo trimethylation when recruited to specifc genomic sequences by members of the KRAB-zinc fnger protein family12.

1Laboratory of Transgenic Models of Diseases, Division, BIOCEV, Institute of Molecular Genetics of the CAS, v.v.i., Vestec, Czech Republic. 2Czech Centre for Phenogenomics, Division BIOCEV, Institute of Molecular Genetics of the CAS, v.v.i., Vestec, Czech Republic. 3Laboratory of Cell and Developmental Biology, Institute of Molecular Genetics of the CAS, v.v.i., Krc, Czech Republic. 4Laboratory of Genomics and Bioinformatics, Institute of Molecular Genetics of the CAS, v.v.i., Krc, Czech Republic. 5Department of Physiology, Faculty of Medicine, University of Toronto, Ontario, Canada. 6Faculty of Science, Charles University, Prague, Czech Republic. Correspondence and requests for materials should be addressed to T.A.E. (email: [email protected]) or K.C. (email: [email protected])

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 1 www.nature.com/scientificreports/

H3K9me3 is associated with tightly packed constitutive heterochromatin, typically found at pericentromeric and subtelomeric repeats, whereas facultative heterochromatin, typically found in silenced gene-encoding regions is associated with H3K9me213. More recently it has been found that H3K9me3 also marks in embryonic stem cells, the poised state of master regulators of diferentiation, allowing them to be acutely activated following inductive nodal-activin signalling14. Tese poised states are established by the action of Oct4, Sox2 and Nanog, which recruit Setdb1 to deposit the H3K9me3 mark15. Loss of function mutations in mice of the above-mentioned H3K9me3-related genes, identifed as modifers of transgene variegation in the mouse, have been independently studied in an embryological context. All, except for the X-linked Suv39h1, result in early embryonic lethality; Setdb1 null mice are lethal at the peri-implantation stage (E3.5–E5.5)16 while both Trim33 and Trim28 null mice fail to undergo gastrulation17,18. Previously, we reported that MommeD6 and MommeD20 homozygotes also die during the gastrulation stage. Here, we examine the mutant phenotype in more detail, characterizing their involvement in central morphogenetic events that occur during this stage, namely the establishment of anteriorio-posterior (A-P) patterning and EMT. Results Fam208a is widely expressed during early post implantation development. To investigate the role of Fam208a during post-implantation development, we frst analysed its mRNA expression profle at embry- onic stages preceding (E5.5), during (~E6.25 to 7.75) and following (E8.5) gastrulation. At E5.5 (egg cylinder; EC), Fam208a is specifcally expressed only in the epiblast. At E6.5 (pre-streak; Pr-S), Fam208a expression extends into the extraembryonic ectoderm (ExE) and one day later E7.5 (early headfold; EHF), the expression is observed in embryonic ectoderm, allantois, amnion and chorion. From E8.5 to 9.5, Fam208a is ubiquitously expressed in the developing mouse embryo (Supplementary Fig. 1).

Fam208a mutation leads to defective primitive streak elongation. From E6.5 (early streak; ES) onwards, Fam208aD6/D6 embryos were increasingly growth retarded. At later stages, the embryonic region became increasingly delayed while extraembryonic tissues continued to develop. At E7.5, we observed the expansion of the exocoelomic cavity with a small amniotic cavity which appears to form by the abutting of ExE onto itself, a lack of an amnion and an allantoic bud that was severely restricted in size (Fig. 1). Te disparity between embry- onic growth impairment and the relatively more advanced development of extraembryonic structures was con- sistent in both Fam208aD6/D6 and Fam208aD20/D20 mutants and therefore, we focused our subsequent studies on one of the mutant alleles, Fam208aD6/D6. We frst investigated the ExE development in Fam208aD6/D6 mutant embryos, by examining the expression of key marker genes such as Cdx2, Elf5, Spc4, and Bmp4, which have been shown to be important in ExE devel- opment and maintenance at the ES stage (E6.5)19–22. We observed comparable expression of Elf5 (n = 4), Spc4 (n = 2), Cdx2 (n = 3) and Bmp4 (n = 4) between Fam208aD6/D6 mutants and their littermate controls indicating that there is no major defect in ExE specifcation at E6.5 (Supplementary Fig. 2). Gastrulation begins with the formation of the PS at E6.5. Brachyury (T) expression is widely used to mark the PS and axial mesoderm that migrates out of the PS but not the mesodermal wing23. While the expression of T in the E7.5 (late pre-headfold; LPHF) wild type embryo extends past the distal tip of the embryo and into the notochord precursor that extends anteriorly to the node (Fig. 1A), T expression in Fam208aD6/D6 embryos is restricted to the posterior of the embryo, extending distally about one-third of the length of the epiblast and never reached the distal tip. Tis suggests that while gastrulation is initiated, there is a failure to elongate the PS (Fam208a;D6/D6 n = 7; Fig. 1A’ and Fam208a;D20/D20 n = 3; Supplementary Fig. 3B). Tis was confrmed by examin- ing Cripto expression. At the onset of gastrulation, Cripto is expressed in the PS and later, at E7.5 is also expressed in the mesodermal wing that extends rostrally24. Cripto seemed to be correctly expressed in Fam208aD6/D6 mutant embryos when compared to its wildtype littermate at E6.5 (n = 2, Supplementary Fig. 6). Further, Cripto expres- sion in E7.5 Fam208aD6/D6 embryos which morphologically resembled E7.0 (midstreak; MS) is observed in the PS, but again the expression domain does not extend to the distal tip. Cripto expression extends further laterally and distally than that of T, marking also the migratory mesoderm that overlays the PS (n = 3; Fig. 1B’). Next, we checked if there is correct specifcation of anterior mesendoderm (AME) and its main derivative the node, the organizer of the mouse gastrula. At E7.5, Noto, a marker of the node, is normally confned to the distal tip of the EC25,26 (n = 3; Fig. 1C). In Fam208aD6/D6 mutant embryos, Noto is absent (n = 3; Fig. 1C’), although expression does appear at E8.5–E9.5 (n = 2; Supplementary Fig. 4B’ and n = 3; Supplementary Fig. 4D’ respec- tively). Like Noto, Nodal expression is similarly confned to the node in wild-type littermate embryos at the E7.75 (late headfold; LHF) stage (n = 5; Fig. 1D), but in mutant littermates, Nodal expression refects an earlier devel- opmental stage, being strongly expressed in both the anterior and posterior proximal epiblast, with expression gradually reducing towards the distal epiblast and with no discernible presumptive node (Fig. 1D’). We sought to confrm the lack of formation of the node and anterior PS derivatives in Fam208aD6/D6 mutant embryos by studying the expression of Foxa2 and Shh during the late streak (LS) to EHF stage. While Foxa2 and Shh share expression in the anterior defnitive endoderm (ADE) and axial mesoderm27–29, Foxa2 expression begins almost 24 hours prior to Shh27,30 at the ES stage, where it is localized to the posterior epiblast31 and delami- nating mesoderm in the anterior PS. We observed absence of Foxa2 expression (2/3) to very faint expression (1/3) in Fam208aD6/D6 mutant embryos when compared to its wildtype littermate at E6.5 (n = 3, Supplementary Fig. 6). Shh expression is frst detected at the early allantoic bud (EB) stage in the midline mesoderm of the head process. Later, at the late streak early allantoic bud (LSEB) stage, Shh expression is initiated in the node, the notochord, and later in the DE27 – overlapping expression domains with Foxa2. In the littermate controls at E7.5 (EHF), Foxa2 expression is consistent with this stage of development, being in the node, ADE and axial mesoderm. In Fam208aD6/D6 mutants however, Foxa2 expression is delayed and is seen in the posterior epiblast and mesoderm similar to that of ES stage (n = 4, Fig. 1E’). Tis developmental delay is further confrmed by the complete absence

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 2 www.nature.com/scientificreports/

Figure 1. Fam208aD6/D6 mutants exhibit gastrulation failure defects. Whole mount in situ hybridization at E7.5-E7.75 of Fam208aD6/D6 mutants (A’–F’) and their wild-type littermate controls (A–F). Te Fam208aD6/D6 mutant embryos at E7.5 are phenotypically distinguishable with severely retarded epiblast. (A’ ) In mutants, PS initiates but remains hardly 1/3rd in its length with no distal and anterior expression as seen by Brachyury expression (pan-mesodermal marker). (B’) Cripto, a PS and nascent mesoderm marker is expressed slightly delayed in mutants. Together, they show arrested PS elongation. (C ’, D ’ and F’) Te expression of Noto, Nodal and Shh is undetectable in the node of the Fam208aD6/D6 mutant embryos with (G’) reduced anterior expression of AME marker Foxa2. Line indicates the length of the PS. Dashed line in black demarcates the length of the PS and the blue dashed line indicates the node and head process. Scale bar: 30 µm. PS, primitive streak, LPHF, Late pre-head fold; LSEB, Late streak, early allantoic bud; LS, Late-streak; EPHF, Early pre-head fold; LHF, Late Head fold; EHF, Early head fold; al, Allantois. Also, see Supplementary Figs 1–2.

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 3 www.nature.com/scientificreports/

of Shh at E7.5 (n = 5; Fig. 1F’) and E8.5 (n = 2; Supplementary Fig. 3C’) while Foxa2 was expressed in the anterior midline of both Fam208aD6/D6 as well as littermate controls at E7.5 (EHF). Tere is no change in Foxa2 expression in Fam208aD6/D6 embryos at E8.5, when the epiblast appeared as a EC with no headfold initiation, yet with a dis- cernible allantoic bud (Fam208aD6/D6, n = 2; Supplementary Fig. 3D’ and Fam208a;D20/D20 n = 2; Supplementary Fig. 3D”). Despites defective elongation and inability to give rise to notochordal cells, we detect Noto positive cells an indicator of node activity in Fam208a mutants at E8.5 (Supplementary Fig. 4B’). Collectively, we conclude that embryos with Fam208a mutation can develop with no overt morphological changes to the ES stage and can initi- ate gastrulation afer which development becomes increasing delayed and fails to progress beyond the EHF stage. Fam208a is important for epithelial-to-mesenchymal transition at the onset-of gastrula- tion. During gastrulation, epiblast cells undergo EMT, migrate and ingress through the PS and later emerge as diferentiated cells to form the mesoderm, a new layer between the epiblast and the overlying visceral endoderm (VE)32,33. We investigated the expression of two markers of EMT, namely E-cadherin and Snail. Prior to gastru- lation, E-cadherin, encoded by the Cdh1 gene, is expressed in the epiblast and endoderm34 and is downregulated in those epiblast progenitor cells that delaminate and undergo EMT at the PS, and is no longer expressed in the nascent mesoderm33. In contrast, Snail is frst detected within the PS and in the migratory mesodermal wings35. Snail is a transcriptional repressor that acts downstream of Fgf signaling to repress Cdh1 gene expression33,36,37. In Fam208aD6/D6 mutant embryos, we analysed the protein level of E-Cadherin and Snail by whole-mount immunos- taining, both at the onset (E6.5; ES) and during the progression of gastrulation (E7.5, Early pre-head fold, EPHF). At E6.5 (ES), there are either no (n = 1/3) or just a couple of Snail-expressing epiblast cells (n = 2/3) with no signifcant reduction in E-cadherin expression (Fig. 2A’–D’) when compared to wildtype littermates (Fig. 2A–D). In Fam208aD6/D6 mutants at E7.5, which morphologically resemble E7 (MS), there were an increased number of Snail-expressing cells marking the nascent PS, which remained at the MS stage and never extended to the distal tip, as observed in control littermates (n = 5/5; Fig. 2F’–H’). Terefore, we conclude that Fam208aD6/D6 mutants can initiate EMT but are unable to sustain progression, leading to a shortened PS.

Fam208a mutant embryos exhibit an alteration in anterior-posterior patterning. Because of the delay in gastrulation progression in Fam208a mutants (both Fam208D6/D6 and Fam208aD20/D20), we sought to analyse several regulatory genes expressed within the Fam208aD6/D6 epiblast from E6.5–7.5. First, we examined the expression of Nodal. In normal embryos, at E5.5 (EC stage), Nodal is expressed throughout the VE and epi- blast38. As gastrulation progresses, Nodal is rapidly downregulated in the VE and anterior ectoderm and becomes concentrated in the posterior ectoderm, which is indeed observed in our E6.5 (ES) littermate embryos (Fig. 3A). In contrast, in E6.5-E6.75 Fam208aD6/D6 embryos, although downregulated in the VE (arrowhead), Nodal remains expressed in the anterior epiblast (n = 4; Fig. 3A’,B’). Because of this failure to downregulate anterior Nodal expression in Fam208aD6/D6 embryos, we investigated expression of Wnt3 and Eomesodermin (Eomes), two downstream targets of Nodal signaling38. In E6.5 (ES) Fam208aD6/D6 mutant embryos, Wnt3 is localized to the posterior epiblast, albeit with reduced expression and extending more anteriorly (arrowhead Fig. 3D’) to that of littermate controls (n = 2; Fig. 3C’). By E6.75 (n = 2; Fig. 3D’), Wnt3 expression in Fam208aD6/D6 embryos has increased, but its expression domain extends only one third the length of the epiblast, instead of the two-thirds seen in the control littermates, and remains unchanged at E7.5 (n = 2; Fig. 3E’). Next, we studied the expression of Eomes because it is induced by Nodal in the ExE and in the posterior epiblast prior to PS formation38 and because its function is crucial for mesoderm formation39. During gastru- lation, Eomes is expressed in the PS and nascent mesoderm, and later becomes confned to the anterior PS, where it abruptly disappears prior to node formation40–42. In Fam208aD6/D6 mutants at E6.5 (ES), Eomes is sig- nifcantly down-regulated in both extra embryonic tissues and in the posterior epiblast (n = 5; Fig. 3F’). At E7.5, Fam208aD6/D6 embryos morphologically resemble E6.5 (LSEB) and Eomes expression slightly increases, but still remains restricted to the ExE and the PS (n = 2; Fig. 3G’). In contrast, in the littermate control at E7.5 (EPHF), extraembryonic expression is seen only in the chorion as embryonic expression has already disappeared. To evaluate whether failure to downregulate Nodal expression in Fam208aD6/D6 mutants is simply due to a developmental delay or whether lack of Fam208a alters the regulatory network of Nodal signalling, we exam- ined the expression profle of several known genes involved in A-P patterning. Cer1, a Nodal antagonist, is an important marker of the AVE43. In the ES stage, Cer1 expression is detected in the AVE extending towards the embryonic distal tip. By MS stage, Cer1 is detected in the VE and in DE emerging from the node. Cer1 expression in DE later disappears, remaining only in the midline (AME) underlying the future head formation44–46. In E6.5 (ES) Fam208aD6/D6 mutants Cer1 is correctly expressed in the AVE and is comparable to littermate controls (n = 3; Fig. 3H’). Strikingly, the Cer1-expression in E7.5 Fam208aD6/D6 embryos, which morphologically resembled the LSEB stage, either was absent (n = 3/7) or was confned to the endoderm at the distal tip, potentially marking the precursors of DE (n = 4/7; Fig. 3I’). We hypothesized that Cer1 positive cells in Fam208aD6/D6 embryos at the ESEB stage were endodermal cells and not AME cells because of the lack of Foxa2 (another AME marker) expression in Fam208aD6/D6 embryos (Fig. 1E’). To further confrm the lack of AME, we studied the expression of Lim1. In ES-LS stages, Lim1 expression is confned to the AVE and to the PS. By LS to EHF (E7.5–7.75), Lim1 is expressed in the mesendoderm (node and notochord) and lateral plate mesoderm47,48. Lim1 expression in Fam208aD6/D6 mutants at E7.5 (LSEB) showed patchy expression in the VE and in the PS, similar to the Lim1 expression pattern reported for normal embryos at ES stage (n = 4; Fig. 3J’). Otx2, an anterior forebrain/midbrain marker, is widely expressed in the epiblast from the PrS-ES stage. During MS-LS, Otx2 expression is progressively reduced in the posterior epiblast and then becomes limited to the anterior half of the embryo49,50. At E6.5, Otx2 is robustly expressed within the entire epiblast of Fam208aD6/D6 embryos, while in littermate controls, its expres- sion domain has already shifed anteriorly (n = 2; Supplementary Fig. 6). Otx2 expression in E7.5 Fam208aD6/D6

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 4 www.nature.com/scientificreports/

Figure 2. Fam208aD6/D6 mutants exhibit signifcantly delayed epithelial-to-mesenchymal transition during gastrulation. Whole mount immunofuorescence of mutant Fam208aD6/D6 embryos (E6.5: A’–D’ and E7.5: E’– H’) and their wildtype littermates (E6.5: A–D and E7.5: E–H). (D’) Confocal images show only a very few Snail- expressing (mesodermal marker, green) cells within the PS with failure to down-regulate E-cadherin (epiblast and endodermal marker, red) at E6.5. Snail expression increases along the elongated PS by E7.5 but gets arrested halfway. Te boxed region to the bottom lef is of 4-fold magnifcation. Scale bar: 30 µm. ES, Early streak; EPHF, Early pre-head fold; MS, Mid-streak; PS, Primitive streak; Al, allantois.

mutants (LS) still remains located throughout the epiblast, and is reminiscent to that of normal embryos at the PrS-ES stage (n = 2; Fig. 3K’). In the LS-EHF stage, Gbx2, a posterior neuroectoderm marker, is widely expressed in the posterior part of the embryo and is excluded from the headfold51. Gbx2 expression is completely absent in E7.5 Fam208aD6/D6 mutant embryos (n = 5; Fig. 3L’).

Determination of overall cell number and proliferation rates in Fam208a mutants. To inves- tigate if the overt developmental phenotype of Fam208aD6/D6 mutant embryos was due to an overall decrease in

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 5 www.nature.com/scientificreports/

Figure 3. Gene marker expression in Fam208aD6/D6 mutant embryos. Whole mount in situ hybridization at E6.5-E7.5 of Fam208aD6/D6 mutants (A’–L’) and their wild-type littermates (A–L). (A’–D’) Posterior epiblast markers Wnt3 and Nodal fail to be completely down-regulated anteriorly in mutant embryos at E6.5–6.75. (F’) Eomes (key-regulator of EMT and inducer of mesoderm) is down-regulated in E6.5 mutant embryos. (I’) complete absence of Gbx2 (posterior neuroectoderm; hindbrain marker) (J’) with expanded (both anterior and posterior) Otx2 (anterior forebrain marker) expression domain in E7.5 mutants. (H’) slight Lim1 (anterior PS marker) expression is seen at E7.5 in mutants when compared to wild type. (K’,L’) Note that AVE migrates correctly in mutants at E6.5. At E7.5, Cer1 is expressed in the ADE overlaying future head formation, while in mutants Cer1 expression is reduced and remains in the distal epiblast. Dashed line in black indicates the length of the primitive streak and in blue indicates the length of the AVE. Scale bar: 30 µm. ES, Early streak; MS, Mid- streak; Pr-S, Pre-streak; EPHF, Early pre-head fold; LSEB, Late streak, early allantoic bud; LPHF, Late pre-head fold; al, allantois.

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 6 www.nature.com/scientificreports/

Figure 4. Fam208aD6/D6 mutants have reduced cell numbers. Whole mount immunofuorescence of mutant Fam208aD6/D6 mutant embryos and their wildtype littermate controls at E6.5. (A) Confocal images show that mutant embryos have a smaller epiblast as seen by the smaller expression domain of Oct4 (epiblast marker, green). Quantifcation of (B) epiblast, (C) ExE and (D) VE cell numbers. All results are mean ± SEM from 17 (Fam208a+/+), 15 (Fam208aD6/+) and 15 (Fam208aD6/D6) embryos. *p < 0.05, ****p < 0.0001 Scale bar: 50 µm. ES, Early streak; MS, Mid-streak; ExE, Extra-embryonic ectoderm; Epi, Epiblast; VE, Visceral endoderm.

cell number, we frst quantifed the number of cells in three distinct regions, namely the epiblast, ExE and VE at E6.5. We found a signifcant reduction in cell number for all three regions in Fam208aD6/D6 mutant embryos when compared to littermate controls (epiblast, n = 17; ExE, n = 15; VE, n = 15; Fig. 4B–D). Next, we performed

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 7 www.nature.com/scientificreports/

Figure 5. Fam208aD6/D6 mutants exhibit altered proliferation and increased p53-mediated apoptosis. (A,B) Confocal images of mutant embryos with signifcantly increased Ki67-positive cells (green) in Epi and ExE; *p < 0.05. (C,D) Te mutant embryos show no signifcant change in pH3 (red) expression, a measure of mitotic index. (E-F) Fam208aD6/D6 mutant embryos have signifcantly increased apoptosis as shown by Cleaved Caspase3 positive cells (Cl. Casp3; Red) particularly in the epiblast, *p < 0.05 (G,H) also with pronounced increase in p53 level primarily in the epiblast and in part of the ExE region adjacent to the epiblast, *p < 0.05. All results are calculated as mean ± SEM from at least two diferent litters. Te number of embryos analysed for each marker are indicated.

immunofuorescence staining using the pan-proliferation marker Ki67 to determine the proliferative index in the epiblast, ExE and VE. We found a signifcant gene-dosage dependent increase in the proliferative index compared to its littermate controls (n = 4; Fig. 5A,B). To examine further this defect, we also checked the M phase marker, phospho-H3 (Ser 28, M-phase marker52 but found no signifcant change in the mitotic index in Fam208aD6/D6 mutant embryos (n = 3 embryos; Fig. 5C,D). Tis suggests that the increased percentage of cells that have entered the cell cycle (Ki67) are arrested or delayed in a phase of the cycle other than M-phase (phospho-H3). Tese fnd- ings cannot account for the diminished size of Fam208aD6/D6 mutant embryos and therefore, we shifed our focus to examining the rate of apoptosis.

Fam208a mutants exhibit increased apoptosis and are partially rescued by p53 mutation. Te epiblast, with its very high rates of cell proliferation, is under constant replicative stress and is particularly sensi- tive to genotoxic stresses53. In response to genotoxic stress, epiblast cells will normally undergo rapid apoptosis54. We investigated whether an increase in apoptosis was leading to impaired epiblast growth using cleaved caspase-3 imunofuorescence as a measure of the apoptotic index. Fam208aD6/D6 mutant epiblasts had a signifcant increase in the number of cleaved caspase-3 positive cells at E6.5. Tis increase was specifc to the epiblast, and was not observed in either the ExE or VE (n = 4; Fig. 5E,F). A p53-dependent apoptosis-mediated mechanism increases embryo ftness by removing mutated or damaged epiblast cells during early post-implantation development, allowing the selective clonal expansion of healthy

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 8 www.nature.com/scientificreports/

Figure 6. Partial rescue of Fam208aD6/D6 mutant gastrulation block upon p53 removal. Gross morphology of E9.5 embryos obtained from Fam208a;+/D6 p53+/− intercrosses. (A) Representative bright-feld image of normally developing Fam208a;+/+ p53−/− control embryos, (B) no rescue of the Fam208aD6/D6 phenotype is observed as a result of the introduced mixed background (FVB/N and C57BL/6 J), however variable rescue of the Fam208aD6/D6 phenotype is observed in embryos with p53+/− (C,D) or p53−/− (E,F) genotypes. In all cases, representative embryos of each genotype are shown. Scale bar: 100 µm.

cells53. To determine whether p53 activation was associated with the observed increase in apoptosis in E6.5 Fam208aD6/D6 embryos, we investigated the expression of p53 using immunofuorescence. We found a signifcant increase in p53 expression in the epiblast as well as within the ExE-epiblast junction in Fam208aD6/D6 mutants (n = 4; Fig. 5G,H). Several gene knock-out models exhibiting gastrulation failure with p53 dependent apoptosis can be rescued by crossing to p53 mutant mice55–59. To investigate whether the gastrulation block seen in Fam208aD6/D6 mutants could be similarly rescued in a p53−/− background, we inter-crossed double heterozygous Fam208a;+/D6 p53+/− mice and dissected at E9.5, a time point when Fam208aD6/D6 mutant embryos are severely retarded and are mor- phologically similar to the E6.75–7.0 stage. Indeed, we observed a partial rescue, Fam208a:D6/D6: p53−/− double nullizygous embryos were alive judging by a beating heart at the time of dissection. Tey reached developmental milestones associated with E8.5–9.0 with several developmental abnormalities, including neural tube closure defects (open mid and hind-brain), an abnormal and enlarged pericardium, and irregular/smaller somites (n = 3; Fig. 6E,F and Supplementary Fig. 5). Tere was also detectable rescue seen in half of Fam208aD6/D6 embryos in a p53+/− heterozygous background (n = 2/4; Fig. 6D); for the other half only an empty yolk sac was retrieved (n = 2/4; Fig. 6C). Tese results clearly indicate that the developmental phenotype seen in Fam208a mutant embryos is due to a p53-dosage mediated increase in the rate of apoptosis.

Transcription profling in Fam208a mutant epiblasts. An expression microarray was performed using total RNA isolated from single dissected epiblasts at E6.25. Te experiment consisted of four samples from each genotype (Fam208aD6/D6, Fam208a+/D6 and Fam208a+/+), equally represented by gender. As the MommeD6 line was originally identifed as a semi-dominant suppressor of transgene variegation, we included heterozygous sam- ples in our analysis in an attempt to identify any dosage-dependant changes in transcript abundance. Using signalling pathway impact analysis (SPIA)60 against the KEGG database, we saw signifcant enrichment of the p53 signalling pathway (mmu04115, p = 0.0023) (Fig. 7D). Tis enrichment is readily apparent in a related 24 member gene set defned recently61, comprising the p53-bound (50 kb from TSS) subset of genes signifcantly downregulated in response to combined p53/p73 depletion in mouse embryoid bodies (Fig. 7C). For this p53/ p73 dataset not only was there signifcant enrichment in homozygotes (comp. probability <0.001, NEk q <0.001, Supplementary Table. 1), but also in heterozygotes (comp. probability = 0.005, NEk q = 0.094). Also revealed in homozygotes, were increased levels of several transcripts related to embryogenesis. Tese include Dkk1 (14.2 fold, q = 1.1 × 10−5), Gsc (4.2 fold, q = 1.5 × 10−4), Cfc1 (Cryptic) (5.9 fold, q = 0.0018), and Nodal (2.5 fold, q = 0.0012) indicating that altered expression of important patterning genes is already pres- ent at E6.25, a stage when mutant embryos are morphologically indistinguishable from wild-type. For Dkk1 this increase was confrmed by qRT-PCR (Fig. 7E). We also observed a signifcant overrepresentation of KRAB protein-containing (IPR001909) genes (comp. probability = 0.001, NEk.q = 0.071), which is consistent with the results from FAM208A knockdown experiments using human cell lines11.

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 9 www.nature.com/scientificreports/

Figure 7. Increased p53 signaling and deregulation of pluripotency-associated transcripts in Fam208aD6/D6 epiblasts. (A) Volcano plot of Fam208aD6/D6 contrast with Fam208a+/+ indicating position of selected genes selected for validation by qPCR. (B) density plots showing position of Platr genes in all genes diferentially expressed between Fam208aD6/D6 and Fam208a+/+ epiblasts with data segregated according to sex (n = 2 each) (C) heat map showing diferential expression of a p53-bound, p53/p73-regulated gene set defned by [7] (D) statistical overrepresentation of the p53 signaling pathway using SPIA analysis (E) single epiblast qPCR verifcation at E6.25 for gene expression normalized to Gapdh with mutant values represented as a fold change relative to wildtype littermate (average expression converted to 1). Expression measurements were carried out in duplicates per embryo (number of embryos; +/+, n = 6; +/D6, n = 4, D6/D6, n = 6). All results are calculated as mean ± SEM from at least two diferent litters.

Functional set enrichment was also found for a previously defned Oct4 coexpression-based pluripotency 62 module (comp. probability = 0.005, NEk q = 0.006) suggesting that mutant epiblast cells may have a delayed transition from a naïve to a primed pluripotency state, or alternatively delayed exit from pluripotency. Te long

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 10 www.nature.com/scientificreports/

non-coding RNA fraction from this gene module, termed Platr (pluripotency associated transcript), was par- ticularly enriched (comp. probability <0.001, NEk q = 0.003) with a similar distribution profle between males and females (Fig. 7B). For Platr3, Platr20 and Platr27, this upregulation was confrmed by qRT-PCR (Fig. 7E). Notably, the Platr gene set was also enriched in heterozygotes (comp. probability <0.044, NEk q = 0.347). Discussion Extensive epigenetic changes occur at the onset of gastrulation as cells exit pluripotency and become committed to the diferent embryonic germ lineages. Here, we show that the epigenetic repressor Fam208a is vitally impor- tant at this stage, with Fam208aD6/D6 mutant embryos exhibiting profound developmental delay beginning by E6.5. Te establishment of the anterior-posterior axis occurs during gastrulation and is coordinated with embry- onic growth. In Fam208aD6/D6 embryos we observed signifcantly impaired growth in the epiblast at E6.5, a time corresponding with the onset of gastrulation and the establishment of A-P patterning. Strongly supporting the interpretation of developmental delay is the eventual appearance of node activity at E8.5 (Supplementary Fig. 4). However, we did observe discrepencies that cannot be readily attributed to developmental delay. First is the increasingly desynchronized growth of extraembryonic tissue compared to embryonic tissue and second, is a failure to downregulate anterior Nodal expression. Normally expression of Lefy1 and Cer1 in the DVE inhibits Nodal expression, resulting in a proximal-distal Nodal gradient. Tis gradient rotates to become the A-P axis, whereby the DVE moves anteriorly to form the AVE, and the proximal epiblast will move posteriorly63–66. In E6.5 Fam208aD6/D6 embryos this rotation has occurred, with the AVE clearly visible as an anterior Cer1-expression domain, but without the expected efect of downregulating anterior Nodal expression (Fig. 3). Tis is not the case for Wnt3 expression, a source of inhibitory signalling to AVE, which rotates normally to the posterior epiblast. Similar to our Fam208aD6/D6 embryos, Drap1 and Lefy2 knockouts also have gastrulation failure with excessive Nodal signaling, but are able to correctly specify the AVE67,68. A possible reason for the rescue of AVE development is the highly increased expression of Dkk1 in E6.5 Fam208aD6/D6 epiblasts. Dkk1 is an attractive signal for AVE migration and exogenous administration of Dkk1 can rescue AVE migration defects caused by inhibition of proliferation69, suggesting some plasticity in the coordination of A-P axis formation with embryonic growth. Te gastrulation defective Fam208aD6/D6 mutants exhibited upregulation of p53-signature genes and increased p53 protein stability. Critically, Fam208aD6/D6 mutants when crossed into a p53 null background, showed a rescue of the gastrulation phenotype. Te rescued embryos, with their kinked neural tube, cardiac defects and failure of anterior neural tube closure, resembled p53/p63/p73 triple knockout chimaeras in which the phenotypes were attributed to an impairment in mesendodermal specifcation with a corresponding proclivity to assume a neu- rectoderm fate61. Te epiblast has been shown to contain an amplifed p53 signaling response and high cytoplasmic prim- ing towards apoptosis. Tis heightened vigilance protects against the accumulation of mutations, at a formative period when rapid cell cycling and a relatively open chromatin conformation can make them more sensitive to genomic insults. Indeed the period between E5.5 and E7.5 comprises a sensitive developmental window during which the deletion of many genes important for genome integrity is lethal70–74. Possible reasons for an increase in p53-dependent apoptosis in Fam208aD6/D6 embryos are increased genomic instability, for example due to impaired repression of endogenous retroviruses and satellite repeats, or it could be the result of stabilization of the p53 protein itself. Fam208aD6/D6 embryos at E6.5 exhibited a gene dosage-dependent increase in the percentage of cells labeled with the proliferative marker Ki67. Tis increased index of proliferation is possibly a compensatory response to cellular attrition through p53-mediated apoptosis and cell cycle arrest. Similar models of compensa- tory epiblast growth have been described2,3. Fam208a (then termed D14Abb1e) was frst proposed as a pluripotency-related gene based on its cluster- ing within an Oct4 co-expression module and demonstration of Oct4 occupancy of its promoter75. Our expres- sion profles of Fam208aD6/D6 mutant epiblasts show gene set enrichment for an Oct4 co-expression module62, and in particular in the nuclear long non-coding subset of this co-expression module, which have been termed pluripotency-associated transcripts (Platr). Indeed Platr3, -4, -20 and -27 were all within the top 95 up-regulated genes (q-value < 0.05 and fold change >2). However, not all members of this module were similarly afected, and Platr22 actually decreased in expression. None of the dysregulated lncRNAs in Fam208a mutants have been previously ascribed a functional role in maintaining embryonic stem cell pluripotency, although direct functional roles have so far been described for Platr11 (linc1405;76) Platr1462 and Platr18 (Lincenc1;77). It is also possible that these lncRNAs are markers for more global, repressive epigenetic changes associated with pluripotency exit. If so, then silencing of Platr3, -4, -20 and -27 may be especially sensitive to Fam208a-dependent expansion of H3K9me3 domains whereas other Platr genes may rely more on other mechanisms of epigenetic silencing. In summary, our results show an important role for Fam208a in maintaining epiblast fitness, and in its absence, embryos are subject to loss via p53-mediated apoptosis. Rescuing the phenotype by mutating p53 brings the question of whether mutation or misexpression of Fam208a can similarly efect ftness of other cell popu- lations, and whether such ftness defects are similarly “rescued“ by loss of the tumor suppressor p53, leading to cancer. Materials and Methods Ethics statement. Housing of mice and in vivo experiments were performed in compliance with the European Communities Council Directive of 24 November 1986 (86/609/EEC) and national and institutional guidelines. Animal care and killing mice by cervical dislocation were approved by the Animal Care Committee of the Institute of Molecular Genetics (Ethic approval ID 14/2015).

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 11 www.nature.com/scientificreports/

Mouse lines and embryo collection. Te two mutant strains of Fam208a, namely MommeD6 (L130P) and MommeD20 (IVS1 + 2 C > T), were maintained by inbreeding on the FVB/NJ background and have been described previously10. Trp53tm1Tyj mice78 were obtained from the Jackson Laboratories (Bar Harbor, USA) and were maintained on a C57BL/6J background. All mice were kept under specifc pathogen free conditions accord- ing to Federation of European Laboratory Animal Science Associations (FELASA) recommendations and all procedures were in strict accordance with local Animal Ethics Committee regulations. Embryos were harvested from timed matings of Fam208aD6/+ or Fam208aD20/+ intercrosses, with noon of the day on which the plug was observed designated embryonic day E0.5. For more accurate staging, we followed the revised Teiler staging of mouse development before organogenesis79.

Genotyping. For MommeD6 or MommeD20 strains, PCR products amplifed from the whole embryo or the Reichert’s membrane were used for genotyping by Sanger sequencing as described10. For gender PCR, Ube primers and conditions were used80 and for Trp53tm1Tyj genotyping, we followed distributor’s protocol (Jackson Laboratory, Bar Harbor, USA).

Whole-mount in situ hybridization and histology. Embryos were dissected from time mating females into cold PBS containing 10% FBS and were fixed overnight in 4% paraformaldehyde in PBS con- taining 0.1% Tween-20 at 4 °C (PBT). Single-color whole mount in situ hybridization was carried out as described81. RNA probes were either labelled with digoxygenin (DIG - Roche Diagnostics, Germany) or FITC (Roche Diagnostics, Germany). Te riboprobe template for Fam208a was prepared using the primers Fwd- ACCACTGGAGAAGCCTGAGA and Rev- GGAATCTTCCTGCTGCACTC and templates for T, Nodal, Cer1, Foxa2, Shh, Noto, Wnt3, Eomes, Gbx2, Lim1, and Otx2 were obtained from Prof. Janet Rossant and were used previously82. Afer post-fxing overnight in 4% paraformaldehyde, embryos were imaged using an inverted micro- scope (SteREO Discovery V12, Zeiss). Selected embryos were then washed 5–6 times in PBT, embedded in aga- rose and then embedded in parafn for sectioning at 3 µm for haematoxylin and eosin (H&E) staining. Sections were imaged using a Zeiss Imager.Z2 equipped with objective N-Achroplan 40x/0.65 M27 and ZEN Sofware for image acquisition.

Whole mount Immunofuorescence. Dissected embryos were fxed in 2% paraformaldehyde in PBT at room temperature for 20 mins and washed twice in PBT. Embryos were permeabilized in 0.1 M glycine/0.1% Triton X-100 for 12 mins (E6.5) or 15 mins (E7.5) at room temperature and washed twice in PBT. Te embryos were blocked in 10% FBS/1%BSA in PBT (blocking bufer) at room temperature for 3 hrs. For primary mouse antibodies, the embryos were further blocked using the mouse MOM IgG kit (Vector Laboratories) according to the manufacturer’s instructions. Embryos were incubated overnight with primary antibodies diluted in blocking bufer and the following day, were incubated further in primary antibodies for 2 hours at room temperature, washed three times in PBT for 10 minutes each and incubated in secondary antibodies diluted in blocking bufer for 3 hours at room temperature. Te embryos were washed three times with PBT, stained with DAPI (nuclei), mounted, and confocal imaged using a Leica TCS SP5 AOBS Tandem microscope and Leica Application Suite Advanced Fluorescence (LAS AF version 2.7.3.9723) sofware. Objectives LP/-/C HC PL APO 40x/1,30 OIL CS2 and LP/0,14–0,20/D HC PL APO 63x/1,40 OIL were used for imaging. In all cases, a single confocal z-stack is shown from one representative embryo of each genotype. For each marker, the number of positive cells and the total number of DAPI-positive (nuclei marker, blue) cells were enumerated using the cell counter plugin, FIJI sofware. Te apoptotic index (cleaved caspase3), pro-apoptotic index (p53), proliferative index (Ki67) and mitotic index (phospho-H3-ser28) were calculated as the percentage of cells positive for each marker to the total number of DAPI-positive (nuclei marker, blue) cells in each of Epi, ExE and VE per embryo in a single confocal plane per embryo (at least 3 per group).

RNA expression analysis and qPCR. Afer dissection of E6.25 MommeD6 embryos and removal of their Reichert’s membrane, the epiblasts were carefully dissected from the rest of the embyo (ExE/EPC). Each isolated epiblast was lysed, snap-frozen at −80 °C, and were later extracted using an RNA micro kit (Qiagen) afer the gen- otypes were determined by Sanger sequencing. Quality and concentration of eluted RNA was assessed with the Agilent RNA 6000 Pico Kit. Only samples with the RNA integrity score >8 were further processed for microarray analysis (Afymetrix GeneChip Mouse Gene 2.0 ST Array). For quantitative real time PCR (qPCR) total RNA from single epiblasts were isolated using a PicoPure RNA isolation kit (Life Technologies) and reactions were performed using the Roche LC480 light cycler. All qPCR primers are listed in Supplementary Table 2.

Microarray analysis. Microarray data was processed from .CEL fles using the oligo library in R83. Data was normalized using the RMA method and batch corrected using the removeBatchEfects function in limma84. Data was assessed pre and post normalization and corrected using the prcomp function and plotted to visualize sample in principal component space. Functional set enrichment was performed using SPIA60 and a modifed version of sigPathways85 as described by Maciejewski86.

Statistical analysis. All graphs were generated using GraphPad Prism version 7 and data are shown as mean and SEM. Mann-Whitney U test was used for analysing cell number; apoptotic, pro-apoptotic, proliferative, and mitotic indices and *p < 0.05 was considered signifcant.

Data availability. Raw data is available from ArrayExpress (E-MTAB-5357)

Scientific Reports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 12 www.nature.com/scientificreports/

References 1. Snow, M. H. L. Gastrulation in the mouse: Growth and regionalization of the epiblast. J. Embryol. exp. Morph. 42, 293–303 (1977). 2. Lewis, N. E. & Rossant, J. Mechanism of size regulation in mouse embryo aggregates. J Embryol Exp Morphol 72, 169–181 (1982). 3. Power, M. A. & Tam, P. P. Onset of gastrulation, morphogenesis and somitogenesis in mouse embryos displaying compensatory growth. Anat Embryol (Berl) 187, 493–504 (1993). 4. Arnold, S. J. & Robertson, E. J. Making a commitment: cell lineage allocation and axis patterning in the early mouse embryo. Nat Rev Mol Cell Biol 10, 91–103, doi:10.1038/nrm2618 (2009). 5. Tam, P. P. & Loebel, D. A. Gene function in mouse embryogenesis: get set for gastrulation. Nat Rev Genet 8, 368–381, doi:10.1038/ nrg2084 (2007). 6. Blewitt, M. E. et al. An N-ethyl-N-nitrosourea screen for genes involved in variegation in the mouse. Proc Natl Acad Sci USA 102, 7629–7634, doi:10.1073/pnas.0409375102 (2005). 7. Daxinger, L. et al. An ENU mutagenesis screen identifes novel and known genes involved in epigenetic processes in the mouse. Genome Biol 14, R96, doi:10.1186/gb-2013-14-9-r96 (2013). 8. Whitelaw, N. C. et al. Reduced levels of two modifers of epigenetic gene silencing, Dnmt3a and Trim28, cause increased phenotypic noise. Genome Biol 11, R111, doi:10.1186/gb-2010-11-11-r111 (2010). 9. Isbel, L. et al. Trim33 Binds and Silences a Class of Young Endogenous Retroviruses in the Mouse Testis; a Novel Component of the Arms Race between Retrotransposons and the Host Genome. PLoS Genet 11, e1005693, doi:10.1371/journal.pgen.1005693 (2015). 10. Harten, S. K. et al. Te frst mouse mutants of D14Abb1e (Fam208a) show that it is critical for early development. Mamm Genome 25, 293–303, doi:10.1007/s00335-014-9516-0 (2014). 11. Tchasovnikarova, I. A. et al. GENE SILENCING. Epigenetic silencing by the HUSH complex mediates position-efect variegation in human cells. Science 348, 1481–1485, doi:10.1126/science.aaa7227 (2015). 12. Timms, R. T., Tchasovnikarova, I. A. & Lehner, P. J. Position-efect variegation revisited: HUSHing up heterochromatin in human cells. BioEssays: news and reviews in molecular, cellular and developmental biology 38, 333–343, doi:10.1002/bies.201500184 (2016). 13. Lachner, M. & Jenuwein, T. Te many faces of histone lysine methylation. Curr Opin Cell Biol 14, 286–298 (2002). 14. Young, R. A. Control of the embryonic stem cell state. Cell 144, 940–954, doi:10.1016/j.cell.2011.01.032 (2011). 15. Xi, Q. et al. A poised chromatin platform for TGF-beta access to master regulators. Cell 147, 1511–1524, doi:10.1016/j. cell.2011.11.032 (2011). 16. Dodge, J. E., Kang, Y. K., Beppu, H., Lei, H. & Li, E. Histone H3-K9 methyltransferase ESET is essential for early development. Mol Cell Biol 24, 2478–2486 (2004). 17. Morsut, L. et al. Negative control of Smad activity by ectodermin/Tif1gamma patterns the mammalian embryo. Development 137, 2571–2578, doi:10.1242/dev.053801 (2010). 18. Cammas, F. et al. Mice lacking the transcriptional corepressor TIF1beta are defective in early postimplantation development. Development 127, 2955–2963 (2000). 19. Beck, S. et al. Extraembryonic proteases regulate Nodal signalling during gastrulation. Nat Cell Biol 4, 981–985, doi:10.1038/ncb890 (2002). 20. Donnison, M. et al. Loss of the extraembryonic ectoderm in Elf5 mutants leads to defects in embryonic patterning. Development 132, 2299–2308, doi:10.1242/dev.01819 (2005). 21. Fujiwara, T., Dunn, N. R. & Hogan, B. L. Bone morphogenetic protein 4 in the extraembryonic mesoderm is required for allantois development and the localization and survival of primordial germ cells in the mouse. Proc Natl Acad Sci USA 98, 13739–13744, doi:10.1073/pnas.241508898 (2001). 22. Winnier, G., Blessing, M., Labosky, P. A. & Hogan, B. L. Bone morphogenetic protein-4 is required for mesoderm formation and patterning in the mouse. Genes Dev 9, 2105–2116 (1995). 23. Herrmann, B. G. Expression pattern of the Brachyury gene in whole-mount TWis/TWis mutant embryos. Development 113, 913–917 (1991). 24. Ding, J. et al. Cripto is required for correct orientation of the anterior-posterior axis in the mouse embryo. Nature 395, 702–707, doi:10.1038/27215 (1998). 25. Abdelkhalek, H. B. et al. Te mouse homeobox gene Not is required for caudal notochord development and afected by the truncate mutation. Genes Dev 18, 1725–1736, doi:10.1101/gad.303504 (2004). 26. Yamanaka, Y., Tamplin, O. J., Beckers, A., Gossler, A. & Rossant, J. Live imaging and genetic analysis of mouse notochord formation reveals regional morphogenetic mechanisms. Dev Cell 13, 884–896, doi:10.1016/j.devcel.2007.10.016 (2007). 27. Echelard, Y. et al. Sonic hedgehog, a member of a family of putative signaling molecules, is implicated in the regulation of CNS polarity. Cell 75, 1417–1430 (1993). 28. Monaghan, A. P., Kaestner, K. H., Grau, E. & Schutz, G. Postimplantation expression patterns indicate a role for the mouse forkhead/ HNF-3 alpha, beta and gamma genes in determination of the definitive endoderm, chordamesoderm and neuroectoderm. Development 119, 567–578 (1993). 29. Sasaki, H. & Hogan, B. L. Diferential expression of multiple fork head related genes during gastrulation and axial pattern formation in the mouse embryo. Development 118, 47–59 (1993). 30. Ang, S. L. et al. Te formation and maintenance of the defnitive endoderm lineage in the mouse: involvement of HNF3/forkhead proteins. Development 119, 1301–1315 (1993). 31. Burtscher, I. & Lickert, H. Foxa2 regulates polarity and epithelialization in the endoderm germ layer of the mouse embryo. Development 136, 1029–1038, doi:10.1242/dev.028415 (2009). 32. Burdsal, C. A., Damsky, C. H. & Pedersen, R. A. Te role of E-cadherin and integrins in mesoderm diferentiation and migration at the mammalian primitive streak. Development 118, 829–844 (1993). 33. Ciruna, B. & Rossant, J. FGF signaling regulates mesoderm cell fate specifcation and morphogenetic movement at the primitive streak. Dev Cell 1, 37–49 (2001). 34. Damjanov, I., Damjanov, A. & Damsky, C. H. Developmentally regulated expression of the cell-cell adhesion glycoprotein cell-CAM 120/80 in peri-implantation mouse embryos and extraembryonic membranes. Developmental biology 116, 194–202 (1986). 35. Smith, D. E., Franco del Amo, F. & Gridley, T. Isolation of Sna, a mouse gene homologous to the Drosophila genes snail and escargot: its expression pattern suggests multiple roles during postimplantation development. Development 116, 1033–1039 (1992). 36. Yamaguchi, T. P., Harpal, K., Henkemeyer, M. & Rossant, J. fgfr-1 is required for embryonic growth and mesodermal patterning during mouse gastrulation. Genes Dev 8, 3032–3044 (1994). 37. Sun, X., Meyers, E. N., Lewandoski, M. & Martin, G. R. Targeted disruption of Fgf8 causes failure of cell migration in the gastrulating mouse embryo. Genes Dev 13, 1834–1846 (1999). 38. Brennan, J. et al. Nodal signalling in the epiblast patterns the early mouse embryo. Nature 411, 965–969, doi:10.1038/35082103 (2001). 39. Arnold, S. J., Hofmann, U. K., Bikof, E. K. & Robertson, E. J. Pivotal roles for eomesodermin during axis formation, epithelium-to- mesenchyme transition and endoderm specifcation in the mouse. Development 135, 501–511, doi:10.1242/dev.014357 (2008). 40. Ciruna, B. G. & Rossant, J. Expression of the T-box gene Eomesodermin during early mouse development. Mech Dev 81, 199–203 (1999). 41. Hancock, S. N., Agulnik, S. I., Silver, L. M. & Papaioannou, V. E. Mapping and expression analysis of the mouse ortholog of Xenopus Eomesodermin. Mech Dev 81, 205–208 (1999).

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 13 www.nature.com/scientificreports/

42. Russ, A. P. et al. Eomesodermin is required for mouse trophoblast development and mesoderm formation. Nature 404, 95–99, doi:10.1038/35003601 (2000). 43. Belo, J. A. et al. Cerberus-like is a secreted BMP and nodal antagonist not essential for mouse development. Genesis 26, 265–270 (2000). 44. Belo, J. A. et al. Cerberus-like is a secreted factor with neutralizing activity expressed in the anterior primitive endoderm of the mouse gastrula. Mech Dev 68, 45–57 (1997). 45. Biben, C. et al. Murine cerberus homologue mCer-1: a candidate anterior patterning molecule. Developmental biology 194, 135–151, doi:10.1006/dbio.1997.8812 (1998). 46. Pearce, J. J., Penny, G. & Rossant, J. A mouse cerberus/Dan-related gene family. Developmental biology 209, 98–110, doi:10.1006/ dbio.1999.9240 (1999). 47. Barnes, J. D., Crosby, J. L., Jones, C. M., Wright, C. V. & Hogan, B. L. Embryonic expression of Lim-1, the mouse homolog of Xenopus Xlim-1, suggests a role in lateral mesoderm diferentiation and neurogenesis. Developmental biology 161, 168–178, doi:10.1006/dbio.1994.1018 (1994). 48. Tsang, T. E. et al. Lim1 activity is required for intermediate mesoderm diferentiation in the mouse embryo. Developmental biology 223, 77–90, doi:10.1006/dbio.2000.9733 (2000). 49. Simeone, A. et al. A vertebrate gene related to orthodenticle contains a homeodomain of the bicoid class and demarcates anterior neuroectoderm in the gastrulating mouse embryo. EMBO J 12, 2735–2747 (1993). 50. Ang, S. L., Conlon, R. A., Jin, O. & Rossant, J. Positive and negative signals from mesoderm regulate the expression of mouse Otx2 in ectoderm explants. Development 120, 2979–2989 (1994). 51. Bouillet, P., Chazaud, C., Oulad-Abdelghani, M., Dolle, P. & Chambon, P. Sequence and expression pattern of the Stra7 (Gbx-2) homeobox-containing gene induced by retinoic acid in P19 embryonal carcinoma cells. Dev Dyn 204, 372–382, doi:10.1002/ aja.1002040404 (1995). 52. Goto, H. et al. Identifcation of a novel phosphorylation site on histone H3 coupled with mitotic chromosome condensation. J Biol Chem 274, 25543–25549 (1999). 53. Laurent, A. & Blasi, F. Differential DNA damage signalling and apoptotic threshold correlate with mouse epiblast-specific hypersensitivity to radiation. Development 142, 3675–3685, doi:10.1242/dev.125708 (2015). 54. Heyer, B. S., MacAuley, A., Behrendtsen, O. & Werb, Z. Hypersensitivity to DNA damage leads to increased apoptosis during early mouse development. Genes Dev 14, 2072–2084 (2000). 55. Fernandez-Diaz, L. C. et al. Te absence of Prep1 causes p53-dependent apoptosis of mouse pluripotent epiblast cells. Development 137, 3393–3403, doi:10.1242/dev.050567 (2010). 56. Guzman-Ayala, M. et al. Chd1 is essential for the high transcriptional output and rapid growth of the mouse epiblast. Development 142, 118–127, doi:10.1242/dev.114843 (2015). 57. Panic, L. et al. Ribosomal protein S6 gene haploinsufciency is associated with activation of a p53-dependent checkpoint during gastrulation. Mol Cell Biol 26, 8880–8891, doi:10.1128/MCB.00751-06 (2006). 58. Ruland, J. et al. p53 accumulation, defective cell proliferation, and early embryonic lethality in mice lacking tsg101. Proc Natl Acad Sci USA 98, 1859–1864, doi:10.1073/pnas.98.4.1859 (2001). 59. Singh, A. P. et al. Brg1 Enables Rapid Growth of the Early Embryo by Suppressing Genes Tat Regulate Apoptosis and Cell Growth Arrest. Mol Cell Biol 36, 1990–2010, doi:10.1128/MCB.01101-15 (2016). 60. Tarca, A. L. et al. A novel signaling pathway impact analysis. Bioinformatics 25, 75–82, doi:10.1093/bioinformatics/btn577 (2009). 61. Wang, Q. et al. Te p53 Family Coordinates Wnt and Nodal Inputs in Mesendodermal Diferentiation of Embryonic Stem Cells. Cell Stem Cell. doi:10.1016/j.stem.2016.10.002 (2016). 62. Bergmann, J. H. et al. Regulation of the ESC transcriptome by nuclear long noncoding RNAs. Genome Res 25, 1336–1346, doi:10.1101/gr.189027.114 (2015). 63. Collignon, J., Varlet, I. & Robertson, E. J. Relationship between asymmetric nodal expression and the direction of embryonic turning. Nature 381, 155–158, doi:10.1038/381155a0 (1996). 64. Constam, D. B. Running the gauntlet: an overview of the modalities of travel employed by the putative morphogen Nodal. Curr Opin Genet Dev 19, 302–307, doi:10.1016/j.gde.2009.06.006 (2009). 65. Constam, D. B. Riding shotgun: a dual role for the epidermal growth factor-Cripto/FRL-1/Cryptic protein Cripto in Nodal trafcking. Trafc 10, 783–791, doi:10.1111/j.1600-0854.2009.00874.x (2009). 66. Zhou, X., Sasaki, H., Lowe, L., Hogan, B. L. & Kuehn, M. R. Nodal is a novel TGF-beta-like gene expressed in the mouse node during gastrulation. Nature 361, 543–547, doi:10.1038/361543a0 (1993). 67. Iratni, R. et al. Inhibition of excess nodal signaling during mouse gastrulation by the transcriptional corepressor DRAP1. Science 298, 1996–1999, doi:10.1126/science.1073405 (2002). 68. Meno, C. et al. Mouse Lefy2 and zebrafsh antivin are feedback inhibitors of nodal signaling during vertebrate gastrulation. Mol Cell 4, 287–298 (1999). 69. Stuckey, D. W. et al. Coordination of cell proliferation and anterior-posterior axis establishment in the mouse embryo. Development 138, 1521–1530, doi:10.1242/dev.063537 (2011). 70. Brown, E. J. & Baltimore, D. ATR disruption leads to chromosomal fragmentation and early embryonic lethality. Genes Dev 14, 397–402 (2000). 71. Dobles, M., Liberal, V., Scott, M. L., Benezra, R. & Sorger, P. K. Chromosome missegregation and apoptosis in mice lacking the mitotic checkpoint protein Mad2. Cell 101, 635–645 (2000). 72. Hakem, R. et al. Te tumor suppressor gene Brca1 is required for embryonic cellular proliferation in the mouse. Cell 85, 1009–1023 (1996). 73. Jeon, Y. et al. TopBP1 defciency causes an early embryonic lethality and induces cellular senescence in primary cells. J Biol Chem 286, 5414–5422, doi:10.1074/jbc.M110.189704 (2011). 74. Kalitsis, P., Earle, E., Fowler, K. J. & Choo, K. H. Bub3 gene disruption in mice reveals essential mitotic spindle checkpoint function during early embryogenesis. Genes Dev 14, 2277–2282 (2000). 75. Campbell, P. A., Perez-Iratxeta, C., Andrade-Navarro, M. A. & Rudnicki, M. A. Oct4 targets regulatory nodes to modulate stem cell function. PLoS One 2, e553, doi:10.1371/journal.pone.0000553 (2007). 76. Guttman, M. et al. lincRNAs act in the circuitry controlling pluripotency and diferentiation. Nature 477, 295–300, doi:10.1038/ nature10398 (2011). 77. Ivanova, N. et al. Dissecting self-renewal in stem cells with RNA interference. Nature 442, 533–538, doi:10.1038/nature04915 (2006). 78. Jacks, T. et al. Tumor spectrum analysis in p53-mutant mice. Curr Biol 4, 1–7 (1994). 79. Lawson, K. A. & Wilson, V. in Kaufman’s Atlas of Mouse Development Supplement: With Coronal Sections (eds R. Baldock, J. B. Bard, D. R. Davidson, & G. Morriss-Kay) Ch. 3, 51–64 (Elsevier, 2015). 80. Chuma, S. & Nakatsuji, N. Autonomous transition into meiosis of mouse fetal germ cells in vitro and its inhibition by gp130- mediated signaling. Developmental biology 229, 468–479, doi:10.1006/dbio.2000.9989 (2001). 81. Georgiades, P. & Rossant, J. Ets2 is necessary in trophoblast for normal embryonic anteroposterior axis development. Development 133, 1059–1068, doi:10.1242/dev.02277 (2006). 82. Polydorou, C. & Georgiades, P. Ets2-dependent trophoblast signalling is required for gastrulation progression afer primitive streak initiation. Nat Commun 4, 1658, doi:10.1038/ncomms2646 (2013).

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 14 www.nature.com/scientificreports/

83. Carvalho, B. S. & Irizarry, R. A. A framework for oligonucleotide microarray preprocessing. Bioinformatics 26, 2363–2367, doi:10.1093/bioinformatics/btq431 (2010). 84. Smyth, G. K. in Bioinforma. Comput. Biol. Solut. Using R Bioconductor (eds Gentleman, R. et al.) 397–420 (Springer, 2005). 85. Tian, L. et al. Discovering statistically signifcant pathways in expression profling studies. Proc Natl Acad Sci USA 102, 13544–13549, doi:10.1073/pnas.0506577102 (2005). 86. Maciejewski, H. Gene set analysis methods: statistical models and methodological differences. Brief Bioinform 15, 504–518, doi:10.1093/bib/bbt002 (2014). Acknowledgements We thank Prof. Emma Whitelaw for kindly providing the MommeD6 and MommeD20 mouse strains, Dr. Libor Macůrek, Dr. Pantelis Georgiades and Dr.Heiko Lickert for sharing reagents, Barbora Singerová for technical assistance, the Animal Facility of the Czech Centre for Phenogenomics, Institute for Molecular Genetics for animal husbandry, and the Microscopy Centre, Institute for Molecular Genetics for expert technical assistance. Tis work was supported by the Academy of Sciences of the Czech Republic (RVO 68378050), the Czech Science Foundation (GACR 15-23165S and 17-16959S to K.C), the Grant Agency of Charles University (GAUK- 1000216 to S.B), Ministry of Education, Youth and Sports (MEYS) L01419 (NPU I to VK), and the following MEYS to RS CZ.1.05/1.1.00/02.010 (BIOCEV), CZ.1.05/2.1.00/19.0395 (CCP), LM2011032, LM2015040, LQ1604 (NPUII). B.C. is supported by a Canada Research Chair in maternal fetal health. Author Contributions T.A.E. and K.C. conceived and designed the experimental approach. S.B. performed all the WISH and IF experiments. S.B. and V.G. performed qPCR. S.B., C.P., T.A.E. and K.C. analyzed the data. B.C. and H.S. analysed the microarray analysis. V.K. and R.S. provided new analytic tools. S.B., T.A.E. and K.C. wrote the manuscript. Additional Information Supplementary information accompanies this paper at doi:10.1038/s41598-017-09490-w Competing Interests: Te authors declare that they have no competing interests. Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional afliations. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre- ative Commons license, and indicate if changes were made. Te images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not per- mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

© Te Author(s) 2017

SCIEnTIFIC REports | 7: 9322 | DOI:10.1038/s41598-017-09490-w 15 Supplementary Information File

The epigenetic modifier Fam208a is required to maintain epiblast cell potency

Shohag Bhargava1,6, Brian Cox5, Christiana Polydorou1, Veronika Gresakova1, Vladimir Korinek3, Hynek Strnad4, Radislav Sedlacek1,2, Trevor Allan Epp1,2,* and Kallayanee Chawengsaksophak1,2,*

1Laboratory of Transgenic Models of Diseases, Division, BIOCEV, Institute of Molecular Genetics of the CAS, v.v.i., Vestec, Czech Republic 2Czech Centre for Phenogenomics, Division BIOCEV, Institute of Molecular Genetics of the CAS, v.v.i., Vestec, Czech Republic 3Laboratory of Cell and Developmental Biology, Institute of Molecular Genetics of the CAS, v.v.i., Krc, Czech Republic 4Laboratory of Genomics and Bioinformatics, Institute of Molecular Genetics of the CAS, v.v.i., Krc, Czech Republic 5Department of Physiology, Faculty of Medicine, University of Toronto, Ontario, Canada 6Faculty of Science, Charles University, Prague, Czech Republic

* Authors for correspondence:

Kallayanee Chawengsaksophak, PhD & Trevor Allan Epp, PhD

Institute of Molecular Genetics AS CR, v.v.i. Videnska 1083, 142 20 Prague 4 Czech Republic Tel.: +420 241 063 391 Fax: +420 224 310 955 Email: [email protected], [email protected]

Running title: Fam208a sustains epiblast potency

Key words: Momme, modifiers of murine metastable epialleles; PS, primitive streak; EMT, epithelial mesenchymal transition, gastrulation Supplementary Figure Legends

Supplementary Fig. 1: Fam208a is widely expressed during early development.

Whole-mount in-situ hybridisation indicates widespread Fam208a expression at E5.5-7.5 (A -

C) strongly seen in the epiblast and lesser in the ExE. Later, at E8.5-9.5, becomes strongly expressed ubiquitously (D - E). Scale bar: 100µm. EC, Egg Cylinder; Pr-S, Pre-streak; al,

Allantois; EHF, Early Head Fold; 8s, 8 somites.

Supplementary Fig. 2. Fam208aD6/D6 mutants exhibit minimal changes in extra- embryonic ectoderm (ExE) marker gene expression.

Whole mount in situ hybridisation at E6.5 of Fam208aD6/D6 mutants (A’-D’) and their wild- type littermates (A-D) shows minimal expression changes of the ExE markers Elf5, (A’) Spc4,

(B’) Bmp4, (C’) Cdx2 (D’). Note that A-A’ show images from single-colored in situ hybridisation for the simultaneous detection of Elf5 and Brachyury (T). Scale bar: 100µm. ES,

Early streak; Pr-S, pre-streak. * indicates BM purple precipitate.

Supplementary Fig. 3: Fam208a mutation leads to delayed AME derivatives at E8.5.

(A) Sanger sequence confirmation of Fam208aD6/D6 mutation seen in the mutant embryos by genotyping. Whole mount in situ hybridisation at E7.5-E8.5 of Fam208a mutants (B, C’-D’&

D’’) and their wild-type littermate controls (C & D) shows arrested expression of the primitive streak marker, Brachyury, (B) in the Fam208aD20/D20 mutant embryos. (C’) Note that the mutants remain developmentally retarded with complete absence of Shh (Node &AME marker) expression. (D’-D’’) Reduced and distal expression of anterior mesendoderm

(AME) marker Foxa2 in seen in both Fam208a allelic mutants (indicated as arrowhead) with not much phenotypic variability. Scale bar: 100µm. EPHF, Early pre-head fold; LSEB, Late streak, EB, early allantoic bud; 8s, 8 somites; al, Allantois.

Supplementary Fig. 4: Fam208a mutation leads to significant delay in the formation of node.

(A) Bright-field images of Fam208aD6/D6 mutant embryos at E8.5-9.5 (B & D) and their wild- type littermate controls (A & C). Whole mount in situ hybridisation at E8.5-9.5 shows significantly delayed slight but correct distal expression of the node marker, Noto. Note, that the mutants also have the allantois. * asterisk indicates the node forming region.

Scale bar: 100µm. al, Allantois.

Supplementary Fig. 5: Overt developmental defects highlighted in partially rescued Fam208aD6/D6 embryos in a p53-/- background at E9.5.

(A-B) Bright-field images of mutant Fam208aD6/D6 embryos, its corresponding rescued littermate at E9.5 (same embryo as represented in Fig. 8. E). Partial rescue of all homozygote

Fam208a mutants in p53-/- background up to E8.5-9.0 with abnormalities in mid-hindbrain closure, kinky neural tube (arrowheads) and enlarged pericardium which can be seen in semi- thin transverse sections stained with Hematoxylin and Eosin (B’- B’’). The boxed region to the bottom left & right is of 3-fold magnification. Dotted line indicates the region of transverse-section. Arrowheads highlight developmental defects. Scale bar: 100µm.

Supplementary Fig. 6: Altered expression of epiblast specification markers in Fam208aD6/D6 embryos at E6.5

Whole mount in situ hybridisation at E6.5 of Fam208aD6/D6 mutants (A’-C’) and their wild- type littermates (A-C). The Fam208aD6/D6 mutant embryos have no to faint Foxa2 expression

(A’) with robust bilateral expression of Otx2 (B’). There is no discernible change in Cripto expression in the mutant embryo (C’). Scale bar: 100µm

Supplementary Table S1. Selected functional gene set enrichment statistics using sigPathways.

Homozygotes (Fam208aD6/D6)

Pathway_Name Size tstat.mean Comp.Probability NEk NEk.p.val NEk.q.val p53/p73a 23 4.505 <0.001 5.78 7.69E-09 <0.001 Platrd 20 6.389 <0.001 3.09 2.00E-03 0.003 KRABb 132 0.479 0.001 1.71 8.70E-02 0.071 Oct4coexpressc 423 2.390 0.005 2.75 6.00E-03 0.006

Heterozygotes (Fam208a+/D6)

Pathway_Name size tstat.mean Comp.Probability NEk NEk.p.val NEk.q.val p53/p73a 23 0.717 0.005 2.18 2.90E-02 0.094 Platrd 20 0.590 0.044 1.36 1.74E-01 0.347 Oct4coexpressc 423 0.138 0.242 0.60 5.51E-01 0.551 KRABb 132 -0.009 0.957 -0.16 8.70E-01 0.938 athe p53-bound (50kb from TSS) subset of genes significantly upregulated in response to combined p53/p73 depletion in mouse embryoid bodies (Wang et al, 2016). bgene set defined according to InterPro ID (IPR001909: Krueppel-associated box). cpluripotency-associated Oct4 co-expression module (Bergmann et al, 2015). dpluripotency-associated long non-coding transcripts (Bergmann et al, 2015) .

Supplementary Table S1. qPCR primers

Oligo Name Sequence

mCdkn1a-F AACATCTCAGGGCCGAAA

mCdkn1a-R TGCGCTTGGAGTGATAGAAA mCcng1-F TGGACAGATTCTTGTCTAAAATGAAG mCcng1-R CAGTGGGACATTCCTTTCCTC mDkk1-F CCGGGAACTACTGCAAAAAT mDkk1-R CCAAGGTTTTCAATGATGCTT mWnt3-F GATGTGGAGGCAGGTCTCTT mWnt3-R CAGAGCAGCCCATTCTTTCT mEomes-F AGCAGCCCAGAGGGTTAAA mEomes-R TGAAGAGCCCACTGTTAACTCA mOct4-F AATGCCGTGAAGTTGGAGAA mOct4-R CCTTCTGCAGGGCTTTCAT mPlatr4-F TGTGAGAATCAGGGAAAGTGG mPlatr4-R TGAGTGCTGAGTTGCAGGTT mPlatr20-F CGGGAAAGCAGAGTGCTG mPlatr20-R TTGCCTTGTTTTTCAAATAGTACCT mPlatr27-F GACTCAGCTGGGTTCCAGAG mPlatr27-R CTGGCTCTTCAAGTCTTCTGC mGAPDH GGGTTCCTATAAATACGGACTGC mGAPDH CCATTTTGTCTACGGGACGA

All real-time PCR primer pairs were designed with Tm approximately 60 ⁰C using the

Universal Probe Library Tool on the Roche Website

(https://lifescience.roche.com/en_cz/brands/universal-probe-library.html#assay-design-center)

Supplementary Table S3. List of primary and secondary antibodies

Primary antibodies

Antibody Cat. No. Dilution anti-Ki67 Abcam #ab15580 1:250 anti-phospho-Histone H3 Cell Signaling # 9713 1:250 (Ser28) anti-Cleaved Caspase3 Cell Signaling #9664 1:250 anti-p53 Novocastra #NCL-p53- 1:250 CM5p anti-Oct4 Santa Cruz #sc-5279 1:100 anti-Snail Cell Signaling # C15D3 1:250 anti-E-Cadherin Sigma Aldrich #U3254 1:500

Secondary antibodies

Antibody Cat. No. Dilution

Alexa Flour 488-conjugated Invitrogen #A-11034 1:500 goat anti-rabbit Alexa Flour 594-conjugated Invitrogen # A-11012 1:500 goat anti-rabbit Alexa Flour 594-conjugated Invitrogen # A-11007 1:500 goat anti-rat Alexa Flour 488-conjugated Jackson# 715-545-150 1:500 goat anti-mouse

teinu, který si v různých modifikacích (mu - Veronika Grešáková tacích) můžeme připravit v živých buňkách a následně ho extrahovat a použít na další experimenty. Další možností funkční geno- miky je studie in vivo, tedy práce s živým organismem (obratlovci, hmyz, rostliny, Cesty mutageneze jednobuněčné organismy atd.), u něhož došlo ke změně v genu, který nás zajímá. Aktuálně dokážeme tyto geny pozměnit hned několika způsoby – úplně je vypnout, ale také pozměnit jen velmi krátké úseky, abychom zjistili, jak moc důležité jsou pro celý protein. Existují dva základní přístupy studia Dlouhodobé přežití živočišných druhů je podmíněno jejich schopností měnit funkce proteinů in vivo. První, tzv. přímá vlastní genetickou informaci a pod vlivem přirozeného výběru se tak následně genetika využívá přirozených nebo uměle lépe přizpůsobovat měnícím se životním podmínkám. Spontánní změny – muta- vytvořených mutantů, kteří jsou rozezná- vány na základě vnějšího projevu – feno- ce – mohou sice být pro svého nositele přínosem, ale ve většině případů ho spíše typu (barva srsti, délka ocasu, tvar lebky znevýhodní nebo jsou z pohledu nositele neutrální. Navíc mohou být mutace apod.). Následně se pokouší odpovědět v některých případech výhodné a jindy nevýhodné. A právě na základě celkové - na otázky typu, jak se dědí daná vloha, zda ho efektu mutací na své nositele vybírá nakonec přírodní selekce v danou chvíli vznikl fenotyp následkem jedné nebo více prospěšnější genové varianty, kterým umožní šířit se v populaci. Přirozená bio- mutací a jaký by byl fenotypový projev logická evoluce je proto ze své podstaty úzce propojená s přírodním výběrem potomků dvou různých mutantů. Tento pří- stup patří k zastaralým, v době před roz- a zcela závisí na přírodních mutantech. Naopak cílené vytváření mutantních vojem molekulárních metod bylo nejdů - forem organismů člověkem představuje jeden ze základních pilířů genetického ležitější získat u studovaného organismu výzkumu. Rozmach metod využívaných k úpravě genetického kódu se podepsal dobře „viditelný“ projev – určitý fenotyp – na celkovém směřování biologických disciplín. Mutageneze od fyzikálních, přes a ten pak studovat, spíše než zjišťovat geny, chemické až po nejaktuálnější biologické užíváme ve snaze o pochopení moleku - které jsou za něj odpovědné. Metoda byla lární podstaty různých jevů. Máme naději, že toto pochopení přinese lepší založena na vytvoření náhodné mutace, budoucnost v podobě efektivní léčby geneticky podmíněných onemocnění. která se může odehrát kdekoli v celém ge - nomu buňky a není proto nijak cílená. V následujících řádcích se pokusím přiblížit základní principy mutageneze V dnešní době jsou známy (přečteny) a původní i novější mechanismy, které je možné nyní využít. všechny geny modelových organismů, nyní je snahou vědců popsat funkce každého jednotlivého genu. Proto vznikl i opačný Žijeme v (post)genomické éře, kdy máme né funkce genu. Takový přístup se nazývá přístup, který postupuje od genu k jeho pro- k dispozici obrovské množství neustále se in silico, což značí „spočteno počítačem jevu (fenotypu). Právě s vývojem nových rozšiřujících dat a informací o celých ge - nebo zjištěno počítačovou simulací“. Z uve- metod genetického výzkumu, schopných nomech (tedy veškeré genetické informa- dených metod vzešla komparativní (srov- analyzovat funkci genů na základě jeho ci uložené v DNA příslušného organismu) návací) genomika porovnávající genomy cílených změn (mutací), se objevil i nový stále většího počtu organismů. Přečtení ve - (a geny) různých organismů. Operuje opět přístup – reverzní genetika, kdy biologic- škeré lidské DNA (lidského genomu) bylo na několika úrovních. Dokáže např. porov- kou funkci genu zjišťujeme jeho vyřaze- dokončeno v r. 2003 (blíže viz Živa 2016, návat samotné genomové sekvence nebo ním nebo naopak přidáním do organismu 5: 203–206) a od té doby se podařilo pře- krátké úseky genů přepisované do RNA a podle potřeby jeho cílenou aktivací/de - číst již tisíce dalších eukaryotních geno- (tzv. transkripty; Expressed Sequence Tag, aktivací. V reverzní genetice tedy startov- mů. Nejzajímavější a také nejdůležitější EST databáze). Bohužel v mnoha případech ní bod tvoří gen. Hlavní snahou je modi- součástí genomu jsou pochopitelně geny, tyto počítačové simulace nebo metody za - fikovat tento gen nebo jeho expresi (cestu přestože tvoří jen menšinu genetické infor- ložené na srovnání s již známými geny k přepsání do proteinových molekul) a ná - mace. U drtivé většiny genů a z nich odvo - selhávají, protože podobnost neznámého sledně detailně popsat fenotypové důsled- zených proteinů ale nemáme ani zdání genu s již popsanými je tak malá, že ani ky modifikace. Klíčovým parametrem musí o jejich funkci. Právě snaha získat tyto infor- počítačová simulace nepřinese výsledky. být snaha nenarušit rozmnožovací schop- mace způsobila rozvoj nového odvětví ge - Proto nezbývá nic jiného, než studovat nost modelového organismu, protože pak netiky – vznikla funkční genomika, jejímž organismy, které mají námi vybrané geny by šance na získání homozygotního potom- hlavním cílem je hledání genů a určování manuálně pozměněné – mutované. ka, tedy jedince s mutací jak v otcovské, jejich funkce. Pracuje na několika úrovních. V tomto případě můžeme funkci genů tak v matčině alele genu, byla téměř nulo- Na základě podobnosti genetické sekven- u vícebuněčných organismů studovat in vá. Celogenomové sekvenování odhalilo ce, spolu s identifikací známých motivů vitro, tedy např. prostřednictvím buněč- velké množství genů, jejichž funkci ne - (tzv. domén) a možnosti předpovědět finál- ných linií nesoucích mutaci ve svém geno- známe a ani ji nejsme schopni jednoduše ní strukturu proteinu dokážeme pomocí po - mu a jejichž projevy sledujeme. Nebo nás předpovědět. Právě proto se staly vysoce čítačové simulace vytipovat pravděpodob - zajímá struktura a složení samotného pro- výkonné postupy reverzní genetiky klíčo- vými v postgenomické éře. Při snaze o zjištění funkcí genů se uplat- mutageneze ňují obě zmíněné strategie, tedy na zákla- dě studia fenotypu i genů. Přímý genetický přístup však dokáže definovat roli genů, fyzikální chemická biologická i když nemáme žádnou představu o mož- né funkci a zkoumáme ji tedy bez „před- radiace částice alkylační analogy inzerční programovatelné sudků a předpokladů“. Mnohé postupy bází nukleázy funkční genomiky využívají tvorbu nulo- vých mutantů, kteří mají úplně vyřazenou UV, b, , α, EMS ENU transpozon virus ZFN TALEN CRISPR- RNAi produkci funkčního proteinu. Metody re - RTG neutrony -Cas9 verzní genetiky můžeme rozdělit do dvou skupin. První zahrnuje přímou mutagene - 5-BU 2-AP 1 zi pomocí chemikálií či vložení fragmen- tů DNA (transpozonů) do genu. Zatímco XLIV živa 2/2017

© Nakladatelství Academia, SSČ AV ČR, v. v. i., 2017. Přetisk článků včetně obrázků se výslovně zapovídá. Veškerá práva včetně práva reprodukce jsou vyhrazena. 1 Typy mutageneze s příklady jednotli- nebo retrovirus, do genomu (obr. 2; viz též vých mutagenů, tedy látek, které mutaci článek na str. XLVII a dále v textu). Tako- na úrovni DNA způsobují. Prvotní členění původní DNA transpozon původní DNA vá inzerce do oblasti DNA, která kóduje spočívá v typu mutagenu. Následně se transponáza transponáza určitý gen, může pak vést k mutagennímu zde rozlišuje princip vyvolání mutací – efektu a pozměnění funkce genu. Bohu- v případě fyzikální mutageneze se uplat- žel při použití této metody se vystavujeme ňuje přímá radiace, nejčastěji záření gama riziku, že ztráta funkce příslušného genu nebo ultrafialové, či samotné ozařování bude jen částečná, což může být způsobe- částicemi, jako jsou neutrony nebo alfa, no vložením buď do promotorové oblas- resp. beta. Chemické mutageny využívají ti genu (tedy části DNA nutné ke spuštění několik způsobů tvoření mutací – klíčové jeho transkripce), nebo do takových oblastí je zavedení změny v párování bází, buď komplex transpozonu DNA, v nichž po inzerci a následném pře- a transponázy pomocí přenosu alkylových skupin na cílová DNA pisu genu vzniká zkrácený protein, který je DNA, nebo prostřednictvím analogů pů - stále alespoň částečně schopen vykoná- vodních nukleotidů. Biologické mutageny vat svou původní funkci. Pokud zvolíme manipulují DNA pro ně vlastním způso- inzerční mutagenezi, může dojít také ke bem. Začleňují různé sekvence nukleových cílová DNA transpozon cílová DNA vzniku nespecifických mutací (jako v pří- kyselin do genomu pomocí inzerce, nebo padě chemické mutageneze), celkový po - programují a regulují expresi proteinů 2 čet změn v DNA je ale nesrovnatelně nižší na všech úrovních (jak DNA, tak RNA). (Hardy a kol. 2010). EMS – etylmetansulfonát, ENU – etylni - Transpozonům se také někdy říká „ská- trozomočovina, 5-BU – 5-bromouracil, za tento fenotypový projev odpovědné. kající geny“, za což vděčí své schopnosti 2-AP – 2-aminopurin, ZNF – nukleázy se V reverzní genetice má však svá omezení. pohybovat se v genomu. V případě muta- zinkovými prsty (Zinc Finger Nucleases), Existují úseky genomu náchylnější k mu - geneze těmito elementy lze obnovit původ- TALEN – TALE nukleázy (Transcription tacím, ale pokud cílový gen leží v oblasti ní stav genu za použití správné transponá - Activator-Like Effector Nucleases), s nízkým výskytem mutací, nemusí se po - zy – enzymu odpovědného za přemístění CRISPR (Clustered Regularly Interspa- dařit ho pomocí chemické mutageneze transpozonu, resp. jeho vyjmutí z cílového ced Short Palindromic Repeats), změnit. Obecně je při takové mutagenezi genu a vložení do nového místa v genomu. RNAi – RNA interference. Blíže v textu nutná vysoká dávka mutagenu, která vždy Transpozony se hojně využívaly pro vy - 2 Princip fungování transpozonu a jeho vede k mnohonásobným mutacím i mimo tvoření rozsáhlých sbírek mutantů růz- vložení do hostitelské DNA. Samotný cílové geny, takže následně musíme nežá- ných modelových organismů. Množství transpozon kóduje sekvenci genu trans- doucí mutace odstranit opakovaným kříže - mutantních linií háďátka Caenorhabditis ponázy, tedy klíčového enzymu odpověd- ním s původními nezmutovanými jedinci elegans bylo získáno pomocí transpozo- ného za vyštěpení a vložení transpozonu (tzv. outcross, Hardy a kol. 2010). Nedávno nu Tc1, který se nachází v mnoha kopiích do cílové sekvence. Transponáza deteku- byla např. vytvořena velká sbírka mutantů ve všech kmenech háďátka. Později se pře- je vazebná místa na okrajích sekvence modelové rostliny huseníčku rolního (Ara- šlo na transpozon Mos1, který má pouze transpozonu, které následně nastřihne bidopsis thaliana), čítající až 3 712 linií, jednu kopii. U ryby dánia pruhovaného a pak se „připojí“ na transpozon. Tento přičemž bylo zjištěno, že se jedna mutace (Danio rerio) se uplatňují oba postupy in - komplex se přesouvá na cílové místo, vyskytuje na úseku dlouhém v průměru zerční mutageneze – transpozony i retro- kde transponáza rozstřihne sekvenci 89 tisíc nukleotidů. To bohužel znamená, že viry. Také výzkum prováděný na octomil- a vloží do ní DNA transpozonu. každá mutantní linie huseníčku obsahuje ce Drosophila melanogaster se neobešel 3 Chiméra (výřez) od italského renesan - několik stovek mutací. V praxi je zcela ne - bez inzerční mutageneze. Pomocí mobil- čního malíře Jacopa Ligozzi (1547–1627). možné takový počet selektovat postupným ních transpozonů, tzv. P elementů, speci- Převzato z Wikimedia Commons, křížením s nezmutovaným jedincem tak, fických právě pro octomilku, bylo získáno v souladu s podmínkami použití aby měla daná linie jedinou mutaci, kte- více než 6 tisíc genových mutací. Množství rou by šlo následně studovat a analyzovat dalších linií vzešlo z použití piggyBac trans- druhá strategie využívá metody ovlivňují - (Martin a kol. 2009). pozonu, který má mimořádnou schopnost cí funkci genu (např. snížením tvorby pro- přesouvat požadovanou sekvenci z vektoru teinu, jemuž gen odpovídá, nebo jeho úpl- Inzerční mutageneze do chromozomu, a nejenom v rámci chro- nou eliminací). Základem obou přístupů je Myšlenka této metody je založena na prin- mozomu jako u většiny ostatních transpo- mutageneze (obr. 1). cipu náhodného včlenění (inzerce) cizoro - zonů. Největší sbírka inzerčních mutantů dého úseku DNA, jakým je např. transpozon existuje pro myš domácí (Mus musculus), Chemická mutageneze zahrnuje 100 tisíc nezávisle zmutovaných Pro vytvoření souboru mutantních orga- embryonálních kmenových buněk. Tyto nismů se využívají hlavně dvě skupiny totipotentní buňky, tedy schopné dife- chemických látek – alkylační činidla nebo rencovat se v jakýkoli typ buňky, se vloží analogy nukleotidových bází, které se za - do hostitelského embrya v průběhu jeho čleňují do nově vznikajícího řetězce DNA vývoje (blíže např. Živa 2016, 4: 150–154 v průběhu replikace. Příkladem jsou 5-bro- a 1: 7–9). V následující etapě se zmutova- mouracil (5-BU) nebo 2-aminopurin (2-AP). né buňky mohou zapojit do tvorby kterého - K vytvoření náhodných bodových mutací koli orgánu. Výsledkem bude tzv. chiméra – v genomu nejčastěji slouží alkylační činid- organismus tvořený buňkami původními la etylmetansulfonát (EMS) a etylnitrozo- i mutantními (obr. 3). močovina (ENU). Etylmetansulfonát odpovídá za alkylaci Programovatelné nukleázy guaninových bází (jedna ze čtyř nukleoti- Programovatelné endonukleázy jsou en - dových bází tvořících molekulu DNA), což zymy schopné rozpoznat specifické sek - způsobí, že se guanin bude místo s cyto- vence v genomu a následně v daném mís- zinem spojovat s thyminem, čímž se v ko - tě přerušit obě vlákna DNA (obr. 4). Zlomy nečném důsledku změní genetická infor- v DNA zvyšují efektivitu homologní re - mace v daném místě. kombinace (HDR – Homology Directed Chemická mutageneze je nástrojem jak Repair) a spouštějí nehomologní spojová- reverzní, tak přímé genetiky. Protože jde ní konců (Non-Homologous End Joining, ale o náhodnou mutagenezi, je vhodnější NHEJ; viz také str. 70 této Živy), které má pro přímou genetiku, která zkoumá změ- za následek mutagenezi. Při homologní re - něný fenotyp po vyvolání mutace, a tepr- 3 kombinaci je pro opravu nutná přítomnost ve pak zjišťuje, které zmutované geny jsou homologního (tj. sekvenčně identického) živa 2/2017 XLV

© Nakladatelství Academia, SSČ AV ČR, v. v. i., 2017. Přetisk článků včetně obrázků se výslovně zapovídá. Veškerá práva včetně práva reprodukce jsou vyhrazena. vlákna DNA bez zlomu, takže pokud buň- Nukleázy se zinkovými prsty (ZFN) Nehomologní spojení konců (NHEJ) ce poskytneme námi upravenou homolog- Fok1 ní DNA, použije ji poté na opravu. Jestliže buňka nedokáže danou homologní sekven- DNA rozštěpené úseky ci detekovat, spustí náhradní plán – NHEJ. Fok1 spojení pomocí Nehomologní spojování nepotřebuje žádný endonukleázy homologní templát a často vytváří INDEL TALE nukleázy (TALEN) mutace (spočívající v inzerci nebo deleci Fok1 DNA s mutací několika párů bází na okrajích zlomu). Zjed- nodušeně řečeno, dvouvláknový zlom na DNA DNA se zahladí a opětovně spojí, ale s jis- Fok1 Homologní rekombinace (HDR) tou chybou (mutací). Homologní rekombi - rozštěpené úseky nace probíhá převážně v S nebo G2 fázi RGE nukleázy (RGEN) interfáze buněčného cyklu, kdy se buňka crisprRNA transaktivační RNA náhrada nedělí (v S fázi se syntetizuje DNA, v G2 fázi (tracrRNA) poškozeného místa se buňka připravuje na rozdělení). NHEJ co nejpodobnější guide RNA molekulou DNA může nastat během celého cyklu. Techno - logie využívající tento princip proděláva- DNA Cas9 opravená DNA jí poměrně bouřlivý vývoj. V minulosti se používaly pouze nukleázy s tzv. zinkovými 4 prsty (Zinc Finger Nucleases, ZFN), pak se objevily další typy programovatelných en - donukleáz. V r. 2011 vstoupily na scénu postupně vyřadí ve škerá ssRNA pro daný 4 Programovatelné nukleázy a porov - výzkumu TALE nukleázy (Transcription gen a nedochází k tvorbě příslušné bílko- nání jejich mechanismu fungování. Activator-Like Effector Nucleases, TALENs) viny. Velkou komplikací spojenou s RNAi Každá nukleáza má vlastní způsob a jako poslední se k nim přidaly RGEN je samotný proces vnesení cizorodé mole- hledání cílové sekvence v DNA řetězci, (RNA-Guided Engineered Nucleases). kuly RNA do buněk (více v následujícím pak ale všechny nastřihnou dvoušroubo- Všechny výše zmíněné nukleázy fungu - článku A. Morávkové na str. XLVII). vici a spustí opravné procesy. Nehomo- jí stejným mechanismem – štěpí chromozo - Využití RNAi v umlčování vybraných logní spojení konců (Non-Homologous movou DNA ve specifických místech, čímž genů však otevřelo dveře také dalším tech- End Joining, NHEJ) probíhá jednoduše, spouštějí systém pro opravu rozštěpené nikám. Mezi nejaktuálnější a mimořádně po identifikaci rozštěpených úseků se DNA a způsobují tak genovou modifikaci. úspěšné patří metoda nazývaná zkráceně jejich konce zarovnají a opětovně ZFN mají dvě domény, jedna se váže na CRISPR-Cas9 (Clustered Regularly Inter- se spojí, čímž vzniká mutace. Pokud DNA právě pomocí zinkových prstů a dru- spaced Short Palindromic Repeats). Tato nastane homologní rekombinace há obsahuje nukleázovou doménu odvo- metoda využívá RNA, aby dokázala efek- (Homology Directed Repair, HDR), zenou z Fok1 restrikčního (štěpícího) en - tivně pozměnit samotnou DNA buněk. Po - opravné mechanismy vyhledají co nej - zymu. Pro plnou funkci jsou zapotřebí dvě mocí guide RNA (gRNA) najde specific- podobnější molekulu DNA, kterou násled- molekuly této nukleázy, které jsou pak ký úsek na dvoušroubovici DNA. Aktivuje ně zamění za původní, poškozenou schopné společně štěpit DNA. Nevýhoda se komplex Cas9 – způsobí rozstřižení pů - molekulu. Proto vkládáme co nejpodob- ZFN spočívá v limitované délce vkládané vodní DNA, čímž spustí opravné mecha- nější DNA s uměle upravenou sekvencí, sekvence, která při kondicionálních knock - nismy, které buď homologní rekombinací, aby právě ona byla vybrána jako nej- out konstruktech může být i tisíce bází nebo nehomologním spojením konců vy - vhodnější kandidát na opravu DNA. (viz dále v textu). ústí v mutaci na úrovni DNA (Dominguez Všechny obr. podle různých zdrojů TALEN mají podobnou strukturu jako a kol. 2016). Protože jde o prokaryotický kreslila V. Grešáková ZFN. Na jednom konci nesou Fok1 nukleá - „imunitní“ systém, musíme vždy dodat ne - zovou doménu, pro navázání DNA ale vy - jen samotnou gRNA s „lokalizátorem“, ale vými buňkami, embryem nebo na použití užívají odlišnou doménu – Transcription i původně bakteriální plazmidy (kružnicové reverzních postupů při tvoření indukova- Activator-Like Efector (TALE). Každá TALE DNA) se sekvencí pro enzym Cas9. telných mutací. Také lze pracovat s RNA doména rozeznává jednu bázi ve velkém Díky vysoké efektivitě a úspěšnosti sys - interferencí a sledovat, zda bude ten samý žlábku DNA. Tyto nukleázy jsme tak schop- témů CRISPR se začalo pracovat i s ovlada - gen stejně důležitý i po narození a v již vy - ni použít na jakoukoli cílovou sekvenci telnými prvky mutageneze, a tak se poda- vinutém organismu. Navíc fenotyp může (Hyongbum a Jin-Soo 2014). řilo včlenit do genomu sekvenci fungující být často projevem většího počtu mutací Další možností je využít pro mutagenezi jako vypínač, který umožňuje regulovat, v různých genech. Proto je velice důleži- RNA. Jednovláknová ribonukleová kyseli- kdy bude cílový gen vypnut. Tyto mutan- té ověřovat účinnost mutageneze na celo- na (single stranded RNA, ssRNA) vzniká ty označujeme jako kondicionální a často genomové úrovni. přepisem jednoho ze dvou vláken dvou- se využívají ke studiu genů klíčových pro Moderní metody přímé a reverzní gene- šroubovice DNA (double stranded, dsDNA) embryonální vývoj. Pokud by byly tyto geny tiky otevírají nové možnosti pro zkoumání a následně slouží jako templát (mediáto- poškozené již v gametách nebo zygotě, funkcí genů, o nichž nemáme dosud téměř rová RNA, mRNA) pro tvorbu proteinu. vedly by k úmrtnosti embrya a mutanti by žádné informace. Neustálý vývoj nových První metodu zacílenou na tento mezikrok se tedy vůbec nenarodili. a výkonných technik umožňuje komplex- proteosyntézy představuje RNA interferen - ně a mnohem detailněji testovat každý ce (RNAi). Základním principem je zame- Závěrem vzorek jak na celkové úrovni (CHIP, Micro - zení přepisu mRNA do podoby proteinu. Studium funkcí neznámých genů je výzvou array, RNAseq), tak na úrovni stavebního Metoda je založena na vpravení specifické pro celou vědeckou obec. Vhodný postup prvku DNA, nukleotidu (metylace, varian - sekvence dvouvláknové RNA (dsRNA), jež by měl být zvolen s ohledem na daný orga- ty sestřihu, mutace). Doufejme, že je pou- odpovídá svou sekvencí cílové ssRNA genu, nismus a na řešenou otázku. V rámci re - ze otázkou času, kdy budeme znát funkce který chceme vypnout, do cytoplazmy bu - verzní genetiky se obecně snažíme vytvořit všech genů, což nám poskytne podrobný něk. Po vpravení dsRNA se totiž v buňkách životaschopné mutanty, kteří se následně náhled na fungování molekulární podstaty spustí přirozené obranné mechanismy, rozmnožují a lze je dále zkoumat. Klíčo- organismu, a budeme schopni efektivněji protože dvouvláknová RNA se normálně vá je snaha o minimalizaci tzv. offtargetů, léčit nebo zmírňovat projevy různých ge - v buňce vyskytovat nesmí. Proto tyto me - tedy mutací v jiných genech než plánova- neticky podmíněných onemocnění (blíže chanismy rozštěpí cizorodou dsRNA mole- ných. Při postupech přímé genetiky občas např. v článku na str. 70 tohoto čísla Živy). kulu na menší úseky jednovláknové RNA, dochází k tomu, že mutace je embryonál- která pak ale vyhledá kompatibilní přiro- ně letální, a tedy všichni homozygotní mu - Citovaná literatura je uvedena na webové zenou ssRNA. Tak vznikne nepřirozená tanti umírají ještě v průběhu nitrodělož- stránce Živy, kde najdete také obrázky dsRNA, opět následně eliminovaná obran- ního vývoje. V takových případech je třeba použité v tomto textu v jejich původní ným mechanismem. Tímto procesem se výzkum částečně omezit na práci s kmeno - barevné verzi. XLVI živa 2/2017

© Nakladatelství Academia, SSČ AV ČR, v. v. i., 2017. Přetisk článků včetně obrázků se výslovně zapovídá. Veškerá práva včetně práva reprodukce jsou vyhrazena. Elsevier Editorial System(tm) for Experimental Cell Research Manuscript Draft

Manuscript Number:

Title: Dual role of Fam208a in maintenance of genome stability in mammals

Article Type: Full Length Article

Keywords: Genome stability; Fam208a; multipolar spindle apparatus; HUSH

Corresponding Author: Dr. Radislav Sedlacek,

Corresponding Author's Institution: nstitute of Molecular Genetics of the ASCR, v. v. i.

First Author: Veronika Gresakova

Order of Authors: Veronika Gresakova; Vendula Novosadova; Michaela Prochazkova; Shohag Bhargava; Irena Jenickova; Jan Prochazka; Radislav Sedlacek

Abstract: Maintenance of genome stability is essential for every living cell as genetic information is repeatedly challenged during DNA replication in each cell division event. Errors, defects, delays, and mistakes that arise during mitosis or meiosis lead to an activation of DNA repair processes and in case of their failure, programmed cell death, i.e. apoptosis, could be initiated. Fam208a is a protein whose importance in heterochromatin maintenance has been described recently. In this work, we describe the crucial role of Fam208a in sustaining the genome stability during the cellular division. The targeted depletion of Fam208a in mice using CRISPR/Cas9 leads to embryonic lethality before E12.5. We also used the siRNA approach to downregulate Fam208a in zygotes to avoid the influence of maternal RNA in the early stages of development. This early downregulation increased arresting of the embryonal development at the two-cell stage and occurrence of multipolar spindles formation. To investigate this further, we used the yeast two-hybrid (Y2H) system and identified new putative interaction partners Gpsm2, Amn1, Eml1, Svil, and Itgb3bp. Their co- expression with Fam208a was assessed by qRT-PCR profiling and in situ hybridization [1] in multiple murine tissues. Based on these results we proposed that Fam208a functions within the HUSH complex by interaction with Mphosph8 as these proteins are not only able to physically interact but also co-localise. We are bringing new evidence that Fam208a is multi- interacting protein affecting genome stability on the level of cell division at the earliest stages of development and also by interaction with methylation complex in adult tissues. In addition to its epigenetic functions, Fam208a appears to have an additional role in zygotic division, possibly via interaction with newly identified putative partners Gpsm2, Amn1, Eml1, Svil, and Itgb3bp.

Research Data Related to this Submission ------Title: Data for: Dual role of Fam208a in maintenance of genome stability in mammals Repository: Mendeley Data https://data.mendeley.com/datasets/94h3kn2rnx/draft?a=536c481e-4a84-461b- a26e-f6add57de478

Cover Letter

Prague, April 02, 2019

Dear Editor

I am enclosing our manuscript by Veronika Gresakova et al., entitled ‘Dual role of Fam208a in maintenance of genome stability in mammals’ to be considered for publication in Cell Division as a research article. Our work focuses on function and interaction network of Fam208a whose involvement in heterochromatin maintenance has been described recently. Besides previously published function of Fam208a in epigenetic silencing by involvement in HUSH complex, our study shows that Fam208a exerts an important role in early embryonic development by orchestration of spindle poles and spindle apparatus assembly. Knocking down endogenous Fam208a by RNA interference in murine zygotes leads to formation of multipolar spindles and increased ratio of arrested or incorrectly developed embryos suggesting that early embryonic lethal phenotype is associated with improper regulation of cell division. These findings suggest a novel role of Fam208a during the earliest events of embryonic development in contrast to the situation in adult somatic cells where the major role of Fam208a has been associated mostly with its function within the HUSH complex. In order to depict molecular role of Fam208a in the formation of multipolar spindles and arrest of embryo development we used yeast two-hybrid (Y2H) system for identification of interacting partners. Interestingly, novel group of putative interaction partners were distinguished, namely: Gpsm2, Amn1, Eml1, Svil, and Itgb3bp, all known to play a role during cell division and cleavage spindle assembly. Their co-expression with Fam208a was profiled using in situ hybridization and RT-qPCR in multiple murine tissues. These results confirmed that Fam208a function is not only associated with the HUSH complex at the level of DNA methylation but it is also involved in regulatory networks driving successful cell division processes. Altogether, our work provides important insight in understanding molecular landscape of Fam208a as multi-interacting protein affecting genome stability, playing also essential role in initial steps of embryonic development via orchestrating spindle pole assembly and chromosome segregation. This manuscript has not been submitted to other journals and all authors have approved the manuscript. To the best of our knowledge, no conflict of interest, financial or other, exists. Thank you for receiving our manuscript and considering it for review. On behalf of all authors I truly look forward to receiving your response.

Yours sincerely,

Radislav Sedlacek Laboratory of Transgenic Models of Diseases and the Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prumyslova 595, 252 50 Vestec, Czech Republic E-mail: [email protected] | Phone: (+420) 325 873 243-2 Highlights

Highlights:

 Homozygous Fam208a knock out is embryonically lethal;

 Downregulation of Fam208a in zygotes increased multipolar spindle

formation;

 Fam208a ablation alters expression profile in embryos and cells;

 Establishment of new putative protein complex driving spindle assembly. Manuscript Click here to view linked References

1 Dual role of Fam208a in maintenance of genome stability in mammals

2

3 Veronika Gresakova1,3, Vendula Novosadova2, Michaela Prochazkova2, Shohag Bhargava1,

4 Irena Jenickova2, Jan Prochazka1, 2 and Radislav Sedlacek*, 1, 2

5 1 Laboratory of Transgenic Models of Diseases, Institute of molecular genetics of the ASCR, 6 v.v.i, Průmyslova 595, 252 50 Vestec, Czech Republic 7 2 Czech centre of phenogenomics, Institute of Molecular Genetics, ASCR v.v.i, Průmyslova 8 595, 252 50 Vestec, Czech Republic 9 3 Palacky University in Olomouc, Faculty of Medicine and Dentistry, Hněvotínská 3 10 775 15 Olomouc, Czech Republic 11

12 Email addresses:

13 [email protected] ; [email protected] ;

14 [email protected] ; [email protected] ;

15 [email protected] ; [email protected] ; [email protected]

16 *Corresponding author: Radislav Sedlacek

17 [email protected]

18 (+420) 325 873 246

19 ORCID: 0000-0002-3352-392X

20 Co-corresponding author: Jan Prochazka

21 jan.prochazka@ img.cas.cz

22 (+420) 325 873 259

23 ORCID: 0000-0003-4675-8995

24

1

25 Abstract

26 Maintenance of genome stability is essential for every living cell as genetic information is

27 repeatedly challenged during DNA replication in each cell division event. Errors, defects,

28 delays, and mistakes that arise during mitosis or meiosis lead to an activation of DNA repair

29 processes and in case of their failure, programmed cell death, i.e. apoptosis, could be initiated.

30 Fam208a is a protein whose importance in heterochromatin maintenance has been described

31 recently. In this work, we describe the crucial role of Fam208a in sustaining the genome

32 stability during the cellular division.

33 The targeted depletion of Fam208a in mice using CRISPR/Cas9 leads to embryonic lethality

34 before E12.5. We also used the siRNA approach to downregulate Fam208a in zygotes to avoid

35 the influence of maternal RNA in the early stages of development. This early downregulation

36 increased arresting of the embryonal development at the two-cell stage and occurrence of

37 multipolar spindles formation. To investigate this further, we used the yeast two-hybrid (Y2H)

38 system and identified new putative interaction partners Gpsm2, Amn1, Eml1, Svil, and

39 Itgb3bp. Their co-expression with Fam208a was assessed by qRT-PCR profiling and in situ

40 hybridization [1] in multiple murine tissues. Based on these results we proposed that Fam208a

41 functions within the HUSH complex by interaction with Mphosph8 as these proteins are not

42 only able to physically interact but also co-localise. We are bringing new evidence that

43 Fam208a is multi-interacting protein affecting genome stability on the level of cell division at

44 the earliest stages of development and also by interaction with methylation complex in adult

45 tissues.

46 In addition to its epigenetic functions, Fam208a appears to have an additional role in zygotic

47 division, possibly via interaction with newly identified putative partners Gpsm2, Amn1, Eml1,

48 Svil, and Itgb3bp.

2

49 Keywords: Genome stability, Fam208a, multipolar spindle apparatus, HUSH

3

50 Introduction

51 Genome stability can be impaired by variable processes including erroneous DNA1 replication

52 or unequal sister chromatid segregation [2]. Moreover, natural decay or exogenous genotoxic

53 agents such as ultraviolet light, oxidative stress, chemical mutagens, and radiation are

54 constantly affecting the stability of the DNA. To compensate this instability, multiple

55 mechanisms were developed to prevent accumulation of these changes. The DNA repair system

56 is a complex machinery involving a system of checkpoints, homologous recombination or non-

57 homologous end-joining process, posttranscriptional RNA2 modifications (m6A, alternative

58 splicing) and posttranslational modifications of proteins (histone methylation, phosphorylation

59 etc.) [3, 4]. Methylation processes are also involved in maintaining genome stability and are

60 driven by several signalling pathways, one of which includes the HUSH (Human silencing hub)

61 complex with FAM208a in its core [5].

62 Fam208a is a large protein, which was originally designated as RAP140a and described as an

63 interaction partner for human partial retinoblastoma protein, Rb [6]. The first suggestions about

64 its role were linked to intracellular translocation of Rb and general involvement in cell-cycle

65 control, gene expression, and tumorigenesis. Subsequently, Fam208a was identified as a direct

66 transcriptional target of Oct4, critical for cell pluripotency, differentiation activation, and gene

67 repression [7].

68 In 2013, Fam208a was described as a potential suppressor of variegation in ENU3 mutagenesis

69 screen with integrated multi-copy green fluorescent protein (GFP) transgene under the control

70 of the haemoglobin promoter [8]. Two independent lines with induced mutations in Fam208a,

71 MommeD6 (L130P) and MommeD20 (introduction of a stop codon in the intron 1 region) were

1 Deoxynucleotide acid 2 Ribonucleic acid 3 N-ethyl-N-nitrosourea

4

72 identified and described as embryonically lethal [9]. Studies on Fam208a-L130P suggested its

73 involvement early embryonal development [10].

74 The usage of near-haploid KBM7 cells in forward genetic screen identified a complex of four

75 proteins, FAM208a, MPHOSPH8, SETDB1, and PPHLN1, designated as HUSH (human

76 silencing hub) complex [11]. HUSH complex was shown to be involved in regulation of

77 silencing integrated retroviruses as well as endogenous regions by recruiting methyltransferase

78 SETDB1 to H3K9 methylation sites. Recently, it was proved that FAM208a in HUSH complex

79 bind to sequences of endogenous retroviruses (ERVs) and long interspersed nuclear elements

80 (LINE-1s/L1s). Moreover, HUSH complex was also shown that together with TRIM28, they

81 are responsible for co-repression of young retrotransposons and new genes promoted by

82 noncoding DNA silencing [12].

83 Our recent study shows that Fam208a, known to be part of HUSH complex, can also play an

84 important role in spindle pole assembly and sister chromatids segregation in the initial steps of

85 embryonic development. We identified several new putative interaction partners of Fam208a

86 such as SVIL, INTB3BP, GPSM2, EML1 and AMN1, which are known to play a role in cell

87 division and mitotic spindle assembly. Moreover, knocking down endogenous Fam208a by

88 RNA interference in murine zygotes leads to formation of multipolar spindles and increased

89 ratio of arrested or incorrectly developed embryos, suggesting that the early embryonic lethal

90 phenotype might be associated with improper regulation of cell division resulting in

91 chromosomal aberrations. These findings suggest a new role of Fam208a in very early events

92 of embryonic development and in adult somatic cells, where its major function is associated

93 mostly with function in HUSH complex.

94

95

5

96 Material and methods

97 Fam208a KO murine strain

98 The knock out (KO) mouse for Fam208a (first exon deleted, see Suppl. Table 2) was generated

99 with CRISPR/Cas9 technology. The active RNA for CRISPR/Cas9 complex was microinjected

100 into C57BL/6NCrl zygotes with two pronuclei on Leica micromanipulator equipped with

101 Eppendorf FemtoJet. The microinjected zygotes were transferred to ICR recipient in the same

102 day. The screening of mutant mice was performed with polymerase chain reaction (PCR). The

103 Dream Taq (ThermoFisher, K1081) genotyping PCR was run with DNA prepared from tail

104 tips (Quick extract solution, Lucigen, QE09050), using F, R1 and R2 primers (for sequence see

105 Suppl. Table 2), under the following conditions: 95°C for 5 minutes, 40 cycles of melting at

106 95°C for 30 sec, annealing at 64°C for 40 sec, and extension at 72°C for 30 sec, with additional

107 extension at 72°C for 5 minutes at the end. PCR products were separated in 1% agarose gels.

108 The animals used in the study came from at least 3 back-crosses of heterozygotes with wild

109 type C57Bl/6NCrl mice.

110 Mutant Hek293t cell lines

111 CRISPR/Cas9 targeting sequence for both genes, FAM208a and MPHOSPH8, were cloned

112 into pX330 Venus vector (Table S2). All constructs were verified by sequencing. Hek293t cells

113 were incubated with transfection mixture consisted of 30µg of vector DNA and 90µl of X-

114 treme GENE HP DNA Transfection Reagent (Roche, 06 366 236 001) and DMEM (D6429).

115 After 24hour incubation, cells were sorted by FACS for GFP positive cells and 10000 cells

116 were plated for further cultivation. 48hours later, GFP negative sorting took place, followed by

117 single cell plating. This sorting was further incubated and single cell colonies were formed and

118 analysed by PCR followed by sequencing. Finally, 6 stable lines were selected for next

119 experimental plans: HekMT (no mutation); Fam-a1 (-39bp, exon 4/11); Fam-a2 (-34bp, exon

6

120 4/11); Mpp8-a (-19bp, exon 8) and Mpp8-b (-46bp, exon 8). Both transcription variants of

121 FAM208a should be effected, in FAM208a-202 exon 4, in FAM208a-209 exon11. In

122 MPHOSPH8 were CRISPRS targeted to Ankyrin rich domain, as this was identified by Y2H

123 system as interaction domain with Fam208a, but the mutation lead to complete protein ablation.

124 E9.5 LC-MS analysis

125 We collected 4 littermate embryos at the stage of E9.5, and performed separation from yolk

126 sack and maternal residues. Two wild types, one heterozygote and one homozygote embryo

127 were lysed in 100mM TEAB containing 2% SDC and boiled at 95°C for 5 minutes. Protein

128 concentration was determined using BCA protein assay kit (Thermo) and 20µg of protein per

129 sample was used for MS sample preparation in technical triplicates. Cysteines were reduced

130 with 5mM final concentration of TCEP (60°C for 60 minutes) and blocked with 10mM final

131 concentration of MMTS (10 minutes at Room Temperature). Samples were digested with

132 trypsin (trypsin/protein ratio 1/20) at 37°C overnight. After digestion, samples were acidified

133 with TFA to 1% final concentration. SDC was removed by extraction to ethyl acetate [1, 11]

134 and peptides were desalted on Michrom C18 column (Michrom BioResources Inc.).

135 Hek293t mutant cell LC-MS analysis

136 Mutant Hek293t cells were grown in 6-well plate till confluence. Cell pellets were lysed in

137 100mM TEAB containing 2% SDC and boiled at 95°C for 5 minutes and the subsequent steps

138 are identical to previously described method (E9.5 LC-MS analysis).

139 nLC-MS 2 Analysis

140 Nano Reversed phase columns (EASY-Spray column, 50 cm x 75 µm ID, PepMap C 18,2µm

141 particles, 100Å pore size) were used for LC/MS analysis. Mobile phase buffer A was composed

142 of water, 2% acetonitrile and 0.1% formic acid. Mobile phase B was composed of 80%

143 acetonitrile, 0.1% formic acid. Samples were loaded onto the trap column (Acclaim

144 PepMap300, C18, 5µm, 300Å Wide Pore, 300µm x 5mm, 5 Cartridges) for 4 minutes at

7

145 15μl/min loading buffer was composed of water, 2% acetonitrile and 0.1% trifluoroacetic acid.

146 Peptides were eluted with Mobile phase B gradient from 2% to 40% B in 120 minutes. Eluting

147 peptide cations were converted to gas-phase ions by electrospray ionization and analyzed on a

148 Thermo Orbitrap Fusion (Q-OT- qIT, Thermo). Survey scans of peptide precursors from 350

149 to 1400m/z were performed at 120K resolution (at 200m/z) with a 5×105 ion count target.

150 Tandem MS was performed by isolation at 1,5Th with the quadrupole, HCD fragmentation

151 with normalized collision energy of 30, and rapid scan MS analysis in the ion trap. The MS

152 2 ion count target was set to 104 and the max injection time was 35ms. Only those precursors

153 with charge state 2–6 were sampled for MS 2. The dynamic exclusion duration was set to 45s

154 with a 10ppm tolerance around the selected precursor and its isotopes. Monoisotopic precursor

155 selection was turned on. The instrument was run in top speed mode with 2s cycles [1].

156 Data analysis

157 All data were analyzed and quantified with the MaxQuant software (version 1.5.3.8) (Cox,

158 Hein et al. 2014). The false discovery rate (FDR) was set to 1% for both proteins and peptides

159 and we specified a minimum peptide length of seven amino acids. The Andromeda search

160 engine was used for the MS/MS spectra search against the Human database (downloaded from

161 Uniprot on September 2015, containing 147,934 entries). Enzyme specificity was set as C-

162 terminal to Arg and Lys, also allowing cleavage at proline bonds and a maximum of two missed

163 cleavages. Dithiomethylation of cysteine was selected as fixed modification and N- terminal

164 protein acetylation and methionine oxidation as variable modifications. The ‘match between

165 runs’ feature of MaxQuant was used to transfer identifications to other LC-MS/MS runs based

166 on their masses and retention time (maximum deviation 0.7 minutes) and this was also used in

167 quantification experiments. Quantifications were performed with the label-free algorithms

168 described recently. Data analysis was performed using Perseus 1.5.2.4 software [13-15].

169

8

170 Immunofluorescence and RNAi in early embryos

171 C57Bl/6NCrl females at the age of 4-6weeks were stimulated for superovulation with injection

172 of PMSG at 2:00 pm of day one. Next day, HCG was applied and males were added to the

173 cages. Third day morning, females were screened for vaginal plugs and fertilized zygotes were

174 isolated and incubated in M2medium (Sigma, M7167). Downregulation of desired proteins was

175 mediated by ON-target smart pool siRNA from Dharmacon (Fam208a- L-047440-00-0005;

176 GAPDH- D-001830-20-05; Non Targeting pool- D-001810-10-05). RNA was dissolved in

177 PCR Ultra H2O from Top-Bio (P340) for final concentration 5,8µM. Electroporation (EP) was

178 performed by NEPA21 electroporator (Nepagene) with impedance set to range 0.18-0.22Ω and

179 transfer pulse set to 5V for 50msec with 5 pulses and 5 zygotes per run.

o 180 After EP, cells were incubated at 37 C with 5% CO2 in M2media. 24, 48 & 72 hours later,

181 zygotes were fixed with 4% PFA for 45 minutes. Three sets of washes in PBS/FBS 5% were

182 performed followed by permeabilization in 0.5% PBST for 60 minutes. Blocking step was

183 performed with solution containing 5% NGS, 0.3M glycine and 0.1% Triton X in PBS for at

184 least two hours. Primary antibody incubation took overnight at 4°C with 1% NGS and β-tubulin

185 antibody (Cell signalling, #2146) with dilution 1:50. Another set of washing took place next

186 morning followed by incubation with secondary antibody donkey anti-Rabbit AlexaA488

187 (Invitrogen, A21206) with dilution 1:1000 for 90 minutes. Last set of washes followed by

188 nuclear staining with DAPI and glycerol series with mounting in 90% glycerol with 5% NPG

189 took place. Cells were visualized by Dragonfly spinning disc microscope with 40x objective.

190 Yeast two hybrid system

191 Screening for identification of Fam208a interaction partners divided this large protein into two

192 overlapping parts N’ (3-741aa) and C’ (536-1640aa) terminal part. To create cloning sites for

193 Y2H vector (pGBKT7, Clontech, 630489), unique primers were designed (Suppl. Table 3) so

194 the PCR product kept reading frame after subcloning into vector sequence. Restriction sites

9

195 XmaI/SalI for N’ terminal and EcoRI/SalI for C’ terminal part were used. Standard protocol

196 for Phusion PCR reaction (NEB, M0532S) was used. Template cDNA was prepared by M-

197 MLV reverse transcription (Promega, M170A) with original protocol by supplier and mRNA

198 from murine D3 cell line was used. Final product of cDNA was diluted into final concentration

199 of 100ng/µl. Both products were cloned into pGBKT7 vector (Clontech, 630489) by using

200 Phusion PCR reaction and primers with integrated XmaI/SalI or EcoRI/SalI restriction sites.

201 Set of constructs was prepared for Y2H screen : C’ terminal construct (536-1640aa) was used

202 as construct C, construct A (536-947aa) was prepared from construct C digested with SmaI

203 and blunted, construct B (536-1160aa) was C construct digested with PstI with spliced out

204 cassette exon 17 (60bp), α (536-1498aa) was C construct without exon 17 and 274bp deletion

205 in exon 24 and β (536-1640aa) was C construct without exon 17 and 179bp deletion in exon

206 22. Further steps were following producer’s manual for Matchmaker Gold Yeast two hybrid

207 system (Cat. No. 630489) with murine embryonal library of 11 days of age (Clontech, 630478).

208 Finally, library plasmids were isolated with lysis buffer (50mM Tris-HCL pH8; 10mM EDTA

209 + 20mg/ml RNaseA) and purified with incubation with 200mM NaOH, 1% SDS and after 5

210 minutes with addition of 3M Sodium acetate, pH4.8. Overnight precipitation with isopropanol

211 was followed by standard ethanol washes. Identified vectors were used for transformation into

212 DH5α E.coli cells to increase a yield and later were re-purified and sent for sequencing to

213 analyse putative partners.

214 BIOMARK and qRT-PCR

215 RNA isolation

216 Gene expression analysis was performed using the BioMark high throughput microfluidic

217 qPCR platform (Fluidigm, San Francisco, CA). Prior to the qPCR the samples were pre-

218 amplified as follows: 2µl of cDNA (10ng RNA/µl) was mixed with 1.25µl of 200nM primer

219 mix (all primers together at a final concentration of each primer of 25 nM), 5 µl of iQ Supermix

10

220 (BioRad, Prague, Czech Republic) and 1.75µl of RNase/DNase-free water (ThermoFisher

221 Scientific). The mixture was first incubated for 3 minutes at 95°C, then 18 cycles of 15s at

222 95°C, and finally 4 minutes at 59°C. Pre-amplified cDNA was diluted 40×. qPCR was carried

223 out in the GE Dynamic array 48.48 in a BioMark HD System (Fluidigm, San Francisco, CA).

224 5µl of Fluidigm sample premix consisted of 1 µl of 40× diluted pre-amplified cDNA, 0.25 µl

225 of 20× DNA Binding Dye Sample Loading Reagent (Fluidigm), 2.5µl of Sso Fast EvaGreen

226 Supermix (Bio-Rad, Czech Republic), 0.1µl of 4× diluted ROX (Invitrogen, USA) and 1.15µl

227 of RNase/DNase-free water. Each 5µl assay premix consisted of 2µl of 10µM primers (forward

228 and reversed primer each at a final concentration of 400nM), 2.5µl of DA Assay Loading

229 Reagent (Fluidigm, USA) and 0.5µl of RNase/DNase-free water. Thermal qPCR protocol was:

230 50°C for 5 s and 98°C for 3 minutes, 40 cycles of 98°C for 5 s, and 60°C for 5 s. The data were

231 collected with the BioMark 4.2.2. Data Collection software and analysed with the BioMark

232 Real-Time PCR Analysis Software 4.1.3. (Fluidigm, USA).

233

234

235

236

237

238

239

240

241

11

242 Results

243 Full ablation of Fam208a leads to embryonic lethality at early somite stage

244 Since previously published data were generated using mutant lines generated with ENU

245 mutagenesis (MommeD6 and MommeD20) [9], where a random mutation does not necessarily

246 result in a complete loss of function alleles, we prepared a full Fam208a knock out (KO) mouse

247 line, where the whole critical exon 1 was deleted by CRISPR/Cas9 according to the IMPC

248 standard [16]. Guide RNAs (G1 and G2) were designed to target an intronic regions before or

249 after the exon 1(Fig. 1A) resulting in the deletion of 866 nucleotides. Quantitative polymerase

250 chain reaction (qPCR) verified decreased levels of Fam208a mRNA in homozygote E9.5

251 embryos (Fig. 1B) and slight increase of its expression in heterozygotes, which might be caused

252 by compensatory mechanisms as mutated mRNA is decayed and cells drive higher expression

253 from wt allele. In breeding using heterozygous mice, we observed fully penetrant pre-weaning

254 lethal phenotype, consistent with previously published data. Instead of viable homozygotes,

255 five litters with a total number of 32 pups provided 23 heterozygotes and 9 wild type animals

256 (Fig. 1C). In order to investigate the cause of this lethal phenotype, we analysed embryonic

257 development of Fam208a KO mutants with the aim to identify the critical developmental period

258 of Fam208a malfunction. We followed the IMPC embryonic lethal screen guidelines [17] and

259 started at embryonic stage E12.5. As expected, no viable homozygous embryos were observed

260 at this stage (Fig. 1C). The heterozygote littermates were of similar size and weight, and the

261 somite number was the same as in the wild-type (wt) embryos (Suppl. Fig. S1).

262 Next, we examined embryos at E8.5 and E9.5 stage but we did not observe any loss of Fam208a

263 null embryos before the placentation as live null embryos harvested at E8.5 and E9.5 were

264 comparable to the wild type littermates (Fig. 1D). Nevertheless, Fam208a null embryos went

265 through gastrulation process (Fig. 1E); however, they reached the maximal number of 4 somites

12

266 at E9.5 compared to the wild type embryos which averaged 28 somites. Interestingly, the

267 delayed development was also observed in heterozygote embryos with reproducible difference

268 in formation of up to 6 somites less than in wild type littermate embryos at E9.5 (Fig. 1F). In

269 E8.5 and E9.5 embryos, we observed the same genotype distribution (similar number of wt and

270 homozygotes and higher numbers of heterozygotes, wt:het:homo ; 13:18:12).

271 Based on these findings, we conclude that Fam208a ablation causes embryonic lethality with

272 robust developmentally delayed phenotype observed at E8.5, progressing through E9.5 with

273 full lethality by E12.5. Remarkably, at earlier stages of development, the dose dependent effect

274 of mutated allele is visible in delayed developmental progress of heterozygous embryos;

275 however, this effect is fully compensated for in later developmental stages and results in fully

276 viable and fertile mice. There are obvious compensatory mechanisms which assist

277 heterozygotes to proceed with the development. Based on provided clues and knowledge, it

278 might be driven by increased expression (higher mRNA levels) or by alternative signalling

279 pathways (Kap1, HUSH2).

280 Fam208a mutation massively impacts protein expression profile in homozygous mice

281 In order to reveal the impact of Fam208a depletion in molecular landscape, during early

282 embryonic development of Fam208a null embryos, we performed unbiased differential

283 proteomics by LC-MS analyses in E9.5 embryos. Proteomics data showed apparent divergence

284 between homozygote and wild type embryos. Examination of samples disclosed the fact that

285 while both, wild type and heterozygote, were almost identical, homozygote embryos exhibited

286 completely altered protein expression profile (Fig. 2A). We identified over 4800 proteins, from

287 which 800 showed highly significant differences (Fig. 2B). The most numerous groups with a

288 common function were proteins with nucleic acid binding activity (170) followed by proteins

289 involved in transcription regulations (165). The third group included proteins playing different

13

290 roles in epigenetic processes (122), crucial for embryonal development (Fig. 2C). Proteins

291 involved in cell cycle control and cell division processes were also highly affected (83 and 103

292 proteins, respectively). Regarding the expression of previously published interaction partners

293 in HUSH complex [11, 12], Pphln1 and Setdb1 showed only mild alteration of the expression

294 in Fam208a null embryos; however, the level of Fam208a itself was decreased, and Mphosph8

295 (Mpp8) in homozygotes was just at the detection limit of the method (Suppl. Fig. S2). Neither

296 of HUSH partners had severe detectable differences in the expression levels of heterozygote

297 embryos (Fam208a+/-). This proteomic data suggests that there might be other complexes,

298 which include Fam208a, that are influenced by the heterozygous expression or milder ablation

299 of Fam208a protein as well. That would partially explain observed developmental retardation

300 in heterozygotes and no changes in the levels of HUSH proteins in them.

301 Downregulation of Fam208a in zygotes leads to immediate cell division phenotype

302 To further investigate the role of Fam208a in embryonic development preceding post-

303 gastrulation and pre-placentation lethal period (E8.5-E12.5), we systematically focused on the

304 earlier events in embryonic development, employing microarray profiling of whole zygotic

305 transcriptome at several stages. To explore the effects of waves of zygotic transcription

306 activation, fully grown GV oocytes, metaphase II-arrested eggs, 1-cell zygotes, 2-cell zygotes,

307 4-cell zygotes, morula, and blastocyst datasets were analysed and compared [18, 19]. RNA

308 levels of Fam208a showed clustering with genes with lower expression in the first three stages

309 i.e. GV, MII oocytes, and 1-cell zygote. However, the expression increases after two-cell stage

310 transcription activation and stays relatively high during subsequent stages, i.e. 4-cell stages,

311 morula and blastocyst (Fig. 3A). Increased expression is represented by more frequent and

312 higher peaks located in intronic regions (exons are represented by blue boxes in the gene

313 scheme in the last row of the Fig. 3A). Based on the transcriptomics data, we investigated the

314 putative function of Fam208a in the first zygotic divisions. To avoid effects of maternally

14

315 delivered mRNA (from heterozygous mothers, as homozygotes are not viable) into early

316 zygotes, which can significantly diminish the null phenotype [20], we used siRNA to

317 knockdown Fam208a in the zygote immediately after electroporation. We introduced RNAi by

318 electroporation into 1-cell stage zygote of C57Bl/6NCrl and observed the ability to further

319 develop. The downregulation of Fam208a in zygotes, which overcame the two-cell stage block,

320 leads to the problem with proceeding through typical cell division and multipolar spindles were

321 formed (Fig. 3B). Tri-polar spindle apparatus was formed most frequently (n=9) in siRNA-

322 knockdown Fam208a cells (Fig. 3C). These abnormal spindles were observed at all selected

323 time points. The most dramatic differences were observed 48 and 72 hours after siRNA

324 delivery. All controls (siGAPDH, non-targeting siRNA and water) developed spindles without

325 any observable disturbances (n=11) (Fig. 3B).

326 Our findings suggest that Fam208a is not limited to epigenetic regulation, but can also play a

327 role also in other processes such as H3K9 methylation. Moreover, Fam208a may directly form

328 a part of the spindle apparatus regulatory pathways with potentially novel spectrum of

329 interacting proteins distinct from the epigenetic modifiers discovered before (HUSH complex).

330 This is also supported by the finding that the methylation state in early zygote is stable, until

331 massive de-methylation at 4-cell stage embryo occurs [21].

332 Y2H screen revealed novel Fam208a interaction network in spindle assembly machinery

333 In order to study Fam208a interactome, we used the Y2H system for unbiased identification of

334 all the potential binding partners. The advantage of the Y2H system is the possibility to search

335 for putative interacting partners, which were not identified by proteomic approaches in

336 differentiated cell cultures. The already known and verified interaction partners of Fam208a

337 from HUSH complex (MPP8, PPHLN1, and SETDB1) served as an internal control. However,

15

338 none of those has been linked with establishment of spindle poles or direct DNA binding with

339 actin and tubulin fibres of spindle apparatus.

340 Because of its large size, we split Fam208a (1610aa) into two overlapping parts for the purpose

341 of Y2H system. N’ terminal part covered first 740aa. C’ terminal part with 1100 amino acids

342 was further studied by generation of four deletion constructs based on natural variants

343 (alternative reading frame, exon skipping and partial deletions). All constructs were used to

344 closely describe potential protein binding interaction domains (Fig. 4A). Our results confirmed

345 possibility of this in silico prediction as there were no verified interaction partners for N’

346 terminal part (possible DNA/RNA binding part), while the Y2H screen identified 20 putative

347 interaction partners for C’ terminal part (Ncbp1, Eml2, Hcfc2, S100a10, Inpp5a, Amn1,

348 Parpbp, Psmd8, Slc22a3, Capn7, Stk38, Pphln1, Alb, Etfa, Gpsm2, Itgb3bp, Mphosph8, Svil,

349 Cntn1 and Tmem 100).

350 Size and colour intensity of the yeast colonies were used as strength markers for different

351 interaction partners (Fig. 4B). One of the strongest colour signal was observed in colonies with

352 Ankyrin domain of Mphosph8 protein, which confirmed our approach as the direct interaction

353 between Fam208a and Mphosph8 was already described [11]. Reciprocal verification mating

354 ruled out four (Capn7, Stk38, Slc22a3 and Pphln1) of twenty identified constructs (Fig. 4B).

355 As an example, Periphilin1 was firstly identified in 39 out of 65 diploid yeast colonies, all

356 possible transcription variants were pulled, surprisingly none of them was ’in frame’.

357 Nevertheless, our control mating did not confirm this protein as a direct binder to Fam208a.

358 In the screening assay, Fam208a-Mphosph8 interaction was confirmed, as the Ankyrin repeat

359 domain (known protein binding domain) was identified in Mphopsh8 sequences which were

360 pulled by Y2H system, and the region between amino acids 600-904 of Fam208a were

361 responsible for their binding. In addition, novel putative interaction partners were identified

16

362 from the different protein function groups such as calcium regulating proteins (S100a10,

363 Inpp5a), proteins with DNA/RNA binding (Ncbp1, Hcfc2, Parpbp) and proteins involved in

364 regulation of the cell division (Eml1, Svil, Gpsm2, Itgb3bp, and Amn1) (Table 1). These

365 proteins play an important role in sister chromatid segregation, spindle assembly, and

366 cytokinesis (Fig. 4C). Fam208a might be therefore involved in organising the formation of

367 spindle poles and in case of its downregulation, multipolar spindles can occur. Proteomics data

368 support this hypothesis, as there is a selective elimination of all putative interaction partners in

369 KO embryos analysed by LC-MS. Svil, Eml1, Gpsm2, and Itgb3bp were all below the detection

370 limits of the LC-MS method in E9.5 homozygous embryos. Although, all these interaction

371 partners exhibited stable and strong expression in wild-type and heterozygote samples. Taken

372 together, we found increased incidence of impaired spindle apparatuses following Fam208a

373 downregulation and identified several presumed interaction partners linked to spindles pole

374 establishment and functioning. These findings suggest that Fam208a plays an important role in

375 spindle pole assembly during zygotic division and might be also a part of novel protein complex

376 other than HUSH.

377 Fam208a has tissues specific subsets of interacting partners

378 The Y2H system provided an unbiased view on pleiotropy of putative binding partners of

379 Fam208a. However, identification of biological processes where possible interactions play a

380 role is a challenging question. In order to study interactions between Fam208a and its potential

381 binding partners, we analysed the expression pattern of all identified interacting proteins in

382 adult murine tissues. 20 murine organs were used for the preparation of RNA library and

383 subsequently used for BIOMARK q-RT-PCR screen with 18 gene-specific primers (Suppl.

384 Table 1), from which 16 were designed based on identified Y2H preys. Primers for Pphln1 and

385 Setdb1 from HUSH complex were also included to the screen. Different tissues showed various

386 expression levels of Fam208a as well as of all different binding partners. Fam208a is generally

17

387 ubiquitously expressed at low levels in the majority of the organs and increased levels were

388 detected in kidneys, spleen, thymus, seminal vesicles, uterus, and ovaries; however, its

389 expression was almost six times higher in male tissues than in females (Fig. 5A). Three other

390 HUSH proteins (Mphosph8, Pphln1, and Setdb1) also exhibited higher expression in kidneys,

391 uterus, seminal vesicles, and testes. Besides that, their expression was higher in the brain and

392 lungs. Genes involved in spindle apparatus assembly and function, and cell division regulation

393 (Amn1, Eml1, Gpsm2, Itgb3bp, and Svil) are commonly highly expressed in seminal vesicles,

394 lungs, duodenum, and brain. Amn1 and Itgb3bp had highest expression in testes. Stomach and

395 lungs exhibited higher level of Eml1 whereas Gpsm2 appears to be predominantly expressed

396 in proximal colon and ileum. Svil is strongly expressed in heart and tongue. Interestingly,

397 Hcfc2 and Ncbp1 together with HUSH proteins, Psmd8 and Parpbp exhibited very high

398 expression in testes (Fig. 5B).

399 To investigate these findings further, we decided to study the expression of Fam208a and other

400 preys at the single cell resolution. All previously examined tissues are composites of several

401 cell types and so it is not possible to state whether expression levels are based on heterogeneity

402 of tissue samples or protein partners. We selected three adult tissues with the highest expression

403 of Fam208a (testes, ovaries, and brain) for in situ hybridisation [1]. Diversity of co-expressed

404 interaction partners with Fam208a based on observed organ sample was obviously supporting

405 contextual nature of Fam208a protein interactions also at the cell resolution level. A closer

406 investigation of tissues from testes (Fig. 5B) showed possible co-localisation of Fam208a and

407 several other partners. Fam208a itself was highly expressed in seminiferous tubules with strong

408 a signal in Sertoli cells, spermatogonia, and spermatocytes. Eml1, Gpsm2, Psmd8, Inpp5a,

409 Amn1, Cntn1 and Parpbp were also specifically expressed in Sertoli cells, which are located at

410 the base of the epithelium and visually form ring-like staining around edge of the seminiferous

411 tubules. Itgb3bp, Etfa, Hcfc2, Mphosph8, and Ncbp1 were more abundant towards the lumen

18

412 and the signal was seen in spermatogonia and spermatocytes. Svil and Alb showed no

413 expression in testes (Fig. 5B).

414 In analysed ovaries, Fam208a had a strong signal in granulosa cells surrounding the oocyte

415 itself. The majority of our probes exhibited this pattern within the ovary. Parpbp, Gpsm2, Etfa,

416 and Cntn1 had the strongest ovarian signal (Fig. 5C). The expression profiling in the brain

417 revealed that Fam208a is strongly expressed in granular cells within the cerebellum. Similar

418 signal was detected with Svil, Alb, Gpsm2, Etfa, Parpbp, Mphosph8, and Ncbp1.

419 To conclude, the general expression pattern of Fam208a and its interacting partners suggests

420 the contextual role of interaction in dependence on the tissue type and involved physiological

421 process. Moreover, tissues with higher proliferation levels have higher expression of Fam208a

422 and HUSH partners while gametogenic tissues are richer in partners important for spindle

423 apparatus control.

424 Ablation of FAM208a in somatic cells did not impair the cell division processes

425 To observe the possible effects of ablation of FAM208a in fully differentiated somatic cells,

426 we used CRISPR/Cas9 to delete exon 4 (exon 10 in alternative splicing variant) in Hek293t

427 cells. In order to study assumed overlapping roles of FAM208a and HUSH complex, we also

428 prepared a MPHOSPH8 deletion mutant by introducing a mutation in exon 7, in which Ankyrin

429 repeat region was identified and thus only the Fam208a function in HUSH complex was

430 targeted. We established three lines for FAM208a (Fam-a1, Fam-a2 and Fam-a3), and two

431 lines for mutated MPHOSPH8 (Mpp8-a; Mpp8-b) protein.

432 To test whether FAM208a deletion in somatic cells also affects cell division, we performed

433 Trypan Blue Viability measurement. Evaluated parameters included concentration of viable

434 cells, their diameter, and total viability. We did not detect any significant differences between

435 mutant variants of FAM208a and control samples (Fig. 6A). Conversely, MPHOPH8 KO lines

19

436 showed diminished cell viability after 48 hours. In addition, we noticed increased cellular

437 diameters in the mutant lines Fam-a2, Fam-a3, and Mpp8-a. These data propose a marginal

438 effect of FAM208a depletion on cell proliferation as well as on the functionality of HUSH

439 complex.

440 To further analyse FAM208a, we performed proteomic analysis of all lines using LC-MS

441 approach, in which more than 4000 proteins were detected and used for differential proteomics.

442 Original Hek293t cells (HekWT) and cells, which underwent the whole mutagenesis procedure

443 but did not have edited genome (HekMT), were used as controls. LC-MS analyses verified

444 absence of FAM208a in lines Fam-a1, Fam-a2, and Fam-a3 and MPHOSPH8 in lines Mpp8-a

445 and Mpp8-b. (Fig. 6B).

446 Evaluating the presence of other members of HUSH complex showed no difference in levels

447 of PERIPHILIN1. On the other hand, SETDB1 could not be detected in any of our mutant lines

448 (Fig. 6B). Approximately 25% of LC-MS detected proteins were differentially expressed

449 among the mutated lines (Fig. 7A). To show the most significantly up- and down-regulated

450 FAM208a-interacting proteins, we filtered out 127 of them using the most stringent analytical

451 criteria. Selection filter was set up at several levels as follows: firstly, the protein had to be

452 detected in either both control samples or in neither of them. Secondly, the levels of these

453 proteins had to be beyond the detection limit in at least two out of three FAM208a KO lines or

454 opposite. As a result, 104 proteins were detected in both controls, HekWT and HekMT, but

455 they were not measured in at least two of FAM208a KO lines (leading to downregulated

456 expression due to FAM208a depletion). Moreover, 23 proteins were upregulated only in mutant

457 Hek293t lines (upregulated expression) (Fig. 7B). The same method was used for MPHOSPH8

458 cell lines. This gave us the final number of 67 downregulated and 16 upregulated proteins

459 compared to non-mutated HekWT and HekMT (Fig. 7B). Based on the gene ontologies (Table

460 1) and identified functions, FAM208a-dependent proteins can be clustered into 6 groups:

20

461 DNA/RNA binding group with 24 downregulated and 7 upregulated proteins, transcription

462 regulation group with 22 downregulated and 2 upregulated proteins, cell cycle regulation group

463 with 15 downregulated and 2 upregulated proteins, cell division-linked proteins with 9

464 downregulated and 3 upregulated proteins, and proteins connected with cellular apoptosis with

465 4 downregulated and 2 upregulated identified changes (Fig. 7C). The majority of proteins had

466 more than just one role and can therefore be involved in multiple categories. Proteomics data

467 from somatic cell lines showed largely effected protein levels, allowing us to assume the impact

468 of depletion of FAM208a in transcription regulations, nucleic acid binding, epigenetic

469 regulation without real impairment of cell viability (based on Trypan blue staining).

470 Thus, our data suggests that FAM208a in somatic cell lines is responsible for epigenetic

471 silencing via complex together with MPHOSPH8. Although no clear viability phenotype of

472 FAM208a KO cell lines points to compensatory mechanisms that overcomes the ablation of

473 the protein.

474 In conclusion, FAM208a is involved in regulation of cleavage during early zygote

475 development, although its removal in stable somatic lines does not impair the cell cycle nor

476 cell division. It is also possible that both suggested roles, cleavage regulation and epigenetic

477 silencing, are related and interlinked either via Fam208a itself or some other regulatory

478 proteins.

479

480

481

482

483

21

484 Discussion

485 Our findings suggest a new putative role of Fam208a in organisation of the spindle apparatus

486 and an indirect role in the maintenance of genome stability via interaction with MPHOSPH8

487 in HUSH complex. In fact, the heterochromatin instability can be tightly linked with spindle

488 apparatus establishment and cell division. So it is possible that these function are related and

489 create a complex orchestration strategy involving Fam208a.

490 We created a new mouse model with complete loss of the critical first exon of Fam208a, which

491 caused functional ablation of the protein. The homozygous stat of this mutation is

492 embryonically lethal. Moreover, we observed the delayed dosage effect in E9.5 stage

493 heterozygous embryos with 22 somites compared to the wild type littermates with 28 somites.

494 The fact, heterozygotes are born fully developed and viable suggests that Fam208a role is

495 crucial mostly in the very early stages of embryogenesis. Interestingly, compensatory

496 mechanisms are activated during further development as E12.5 heterozygous embryos are

497 indistinguishable from wild types. Even homozygous embryos contain subset of wild type

498 mRNA from heterozygote female (maternal products) and the usage of wild type RNA in first

499 zygotic events results in a delayed effect of Fam208a ablation. RNA-seq data from zygotes

500 indeed show the presence of maternal Fam208a RNA from maternal to zygotic transition of

501 embryonic transcription at 4-cell stage. There is also high-levels of Fam208a in Metaphase I

502 stage oocytes, showing clear involvement of maternal-originating molecules deposition [22].

503 To overcome this problem, we downregulated maternal RNA with the pool of siRNA directly

504 in fertilized eggs. This manipulation resulted in increased incidence of multiple spindle

505 formation and higher risk of cell division arrest. None of the control groups developed spindle

506 apparatus with multiple poles, however, zygotes with downregulated Fam208a had impaired

507 poles in 9 out of 11 detected spindles. Based on this and on RNA-seq data, we suggest that

22

508 Fam208a is critical for early zygotic division processes, spindle dynamics and establishment

509 of bi-polar apparatus.

510 Our data show that Fam208a null mutants die between E9.5 and E12.5. This is remarkably later

511 than in ENU mutagenesis induced L130P mutants (point mutation of Fam208a leading to

512 amino acid substitution – MommeD6), which are fully absorbed by E9.5 [23]. There might be

513 several reasons for a milder effect in the complete knock out. L130P mutants have so far not

514 been fully characterized and it has not been reported whether the Fam208a mutation fully

515 eliminates the endogenous protein. Thus, the presence of remaining and maybe not fully active

516 Fam208a might cause a dominant negative effect. In addition, murine zygote deficient for

517 Fam208a may develop compensatory mechanisms (HUSH2, Kap1) and use pathways, which

518 provide partial rescue for embryos. Moreover, the presence of maternal RNA leads to the

519 production of Fam208a protein in oocytes and thus postpones the onset of effects caused by

520 Fam208a zygotic mRNA decay. By this act, maternal Fam208a helps to overcome the first

521 rounds of zygotic division [24].

522 Using the Y2H system, we identified and verified 16 putative Fam208a interaction partners.

523 Five of these proteins (Gpsm2, Eml1, Svil, Amn1 and Itgb3bp) are directly linked with spindle

524 apparatus establishment and correct functioning and might be more important for interaction

525 with Fam208a during the early rounds of zygotic divisions. One of the identified proteins is

526 Gpsm2, a protein member of cortical complex (consisting of NUMA and Dynactin/dynein) [25

527 2004] that plays a key role in establishing proper spindle orientation [26]. Another interesting

528 protein is Supervillin which co-localises with endogenous myosin II and EPLIN in the cleavage

529 furrow during early cytokinesis [27]. Eml1 is critical for correct formation of cleavage plane

530 [28]. The function of Amn1 is linked with both, spindle assembly and nuclear orientation

531 checkpoints [29]. Itgb3bp (CENPR) is a core centromere protein, which prevents pre-mature

532 separation [30]. Considering the fact, that almost one third of identified interaction partners is

23

533 involved directly in cell division mechanism, we propose that Fam208a is most likely involved

534 in this process as well.

535 To further investigate the interaction partners of Fam208a in tissues, we performed expression

536 profiling in murine organ samples, showing variable expression pattern in different tissues.

537 Therefore, it is possible that the cell specific role of Fam208a is governed by different

538 interactions. To map possible interactions among Fam208a and its partners in specific tissues,

539 we performed in situ hybridisation with RNA probes (Suppl. Table 1). The hybridisation

540 revealed that co-localisation and co-expression of Fam208a and its putative interaction partners

541 is strongly tissue- and cell type-specific. Therefore, in hyper-proliferative cells, Fam208a might

542 play a different role compared to slowly proliferative tissues.

543 To further describe a role of Fam208a in the spindle apparatus assembly, we used CRISPR-

544 Cas9 to prepare stable somatic cell lines with mutations in Fam208a and Mphosph8, which

545 was identified as one of the interaction partners using Y2H system. This interaction has

546 previously been described and verified [11]. Fam208a, Mphosph8, Periphillin, and Setdb1

547 were designated as HUSH complex, whose function was linked with gene silencing. To

548 investigate whether these partners are also involved in mitotic cell division, we prepared knock

549 down cell lines. No viability defects were observed in Fam208a mutant cells indicating that in

550 comparison with the described effect during early zygotic division, the described machinery in

551 somatic cells might not be effected at all. Crucial difference between cleavage of zygotic cells

552 and normal cell division is that there is no increase of cytoplasmic mass in dividing cells in

553 comparison with an increase of nuclear mass and overall cell number [31]. Cleavage cycle is

554 completely omitting G1 and G2 phases and it only consists of quick sets of S and M phases

555 [32]. Due to these differences, various proteins are orchestrating this zygote specific division.

556 While Dnmt3a (methyltransferase cooperating with Fam208a) that recruits HUSH complex to

557 its active sites is not necessary during zygotic divisions [33], LGN (GPSM2) is important for

24

558 nuclear positioning and cellular polarity establishment particularly during embryonic cleavage

559 [31]. Thus Fam208a seems to be acting, beside the HUSH complex, during early zygotic

560 division and cleavage. Data from expression profiling were supported by LC-MS proteomics,

561 showing similar interaction variety. However, it appears that both experimental setups,

562 embryonic and cellular, point to the involvement of Fam208a in DNA/RNA binding and

563 transcription regulations as majority of effected proteins are functionally linked to these

564 ontologies.

565 While the analysis of Fam208a interactions in cell lines revealed that protein partners are acting

566 more as cell cycle regulators, the analyses performed in embryos identified a protein group

567 participating in epigenetics processes. Therefore, it seems that epigenetic machinery in cell

568 lines is more stable even if Fam208a is downregulated, while in embryos, Fam208a removal

569 causes accumulation of errors. In summary, the analysis of the proteomics data suggests that

570 Fam208a has a crucial role during zygotic cleavage. Its epigenetic regulation role becomes a

571 key function once the methylation processes are needed, i.e. the epigenetic machinery in cell

572 lines is not influenced by the ablation of Fam208a while Fam208a removal in embryos causes

573 gastrulation arrest and primitive streak formation failure. Thus, the depletion of Fam208a does

574 not seem to affect standard mitotic division. In case of impairment of HUSH complex

575 (Fam208a KO) function, cells can operate through other mechanism, e.g. through Kap1

576 complex [34 2013]. However, when MPHOSPH8 is downregulated and thus excluded from

577 HUSH and other methylation complexes, e.g. with Dnmt3a [35], cells are barely able to cope

578 with this lost. On the contrary, ablation of MPHOSPH8 predominantly causes an increase in

579 proliferation followed by lower viability and higher sensitivity to cell death.

580 Altogether, we identified 16 putative Fam208a-interacting partners, which besides the HUSH

581 complex, create a novel protein network (Eml1, Svil, Gpsm2, Amn1, and Itgb3bp), linking

582 Fam208a to the maintenance of the genome stability via controlling the function of spindle

25

583 apparatus. This new role of Fam208a within unique complex appears to affect processes of an

584 early embryonic division when the HUSH function is paused. The epigenetic role of Fam208a

585 seems to be more crucial during and after the differentiation process and biological functions

586 of this dual acting protein could be not only for developmental process but they seem to be

587 cell-type and tissue specific. Molecular mechanism is yet to be discovered, nevertheless, based

588 on other putative interaction proteins, we conclude that Fam208a is a crucial protein involved

589 in several mechanisms maintaining genome stability.

590

591

592

593 Conclusions

594 In summary, we demonstrated that Fam208a exerts an important role in early embryonic

595 development and it is involved in the organisation of spindle poles and spindle apparatus

596 assembly. To perform this role, distinct interaction partners must be present besides those

597 responsible for epigenetic silencing (HUSH complex). We identified 16 proteins as putative

598 interaction partners and 5 of them are linked with cell division processes. This library was

599 profiled and showed high tissue specificity. Thus, the roles of Fam208a may differ in a cell-

600 specific manner, as it is likely dependent on availability of its interaction partners. Altogether,

601 our work provides important insight in understanding the molecular landscape of Fam208a as

602 a multi-interacting protein affecting genome stability, playing an essential role in the initial

603 steps of embryonic development via orchestrating spindle pole assembly and chromosome

604 segregation.

605

606

26

607 Abbreviations

608 MPP8: Mphosph8, FAM: Fam208a, DAPI: 4’,6-diamidino-2-phenylindole hydrochloride;

609 GFP: green fluorescent protein; A488: Alexa 488, IMPC - The International Mouse

610 Phenotyping Consortium, ENU - N-ethyl-N-nitrosourea, KO – knock out, EPLIN - epithelial

611 protein lost in neoplasm, Y2H – yeast two hybrid

612 Author’s contributions

613 VG performed the experiments and prepared the manuscript, VN was responsible for statistical

614 analyses of high through output methods (LC-MS, BIOMARK), SB and IJ prepared murine

615 Fam208a KO strain, MP dissected mice and isolated embryos, JP and RS designed the

616 experiments, proofread, and corrected the manuscript. All authors read and approved the final

617 manuscript.

618 Author details

619 1 Laboratory of transgenic models of diseases, Institute of molecular genetics of the ASCR,

620 v.v.i, Průmyslova 595 Vestec, Czech Republic

621 2 Czech centre of phenogenomics, Institute of molecular genetics, ASCR v.v.i, Průmyslova

622 595, Vestec, Czech republic

623 Acknowledgements

624 Acknowledgment to Karel Harant and Pavel Talacko from Laboratory of Mass Spectrometry,

625 Biocev, Charles University, Faculty of Science, where proteomic and mass spectrometric

626 analysis had been done. We also thank to Dr. Epp Trevor for his help with Y2H experimental

627 design. We would like to acknowledge Dr. Peter Solc for discussion about oocyte staining and

628 fluorescent labelling.

629 Competing interests

630 The authors declare that they have no competing interests.

27

631 Availability of data and materials

632 All data generated or analysed during this study are included in this published article and its

633 supplementary information files.

634 Consent for publication

635 The authors agree with publishing this manuscript.

636 Ethics approval and consent to participate

637 Mice were bred and housed in accordance with animal welfare rules in a pathogen-free facility.

638 All procedures used in this study were in accordance to applicable local laws and in conformity

639 with animal welfare regulations of the Czech Republic. All animal models and experiments of

640 this study were ethically reviewed and approved by the Institute of Molecular Genetics

641 approved performed experiments (c.j.115/2016)

642 Funding

643 The study was supported by RVO 68378050 by Academy of Sciences of the Czech Republic

644 and by LM2015040 (Czech Centre for Phenogenomics), CZ.1.05/2.1.00/19.0395 (’Higher

645 quality and capacity for transgenic models’), CZ.1.05/1.1.00/02.0109 (BIOCEV -

646 Biotechnology and Biomedicine Centre of the Academy of Sciences and Charles University),

647 LQ1604 (National Sustainability Program II project BIOCEV-FAR) funded by the Ministry of

648 Education, Youth and Sports and the European Regional Development Fund.

649

650

651

652

653

28

654 Figure titles and legends

655 Figure 1

656 Downregulation of Fam208a in zygotes leads to embryonic lethality

657 A) Mutagenesis design shows location of genotyping primers (F’; R1’ and R2’) and guide

658 RNAs (G1 and G2) with labeled PAM sequence, yellow. B) Quantitative analyses of mRNA

659 levels shows a slight increase of transcription levels of Fam208a in heterozygous and reduction

660 of RNA in homozygous mutants. C) Within five litters, no viable homozygote is observed. D)

661 Bright-field microscopy shows morphological differences between the development of

662 Fam208a +/+; Fam208a -/+ and Fam208a -/- embryos at embryonic day 9.5. E) As no mutants

663 were born, we examined embryos at stages E8.5 and E9.5. In 43 dissected embryos

664 (E8.5+E9.5), 12 were full mutants with Fam208a -/- genotype, 13 embryos were wild types and

665 18 embryos were heterozygous for deletion in Fam208a. F) Graph representing the number of

666 somites at embryonic stage E9.5 with milder differences observed between wild types and

667 heterozygotes and rapid elimination of the number of somites in Fam208a -/- embryos.

668 Figure 2

669 Fam208a mutation massively impacts protein expression profile in homozygous mice

670 A) Heat map representing overall identified genes for LC-MS data obtained from E9.5

671 embryos. Wild types and a heterozygote show high similarity in the expression profile, while

672 a homozygous embryo shows a different pattern. B) Detail visualization of affected proteins

673 and distinct expression profiles identified in a homozygous Fam208a KO embryo. C) The

674 affected proteins were grouped according to their ontologies and the chart represents their

675 distribution amongst these groups; numbers in lines represent common ontologies for a protein.

676

677

678

29

679 Figure 3

680 Downregulation of Fam208a in zygotes leads to formation of multipolar spindles

681 A) RNA-seq results show increased expression of Fam208a after two-cell stage activation. The

682 first line of expression profile in one-cell stage shows a low amount of identified RNA mainly

683 obtained from maternal RNA. In the third line presenting expression after two-cell stage

684 activation (four cells), we see increased levels of both introns and exons, which is evidence of

685 transcription coming from embryonic RNA. In morula, the fourth line, the trend is still

686 detectable, but obviously the highest expression peak for Fam208a is during the first rounds of

687 zygotic division. B) Immunofluorescence staining of zygotes, which are either 42 hours or 72

688 hours after siRNA electroporation, shows increased incidence of formation of multipolar

689 spindles in the absence of Fam208a (n=9/11). Red arrows point to spindle poles of dividing

690 cells. C) Detailed view of the spindle apparatus with a multipolar spindle in Fam208a

691 downregulated cell at different planes (z) to observe all formed spindle centers (left panel) and

692 two different planes of a normal spindle apparatus with two spindle poles in a control cell (right

693 panel)

694 Figure 4

695 Y2H screen revealed novel Fam208a interacting network in spindle assembly machinery

696 The yeast two-hybrid screen used several constructs as a bait to pull down interaction partners;

697 a scheme of prepared constructs is shown with an outline of possible interaction domains with

698 competitive binding partners. Constructs α, β and B lack cassette exon 17 (60 bp), the α

699 construct had also an identified alternative reading frame caused by splicing differences, and

700 the β variant had deletion in exon 22 that influences the reading frame in the rest of the protein.

701 B) All identified preys are listed in a table with symbols to identify the observed strength of

702 their interaction based on the size and color of the yeast diploid colonies after mating. C)

30

703 Proposed scheme with already identified function of proteins involved in the cell division

704 process. Gpsm2, mitotic spindle pole organization; Amn1, nuclear orientation checkpoints;

705 Eml1, assembles and organizes microtubules and regulates orientation of the spindle apparatus;

706 Itgb3bp, member of centromere-specific complex, recruits histone H3 to the centromere region;

707 Svil, coordinates actin filaments and myosin II during cell spreading.

708 Figure 5

709 Fam208a is differentially co-expressed with putative partner proteins in adult murine

710 tissues

711 A) Complex heat map based on qRT-PCR data showing expression patterns of 18 genes within

712 20 different murine tissues with dark blue representing relatively low expression and bright red

713 representing relatively high expression of pre-selected genes. X axis is divided into columns

714 representing genes and Y axis forms rows dedicated for murine tissues. ‘F’ states for female

715 and ‘M’ for male samples. B) In situ hybridization staining of paraffin sections of testes and

716 C) ISH of ovaries, NC = negative control

717 Figure 6

718 Ablation of FAM208a and MPHOSPH8 in somatic cells does not impair the cell division

719 processes

720 A) Relative fold ratio of expression levels of HUSH complex proteins in mutated Hek293t cell

721 lines. The levels of FAM208a in Fam208a mutants and MPHOSPH8 in Mpp8 mutants are

722 below the detection limit. There is almost no effect on PERIPHILIN levels (PPHLN1), but the

723 levels of SETDB1 could not be traced down in any of affected cell lines. B) To study the effect

724 of downregulation of FAM208a and MPHOSPH8 in somatic cell lines, analyses of live cell

725 concentration, total viability, and cell size measurements were performed.

31

726 Figure 7

727 Ablation of FAM208a and MPHOSPH8 in somatic cells did not impair the cell division

728 processes

729 A) Hek293t cells were mutated with the CRISPR/Cas9 system targeting either the Fam208a or

730 Mphosph8 gene. LC-MS was used to identify changes implicated by these mutations. Normal

731 wt Hek293t cells (HekWT) were used as a control and standard sample; the HekMT line does

732 not carry any mutations and was used as a control of the mutagenesis process, which could also

733 have introduced changes; lines Fam-a1, Fam-a2 and Fam-a3 carry different deletions in

734 FAM208a and lines Mpp8-a and Mpp8-b had knocked out MPHOSPH8 protein. B) Expression

735 profiles of cell lines differ in approximately one fourth of the proteins identified by LC-MS.

736 C) Ontology graph representing 127 affected proteins grouped according to their ontologies.

737 Proteins might be involved in more categories simultaneously, and numbers above lines

738 represent common proteins for the linked groups. The first number in brackets represents

739 upregulated while the second one shows downregulated protein

740 Supplementary Figure S1

741 A) Wild-type and Fam208a heterozygous E12.5 embryos do not exhibit any obvious

742 morphological differences. Homozygotes are not viable, but resorptions are always present at

743 this stage. B) Diagram of measured weight of E12.5 embryos with no data for absorbed

744 embryos with homozygous mutation. The difference between wild types and heterozygotes at

745 this stage of embryonic development is negligible.

746

747 Supplementary Figure S2

32

748 Chart representing relative amounts of detected proteins in murine E9.5 embryos. Fam208a

749 showed generally low expression which is beyond detection limit in homozygotes. Pphln1 and

750 Setdb1 do not display any alterations in their expression, while Mphosph8 shows decreased

751 values in homozygotes.

752

753

754

755

756

757

758

759

760

761

762

763

764

765

766

767

768 Table 1

33

769 List of identified and verified putative interaction partners with their ontologies.

Gene Name Ontology

Eml1 Echinoderm microtubule-associated protein-like cell division, spindle

Svil Supervillin cell division, adhesion

Gpsm2 G-Protein Signalling Modulator 2 cell division, spindle

Itgb3bp Integrin Subunit Beta 3 Binding Protein cell division, kinetochore

Amn1 Antagonist Of Mitotic Exit Network 1 Homolog cell division, spindle

Cntn1 Contactin 1 adhesion

Etfa Electron Transfer Flavoprotein Alpha Subunit energy

Psmd8 Proteasome 26S Subunit, Non-ATPase 8 proteasome, degradation

Inpp5a Inositol Polyphosphate-5-Phosphatase A Ca regulation

S100a10 Calpactin Ca binding

Mphosph8 M-Phase Phosphoprotein 8 HUSH

Pphln1 Periphillin HUSH

Tmem100 Transmembrane protein 100 differentiation

Alb Albumin carrier protein

Parpbp PARP1 Binding Protein DNA/RNA mechanism

Hcfc2 Host Cell Factor C2 DNA/RNA mechanism

Ncbp1 Nuclear Cap Binding Protein Subunit 1 DNA/RNA mechanism

770

771

772

773

774 Supplementary Table 1 Sequences of primers for ISH probes and qRT-PCR Biomark essay.

34

Gene Forward Reverse Product Gpsm2 CACGAGCAGCGTCTCCTAAT TTGTCATCAGGCGCTCAAGT 1090 Alb AGATGACAGGGCGGAACTTG GGTTTGGACCCTCAGTCGAG 962 Itgb3bp GGAGCACAGAAACGGACCAT AGGAATTCAAAGCTGTCAAGATGA 354 Etfa AGTAGCTGGCGTAGCAAAGG GCCACCTGGAAAATTGGAGC 726 Psmd8 GAACCGGAAGAACCCGAACC TGGCCAGTTCAGTAGAGGGG 1367 Inpp5a CCTGCTGGTCACGGCCAA TTCCGGTCATTGTCTGACTCC 792 Hcfc2 ATGGTTCTGGTGTTGTGGGC GGAAGTGTTGGGTGCCAATG 908 Parpbp CACATGCCAGAGTCACCAGT AGTGAAGAGCAGACAAGGGC 928 Cntn1 GATCCTGCCTTGGACCTCAC CATCACTGGAAGGTCCGCAT 1099 Amn1 AGCATTCGGGGTCGGATAAC TGTTTGGCACATGGTCCACT 452 Svil GGCCTTTGGTAGAGCACAGT GCTGCATATCTGGTCTGGCT 984

Eml1 ACATCACGGAGGAGCAACAG CCGCAAATACCGCTTCGTTC 978 Ncbp1 CCGACAAACCACATCCACAG GTGCAGCTTCCCCTTATCACTC 1100 Mphosph8 GCTTGAAAGCACGAATGCCT TGTTGCAGTCAGCTCCACAT 1043

ISH probes Gene Forward Reverse Product Gpsm2 TCTTCGACATCCTTGTAAAGTGC GGACAGTCGGCCCCTTAG 90 Alb GACTTTGCACAGTTCCTGGAT TGCATCTAGTGACAAGGTTTGG 91 Itgb3bp GGAACTTATCAGTTGAGCCCATT AGTTACTCCGTTTCCTTGTTTCA 97 Etfa GGAGCGTCTGCTTTTGGA TGATGTCAGAAACTGGAGCAA 76 Psmd8 TCTACATCAAACACCCTGTTTCC CTTTCGGCAGGGATGTTC 94 Inpp5a ATTCGGACACTTTGGAGAGC CCTTTTCTTGACCATTTGCAC 88 Hcfc2 CGTACCAAGCTACATCGTCTGA CCTTGTCTGTGAGGGTCCA 76 Parpbp CCAACAACATCAGTCCTGTCC GACCAGCAAAATTTTCACAGC 92 Cntn1 AACAAGGAAATTACGCATATCCA CATTTCGGATGAGCAGTTCC 76 Amn1 GGATAACAGATTCCAATATAAGTGAGG GCTGGAGAGCCACATCTGA 93 Svil AGGACCGTTCACACACACAG AGGAAGTCTCGCCGTTACAG 113

Eml1 CATCTCCCCCACCATGTC CGATGCGGTCTGACACCT 89 Ncbp1 CTACACTGCTAATCGAACTGTGC GCATGTACAGCATCTCAGTCG 84 Mphosph8 GGGGAGGACGTTTTCGAG GATGTATATCCTTTCCATCGAACTTT 92 Tmem100 TTGCTGCTGTCTCAGTCCAC AAAGAGCCTGTCACCCACTG 88

Biomark primers Biomark 775 Supplementary Table 2

35

776 Primer sequences used for cloning into Y2H vector and used for production of mutant cell lines

777 and murine strain.

778

Name Plasmid Res.E. Sequence

N'terF pGBKT7 XmaI AACCCGGGGACTGCCGCGGAGACG

Hybrid N'terR pGBKT7 SalI AAAGTCGACAAGCCAATGGACTGTGGAGA C'terF pGBKT7 EcoRI CAATGTAGAAAAGAATTCAAAACTAT C'terR pGBKT7 SalI CAAGTCGACTTAATGGAGATTTCTCTGTACCC

Yeast Two Yeast Two Name Sequence F2F CACCGTTGCAGCCTTTATGAAGTTG F2R AAACCAACTTCATAAAGGCTGCAAC F3F CACCGGTTTCCTTATAAAACAGTGC F3R AAACGCACTGTTTTATAAGGAAACC M82bR CACCGTGATGCTTGCCGCCGCCGGA

M82bR AAACTCCGGCGGCGGCAAGCATCAC line VerifyFam2+3F GGTTGGAAATATTGCCTGGCT VerifyFam2+3R CAGCAACAGACAGACACCTCA VerifyM8a2abF GGCAGGGTTACCACAAACCT VerifyM8a2abR GTCCATCGGCAGGAATACCA

Crispr/Cas9 cell Crispr/Cas9 Genotyping F TCAGAGCAGACCGATCACAC

Genotyping R1 GAAACGCTTCAAACCTGAGC Genotyping R2 GCACTCCAGCCACAGAGAC Fam208a 5c+T7 TAATACGACTCACTATAGGG ATGGCGTCGACGCTTTCCC TAATACGACTCACTATAGGG Fam208a 3b+T7

Crispr/Cas9 mice Crispr/Cas9 ACCCTATCCTCTCGCACCAA 779

780

781

782

36

783 References

784 1. Masuda, T., M. Tomita, and Y. Ishihama, Phase transfer surfactant-aided trypsin 785 digestion for membrane proteome analysis. J Proteome Res, 2008. 7(2): p. 731-40. 786 2. Iraqui, I., et al., Recovery of arrested replication forks by homologous recombination 787 is error-prone. PLoS Genet, 2012. 8(10): p. e1002976. 788 3. Peters, A.C., et al., Mammalian DNA mismatch repair protects cells from UVB- 789 induced DNA damage by facilitating apoptosis and p53 activation. DNA Repair 790 (Amst), 2003. 2(4): p. 427-35. 791 4. Ferguson, D.O., et al., The nonhomologous end-joining pathway of DNA repair is 792 required for genomic stability and the suppression of translocations. Proc Natl Acad 793 Sci U S A, 2000. 97(12): p. 6630-3. 794 5. Tchasovnikarova, I.A., et al., GENE SILENCING. Epigenetic silencing by the 795 HUSH complex mediates position-effect variegation in human cells. Science, 2015. 796 348(6242): p. 1481-1485. 797 6. Li, Q., H. Wen, and S. Ao, Identification and cloning of the cDNA of a Rb-associated 798 protein RAP140a. Sci China C Life Sci, 2000. 43(6): p. 637-47. 799 7. Campbell, P.A., et al., Oct4 targets regulatory nodes to modulate stem cell function. 800 PLoS One, 2007. 2(6): p. e553. 801 8. Daxinger, L., et al., An ENU mutagenesis screen identifies novel and known genes 802 involved in epigenetic processes in the mouse. Genome Biol, 2013. 14(9): p. R96. 803 9. Harten, S.K., et al., The first mouse mutants of D14Abb1e (Fam208a) show that it 804 is critical for early development. Mamm Genome, 2014. 25(7-8): p. 293-303. 805 10. Bhargava, S., et al., The epigenetic modifier Fam208a is required to maintain 806 epiblast cell fitness. Sci Rep, 2017. 7(1): p. 9322. 807 11. Iva A. Tchasovnikarova, R.T.T., 1* Nicholas J. Matheson,1 Kim Wals,1 Robin 808 Antrobus,1 Berthold Göttgens,2 Gordon Dougan,3 Mark A. Dawson,4 Paul J. 809 Lehner, Epigenetic silencing by the HUSH complex mediates position-effect 810 variegation in human cells. Sciencexpress, 2015. 811 12. Robbez-Masson, L., et al., The HUSH complex cooperates with TRIM28 to repress 812 young retrotransposons and new genes. Genome Res, 2018. 813 13. Hebert, A.S., et al., The one hour yeast proteome. Mol Cell Proteomics, 2014. 13(1): 814 p. 339-47. 815 14. Richards, A.L., et al., One-hour proteome analysis in yeast. Nat Protoc, 2015. 10(5): 816 p. 701-14. 817 15. Tyanova, S., et al., The Perseus computational platform for comprehensive analysis 818 of (prote)omics data. Nat Methods, 2016. 13(9): p. 731-40. 819 16. Munoz-Fuentes, V., et al., The International Mouse Phenotyping Consortium 820 (IMPC): a functional catalogue of the mammalian genome that informs 821 conservation. Conserv Genet, 2018. 19(4): p. 995-1005. 822 17. Dickinson, M.E., et al., High-throughput discovery of novel developmental 823 phenotypes. Nature, 2016. 537(7621): p. 508-514. 824 18. Abe, K., et al., The first murine zygotic transcription is promiscuous and uncoupled 825 from splicing and 3' processing. EMBO J, 2015. 34(11): p. 1523-37. 826 19. Karlic, R., et al., Long non-coding RNA exchange during the oocyte-to-embryo 827 transition in mice. DNA Res, 2017. 24(2): p. 219-220. 828 20. Li, L., X. Lu, and J. Dean, The maternal to zygotic transition in mammals. Mol 829 Aspects Med, 2013. 34(5): p. 919-38.

37

830 21. Saitou, M., S. Kagiwada, and K. Kurimoto, Epigenetic reprogramming in mouse 831 pre-implantation development and primordial germ cells. Development, 2012. 832 139(1): p. 15-31. 833 22. Gasca, S., et al., Identifying new human oocyte marker genes: a microarray 834 approach. Reprod Biomed Online, 2007. 14(2): p. 175-83. 835 23. Bhargava, S., et al., Author Correction: The epigenetic modifier Fam208a is 836 required to maintain epiblast cell fitness. Sci Rep, 2018. 8(1): p. 5762. 837 24. Kim, K.H. and K.A. Lee, Maternal effect genes: Findings and effects on mouse 838 embryo development. Clin Exp Reprod Med, 2014. 41(2): p. 47-61. 839 25. Du, Q. and I.G. Macara, Mammalian Pins is a conformational switch that links 840 NuMA to heterotrimeric G proteins. Cell, 2004. 119(4): p. 503-16. 841 26. Kschonsak, Y.T. and I. Hoffmann, Activated ezrin controls MISP levels to ensure 842 correct NuMA polarization and spindle orientation. J Cell Sci, 2018. 131(10). 843 27. Smith, T.C., Z. Fang, and E.J. Luna, Novel interactors and a role for supervillin in 844 early cytokinesis. Cytoskeleton (Hoboken), 2010. 67(6): p. 346-64. 845 28. Bizzotto, S., et al., Eml1 loss impairs apical progenitor spindle length and soma 846 shape in the developing cerebral cortex. Sci Rep, 2017. 7(1): p. 17308. 847 29. Wang, Y., et al., Exit from exit: resetting the cell cycle through Amn1 inhibition of 848 G protein signaling. Cell, 2003. 112(5): p. 697-709. 849 30. Verdaasdonk, J.S. and K. Bloom, Centromeres: unique chromatin structures that 850 drive chromosome segregation. Nat Rev Mol Cell Biol, 2011. 12(5): p. 320-32. 851 31. Ajduk, A. and M. Zernicka-Goetz, Polarity and cell division orientation in the 852 cleavage embryo: from worm to human. Mol Hum Reprod, 2016. 22(10): p. 691- 853 703. 854 32. O'Farrell, P.H., J. Stumpff, and T.T. Su, Embryonic cleavage cycles: how is a 855 mouse like a fly? Curr Biol, 2004. 14(1): p. R35-45. 856 33. Hirasawa, R., et al., Maternal and zygotic Dnmt1 are necessary and sufficient for 857 the maintenance of DNA methylation imprints during preimplantation development. 858 Genes Dev, 2008. 22(12): p. 1607-16. 859 34. Rowe, H.M., et al., De novo DNA methylation of endogenous retroviruses is shaped 860 by KRAB-ZFPs/KAP1 and ESET. Development, 2013. 140(3): p. 519-29. 861 35. Chang, Y., et al., MPP8 mediates the interactions between DNA methyltransferase 862 Dnmt3a and H3K9 methyltransferase GLP/G9a. Nat Commun, 2011. 2: p. 533. 863

864

38

Figure 1 Click here to download high resolution image Figure 2 Click here to download high resolution image Figure 3 Click here to download high resolution image Figure 4 Click here to download high resolution image Figure 5 Click here to download high resolution image Figure 6 Click here to download high resolution image Figure 7 Click here to download high resolution image Figure S1 Click here to download high resolution image Figure S2 Click here to download high resolution image D14Abb1e – Tracking Down A Putative Suppressor Of Variegation

VERONIKA GREŠÁKOVÁa,b, SLAVOMIR KINSKYa , RADISLAV SEDLÁČEKa,c, TREVOR A. EPPa a Laboratory of transgenic models of diseases, IMG ASCR, 142 20 Prague 4; bFaculty of Medicine, Palacky University, 771 26 Olomouc; cCzech Centre for Phenogenomics (BIOCEV/IMG), IMG AS CR, 142 20 Prague 4

Formation of heterochromatin is important for many aspects of nuclear function, including suppression of endogenous retroviruses, control of gene-dosage, and silencing of genes affected by DNA damage (senescence associated heterochromatic foci). In order to identify genes important for heterochromatin formation, an ENU (N-ethyl- -N- nitrosurea) mutagenesis screen was conducted on a transgenic mouse line containing a GFP reporter transgene that exhibits variegated expression in erythrocytes1 . Mutants that suppress or enhance variegation were termed the Modifiers of Murine Metatable Epialleles (Momme) and to date, more than 40 mutant mouse lines have been identified in the dominant screen. The underlying causative mutations have been identified for many of the Momme mutants, and most correspond to genes well-known for their role in heterochromatin formation (e.g. Dnmt1, Dnmt3b, Hdac1, Snf2h etc.). However, one gene identified twice in the screen (MommeD6 and MommeD20) remains largely unknown – D14Abb1e. The MommeD6 line contains a nonconservative substitution and the MommeD20 line contains a splice site mutation2 . Both are homozygous lethal during gastrulation with compound heterozygosity replicating the homozygous phenotype. Our goal is to identify the molecular function of D14Abb1e and understand its possible role as a suppressor of variegation. Preliminary results from our yeast 2- hybrid screen suggests that D14Abb1e could indeed be associated with chromatin remodeling complexes.

This work was supported by AS CR (RVO 68378050), and by Ministry of Education, Youth and Sports, CR (OP RDI CZ.1.05/1.1.00/02.0109 (BIOCEV)), and OPVK projects CZ.1.07/2.3.00/20.0102 and CZ.1.07/2.3.00/30.0027.

1. Daxinger L., Harten S.K., Oey H., Epp T., Isbel L., Huang E., Whitelaw N., Apendaile A., Sorolla A., Yong J., Bharti V., Sutton J., Ashe A., Pang Z., Wallace N., Gerhardt D., Blewitt M.E., Jeddeloh J. A., Whitelaw E.: Genome Biol. 14, R97 (2013).

2. Harten S.K., Bruxner T.J., Bharti V., Blewitt M., Nguyen T., Whitelaw E., Epp T.: submitted. From ENU mutagenesis to knock-in reporter fusion alleles: using genetic technologies to study the role of Fam208a

Veronika Grešáková1, 2; Shohag Bhattacharyya1; Bjorn Schuster1; Inken M. Beck1; Radislav Sedláček1; Kallayanee Chawengsaksophak1 and Trevor A. Epp1

1 Biocev, IMG AV CR v.v.i., Videnska 1083, Praha 14220, Czech Republic

2 LF UPOL, Tr. Svobody 8, Olomouc 77126, Czech Republic

Heterochromatin formation is essential for suppressing undesirable or unrequired genetic elements, and thereby serves essential protective, structural and regulatory roles. In order to identify genes important for heterochromatin formation, a dominant ENU mutagenesis screen was conducted on a transgenic mouse line containing a GFP reporter under control of hemoglobin promoter. Mutants that suppress or enhance variegation of GFP expression were termed Modifiers of Murine Metastable Epialleles (Momme). We chose to focus on two mutant lines, that suppressed transgene variegation, and contain point mutations in the uncharacterized protein— Fam208a: the MommeD6 line contains a non-conservative substitution and the MommeD20 line contains a splice site mutation. Both are homozygous lethal during gastrulation with compound heterozygosity replicating the homozygous phenotype. Our results show that this protein is a putative player in heterochromatin formation via its interaction with the ankyrin repeat domain of Mphosph8 protein. This interaction was identified by yeast two hybrid screening and verified by Immunoprecipitation and Immunofluorescence in both human (HEK 293T) and murine (NIH 3T3) cells. Interestingly, the MommeD6 mutation seems to impair nuclear localization of Fam208a as well as its interaction with Mphosph8. More recently we have successfully prepared a fluorescent reporter knock-in strain using a CRISPR/Cas9-driven strategy, and have confirmed its single-copy integration by droplet digital PCR. This strain will help us to visualize and study the endogenous protein in its native form.

Keywords: Fam208a, citrine, CRISPR/Cas9, Momme, variegation, supressor