Endothelial Heterogeneity: A Role for Epigenetic Pathways in Constitutive Expression

by

Lucy Chen

A thesis submitted in conformity with the requirements for the degree of Master of Science Department of Medical Biophysics University of Toronto

© Copyright by Lucy Chen 2019

Endothelial Heterogeneity: A Role for Epigenetic Pathways in Constitutive Gene Expression

Lucy Chen

Master of Science

Department of Medical Biophysics University of Toronto

2019 Abstract

Endothelial cells (ECs) have distinct gene expression and morphology depending on their location in the vasculature. A prevailing assumption is that these differences are largely caused by environmental factors, which differentially modulate gene expression in an otherwise homogeneous cell population. Here, we investigated the mechanistic basis for heterogeneity in

EC gene expression and its functional importance, using P- as a model gene. We found that heterogeneity exists basally in P-selectin expression across several human EC types.

RNA FISH also showed heterogeneity in SELP transcription in isogenic populations of human umbilical vein endothelial cells (HUVEC). Differential proximal promoter DNA methylation, in part, accounted for this protein and RNA heterogeneity. In vitro methylation of the SELP promoter repressed transcription in episomal promoter reporter transfections. Importantly, promoter methylation was shown to be mitotically heritable. Using informative heterozygous

SNPs in the SELP gene, we also report that P-selectin is monoallelically expressed in HUVEC.

ii

Acknowledgments

There are a number of people that have helped me get to this point in my scientific career, whom without, none of this would have been possible.

First, I’d like to thank my supervisor Dr. Philip A. Marsden, for teaching me how to pay attention to detail, for encouraging me to think independently and critically, and for supporting me during those times when experiments just wouldn’t work. You are a brilliant mind and I am fortunate to have been mentored and coached by you. Thank you for making me both a stronger person and a better scientist. I will always remember to keep things in perspective and consider the bigger picture.

Graduate school has certainly had its share of challenges, both personal and professional. I definitely have had many ups and downs over the past few years. But one of the major highlights for me during graduate school was the development of some great friendships. In particular, I’d like to thank Paul Turgeon. To have worked with you, learned from you, and laughed with you was truly a gift and a pleasure. We have suffered together at times, but I’m glad that I had someone to complain with. I will miss your humor and advice.

I would also like to thank Eileen Tran and Joseph Samuel. We’ve had some great moments both in and out of the lab. I will always remember the times we laughed till we cried and the silly sense of humour we all shared. Thank you for inspiring me and your constant willingness to help.

Next, I would like to thank all of the past and present members of the Marsden lab. I’m so thankful to have been welcomed and accepted into this wonderful group of brilliant and quirky individuals. Each of you has contributed in some way, to these last few years being the best of my life. No matter how stressful, frustrating, or overwhelming things got, I still looked forward to coming into lab every day because of you guys. Thank you for all the lively discussions and helpful experimental advice. I’m so grateful to have shared in moments that will forever be remembered and retold as classic Marsden lab stories.

I would also like to express my deepest thanks to the amazing research facility core specialists at the Keenan Research Centre for Biomedical Science. Chris Spring, Caterina Di Ciano-Oliveira,

iii and Pamela Plant have been invaluable resources. Their expertise and advice have contributed immensely to helping me troubleshoot, interpret, and move forward with my research.

Additionally, I would like to thank my thesis committee members, Dr. Mathieu Lupien, Dr. Myron Cybulsky, and Dr. Daniel De Carvalho. They have all provided me with great support and guidance over the years. I have appreciated our discussions and the reassurance of having them in my corner through all these years.

Finally, I would like to thank my family, who have supported me in chasing my dreams, and who have never stopped believing in me or in what I can achieve.

iv

Accreditation of Work

The information in this thesis is the original work performed by Lucy Chen. Experimental design, execution, and analysis of data were completed by Lucy Chen and her supervisor Dr. Philip A. Marsden.

5-Aza-2’-deoxycytidine and or Trichostatin A treated samples were prepared by Kyung Ha Ku.

One HUVEC and VSMC single strand high resolution bisulfite experiment was done by Apurva V. Shirodkar. Ariel Gershon and Michael K. Lee assisted in some sodium bisulfite data analysis.

NanoString data used to compare the expression of MAE and BAE was generated by Aravin N. Sukumar.

ChIP-chip data from the custom ultra-high resolution tiling array was generated by Matthew Shu Ching Yan.

FISH experiments were performed by Noeline Subramaniam.

v

Table of Contents

Abstract ...... ii

Acknowledgments...... iii

Accreditation of Work ...... v

Table of Contents ...... vi

List of Abbreviations and Acronyms ...... x

List of Tables ...... xii

List of Figures ...... xiii

Chapter 1 ...... 1

Introduction ...... 1

1.1 Endothelial Cells and the Vascular ...... 1

1.1.1 Physiology and Heterogeneity of the Vascular Endothelium ...... 1

1.1.2 Endothelial Adhesion and Inflammation ...... 3

1.1.2.1 von Willebrand Factor, P-selectin, and the Weibel-Palade Body ...... 3

1.1.3 Regulation of P-selectin Expression ...... 6

1.1.4 P-selectin in Disease ...... 10

1.2 Transcriptional Regulation of Endothelial Gene Expression ...... 13

1.2.1 Epigenetics ...... 13

1.2.1.1 RNA Based Mechanisms ...... 14

1.2.1.2 DNA Methylation ...... 15

1.2.1.3 Histone Density, Post-transcriptional Modifications and Variants ...... 17

1.2.2 Maintenance of Epigenetic Marks ...... 19

1.3 Methods and Mechanisms that Determine Endothelial Heterogeneity ...... 20

1.3.1 Monoallelic and Allele-biased Gene Expression ...... 21

1.3.1.1 X- Inactivation ...... 21

vi

1.3.1.2 Imprinting ...... 22

1.3.1.3 Autosomal Allele-biased Expression ...... 23

1.3.2 Consequences of Monoallelic Gene Expression ...... 25

1.3.3 Generating Diversity Through Allele-biased Expression ...... 26

1.3.4 Single-Cell RNA Sequencing ...... 27

Chapter 2 ...... 28

Thesis Overview...... 28

2.1 Current Understanding and Outstanding Questions ...... 28

2.2 Heterogeneity of Endothelial Enriched Genes von Willebrand Factor and Vascular Molecule 1 ...... 30

2.3 Hypothesis...... 33

2.4 Specific Aims ...... 33

Chapter 3 ...... 34

Chromatin-based Pathways Mediate Basal Expression Heterogeneity of P-selectin in Endothelial Cells ...... 34

3.1 Materials and Methods ...... 34

3.1.1 Cell Culture ...... 34

3.1.2 Isolation of Single Cells for Clonal Expansion ...... 35

3.1.3 Fluorescence Activated Cell Sorting ...... 35

3.1.4 DNA Isolation ...... 36

3.1.5 RNA Isolation ...... 36

3.1.6 Absolute Quantification of Gene Expression by Reverse Transcriptase-qPCR (RT-qPCR) ...... 36

3.1.7 Sodium Bisulfite Genomic Sequencing ...... 38

3.1.8 Pyrosequencing of Bisulfite Converted DNA...... 41

3.1.9 Inhibition of DNA Methylation or Histone Deacetylation ...... 41

3.1.10 Plasmids Used for Promoter Reporter Experiments ...... 41

vii

3.1.11 In Vitro Methylation of Promoter/Reporter Constructs ...... 43

3.1.12 Plasmid Transfection and Luciferase Assay ...... 43

3.1.13 Chromatin Immunoprecipitation (ChIP) ...... 44

3.1.14 Absolute Quantitative PCR Analysis of ChIP Samples ...... 44

3.1.15 PCR and Sanger Sequencing to Detect Heterozygous SNPs ...... 45

3.1.16 Immunofluorescence and Confocal Imaging ...... 47

3.1.17 Single-cell RNA Sequencing (scRNA-seq) ...... 48

3.1.18 Fluorescence In Situ Hybridization ...... 49

3.2 Results ...... 50

3.2.1 Existence of Basal P-selectin Protein Heterogeneity ...... 50

3.2.2 Heterogeneity of P-selectin at the Level of RNA ...... 51

3.2.2.1 Fluorescence In Situ Hybridization ...... 53

3.2.2.2 Single-cell RNA Sequencing ...... 54

3.2.3 P-selectin Protein Heterogeneity Correlates with RNA Expression ...... 69

3.2.4 Promoter Methylation Correlates with P-selectin mRNA Expression in Endothelial and Non-endothelial Cell Types ...... 71

3.2.5 Promoter Methylation Correlates with P-selectin Protein Expression in HUVECs ...... 88

3.3 The Role of Histone Post-translational Modifications on P-selectin Expression ...... 90

3.3.1 Comparison of Histone Acetylation ChIP-chip Data with ChIP-seq Data from the ENCODE Consortium...... 90

3.3.2 Allele-biased Regulation of P-selectin Expression ...... 93

3.4 Mitotic Stability of P-selectin Epialleles ...... 102

3.4.1 Conservation of Promoter DNA Methylation ...... 102

3.5 Discussion ...... 107

3.6 Future Directions ...... 113

3.7 Concluding Remarks ...... 116

viii

References ...... 118

ix

List of Abbreviations and Acronyms

5-Aza-CdR 5-aza-2'-deoxycytidine, decitabine 5hmC 5-hydroxymethylcytosine 5mC 5-methylcytosine AUROC Area under the receiver operator characteristic curve BAE Biallelic expression BAEC Bovine aortic endothelial cell BSA Bovine serum albumin cAMP Cyclic adenosine monophosphate CGI CpG island ChIP Chromatin immunoprecipitation CTCF CCCTC-binding factor CVD Cardiovascular disease DAPI 4′,6-diamidino-2-phenylindole dbMAE The Database of Autosomal Monoallelic Expression DNase I Deoxyribonuclease I DNMT DNA methyltransferase DNA Deoxyribonucleic acid EC Endothelial cell EDTA Ethylenediaminetetraacetic acid ENCODE Encyclopedia of DNA Elements ERG ETS-related gene ESM1 Endothelial-specific molecule 1 ETP Early T-cell progenitors ETS E26 transforming sequence FACS Fluorescence activated cell sorting FBS Fetal bovine serum FISH Fluorescence in situ hybridization HAEC Human aortic endothelial cell HCAEC Human coronary artery endothelial cell HDAC Histone deacetylase HEPES 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid HMVEC Human dermal microvascular endothelial, neonatal HuAoVSMC Human aortic vascular smooth muscle cell HUVEC Human umbilical vein endothelial cell ICAM1 Intercellular adhesion molecule 1 ICR Imprinting control region IGF2 Insulin-like growth factor 2 IL Interleukin KAT Lysine acetyltransferase lncRNA Long non-coding RNA LPS Lipopolysaccharide MAD Median absolute deviation MAE Monoallelic expression MAF Minor allele frequency MBD Methyl-CpG binding domain

x mRNA Messenger RNA NcRNA Noncoding RNAs NF-κB Nuclear factor kappa B NHEK Normal human epidermal keratinocytes Obs/Exp Observed/Expected CpG ratio OR Odorant receptor PAR Protease-activated receptor PBS Phosphate buffered saline PCA Principal component analysis PCNA Proliferating cell nuclear antigen PCR Polymerase chain reaction PI Propidium iodide Pol II RNA polymerase II PRC2 Polycomb repressive complex 2 PSGL-1 P-selectin glycoprotein ligand-1 qPCR Quantitative PCR RNA Ribonucleic acid RNase Ribonuclease Robo4 Roundabout 4 ROS Reactive oxygen species RT-qPCR Reverse transcriptase quantitative polymerase chain reaction scRNA-seq Single-cell RNA sequencing SCPC Sickle cell pain crisis SEM Standard error of the mean sLeX Sialyl-Lewis X SNP Single nucleotide polymorphism SNV Single nucleotide variation sP-selectin Soluble P-selectin SV40 Simian virus 40 TBP TATA-binding protein TET Ten-eleven translocation TF Transcription factor TNF-α Tumor necrosis factor α TSA Trichostatin A TSS Transcriptional start site tSNE t-distributed stochastic neighbor embedding UMI Unique molecular identifier VCAM1 Vascular cell adhesion molecule 1 VWF von Willebrand Factor WPB Weibel Palade body XCI X-chromosome inactivation Xic X-inactivation centre

xi

List of Tables

Table 1: ChIP-qPCR and RT-qPCR primer sequences...... 37

Table 2: Sodium bisulfite genomic sequencing primer set ...... 40

Table 3: Primers used to make pGL2-1796/+92 ...... 42

Table 4: H3K27me3 and H3K36me3 SNP PCR primers ...... 46

Table 5: H3K27me3 and H3K36me3 SNP Sanger sequencing primers ...... 47

Table 6: Top 25 marker genes for each cluster ...... 59

Table 7: Basal Expression Heterogeneity is Not Limited to P-selectin ...... 66

Table 8: scRNA-seq clusters 1-4 (ECs) top 5% P-selectin expressing cells vs bottom 5% (i.e., cells that do not express P-selectin) ...... 68

xii

List of Figures

Figure 1. Morphology of Weibel-Palade bodies in non-stimulated cells...... 4

Figure 2. Constitutive expression of P-selectin in and endothelial cells...... 7

Figure 3. Location of cis-regulatory elements in the proximal promoter of the human P-selectin gene...... 9

Figure 4. Mechanisms of epigenetic gene regulation...... 13

Figure 5. DNA methylation, hydroxymethylation, and demethylation...... 17

Figure 6. Sodium bisulfite sequencing...... 39

Figure 7. Heterogeneity of basal P-selectin protein and SELP transcription in human endothelial cells...... 51

Figure 8. Expression of SELP across multiple endothelial cell types from publicly available microarray data215 ...... 53

Figure 9. P-selectin mRNA and hnRNA is heterogeneously expressed in cultured HUVEC ...... 54

Figure 10. Endothelial subsets identified by scRNA-seq in cells isolated fresh from a single- donor umbilical cord...... 57

Figure 11. Single-cell RNA sequencing reveals variability in housekeeping gene expression. ... 62

Figure 12. Single-cell RNA sequencing indicates that SELP heterogeneity is present in vivo in fresh human tissues, just as it is in vitro in cultured HUVEC...... 65

Figure 13. FACS analysis of histamine treatment on P-selectin expression in HUVEC...... 70

Figure 14. Interindividual heterogeneity in human P-selectin promoter DNA methylation using bisulfite sequencing of six independent HUVEC lines...... 75

Figure 15. Heterogeneity in human P-selectin promoter DNA methylation among endothelial cell types...... 78

xiii

Figure 16. Alignment of the human P-selectin sequence with other vertebrate mammals demonstrated weak conservation for many of the promoter CpG sites...... 79

Figure 17. Location of CG dinucleotides, SNPs, and cis-regulatory elements in the proximal promoter of the human P-selectin gene...... 80

Figure 18. DNA methylation and RNA Polymerase II patterns at the P-selectin and VWF gene promoters...... 84

Figure 19. Effects of 5-Aza-CdR and TSA on SELP steady-state mRNA and hnRNA levels in HUVEC and HuAoVSMC...... 86

Figure 20. Quantification of human P-selectin gene promoter activity...... 88

Figure 21. Methylation of SELP promoter in ECs sorted for high and low P-selectin protein. ... 89

Figure 22. Integrated Genome Browser view of histone modification and RNA polymerase II profiles of SELP...... 93

Figure 23. Frequency and location of SNPs in the P-selectin gene...... 96

Figure 24. Visualization of the H3K36me3 and H3K27me3 MAE signature of the P-selectin gene in HUVECs...... 97

Figure 25. Sanger sequencing of PCR products containing an informative SNV in HUVEC cells...... 100

Figure 26. Expression level comparison between monoallelic and biallelic genes...... 102

Figure 27. Mitotic conservation of promoter methylation...... 105

Figure 28. Heterogeneity of other EC-enriched genes...... 108

xiv

Chapter 1 Introduction 1.1 Endothelial Cells and the Vascular Endothelium 1.1.1 Physiology and Heterogeneity of the Vascular Endothelium

The endothelium and closed cardiovascular system evolved in a relatively short period over 500 million years ago during the Cambrian explosion, where it first appeared in an ancestral vertebrate following the divergence of cephalochordates from vertebrates (reviewed by Monahan-Earley, Dvorak, and Aird1). Comparative evolution studies have revealed that endothelial cells (ECs) are absent in invertebrates, cephalochordates, and tunicates but are present in two main groups of extant vertebrates: jawed (gnathostomes) and jawless (agnathans). This latter group is further divided into myxinoids (hagfish), and petromyzonids (lamprey). Endothelial cells form the inner lining of blood and lymphatic vessels and are key sensors and effectors of the closed cardiovascular system. Several hypotheses have been proposed to explain the evolution of the endothelium in vertebrate species. One theory suggests that the endothelium arose as a way for organisms to exert vasomotor control to regulate blood pressure and other hemostatic processes1. A second explanation ties the emergence of an endothelial layer to the development of coagulation and immune systems, as well as the development of other organs and tissues, whereby the endothelium served as both the structural framework and the interface for these components1.

The vascular system is regionally specialized to accommodate the individual needs of distinct tissues and organs throughout the body. ECs play a critical role in many physiological processes, such as regulating vasomotor tone, trafficking of white blood cells, control of coagulation, angiogenesis, and the immune response2-4. It is well known that endothelial cells have characteristic gene expression profiles and morphology depending on their location in the vasculature5. The phenotypes of endothelial cells can vary, even between neighboring cells of the same blood vessel exposed to the same extracellular environment. This includes microenvironmental factors such as local cytokine levels, and macroenvironmental factors such as blood pressure and cholesterol levels. Cells with the same genotype that are grown under the same laboratory conditions can also display variability in appearance and behavior6. Importantly,

1

EC heterogeneity within the same blood vessel may contribute to differences in propensities for development of cardiovascular disease (CVD)7. A prevailing assumption is that these differences are largely caused by environmental factors extrinsic to the ECs (such as variations in the local concentrations of inflammatory molecules and hemodynamic differences in the vessels), which differentially modulate gene expression in an otherwise seemingly homogeneous cell population. Our studies and others have clearly shown that ECs from different blood vessels have distinct expression profiles and these differences might account for the selective involvement in disease processes.

Heterogeneity in biological systems exist as differences between individuals (inter‐individual variability), or differences within an individual (intra‐individual variability or cell‐to‐cell variability across single cells of a population). Cell-to-cell heterogeneity is an essential feature of cellular decision‐making. Despite being genetically identical, cells frequently respond differently to the same external stimulus. This heterogeneity in cellular behavior can be beneficial at the population level; variable expression of certain genes can serve as a bet-hedging mechanism, allowing some cells to respond to and survive a sudden change in the environment8. Transcriptional heterogeneity arises from stochastic switching between transcriptionally active and inactive states, so-called transcriptional bursting9, 10. Individual genes can exhibit different burst kinetics, meaning that the duration and frequency of these bursts can differ, and messenger RNA (mRNA) production between single cells can be highly variable10-12. Currently there is no clear model to explain transcriptional bursting but several aspects of transcription, such as chromatin conformation, nucleosome occupancy, histone modifications, transcription factor (TF) binding, and enhancer–promoter interactions can affect how often and how long the promoter is active. Differences in phenotypic states, such as cell cycle stage, also influences global gene expression at both the transcriptional and translational level. Together, these factors can result in heterogeneous gene expression between individual cells of a population, both in terms of which genes they express and how much. It is currently unclear whether these differences are driven by stochastic or deterministic mechanisms.

2

1.1.2 Endothelial Adhesion Proteins and Inflammation

1.1.2.1 von Willebrand Factor, P-selectin, and the Weibel-Palade Body

The endothelium is a highly dynamic and responsive tissue whose function can be altered by exposure to biochemical stimuli, such as cytokines, and biomechanical stimuli, such as blood flow, both of which can exist at graded levels throughout the vasculature13. Adhesion molecules play an essential role in the recruitment of immune cells to the site of inflammation and tissue injury (reviewed in 14). Leukocyte interaction with venular endothelial cells is essential for the inflammatory response and is mediated by several families of adhesion molecules, including integrins, , and members of the Ig superfamily (such as intercellular adhesion molecule 1 (ICAM1) and vascular cell adhesion molecule 1 (VCAM1))14, 15. Each is involved in a different stage of leukocyte transmigration through the endothelium and the coordination of their expression and function is crucial for the normal recruitment of leukocytes from the blood stream to the tissue. Human endothelial cells basally express P-selectin (PADGEM, GMP-140, CD62P, SELP), whereas expression of E-selectin, ICAM1, and VCAM1 requires induction by cytokines (such as tumor necrosis factor α (TNF-α), interleukin (IL) 1, IL-1β, and/or interferon γ). This occurs in a delayed manner due to the time required to synthesize nascent mRNA and protein. E- selectin protein expression reaches maximal levels 2-4hrs post-induction and returns to baseline after 24hrs16, 17. The induction of VCAM1 is somewhat slower and longer-lasting than E-selectin. VCAM1 reaches maximal expression ~4hrs after TNF-α treatment and is sustained for at least 48hrs18.

Both P- and E-selectin are adhesion molecules that are involved in the first stages of leukocyte rolling, allowing their rapid attachment to the endothelium within minutes or hours of exposure to inflammatory stimuli respectively19. Importantly, platelets also roll along the endothelium in vivo. This rolling is mediated by both platelet PSGL-1 binding to endothelial P-selectin as well as platelet P-selectin binding to endothelial PSGL-120. In unstimulated human umbilical vein endothelial cells (HUVECs), P-selectin is constitutively synthesized as a preformed pool in the membrane of intracellular Weibel-Palade bodies (WPBs) with low levels present on the surface of endothelial cells21 (Figure 1). In healthy individuals the abundance of P-selectin is greatest in small veins and venules, although it is occasionally observed in small arteries, arterioles, and capillaries21, 22. The levels of P-selectin are known to vary significantly within

3 individual venules but have been found to be the same across venules with different shear rates or diameters23. In vitro studies from our lab (unpublished data) and others confirmed that P- selectin mRNA is not affected by different shear conditions or the presence or absence of flow24. Others have also demonstrated that laminar flow does not cause major changes in the number and distribution of WPBs in ECs25. We also have data showing that human P-selectin is not significantly different in samples +/- 10ng/mL TNFα treatment for 4hrs, +/- hypoxia (for 4 or 24hrs), +/- dicer knockdown, and +/- 100ng/mL human recombinant VEGF-A 165 for 24hrs either in 2D cultures or 3D matrices (data unpublished).

Figure 1. Morphology of Weibel-Palade bodies in non-stimulated cells. (A) Immunofluorescence image of a human umbilical vein endothelial cell (HUVEC) labeled with an antibody against VWF, revealing the elongated cigar-shaped WPBs throughout the cell. The reticular staining represents the ER (white arrow). Scale bar, 5μm. (B) Electron microscope image of a HUVEC showing several WPBs (arrows). N, nucleus. Scale bar, 1μm.

Endothelial cells release WPBs in response to a large number of secretagogues such as thrombin, histamine, epinephrine, and vasopressin26, 27. These agonists can be divided into two distinct 4 groups, those that act by elevating intracellular Ca2+ levels and those that act by raising cyclic adenosine monophosphate (cAMP) levels in the cell. The redistribution of existing P-selectin from WPBs to the surface of ECs occurs through fusion of the WPB membranes with the plasma membrane21. Surface expression is maximal 5-10min after stimulation and the protein is rapidly cleared from the cell surface within the next 30-60min by endocytosis28 or proteolytic cleavage29. In vitro, P-selectin is not proteolytically shed, but in vivo, it is cleaved from the surfaces of both activated platelets and ECs30, 31. Prolonged exposure to cytokines, such as IL-332, IL-433, IL-1334, or oncostatin M33, results in chronic expression of P-selectin on EC surfaces that is both transcription- and translation-dependent. This is associated with the establishment and maintenance of certain chronic inflammatory diseases.

Stimulated exocytosis of WPBs, which are endothelial-specific storage compartments, provides the endothelium with the ability to quickly respond to environmental changes and induce leukocyte adhesion within minutes after exposure to inflammatory mediators. The main constituents of WPBs are von Willebrand factor (VWF) (a secreted protein) and P-selectin; however other components, including the chemokines IL-8 and eotaxin-3, tissue-type plasminogen activator, osteoprotegrin, endothelin 1, and angiopoietin 2 have also been reported to be stored within these granules, suggesting a role for WPBs in inflammation, hemostasis, modulation of vascular tone, and angiogenesis35. Studies suggest that not all of these components are present in all Weibel–Palade bodies at all times. The regulation of WPBs involves controlling both the number of stored biomolecules as well as the fusion of integral membrane components (i.e., P-selectin) with the cytosolic cell surface and release of soluble mediators (e.g., VWF) into circulation.

It is well established that the contents of Weibel-Palade bodies can be modified in response to inflammatory mediators. For example, IL-8 and eotaxin-3 are found in the WPBs of cells that have been exposed to the inflammatory cytokines, such as IL-1β and IL-436, 37. Studies have shown that the contents of WPBs not only differs depending on the type of ECs in which they are found, but can also vary within the same EC. P-selectin and angiopoietin-2 have been reported to be stored in different populations of WPBs even when both proteins are expressed in the same cell38. The sizes and number of WPBs in individual cells can also vary between ECs in the different regions of the vascular tree. Especially high numbers of WPBs are found in ECs of pulmonary arteries, arterioles, and veins35, 39-41. The patterns of expression of VWF (the largest 5 constituent of WPBs) are known to be heterogeneous in mRNA and protein levels across multiple EC types (particularly in human coronary artery ECs, human aortic ECs, and human pulmonary artery ECs)2, 3, 42-44. In HUVEC however, VWF expression is homogeneous42. Additionally, P-selectin and VWF have been reported to be differentially released from HUVECs based on their stimulation with different secretagogues45. HUVECs stimulated with cAMP or an agonist peptide that binds to protease-activated receptor (PAR) 2 display a delayed release of VWF and a reduced release of P-selectin compared with HUVECs stimulated with histamine (binding to H1-receptors), thrombin (binding to PARs), or other agonists that bind to PAR145. It is currently unknown if and how much of P-selectin heterogeneity can be attributed to heterogeneity of WPBs.

1.1.3 Regulation of P-selectin Expression

The selectins are a family of three calcium-dependent type I transmembrane glycoproteins (L-, E-, and P-selectin) found on leukocytes, endothelial cells, and platelets. The genes for the selectin family are closely linked on chromosome 1, reflecting their common evolutionary genomic origin. L-selectin is expressed on all granulocytes and and on most lymphocytes. P-selectin is constitutively synthesized in Weibel-Palade bodies of endothelial cells and in the membranes of the α-granules of megakaryocytes/platelets (Figure 2)21, 46, 47. E-selectin is expressed in ECs but requires induction by inflammatory cytokines. Basal expression of P- selectin is greater than that of E-selectin, which is not normally constitutively expressed (except in dermal microvascular ECs48). All three selectins recognize and bind to glycoproteins containing the tetrasaccharide sialyl-Lewis X (sLeX). P-selectin glycoprotein ligand-1 (PSGL-1) is the primary ligand for P-selectin and is expressed on monocytes, , lymphocytes, and platelets49. E-selectin and L-selectin also bind to PSGL-1.

6

Figure 2. Constitutive expression of P-selectin in platelets and endothelial cells. (A) P-selectin is constitutively synthesized by megakaryocytes and endothelial cells, where it is stored in the membranes of α-granules in platelets and Weibel-Palade bodies in endothelial cells. Within minutes after cellular activation by mediators such as thrombin or histamine, P-selectin is redistributed to the plasma membrane. (B) Most endothelial cells do not constitutively synthesize E-selectin. Inflammatory mediators such as TNF-α, IL-1β, or lipopolysaccharide (LPS) induce the transient synthesis of E-selectin, which is then transported to the cell surface. Because of the time require for de novo synthesis of RNA and protein, E-selectin expression only reaches maximal levels 2-4 hours post-induction.

The transcriptional regulation of P-selectin differs among many vertebrate species. In mouse and bovine ECs, P-selectin mRNA levels can be augmented by treatment with lipopolysaccharide (LPS), TNF-α, or other inflammatory molecules50, 51. In cultured HUVECs, LPS and TNF-α do not increase P-selectin expression52, suggesting that the transcriptional regulation of human P- selectin differs from that of E-selectin, VCAM1, and ICAM1 in human ECs. Pan and McEver have previously characterized the 5′-flanking region of the human P-selectin gene using primer extension, ribonuclease (RNase) protection, and anchored polymerase chain reaction (PCR)53. It was discovered that transcription of the gene is initiated at multiple sites, which is characteristic 7 of promoters lacking in a canonical TATA box. The human sequence also lacks a GC box for binding of the Sp1 transcription factor. At least three functional regulatory elements have been identified; a GATA element (TTATCA), which binds to GATA-2, a non-canonical site for NF- κB (Nuclear factor kappa B)/Rel transcription factors, and two Stat6 binding sites (Figure 3)53, 54. Unlike the mouse NF-κB/Rel element, the human variant was found to only bind homodimers containing p50 or p52, but not those containing p65 (Rel A)54. This is believed to be the reason why TNF-α induces transcription of P-selectin transcription in mouse but not in human ECs55. The Stat6 elements overlap three putative ETS sites. It is not clear whether Stat6 binding results in competition with ETS factor binding since the ETS functionality has not been elucidated. The region from -500 to +98 relative to the transcriptional start site contains 13 additional potential E26 transforming sequence (ETS) binding sites (GGAA, TTCC, CCTT). The functional importance of these sites in the P-selectin gene has not been determined. The effect of promoter methylation on P-selectin transcription is also currently unknown.

8

Figure 3. Location of cis-regulatory elements in the proximal promoter of the human P-selectin gene. The locations of cis-regulatory elements and their position relative to the transcriptional start site (arrow) are indicated in green text. Black boxes denote putative cis elements and blue boxes denote elements which have been shown to be functionally important in basal P- selectin expression.

9

Endothelial cells use multiple mechanisms to regulate the synthesis, sorting, degradation, and recycling of P-selectin. These mechanisms are important for controlling the amount of P-selectin on the activated endothelial surface and the duration that they remain there before internalization. Sorting signals in the cytoplasmic domain of the protein allow for tight regulation of the steady- state levels on the plasma membrane56-60. The short-lived presence of P-selectin at the cell surface reflects its rapid internalization and either delivery to lysosomes for degradation or recycling through the trans-Golgi network61. The net effect is to remove P-selectin from the cell surface to limit the duration of the inflammatory response. However, under prolonged inflammation, an increase in the rate of protein synthesis may saturate the sorting pathway from the trans-Golgi network to the WPBs and plasma membrane, resulting in increased steady-state levels of P-selectin on the surface50. A higher synthetic rate could also deliver more nascent P- selectin protein to WPBs, providing larger stores of P-selectin that could be mobilized in response to inflammatory stimuli. The half-life of P-selectin mRNA in cultured HUVEC is at least 12 hours62. Because P-selectin transcripts have a long half-life, even a brief increase in the transcription rate could result in the accumulation of P-selectin mRNA. However, a substantial increase in P-selectin mRNA might only produce a relatively limited increase in protein due to the efficiency of removal from the cell surface.

1.1.4 P-selectin in Disease

The recruitment and transmigration of circulating leukocytes into the arterial endothelium is an important pathogenic event in atherogenesis and the association between monocytes and atherosclerotic lesions has long been recognized63. Ultrasound contrast imaging of P-selectin targeted microbubbles demonstrated enhanced signal intensity in areas of atherosclerotic plaque development and has also been used to detect early stages of disease64. Evidence for the role of adhesion molecules in the initiation and progression of plaques have been obtained from studies using gene knockout mouse models. Double knockout mice lacking both ICAM1 and apolipoprotein E (ApoE) or P-selectin and ApoE exhibited marked delay in atherosclerotic lesion formation65. VCAM1-/- mice, on the other hand, are embryonic lethal66. Mice lacking in E- selectin do not display a reduction in lesion formation but knockout models of both P- and E- selectin in ApoE-/- mice have been shown to decrease fatty streak and atherosclerotic lesion development67. It is important to note that although regulation of P-selectin differs between mice

10 and humans, increased expression of P-selectin has also been observed in surgically-excised human atherosclerotic plaques and fatty streaks, implicating its involvement in human cardiovascular disease68. Recently, a study using a transgenic mouse model that expressed human P-selectin from the human promoter found that these ApoE–deficient mice had augmented atherogenesis and developed large, -rich atheromas in their aortas when fed a Western diet69. Importantly, the expression of P-selectin in this model recapitulated that of the native gene in humans, thereby providing strong evidence to support the involvement of P- selectin in human atherosclerosis69. Studies have also implicated the importance of platelet P- selectin in atherosclerotic lesion development70, 71. Huo et al. found that circulating activated platelets and platelet-leukocyte aggregates promoted formation of atherosclerotic lesions71. This was attributed to the platelet P-selectin–mediated delivery of platelet-derived proinflammatory factors to monocytes/leukocytes and the vessel wall. Others have also described the positive association of platelet P-selectin expression with arterial wall thickness and stiffness of carotid arteries in humans72.

It is important to note that a soluble form of P-selectin (sP-selectin) has been identified in plasma, the majority of which is thought to arise from the proteolytic cleavage of membrane P- selectin from degranulated platelets73-75. In humans, a minor portion of sP-selectin originates from an alternatively spliced form in ECs and platelets that removes the exon encoding the transmembrane domain76. Increased level of this sP-selectin has been implicated in a number of conditions, including diabetes77, ischemic heart disease78, and coronary artery disease79 with some association to patient prognosis. Although elevated levels of circulating sP-selectin are largely thought to contribute directly to cardiovascular disease, other evidence suggests that sP- selectin is not prothrombotic or proinflammatory80. Further studies are needed to determine whether sP-selectin is a cause or consequence of CVD.

The physiological importance of the selectins is also demonstrated in human leukocyte adhesion deficiency type II syndrome, which is caused by deficiency in sLeX-containing ligands81. These individuals have a mutation in the gene encoding a fucose transporter and cannot effectively incorporate fucose into selectin ligands. As a result, these patients display neutrophilia and recurrent infections due to the failure of their neutrophils to bind P- and E-selectin81.

11

Several reports have also shown that endothelial P-selectin is central to impaired microvascular blood flow in sickle cell disease and is a main driver of vaso-occlusive processes that lead to sickle cell pain crisis (SCPC)82. Recently, the completion of a Phase II trial for crizanlizumab (SUSTAIN trial83), a humanized monoclonal antibody against P-selectin, showed a decreased frequency of SCPCs and was associated with a lower incidence of adverse events. This study demonstrated the effectiveness of selectin inhibitors as drugs and the importance of endothelial adhesion molecules in disease.

As briefly mentioned earlier, P-selectin also plays a prominent role in hemostasis and thrombosis84-86. Activated platelets and ECs both express P-selectin at their surface. Binding of P-selectin to PSGL-1 (expressed on platelets and endothelial cells), mediates platelet rolling and accumulation at the site of injury. P-selectin, either sP-selectin or in the membranes of microparticles shed by either activated platelets or ECs, binds to PSGL-1 on leukocytes and induces expression, the main initiator of coagulation. Microparticles containing PSGL-1 delivers tissue factor to the developing platelet thrombus via a mechanism dependent on P-selectin and PSGL-1. Overall, this promotes fibrin deposition at the site of injury.

Previous studies have shown that mice deficient in P-selectin exhibit a 40% increase in bleeding time after amputation of the tip of the tail87. Others have reported that P-selectin null mice have compromised leukocyte rolling and extravasation, and experience a 1-2 hour lag in the recruitment of neutrophils to the site of injury88. Despite this, these mice remained healthy under standard laboratory conditions up to at least 1 year. Only mice lacking both E- and P-selectin resulted in increased susceptibility to spontaneous disease or infections over time89, 90. This observation is likely due to the functional redundancy of these two selectins in mice. In humans, it is reasonable to speculate that deficiency in P-selectin will have more pronounced effects due to the selective recruitment of different leukocyte populations by endothelial adhesion molecules. P-selectin has been shown to have a higher affinity for eosinophils than E-selectin91, 92 and a lack of P-selectin may affect the immune regulatory capabilities of eosinophils93 in both health and disease.

12

1.2 Transcriptional Regulation of Endothelial Gene Expression 1.2.1 Epigenetics

There are a variety of genetic and epigenetic factors that determine gene expression (Figure 4). In contrast to the static DNA code, epigenetic marks may be transient and reversible and can vary between cell types94. Importantly, emerging evidence suggests that epigenetic mechanisms are responsive to changes in environmental stimuli95. Epigenetics can be broadly defined as chromatin-based mechanisms that regulate gene expression, which do not involve changes to the DNA sequence per se96. In addition to cis/trans-regulation, epigenetic mechanisms, such as DNA methylation, and histone density, variants, and post-translational modifications, can dynamically change the chromatin structure, leading to transcriptionally permissive (euchromatin) or repressive (heterochromatin) regions.

Figure 4. Mechanisms of epigenetic gene regulation. Epigenetic mechanisms can be divided into 3 distinct yet highly interrelated processes: 1) DNA methylation and hydroxymethylation, which almost exclusively occur in the context of CpG dinucleotides; 2) RNA-based mechanisms that involve regulatory non-coding RNAs, including lncRNAs; and 3) histone code and density. This figure is adapted from Webster et al.94 13

Non-protein coding RNAs, particularly long non-coding RNAs (lncRNAs) are also players in regulating gene expression. Together, these epigenetic mechanisms at the promoter and enhancer regions of a gene can contribute to the recruitment of transcription factors, coactivators, and RNA polymerase II (pol II) to regulate transcription initiation. This concept of an epigenome helps to explain how cells with identical DNA sequences can demonstrate variable gene expression, such as the case of neighboring cells of the same blood vessel exposed to the same extracellular environment. Epigenetics can also provide us with new perspectives for understanding how gene expression in the human vascular system may be perturbed by disease.

1.2.1.1 RNA Based Mechanisms

Beginning in 2004, the Encyclopedia of DNA Elements (ENCODE) project has aimed to catalogue all functional genomic elements that contribute to human development and function, including data about DNA methylation, histone modifications, transcription factor binding, chromatin structure, and transcription from both non-coding and protein-coding regions97. According to ENCODE’s analysis, 80% of the genome contains elements that have biochemical functionality. Although this number has been highly criticized, it remains true that the only contains ~21, 000 protein-coding genes (transcribed from ~1.5% of the genome) and the majority is non-protein-coding DNA98. Most of this non-coding DNA are repetitive elements, which comprise over two-thirds of the human genome99. This includes long terminal repeats, long interspersed nuclear elements, short interspersed nuclear elements, satellites, minisatellites, and microsatellites. These transposable elements provide a source of genetic variation and in rare cases, may cause mutations that lead to disease100. Long non-coding RNAs are a relatively newer class of functionally important non-coding RNA molecules101. These are classified as transcripts >200 nucleotides in length and appear to be functionally distinct from shorter RNA species, such as microRNAs, short interfering RNAs, piwi-interacting RNAs, and small nucleolar RNAs. Although initially viewed as being mainly epigenetic regulators (by interacting with TFs, altering splicing, or recruiting chromatin modifying enzymes), lncRNAs have also been shown to function through transcriptional and post-transcriptional mechanisms. These include lncRNAs that function as microRNA sponges, those that act as natural antisense transcripts, and those that correspond with transcribed-ultraconserved regions102.

14

In terms of endothelial biology, several recent studies have identified lncRNAs that regulate EC functions. For instance, our laboratory recently reported a number of endothelial enriched lncRNAs103. Biomechanical sheer stress regulates the lncRNA STEEL and overexpression of STEEL in vivo led to increased and network formation103. Endothelial-specific functions were also described for the non-coding RNA SENSR, which was found to control proliferation, migration, and angiogenic potential of HUVECs in vitro. Additionally, many other lncRNAs have been identified to be important regulators of endothelial identity and function104, 105.

1.2.1.2 DNA Methylation

DNA methylation is an epigenetic modification in which a methyl group (-CH3) is covalently added to the 5-carbon position of the pyrimidine ring of cytosine (5mC), and in mammals, occurs predominantly in the context of CpG dinucleotides on the same strand of DNA106. Methylation of CpG sites is usually symmetrical, meaning that complementary CGs are also methylated on the opposite DNA strand. 5mC is known to be the most mutagenic base in the mammalian genome; spontaneous deamination of 5mC converts the cytosine residue into uracil which results in the substitution of thymine after DNA replication106. The total number of CpG sites in human genome comprises ~28 million instances (representing approximately 1% of genome). The human genome is largely CpG-depleted, which is mainly attributed to the high mutation rate of methylated cytosines into thymine107. It only contains about 20% of the CpGs expected based on the total G + C content108. Yet, dense regions of CpG sites do exist (termed CpG islands) where the occurrence of C followed by G is significantly higher and closer to the expected frequency. CpG islands (CGIs) are typically unmethylated and are associated with the 5ʹ-regulatory regions or promoters of most genes (about 70% of gene promoters are associated with CGIs)109. Exceptions to this are CGIs found on the inactivated X-chromosome110, those found in the context of parent-of-origin imprinting (discussed later on)111, and in cancer cells112-115. In these cases, paradoxical hypermethylation in CGIs is noted. CpG sites in gene bodies, intergenic regions, and repetitive elements are also heavily methylated and transcriptionally silent116, 117. Although CGI promoters were originally considered a feature of housekeeping genes, it is now accepted that many tissue-specific genes also have CGI promoters118. CGI promoters have distinct patterns of transcription initiation and chromatin structure119 and recent studies have shown that the length of CGIs also coincides with different patterns of gene expression and 15 promoter characteristics120. While the role of DNA methylation in controlling the expression of CGI-containing gene promoters is well documented, methylation of non‐CpG‐island promoters is also linked to gene expression121.

Non-CpG methylation, either in the form of CpH (H = A/C/T) or hydroxymethylation, also exists and have been recognized to have a functional role in human pluripotent cells and brain cells122, 123.

DNA methylation is catalyzed by three DNA methyltransferase enzymes (DNMT1, DNMT3a, and DNMT3b) which catalyze the transfer of a methyl group from S-adenosylmethionine to the fifth carbon on cytosine (Figure 5). DNMT1 is responsible for maintenance methylation during mitotic cell division (recognizes and methylates hemimethylated DNA during replication), whereas DNMT3a and 3b are mainly responsible for de novo DNA methylation during development. DNA demethylation can occur passively through the loss of DNA methylation during DNA replication, or actively through enzymatic removal of methylated cytosines124. Active demethylation is mediated by the ten-eleven translocation methylcytosine dioxygenases (TET1, TET2, and TET3) which iteratively catalyze the conversion of 5mC to 5- hydroxymethylcytosine (5hmC) and subsequent oxidative products. These oxidative products act as substrates for thymine DNA glycosylase-dependent base excision repair that ultimately restores the 5mC to its unmodified state125, 126 (Figure 5).

16

Figure 5. DNA methylation, hydroxymethylation, and demethylation. The 5-carbon position on cytosine is a potential site for DNMT1/3a/3b activity when it is followed by guanine on the same DNA strand. The methylcytosine can be sequentially oxidized by the TET1/2/3 proteins into a hydroxyl, formyl, or carboxyl state. Hydroxymethylcytosine is not well recognized by the maintenance DNMT1 protein and thus can lead to passive DNA demethylation via cell division. The formyl and carboxylcytosine can be recognized by TDG, which excises the modified cytosine base, leaving an abasic site where the BER complex can add a new, non-modified cytosine base.

1.2.1.3 Histone Density, Post-transcriptional Modifications and Variants

The genetic information of eukaryotes is packed into , which are densely packed and highly organized structures consisting of chromatin, histones, and other proteins. Histones are highly conserved proteins that are comprised of an octamer of 4 core histone proteins: 2 each of H2A, H2B, H3, and H4. Nucleosomes consist of 147 base pairs of DNA wrapped (1.75 times) around a histone octamer127. Nucleosomes are linked into a primary chromatin structure by linker histones, especially H1 variants, and non-nucleosomal linker DNA128. The length of linker DNA ranges from ~20–90bp and varies among different species and cell types and may even change

17 over time128. The length of linker DNA may also be important for gene regulation129. The structure of nucleosomes can be altered post-translationally by biochemical modifications of the N-terminal histone tail. The most common are methylation, acetylation, ubiquitination, or SUMOylation of lysine residues, methylation of arginines, and phosphorylation of serines. At any given residue, the addition of a post-translational modification is mutually exclusive. Histone modifications have a major influence over chromatin structure as it dictates accessibility or recruitment of transcriptional regulators to cis-DNA binding elements130. The modifications of histones are dynamic and requires the action of specific enzymes that add and remove these modifications131. This occurs in the cytoplasm shortly after histones are synthesized, after which they are transported to the nucleus and integrated in newly synthesized DNA. For acetylation, lysine acetyltransferases (KATs) transfer the acetyl group from acetyl-Coenzyme A to the tail of histones and histone deacetylates (HDACs) remove these groups. Many transcriptional coactivators, such as p300/CBP, contain KAT activity132. Histone methyltransferases add methyl groups to histones and their corresponding demethylases remove these groups. Polycomb repressive complex 2 (PRC2), which includes the set-domain-containing methyltransferase enhancer of zeste homolog 2, is important in the methylation of lysine 27 on histone H3, a modification associated with epigenetic gene silencing133. Another noteworthy methylase is SETD2, which mediates the methylation of H3 lysine 36 co-transcriptionally, meaning that its activity is linked to transcriptional elongation by RNA polymerase II134, 135. Although this histone marker is found in gene bodies of actively transcribed genes, SETD2-dependent methylation of H3K36 aids in the recruitment of Eaf3 (the human homolog is MRG15), a component of the Rpd3C(S) HDAC complex, which inhibits aberrant transcription initiation from cryptic, intragenic promoters136, 137. In general, histone acetylation is associated with transcriptional activation, whereas histone methylation may specify active or repressed chromatin states depending on the lysine residue and methylation status (mono-, di-, or trimethylation)138-140. Generally, H3K4, H3K36 and H3K79 methylations are considered to mark active transcription, whereas H3K9, H3K27 and H4K20 methylations are thought to be associated with silenced chromatin states. For a comprehensive list of known histone modifications and their functions, please see the review from Zhao et al141.

18

Although not discussed in this work, non-canonical histone variants have also been found to impact nucleosome dynamics, and thus have important functions in chromatin and gene regulation142.

1.2.2 Maintenance of Epigenetic Marks

Epigenetic inheritance occurs transgenerationally through the germ line from parent to offspring. Epigenetic information is also maintained across cell divisions and epigenetic marks from the parental cell are largely re-established on newly synthesized DNA during mitosis. For the scope of this thesis, I will only be focusing on the maintenance of the epigenome during mitotic cell division. These mechanisms are well understood for DNA methylation and non-coding RNAs, but limited knowledge exists for the maintenance of histone modifications and chromatin structure. During DNA replication in S phase, DNA methylation marks are identified on the parental strand by DNMT1, which then methylates the newly synthesized strand of DNA. This model provides a simple explanation for the stability of methylation marks across cell divisions. However, a growing body of work has indicated that the maintenance of DNA methylation may be more complex than previously thought. For example, it has been demonstrated that the specificity of DNMT1 to hemimethylated DNA is dependent on its associations with proliferating cell nuclear antigen (PCNA) and Uhrf1 (also known as Nuclear Protein 95 or NP95)143. Uhrf1 localizes to replication forks through interactions with PCNA and H3K9me3/me2 and facilitates the recruitment of DNMT1. DNMT1 then coordinates the recruitment of G9a histone methyltransferase to complete the H3K9 methylation on the nascent chromatin nucleosomes144. This demonstrates that the maintenance of DNA and histone methylation may be interdependent and can be coordinated. However, our understanding of how histone post-translational modifications are maintained through cell division is still limited. Currently, we know that histone modifying enzymes can be recruited to nascent DNA by the DNA replication machinery (as discussed above) but can also be recruited through DNA methylation. Proteins containing a methyl-CpG binding domain (MBD) can recognize and bind to methylated DNA, thereby acting as a link between methylated DNA and histone modifiers. For example, MBD1 interacts with Suv39h1-HP1 heterochromatic complex to direct the methylation of H3K9145. Importantly, the re-establishment of histone post-translational modifications are not necessarily limited to S-phase and may be independent of DNA replication.

19

1.3 Methods and Mechanisms that Determine Endothelial Heterogeneity

Polymorphisms in gene sequences, meaning differences in the DNA code (single nucleotide variation (SNVsa), mutations, copy number variations), as well as variation in gene expression, forms much of the basis of what makes individuals different within a population. During development, gene expression is tightly regulated, forming the basis of lineage specification and cellular identity. However, gene expression can also be stochastic and dynamic, contributing to cellular heterogeneity within an individual without changes in the DNA sequence. Each gene in a diploid cell exists in two copies – one that originates from the paternal allele and the other from the maternal allele. The copies may be identical or different, but in most cases both alleles are expressed, which is termed biallelic expression (BAE). When a gene is exclusively transcribed from one of the alleles, it is said to be monoallelically expressed (MAE). Recent transcriptome- wide studies have reported that monoallelic expression is much more widespread than previously thought146-150. The extent of MAE is still highly debated as the experimental design and operational definition of MAE varies across different studies. In most of these studies, it can be argued that the investigators studied allele-biased expression rather than strict MAE. Herein, monoallelic expression and allele-biased expression will be used interchangeably to refer to genes that are expressed at least 2:1 in favor of either allele. This was the threshold cutoff used to define monoallelic genes in the database of autosomal monoallelic expression (dbMAE)151 as well as two other studies152, 153. Others have also used more conservative cutoffs154, 155. In a 2015 study by Perez et al. looking at parent-of-origin allelic expression biases in the murine brain, several parental allelic biases were tested (from weak biases slightly above 50:50-60:40, up to strong biases 90:10-100:0)156. The authors found that genes with weak allelic bias were enriched with newly identified imprinted genes (yet also included known imprinted genes) whereas stronger biases were enriched with known imprinted genes. Overall, the stringency in calling genes as MAE in these studies will affect the identification of novel monoallelic genes.

a SNVs are single nucleotide positions in genomic DNA at which different sequence alternatives exist. SNPs are DNA variants detectable in >1 % of the population. Here, we will use SNV and SNP interchangeably. 20

1.3.1 Monoallelic and Allele-biased Gene Expression

Examples of monoallelically regulated genes have been well-characterized and studied for several decades. In mammals, there are 3 main types of MAE or allele-biased expression that occurs: genomic imprinting, X-chromosome inactivation, and autosomal monoallelic expression that is independent of the parent-of-origin157.

1.3.1.1 X-chromosome Inactivation

X-chromosome inactivation (XCI) is the mechanism by which humans and other eutherian mammals equalize the expression of X-linked genes between XY males and XX females to achieve X-chromosome dosage compensation. It is commonly argued that XCI occurs early during female embryonic development when cells choose to randomly inactivate either the maternal or paternal X chromosome. The timing and permanence of this XCI is a newer area of study. When XCI does occur, this results in monoallelic expression of the genes on the remaining active X158, 159. Once established, the choice of inactive X chromosome is stably maintained across mitotic cell division, generating mosaic expression in which some populations of cells express genes from the maternal X and others express from the paternal X.

In other species, sex chromosome dosage compensation is achieved using alternate mechanisms. For example, in Drosophila, the expression level from the two X chromosomes in females is normal and expression from the single X chromosome in males is increased two-fold160. X- inactivation in marsupial mammals is imprinted and the paternally derived X chromosome is always inactivated. This also occurs in the placental tissues of mice161 and cows162, where the paternal X is always inactivated. Historically there has been a lot of controversy surrounding the XCI patterns in humans. In 2003, a study of 55 human cytotrophoblast samples (24 of which had matched fetal tissues) revealed random XCI in both. These results suggested that imprinted XCI is not required for normal placental or embryonic development in humans163.

Molecularly, the process of XCI involves several steps: counting the number of X chromosomes in a cell, choosing which X chromosome to inactivate, initiating and spreading chromosome- wide silencing, and maintenance of stable X chromosome repression through mitosis164. In mice, this is regulated in cis by the X-inactivation centre (Xic), which encodes two lncRNAs, X- inactivation specific transcript (Xist)165 and Tsix (the antisense transcript of Xist)166, 167. Xist is

21 expressed exclusively from the Xic on the inactive X, whereas Tsix is transcribed from the Xic on the active X. On the active X, Tsix downregulates Xist expression by modulating the chromatin structure and DNA methylation status of the Xist promoter. On the inactive X, Tsix expression is reduced, which allows Xist to accumulate and coat the inactive X. This results in the loss of histone modifications associated with active chromatin and the recruitment of modifying complexes, such as PRC2, to induce gene silencing and chromatin compaction168. In humans, the function of TSIX is not conserved169. Recent work has pointed to another lncRNA, XACT, which coats the active X in human pre-implantation embryos and antagonizes the silencing ability of XIST170. Interestingly, not all genes on the inactive X chromosome are silenced. Approximately 3% of X-linked genes in mice171 and 15% of X-linked genes in human escape X inactivation172. The molecular mechanisms that allow for escape from XCI remain largely unknown.

1.3.1.2 Imprinting

Imprinting occurs when a gene is preferentially expressed from only one of the parental alleles173, 174. The term “imprinted” describes the non-expressed allele; maternally imprinted refers to expression from the paternal allele and paternally imprinted refers to expression from the maternal allele. Specific parent-of-origin imprints are mediated by imprinting control regions (ICRs) and are fixed in the germline (i.e., oocytes and sperm cells)174. These ICRs may function as differentially methylated regions but can also include repetitive regions, non-coding RNAs (ncRNA), and histone modifications to establish and maintain imprinting. After fertilization, imprinted genes are maintained in the developing embryo and heritably transmitted in somatic tissues such that the parental marks are faithfully propagated to daughter cells during cell division175. Imprinted genes are always expressed from the same allele either in a tissue-specific manner or throughout the entire organism. Monoallelic expression due to imprinting often appears in genomic clusters and can be either maternally or paternally expressed (i.e., genes in a cluster may not be imprinted on the same chromosome)176.

There are two models of imprinting regulation in mammals: the insulator model and the long non-coding RNA model. The insulator model is demonstrated by the classic example of insulin- like growth factor 2 (IGF2), transcribed from the paternal allele (maternally imprinted) and the ncRNA H19, transcribed from the maternal allele (paternally imprinted)177. In this example,

22 parental-allele specific MAE of both the H19 and IGF2 genes requires differential DNA methylation in the ICR that is inherited from the gametes178. The ICR is located between IGF2 and H19 and regulates the interactions of both gene promoters with their shared downstream enhancer elements179. On the maternal allele, the ICR is unmethylated, allowing the binding of CCCTC-binding factor (CTCF), a transcriptional repressor that prevents enhancers from interacting with the IGF2 gene. This results in the expression of H19 from the maternal allele. On the paternal allele, the ICR is methylated, preventing the binding of CTCF and allowing enhancer elements to activate the transcription of IGF2. Chromatin composition was also shown to differentially mark the maternal and paternal alleles of the H19/IGF2 locus178.

In contrast to the insulator model, the majority of imprinted genes uses the ncRNA mechanism. The best example of this is the Igf2r cluster in mice. This cluster includes the protein coding genes Igf2r, Slc22a2, and Slc22a3, which are expressed from the maternal allele, and the lncRNA Air (also known as Airn), which is expressed from the paternal allele180. Igf2r and Air are transcribed on opposite strands of DNA and in opposite directions (i.e., antisense divergent transcription). The promoter of Air lies within intron 2 of Igf2r and the transcript overlaps exons 1 and 2 of Igf2r. Maternal expression of Igf2r, Slc22a2, and Slc22a3 requires methylation of the Air promoter in cis181. On the paternal allele, the Air promoter is unmethylated and Air RNA is expressed. Silencing of Igf2r, Slc22a2, and Slc22a3 on the paternal allele requires expression of Air in cis. Transcriptional overlap between Air RNA and the 5ʹ-flanking region of Igf2r may be involved in the silencing of paternal Igf2r but studies have shown that the transcription of Air, rather than the transcript itself, can repress Igf2r by preventing RNA polymerase II from assembling on the promoter182. Because Air expression is also required to silence Slc22a2 and Slc22a3 without transcriptional overlap or prior silencing of Igf2r, it has been suggested that Air RNA may function in a manner analogous to Xist whereby Air RNA coats Slc22a2 and Slc22a3 recruits histone methyltransferases that methylate H3K9183.

1.3.1.3 Autosomal Allele-biased Expression

Monoallelic expression can occur on autosomes independently of the parent-of-origin and other chromosomal genes (i.e., not occurring in a chromosome-wide manner). In terms of this type of MAE, there are 2 fundamentally different models. In one model, monoallelic expression is conserved in daughter cells after mitotic division. In the second model MAE occurs

23 stochastically and is a dynamic process that is not mitotically transmitted. Importantly, some genes may be monoallelically expressed in one cell type but biallelic in another (known as cell type-dependent gene expression).

The best known examples of autosomal MAE are found in the immune system, which utilizes MAE to generate receptor diversity in T- and B-cells184, and in the nervous system where olfactory receptors in neurons use MAE to provide cell-identity185.

The immune system response to pathogens is mediated by lymphocytes and each T and recognizes only one antigen. T cell receptors (on T cells) and immunoglobulins (on B cells) both undergo DNA recombination and are monoallelically expressed. DNA rearrangement allows antigen receptor diversity without the need for individual genes to encode all the different receptors. Monoallelic expression (or allelic exclusion) ensures that each mature cell expresses a single antigen receptor, ensuring the specificity of the immune system. The mechanism that generates variation in the antigen-binding domains of these receptors involves recombining variable (V), diversity (D), and joining (J) gene segments in a process called V(D)J recombination, which requires the DNA recombinase enzymes RAG1 and RAG2186, 187. During B and T cell development, each antigen receptor locus becomes accessible to the rearrangement machinery. Only one of the two alleles undergo recombination and the other is prevented from rearranging. Epigenetic processes, including DNA methylation, histone modifications, and chromatin conformation, as well as nuclear localization and asynchronous DNA replication all appear to have a role in determining which allele will be rearranged and expressed188, 189. The choice of allele for recombination is independent of the parent-of-origin and is thought to be established early during development, following implantation. The allele that will be rearranged and expressed is replicated early during S phase, in contrast to the other allele which is replicated later in S phase. This asynchronous locus replication is maintained throughout development189.

Odorant receptors (OR) are another family of well-studied monoallelic genes. There are ~350 OR genes in the human genome spread across nearly all chromosomes190. Each olfactory neuron expresses a single OR gene that can recognize multiple odorants185, 190, 191. Many variants exist for each OR gene, therefore the expression of ORs from a single allele allows each neuron to specify one OR-odorant interaction192. The detection of odors is combinatorial; although each OR can recognize multiple odorant molecules, the odor itself is encoded by the combined

24 activation of multiple ORs191. The monoallelic expression of ORs is driven by differential epigenetic marks on the two alleles, which determines the asynchronous replication of OR genes. During neuronal differentiation, immature olfactory neurons make an allelic choice of OR and this receptor is stably expressed for the life of the mature cell193. Like antigen receptors, feedback mechanisms act to maintain the initial OR choice and prevent activation from the other allele194. Currently the mechanism of how the initial allelic choice is made is not known.

Asynchronous DNA replication (where one allele replicates earlier during S phase than the other allele) is a hallmark of all monoallelically expressed genes195, including X chromosome inactivation196, genomic imprinting197, and monoallelic expression of autosomal genes, such as antigen189 and olfactory receptors185. Recent studies have found that up to 20% of human genes and 10% of mouse genes may replicate asynchronously198, 199. It is currently unclear whether all genes that are replicated asynchronously exhibit monoallelic expression.

In these examples of MAE, allele-specific expression is thought to function in parental resource allocation (imprinting), dosage compensation between sexes (X-chromosome inactivation), or as a mechanism for a cell to achieve heterogeneity (e.g., olfactory receptor genes, antigen receptor genes). Monoallelic expression of autosomal genes, that occurs independently of parental allele, is thought to occur at a significant portion of mammalian genes based on recent genome wide studies of monoallelic expression147, 200, 201. Despite the recent interest in MAE over the last decade, there are still many unanswered questions regarding the mechanisms of how the allelic choice is made, how it is maintained across cell division, and the importance of this type of gene regulation. Whether MAE of autosomes occurs deterministically or stochastically is still highly debated.

1.3.2 Consequences of Monoallelic Gene Expression

One potential function of monoallelic gene expression may be to regulate the total amount of mRNA that is transcribed for a particular gene – by having only one active allele, it would be expected that the total transcript levels would be halved. Alternatively, a cell may augment transcription from the single allele such that the total mRNA levels approach those that would be observed if the gene were biallelically regulated. Previous genome-wide studies of MAE have reported an overall decrease in transcript levels from monoallelically expressing cells versus biallelically expressing, however rather than the expected 50% reduction, this difference was 25 only 30-35%147, 201, 202. Examples have been discovered where expression from 0, 1, or 2 alleles functions to control the proportion of cells that express the gene, rather than the amount produced within an individual cell203. Therefore, monoallelic expression in some somatic cells may represent an additional layer of control for gene expression and in other contexts functions to control the total amount of gene expression in a population of cells.

1.3.3 Generating Diversity Through Allele-biased Expression

Monoallelic expression may operate in a selective manner, as in the case of antigen receptor genes in B- and T-cells184 and olfactory receptor genes in neurons185, where MAE functions to generate receptor diversity. While these classic examples of monoallelic expression have been well-studied, the questions of why it occurs in specific genes and how it contributes to cellular heterogeneity within an individual are only beginning to be explored. A 2016 paper by Savova et al. reported that autosomal MAE genes, identified on the basis of a distinctive gene-body chromatin signature, were more genetically diverse in terms of DNA sequence variability than biallelic genes148. This was attributed, in part, to increased mutation and recombination rates in the MAE genes. The MAE signature was also enriched among those genes identified by Andrés et al., as undergoing balancing selection204. Functional analysis of these MAE genes revealed an enrichment of genes involved in cell adhesion, cell-environment interaction, and cell-cell signaling. MAE has the potential to generate different transcriptional signatures, resulting in elevated cellular diversity. This heterogeneity should, for example, reduce susceptibility of a cell population or tissue to a sudden change in the environment.

In recent work by Gimelbrant et al., a study of cloned cell populations of human B lymphocytes revealed up to 10% of genes that were assessed displayed monoallelic expression147. More importantly, the epigenetic patterns were found to be heritable within a population of cells derived from a single B cell. Interestingly, many of these genes were found to be biallelically expressed in other clonal B cell populations, but up to 20% of these originally identified genes were found to be consistently monoallelically regulated between clonal populations. Based on these findings, Gimelbrant et al., asserted that in a given sample consisting of a mixed population of cells (what we observe in vivo), extensive heterogeneity exists in terms of mono- and biallelically regulated genes. When combined with static DNA sequence variation on the alleles,

26 different combinations of expressed alleles (0, 1 or 2 active alleles) can generate a broad repertoire of cellular functions that may be beneficial or may increase disease susceptibility.

1.3.4 Single-Cell RNA Sequencing

Single-cell RNA sequencing is becoming a widely used method to study the genome-wide expression profile of individual cells205. These studies are expensive, large-scale sequencing experiments that generate high-dimensional datasets. They can reveal valuable insights into cell- to-cell heterogeneities compared to bulk RNA-seq, since this latter method generates a sequencing library that represents a population of cells.

Atherosclerosis is a chronic inflammatory disease that is in part driven by the dysregulation of endothelial cells in the atherosclerotic vessel and the plaque. Although it is known that ECs impact on disease progression, the complex phenotypic and transcriptional diversity of endothelial cells is poorly understood. Studies that measure population averages masks variability between cells and prevents accurate assessment of functional heterogeneity at the cellular and molecular level. Single cell analysis can distinguish whether all cells of a population express similar numbers of a gene transcript or whether a small number of cells account for most of the expression.

An increasing number of studies have investigated EC heterogeneity at the single cell level206-209. Many of these studies examine ECs in specific disease or developmental contexts and so far, none have looked at basal HUVEC isolated fresh off the cord.

27

Chapter 2 Thesis Overview 2.1 Current Understanding and Outstanding Questions

As discussed above, the concept of heterogeneity and examples of monoallelic expression have long been recognized, however little is known about the features of these genes in terms of how they are established and their functional importance. Recent studies have suggested that autosomal monoallelic expression may affect a large number of expressed genes in human tissues and cells throughout the body. The focus of this work is addressing the questions of how epigenetics and MAE function to generate heterogeneity and whether these aspects of gene regulation are retained across mitotic cell division. Currently it is believed that most MAE genes are heritable and once an allelic choice is made, expression from the allele is maintained through mitosis and all daughter cells will express that gene from the same allele.

However, there are known examples of genes that are transiently monoallelic and can switch to biallelic expression in response to external stimuli. A study in murine reported that stimulation with LPS induces the mono- to biallelic switch of Tnfα expression after 1hr of exposure210. Shortly after the biallelic switch, cells returned to Tnfα expression from one allele and the number of expressing cells steadily decreased to basal levels 24hrs post-stimulation. In this system, allelic regulation was thought to function as a mRNA dosage control mechanism, similar to the one seen in X inactivation.

Other studies have found that certain genes undergo transcriptional switching (from mono- to biallelic or bi- to monoallelic) as part of normal developmental processes211, 212. Ku et al. have described a switch that occurs in Gata3 transcription during T-cell development212. Hematopoietic stem cells found in bone marrow migrate to the thymus, where they undergo maturation and expansion to become early T-cell progenitors (ETPs). These ETPs sequentially develop through double-negative stages, double-positive stages, and finally into CD4+ helper T cells or CD8+ cytotoxic T cells (reviewed in 213). Cells isolated from adult mice at early stages of T cell maturation were found to have monoallelic expression of Gata3 whereas about half of those obtained from later stages showed biallelic expression. It was shown that the commitment to either mono- or biallelic expression of Gata3 is stable throughout T-cell development. The

28 authors hypothesized that the switch to BAE in half of the cells may be linked with T cell receptor β allelic exclusion. It was later shown that GATA3 protein abundance, regulated in part by the mono- to biallelic transcriptional switch, had a role in determining successful TCR recombination214.

Evidence has also been found to suggest a category of monoallelic autosomes that can be expressed from both alleles, but RNA from only one allele is present at any given time149. This type of monoallelic regulation is more consistent with transcriptional bursting, which can range from minutes to hours in length and is not sustained over a longer period of time6, 10. Methods that have been traditionally used to study MAE, such as RNA fluorescence in situ hybridization (FISH) or single-cell RNA sequencing, examines cells at a single time point and cannot distinguish between dynamic MAE resulting from transcriptional bursting and mitotically stable monoallelic expression. In order to conclusively classify a gene as being allele-biased or monoallelic, the expression imbalance needs to be followed over a long time frame and throughout clonal expansion in vitro and in vivo.

Another question which remains to be addressed is the role of replication timing and DNA methylation in determining which allele is expressed. Generally, early replicating genes are actively expressed, while repressed genes are replicated later in S phase189. A caveat to this has been found in genes that are differentially methylated and expressed in different cell types215. For example, the promoter of endothelial nitric oxide synthase (eNOS/NOS3) is unmethylated in ECs where it is expressed and methylated in human aortic vascular smooth muscle cells (HuAoVSMCs) where it is not expressed. In both cell types, NOS3 is replicated early during S phase (rather than being replicated early in ECs and late in HuAoVSMCs). The majority of known monoallelic genes are thought to be replicated asynchronously, meaning that one allele is replicated earlier and the other later during S phase. If a monoallelic gene is also regulated by DNA methylation, how does this epigenetic mark influence the replication timing and coordination of expression from a single allele? For instance, it is unknown whether there are differences in epigenetics between the early and late replicating alleles. It is also unknown if early replication drives transcriptional activity of an allele. Currently, how replication timing and epigenetics are coordinated to direct gene expression is not well understood.

29

Endothelial cells found in different vascular beds are known to exhibit differences in gene expression. Human ECs can retain gene expression patterns that reflect their vascular bed of origin even when grown for extended periods in vitro under standard culture conditions216. Because of this, it has been argued that EC heterogeneity is innate and preprogrammed. This phenotypic heterogeneity is also argued to reflect hardwired cellular responses to the extracellular environment. We are interested in the basal heterogeneity that exists between adjacent ECs in the same blood vessel exposed to the same extracellular environment.

P-selectin is required for a healthy hemostatic and inflammatory response84. Yet, high levels of expression in ECs at sites of chronic inflammation, such as in atherosclerotic lesions, promotes lesion progression69, 70. Too little P-selectin may induce a hemophilic state87. In terms of P- selectin we suspect that healthy endothelial cells hedge bets as to where, when, and how much to express. Having heterogeneous expression may be advantageous at the population level where variability in gene expression and phenotype across cells may contribute to effective resource allocation and task sharing. We consider the idea that a loss of heterogeneity may be a significant risk factor in the development of certain diseases. Our work will address how heterogeneity of EC gene expression is established, why it affects some genes more than others, and how it contributes to the structure and function of blood vessels both between and within different organs.

2.2 Heterogeneity of Endothelial Enriched Genes von Willebrand Factor and Vascular Cell Adhesion Molecule 1

Epigenetic processes are known to be functionally important in establishing endothelial heterogeneity across vascular beds. The importance of this heterogeneity in normal vascular physiology and various pathologies is widely appreciated. Inter- and intraindividual variability in endothelial gene expression plays an important role in susceptibility to disease. For example, atherosclerosis mainly occurs in medium to large sized arteries in regions of disturbed flow with low shear stress, characterized by low expression of eNOS and high expression of pro- inflammatory genes including p65, a subunit of NF-κB7.

We are interested in the specific mechanisms that regulate vascular heterogeneity. Epigenetic pathways, including DNA methylation and histone post-translational modifications, are fundamental for regulating EC gene expression. Importantly, epigenetic mechanisms are able to 30 be retained and passed on to daughter cells through mitotic cell division217. Despite the known biological roles of differential allelic gene expression resulting from imprinting and X- chromosome inactivation, an understanding of random autosomal monoallelic expression and how it contributes to human genetic diversity and cell-to-cell heterogeneity is lacking. Genes that are biallelically expressed protects diploid organisms from single-hit mutations that may be deleterious with respect to development or may be responsible for disease. When a cell expresses a gene from only one allele, this could increase the risk of developing these disease-related phenotypes. This is most commonly observed in cancer where loss of heterozygosity in a cell (whereby the normal function of one allele of a gene is lost while the other homologous allele has already been inactivated) is thought to contribute to malignant progression. In this instance, the deleterious risk of MAE is strikingly clear. The question then arises: why does random autosomal monoallelic gene expression exist at all?

In this work we will investigate whether DNA methylation can be passed on to daughter cells after mitotic cell division. DNA methylation is a known epigenetic mark that can distinguish active and inactive alleles of both imprinted and X-linked genes218, 219. It is one mechanism through which the transcriptional state of a gene can be inherited and maintained in daughter cells217, 220. Previous work from our lab has reported that VCAM1 is a cytokine-induced EC gene that displays heterogeneous expression upon stimulation with TNF-α (Turgeon et al., under revision). We observed that epigenetic modifications, namely DNA methylation, in part, explained this heterogeneity in VCAM1 inducibility. RNA polymerase II and p65 were found to preferentially bind hypomethylated VCAM1 DNA. We hypothesized that differential DNA methylation at the promoter results in differential binding of p65, which drives VCAM1 expression heterogeneity. Expansion of a single EC into a clonal population also demonstrated that the DNA methylation profile of the VCAM1 promoter was retained through mitotic cell divisions.

In prior collaborative work with our lab, Yuan et al. looked at heterogeneity in von Willebrand factor expression in endothelial cells42. We have also previously shown that methylation at the VWF promoter correlated with expression215. Four CpG sites downstream of the transcriptional start site (TSS) were assessed and low levels of methylation were found in HUVECs and human neonatal dermal microvascular ECs (HMVEC), whereas dense methylation was observed in the non-expressing HuAoVSMCs, human saphenous vein VSMC, keratinocytes, and hepatocytes. 31

When comparing the same CpG sites, these data were consistent with the Yuan et al. work where hypomethylation was observed at the 4 CpGs in HUVEC and HMVEC and hypermethylation in HuAoVSMC and hepatocytes42. Interestingly, when the authors used fluorescence activated cell sorting (FACS) to isolate HUVEC with the highest (top 10%) and lowest (bottom 10%) levels of VWF and compared their DNA methylation, the 4 downstream CpG sites were very similar among the high and low cells. Bisulfite sequencing of RNA polymerase II bound and unbound immunoprecipitated DNA also showed no difference in the 4 downstream CpGs. When the authors used in vitro methylation to assess the CpG sites in the VWF promoter/reporter constructs, compared with the mock-methylated control, the in vitro methylated plasmid demonstrated significantly less promoter activity when the 4 downstream sites were methylated. Overall the main finding from this paper was that some endothelial cells (such as coronary artery ECs and pulmonary artery ECs) had variable VWF expression whereas others (such as HUVEC) appeared to be more homogeneous. Interestingly, methylation patterns were clonally inherited in cells at the highest and lowest ends of VWF expression (HUVEC and human coronary artery vascular smooth muscle cells respectively) and not inherited in ECs that express medium levels of VWF. In these cells of intermediate VWF expression, the authors proposed that heterogeneity may be driven by upstream regulatory pathways that directed changes in the transcriptional/epigenetic machinery at the promoter. Heterogeneity of VWF at select vessels may be protective. Too little VWF results in von Willebrand disease, a bleeding disorder, and too much VWF causes thrombotic thrombocytopenic purpura, a disorder where blood clots form in small blood vessels throughout the body, which can limit or block the flow of oxygen-rich blood to the organs221.

Like VCAM1, P-selectin is a cellular adhesion molecule involved in leukocyte recruitment to the vascular endothelium during inflammation, but it is basally expressed. P-selectin is stored within WPBs, along with VWF, and upon stimulation with inflammatory molecules these WPBs fuse to the surface of ECs and expose P-selectin to the lumen where it can mediate leukocyte rolling. Our interrogation of the dbMAE151 suggested to us that P-selectin is monoallelically expressed in human and murine cell/tissue types whereas VCAM1 and VWF are biallelic.

The specific goal of this project is to gain further insight into how epigenetics and monoallelic expression contribute to EC heterogeneity and assess whether this type of allele-specific gene regulation is mitotically stable by using P-selectin as a model gene. 32

2.3 Hypothesis

P-selectin is heterogeneously expressed from a single allele in basal endothelial cells, epigenetics contributes to this differential expression, and this allele-specific regulation is mitotically heritable through cell division.

2.4 Specific Aims

Aim 1: Define the cellular and molecular basis for heterogeneous P-selectin expression in primary human endothelial cells grown in vitro.

Aim 2: Determine whether P-selectin epialleles are mitotically inherited.

33

Chapter 3 Chromatin-based Pathways Mediate Basal Expression Heterogeneity of P-selectin in Endothelial Cells 3.1 Materials and Methods 3.1.1 Cell Culture

Human umbilical vein endothelial cells were isolated from fresh single-donor umbilical cords according to the method described by Jaffe et al222. Primary cultures of endothelial cells were plated on 100mm tissue culture dishes pre-coated with 0.2% (v/v) gelatin in water. Cells were maintained in Endothelial Cell Growth Medium 2 (PromoCell, Cat# C-22011) containing: 2% fetal calf serum, 5ng/mL recombinant human epidermal growth factor, 10ng/mL recombinant human basic fibroblast growth factor, 20ng/mL insulin-like growth factor, 0.5ng/mL recombinant human vascular endothelial growth factor 165, 1μg/mL ascorbic acid, 22.5μg/mL heparin, and 0.2μg/mL hydrocortisone. 100U/mL penicillin and 100U/mL streptomycin were added to the supplemented endothelial cell growth media.

Bovine aortic endothelial cells (BAEC) were harvested according to the core facility method from the Vascular Research Division of the Brigham & Women’s Hospital and Harvard Medical School (http://vrd.bwh.harvard.edu/core_facilities/baec.html). Cells were cultured on 0.2% (v/v) gelatin-coated dishes in RPMI Medium 1640 containing 2mM L-Glutamine (Thermo Fisher Scientific, Cat# 11875093) supplemented with 15% heat-inactivated calf serum (Gibco, Cat# 16170-086) and an antibiotic solution of 100U/mL penicillin and 100U/mL streptomycin.

Single donor human aortic vascular smooth muscle cells, cryopreserved at passage 1, were obtained from ScienCell (Cat# 6110) and maintained as recommended by the supplier.

Passage 3 cryopreserved single donor human aortic endothelial cells (HAEC; Lonza, Cat# CC- 2535), human coronary artery endothelial cells (HCAEC; Lonza, Cat# CC-2585), and human neonatal dermal microvascular endothelial (HMVEC; Lonza, Cat# CC-2505) were initiated and cultured according to supplier instructions.

34

All cell cultures were maintained at 37°C in a humidified 5% CO2 atmosphere and the media replaced every 48h. Cells were passaged at 80-90% confluency using StemPro Accutase® (Thermo Fisher Scientific, Cat# A1110501). HUVEC were expanded up to passage 3 for experiments, unless stated otherwise. BAEC, HMVEC, HCAEC, HAEC, and HuAoVSMC were used up to passage 5.

3.1.2 Isolation of Single Cells for Clonal Expansion

Single HUVEC clones were obtained by fluorescence activated cell sorting of individual cells into separate wells of 0.2% (v/v) gelatin-coated flat bottom 96-well plates (Sarstedt, Cat# 83- 3924). Freshly harvested HUVECs were sorted using a FACSAria™ III (BD Biosciences) into 96-well plates containing 50µL/well of fetal bovine serum (FBS) (Wisent Bioproducts, Cat# 090-150). After sorting, each well was topped up with an additional 150µL of media. Media was changed every other day and wells were inspected by microscopy to exclude wells which contained more than one cell. These monoclonal wells were then expanded and subsequently passaged into 24-, 12-, 6-well, and finally into 100mm dishes.

3.1.3 Fluorescence Activated Cell Sorting

Flow cytometry was used to sort P-selectin high (top 5%) and low (bottom 5%) subpopulations. FACS experiments were performed using passage 3, post-confluent HUVEC preparations. Briefly, 100mm dishes of confluent HUVEC cells were washed with 10mL phosphate buffered saline without calcium and magnesium (PBS-/-). Cells were lifted and dissociated using 5mL of Accutase® (Millipore, Cat# SCR005) per dish for 10min at room temperature. Single cell homogenates were collected after centrifugation, re-suspended in sorting buffer (1 × PBS-/-, 1mM EDTA (ethylenediaminetetraacetic acid), 25mM HEPES (pH 7.0), 1% bovine serum albumin (BSA)) and filtered through a 40µm nylon mesh cell strainer to eliminate clumps and debris. Stimulated samples were treated with 10-4M histamine for 10min and labelled with Alexa Fluor 488-conjugated mouse anti-human P-selectin antibodies (BioLegend, Cat# 304916) or an isotype control (BioLegend, Cat# 400129). Propidium iodide (PI) staining (Millipore Sigma, Cat# P4170) was used to mark dead/dying cells. 1µL of 100µg/mL PI solution was added per 100µL of sample. Singlet discrimination was sequentially performed using plots for forward scatter and side scatter (FSC-W versus FSC-H, SSC-W versus SSC-H, and SSC-A versus FSC- A). Dead cells were excluded by scatter characteristics and PI staining. All FACS experiments 35 were performed on an FACSAria III sorter (BD Biosciences) at the St. Michael’s Hospital Core Facility and FACS data were analyzed using BD FACSDIVA™ Software (V8.0.1).

3.1.4 DNA Isolation

Cells were lysed using a buffer containing 200mM NaCl, 100mM Tris (pH 8.0), 5mM EDTA, 0.2% sodium dodecyl sulfate, and 20μg/mL RNase A. 200µg of Proteinase K (Thermo Fisher Scientific, Cat# EO0491) was added per mL of lysate and samples were incubated at 56°C overnight. DNA was then isolated and precipitated using standard phenol/chloroform and salt- ethanol precipitation methods. For flow sorted samples, DNA was extracted using the QIAamp DNA Mini Kit (Qiagen, Cat# 51304).

3.1.5 RNA Isolation

Total cellular RNA was harvested according to the method of Chomczynski and Sacchi223 with several modifications. Cells were washed with ice-cold PBS-/- and lysed with 750µL denaturing solution (solution D) (4M guanidinium thiocyanate, 25mM sodium citrate (pH 7.0), 0.5% (wt/vol) N-laurosylsarcosine (sarkosyl) and 0.01M 2-mercaptoethanol) per ~ 3 × 106 cells. The lysate was resuspended at least twenty times with a sterile 18G needle to shear the DNA. To control for differences in RNA extraction and cDNA synthesis efficiencies, a known amount of exogenous, in vitro synthesized, capped, and polyadenylated firefly luciferase mRNA was added to each sample. To eliminate traces of DNA, 5µL each of DNase I (deoxyribonuclease I) and DNase I buffer (New England Biolabs, Cat# M0303S) were added to the RNA samples before the second precipitation. Solubilized RNA was treated with 100µL of 8M LiCl to remove heparin, which can inhibit reverse transcription and PCR amplification, followed by a final round of precipitation and solubilization in DEPC-treated, RNase/DNase free water. RNA was extracted from flow sorted samples (containing <300 000 cells) using the RNeasy Mini Kit (Qiagen, Cat# 74104).

3.1.6 Absolute Quantification of Gene Expression by Reverse Transcriptase-qPCR (RT-qPCR)

Reverse transcription was carried out on 1µg of total RNA using the SuperScript™ III First- Strand Synthesis SuperMix for RT-qPCR (Thermo Fisher Scientific, Cat# 11752-050) according to the instructions. Quantitative PCRs (qPCRs) were performed on 2μL of cDNA with

36 optimized primer concentration for each primer pair in a 10μL reaction using Power SYBR™ Green PCR Master Mix (Thermo Fisher Scientific, Cat# 4368708) (Table 1). All measurements were performed as technical triplicates in a ViiA™ 7 Real‐Time PCR System or QuantStudio™ 7 Flex Real-Time PCR System (Applied Biosystems) in 384‐well plates (Applied Biosystems, Cat# 4309849) using 40 cycles of a two‐step protocol with 15sec at 95°C and 1 min at 60°C. Mean Ct‐values of technical replicates were used to calculate copy number based on a standard curve of known quantities of plasmid DNA containing cloned target DNA sequences. Gene expression was quantified after normalizing to a reference gene and luciferase efficiency where applicable.

Table 1: ChIP-qPCR and RT-qPCR primer sequences Gene Location1 Sequence Strand Firefly 5ʹ-ACTCCTCTGGATCTACTGGTC-3ʹ Forward

luciferase 5ʹ-GTAATCCTGAAGGCTCCTCA-3ʹ Reverse

5ʹ-CGCCAGCTTCGGAGAGTTC-3ʹ Forward TBP 5ʹ-GCAATGGTCTTTAGGTCAAGTTTACA-3ʹ Reverse

5ʹ-GACGGCGAGCCCTTGG-3ʹ Forward PPIA Exon 1/ Exon 2 5ʹ-TCTGCTTTTGGGACCTTGT-3ʹ Reverse

Exon 5/ Exon 6 5ʹ-GGCTTCTGGAATCTGGACAA-3ʹ Forward SELP +17201/+17842 5ʹ-TGAAGCTGCAGCTAGACTGATG-3ʹ Reverse

Intron 1/ Exon 2 5ʹ-GTTGGTGGTGTATGGATACTA-3ʹ Forward SELP +10903/+10962 5ʹ-TCTGGTACAAGATGGCTATT-3ʹ Reverse

Intron 16 5ʹ-GTGGGTAGTTGAGTGTCCTACA-3ʹ Forward SELP +40437/+40501 5ʹ-CCTGAGGCGCAGCATAAT-3ʹ Reverse

5ʹ-CACCGGACTGGGACACCTCT-3ʹ Forward MYT1 -141/+7 5ʹ-GGCGGAGGAGGCTCCTTTGTA-3ʹ Reverse

5ʹ-GGCTGCCAGTGTGTTCATAAC-3ʹ Forward NOS2 -16/+121 5ʹ-CTTCGGGACTGTCTAGAAGTGC -3ʹ Reverse

5ʹ-AACCAAGCCAACAGTGTCAAC-3ʹ Forward VEGFR2 +15671/+15524 5ʹ-ATTGCCCACCTGGTACACA-3ʹ Reverse 1relative to the transcriptional start site

37

3.1.7 Sodium Bisulfite Genomic Sequencing

Genomic DNA, isolated from quiescent, post-confluent cells, was subjected to sodium bisulfite treatment using the EZ DNA Methylation Kit (Zymo, Cat# D5001) according to manufacturer’s recommendations. 500ng of input DNA was used for all bisulfite conversion reactions. 40ng of the bisulfite-treated DNA was subjected to 35 cycles of PCR amplification in a total volume of 50μL using Platinum™ Taq DNA Polymerase (Thermo Fisher Scientific, Cat# 10966018) and the outer primers listed in Table 2. 3μL of the PCR product was used as the template for another 35 cycles of PCR using the inner primers (Table 2). All PCR primers were specifically designed against the sodium bisulfite converted sense strand (Figure 6). The final PCR products were subcloned into ElectroMAX™ DH10B™ Cells (Thermo Fisher Scientific, Cat# 18290015) using the TA Cloning® Kit (Thermo Fisher Scientific, Cat# K207020) and the Cell Porator® Electroporation System (Gibco BRL Life Technologies, Inc., Cat# 71600-050 and 11609-039) with 0.15cm gap disposable microelectroporation chambers (Cat# 11608-031). Using antibiotic and β-galactosidase selection, individual colonies were picked and mini-prepped using the QuickLyse Miniprep Kit (Qiagen, Cat# 27406) following the manufacturer’s protocol. The plasmids were than subjected to Sanger sequencing for base resolution analysis using the standard M13R(-27) Invitrogen primer. For each cell type 15-24 randomly chosen subclones were sequenced using the ABI 3730XL sequencer at The Centre for Applied Genomics (Peter Gilgan Centre for Research and Learning, Toronto, ON, CA).

38

5ʹ 3ʹ

3ʹ 5ʹ

Figure 6. Sodium bisulfite sequencing. One method of assessing DNA methylation is through sodium bisulfite sequencing. The first step involves denaturing double-stranded genomic DNA. Bisulfite treatment results in the deamination of cytosine to uracil whereas methylated cytosines are more resistant to this conversion. After treatment, the two DNA strands are no longer complimentary as all cytosines will be converted to uracil and all methyl cytosines will remain unconverted. The target region is then PCR amplified using primers specifically designed for the bisulfite-converted DNA. The final step involves subcloning the PCR products into an appropriate vector and transformation into competent E. coli cells. In order to analyze methylation using this method, 15-20 subclones are then sequenced to yield high-resolution analysis of the CpG methylation patterns in single alleles.

39

Table 2: Sodium bisulfite genomic sequencing primer set

1 Primer Set Sequence Strand Location Ta (°C) Amplicon Size (bp) 5ʹ-TATGGGAAGAGGAAAATAAAAAGGTAT-3ʹ Forward ₋566 to ₋540 Outer 45.3 491 5ʹ-ATTAACAACTTAAAAATAATAATTACCAAAA-3ʹ Reverse ₋106 to ₋76 SELP

Promoter 5ʹ-GGAAGAGGAAAATAAAAAGGTATTTAT-3ʹ Forward ₋562 to ₋536 Inner 47.4 451 5ʹ-AAAATCACCCCCTTCCATAAAAAATA-3ʹ Reverse ₋137 to ₋112 1 relative to the transcriptional start site

40

3.1.8 Pyrosequencing of Bisulfite Converted DNA

Pyrosequencing assay setup was done by EpigenDX (Hopkinton, MA). Two assays, ADS8536 and ADS8769, were created to cover the same 4 CpGs assessed using the single strand bisulfite analysis method. Samples were prepared according to the recommended PCR protocol and sent to EpigenDX for analysis using the PSQ™96HS system. This method is used to provide a quantitative assessment of DNA methylation at each CpG site in every product in the PCR reaction, as opposed to single DNA strand analysis.

3.1.9 Inhibition of DNA Methylation or Histone Deacetylation

Cells were treated with 5-Aza-2’-deoxycytidine (5-Aza-CdR) and or Trichostatin A (TSA) to inhibit DNA methylation or histone deacetylation respectively as described by Man et al.103 and Chan et al224. Briefly, exponentially growing cells were treated with 50μM 5-Aza-CdR and grown for 2 days, after which RNA was isolated. For TSA and combined 5-Aza-CdR and TSA treatments, TSA was added at a concentration of 1µM for the last 24h. Media was replaced every 48h.

3.1.10 Plasmids Used for Promoter Reporter Experiments

The P-selectin promoter/luciferase reporter construct, pGL2-1796/+92, was created according to the protocol by Xu et al.225 using pGL2-Basic as the backbone (5597bp) and the primers listed in Table 3. Plasmid sequences were confirmed with DNA sequencing. pGL2-Control (6046bp) was used as a control to compare the efficiencies of luciferase expression driven by the early simian virus 40 (SV40) enhancer and promoter versus the engineered P-selectin promoter in pGL2- 1796/+92. All plasmids used for transfection were purified using the QIAGEN Plasmid Maxi Kit (Cat# 12163). Plasmids were screened by miniprep analysis and agarose gel electrophoresis.

41

Table 3: Primers used to make pGL2-1796/+92 Restriction site Primer Sequence Strand Tm (°C) Ta (°C) added 1 5ʹ-TAACGGGGTACCTAACAGCGTGATAGGTATTGTTCCA-3ʹ Forward GGTACC (KpnI) 76.9 57.9 2 5ʹ-ACTCCGCTCGAGCTCTGTGACTCTGCTGGTTTTCTG-3ʹ Reverse CTCGAG (XhoI) 81.4

42

3.1.11 In Vitro Methylation of Promoter/Reporter Constructs pGL2-1796/+92 was in vitro methylated using M.SssI CpG methyltransferase (New England Biolabs, Cat# M0226S) according to the manufacturer’s recommendations. Briefly, DNA was incubated with M.SssI (1 unit/μg DNA) in the presence of 1 × NEBuffer 2 and 160μM S- adenosylmethionine at 37°C for 4h. DNA was purified using the QIAquick PCR Purification Kit (Qiagen, Cat# 28104) and the recovered DNA was subjected to a second round of methylation overnight. The construct was then purified again using the QIAquick PCR Purification Kit. Mock methylation followed the same protocol with exclusion of the M.SssI enzyme. To test the efficacy of M.SssI, methylated DNA was digested with enzymes either blocked by CpG methylation (HpaII) or not blocked by methylation (MspI). Sensitivity to MspI and resistance to HpaII indicated that the DNA was efficiently methylated by M.SssI.

3.1.12 Plasmid Transfection and Luciferase Assay

18-24hrs before transfection, cells were seeded to be 80–90% confluent at the time of transfection using antibiotic-free complete growth media. BAEC cells were used at passage 4 and HUVEC cells were used at passage 3. BAEC cells were transfected using Lipofectamine® 2000 and HUVEC with Lipofectamine® LTX according to the instructions provided by the manufacturer. All cells were co-transfected with the test plasmid, an internal control vector, and pcDNA3, which was used to increase the bulk mass of DNA to obtain an optimized ratio of DNA to transfection reagent (1:2.5 (mass/volume) ratio). The internal control vector used was pRL-SV40, which encodes the SV40 enhancer/promoter region upstream of the Renilla luciferase gene. This internal control is used to normalize the values of the test reporter gene for variations that may be caused by transfection efficiency and sample handling. Renilla luciferase is constitutively expressed from the control vector and its production is independent of experimental manipulations to the test promoter (such as methylation). pGL2-Control vector, containing the SV40 promoter and enhancer sequences upstream of Firefly luciferase, was used as a positive control. pGL2-Basic, which lacks eukaryotic promoter and enhancer sequences, was used as a negative control. Transfection complexes were formed at room temperature in Opti- MEM® I Reduced-Serum Medium (Thermo Fisher Scientific, Cat# 31985070). The DNA-lipid complexes were added to cells pre-washed with Opti-MEM® and incubated for 6hrs at 37°C, after which the transfection mix was replaced with either Endothelial Cell Growth Medium 2 43 supplemented with 12% (v/v) FBS (Wisent Bioproducts, Cat# 090-150) for HUVECs or RPMI 1640 supplemented with 45% (v/v) calf serum (Gibco, Cat# 16170-086) for BAECs. Twenty- four hours after transfection, the cells were harvested for Dual-Luciferase® Reporter Assay (Promega, Cat# E1910) according to the manufacturer protocol and luciferase was measured using the BioTek Synergy NEO multimode plate reader. Each transfection experiment was performed as technical triplicates and repeated a minimum of three times. Samples were normalized by dividing the test reporter activity by the Renilla reporter activity and subtracting the mean of pGL2-Basic activity.

3.1.13 Chromatin Immunoprecipitation (ChIP)

The following antibodies were used in ChIP experiments: Anti-histone H3 (trimethyl Lys27) (Millipore Sigma, Cat# ABE44), anti-histone H3 (trimethyl K36) (Abcam, Cat# ab9050), and normal rabbit IgG (Santa Cruz, Cat# sc-2027).

ChIP was performed using the ChIP Assay Kit according to the manufacturer instructions with several modifications (Millipore Sigma, Cat# 17-295). Approximately 1 × 106 post-confluent cells were used per ChIP assay. Sonication was performed on ice using a Sonics and Materials Vibra-Cell™ ultrasonic processor with a 3mm tip set at 30% amplitude. Samples were pulsed 5 times using a cycle of 10sec ON followed by 10sec OFF. This was repeated 10 times with a 30sec rest interval after every 5 cycles for a total of 50 pulses. Post-sonication, chromatin fragment sizes ranged from 500-1500bp. Chromatin was pre-cleared using 80µL of salmon- sperm DNA/protein A or G-agarose for 60min (N.B. protein A has a better affinity for rabbit IgG; protein G for mouse IgG). A 50μL aliquot of pre-cleared chromatin was reserved as an input control. Immunoprecipitation was performed overnight using 5μg of antibody. Sample DNA was recovered using the QIAquick PCR Purification Kit (Qiagen, Cat# 28104) and eluted with 60μL 10mM Tris-Cl (pH 8.5).

3.1.14 Absolute Quantitative PCR Analysis of ChIP Samples

Quantitative PCRs were performed on 2μL of ChIP DNA or a 10-fold dilution of input chromatin. This was done using optimized primer concentrations for each primer pair in a 10μL reaction with Power SYBR™ Green PCR Master Mix (Thermo Fisher Scientific, Cat# 4368708) (Table 1). All measurements were performed as technical triplicates in a ViiA™ 7 Real‐Time

44

PCR System or QuantStudio™ 7 Flex Real-Time PCR System (Applied Biosystems) in 384‐well plates (Applied Biosystems, Cat# 4309849) using 40 cycles of a two‐step protocol with 15sec at 95°C and 1 min at 60°C. The number of copies of target DNA was calculated from a standard curve prepared using known copies of human genomic DNA (assuming 1ng of DNA represents 300 copies). The amount of immunoprecipitated DNA was determined by subtracting the number of target DNA molecules in the control IP from the number of copies in the immunoprecipitated samples using a specific antibody and dividing by the number in the diluted input sample.

3.1.15 PCR and Sanger Sequencing to Detect Heterozygous SNPs

The primer pairs that were used to PCR amplify heterozygous single nucleotide polymorphisms (SNPs) from HUVEC cells using Platinum™ Taq DNA Polymerase High Fidelity (Thermo Fisher Scientific, Cat# 11304011) can be found in Table 4. (NOTE: PCR products were directly subjected to Sanger sequencing at The Centre of Applied Genomics (Peter Gilgan Centre for Research and Learning, Toronto, ON, CA) using the corresponding primers Table 5.

45

Table 4: H3K27me3 and H3K36me3 SNP PCR primers

Primer Set SNP Variant ID Sequence Strand Ta (°C) Amplicon Size (bp) 5ʹ-GCCCAACCCTGCTACTATC-3ʹ Forward 1 rs2235303 55.2 417 5ʹ-CACAATGGCTCACACCTATAATC-3ʹ Reverse

rs3917783 5ʹ-CTACCGATAGTTCCACTTTAG-3ʹ Forward 21 54.4 632 rs3917786 5ʹ-GGTCACCAGAGGATCAATG-3ʹ Reverse

5ʹ-CTTGGCCAGGAGAATCATC-3ʹ Forward 3 rs3917811 55.7 525 5ʹ-GTTGGGTGAGACTTCAACATACA-3ʹ Reverse

rs3917820 5ʹ-CCTGCTATAGAAATTGGTAAAC-3ʹ Forward 42 48.6 513 rs3917824 5ʹ-GCTGCTAATGCATTGATAAATC-3ʹ Reverse

5ʹ-GGCAGTAGGACACAGGTATG-3ʹ Forward 5 rs3917854 53.8 605 5ʹ-GGCTGGTCCAGAGCTAATCTAAG-3ʹ Reverse 1 amplicon contains both rs3917783 and rs3917786 2 amplicon contains both rs3917820 and rs3917824

46

Table 5: H3K27me3 and H3K36me3 SNP Sanger sequencing primers

1 Primer Set SNP Variant ID Sequence Strand Tm (°C) 5ʹ-TGCCTCAGCCTCCCGAATAG-3ʹ Forward 61.9 1 rs2235303 5ʹ-GAGAGGCTGATGTGAGTGAATC-3ʹ Reverse 58.1

rs3917783 5ʹ-CTGCCTAGTTGTTCCTCAG-3ʹ Forward 56.9 2 rs3917786 5ʹ-GAAGGGCAAAGGCAGAGACAAG-3ʹ Reverse 59.2

5ʹ-CTCTGACACTGATAGTTTTTC-3ʹ Forward 53.0 3 rs3917811 5ʹ-GACATTGCACCCCTGGAGTAG-3ʹ Reverse 57.7

rs3917820 5ʹ-GCCATACATGATATTTCCTGAAG-3ʹ Forward 54.7 4 rs3917824 5ʹ-GCATGTCAGTATGTGTTTATTG-3ʹ Reverse 50.2

5ʹ-GGAGTTACAAAAGAGTTCAC-3ʹ Forward 51.8 5 rs3917854 5ʹ-GAGGCGCAGCATAATCTTTTC-3ʹ Reverse 57.7 1 all sequencing reactions are done at an annealing temperature of 50°C 3.1.16 Immunofluorescence and Confocal Imaging

The following antibodies were used in immunofluorescence experiments: mouse anti-CD62P (P- selectin) [AK-6] (Abcam, Cat# ab6632), rabbit anti-VWF (Dako, Cat# A0082), goat anti-VE- cadherin [C-19] (Santa Cruz Biotechnology, Cat# sc-6458), donkey anti-Mouse IgG (H+L), Alexa Fluor 488 (Thermo Fisher Scientific, Cat# A-21202), donkey anti-Rabbit IgG (H+L), Alexa Fluor 555 (Thermo Fisher Scientific, Cat# A-31572), and donkey anti-Goat IgG (H+L), Alexa Fluor 647 (Thermo Fisher Scientific, Cat# A-21447).

Freshly isolated or low passage HUVEC were seeded onto No. 1.5 glass coverslips (22mm × 22mm × 0.17mm) coated with 0.05mg/mL type I rat tail collagen in 0.02M acetic acid (Roche, Cat# 11179179001). Cells were fixed using 2% paraformaldehyde for 10min at room temperature, followed by quenching with 75mM glycine for 10min, and permeabilization with 0.1% Triton X-100 for 15min. Cells were blocked for 1hr at room temperature with 3% BSA. Samples were incubated with the appropriate primary antibodies, diluted in PBS with 1% BSA, for 4 hours or overnight at 4°C. Secondary antibodies were applied for 1hr at room temperature. Nuclei were counterstained using DAPI (4′,6-diamidino-2-phenylindole) or Hoechst. Coverslips were mounted on glass slides using Dako Fluorescence Mounting Medium (Cat# S3023).

47

Image acquisition was done on a Zeiss LSM 700 Inverted Confocal microscope using the ZEN 2012 Sp1 64-bit software (black edition; v8.1). Slides were imaged with a Plan Apochromat 63×/1.40 NA DIC M27 oil immersion objective as a z‐stack with 14 planes spaced 0.4μm apart.

Fluorophores were detected as a multitrack scan with the following settings: track 1 – 488nm laser (2.0% power, 650 gain), 647nm laser (2.0% power, 650 gain); track 2 – 405nm laser (2.0% power, 600 gain), 555nm laser (1.0% power, 475 gain). Images were analyzed using Fiji226. Imaging experiments were performed in at least 3 separate experiments.

3.1.17 Single-cell RNA Sequencing (scRNA-seq)

Freshly harvested cells were delivered on ice to the Princess Margaret Genomics Centre (Toronto, ON, CA) for processing. A separate line of third-passage HUVECs were also prepared for sequencing.

Sequencing libraries was prepared targeting ~6,000 cells per sample (90% viability) following the Chromium Single Cell 3ʹ Reagent Kits v2 User Guide (CG00052). This platform uses droplet cell capture technology and applies a 3ʹ-end counting sequencing approach. The libraries were sequenced on an Illumina® HiSeq 2500 instrument, targeting 100K reads per cell. scRNA-seq analysis was performed by the bioinformatics team at the Prince Margaret Cancer Centre (Toronto, ON, CA). Raw Illumina sequencing data from Chromium Single Cell libraries were preprocessed using the CELLRANGER (v2.1.0) pipeline from 10X Genomics. First, cellranger mkfastq was used to demultiplex raw base call files based on sample indices and converted the cell barcodes and read data to FASTQ files. Each transcript has an 8bp library barcode (sample index) that allows identification of pooled samples on one sequencing lane. Cell barcodes (16bp) are used to identify the cell the read came from and a 10bp unique molecular index (UMI) uniquely identifies each RNA molecule. cellranger count takes FASTQ files from cellranger mkfastq and performs alignment, filtering, and UMI counting. FASTQ sequences were mapped to the GRCh38 human reference genome using STAR aligner (STAR v2.5.2b)227. After aligning sequence data to the genome, the quality of the mapping was assessed using RNA- SeQC (v1.1.7) and SAMTOOLS (v1.3.1). The primary analysis output of CELLRANGER are gene-barcode matrices. Two types of gene-barcode matrices were generated. The first (unfiltered) contained every barcode from the fixed list of known barcode sequences, including

48 background (droplets with no cells) and non-cellular barcodes (from ambient RNA). The second (filtered) contained only the detected cellular barcodes. To remove cells representing background noise, the cumulative fraction of uniquely mapped reads was plotted against the cell barcodes ordered by descending number of reads and the inflection point of the curve was used as a cutoff. Gene-barcode matrices were normalized according to the method by Lun et al.228 using the R package SCRAN (v1.2.2). Low-quality cells with log-library sizes >4 median absolute deviations (MADs) below the median log-library size, log-number of genes detected >4 MADs below the median, or high mitochondrial gene expression (UMI counts) were removed from the data sets. 101 cells in the fresh tissue sample and 335 cells in the passage 3 sample were removed from further analysis. Principal component analysis (PCA) dimensionality reduction of single cell transcriptomes, clustering, differential gene expression and t-distributed Stochastic Neighbor Embedding (t-SNE) visualization was performed using R packages, SCATER229 (v1.2.0), CELLRANGERRKIT (v1.1.0), RTSNE230 (v0.11), SC3231 (v1.3.14), EDGER232 (v3.16.5), SEURAT (v2.1)233, 234, and PCAMETHODS235 (v1.50.0). LoupeTM Cell Browser (v2.0.0) was used to view and explore the dataset to find significant genes, cell types, and subpopulations.

3.1.18 Fluorescence In Situ Hybridization

FISH was performed using QuantiGene ViewRNA ISH Cell Assay Kit (Affymetrix, Cat# QVC0001), following the manufacturer’s instructions with some modifications. Briefly, first- passage cells were plated on gelatin-coated cover slips, fixed in 4% paraformaldehyde for 1hr, permeabilized with detergent, and treated with protease (1:2000 Protease QS in 1X PBS). A working probe set solution (1:100 probe set into probe set diluent QF) was prepared, added to the appropriate well, and incubated as recommended (probes, SELP hnRNA: VA1-6001142; SELP mRNA: VA4-3083145; VWF mRNA: VA1-12338). The PreAmplifier Mix, Amplifier Mix, and Label Probe Mix were then added sequentially according to the protocol. Slides were rinsed three times in PBS and nuclei were stained using DAPI. Coverslips were mounted on glass slides using Dako Fluorescence Mounting Medium (Cat# S3023). Images were collected using a Spinning Quorum Disc Confocal Microscope using a 20X objective (0.6 numerical aperture) and a 63x oil immersion objective (1.4 numerical aperture). An EM-CCD camera (Hamamatsu ImageEMX2) was used for all images taken. Z-stacks were taken every 0.6µm for a total of 24 slices and used for cell and transcript counting using Imaris Image Analysis software. Representative images were processed using FIJI226. 49

3.2 Results 3.2.1 Existence of Basal P-selectin Protein Heterogeneity

To determine whether intercellular P-selectin protein heterogeneity exists, immunofluorescence assays were performed on fixed and permeabilized monolayers of cultured HUVEC using specific antibodies directed at human P-selectin and VWF (Figure 7A). Confocal microscopy revealed heterogeneous expression of P-selectin among cells isolated from the same umbilical cord. The intracellular distribution of VWF appeared to be largely homogeneous, which also recapitulates our prior findings42 (Figure 7B). Comparison of the distribution of these proteins within the same cell indicated that P-selectin co-localizes with VWF in the same WPBs (Figure 7C). A comparison of P-selectin expression in primary, first-, and second-passage cultures of HUVEC demonstrated almost exclusive P-selectin expression only in the primary and first- passage cells, which is similar to previously reported results28.

50

Figure 7. Heterogeneity of basal P-selectin protein and SELP transcription in human endothelial cells. (A-C) Representative confocal images of human umbilical vein endothelial cells, fixed, permeabilized, and labelled with Hoechst nuclear stain ( blue-green colour), anti-P-selectin (green), and anti-VWF (red) antibodies. These experiments were repeated three times with independent HUVEC populations and nearly identical results were observed. (D-F) HUVEC cells were stimulated with histamine (10-4M) for 10min, stained with Alexa488–conjugated P- selectin antibody, and sorted using fluorescence activated cell sorting. qRT-PCR quantification of steady-state SELP mRNA (normalized to cyclophilin A and luciferase efficiency) and hnRNA (normalized to luciferase efficiency) in P-selectin-high and -low expressing subpopulations. Representative data from one HUVEC line is shown. RNA was assessed in 3 biological replicates, which showed comparable results. Error bars represent the SEM of three triplicate wells. (F) Ratio of mature SELP hnRNA to SELP mRNA in P-selectin-high and -low expressing subpopulations. 3.2.2 Heterogeneity of P-selectin at the Level of RNA

To examine the mRNA expression heterogeneity in various cultured human ECs, we have utilized a publicly available microarray data set which profiles 53 EC samples representing 14 different vessel and tissue types. This sample set included ECs from five different arteries (aorta, 51 coronary artery, pulmonary artery, iliac artery, and umbilical artery), two different veins (umbilical vein and saphenous vein), and seven different tissues (skin, lung, intestine, uterus myometrium, nasal polyps, bladder, and myocardium). We found that ECs from different blood vessels and tissues have varying levels of P-selectin expression (Figure 8). Some vascular beds, such as nasal polyps and lung, exhibit significant constitutive expression of P-selectin. HUVECs and coronary artery ECs were shown to express higher levels of P-selectin than the aorta and skin microvascular neonatal ECs. This correlated with the percent methylation patterns (discussed later) (Figure 15). There was no obvious association between P-selectin mRNA expression and vessel size (large versus microvascular) or vessel type (vein versus artery). We cannot exclude differences between mRNA and protein expression in these studies, but we and others have not observed discordance between SELP mRNA and protein expression.

52

Figure 8. Expression of SELP across multiple endothelial cell types from publicly available microarray data216 Expression values represent log(base 2) of ratio of Cy5/Cy3. * indicates p < 0.05. 3.2.2.1 Fluorescence In Situ Hybridization

FISH assays were used to evaluate the relative steady state amounts of P-selectin mRNA and hnRNA in basal, early passage HUVEC. As with protein expression, we found that monolayers of HUVEC had variable expression of both P-selectin hnRNA and mRNA in seemingly homogeneous populations of cells (Figure 9). As expected, P-selectin hnRNA was predominantly nuclear while P-selectin and VWF mRNA was predominantly cytosolic. VWF mRNA was more highly and uniformly expressed across the imaged cells. Quantification of the average number of detected transcripts per positive cell, revealed an hnRNA to mRNA ratio of about 0.92.

53

Figure 9. P-selectin mRNA and hnRNA is heterogeneously expressed in cultured HUVEC SELP hnRNA (A, E) and mRNA (B, F) is heterogeneously expressed. VWF mRNA (C, G) is homogeneously expressed. Dual-labelled SELP (green) and VWF (red) mRNA (D, H). (A-D) is imaged using a 20X objective lens; scale = 50µm. (E-H) is imaged using a 63X objective (oil) lens; scale = 10µm. Representative transcripts indicated by arrowheads (n=1). 3.2.2.2 Single-cell RNA Sequencing

HUVECs isolated from fresh human tissue (umbilical cords) and cultured cells (passage 3) were characterized by scRNA-seq to determine the unique transcriptional landscape of ECs in normal blood vessels. Briefly, the 10X Genomics Chromium platform, a droplet-based method, was used to isolate individual cells. RNA was isolated from each cell using oligo-dT primers to capture polyadenylated RNA. poly[T]-primed RNA is converted to cDNA by reverse transcriptase and PCR amplified for library preparation and high-throughput sequencing. Unique molecular identifiers and barcodes added during the amplification step, mark individual RNAs and preserve information about which cell the RNA came from. UMIs are used to directly quantify the number of transcript molecules for each gene.

Our goal was to sequence 6,000 cells at a minimum read depth of 100,000 reads per cell. Overall, we were able to capture 2,929 cells with 117,920 reads per cell for the fresh tissue and 8,281 cells with 37,151 reads per cell for the passage 3 sample. (Additional results for fresh HUVEC: 2,742 median genes per cell; 9,673 median UMI counts per cell; and ~74% sequencing 54 saturation. Additional results for passage 3 HUVECs: 3,627 mean genes per cell; 15,719 median UMI counts per cell; and ~24% sequencing saturation.) Summaries of the sequencing runs can be found at file:///C:/Users/lucyc/Desktop/HS27F_results/web_summary.html and ..\..\Desktop\Paul's 2 samples HUVEC + TNF\Lucy_2\H548M_HUVEC_untreated\web_summary.html. Only 42 cells were positive for P-selectin in the passage 3 sample, therefore we have mainly focused on the analysis of the fresh tissue sample.

Analysis of the single-cell RNA-seq data was done in collaboration with the bioinformatics team at the Princess Margaret Genomics Centre. The data was filtered for library size, number of expressed genes per cell, and mitochondrial proportion. Cells with a large proportion of transcripts that mapped to mitochondrial genes was used as a proxy to indicate low-quality cells that are dying or damaged236. Outliers are defined based on the median absolute deviation (MADs) from the median value of each metric across all cells. Cells with log2-transformed library sizes that were ≥ 4 MADs below the median log2-transformed library size were removed. A log-transformation improves resolution at small values, especially when the MAD of the raw values is comparable to or greater than the median. Cells where the log-transformed number of expressed genes and mitochondrial proportion was 4 MADs below the median were also removed. Based on this, 101 cells were excluded from subsequent analysis. The data was normalized using a method based on CPM (counts per million) specifically formulated for single-cell data as described by Lun, et al228. All subsequent analysis was conducted on the normalized and log2-transformed expression data.

Two of the most common goals of an scRNA-seq experiment are identifying cell sub- populations and characterizing genes that have differential expression across these groups. Principal component analysis and t-distributed stochastic neighbor embedding are both commonly used dimensionality reduction methods that can be used to summarize highly complex (i.e., high-dimensional) data into low-dimensional space (e.g., a 2D graph). PCA is an algorithm that uses the mathematical concepts of standard deviation, covariance, eigenvectors, and eigenvalues. It is an unsupervised method that linearly transforms the high dimension data based on the Euclidean distance in PCA space237. Essentially, PCA uses the correlation between some features (which in this case is gene expression) and tries to provide a minimum number of variables (principal components) that keeps the maximum amount of variation or information 55 about how the original data is distributed. In contrast to PCA, t-SNE is a non-linear dimensionality reduction algorithm that essentially “maps” high-dimensional data onto low- dimensional space238. t-SNE creates a probability distribution (using the Gaussian distribution) that defines the relationships between the points in high-dimensional space. It then uses the Student t-distribution to recreate the probability distribution in two or three dimensions as a scatterplot. Because t-SNE analysis is computationally intensive, PCA is often used in conjunction with t-SNE to first reduce the dimensionality of the data, after which the PCA components are visualized in two dimensions using t-SNE.

Over 21,000 genes were detected in our sample that could be used to cluster cells into subpopulations. To visualize and ultimately define the various cell subpopulations in the single- cell data, both PCA and tSNE were used in the analysis of our data. t-SNE analysis segregated the cells into obvious clusters (Figure 10A). The visualized plot contains clusters of points, representing individual cells, which are distinguished into groups or subtypes depending on the approximate distance between the clusters. Eleven differentially regulated clusters were identified within the fresh tissue sample. The labeling of the cell groups (in terms of cell type) in a scRNA-seq dataset can either be performed experimentally (by using FACS before the actual sequencing experiment) or post hoc based on the expression level of genes coding for commonly used cell-enriched mRNA transcripts. The cluster identities in our sample were based on the analysis of well-characterized mRNAs from the literature. EC- containing clusters were identified using prototypical EC genes, including PECAM1, VWF, CDH5, CD34, TEK, and ENG (Figure 10B). Most of the HuAoVSMC-associated genes (ACTA2, MYH11, SMTN, and MYOCD) were downregulated in these groups, as well as genes known to be expressed in lymphatic endothelial cells, such as PDPN, PROX, LYVE1, and FLT4/VEGFR3 (data not shown). These EC subpopulations were close in PCA space and were well separated from clusters identified as natural killer cells, red blood cells, conventional dendritic cells, T-cells, granulocytes, Langerhans cells, and macrophages (Figure 10A). Because our sample was harvested from fresh tissue, it was anticipated that these cells would be present among the ECs.

56

Figure 10. Endothelial subsets identified by scRNA-seq in cells isolated fresh from a single-donor umbilical cord. Steady-state mRNA transcript profiles of single cells were obtained using the 10X Chromium platform. Single-cell RNA sequencing data processing, alignment, gene quantification, and quality control were preprocessed using the CELLRANGER (v2.1.0) pipeline from 10X Genomics. Raw FASTQ files were aligned to GRCh38 using the STAR aligner (STAR v2.5.2b). The data set was normalized using a variant on CPM (counts per million) specifically formulated for single-cell data228. 2,828 cells passed quality control and filtering, for which an average of 2,742 genes per cell were measured with 117,920 mean reads per cell. (A) Single cell transcriptomes of cells analyzed with an unsupervised dimensionality reduction algorithm (principal component analysis, PCA) and visualized with t-distributed stochastic neighbor embedding (t-SNE). Clusters are demarcated by different colors demonstrating eleven clusters based on gene expression differences. Clusters: 1-4–Endothelial cells, 5–Natural killer cells, 6–Macrophages, 7–Conventional dendritic cells, 8–T-cells, 9–Granulocytes, 10–Langerhans cells, 11–Red blood cells (B) Feature plots demonstrating expression levels of characteristic endothelial genes in the 2,828 cells. Color scale for the feature plots ranges from light grey (no expression) to blue (high expression). 57

Since annotation of the t-SNE clusters was performed manually, we also assessed marker genes using a less subjective method, namely SC3, a differential gene analysis method231. For every gene detected in the sequencing experiment, a binary classifier was constructed based on the mean cluster expression values. In other words, the mean expression of a gene in a particular cluster was compared to the mean expression of cells in all other clusters. The area under the receiver operator characteristic curve (AUROC) was used to determine the accuracy of the gene as a marker for the cluster. A p-value was assigned to each gene by using the Wilcoxon signed test. The p-values were adjusted for multiple comparisons using the Holm-Bonferroni method. By default, genes with an AUROC > 0.85 and p-value < 0.01 were defined as marker genes. Table 6 lists the top 25 marker genes for each cluster versus all other clusters. Interestingly most of the top ranked genes in each cluster are not well-known cell type markers. This may be explained by the fact that most accepted markers are defined and used at the protein level and may not be reliable markers at the level of RNA. Interestingly, when looking at the endothelial clusters, none of the classical endothelial markers came up in the top 25 genes. It is quite possible that novel RNA markers for endothelial cells have been identified. These will need to be further validated using alternate methods. Another interesting observation is that many genes that are identified as “markers” are found in a large percentage of cells in all other clusters (Table 6, comparing pct.1 to pct.2). This indicated to us that many of these markers may not be specific to a single cell type and that cells may be better identified using an aggregate signature consisting of multiple genes. The discovery of non-specific markers is not a new concept. For example, previous work has found that fibroblast-specific protein 1 (FSP1) can be expressed in cells other than fibroblasts103.

Quantification of gene expression heterogeneity is technically challenging since there is no baseline to compare the expression data to. Certain genes that are widely considered as housekeeping genes such as cyclophilin A and GAPDH, appeared to be largely homogeneous across all cells in every cluster (Figure 11). Others such as TATA-binding protein (TBP) and hypoxanthine phosphoribosyltransferase 1 (HPRT1), were not ubiquitously expressed and showed some degree of expression heterogeneity.

58

Table 6: Top 25 marker genes for each cluster Avg. Avg. Adjusted log2 Adjusted log2 Gene Cluster pct.1a pct.2b Gene Cluster pct.1a pct.2b p-value fold p-value fold change change ADIRF 1 2.18E-117 0.84 1.00 0.85 LTBP1 3 3.46E-120 1.01 0.80 0.22 CYP1B1 1 4.88E-97 0.67 0.95 0.55 BAMBI 3 1.67E-103 0.79 0.68 0.17 ACKR1 1 3.28E-92 0.81 0.77 0.36 CAMK2N1 3 1.56E-95 0.56 0.29 0.02 NPDC1 1 9.54E-92 0.53 1.00 0.83 SEZ6L2 3 9.23E-93 0.65 0.69 0.19 FBLN2 1 1.38E-82 0.66 0.97 0.70 MATN2 3 5.93E-79 0.78 0.99 0.71 IGF2 1 3.08E-79 0.67 0.98 0.77 GDF7 3 2.52E-65 0.69 0.99 0.66 CXorf36 1 4.74E-79 0.49 0.62 0.21 SFRP5 3 3.44E-65 0.60 0.44 0.10 PALMD 1 3.71E-78 0.54 0.98 0.68 BMP4 3 3.80E-65 0.69 0.99 0.80 NPY 1 6.73E-77 0.73 0.76 0.36 OGN 3 2.80E-64 0.96 0.37 0.07 IFI27 1 4.80E-76 0.59 1.00 0.87 ART4 3 3.51E-64 0.57 0.82 0.33 PTMS 1 3.69E-69 0.46 1.00 0.87 EDN1 3 5.41E-64 0.64 1.00 0.89 YBX3 1 1.29E-65 0.49 0.99 0.83 CTGF 3 3.71E-63 0.69 1.00 0.99 AQP1 1 1.11E-64 0.49 0.99 0.81 SULT1E1 3 5.17E-59 0.66 0.71 0.28 HRCT1 1 1.28E-63 0.47 0.95 0.65 CCDC80 3 5.91E-55 0.58 0.89 0.45 LTBP2 1 4.21E-63 0.48 0.98 0.78 SNHG7 3 1.81E-50 0.56 0.99 0.83 IFI44L 1 4.18E-62 0.46 0.81 0.43 ANKRD1 3 1.29E-45 0.62 0.64 0.27 MALL 1 1.98E-59 0.49 0.61 0.25 HHIP 3 2.56E-45 0.65 0.72 0.32 PLAC8 1 4.84E-56 0.52 0.84 0.49 LEPR 3 1.87E-42 0.55 0.95 0.63 NUAK1 1 1.29E-54 0.44 0.82 0.45 C10orf10 3 1.09E-41 0.76 0.90 0.56 TGM2 1 1.99E-51 0.45 0.99 0.83 OMD 3 5.37E-41 0.55 0.99 0.74 SERPINE1 1 1.64E-50 0.52 0.93 0.63 COL3A1 3 3.64E-33 0.61 0.99 0.80 LYPD2 1 3.42E-50 0.60 0.40 0.13 TPM2 3 2.53E-15 0.62 0.97 0.72 PIR 1 3.68E-45 0.45 0.75 0.41 COL1A2 3 1.51E-13 0.96 0.83 0.53 ID1 1 4.06E-43 0.48 0.94 0.71 TAGLN 3 9.42E-10 1.62 0.79 0.49 NPW 1 1.94E-30 0.54 0.47 0.23 ACTA2 3 2.06E-03 2.31 0.48 0.28 CTGF 2 1.59E-63 0.79 1.00 0.99 FBLN2 4 3.32E-33 0.83 0.91 0.76 DKK3 2 4.14E-47 0.57 0.97 0.80 IFI27 4 7.35E-30 0.66 0.98 0.90 OMD 2 3.24E-46 0.67 0.92 0.73 ADIRF 4 1.81E-27 0.70 0.96 0.88 MATN2 2 4.67E-46 0.73 0.91 0.71 CPE 4 2.69E-27 0.51 1.00 0.97 EDN1 2 3.09E-45 0.57 0.99 0.88 LTC4S 4 1.50E-25 0.58 0.94 0.82 PROCR 2 2.75E-44 0.55 0.97 0.82 CD59 4 1.62E-25 0.55 0.98 0.93 BMP4 2 6.21E-43 0.53 0.97 0.78 RNASE1 4 7.04E-24 0.46 0.99 0.96 EFEMP1 2 1.96E-42 0.56 1.00 0.96 NPY 4 1.68E-18 1.04 0.62 0.46 IGFBP7 2 1.05E-37 0.52 1.00 0.95 CLU 4 2.78E-17 0.49 1.00 1.00 TM4SF1 2 2.51E-37 0.53 1.00 0.89 HRCT1 4 2.19E-15 0.60 0.82 0.72 FHL2 2 8.84E-29 0.53 0.85 0.71 IGF2 4 2.59E-14 0.48 0.92 0.82 EMCN 2 1.09E-24 0.47 0.91 0.72 PPIC 4 7.25E-14 0.52 0.77 0.71 SDPR/CAVI N2 2 4.94E-24 0.46 0.96 0.78 ACKR1 4 1.87E-11 1.03 0.58 0.46 CYR61 2 7.84E-23 0.60 0.94 0.81 CYP1B1 4 2.19E-11 0.66 0.75 0.65 ACSM3 2 2.44E-20 0.71 0.57 0.42 BCAM 4 7.15E-10 0.52 0.66 0.61 RBP1 2 2.72E-16 0.48 0.70 0.60 GFRA1 4 1.89E-07 0.49 0.67 0.59 THBS1 2 7.02E-16 0.51 0.70 0.59 PERP 4 4.17E-07 0.54 0.46 0.36 MEG3 2 1.50E-12 0.53 0.60 0.50 IFI6 4 7.59E-06 0.52 0.78 0.76 ART4 2 1.78E-11 0.47 0.49 0.38 HYAL2 4 1.32E-05 0.49 0.77 0.75 ARID5B 2 5.89E-08 0.48 0.63 0.55 PODXL 4 4.08E-04 0.50 0.53 0.50 MEST 2 6.73E-08 0.46 0.49 0.41 RAMP3 4 9.61E-04 0.48 0.52 0.48 GDF6 2 5.58E-04 0.48 0.26 0.19 ID1 4 8.75E-03 0.54 0.79 0.77 C10orf10 2 2.90E-03 0.53 0.65 0.60 MATK 5 1.31E-233 2.87 0.89 0.08 59

CD247 5 7.90E-222 2.93 0.89 0.08 IRF8 7 1.61E-129 3.54 0.97 0.13 ZNF683 5 5.23E-221 3.31 0.88 0.08 CPVL 7 4.39E-115 3.68 0.96 0.15 XCL2 5 8.37E-218 3.90 0.93 0.10 DNASE1L3 7 5.20E-107 3.06 0.83 0.10 CTSW 5 1.10E-195 3.93 0.95 0.13 RGS10 7 3.06E-98 2.81 1.00 0.21 KRT86 5 4.73E-195 2.82 0.74 0.06 BASP1 7 4.36E-88 2.46 0.99 0.23 XCL1 5 5.05E-193 3.63 0.88 0.11 S100B 7 3.14E-84 3.01 0.83 0.14 TNFRSF18 5 4.94E-192 3.35 0.83 0.09 HLA-DQA1 7 9.21E-84 2.90 1.00 0.25 KLRC1 5 1.03E-168 4.19 0.86 0.12 HLA-DQB1 7 3.88E-83 2.93 0.99 0.24 KRT81 5 7.18E-168 2.89 0.75 0.08 RAB32 7 2.99E-73 2.47 0.98 0.30 TRDC 5 1.27E-161 3.71 0.90 0.15 LSP1 7 1.66E-72 2.47 1.00 0.30 CD69 5 2.33E-153 3.03 0.95 0.18 HLA-DPA1 7 4.74E-62 3.10 1.00 0.43 HOPX 5 3.89E-152 3.57 0.94 0.19 PSMB9 7 8.12E-62 2.37 0.98 0.39 KLRD1 5 7.60E-147 2.78 0.84 0.13 HLA-DPB1 7 8.82E-59 3.48 1.00 0.52 KLRB1 5 2.42E-135 3.72 0.90 0.20 HLA-DRB1 7 9.99E-57 2.73 1.00 0.47 CD7 5 2.39E-134 4.69 0.99 0.30 C1orf54 7 1.60E-56 2.72 0.97 0.44 ITM2C 5 4.86E-129 3.13 0.86 0.18 SNX3 7 4.00E-54 2.99 1.00 0.91 GZMA 5 1.30E-121 4.09 0.90 0.23 CD74 7 3.53E-53 3.39 1.00 0.85 NKG7 5 9.24E-119 4.13 0.99 0.37 CST3 7 6.50E-52 3.46 1.00 0.94 GZMB 5 4.19E-117 3.49 0.71 0.11 HLA-DRA 7 2.07E-51 2.74 1.00 0.64 CD52 5 2.94E-89 2.75 0.90 0.33 HLA-B 7 1.29E-48 2.46 1.00 0.94 GNLY 5 2.12E-86 5.20 0.94 0.52 BIRC3 7 2.62E-35 2.85 0.53 0.10 CCL5 5 1.49E-76 3.45 0.73 0.20 CD27 8 3.46E-66 1.53 0.32 0.02 HMGB2 5 2.24E-23 2.83 0.73 0.60 CD8B 8 1.57E-62 1.75 0.29 0.01 HIST1H4C 5 8.20E-02 2.93 0.60 0.65 CD3G 8 3.50E-55 1.66 0.29 0.02 CSF1R 6 <2.23E-308 3.08 0.91 0.04 CD69 8 2.01E-49 1.74 0.88 0.20 FCGR1A 6 1.71E-294 2.93 0.80 0.03 CD8A 8 6.23E-49 1.56 0.34 0.03 FCGR2A 6 2.23E-284 3.21 0.97 0.06 CD3D 8 1.56E-48 2.17 0.45 0.05 MS4A4A 6 2.07E-273 2.99 0.79 0.03 CD3E 8 3.41E-45 1.34 0.51 0.07 FCGR3A 6 1.36E-258 3.75 0.89 0.06 TRBC2 8 5.40E-42 1.96 0.57 0.09 MS4A7 6 1.76E-243 3.30 0.90 0.06 CD2 8 1.21E-41 1.77 0.39 0.04 VSIG4 6 4.25E-242 2.82 0.62 0.02 RAC2 8 2.09E-35 1.22 0.83 0.21 FOLR2 6 2.41E-165 2.85 0.49 0.02 LTB 8 5.32E-35 2.56 0.54 0.10 C1QC 6 5.19E-155 5.34 0.75 0.08 TRAC 8 2.44E-34 2.09 0.49 0.08 CD14 6 4.10E-151 3.83 0.98 0.19 LCK 8 6.36E-34 1.45 0.51 0.08 AIF1 6 1.03E-128 3.24 1.00 0.26 LEF1 8 6.03E-33 1.42 0.26 0.02 CCL4L2 6 3.51E-127 4.03 0.92 0.19 RHOH 8 2.94E-32 1.32 0.44 0.07 CCL3L3 6 3.63E-124 4.56 0.94 0.22 TRBC1 8 1.00E-31 1.29 0.50 0.09 MAFB 6 6.61E-119 2.98 0.86 0.16 IL2RG 8 1.85E-29 1.08 0.57 0.11 GPR183 6 5.58E-118 2.93 0.84 0.15 CXCR4 8 4.75E-28 1.38 0.84 0.27 C1QA 6 1.69E-109 5.40 0.75 0.13 CCL5 8 6.62E-23 2.07 0.67 0.22 TYROBP 6 8.96E-100 2.96 1.00 0.41 FYB 8 1.70E-21 1.06 0.57 0.14 C1QB 6 3.32E-97 5.47 0.72 0.14 CD52 8 1.83E-20 1.66 0.82 0.35 CCL3 6 1.55E-89 3.94 0.95 0.38 CLEC2D 8 2.18E-15 1.15 0.37 0.08 FTL 6 5.47E-84 2.93 1.00 1.00 IL32 8 1.40E-12 1.39 0.81 0.49 CXCL8 6 1.09E-82 2.97 0.71 0.15 CD37 8 3.28E-12 1.06 0.61 0.21 APOE 6 6.71E-47 4.43 0.32 0.04 AREG 8 7.11E-08 1.22 0.45 0.16 SPP1 6 1.29E-27 5.82 0.26 0.04 APOBEC3A 9 9.93E-218 3.36 0.68 0.01 CXCL3 6 2.50E-25 3.09 0.42 0.13 CSTA 9 8.41E-198 3.25 0.94 0.04 SEPP1 6 2.71E-09 4.39 0.46 0.29 FCN1 9 1.13E-189 4.43 0.98 0.05 CLEC9A 7 <2.23E-308 3.35 0.89 0.01 S100A12 9 1.40E-185 4.20 0.79 0.03 LGALS2 7 1.73E-170 3.07 0.98 0.08 CFD 9 1.06E-105 3.30 1.00 0.12 RP11- DUSP4 7 1.12E-151 2.57 0.79 0.06 1143G9.4 9 1.79E-100 3.45 0.94 0.10 RGCC 7 1.15E-131 3.42 0.97 0.12 MNDA 9 3.70E-92 3.57 1.00 0.14

60

BCL2A1 9 3.01E-86 3.20 0.91 0.11 S100B 10 2.48E-42 3.09 0.81 0.15 STXBP2 9 2.12E-82 2.79 0.89 0.12 BASP1 10 7.64E-42 2.21 0.98 0.25 EVI2B 9 5.65E-77 2.83 0.95 0.15 RGS2 10 1.60E-39 2.21 0.92 0.20 IL1B 9 6.63E-76 3.76 0.97 0.17 MS4A6A 10 3.34E-39 2.51 1.00 0.28 VCAN 9 7.04E-76 2.92 0.91 0.13 HLA-DPA1 10 2.71E-30 2.59 1.00 0.44 CFP 9 1.22E-75 2.86 0.89 0.13 HLA-DPB1 10 1.21E-28 2.83 1.00 0.53 LST1 9 3.16E-75 2.91 1.00 0.17 HLA-DRB1 10 5.72E-28 2.50 1.00 0.48 PLAUR 9 8.34E-71 3.01 0.91 0.14 HLA-DRA 10 8.80E-26 2.56 1.00 0.64 CTSS 9 7.89E-61 3.52 1.00 0.26 CD74 10 5.56E-24 2.38 1.00 0.85 S100A9 9 2.71E-57 5.93 1.00 0.29 AHSP 11 <2.23E-308 6.80 0.97 0.01 S100A8 9 2.98E-55 5.59 0.95 0.25 GYPB 11 <2.23E-308 4.87 0.91 0.00 CXCL8 9 5.47E-50 3.12 0.83 0.16 HEMGN 11 <2.23E-308 4.86 0.80 0.00 S100A4 9 4.91E-46 3.64 1.00 0.41 SLC4A1 11 <2.23E-308 4.36 0.83 0.00 LYZ 9 1.09E-45 4.04 1.00 0.40 GYPA 11 3.70E-303 4.65 0.69 0.00 G0S2 9 2.39E-45 3.72 0.54 0.07 ALAS2 11 7.88E-278 6.45 1.00 0.02 TKT 9 6.88E-40 2.93 1.00 0.52 HBM 11 1.21E-252 7.13 0.94 0.02 SOD2 9 2.85E-39 3.07 0.98 0.45 CA1 11 4.73E-209 6.34 0.63 0.01 NFKBIA 9 1.09E-33 3.22 0.98 0.83 HBG1 11 6.83E-100 8.63 0.97 0.06 CD1E 10 <2.23E-308 2.58 0.73 0.01 FECH 11 8.07E-64 4.36 0.89 0.09 CD1C 10 3.33E-288 3.31 0.96 0.02 GYPC 11 7.13E-47 4.29 0.97 0.16 CD207 10 1.52E-257 2.51 0.63 0.01 HBD 11 2.10E-34 6.19 0.29 0.01 PKIB 10 1.71E-230 2.78 0.90 0.02 BPGM 11 9.25E-33 5.30 0.94 0.23 FCER1A 10 4.67E-183 4.78 1.00 0.04 BLVRB 11 6.76E-28 5.10 1.00 0.36 CLEC10A 10 2.13E-122 3.28 0.88 0.05 MKRN1 11 8.72E-28 4.12 1.00 0.36 FCGR2B 10 5.15E-112 2.18 0.90 0.06 SNCA 11 1.20E-23 5.90 1.00 0.48 PAK1 10 1.71E-100 2.25 0.94 0.07 SLC25A39 11 5.92E-23 4.28 1.00 0.51 JAML 10 2.72E-89 2.71 0.96 0.09 HBA1 11 9.91E-22 9.24 1.00 0.59 C15orf48 10 2.53E-85 2.50 0.77 0.06 NCOA4 11 8.06E-21 4.17 0.97 0.51 PHACTR1 10 7.35E-61 2.41 0.90 0.12 SLC25A37 11 1.42E-20 5.19 1.00 0.73 SPI1 10 1.10E-60 2.26 0.98 0.14 HBB 11 1.91E-20 8.38 1.00 0.75 HLA-DMB 10 9.22E-60 2.29 1.00 0.16 BNIP3L 11 2.71E-20 4.30 1.00 0.78 LST1 10 1.69E-53 2.55 0.98 0.18 HBA2 11 3.80E-20 9.05 1.00 0.88 HLA-DQB1 10 1.47E-44 2.84 1.00 0.25 HBG2 11 4.02E-20 8.82 1.00 0.89 HLA-DQA1 10 1.87E-44 3.05 1.00 0.26 TMCC2 11 3.97E-04 4.30 0.26 0.05 a percentage of cells that express the gene in this cluster b percentage of cells that express this gene in all other clusters

61

Figure 11. Single-cell RNA sequencing reveals variability in housekeeping gene expression. Feature plots and violin plots show the expression of cyclophilin A (PPIA), glyceraldehyde-3-phosphate dehydrogenase (GAPDH), TATA-box binding protein (TBP) and hypoxanthine phosphoribosyltransferase 1 (HPRT1) in freshly isolated human umbilical vein endothelial cells. The gene name is indicated on top of each violin plot where the value on Y-axis represents the log2 expression level and cluster numbering (X-axis) is according to Figure 10A.

62

Overall scRNA-seq analysis revealed a highly heterogeneous EC population – many classical EC genes showed varied gene expression whereas others appeared more homogeneous (Figure 12B). There was considerable variation in eNOS (NOS3) expression, which has not been previously appreciated. Other genes, such as vascular endothelial growth factor receptor 2 (KDR/FLK1), showed very little expression. P-selectin appeared to have widespread expression heterogeneity among the ECs, a cell type which was previously considered to be largely homogeneous (Figure 12A, B). As expected, VWF was more homogeneous and every cell in clusters 1-4 expressed some VWF. Table 7 shows examples of genes that appear to have basal mRNA heterogeneity among ECs and those that are uniformly expressed.

63

64

Figure 12. Single-cell RNA sequencing indicates that SELP heterogeneity is present in vivo in fresh human tissues, just as it is in vitro in cultured HUVEC. (A) Expression of P-selectin in the 2,828 cells. Violin plots (B-D) show the expression of genes of interest among the different clusters identified using t-SNE (Figure 10A). The gene name is indicated on top of each violin plot where the value on Y-axis represents the log2 expression level and cluster numbering (X-axis) is according to (Figure 10A). These data indicate heterogeneity of SELP mRNA transcript expression in the ECs (clusters 1-4) versus an absence of heterogeneity of broadly expressed housekeeping genes, such as GAPDH, or other EC-enriched genes, such as PECAM1/CD31 and VWF.

65

Table 7: Basal Expression Heterogeneity is Not Limited to P-selectin Genes Like P-selectin (basal mRNA heterogeneity Genes Unlike P-selectin (basal mRNA among ECs) homogeneity among ECs) CDH5 (VE-cadherin) PPIA (cyclophilin A) GAPDH (glyceraldehyde-3-phosphate CD34 dehydrogenase) PECAM1 (CD31/ platelet and endothelial cell TEK (TIE2/ TEK receptor tyrosine kinase) adhesion molecule 1) ENG () VWF (von Willebrand factor) KLF4 (Kruppel like factor 4) EDN1 (endothelin 1) ECE1 (endothelin converting enzyme 1) CD63

We then evaluated the expression of other Weibel-Palade genes and genes that may be regulated along with P-selectin to gain insight into whether the heterogeneity observed in P-selectin was found broadly across WPB genes in general, or also found in genes that may be concomitantly regulated with P-selectin. Generally, most WPB genes were either lowly expressed in very few cells (CXCL8/IL8, tissue-type plasminogen activator/ PLAT, angiopoietin 2/ANGPT2, osteoprotegerin / TNFRSF11B) or were highly expressed across most of the ECs in clusters 1-4 (EDN1 and CD63) (Figure 12C). This indicated that the heterogeneity observed in P-selectin was an independent phenomenon that did not appear in the expression patterns of most WPB genes. The gene most similar to P-selectin in terms of having heterogeneous expression was ECE1 (endothelin converting enzyme 1), which displayed a large range of expression distribution with many cells also exhibiting no expression.

Genes that may be regulated with P-selectin were identified largely based on their genomic proximity or function in relation to P-selectin. F5 (coagulation factor 5) is located 2217bp away from 3ʹ end of SELP in the same orientation. Coagulation factor VIII is carried in its inactive state by VWF in circulation and has a crucial role in hemostasis. Expression is primarily restricted to liver sinusoidal ECs239. Both F5 and F8 were expressed in relatively few ECs (Figure 12D). ACE encodes angiotensin I-converting enzyme (catalyzes the cleavage of angiotensin I into angiotensin II). Studies suggest that elevated levels of vasoconstrictors contribute to leukocyte–endothelial cell interactions via P-selectin surface expression240, 241. Angiotensin II is the dominant vasoconstrictor in many vascular diseases, and it may serve as a stimulus for the leukocyte recruitment and infiltration associated with these conditions. In our sample, ACE appeared to be quite lowly expressed across the ECs, with many expressing no

66

ACE (Figure 12D). Interestingly, as mentioned above, ECE1 showed similar low but heterogeneous expression. ECE1 is predominantly responsible for producing active endothelin-1, another vasoactive peptide linked to vasoconstriction and leukocyte attachment. These findings are consistent with our hypothesis that in healthy individuals, disease-associated genes should be lowly and heterogeneously expressed, and a transition towards homogeneous expression promotes pathologic development.

Evaluation of genes involved in regulating P-selectin like the cytokines, oncostatin M, IL-3, -4, and -13, did not show any obvious results in terms of a pattern that can account for P-selectin expression. The same can be said for the DNMT and the TET genes (data not shown).

Interestingly, PROM1 (prominin 1/CD133), a typical “stem-like” marker was observed in some cells within the endothelial group (data not shown). It will be interesting to explore which other genes are expressed in these cells.

We also performed differential gene analysis of the top 5% P-selectin expressing cells versus the bottom 5% (this group only contains cells that do not express P-selectin at all) in the EC- containing clusters. The genes with the greatest fold change (MATN2, LTBP1, SAT1, and INMT) were only 2-2.5-fold greater in the P-selectin-high versus low cells (Table 8). A literature search of these genes did not reveal any biologically meaningful pathways which would explain the correlation with high P-selectin expression. Overall, many mitochondrial genes appeared to be the most significantly different between the top and bottom P-selectin expressing cells, however the actual fold change differences were only in the ranges of 1.3-1.6. In general, there did not appear to be a subset of genes that were upregulated with P-selectin that were functionally or biologically relevant.

67

Table 8: scRNA-seq clusters 1-4 (ECs) top 5% P-selectin expressing cells vs bottom 5% (i.e., cells that do not express P-selectin) Avg. log2 Avg. log2 Adjusted p- Adjusted p- Gene fold pct.1a pct.2b Gene fold pct.1a pct.2b value value change change SELP 4.45E-50 2.85 1.00 0.00 SEZ6L2 0.02 0.89 0.36 0.14 MT-ND4 1.02E-12 0.70 1.00 1.00 PTPRB 0.02 0.58 0.88 0.84 MT-ATP6 2.22E-10 0.62 1.00 1.00 ART4 0.03 0.81 0.50 0.28 PLAC8 1.76E-09 -0.98 0.39 0.79 INMT 0.03 1.00 0.43 0.21 MT-CO3 1.92E-09 0.50 1.00 1.00 CALCRL 0.04 0.48 0.97 0.94 MT-CYB 1.06E-08 0.57 1.00 1.00 LEPR 0.04 0.69 0.77 0.72 MT-CO2 1.19E-08 0.71 1.00 1.00 LGALS3 0.04 -0.30 0.50 0.80 GAPDH 1.33E-08 -0.48 1.00 1.00 a percentage of cells that express the gene in the top 5% P- MT-ND5 1.82E-08 0.67 1.00 0.99 selectin-expressing cells within clusters 1-4 MATN2 1.68E-07 1.17 0.87 0.80 b percentage of cells that express this gene in cells that do not CTGF 4.11E-07 0.91 1.00 1.00 express P-selectin MT-ND3 1.88E-06 0.61 1.00 1.00 CRIM1 2.63E-06 0.60 1.00 1.00 MT-CO1 3.50E-06 0.47 1.00 1.00 MT-ND1 1.13E-05 0.51 1.00 1.00 APP 2.71E-05 0.50 0.99 0.99 VWF 2.72E-05 0.43 1.00 1.00 HSPG2 2.95E-05 0.52 1.00 1.00 OMD 5.41E-05 0.88 0.94 0.86 CST3 5.42E-05 -0.37 0.92 1.00 LTBP1 8.85E-05 1.10 0.45 0.18 ADIRF 1.45E-04 -0.33 0.95 1.00 ADGRG6 1.50E-04 0.62 0.86 0.80 YBX1 2.46E-04 -0.45 0.99 1.00 PPDPF 3.13E-04 -0.40 0.98 1.00 GDF7 3.91E-04 0.73 0.86 0.72 SMOC1 5.05E-04 0.67 0.38 0.11 FKBP1A 5.09E-04 -0.34 1.00 1.00 MT-ND2 5.96E-04 0.41 1.00 1.00 CGNL1 6.59E-04 0.88 0.31 0.06 TFPI 9.20E-04 0.56 0.98 0.99 BMP6 1.06E-03 0.66 0.77 0.63 PRSS23 1.20E-03 -0.51 0.23 0.55 SRP14 1.22E-03 -0.30 0.99 1.00 PPA1 1.94E-03 -0.41 0.41 0.76 PTGS1 2.20E-03 0.90 0.70 0.57 ACTG1 5.33E-03 -0.28 1.00 1.00 SAT1 7.81E-03 1.04 0.87 0.84 CD93 8.24E-03 0.84 0.48 0.25 ECE1 8.79E-03 0.59 0.83 0.80 SFRP5 9.04E-03 0.83 0.26 0.04 SPARC 9.69E-03 0.43 1.00 0.98 TMSB4X 9.69E-03 -0.29 1.00 1.00 LDHA 0.01 -0.48 0.79 0.97 PERP 0.01 -0.48 0.23 0.56 TAGLN2 0.01 -0.40 0.94 0.99 DDX17 0.01 0.35 0.95 0.95 CFL1 0.01 -0.32 0.97 1.00 CCDC3 0.02 -0.31 0.21 0.52 HSPB1 0.02 -0.38 0.99 1.00 IL1R1 0.02 0.59 0.82 0.73 BMP4 0.02 0.59 0.97 0.96 68

3.2.3 P-selectin Protein Heterogeneity Correlates with RNA Expression

Using immunofluorescence imaging we have shown that P-selectin is heterogeneously expressed at the protein level between ECs isolated from the same blood vessel exposed to the same environment. We were interested in knowing whether protein heterogeneity was mediated, in part, by heterogeneity in mRNA and hnRNA expression. We used FACS to isolate the top and bottom 5% of P-selectin-expressing HUVECs. Since P-selectin expression is lost with passaging, a large number of cells were required for FACS in order to obtain sufficient cells for RNA and DNA isolation. Collins et al.242 have reported that trypsin, which is commonly used to detach cells for passaging, affects WPB degranulation. Others have also reported the sensitivity of EC adhesion molecules to proteolytic cleavage by trypsin243, 244. Because of this, Accutase® was used as the dissociation agent for passaging of HUVECs. Briefly, confluent third-passage HUVECs were harvested and surface expression of P-selectin was stimulated with 10-4M histamine for 10min. Cells were stained with Alexa 488-conjugated anti-human P-selectin mAb immediately prior to cell sorting. HUVECs were gated using forward- and side-scatter settings that were optimized using a small sample reserved for pre-sorting. Fluorescence of unstained and isotype-matched antibody controls were also analyzed to determine specificity and levels of background signal. Only viable cells, as determined by propidium iodide exclusion, were analyzed.

The time course of cellular P-selectin expression on ECs after stimulation with histamine was similar to that previously reported26. The response was maximal within 5-10min and decreased to near-basal levels by 60min. To obtain enough cells, sort times for FACS often exceeded 2hrs and repeated stimulation with histamine was required to induce re-expression of P-selectin on EC surfaces. This was done using the same concentration of histamine one hour after the initial stimulation.

Shown in Figure 13 is a representative FACS plot showing, as with adherent cells, that P-selectin expression is heterogeneously expressed in a HUVEC suspension post histamine stimulation. Note that the x-axis is a logarithmic scale that extends over a five-decade range, representing cells with fluorescence values that differ 100,000-fold between the lower and upper ends. Comparable results were evident in >3 independent human donors in terms of heterogeneity in P- selectin cell surface expression.

69

Figure 13. FACS analysis of histamine treatment on P-selectin expression in HUVEC. HUVEC cells were stimulated with histamine (10-4M) for 10min, stained with Alexa488– conjugated P-selectin antibody, and sorted using fluorescence activated cell sorting. Left and centre, histogram plots showing a single parameter (fluorescence intensity log10 scale, horizontal axis) against the number of events detected (cell counts, vertical axis). Right, dot-plot showing the gating strategy used for isolating the P-selectin-high and -low expressing subpopulations (top 5% and bottom 5% expressing cells) (side-scattered light area (SSC-A) (y-axis) is proportional to cell granularity).

RNA and hnRNA were quantified using RT-qPCR (Figure 7D-F). These results showed increased expression of P-selectin mRNA in the protein-high cells compared to the low. A similar result was observed in the hnRNA, which is used as a surrogate measure of gene transcription; cells high in P-selectin protein were also transcribing more P-selectin hnRNA. This confirmed that levels of P-selectin protein correlated with the steady-state RNA levels and hnRNA. In comparing the ratio of hnRNA to mRNA in low versus high cells, there was essentially no difference. This indicates that the difference in mRNA between the P-selectin-high and P-selectin-low is due to transcription, rather than a difference in mRNA stability (if the low cells had less mRNA because there was post-transcriptional regulation that destabilized the RNA, the ratio would be > 1). When assessed using FISH, quantification of the hnRNA:mRNA ratio in passage 1 cells was the same compared to the ratio in P-selectin-high and -low expressing subpopulations of passage 3 HUVECs. This suggested that the loss of P-selectin expression with prolonged culture was not due to a change in the stability of the mRNA.

70

Our lab has data supporting the finding that post-transcriptional regulation of mRNA is not a main factor in the expression of P-selectin. MicroRNAs are a class of short (~22nt) non-coding RNAs that can target mRNAs for cleavage or translational repression. Dicer is an essential enzyme in processing functional mature microRNA and P-selectin has been found to be unaffected by dicer deficiency (Marsden lab, unpublished data). Additionally, microRNA databases (e.g., TargetScan and MicroRNA.org) do not predict conserved sites for microRNA binding within P-selectin transcripts.

3.2.4 Promoter Methylation Correlates with P-selectin mRNA Expression in Endothelial and Non-endothelial Cell Types

Previous work from our lab has shown that gene promoters that are poor CGIs tend to be differentially methylated between different cell types and this methylation is linked to the cell- specific expression of that gene215. There are different approaches for classifying CpG enrichment. For example, UCSC defines CpG islands based on CG content >50%, Observed/Expected (Obs/Exp) CpG ratio >0.6, and length >200 bps245. An alternative classification is poor CGI (Obs/Exp <0.45), weak (Obs/Exp 0.45-0.75), and strong (Obs/Exp >0.75) 246. The most biologically meaningful definition of a CGI remains to be determined. Several computer-based algorithms have been developed as bioinformatic tools to identify CGIs in the genome. EMBOSS Cpgplot is one such tool used to identify CGIs in silico247. O/E is calculated over a pre-set window length which is moved along the sequence of interest. An example O/E calculation is provided below:

푂푏푠 퐴푐푡푢푎푙 푛푢푚푏푒푟 표푓 표푏푠푒푟푣푒푑 퐶푝퐺푠 𝑖푛 푡ℎ푒 푠푒푞푢푒푛푐푒 표푓 𝑖푛푡푒푟푒푠푡 = 퐸푥푝 푁푢푚푏푒푟 표푓 푒푥푝푒푐푡푒푑 𝑖푛푠푡푎푛푐푒푠 표푓 퐶 푓표푙푙푤푒푑 푏푦 퐺 푡푎푘𝑖푛푔 𝑖푛푡표 푎푐푐표푢푛푡 푡ℎ푒 푛푢푚푏푒푟 표푓 퐶푠 푎푛푑 퐺푠 𝑖푛 푡ℎ푒 푠푒푞푢푒푛푐푒 표푓 𝑖푛푡푒푟푒푠푡

푂 푛푢푚푏푒푟 표푓 표푏푠푒푟푣푒푑 퐶푝퐺푠 𝑖푛 푡ℎ푒 푠푒푞푢푒푛푐푒 표푓 𝑖푛푡푒푟푒푠푡 / 푁 = 퐶 퐺 퐸 ( ∗ ) 푁 푁

푤ℎ푒푟푒 퐶 𝑖푠 푐푦푡표푠𝑖푛푒, 퐺 𝑖푠 푔푢푎푛𝑖푛푒, 푎푛푑 푁 𝑖푠 푡ℎ푒 푙푒푛푔푡ℎ 표푓 푠푒푞푢푒푛푐푒 표푓 𝑖푛푡푒푟푒푠푡

Using the following parameters in EMBOSS Cpgplot, CGIs were defined as DNA regions longer than 500 nucleotides, with a minimum G plus C composition > 55%, and a minimum average O/E CpG ratio > 0.65248. When we analyzed a 451nt region in the P-selectin promoter (from -562 to -112 relative to the transcriptional start site) the O/E ratio was 0.14. Although P-selectin is a non-CpG island promoter, studies in other cell types have demonstrated that DNA methylation

71 can directly silence genes with non-CGI promoters and contribute to the establishment of tissue- specific methylation patterns121. We have previously demonstrated that proximal promoter DNA methylation is important for regulating EC-enriched gene expression, despite many of the assessed genes being classified poor CpG islands215.

To better understand the role that DNA methylation has in the regulation of P-selectin expression, bisulfite analysis of 4 CpG sites in the proximal P-selectin promoter was performed. Bisulfite treatment of DNA causes the deamination of unmethylated cytosine to uracil, which is read out as thymine following PCR amplification and Sanger sequencing. Methylated and hydroxymethylated cytosine are resistant to this conversion, thus allowing single-base resolution of the methylation status in the original DNA.

Sodium bisulfite genomic sequencing was performed on six independent, passage 3 HUVEC preparations derived from genetically distinct individuals (Figure 14). Heterogeneous methylation was observed at the 4 CpG sites in both the pattern of methylation as well as the number of methylated CpGs across sequenced clones. Importantly this was observed both within a single individual and between individual donors, hinting at the existence of interhuman variability in P-selectin expression.

72

73

74

Figure 14. Interindividual heterogeneity in human P-selectin promoter DNA methylation using bisulfite sequencing of six independent HUVEC lines. Top, Schematic showing the four consecutive CpG sites in the human P-selectin proximal promoter. Each circle represents a CpG site relative to the transcriptional start site, indicated by the arrow. The lollipop diagrams show the single strand analysis of the four CpGs, where unfilled circles denote unmethylated CpGs and filled circles represent methylated CpGs. The percentages refer to the summed methylated/total CpGs across all four CpG sites. The frequency histograms also indicate variation in the number of methylated CpGs. The frequency of sequenced clones with a given number of methylated CpGs varies within a single HUVEC population as well as across the 6 HUVEC lines. The % methylation graphs display the average percent methylation for each CpG site across all alleles that were sequenced. HUVEC, human umbilical vein endothelial cells.

Heterogeneity in basal P-selectin promoter methylation was also assessed in human aortic vascular smooth muscle cells (HuAoVSMC), which are a non-expressing cell type, as well as three other endothelial cell types (human aortic endothelial cells, human dermal microvascular endothelial cells, and human coronary artery endothelial cells) (Figure 15). Compared to the representative HUVEC line, the analysis of the HuAoVSMCs revealed hypermethylation and less heterogeneity at the promoter CpGs sites. Of the other endothelial types, only HCAECs showed similar heterogeneity to HUVECs in terms of the distribution of the amount of promoter methylation. Both HAEC and HMVEC displayed hypermethylation and frequently had >2 CpG sites methylated. These results suggested that methylation may be involved in the cell-type specific regulation of P-selectin expression.

75

76

HAEC

77

Figure 15. Heterogeneity in human P-selectin promoter DNA methylation among endothelial cell types. Top, Schematic showing the four consecutive CpG sites in the human P-selectin proximal promoter. Each circle represents a CpG site relative to the transcriptional start site, indicated by the arrow. Lollipop diagrams show the bisulfite sequencing of the P-selectin promoter in four endothelial and one non-endothelial cell types. Filled circles represent methylated CpGs and unfilled circles are unmethylated CpGs. The percentages refer to the summed methylated/total CpGs across all four CpG sites. The frequency histograms indicate variation in the number of methylated CpGs in each strand. The % methylation graphs display the average percent methylation for each CpG site across all alleles that were sequenced. HUVEC, human umbilical vein endothelial cells; HCAEC, human coronary artery endothelial cells; HuAoVSMC, human aortic vascular smooth muscle cells; HAEC, human aortic endothelial cells; HMVEC, human dermal microvascular endothelial cells.

Cross-species alignment of the human P-selectin promoter sequence with other vertebrate mammals demonstrated weak conservation for many of the promoter CpG sites (Figure 16). Moreover, none of the human CpG sites are located in cis-elements in the SELP promoter that have been shown to be functionally important (Figure 17). Of the 4 CpG sites assessed in our methylation analysis, only one site (-468) showed a high degree of conservation across old world monkeys. Among the closely related primate species, one particular site (-433) appeared to be unique to humans, suggesting that this site may have arisen as way to gain more control over human P-selectin expression. Interestingly, there was only one common CpG site (at -152) between mice and humans in the region of -1000 to +500. Bovine P-selectin appears to be largely devoid of promoter CpGs, indicating that methylation of these extant genomic regions no longer plays a huge role in determining transcriptional output. Whether there are species-specific differences in SELP DNA methylation in other regions of the gene that affect P-selectin transcription in other species is not known.

78

Figure 16. Alignment of the human P-selectin sequence with other vertebrate mammals demonstrated weak conservation for many of the promoter CpG sites. The observed/expected ratio calculated for the region of -1000 to +500 of the human SELP gene (relative to the transcriptional start site) is 0.145, indicating a weak CGI promoter. The four CpG sites indicated by the blue boxes are those assessed in the single strand and pyrosequencing bisulfite analyses. Yellow boxes with ✓ indicate the presence of an identical CpG site to the human sequence at the same location. Dashes ('-') represent gaps in the sequence of one genome relative to the human sequence in a pairwise comparison.

79

Figure 17. Location of CG dinucleotides, SNPs, and cis-regulatory elements in the proximal promoter of the human P-selectin gene. The locations of cis-regulatory elements and their position relative to the transcriptional start site (arrow) are indicated in green text. Black boxes denote putative cis elements and blue boxes denote elements which have been shown to be functionally important in basal P- selectin expression. The CG dinucleotides assayed for methylation (denoted by bold font and blue highlighting) and their position relative to the transcriptional start site (arrow) is indicated. Data for the annotated SNPs (highlighted in yellow) was obtained from dbSNP249. Green nucleotides denote an overlap between a CG dinucleotide and SNP. Shown are the RefSNP alleles; the global minor allele frequency for all of the SNPs in this region are <0.01. Cis-regulatory elements are shown in the boxes.

80

Genome-wide DNA methylation data and a variety of ChIP-seq datasets have been published for HUVECs and many non-ECs. These data provide a valuable resource to gain information about the cell-specific epigenetic modifications that drive EC gene expression. Promoter DNA methylation was analyzed for P-selectin and VWF using whole genome bisulfite sequencing data from ENCODE/HAIB97, 250 retrieved from http://genome.ucsc.edu/ENCODE/downloads.html. Figure 18 shows the methylation status of select CpG dinucleotides in the given cell types as identified by the Illumina Infinium Human Methylation 450 Bead Array platform. This array uses bisulfite treated genomic DNA to assess the methylation status of 482,421 CpG sites covering all designatable RefSeq genes, including promoter, 5ʹ, and 3ʹ regions in the human genome. This represents <2% of the total number of CpGs in the human genome. Four CpG sites were assessed in the SELP proximal promoter at -1298, -348, -152, and +194 relative to the transcriptional start site. Two of these sites, -348 and -152, are also analyzed in our high resolution single strand analyses (Figure 14, Figure 15, Figure 21, Figure 27). Analysis of the P- selectin proximal promoter revealed partial methylation at -348 in HUVEC, HuAoVSMC, and hepatocytes (Figure 18C). The CpG site at -152 was methylated in HUVEC and HuAoVSMC, and partially methylated in hepatocytes. It is difficult to assess whether these results agree with our methylation analyses of HUVEC and HuAoVSMC since we did not evaluate all of the same CpG sites. We have reported a general trend in which less promoter methylation seems to correlate with RNA and protein expression, but we have not tested the importance of individual CpG sites. No strong RNA polymerase II peak was observed in the HUVEC promoter. This is consistent with the low basal expression of P-selectin in HUVEC (Figure 18A). Interestingly, E- selectin (SELE) showed a defined pol II peak near the transcriptional start site. Although, E- selectin is not known to be basally expressed in HUVECs it is possible that pol II is preloaded and paused at the E-selectin promoter to allow rapidly induction following stimulation by cytokines. There is also a marked repetitive DNA cluster ~50kb in length sitting between the 3ʹ end of SELL (L-selectin) and the 5ʹ end of SELP. There are two distinct H3K4me1 peaks within this region, which may indicate the presence of an enhancer. In terms of VWF, the region surrounding the start of transcription displayed no methylation in HUVEC, whereas the same region appeared to be partially and heavily methylated in HuAoVSMC and hepatocytes respectively (Figure 18D). There is also a strong enrichment of RNA pol II binding near the transcriptional start site, which is consistent with decreased methylation of CpG sites within a

81 promoter of a gene that is expressed (Figure 18B). It is important to note that current ENCODE based studies start with the assumption that chromatin structure and modifications are homogeneous in individual cells within the different primary cell types. Our work demonstrates that alleles and/or single cells can have heterogeneous epigenetic marks, especially DNA methylation profiles. This highlights an important limitation of studying groups of cells versus single cells.

82

83

Figure 18. DNA methylation and RNA Polymerase II patterns at the P-selectin and VWF gene promoters. Promoter profiles of (A) SELP/P-selectin and (B) VWF/VWF in HUVECs, AoSMCs, and hepatocytes are shown. Regions of repetitive elements are indicated in black, identified by the RepeatMasker program (http://www.repeatmasker.org). Levels of methylation are color coded, where blue represents no methylation, purple is partial methylation, and orange full methylation, as identified by the Illumina Infinium Human Methylation 450K Bead Array platform. This track was generated and analyzed by the labs of R. Myers and D. Absher at the HudsonAlpha Institute for Biotechnology (Huntsville, AL) as part of the ENCODE project97, 250. Signal enrichment of H3K4me, H3K4me3, H3K23me3, and RNA polymerase II chromatin immunoprecipitation-seq data in HUVECs is shown in orange, with the signal enrichment shown on the y-axis. Input designates the sequencing of the whole cell lysate isolated from cells that have been cross-linked and fragmented under the same conditions as the immunoprecipitated DNA. The ChIP-seq data were generated by the Bernstein laboratory and the Broad Institute at the Massachusetts General Hospital/Harvard Medical School. All ENCODE sequence data were aligned to the GRCh37/hg19 (Feb. 2009) reference genome. Black arrows denote the transcriptional start sites. The UCSC Genome Browser was used to generate the track displays251, 252. HUVEC, human umbilical vein endothelial cells; AoSMC, human aortic vascular smooth muscle cells. (C, D) Zoomed in views of the SELP and VWF promoter methylation profiles.

84

Promoter methylation can downregulate transcription by recruiting methyl-CpG binding proteins with histone deacetylase activity and blocking the binding of select TFs, resulting in a closed and transcriptionally repressive chromatin structure. Histone acetylation at proximal promoter regions is a chromatin modification associated with open chromatin conformation and transcriptional activation. Some genes can be activated by DNMT inhibition or by histone deacetylase inhibition, whereas others require the combined effects of both, which may lead to synergistic activation253.

To implicate proximal promoter methylation in the determination of steady-state RNA levels of P-selectin, we treated HUVEC and HuAoVSMC with 5-aza-2'-deoxycytidine, a DNA methyltransferase inhibitor. We also tested whether histone deacetylase activity was required for the maintenance of a transcriptionally silent state by treating cells with the HDAC inhibitor, trichostatin A, either with or without 5-Aza-CdR.

Shown in Figure 19 are three independent experiments. We observed no marked effect when DNA methyltransferase inhibitors were added to HUVECs or HuAoVSMCs. This was surprising as were predicted that both cell types would show a strong induction with 5-Aza-CdR treatment. With TSA treatment, either alone or in combination with 5-Aza-CdR, we also expected increased expression. In the HUVEC samples we observed a decrease in expression in both TSA conditions (either alone or with 5-Aza-CdR). An increase in expression was observed in the HuAoVSMC when treated with 5-Aza-CdR + TSA. A possible explanation for these findings is that treatment with 5-Aza-CdR and TSA results in global, non-specific DNA demethylation and prevention of histone deacetylation. This may affect multiple regulatory pathways that are indirectly implicated in P-selectin regulation. To rule this possibility, we investigated whether DNA methylation plays a direct role in regulating P-selectin expression using a luciferase reporter assay.

85

Figure 19. Effects of 5-Aza-CdR and TSA on SELP steady-state mRNA and hnRNA levels in HUVEC and HuAoVSMC. Exponentially growing cells were treated with 50μM 5-Aza-CdR and grown for 2 days, after which RNA was isolated. For TSA and combined 5-Aza-CdR and TSA treatments, TSA was added at a concentration of 1µM for the last 24h. RT-qPCR analysis of SELP steady-state transcript levels revealed induction of SELP mRNA (A) and hnRNA (B) expression in HUVECs treated with 5-Aza-CdR. mRNA values were normalized to TATA-binding protein (TBP), which was used as an endogenous reference gene. hnRNA was normalized to luciferase efficiency. The black line indicates the mean of three biological replicates.

A luciferase reporter construct was created by ligating a 1888bp region of the human P-selectin promoter (from -1796 to +92 relative to the TSS) into the KpnI/XhoI sites of pGL2 (Figure 20A). This pGL2-1796/+92 reporter contains a functional GATA site53, two Stat6 sites, as well as several ETS elements. It also contains 13 CpG sites and allows us to assess whether proximal promoter cis-elements are important in driving expression of P-selectin or whether additional regulatory elements (which may be more distant) are required. Methylated and unmethylated pGL2-1796/+92 were transiently transfected into HUVEC and BAEC cells. Mock-treated plasmids that were processed using the same in vitro methylation steps, but in the absence of M.SssI, were also tested.

86

High luciferase activities were observed when unmethylated promoters, including mock-treated control plasmids, were transfected into HUVECs (Figure 20C). This indicated a mechanism whereby differential methylation determines steady-state RNA levels. Mock methylated plasmids displayed lower luciferase activities compared to non-methylated. This is expected since the process of purifying the plasmids using a column-based approach during the mock methylation procedure can introduce nicks into the plasmid DNA. The supercoiled, non- methylated pGL2-1796/+92 plasmid should experience higher RNA synthesis rates due to more effective and stable promoter-transcription factor interactions which can enhance the rate of initiation254,255. In contrast, low luciferase activities were observed when the promoters were methylated, indicating methylation can directly suppress promoter activity. As expected, BAEC cells responded almost identically to the human cells. Studies have shown that TNF-α and IL-1β, which initiate NF-κB signaling pathways, upregulate expression of murine and bovine P-selectin but downregulate expression of human SELP256-258. It will be interesting to see whether BAECs pre-activated with TNF-α will generate different results in these transfection studies (i.e., a downregulation of luciferase activity from the non-methylated episomes). Overall these results show that DNA methylation status can directly lead to changes in transcriptional activity.

87

Figure 20. Quantification of human P-selectin gene promoter activity. BAEC and HUVEC cells were transiently co-transfected with a reporter vector and an internal reference plasmid pRL-SV40 to control for transfection efficiency. (A) pGL2-1796/+92 contains a 1888bp fragment of the P-selectin promoter fused to a firefly luciferase cassette in a pGL2 plasmid. Methylated pGL2-1796/+92 is the in vitro methylated version of this construct. Mock methylated is pGL2-1796/+92 processed the same as the methylated construct but without the addition of M.SssI methylase. (B, C) Data are expressed as percentage of luciferase activity relative to unmethylated pGL2-1796/+92. pGL2-1796/+92 displayed ~10% of the activity of an SV40 promoter/enhancer-directed luciferase control vector, pGL2-control (data not shown). Error bars represent the SEM of three biological replicates. BAEC, bovine aortic endothelial cells; HUVEC, human umbilical endothelial cells. 3.2.5 Promoter Methylation Correlates with P-selectin Protein Expression in HUVECs

Finally, to test the relationship between P-selectin protein heterogeneity and promoter methylation, cell sorting was used to isolate subpopulations of HUVECs that expressed high and low P-selectin. DNA was isolated from these samples for bisulfite analysis. When we examined the P-selectin-high population, we found that they have a higher frequency of 0 or 1 methylated CpGs whereas the low population had a higher frequency of 2, 3, or 4 sites methylated (Figure 21B). In terms of the percent methylation, the high cells had lower average methylation across

88 the majority of CpG sites compared to the P-selectin low cells (Figure 21C). In line with previous results, this indicated to us that DNA methylation has a role in the regulation of P- selectin RNA and protein and contributes to its heterogeneous expression. It further suggests that expression of P-selectin mRNA and protein may be best correlated with alleles that have no DNA methylation (0 of 4 sites methylated).

Figure 21. Methylation of SELP promoter in ECs sorted for high and low P-selectin protein. HUVEC cells were stimulated with histamine (10-4M) for 10min, stained with Alexa488– conjugated P-selectin antibody, and sorted using fluorescence activated cell sorting. DNA was isolated and converted with sodium bisulfite for subcloning and high-resolution analysis of promoter CpG methylation. Lollipop diagrams (A) show the strand analysis of 4 consecutive CpG sites in individually sequenced cloned alleles. Each row represents a single cloned allele and each circle represents a CpG site relative to the transcriptional start site. Filled circles represent methylated CpGs and unfilled circles are unmethylated CpGs. The frequency histogram (B) shows the number of methylated CpG sites in the P-selectin high and P-selectin low populations. The % methylation graph (C) displays the average percent methylation for each CpG site across all alleles that were sequenced.

89

3.3 The Role of Histone Post-translational Modifications on P- selectin Expression

3.3.1 Comparison of Histone Acetylation ChIP-chip Data with ChIP-seq Data from the ENCODE Consortium

Histone post-translational modifications and chromatin-associated protein complexes are known to play crucial roles in the control of gene expression. Bivalent (poised or paused) chromatin is characterized by the co-enrichment of activating (e.g., H3K4me1 or H3K4me3) and repressing (e.g., H3K27me3) chromatin modifications at the same location259, 260. Originally characterized by Bernstein et al., bivalent promoters were found to be strongly correlated with CGIs260. This combination of epigenetic marks at promoter or enhancer regions keeps genes expressed at low levels but poised for rapid activation. While many studies have reported on the importance of bivalent chromatin domains in the context of embryonic stem cells and development, differentiated cells may also have bivalent promoters. Matsumura et al. have reported the existence of a novel H3K4/H3K9me3 bivalent domain involved in maintaining developmental genes of preadipocytes261. Although P-selectin is not a CGI gene, we were still interested in knowing whether a bivalent domain existed at the P-selectin promoter which could explain its heterogenous expression. We also wanted to explore the role of other histone modifications at the P-selectin gene.

The chromatin profiles of various histone marks were evaluated for HUVEC, HuAoVSMC, and normal human epidermal keratinocytes (NHEK) cells (Figure 22). We compared the ENCODE ChIP-seq profiles262 with data from our recently published ChIP-chip ultra-high resolution tiling array which assessed pan-H3Ac (H3K9 and K14 acetyl) and pan-H4Ac (H4K5, K8, K12, and K16 acetyl) profiles in HUVEC versus HuAoVSMC263. This tiling array consisted of 1 million 60nt oligonucleotide probes that tiled the non-repetitive genomic regions of 34 select genes of interest at a 5nt resolution. The probes were designed with Agilent eArray using the UCSC HG18 March 2006 human genomic assembly. The regions analyzed include 50kb regions that were upstream of the TSS and 50kb downstream of the cleavage and polyadenylation signal. The array had a maximum of 24 probes detecting every 5nt of unique genomic region with 12 probes detecting each DNA strand. The ChIP-chip data demonstrated enrichment of pan-H3Ac and - H4Ac at the proximal promoter of P-selectin in HUVEC compared to HuAoVSMC (Figure 22

90

A). This was consistent with the low-resolution ChIP-seq H3K9Ac and H3K27Ac ENCODE data comparing HUVECs to NHEK, although the differences were less pronounced. P-selectin did not show differential enrichment of H3K36me3 in the intragenic region compared to either HuAoVSMC or NHEK (Figure 22B). This observation supports the conclusions from the RNA pol II ChIP-seq and is in line with the known low basal mRNA expression in HUVEC. In terms of bivalency, P-selectin did not appear to have the classical H3K4me3–H3K27me3 bivalent mark at the promoter (Figure 18A). As our understanding of bivalent chromatin states increases, other signatures for bivalency may emerge which are not contingent on the presence of a CGI in the promoter. For example, Mauser et al. have reported on a bivalent H3K9me3–H3K36me2/3 mark enriched in the gene bodies and enhancers of weakly transcribed genes264. Although the co- localization of these marks has been associated with alternative splicing265, 266, it is possible that the presence of this signature denotes other unknown biological functions.

91

92

Figure 22. Integrated Genome Browser view of histone modification and RNA polymerase II profiles of SELP. The tracks for SELP across HUVEC and two non-EC cell types, HuAoVSMC, and NHEK, are shown. The x-axis of all tracks corresponds to genomic position. (A) histone H3 (pan-H3Ac; H3K9Ac, and H4K14Ac) and histone H4 (pan-H4Ac; H4K5Ac, H4K8Ac, H4K12Ac, and H4K16Ac) acetylation profiles of SELP in HUVEC and HuAoVSMC. The histone acetylation level at each genomic location is defined as the fold difference between the pan-H3Ac/pan-H4Ac ChIP (Cy5) and the isotype control rIgG ChIP (Cy3) hybridization signals. Only fold differences between pan-H3Ac/Pan-H4Ac ChIP and isotype control ChIP that are ≥2-fold are shown. Biological replicates of pan-H3Ac (n=2) and pan-H4Ac (n=3) ChIP-chip analysis showed similar results. (B) H3K9Ac, H3K27Ac, H3K27me3, H3K36me3, and RNA pol II ChIP-seq profiles of SELP in HUVEC and NHEK. The HUVEC- and NHEK-associated ChIP-seq profiles were obtained from the ENCODE database262. These ENCODE sequence data were aligned to the GRCh37/hg19 (Feb. 2009) reference genome. The enrichment levels of H3K9Ac, H3K27Ac, H3K27me3, H3K36me3, and RNA pol II are represented by normalized intensities. HUVEC, human umbilical vein endothelial cells; HuAoVSMC, human aortic vascular smooth muscle cells; NHEK, normal human epidermal keratinocytes. 3.3.2 Allele-biased Regulation of P-selectin Expression

A major aim of this project was to define the contribution of epiallelic states to heterogeneity in P-selectin expression. Cellular heterogeneity can emerge from the expression of only one parental allele. Recent studies of allele-specific expression led to the surprising discovery that a significant proportion of human and mouse autosomal genes are subject to monoallelic expression151, 155, 267. In every cell type assessed, between 10 to 30% of human and mouse autosomal genes were found to be regulated by MAE. An assessment of the extent and nature of genetic variation present in these genes revealed that they contribute disproportionately to genetic diversity in humans148, 268.

There are a number of techniques for detecting MAE157. Nag et al. recently reported an approach that distinguishes MAE genes200. They found that active and inactive alleles are differentially marked on the intragenic gene bodies by H3K36me3 and H3K27me3, respectively. This MAE signature is present in about 20% of ubiquitously expressed genes and over 30% of tissue- specific genes across cell types.

We first suspected that P-selectin was regulated by MAE when we evaluated a publicly available database151. This database of autosomal monoallelic expression predicted that P-selectin is monoallelically expressed in HUVECs. Thus, in a substantial fraction of HUVECs, only one of two P-selectin alleles is expressed in any given cell. 93

Chromatin immunoprecipitation with H3K27me3 and H3K36me3 antibodies was used to isolate chromatin from HUVEC. Following the pull-down and purification of the DNA, PCR was performed using genomic PCR primers to amplify regions of the gene containing informative SNPs. Sanger sequencing was used to analyze the phase of these PCR products.

A comprehensive SNP analysis was performed on the entire SELP gene (Chr 1q24.2; 41344bp) using data from The Single Nucleotide Polymorphism database (dbSNP build 148)249, 269, 270. dbSNP reports the minor allele frequency for each variant included in a default global population, which is currently the 1000 Genome phase 3 genotype data from 2504 worldwide individuals. MAF is based on total number of alleles in a population and does not give any indication of number of homozygous or heterozygous individuals in a population.1419 SNPs were identified in SELP gene (Figure 23A). A summary of SNPs located within 500 nucleotides upstream of the transcriptional start and including the first exon are shown in Figure 17. Importantly, several reported SNPs were found in functional cis-regulatory elements as well as 3 that overlapped the CpG sites at -348 and -433. However, all of the SNPs indicated appear to be rare variants as they were observed to occur at a global minor allele frequency (MAF) of <1%. ChIP-seq data for H3K36me3 and H3K27me3 in HUVEC were obtained from previously published data97, 251, 252 (Figure 24). Comparison of the ChIP-seq signal intensities allowed the identification of informative SNPs that would likely be present in the DNA isolated from ChIP experiments using H3K27me3 and H3K36me3 antibodies. Seven SNPs within the gene body of P-selectin having a MAF > 0.2 were chosen for the design of SNP-seq primers (Figure 23B). Using this cutoff minimizes the number of umbilical cords needed to find a heterozygote (~ 1 in 3 cords).

94

95

Figure 23. Frequency and location of SNPs in the P-selectin gene. The P-selectin gene is 41344bp in length, containing 1419 SNPs in this region. (A) Location of the SNPs assessed in the H3K27me3 and H3K36me3 ChIP samples. (B) The global minor allele frequency (MAF) for each variant based on 2504 worldwide individuals in the 1000 Genome phase 3 genotype data set. MAF reports the less (or least) frequent allele for each polymorphism included in a default global population. The numerator is the frequency of a particular allele and the denominator is the number of times that allele is observed in the sample population. For example, the MAF for rs2235303 is A=0.3652/1829. This means that for rs2235303, the minor allele A has a frequency of 36.52% in the reference population and that 'A' is observed 1829 times in the sample population of 5008 chromosomes (or 2504 people).

96

Figure 24. Visualization of the H3K36me3 and H3K27me3 MAE signature of the P-selectin gene in HUVECs. Mapped H3K36me3 (blue), H3K27me3 (red), and control (green) ChIP-Seq signal intensities over the gene body region of P-selectin are shown. The control sample is DNA isolated from cells that have been cross-linked and fragmented under the same conditions as the immunoprecipitated DNA. ENCODE ChIP-Seq data in HUVECs was obtained in wig format from the UCSC genome browser for genome assembly UCSC hg18 (2006). Image was created using Integrated Genome Browser Version 9.0.0271 and the heights of the signal tracks were set to 15.

97

H3K27me3 and H3K36me3 ChIP experiments were performed on three different HUVEC lines. Figure 25 shows the Sanger sequencing results of the SNP analysis. The input samples from each line were used as the template DNA for PCR amplification of several high frequency SNPs, followed by Sanger sequencing. Two of the samples (HUVEC 1 and 2) were heterozygous for the same SNP, whereas the third sample (HUVEC 3) was heterozygous for a different SNP. DNA purified by ChIP was PCR amplified and sequenced across the heterozygous SNP that was identified in the input genomic DNA. RNA was extracted from these samples and the polyA- fraction was separated from the total RNA. It was important that we used the polyA- RNA to make cDNA because all of the informative SNPs that we identified were within intronic regions; performing the polyA- selection allowed more efficient amplification during the PCR step.

98

A

99

Figure 25. Sanger sequencing of PCR products containing an informative SNV in HUVEC cells. (A) Chromatograms from Sanger sequencing of PCR products containing informative intronic SNVs (arrows) from input DNA, DNA from chromatin immunoprecipitation of H3K27me3 and H3K36me3, as well as cDNA from three HUVEC samples. (B) Summary of peak heights and allele ratio normalized to input DNA.

The electropherogram pattern of a DNA sequence is highly reproducible and is determined by the local context of the sequence as well as the nature of the labelled chain terminal nucleotides used in sequencing. In other words, if the same region of DNA is amplified from different individuals and sequenced, both individuals will have the same peak height patterns. Comparing the two HUVEC lines with the same hetSNP (HUVEC 1 and 2) we observed that the heights of the peaks surrounding the SNP are identical for these two individuals (Figure 25A). In HUVEC 1, we can see that the H3K27me3 sample had an enrichment of the ‘A’ allele, which indicates bias of ‘A’ in the “off” signal. On the other hand, the H3K36me3 sample as well as the hnRNA had an enrichment of the G allele. Overall, this revealed bias of ‘G’ in the “on” signal and confirmed the presence of this allele in the RNA transcript. Interestingly when looking at HUVEC 2 with the same hetSNP, we saw the opposite result in that the H3K27me3 DNA was enriched for ‘G’ and the H3K36me3 DNA and hnRNA was enriched for ‘A’. This showed that that the specific nucleotide (either G or A) was not a factor in determining which allele was on or off, but rather confirmed to us that transcription was occurring from one allele and that P-selectin is monoallelically expressed. This SNP is not likely functional as different alleles expressed in different HUVEC preps. Finally, in our third line (HUVEC 3), again we saw enrichment of a

100 specific allele in the two different ChIP samples. H3K27me3 was enriched for ‘G’ and both the H3K36me3 ChIP DNA and cDNA was enriched for T.

The expression levels of monoallelic genes was also compared to biallelic genes to determine whether there were any differences in the steady-state transcript levels. Our lab has created a custom NanoString nCounter assay containing 231 genes of interest. This NanoString assay was recently used to profile the mRNA abundance of several HUVEC lines. We were interested in using this data to examine the expression level of these genes, which were classified as either MAE or BAE based on the dbMAE predictor. The results of this analysis are summarized in Figure 26. The mean expression of MAE genes was 82.38 counts (averaged across 52 genes) compared to the mean expression of BAE genes which was 1974 counts (averaged across 153 genes). The Wilcoxon–Mann–Whitney (WMW) 2-sample rank sum test was used to check whether the expression of either MAE or BAE genes tended to be larger than the other (P<0.0001). Generally, the levels of transcripts from BAE genes were found to be higher than those from MAE genes. This is consistent with previous studies performed in other cell types that reports monoallelically expressed genes have lower transcript levels than biallelically expressed genes147.

101

Figure 26. Expression level comparison between monoallelic and biallelic genes. Box and whisker plots showing the expression distribution for 205 genes classified as either biallelic (153 genes) or monoallelic (52 genes) in HUVEC. The expression of each gene was averaged across three HUVEC lines. The box plots show the median, lowest, and highest expressing gene in each category. Data were analysed by Wilcoxon-Mann-Whitney 2-sample rank sum test to compare the expression of BAE to MAE genes (P<0.0001). 3.4 Mitotic Stability of P-selectin Epialleles 3.4.1 Conservation of Promoter DNA Methylation

Having established that genetically identical ECs exhibit heterogeneous P-selectin protein, mRNA, hnRNA, and promoter DNA methylation, we then asked whether the DNA methylation status at the proximal promoter could be transmitted through mitotic cell division. To address this question, cell sorting was used to plate single, freshly isolated HUVEC cells into 96 well plates. Colonies were subsequently grown to confluence in 24-well, 12-well, 6-well and finally 10cm plates. DNA was isolated from the parental line (at passage 1) as well as the individual clones and methylation was analyzed using sodium bisulfite genomic sequencing. This was done for two independent HUVEC donors. We found that the proximal promoter methylation of individual clones differed from one another as well as from the original parental population (Figure 27). The clones did not re-establish the same heterogeneity that was observed in the

102 parental DNA. Clones appeared to show some strand-to-strand variation but stable average methylation of specific sites or stable patterns across all sites was observed. For example, in the hypomethylated clones (Parental Line 1 Clone 2 (Figure 27C) and Parental Line 2 Clone 2 (Figure 27F), we observed consistent low average percent methylation across 3 CpGs (namely - 468, -433, and -348). When considering the total average percent methylation of CpGs across all strands, both hypomethylated clones showed a lower percent methylation than the respective parental lines. In Parental Line 2 Clone 2 (Figure 27F), seven of the eighteen alleles had the same pattern of methylation at -152. Similar observations were made in the hypermethylated clones. Dominant patterns of methylation could be identified in the clones that appear to have originated from the parental line. Over the course of expansion from a single cell to confluence in a 10cm plate, a cell undergoes ~31-32 doublings. The presence of other allele patterns in the clones may be explained by imperfect conservation during mitosis. Numerous studies using the bisulfite method have clearly established that methylation patterns often show some molecule-to- molecule variation. In our results Parental Line 1 Clone 1 (Figure 27B) appeared to have more consistency in terms of the methylation status at specific CpG sites. There are two main distinct patterns (methylation of all four CpG sites or methylation at -468, -433 and -152) however more uniformity was observed in the site percent methylation—the CpG at -468 had an average 74% methylation across all strands, -433 had 95% methylation, and -152 had 85%. This finding provided exciting evidence that there may be some “memory” in terms of inheritance of epigenetic states.

103

104

Figure 27. Mitotic conservation of promoter methylation. HUVEC clones were generated from genetically identical single cells and analyzed using sodium bisulfite genomic sequencing. The parental methylation profile and two representative clones are shown. Lollipop diagrams (left) show the strand analysis of four consecutive CpG sites in individually sequenced cloned alleles. Each row represents a single cloned allele and each circle represents a CpG site relative to the transcriptional start site. Filled circles represent methylated CpGs and unfilled circles are unmethylated CpGs. The percentages refer to the summed methylated/total CpG across all four CpG sites. The frequency histograms (centre) show the number of methylated CpG sites. The % methylation graphs (right) display the average percent methylation for each CpG site across all alleles that were sequenced. HUVEC, human umbilical vein endothelial cells.

Another exciting observation was made in terms of the clonogenic and proliferative potential of HUVEC cells. Previous studies have described the presence of circulating endothelial progenitor cells isolated from adult peripheral blood and umbilical cord blood272-274. These cells were 105 primarily defined by their ability to form replatable colonies when isolated and plated as single cells. Similar cells were also discovered in HUVECs and other cells derived from adult vessel walls. The authors found that 52% of single HUVECs underwent at least one cell division after two weeks in culture. Of those cells, 28% eventually formed colonies containing >2000 cells. When replated into wells of a 24-well plate, about 20% of those colonies formed secondary colonies or became confluent273. These are termed the high proliferative potential–endothelial colony-forming cells (HPP-ECFC)272. Cells which formed smaller colonies (50–500 cells) were defined as low proliferative potential–endothelial colony-forming cells (LPP-ECFCs). Overall this indicated that in any given population of HUVECs starting at passage 3, approximately 15% will have colony forming potential and about 3% will be able to be serially passaged and expanded.

In our studies, we obtained a much lower initial cell survival/divisibility potential. In Parental Line 1, out of 480 single HUVEC cells seeded into 96-well plates, only 64 formed colonies or grew to confluence (13.3%) after two weeks. We observed similar results in Parental Line 2 – 51 out of 288 wells (17.7%) had divided and either formed a colony or became confluent. After these initial 2 weeks, most colonies from both lines could be serially replated onto 24-well, 12- well, 6cm, and finally 10cm plates. We consistently observed more than 80% of colonies growing to confluence after each replating. This equated to a four times greater percentage (~12%) of cells that were able to be successively passaged from a 96-well plate to confluence in a 10cm plate. A possible explanation for this discrepancy is that Ingram et al.273 utilized cryopreserved passage 3 HUVEC while our experiments were done with freshly isolated cells from the umbilical cord. Because fresh cells have not undergone several rounds of subculturing and expansion, this would have enriched the proportion of cells with high survival and proliferative capacities. In the same vein, passage 3 cells would likely also appear to have fewer cells capable of being serially passaged and propagated since they have already undergone a larger number of divisions. This phenomenon, known as the Hayflick limit, describes the observation that cultured normal human cells have limited capacity to divide, after which they become senescent275, 276. Collectively our results indicate the presence of a subset of HUVECs that are able to achieve at least 32 population doublings.

106

3.5 Discussion

Epigenetic and transcriptional variability play key roles in human health and disease. Although heterogeneity has become an accepted characteristic of a population of cells, it is not routinely considered when designing strategies for the therapeutic treatment of diseases. The experiments presented here examine the molecular mechanisms of constitutive heterogeneous gene expression of endothelial cells grown in vitro. P-selectin expression was found to be heterogeneous in basal HUVECs at the level of both protein and RNA. This heterogeneity was observed both in cultured ECs and ECs isolated from fresh human tissues. This can be compared with VWF, which displayed homogeneous expression in RNA and protein across cells isolated from the same blood vessel. Therefore, this phenomenon of heterogeneity is not ubiquitous and does not affect all endothelial genes equally.

Although our assessment of transcriptional heterogeneity is largely based on qualitative assessments from RNA FISH and scRNA-seq experiments, we believe that certain genes are programmed to be more heterogeneous than others and a wide expression distribution may be more protective and advantageous under certain circumstances. It is likely that no single transcriptional state determines whether a cell may be prone to vascular disease development. It is more likely that many different transcriptional states can produce the same functionality and this heterogeneity is advantageous to the population as a whole.

In a 2016 paper from Yuan et al., roundabout 4 (Robo4) and ETS-related gene (ERG) were reported to be examples of homogeneous EC-restricted genes, while endothelial-specific molecule 1 (ESM1) and EphrinB2 were heterogeneous42. Using RNA FISH, parental and clonal populations of HUVEC showed uniform expression of Robo4 and ERG and mosaic expression of ESM1 and EphrinB2. Our scRNA-seq results confirmed the EphrinB2 heterogeneity but also showed variation in Robo4 and ERG mRNA (Figure 28). ESM1 was not significantly expressed. This may be explained by the difference in cell passaging. Our sample was freshly harvested from a single-donor and the Yuan paper used third-passage single-donor cells. Biases in quantification of FISH experiments may also contribute to the differences in our findings. Quantifying heterogeneity in gene expression among single cells can reveal information masked by cell-population averaged experiments. However, the lack of standardized metrics to accurately quantify heterogeneity in gene expression remains a significant challenge.

107

Figure 28. Heterogeneity of other EC-enriched genes. Feature plots and violin plots show the expression of endothelial-specific molecule 1 (ESM1), Ephrin B2 (EFNB2), Roundabout 4 (Robo4) and ETS-related gene (ERG) in freshly isolated human umbilical vein endothelial cells. ESM1 displayed little expression in the endothelial clusters (1-4) whereas EFNB2, ROBO4, and ERG demonstrate heterogenous mRNA expression among the ECs.

In terms of promoter methylation, an important finding was that heterogeneity existed basally in certain cell and tissue types but not in others. Many previous studies assumed that all cells within a tissue have identical or greatly similar methylation patterns. When we examined promoter methylation in a variety of endothelial and non-endothelial cell types, P-selectin heterogeneity was observed in specific ECs, namely HUVEC and HCAEC. Aortic and dermal microvascular endothelial cells appeared to be more homogeneous. We expect follow up studies will also show homogeneity in protein and RNA for HAEC and HMVEC. These findings allude to a mechanism where heterogenous expression of certain genes aids in the function and survival of some cell types but not others. This is similar to the concept of adaptive resistance in bacteria, where phenotypic heterogeneity contributes to cell survival after exposure to an antibiotic277. Similar findings have been reported in yeast278 and tumor cells279. In these examples, phenotypic heterogeneity is thought to accelerate the rate of adaptive evolution in populations exposed to a drug treatment. The growth and expansion of resistant subclones is therefore driven by both pre- 108 existing random fluctuations in gene expression, as well as de novo variations (such as driver mutations). In our studies, we are arguing that cell-to-cell variation of certain physiological traits functions in a similar role within different regions of the body. In the vascular system, heterogeneity seems to benefit some vessel types over others, and expression variation is a non- random process that is built into certain genes and occurs in the absence of environmental stresses. We believe that in an otherwise isogenic population, heterogeneity arises due to epigenetic differences and this is an inherent property that can be beneficial to the tissue as a whole.

A second finding from our methylation studies was the presence of methylation heterogeneity in both early and late passage ECs. P-selectin promoter methylation heterogeneity was found in both the passage 1 parental HUVECs from the clonal studies (Figure 27) as well as cells that were expanded to passage 3 (Figure 14). Although these experiments were done in different donor lines, it suggests that methylation heterogeneity is a feature that endures culturing and cell expansion. In relating methylation to the observation that HUVEC lose the ability to express P- selectin within two to three passages28, it seems that promoter hypermethylation does not explain this phenomenon and other factors are at play. It is possible that the expression of P-selectin is dependent on an in vivo stimulus that is not recapitulated in vitro. Future studies are needed to confirm that methylation heterogeneity is not passage-dependent by comparing promoter methylation in populations of serially passaged HUVEC derived from the same individual.

In our clonal studies, we have shown that differences in methylation are heritable through mitotic cell division. While we did not observe perfect conservation of methylation across all strands isolated from a population of cells expanded from a single founder, this may be explained by the processivity and fidelity of the DNMTs. DNMT1 plays an essential role in the maintenance of methylation; in association with the DNA replication machinery, DNMT1 methylates hemimethylated CpG sites with high efficiency but not absolute accuracy280. DNMT1 was reported to methylate daughter strands with ~95% efficiency280. Other groups have reported that in addition to the de novo methylation activities of DNMT3A/B, these methylases also function as proofreaders to fill the gaps of the hemimethylated CpG sites missed by DNMT1281. Ushijima et al. have calculated the overall fidelity of maintenance methylation to be 99.85–99.92% per cell division282. Although this value seems quite high, a large number of factors are likely to affect this rate, such as methylation density, cellular availability of S-adenosylmethionine, 109 repetitive DNA, histone modifications, the activity of demethylases (TET proteins), and cell cycle kinetics283. Therefore, some clone-to-clone variation in our results are expected considering the number of cell divisions from a single cell to a confluent 10cm dish.

Taken together it may appear that these results are contradictory; a population of HUVECs isolated from a single donor appear to have heterogeneous promoter methylation, but a clone of cells derived from a single HUVEC has little variability. A possible explanation involves the changes in methylation during embryonic development. At two unique periods of mammalian development—in primordial germ cells and pre-implantation embryos—the epigenome is reprogrammed towards a globally demethylated state284, 285. Cell differentiation is directed by transcription factor networks, which establish gene expression patterns in response to developmental cues. Once established, these TF networks maintain robust lineage-restriction and ensure development towards defined differentiated cell-types. In parallel, epigenetic mechanisms act to reinforce these cell-fate decisions and prevents loss of cell identity. However, methylation at a given gene may change during differentiation due to local concentrations of DNMTs and TETs. The activity of the TET enzymes is highly oxygen-dependent and varying oxygen concentrations during tissue development may result in a mosaic population of cells with substantial variation in DNA methylation patterns286. Therefore, the cells in a mature vessel may have arisen from a defined number of progenitor cells, each with slight variations in methylation. Clonal expansion of these progenitors may give rise to patches of cells with similar methylation patterns. Since our HUVECs are harvested from the entire length of a cord, the resulting cells will be a mixed population with cells that originated from different patches. A second mechanism which contributes to mature EC heterogeneity is the replication-mediated dilution of methyl marks during DNA replication. As described above, the fidelity of maintenance methylation is not perfect and will lead to passive demethylation. Finally, the activity of DNMTs and TETs are highly amenable to changes in local environmental factors. Reactive oxygen species (ROS) are generated in response to various stimuli, such as shear stress and hypoxia. It is well known that different regions of the vascular tree experience different levels of shear. DNA bases can be directly modified by ROS. For example, hydroxyl radicals can lead to the formation of 5hmC from 5mC287. 5hmC has been proposed to also interfere with DNMT1 activity to block the proper inheritance of methylation patterns, thus leading to indirect demethylation of CpG

110 sites288. Overall there are many cell factors that will affect the epigenetic landscape, but epigenetic marks are generally somatically stable though mitotic cell division.

Here, we have highlighted the importance of heterogeneous expression of some genes, such as P- selectin, whereby vessels benefit from having a wide distribution of cell surface P-selectin expression. Having some cells express high levels of P-selectin allows rapid leukocyte attachment during acute inflammation or injury. Conversely if many cells continually expressed high P-selectin, this would generate a pro-inflammatory and pro-atherosclerotic endothelial phenotype. The advantage of having basal expression heterogeneity is that it allows for quick response to a change in the environment without the need for new RNA and protein synthesis. The expression of E-selectin and other downstream adhesion molecules is delayed but can be more pronounced depending on the duration and amount of cytokine exposure. This is critical to the capacity of tissues to recruit leukocytes during chronic inflammatory conditions.

The present study is the first, to our knowledge, to explore the role of epigenetics and allele- biased expression in determining the expression heterogeneity of EC genes without the need for endothelial activation by cytokines. We have shown that the alleles of P-selectin are differentially marked by H3K27me3 and H3K36me3. We suspect that the allelic expression contributes to the non-uniform expression of P-selectin across different ECs—whether P-selectin is expressed by one allele or two will determine the steady state transcript levels. Studies have shown that the heterogenous pattern of P-selectin expression affects leukocyte and platelet rolling, providing direct evidence for the local variation of adhesion molecule expression as a mechanism for the regulation of leukocyte and platelet recruitment23. We propose a model in which monoallelic expression keeps basal P-selectin expression low in healthy ECs. In this model, the transition to biallelic expression (and high P-selectin expression) in certain vascular regions will predispose those areas to have high leukocyte attachment. Future studies are needed to confirm this and we will be interested to know the allelic expression status of P-selectin in ECs isolated from vascular regions prone to developing atherosclerosis.

We have also alluded to the possibility of clonal inheritance of histone marks. The ChIP-Sanger sequencing experiments were done using dishes of passage 3 HUVEC cells that were not derived from a single cell. In the ChIPed DNA using either H3K27me3 or H3K36me3 antibodies we observed clear peak enrichment of a single nucleotide at the assayed SNP location. This

111 suggested to us that a majority of cells within that dish had association of a histone mark with the same allele. A recent study from Reveron-Gomez et al. has shown that histone post-translational modifications are faithfully copied with high accuracy in newly synthesized DNA during the cell cycle289. Using a method called ChOR-seq (chromatin occupancy after replication), the authors were able to show that H3K27me3 and H3K36me3 occupancy patterns are reproduced on newly replicated DNA in both repressed and active genomic regions. Although we cannot exclude the possibility that the deposition of H3K27me3 and H3K36me3 was random between the two alleles, we would have expected a mixed population of cells to result in overlapping peaks at a SNP if both alleles had equal chance of receiving either histone mark.

Transcription is regulated at several levels and currently we have yet to resolve the contribution of replication timing and transcriptional bursting to P-selectin heterogeneity. Asynchronous DNA replication timing is a feature of several examples of classical monoallelically expressed genes, such as imprinted genes, olfactory receptor genes, immunoglobulin loci and the X- chromosome185, 189, 290, 291. Whereas most genes have both alleles replicated relatively synchronously at a defined portion of the S phase, we suspect that P-selectin will have one allele replicating before the other and there will be differential epigenetic marks on the early- vs late- replicating alleles. Early replication is generally thought to be a feature of transcribed genes. However, in many cases early replication is not sufficient for transcription. In previous work, we have shown that genes that are not expressed in a specific cell type may also be replicated early215. For example, CD31 and ICAM2 both displayed early S-phase replication in both ECs and non-ECs. Although we expect to find asynchronous replication in P-selectin, we do not assume the early replicating allele will the transcriptionally active allele. Additionally, Gendrel et al. have found that asynchronous replication is not always correlated with monoallelic expression and can occur in biallelic clonal populations of cells292. Nevertheless, examination of a possible mechanism of monoallelic expression will be important to our understanding of P- selectin regulation. We believe that monoallelic regulation of P-selectin is important for fine- tuning levels of mRNA in a population of cells. To test whether P-selectin is replicated asynchronously we can utilize BrdU pulse-labelling and FISH analysis of S-phase nuclei293.

To an extent, heterogeneity in gene expression is always present, even among isogenic cells in the same environment. Part of this variability is attributed to transcriptional bursting, the stochastic activation and inactivation of promoters that leads to the discontinuous production of 112 mRNA294-297. Transcriptional bursting is typically described by its frequency (the number of bursts in time units) and size (the mean number of transcripts produced per burst event). Many factors can affect transcriptional burst kinetics, such as nucleosome occupancy, histone modifications, cis-regulatory sequences, and chromatin conformation295. Recently, a negative correlation was reported between gene length and burst size, but not frequency11. The authors also reported a strong linear relationship between enhancer activity and burst frequency. In the present study, we have examined some aspects of burst kinetics, but further work is needed to fully dissect how different mechanisms are coordinated to tune burst size and frequency.

Heterogeneity is a topic that is gaining widespread attention in the biomedical community, however there are a number of challenges that need to be addressed. One of the largest issues in analyzing variability is differentiating technical variability from biological variability. Technical variability due to intrinsic experimental noise is greater for lowly expressed genes compared to those that are highly expressed. This is especially relevant in the analysis of single-cell data298. Considering the rapid pace and low cost at which large-scale transcriptomic datasets can be produced, data analysis often presents a significant bottleneck. There exists a plethora of strategies available for the analysis of gene expression data and new algorithms are constantly being developed. Currently there is no gold standard for the analysis of single-cell transcriptomic data. Oftentimes the incorporation of user knowledge into the analysis algorithms is required to generate interpretable results that facilitate hypothesis generation. However, this can also introduce biases into the analysis. For example, the decisions of how to preprocess the data in terms of removing certain cells or genes (based on mitochondrial or ribosomal reads, thresholds for expression, cell cycle regression, etc.) may help to remove sources of variation, but doing so will also affect clustering and differential expression later on. Gaining a deeper understanding of how heterogeneity is established, controlled, and maintained will be fundamental to broaden our understanding of human health and disease.

3.6 Future Directions

The findings presented in this thesis highlight important and exciting questions regarding the role of epigenetic mechanisms in regulating the heterogeneous expression of select constitutively expressed EC genes. In the following section, several experiments have been outlined that can further our understanding of the relationships between proximal promoter methylation,

113 transcription, monoallelic regulation, and the importance of DNA replication timing in determining the expression variation of P-selectin.

The spectrum of EC P-selectin expression heterogeneity and clonality will be characterized using an in vivo lineage tracing strategy. R26R-Confetti (R26R-Brainbow2.1) transgenic mice allow a way to label and distinguish individual adjacent cells using a series of fluorescent protein cassettes in Cre recombined cells299. These mice carry an insertion of a Brainbow 2.1 transgene in the Rosa26 locus that is ubiquitously expressed throughout the entire mouse300. The Brainbow2.1 allele contains green, yellow, red, and cyan fluorescent proteins arranged in tandem invertible segments flanked by loxP sites. A floxed STOP sequence in front of the fluorescent cassette prevents background fluorescence in the absence of Cre recombinase. These mice can be crossed to a second mouse line expressing tamoxifen-inducible Cre-recombinase (Cre-ERT2) under the regulation of the vascular endothelial cadherin (CDH5/VE-cadherin) promoter301. In the resulting mice, tamoxifen induction of Cre mediates excision of the STOP sequence and random inversion of the fluorescent cassettes, causing ECs to express one of the four fluorescent proteins. Thus, a starting population of dividing ECs generates a random colour after Cre recombination. Following recombination, each dividing cell produces daughter cells that share the same colour, allowing clone identification. Other mice strains with increased colour complexity can be developed by the insertion of multiple copies of the Brainbow2.1 allele into the mouse genome. Using this approach, different vessels, such as the aorta, can be isolated and the patterns of clonal expansion can be visualized along with P-selectin expression. Clonal cells can also be isolated using FACS and the differences in epigenetic and transcriptional states can be analyzed.

Our results demonstrated that P-selectin protein expression correlates with the steady-state levels of hnRNA and mRNA. A relationship between protein level and methylation status was also observed (high protein expression with low promoter methylation and vice versa). Importantly, promoter reporter experiments revealed a direct link between methylation and transcription.

To provide more evidence for a relationship between DNA methylation and transcription, we will use antibodies against RNA polymerase II to immunoprecipitate transcriptionally active chromatin fragments in basal, early passage HUVEC, and carry out bisulfite sequencing on the pol II-bound and -unbound chromatin fractions. We expect pol II to preferentially bind to

114 hypomethylated regions of the P-selectin proximal promoter. Bisulfite analysis of the four proximal promoter CpGs should reveal an enrichment of alleles with 0 or 1 sites methylated in the pol II-bound fractions.

In our clonal studies, we have shown that differences in methylation are heritable through mitotic cell division. An interesting follow-up experiment will be to use the method of hairpin bisulfite sequencing developed by Laird et al.302 This method uses hairpin ligation to covalently join the complementary strands of individual DNA molecules, thereby allowing the determination of methylation/demethylation processes. Using the same approach as Yuan et al.42 we can then use these data to estimate rate of methylation maintenance, loss, and de novo methylation based on the mathematical model of dynamic DNA methylation proposed by Pfeifer et al303.

We have provided compelling evidence that P-selectin is monoallelically expressed in HUVEC. Further work will need to be done to address whether there are differences in methylation on the two alleles. We will perform ChIP experiments using H3K36me3 and H3K27me3 antibodies and perform bisulfite analysis on the immunoprecipitated DNA. This will give us insight into whether methylation plays a role in allelic expression bias. In this work, H3K36me3 and H3K27me3 have been used as proxies to identify the active and inactive alleles in P-selectin. It is likely that other histone modifications are also involved in regulating P-selectin transcription. For example, in mammalian cells, active or primed enhancers are commonly marked by H3K4me1304. The P-selectin gene in HUEVC showed enrichment of this histone mark in several distinct regions within the promoter and gene body. A recent study has proposed a “seesaw” mechanism whereby the activity of H3K4me1 and H3K4me3 is controlled by DNA methylation305. H3K4me3 is a marker of gene promoters and the authors described a model in which a genomic region may only be enriched for one of these three methylation marks (DNA methylation, H3K4me3 methylation, or H3K4me1). The switching from H3K4me1 to H3K4me3 at the promoter facilitates recruitment of RNA pol II and promotes transcriptional initiation through binding of TFIID. It is possible that this DNA methylation dependent switching of histone marks on the P-selectin alleles determines which one is on and off.

As previously mentioned, differential replication timing has long been thought to have a role in regulating the cell- or tissue-specific expression of certain genes. However, we and others have shown that some genes display early S phase replication in both expressing and non-expressing

115 cell types215. Moreover, asynchronous replication, is a defining characteristic of monoallelically expressed genes. Future studies are needed to determine whether replication timing and/or asynchronous replication plays a role in heterogenous P-selectin expression. This will be done using S-phase fractionation methods (BrdU labelling) and DNA FISH293. In these FISH experiments DNA probes designed to hybridize to SELP will allow the replication status to be observed: a single-single dot signal (2 spots) will indicate unreplicated SELP DNA and a double- double dot signal (4 spots) will indicate replicated SELP. If P-selectin undergoes asynchronous replication, a single-double dot signal (3 spots) should be seen in early S phase fractions, which indicates that one allele has replicated and the other has not. It will also be possible to use informative SNPs to distinguish the parental alleles in the DNA and RNA isolated from each cell cycle fraction. This will help us determine whether allelic expression bias is linked to replication timing.

To further our understanding of the molecular basis for P-selectin-high versus P-selectin-low expressing cells, we will also transiently transfect promoter/reporter constructs into HUVEC 48hrs before using P-selectin fluorescence-labelled primary antibodies and FACS to sort cells. This will allow us to compare episomal promoter activity in P-selectin-high versus P-selectin- low fractions. We do not anticipate that classical cis-trans pathways are dominant in establishing P-selectin heterogeneity in ECs and therefore expect promoter/reporter activity to be equivalent in the two subpopulations. It will also be interesting to see whether P-selectin promoter/reporter plasmids behave differently in non-expressing cell types. If we observe expression of the episomal reporter in cells, such as HuAoVSMC, this would indicate the involvement of chromatin-based mechanisms in keeping native P-selectin off in these non-expressing cells.

3.7 Concluding Remarks

The work presented in this thesis provides the first in depth analysis of the role of epigenetics in determining the heterogeneous expression of select constitutively expressed genes in endothelial cells. We believe that cell-to-cell variation contributes to the macroscopic heterogeneity observed between ECs of different vascular beds. P-selectin was found to be heterogeneously expressed in basal HUVEC at the level of both protein and RNA. This heterogeneity was not observed in other ECs, like HAEC and HMVECs. Variability in HUVEC P-selectin expression was attributed, in part, to differences in proximal promoter DNA methylation that were

116 subsequently maintained across mitotic cell divisions. P-selectin was also found to be monoallelically expressed and this allelic regulation was likely mediated though histone modifications that differentially mark the two alleles. Importantly, mitotically stable maintenance of the active allele was observed. These findings are consistent with our hypothesis that in healthy individuals, disease-associated genes, like P-selectin, should be lowly and heterogeneously expressed, and a transition towards homogeneous expression (at either extremes) promotes pathologic development. We believe that functional heterogeneity at select genes allows seemingly identical populations of endothelial cells to differentially respond to the environment. This has important biological significance, not only when considering the survival of single cells, but also in relation to the health of a tissue as a whole. Within the lifespan of an individual, inheritance of epigenetics at select genes will affect the differential survival of cell populations. Importantly, we are arguing that mitotic cell lineages within multicellular organisms can evolve through selection among cells with identical genotypes. This is contrasted to cancers where the survival and propagation of tumor cells can be largely attributed to the accumulation of driver mutations. Mitotic evolution at the cellular level helps to explain why certain regions of the vascular tree display different propensity for developing cardiovascular disease. Overall our mechanistic studies are relevant to the study of EC gene expression in general and form the basis for understanding how endothelial phenotype is established over the lifetime of an organism.

117

References

1. Monahan-Earley R, Dvorak AM, Aird WC. Evolutionary origins of the blood vascular system and endothelium. J. Thromb. Haemost. 2013;11 Suppl 1:46-66 2. Aird WC. Phenotypic heterogeneity of the endothelium: I. Structure, function, and mechanisms. Circ. Res. 2007;100:158-173 3. Aird WC. Phenotypic heterogeneity of the endothelium: Ii. Representative vascular beds. Circ. Res. 2007;100:174-190 4. Michiels C. Endothelial cell functions. J. Cell. Physiol. 2003;196:430-443 5. Aird WC. Endothelial cell heterogeneity. Cold Spring Harb. Perspect. Med. 2012;2:a006429 6. Elowitz MB, Levine AJ, Siggia ED, Swain PS. Stochastic gene expression in a single cell. Science. 2002;297:1183-1186 7. Won D, Zhu SN, Chen M, Teichert AM, Fish JE, Matouk CC, Bonert M, Ojha M, Marsden PA, Cybulsky MI. Relative reduction of endothelial nitric-oxide synthase expression and transcription in atherosclerosis-prone regions of the mouse aorta and in an in vitro model of disturbed flow. Am. J. Pathol. 2007;171:1691-1704 8. Dueck H, Eberwine J, Kim J. Variation is function: Are single cell differences functionally important?: Testing the hypothesis that single cell variation is required for aggregate function. Bioessays. 2016;38:172-180 9. Chubb JR, Trcek T, Shenoy SM, Singer RH. Transcriptional pulsing of a developmental gene. Curr. Biol. 2006;16:1018-1025 10. Raj A, Peskin CS, Tranchina D, Vargas DY, Tyagi S. Stochastic mrna synthesis in mammalian cells. PLoS Biol. 2006;4:e309 11. Larsson AJM, Johnsson P, Hagemann-Jensen M, Hartmanis L, Faridani OR, Reinius B, Segerstolpe A, Rivera CM, Ren B, Sandberg R. Genomic encoding of transcriptional burst kinetics. Nature. 2019;565:251-254 12. Dar RD, Razooky BS, Singh A, Trimeloni TV, McCollum JM, Cox CD, Simpson ML, Weinberger LS. Transcriptional burst frequency and burst size are equally modulated across the human genome. Proceedings of the National Academy of Sciences. 2012;109:17454 13. Aird WC. Endothelium as an organ system. Crit. Care Med. 2004;32:S271-279 14. Vestweber D. How leukocytes cross the vascular endothelium. Nat. Rev. Immunol. 2015;15:692-704 15. Carlos TM, Harlan JM. Leukocyte-endothelial adhesion molecules. Blood. 1994;84:2068- 2101 16. Bevilacqua MP, Pober JS, Mendrick DL, Cotran RS, Gimbrone MA, Jr. Identification of an inducible endothelial-leukocyte adhesion molecule. Proc. Natl. Acad. Sci. U. S. A. 1987;84:9238-9242

118

17. Bevilacqua MP, Stengelin S, Gimbrone MA, Jr., Seed B. Endothelial leukocyte adhesion molecule 1: An inducible receptor for neutrophils related to complement regulatory proteins and . Science. 1989;243:1160-1165 18. Osborn L, Hession C, Tizard R, Vassallo C, Luhowskyj S, Chi-Rosso G, Lobb R. Direct expression cloning of vascular cell adhesion molecule 1, a cytokine-induced endothelial protein that binds to lymphocytes. Cell. 1989;59:1203-1211 19. Blankenberg S, Barbaux S, Tiret L. Adhesion molecules and atherosclerosis. Atherosclerosis. 2003;170:191-203 20. Frenette PS, Johnson RC, Hynes RO, Wagner DD. Platelets roll on stimulated endothelium in vivo: An interaction mediated by endothelial p-selectin. Proceedings of the National Academy of Sciences. 1995;92:7450 21. McEver RP, Beckstead JH, Moore KL, Marshall-Carlson L, Bainton DF. Gmp-140, a platelet alpha-granule membrane protein, is also synthesized by vascular endothelial cells and is localized in weibel-palade bodies. J. Clin. Invest. 1989;84:92-99 22. Granger DN, Senchenkova E. Inflammation and the microcirculation. San Rafael (CA): Morgan & Claypool Life Sciences. 2010 23. Kim MB, Sarelius IH. Role of shear forces and adhesion molecule distribution on p- selectin-mediated leukocyte rolling in postcapillary venules. American Journal of Physiology-Heart and Circulatory Physiology. 2004;287:H2705-H2711 24. Sheikh S, Rainger GE, Gale Z, Rahman M, Nash GB. Exposure to fluid shear stress modulates the ability of endothelial cells to recruit neutrophils in response to tumor necrosis factor-alpha: A basis for local variations in vascular sensitivity to inflammation. Blood. 2003;102:2828-2834 25. Dragt BS, van Agtmaal EL, de Laat B, Voorberg J. Effect of laminar shear stress on the distribution of weibel-palade bodies in endothelial cells. Thromb. Res. 2012;130:741-745 26. Hattori R, Hamilton KK, Fugate RD, McEver RP, Sims PJ. Stimulated secretion of endothelial von willebrand factor is accompanied by rapid redistribution to the cell surface of the intracellular granule membrane protein gmp-140. J. Biol. Chem. 1989;264:7768-7771 27. Geng JG, Bevilacqua MP, Moore KL, McIntyre TM, Prescott SM, Kim JM, Bliss GA, Zimmerman GA, McEver RP. Rapid neutrophil adhesion to activated endothelium mediated by gmp-140. Nature. 1990;343:757-760 28. Kameda H, Morita I, Handa M, Kaburaki J, Yoshida T, Mimori T, Murota S, Ikeda Y. Re-expression of functional p-selectin molecules on the endothelial cell surface by repeated stimulation with thrombin. Br. J. Haematol. 1997;97:348-355 29. Vestweber D, Blanks JE. Mechanisms that regulate the function of the selectins and their ligands. Physiol. Rev. 1999;79:181-213 30. Hartwell DW, Mayadas TN, Berger G, Frenette PS, Rayburn H, Hynes RO, Wagner DD. Role of p-selectin cytoplasmic domain in granular targeting in vivo and in early inflammatory responses. J. Cell Biol. 1998;143:1129-1141

119

31. Michelson AD, Barnard MR, Hechtman HB, MacGregor H, Connolly RJ, Loscalzo J, Valeri CR. In vivo tracking of platelets: Circulating degranulated platelets rapidly lose surface p-selectin but continue to circulate and function. Proc. Natl. Acad. Sci. U. S. A. 1996;93:11877-11882 32. Khew-Goodall Y, Butcher CM, Litwin MS, Newlands S, Korpelainen EI, Noack LM, Berndt MC, Lopez AF, Gamble JR, Vadas MA. Chronic expression of p-selectin on endothelial cells stimulated by the t-cell cytokine, interleukin-3. Blood. 1996;87:1432- 1438 33. Yao L, Pan J, Setiadi H, Patel KD, McEver RP. Interleukin 4 or oncostatin m induces a prolonged increase in p- selectin mrna and protein in human endothelial cells. The Journal of Experimental Medicine. 1996;184:81-92 34. Woltmann G, McNulty CA, Dewson G, Symon FA, Wardlaw AJ. Interleukin-13 induces psgl-1/p–selectin–dependent adhesion of eosinophils, but not neutrophils, to human umbilical vein endothelial cells under flow. Blood. 2000;95:3146 35. Rondaij MG, Bierings R, Kragt A, van Mourik JA, Voorberg J. Dynamics and plasticity of weibel-palade bodies in endothelial cells. Arterioscler. Thromb. Vasc. Biol. 2006;26:1002-1007 36. Wolff B, Burns AR, Middleton J, Rot A. Endothelial cell "memory" of inflammatory stimulation: Human venular endothelial cells store interleukin 8 in weibel-palade bodies. J. Exp. Med. 1998;188:1757-1762 37. Oynebraten I, Bakke O, Brandtzaeg P, Johansen FE, Haraldsen G. Rapid chemokine secretion from endothelial cells originates from 2 distinct compartments. Blood. 2004;104:314-320 38. Fiedler U, Scharpfenecker M, Koidl S, Hegen A, Grunow V, Schmidt JM, Kriz W, Thurston G, Augustin HG. The tie-2 ligand angiopoietin-2 is stored in and rapidly released upon stimulation from endothelial cell weibel-palade bodies. Blood. 2004;103:4150-4156 39. Weibel ER. Fifty years of weibel-palade bodies: The discovery and early history of an enigmatic organelle of endothelial cells. J. Thromb. Haemost. 2012;10:979-984 40. Gebrane-Younès J, Drouet L, Caen JP, Orcel L. Heterogeneous distribution of weibel- palade bodies and von willebrand factor along the porcine vascular tree. The American Journal of Pathology. 1991;139:1471-1484 41. Ochoa CD, Wu S, Stevens T. New developments in lung endothelial heterogeneity: Von willebrand factor, p-selectin, and the weibel-palade body. Semin. Thromb. Hemost. 2010;36:301-308 42. Yuan L, Chan GC, Beeler D, Janes L, Spokes KC, Dharaneeswaran H, Mojiri A, Adams WJ, Sciuto T, Garcia-Cardena G, Molema G, Kang PM, Jahroudi N, Marsden PA, Dvorak A, Regan ER, Aird WC. A role of stochastic phenotype switching in generating mosaic endothelial cell heterogeneity. Nature communications. 2016;7:10160 43. Pusztaszeri MP, Seelentag W, Bosman FT. Immunohistochemical expression of endothelial markers , , von willebrand factor, and fli-1 in normal human tissues. J. Histochem. Cytochem. 2006;54:385-395

120

44. Muller AM, Hermanns MI, Skrzynski C, Nesslinger M, Muller KM, Kirkpatrick CJ. Expression of the endothelial markers pecam-1, vwf, and cd34 in vivo and in vitro. Exp. Mol. Pathol. 2002;72:221-229 45. Cleator JH, Zhu WQ, Vaughan DE, Hamm HE. Differential regulation of endothelial exocytosis of p-selectin and von willebrand factor by protease-activated receptors and camp. Blood. 2006;107:2736-2744 46. McEver RP, Martin MN. A monoclonal antibody to a membrane glycoprotein binds only to activated platelets. J. Biol. Chem. 1984;259:9799-9804 47. Hsu-Lin S, Berman CL, Furie BC, August D, Furie B. A platelet membrane protein expressed during platelet activation and secretion. Studies using a monoclonal antibody specific for thrombin-activated platelets. J. Biol. Chem. 1984;259:9121-9126 48. Keelan ET, Licence ST, Peters AM, Binns RM, Haskard DO. Characterization of e- selectin expression in vivo with use of a radiolabeled monoclonal antibody. Am. J. Physiol. 1994;266:H278-290 49. Sako D, Chang XJ, Barone KM, Vachino G, White HM, Shaw G, Veldman GM, Bean KM, Ahern TJ, Furie B, et al. Expression cloning of a functional glycoprotein ligand for p-selectin. Cell. 1993;75:1179-1186 50. Hahne M, Jäger U, Isenmann S, Hallmann R, Vestweber D. Five tumor necrosis factor- inducible cell adhesion mechanisms on the surface of mouse endothelioma cells mediate the binding of leukocytes. The Journal of Cell Biology. 1993;121:655 51. Weller A, Isenmann S, Vestweber D. Cloning of the mouse endothelial selectins. Expression of both e- and p-selectin is inducible by tumor necrosis factor alpha. J. Biol. Chem. 1992;267:15176-15183 52. McEver RP, Moore KL, Cummings RD. Leukocyte trafficking mediated by selectin- carbohydrate interactions. J. Biol. Chem. 1995;270:11025-11028 53. Pan J, McEver RP. Characterization of the promoter for the human p-selectin gene. J. Biol. Chem. 1993;268:22600-22608 54. Pan J, McEver RP. Regulation of the human p-selectin promoter by bcl-3 and specific homodimeric members of the nf-kappa b/rel family. J. Biol. Chem. 1995;270:23077- 23083 55. Pan J, Xia L, McEver RP. Comparison of promoters for the murine and human p-selectin genes suggests species-specific and conserved mechanisms for transcriptional regulation in endothelial cells. J. Biol. Chem. 1998;273:10058-10067 56. Disdier M, Morrissey JH, Fugate RD, Bainton DF, McEver RP. Cytoplasmic domain of p-selectin (cd62) contains the signal for sorting into the regulated secretory pathway. Mol. Biol. Cell. 1992;3:309-321 57. Green SA, Setiadi H, McEver RP, Kelly RB. The cytoplasmic domain of p-selectin contains a sorting determinant that mediates rapid degradation in lysosomes. J. Cell Biol. 1994;124:435-448

121

58. Blagoveshchenskaya AD, Hewitt EW, Cutler DF. A balance of opposing signals within the cytoplasmic tail controls the lysosomal targeting of p-selectin. J. Biol. Chem. 1998;273:27896-27903 59. Subramaniam M, Koedam JA, Wagner DD. Divergent fates of p- and e-selectins after their expression on the plasma membrane. Mol. Biol. Cell. 1993;4:791-801 60. Koedam JA, Cramer EM, Briend E, Furie B, Furie BC, Wagner DD. P-selectin, a granule membrane protein of platelets and endothelial cells, follows the regulated secretory pathway in att-20 cells. The Journal of Cell Biology. 1992;116:617 61. Straley KS, Green SA. Rapid transport of internalized p-selectin to late endosomes and the tgn: Roles in regulating cell surface expression and recycling to secretory granules. The Journal of cell biology. 2000;151:107-116 62. McEver R. Regulation of expression of e-selectin and p-selectin. In: D vestweber (ed): The selectins – initiators of leukocyte endothelial adhesion. Harwood academic, amsterdam. 1997:31-48 63. Ross R. Atherosclerosis--an inflammatory disease. N. Engl. J. Med. 1999;340:115-126 64. Kaufmann BA, Carr CL, Belcik JT, Xie A, Yue Q, Chadderdon S, Caplan ES, Khangura J, Bullens S, Bunting S, Lindner JR. Molecular imaging of the initial inflammatory response in atherosclerosis: Implications for early detection of disease. Arterioscler. Thromb. Vasc. Biol. 2010;30:54-59 65. Collins RG, Velji R, Guevara NV, Hicks MJ, Chan L, Beaudet AL. P-selectin or intercellular adhesion molecule (icam)-1 deficiency substantially protects against atherosclerosis in apolipoprotein e-deficient mice. J. Exp. Med. 2000;191:189-194 66. Gurtner GC, Davis V, Li H, McCoy MJ, Sharpe A, Cybulsky MI. Targeted disruption of the murine vcam1 gene: Essential role of vcam-1 in chorioallantoic fusion and placentation. Genes Dev. 1995;9:1-14 67. Dong ZM, Chapman SM, Brown AA, Frenette PS, Hynes RO, Wagner DD. The combined role of p- and e-selectins in atherosclerosis. J. Clin. Invest. 1998;102:145-152 68. Blann AD, Nadar SK, Lip GY. The adhesion molecule p-selectin and cardiovascular disease. Eur. Heart J. 2003;24:2166-2179 69. Zhang N, Liu Z, Yao L, Mehta-D'souza P, McEver RP. P-selectin expressed by a human selp transgene is atherogenic in apolipoprotein e-deficient mice. Arterioscler. Thromb. Vasc. Biol. 2016;36:1114-1121 70. Burger PC, Wagner DD. Platelet p-selectin facilitates atherosclerotic lesion development. Blood. 2003;101:2661-2666 71. Huo Y, Schober A, Forlow SB, Smith DF, Hyman MC, Jung S, Littman DR, Weber C, Ley K. Circulating activated platelets exacerbate atherosclerosis in mice deficient in apolipoprotein e. Nat. Med. 2003;9:61-67 72. Koyama H, Maeno T, Fukumoto S, Shoji T, Yamane T, Yokoyama H, Emoto M, Shoji T, Tahara H, Inaba M, Hino M, Shioi A, Miki T, Nishizawa Y. Platelet p-selectin expression is associated with atherosclerotic wall thickness in carotid artery in humans. Circulation. 2003;108:524-529

122

73. Fijnheer R, Frijns CJ, Korteweg J, Rommes H, Peters JH, Sixma JJ, Nieuwenhuis HK. The origin of p-selectin as a circulating plasma protein. Thromb. Haemost. 1997;77:1081- 1085 74. Blann AD, Lip GY. Hypothesis: Is soluble p-selectin a new marker of platelet activation? Atherosclerosis. 1997;128:135-138 75. Berger G, Hartwell DW, Wagner DD. P-selectin and platelet clearance. Blood. 1998;92:4446-4452 76. Ishiwata N, Takio K, Katayama M, Watanabe K, Titani K, Ikeda Y, Handa M. Alternatively spliced isoform of p-selectin is present in vivo as a soluble molecule. J. Biol. Chem. 1994;269:23708-23715 77. Jilma B, Fasching P, Ruthner C, Rumplmayr A, Ruzicka S, Kapiotis S, Wagner OF, Eichler HG. Elevated circulating p-selectin in insulin dependent diabetes mellitus. Thromb. Haemost. 1996;76:328-332 78. Chong BH, Murray B, Berndt MC, Dunlop LC, Brighton T, Chesterman CN. Plasma p- selectin is increased in thrombotic consumptive platelet disorders. Blood. 1994;83:1535- 1541 79. Blann AD, Faragher EB, McCollum CN. Increased soluble p-selectin following myocardial infarction: A new marker for the progression of atherosclerosis. Blood Coagul. Fibrinolysis. 1997;8:383-390 80. Panicker SR, Mehta-D'souza P, Zhang N, Klopocki AG, Shao B, McEver RP. Circulating soluble p-selectin must dimerize to promote inflammation and coagulation in mice. Blood. 2017;130:181-191 81. Etzioni A. Adhesion molecules-their role in health and disease. Pediatr. Res. 1996;39:191-198 82. Matsui NM, Borsig L, Rosen SD, Yaghmai M, Varki A, Embury SH. P-selectin mediates the adhesion of sickle erythrocytes to the endothelium. Blood. 2001;98:1955-1962 83. Ataga KI, Kutlar A, Kanter J, Liles D, Cancado R, Friedrisch J, Guthrie TH, Knight- Madden J, Alvarez OA, Gordeuk VR, Gualandro S, Colella MP, Smith WR, Rollins SA, Stocker JW, Rother RP. Crizanlizumab for the prevention of pain crises in sickle cell disease. N. Engl. J. Med. 2017;376:429-439 84. Polgar J, Matuskova J, Wagner DD. The p-selectin, tissue factor, coagulation triad. J. Thromb. Haemost. 2005;3:1590-1596 85. Cambien B, Wagner DD. A new role in hemostasis for the adhesion receptor p-selectin. Trends Mol. Med. 2004;10:179-186 86. Palabrica T, Lobb R, Furie BC, Aronovitz M, Benjamin C, Hsu YM, Sajer SA, Furie B. Leukocyte accumulation promoting fibrin deposition is mediated in vivo by p-selectin on adherent platelets. Nature. 1992;359:848-851 87. Subramaniam M, Frenette PS, Saffaripour S, Johnson RC, Hynes RO, Wagner DD. Defects in hemostasis in p-selectin-deficient mice. Blood. 1996;87:1238-1242

123

88. Mayadas TN, Johnson RC, Rayburn H, Hynes RO, Wagner DD. Leukocyte rolling and extravasation are severely compromised in p selectin-deficient mice. Cell. 1993;74:541- 554 89. Bullard DC, Kunkel EJ, Kubo H, Hicks MJ, Lorenzo I, Doyle NA, Doerschuk CM, Ley K, Beaudet AL. Infectious susceptibility and severe deficiency of leukocyte rolling and recruitment in e-selectin and p-selectin double mutant mice. J. Exp. Med. 1996;183:2329- 2336 90. Frenette PS, Mayadas TN, Rayburn H, Hynes RO, Wagner DD. Susceptibility to infection and altered hematopoiesis in mice deficient in both p- and e-selectins. Cell. 1996;84:563-574 91. Wein M, Sterbinsky SA, Bickel CA, Schleimer RP, Bochner BS. Comparison of human eosinophil and neutrophil ligands for p-selectin: Ligands for p-selectin differ from those for e-selectin. Am. J. Respir. Cell Mol. Biol. 1995;12:315-319 92. Symon FA, Walsh GM, Watson SR, Wardlaw AJ. Eosinophil adhesion to nasal polyp endothelium is p-selectin-dependent. J. Exp. Med. 1994;180:371-376 93. Jacobsen EA, Helmers RA, Lee JJ, Lee NA. The expanding role(s) of eosinophils in health and disease. Blood. 2012;120:3882-3890 94. Webster AL, Yan MS, Marsden PA. Epigenetics and cardiovascular disease. Can. J. Cardiol. 2013;29:46-57 95. Jaenisch R, Bird A. Epigenetic regulation of gene expression: How the genome integrates intrinsic and environmental signals. Nat. Genet. 2003;33 Suppl:245-254 96. Yan MS, Matouk CC, Marsden PA. Epigenetics of the vascular endothelium. Journal of applied physiology (Bethesda, Md. : 1985). 2010;109:916-926 97. The encode (encyclopedia of DNA elements) project. Science. 2004;306:636-640 98. Clamp M, Fry B, Kamal M, Xie X, Cuff J, Lin MF, Kellis M, Lindblad-Toh K, Lander ES. Distinguishing protein-coding and noncoding genes in the human genome. Proc. Natl. Acad. Sci. U. S. A. 2007;104:19428-19433 99. Casa V, Gabellini D. A repetitive elements perspective in polycomb epigenetics. Front Genet. 2012;3:199 100. Bennett EA, Coleman LE, Tsui C, Pittard WS, Devine SE. Natural genetic variation caused by transposable elements in humans. Genetics. 2004;168:933-951 101. Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, Huarte M, Zuk O, Carey BW, Cassady JP, Cabili MN, Jaenisch R, Mikkelsen TS, Jacks T, Hacohen N, Bernstein BE, Kellis M, Regev A, Rinn JL, Lander ES. Chromatin signature reveals over a thousand highly conserved large non-coding rnas in mammals. Nature. 2009;458:223- 227 102. Rinn JL, Chang HY. Genome regulation by long noncoding rnas. Annu. Rev. Biochem. 2012;81:145-166 103. Man HSJ, Sukumar AN, Lam GC, Turgeon PJ, Yan MS, Ku KH, Dubinsky MK, Ho JJD, Wang JJ, Das S, Mitchell N, Oettgen P, Sefton MV, Marsden PA. Angiogenic patterning

124

by steel, an endothelial-enriched long noncoding rna. Proc. Natl. Acad. Sci. U. S. A. 2018;115:2401-2406 104. Michalik KM, You X, Manavski Y, Doddaballapur A, Zornig M, Braun T, John D, Ponomareva Y, Chen W, Uchida S, Boon RA, Dimmeler S. Long noncoding rna malat1 regulates endothelial cell function and vessel growth. Circ. Res. 2014;114:1389-1397 105. Yu B, Wang S. Angio-lncrs: Lncrnas that regulate angiogenesis and vascular disease. Theranostics. 2018;8:3654-3675 106. Bird A. DNA methylation patterns and epigenetic memory. Genes Dev. 2002;16:6-21 107. Bird AP. Cpg-rich islands and the function of DNA methylation. Nature. 1986;321:209- 213 108. Bird AP. DNA methylation and the frequency of cpg in animal DNA. Nucleic Acids Res. 1980;8:1499-1504 109. Saxonov S, Berg P, Brutlag DL. A genome-wide analysis of cpg dinucleotides in the human genome distinguishes two distinct classes of promoters. Proc. Natl. Acad. Sci. U. S. A. 2006;103:1412-1417 110. Norris DP, Brockdorff N, Rastan S. Methylation status of cpg-rich islands on active and inactive mouse x chromosomes. Mamm. Genome. 1991;1:78-83 111. Morison IM, Reeve AE. A catalogue of imprinted genes and parent-of-origin effects in humans and animals. Hum. Mol. Genet. 1998;7:1599-1609 112. Sproul D, Meehan RR. Genomic insights into cancer-associated aberrant cpg island hypermethylation. Briefings in functional genomics. 2013;12:174-190 113. Herranz M, Esteller M. Cpg island hypermethylation of tumor suppressor genes in human cancer. Boston, MA: Springer; 2005. 114. van Vlodrop IJ, Niessen HE, Derks S, Baldewijns MM, van Criekinge W, Herman JG, van Engeland M. Analysis of promoter cpg island hypermethylation in cancer: Location, location, location! Clin. Cancer. Res. 2011;17:4225-4231 115. Bae MG, Kim JY, Choi JK. Frequent hypermethylation of orphan cpg islands with enhancer activity in cancer. BMC Med. Genomics. 2016;9 Suppl 1:38 116. Yang AS, Estecio MR, Doshi K, Kondo Y, Tajara EH, Issa JP. A simple method for estimating global DNA methylation using bisulfite pcr of repetitive DNA elements. Nucleic Acids Res. 2004;32:e38 117. Yoder J, Walsh C, H. Bestor T. Cytosine methylation and the ecology of intragenomic parasites. Trends Genet. 1997;13:335-340 118. Larsen F, Gundersen G, Lopez R, Prydz H. Cpg islands as gene markers in the human genome. Genomics. 1992;13:1095-1107 119. Vavouri T, Lehner B. Human genes with cpg island promoters have a distinct transcription-associated chromatin organization. Genome Biol. 2012;13:R110-R110 120. Elango N, Yi SV. Functional relevance of cpg island length for regulation of gene expression. Genetics. 2011;187:1077

125

121. Han H, Cortez CC, Yang X, Nichols PW, Jones PA, Liang G. DNA methylation directly silences genes with non-cpg island promoters and establishes a nucleosome occupied promoter. Hum. Mol. Genet. 2011;20:4299-4310 122. He Y, Ecker JR. Non-cg methylation in the human genome. Annual review of genomics and human genetics. 2015;16:55-77 123. Kinde B, Gabel HW, Gilbert CS, Griffith EC, Greenberg ME. Reading the unique DNA methylation landscape of the brain: Non-cpg methylation, hydroxymethylation, and mecp2. Proc. Natl. Acad. Sci. U. S. A. 2015;112:6800-6806 124. Miranda TB, Jones PA. DNA methylation: The nuts and bolts of repression. J. Cell. Physiol. 2007;213:384-390 125. Li Z, Gu TP, Weber AR, Shen JZ, Li BZ, Xie ZG, Yin R, Guo F, Liu X, Tang F, Wang H, Schar P, Xu GL. Gadd45a promotes DNA demethylation through tdg. Nucleic Acids Res. 2015;43:3986-3997 126. Wu X, Zhang Y. Tet-mediated active DNA demethylation: Mechanism, function and beyond. Nature Reviews Genetics. 2017;18:517 127. Luger K, Mader AW, Richmond RK, Sargent DF, Richmond TJ. Crystal structure of the nucleosome core particle at 2.8 a resolution. Nature. 1997;389:251-260 128. Szerlong HJ, Hansen JC. Nucleosome distribution and linker DNA: Connecting nuclear function to dynamic chromatin structure. Biochemistry and cell biology = Biochimie et biologie cellulaire. 2011;89:24-34 129. Nizovtseva EV, Clauvelin N, Todolli S, Polikanov YS, Kulaeva OI, Wengrzynek S, Olson WK, Studitsky VM. Nucleosome-free DNA regions differentially affect distant communication in chromatin. Nucleic Acids Res. 2017;45:3059-3067 130. Jenuwein T, Allis CD. Translating the histone code. Science. 2001;293:1074-1080 131. Kouzarides T. Chromatin modifications and their function. Cell. 2007;128:693-705 132. Ogryzko VV, Schiltz RL, Russanova V, Howard BH, Nakatani Y. The transcriptional coactivators p300 and cbp are histone acetyltransferases. Cell. 1996;87:953-959 133. Cao R, Wang L, Wang H, Xia L, Erdjument-Bromage H, Tempst P, Jones RS, Zhang Y. Role of histone h3 lysine 27 methylation in polycomb-group silencing. Science. 2002;298:1039-1043 134. Strahl BD, Grant PA, Briggs SD, Sun Z-W, Bone JR, Caldwell JA, Mollah S, Cook RG, Shabanowitz J, Hunt DF, Allis CD. Set2 is a nucleosomal histone h3-selective methyltransferase that mediates transcriptional repression. Mol. Cell. Biol. 2002;22:1298 135. Butler JS, Dent SY. Chromatin 'resetting' during transcription elongation: A central role for methylated h3k36. Nat. Struct. Mol. Biol. 2012;19:863-864 136. Carrozza MJ, Li B, Florens L, Suganuma T, Swanson SK, Lee KK, Shia WJ, Anderson S, Yates J, Washburn MP, Workman JL. Histone h3 methylation by set2 directs deacetylation of coding regions by rpd3s to suppress spurious intragenic transcription. Cell. 2005;123:581-592

126

137. Li B, Gogol M, Carey M, Pattenden SG, Seidel C, Workman JL. Infrequently transcribed long genes depend on the set2/rpd3s pathway for accurate transcription. Genes Dev. 2007;21:1422-1430 138. Wang Z, Zang C, Cui K, Schones DE, Barski A, Peng W, Zhao K. Genome-wide mapping of hats and hdacs reveals distinct functions in active and inactive genes. Cell. 2009;138:1019-1031 139. Barski A, Cuddapah S, Cui K, Roh T-Y, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K. High-resolution profiling of histone methylations in the human genome. Cell. 2007;129:823-837 140. Black JC, Van Rechem C, Whetstine JR. Histone lysine methylation dynamics: Establishment, regulation, and biological impact. Mol. Cell. 2012;48:491-507 141. Zhao Y, Garcia BA. Comprehensive catalog of currently documented histone modifications. Cold Spring Harb. Perspect. Biol. 2015;7:a025064 142. Henikoff S, Furuyama T, Ahmad K. Histone variants, nucleosome assembly and epigenetic inheritance. Trends Genet. 2004;20:320-326 143. Rose NR, Klose RJ. Understanding the relationship between DNA methylation and histone lysine methylation. Biochim. Biophys. Acta. 2014;1839:1362-1372 144. Esteve PO, Chin HG, Smallwood A, Feehery GR, Gangisetty O, Karpf AR, Carey MF, Pradhan S. Direct interaction between dnmt1 and g9a coordinates DNA and histone methylation during replication. Genes Dev. 2006;20:3089-3103 145. Fujita N, Watanabe S, Ichimura T, Tsuruzoe S, Shinkai Y, Tachibana M, Chiba T, Nakao M. Methyl-cpg binding domain 1 (mbd1) interacts with the suv39h1-hp1 heterochromatic complex for DNA methylation-based transcriptional repression. J. Biol. Chem. 2003;278:24132-24138 146. Chess A. Monoallelic gene expression in mammals. Annu. Rev. Genet. 2016;50:317-327 147. Gimelbrant A, Hutchinson JN, Thompson BR, Chess A. Widespread monoallelic expression on human autosomes. Science. 2007;318:1136-1140 148. Savova V, Chun S, Sohail M, McCole RB, Witwicki R, Gai L, Lenz TL, Wu CT, Sunyaev SR, Gimelbrant AA. Genes with monoallelic expression contribute disproportionately to genetic diversity in humans. Nat. Genet. 2016;48:231-237 149. Deng Q, Ramskold D, Reinius B, Sandberg R. Single-cell rna-seq reveals dynamic, random monoallelic gene expression in mammalian cells. Science. 2014;343:193-196 150. Pinter SF, Colognori D, Beliveau BJ. Allelic imbalance is a prevalent and tissue-specific feature of the mouse transcriptome. 2015;200:537-549 151. Savova V, Patsenker J, Vigneau S, Gimelbrant AA. Dbmae: The database of autosomal monoallelic expression. Nucleic Acids Res. 2016;44:D753-756 152. Savova V, Vinogradova S, Pruss D, Gimelbrant AA, Weiss LA. Risk alleles of genes with monoallelic expression are enriched in gain-of-function variants and depleted in loss-of-function variants for neurodevelopmental disorders. Mol. Psychiatry. 2017;22:1785-1794

127

153. Eulenberg-Gustavus C, Bähring S, Maass PG, Luft FC, Kettritz R. Gene silencing and a novel monoallelic expression pattern in distinct neutrophil subsets. The Journal of experimental medicine. 2017;214:2089-2101 154. Zhao D, Lin M, Pedrosa E, Lachman HM, Zheng D. Characteristics of allelic gene expression in human brain cells from single-cell rna-seq data analysis. BMC Genomics. 2017;18:860-860 155. Reinius B, Mold JE, Ramskold D, Deng Q, Johnsson P, Michaelsson J, Frisen J, Sandberg R. Analysis of allelic expression patterns in clonal somatic cells by single-cell rna-seq. Nat. Genet. 2016;48:1430-1435 156. Perez JD, Rubinstein ND, Fernandez DE, Santoro SW, Needleman LA, Ho-Shing O, Choi JJ, Zirlinger M, Chen S-K, Liu JS, Dulac C. Quantitative and functional interrogation of parent-of-origin allelic expression biases in the brain. eLife. 2015;4:e07860 157. Reinius B, Sandberg R. Random monoallelic expression of autosomal genes: Stochastic transcription and allele-level regulation. Nat. Rev. Genet. 2015;16:653-664 158. Lee JT, Bartolomei MS. X-inactivation, imprinting, and long noncoding rnas in health and disease. Cell. 2013;152:1308-1323 159. Disteche CM. Dosage compensation of the sex chromosomes. Annu. Rev. Genet. 2012;46:537-560 160. Conrad T, Akhtar A. Dosage compensation in drosophila melanogaster: Epigenetic fine- tuning of chromosome-wide transcription. Nat. Rev. Genet. 2012;13:123-134 161. Takagi N, Sasaki M. Preferential inactivation of the paternally derived x chromosome in the extraembryonic membranes of the mouse. Nature. 1975;256:640-642 162. Xue F, Tian XC, Du F, Kubota C, Taneja M, Dinnyes A, Dai Y, Levine H, Pereira LV, Yang X. Aberrant patterns of x chromosome inactivation in bovine clones. Nat. Genet. 2002;31:216-220 163. Zeng SM, Yankowitz J. X-inactivation patterns in human embryonic and extra- embryonic tissues. Placenta. 2003;24:270-275 164. Avner P, Heard E. X-chromosome inactivation: Counting, choice and initiation. Nat. Rev. Genet. 2001;2:59-67 165. Penny GD, Kay GF, Sheardown SA, Rastan S, Brockdorff N. Requirement for xist in x chromosome inactivation. Nature. 1996;379:131-137 166. Lee JT, Davidow LS, Warshawsky D. Tsix, a gene antisense to xist at the x-inactivation centre. Nat. Genet. 1999;21:400-404 167. Sado T, Hoki Y, Sasaki H. Tsix silences xist through modification of chromatin structure. Dev. Cell. 2005;9:159-165 168. Simon MD, Pinter SF, Fang R, Sarma K, Rutenberg-Schoenberg M, Bowman SK, Kesner BA, Maier VK, Kingston RE, Lee JT. High-resolution xist binding maps reveal two-step spreading during x-chromosome inactivation. Nature. 2013;504:465-469

128

169. Migeon BR, Lee CH, Chowdhury AK, Carpenter H. Species differences in tsix/tsix reveal the roles of these genes in x-chromosome inactivation. Am. J. Hum. Genet. 2002;71:286-293 170. Vallot C, Patrat C, Collier AJ, Huret C, Casanova M, Liyakat Ali TM, Tosolini M, Frydman N, Heard E, Rugg-Gunn PJ, Rougeulle C. Xact noncoding rna competes with xist in the control of x chromosome activity during human early development. Cell stem cell. 2017;20:102-111 171. Yang F, Babak T, Shendure J, Disteche CM. Global survey of escape from x inactivation by rna-sequencing in mouse. Genome Res. 2010;20:614-622 172. Carrel L, Willard HF. X-inactivation profile reveals extensive variability in x-linked gene expression in females. Nature. 2005;434:400-404 173. Ferguson-Smith AC. Genomic imprinting: The emergence of an epigenetic paradigm. Nat. Rev. Genet. 2011;12:565-575 174. Bartolomei MS, Ferguson-Smith AC. Mammalian genomic imprinting. Cold Spring Harb. Perspect. Biol. 2011;3:a002592 175. Barlow DP, Bartolomei MS. Genomic imprinting in mammals. Cold Spring Harb. Perspect. Biol. 2014;6:a018382 176. Verona RI, Mann MR, Bartolomei MS. Genomic imprinting: Intricacies of epigenetic regulation in clusters. Annu. Rev. Cell. Dev. Biol. 2003;19:237-259 177. Ohlsson R, Flam F, Fisher R, Miller S, Cui H, Pfeifer S, Adam GI. Random monoallelic expression of the imprinted igf2 and h19 genes in the absence of discriminative parental marks. Dev. Genes Evol. 1999;209:113-119 178. Singh P, Lee D-H, Szabo P. More than insulator: Multiple roles of ctcf at the h19-igf2 imprinted domain. Frontiers in Genetics. 2012;3 179. Ideraabdullah FY, Vigneau S, Bartolomei MS. Genomic imprinting mechanisms in mammals. Mutat. Res. 2008;647:77-85 180. Sleutels F, Zwart R, Barlow DP. The non-coding air rna is required for silencing autosomal imprinted genes. Nature. 2002;415:810-813 181. Wutz A, Smrzka OW, Schweifer N, Schellander K, Wagner EF, Barlow DP. Imprinted expression of the igf2r gene depends on an intronic cpg island. Nature. 1997;389:745-749 182. Latos PA, Pauler FM, Koerner MV, Senergin HB, Hudson QJ, Stocsits RR, Allhoff W, Stricker SH, Klement RM, Warczok KE, Aumayr K, Pasierbek P, Barlow DP. Airn transcriptional overlap, but not its lncrna products, induces imprinted igf2r silencing. Science. 2012;338:1469-1472 183. Sleutels F, Tjon G, Ludwig T, Barlow DP. Imprinted silencing of slc22a2 and slc22a3 does not need transcriptional overlap between igf2r and air. EMBO J. 2003;22:3696-3704 184. Pernis B, Chiappino G, Kelus AS, Gell PG. Cellular localization of immunoglobulins with different allotypic specificities in rabbit lymphoid tissues. J. Exp. Med. 1965;122:853-876

129

185. Chess A, Simon I, Cedar H, Axel R. Allelic inactivation regulates olfactory receptor gene expression. Cell. 1994;78:823-834 186. Hesslein DG, Schatz DG. Factors and forces controlling v(d)j recombination. Adv. Immunol. 2001;78:169-232 187. Oettinger MA, Schatz DG, Gorka C, Baltimore D. Rag-1 and rag-2, adjacent genes that synergistically activate v(d)j recombination. Science. 1990;248:1517-1523 188. Levin-Klein R, Bergman Y. Epigenetic regulation of monoallelic rearrangement (allelic exclusion) of antigen receptor genes. Front. Immunol. 2014;5:625-625 189. Mostoslavsky R, Singh N, Tenzen T, Goldmit M, Gabay C, Elizur S, Qi P, Reubinoff BE, Chess A, Cedar H, Bergman Y. Asynchronous replication and allelic exclusion in the immune system. Nature. 2001;414:221-225 190. Malnic B, Godfrey PA, Buck LB. The human olfactory receptor gene family. Proc. Natl. Acad. Sci. U. S. A. 2004;101:2584 191. Malnic B, Hirono J, Sato T, Buck LB. Combinatorial receptor codes for odors. Cell. 1999;96:713-723 192. Mainland JD, Keller A, Li YR, Zhou T, Trimmer C, Snyder LL, Moberly AH, Adipietro KA, Liu WL, Zhuang H, Zhan S, Lee SS, Lin A, Matsunami H. The missense of smell: Functional variability in the human odorant receptor repertoire. Nat. Neurosci. 2014;17:114-120 193. Shykind BM, Rohani SC, O'Donnell S, Nemes A, Mendelsohn M, Sun Y, Axel R, Barnea G. Gene switching and the stability of odorant receptor gene choice. Cell. 2004;117:801- 815 194. Nguyen MQ, Zhou Z, Marks CA, Ryba NJ, Belluscio L. Prominent roles for odorant receptor coding sequences in allelic exclusion. Cell. 2007;131:1009-1017 195. Singh N, Ebrahimi FA, Gimelbrant AA, Ensminger AW, Tackett MR, Qi P, Gribnau J, Chess A. Coordination of the random asynchronous replication of autosomal loci. Nat. Genet. 2003;33:339-341 196. Schmidt M, Migeon BR. Asynchronous replication of homologous loci on human active and inactive x chromosomes. Proc. Natl. Acad. Sci. U. S. A. 1990;87:3685-3689 197. Simon I, Tenzen T, Reubinoff BE, Hillman D, McCarrey JR, Cedar H. Asynchronous replication of imprinted genes is established in the gametes and maintained during development. Nature. 1999;401:929-932 198. Karnani N, Taylor C, Malhotra A, Dutta A. Pan-s replication patterns and chromosomal domains defined by genome-tiling arrays of encode genomic areas. Genome Res. 2007;17:865-876 199. Farkash-Amar S, Lipson D, Polten A, Goren A, Helmstetter C, Yakhini Z, Simon I. Global organization of replication time zones of the mouse genome. Genome Res. 2008;18:1562-1570 200. Nag A, Savova V, Fung HL, Miron A, Yuan GC, Zhang K, Gimelbrant AA. Chromatin signature of widespread monoallelic expression. eLife. 2013;2:e01256

130

201. Jeffries AR, Perfect LW, Ledderose J, Schalkwyk LC, Bray NJ, Mill J, Price J. Stochastic choice of allelic expression in human neural stem cells. Stem Cells. 2012;30:1938-1947 202. Eckersley-Maslin MA, Spector DL. Random monoallelic expression: Regulating gene expression one allele at a time. Trends Genet. 2014;30:237-244 203. Guo L, Hu-Li J, Paul WE. Probabilistic regulation in th2 cells accounts for monoallelic expression of il-4 and il-13. Immunity. 2005;23:89-99 204. Andres AM, Hubisz MJ, Indap A, Torgerson DG, Degenhardt JD, Boyko AR, Gutenkunst RN, White TJ, Green ED, Bustamante CD, Clark AG, Nielsen R. Targets of balancing selection in the human genome. Mol. Biol. Evol. 2009;26:2755-2764 205. Linnarsson S, Teichmann SA. Single-cell genomics: Coming of age. Genome Biol. 2016;17:97 206. Vanlandewijck M, He L, Mae MA, Andrae J, Ando K, Del Gaudio F, Nahar K, Lebouvier T, Lavina B, Gouveia L, Sun Y, Raschperger E, Rasanen M, Zarb Y, Mochizuki N, Keller A, Lendahl U, Betsholtz C. A molecular atlas of cell types and zonation in the brain vasculature. Nature. 2018;554:475-480 207. Zhao Q, Eichten A, Parveen A, Adler C, Huang Y, Wang W, Ding Y, Adler A, Nevins T, Ni M, Wei Y, Thurston G. Single-cell transcriptome analyses reveal endothelial cell heterogeneity in tumors and changes following antiangiogenic treatment. Cancer Res. 2018;78:2370-2382 208. Sun Z, Wang C-Y, Lawson DA, Kwek S, Velozo HG, Owyong M, Lai M-D, Fong L, Wilson M, Su H, Werb Z, Cooke DL. Single-cell rna sequencing reveals gene expression signatures of breast cancer-associated endothelial cells. Oncotarget. 2017;9:10945-10961 209. Paik DT, Tian L, Lee J, Sayed N, Chen IY, Rhee S, Rhee JW, Kim Y, Wirka RC, Buikema JW, Wu SM, Red-Horse K, Quertermous T, Wu JC. Large-scale single-cell rna- seq reveals molecular signatures of heterogeneous populations of human induced pluripotent stem cell-derived endothelial cells. Circ. Res. 2018;123:443-450 210. Stratigi K, Kapsetaki M, Aivaliotis M, Town T, Flavell RA, Spilianakis CG. Spatial proximity of homologous alleles and long noncoding rnas regulate a switch in allelic gene expression. Proceedings of the National Academy of Sciences. 2015;112:E1577 211. Eckersley-Maslin MA, Thybert D, Bergmann JH, Marioni JC, Flicek P, Spector DL. Random monoallelic gene expression increases upon embryonic stem cell differentiation. Dev. Cell. 2014;28:351-365 212. Ku CJ, Lim KC, Kalantry S, Maillard I, Engel JD, Hosoya T. A monoallelic-to-biallelic t- cell transcriptional switch regulates gata3 abundance. Genes Dev. 2015;29:1930-1941 213. Sitnicka E. From the bone marrow to the thymus: The road map of early stages of t-cell development. Crit. Rev. Immunol. 2009;29:487-530 214. Ku CJ, Sekiguchi JM, Panwar B, Guan Y, Takahashi S, Yoh K, Maillard I, Hosoya T, Engel JD. Gata3 abundance is a critical determinant of t cell receptor beta allelic exclusion. Mol. Cell. Biol. 2017;37 215. Shirodkar AV, St Bernard R, Gavryushova A, Kop A, Knight BJ, Yan MS, Man HS, Sud M, Hebbel RP, Oettgen P, Aird WC, Marsden PA. A mechanistic role for DNA

131

methylation in endothelial cell (ec)-enriched gene expression: Relationship with DNA replication timing. Blood. 2013;121:3531-3540 216. Chi JT, Chang HY, Haraldsen G, Jahnsen FL, Troyanskaya OG, Chang DS, Wang Z, Rockson SG, van de Rijn M, Botstein D, Brown PO. Endothelial cell diversity revealed by global expression profiling. Proc. Natl. Acad. Sci. U. S. A. 2003;100:10623-10628 217. Pacchierotti F, Spano M. Environmental impact on DNA methylation in the germline: State of the art and gaps of knowledge. BioMed research international. 2015;2015:123484 218. Kelsey G, Feil R. New insights into establishment and maintenance of DNA methylation imprints in mammals. Philos. Trans. R. Soc. Lond. B Biol. Sci. 2013;368:20110336 219. Schulz EG, Heard E. Role and control of x chromosome dosage in mammalian development. Curr. Opin. Genet. Dev. 2013;23:109-115 220. Smith ZD, Meissner A. DNA methylation: Roles in mammalian development. Nat. Rev. Genet. 2013;14:204-220 221. Veyradier A. Von willebrand factor--a new target for ttp treatment? N. Engl. J. Med. 2016;374:583-585 222. Jaffe EA, Nachman RL, Becker CG, Minick CR. Culture of human endothelial cells derived from umbilical veins. Identification by morphologic and immunologic criteria. J. Clin. Invest. 1973;52:2745-2756 223. Chomczynski P, Sacchi N. The single-step method of rna isolation by acid guanidinium thiocyanate-phenol-chloroform extraction: Twenty-something years on. Nat. Protoc. 2006;1:581+ 224. Chan Y, Fish JE, D'Abreo C, Lin S, Robb GB, Teichert AM, Karantzoulis-Fegaras F, Keightley A, Steer BM, Marsden PA. The cell-specific expression of endothelial nitric- oxide synthase: A role for DNA methylation. J. Biol. Chem. 2004;279:35087-35100 225. Xu RT, Zhou H, Liu WL, Wu W, Liu XY, Zhang WQ, Tan J, Zhao M. [construction and identification of human p-selectin promotor luciferase reporter gene vector]. Nan Fang Yi Ke Da Xue Xue Bao. 2016;36:332-338 226. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, Tinevez J-Y, White DJ, Hartenstein V, Eliceiri K, Tomancak P, Cardona A. Fiji: An open-source platform for biological-image analysis. Nat. Methods. 2012;9:676 227. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR. Star: Ultrafast universal rna-seq aligner. Bioinformatics. 2013;29:15-21 228. Lun AT, Bach K, Marioni JC. Pooling across cells to normalize single-cell rna sequencing data with many zero counts. Genome Biol. 2016;17:75 229. McCarthy DJ, Campbell KR, Lun AT, Wills QF. Scater: Pre-processing, quality control, normalization and visualization of single-cell rna-seq data in r. Bioinformatics. 2017;33:1179-1186 230. Krijthe J, van der Maaten L, Krijthe MJ. Package ‘rtsne’. GitHub https://github.com/jkrijthe/Rtsne. 2017 132

231. Kiselev VY, Kirschner K, Schaub MT, Andrews T. Sc3: Consensus clustering of single- cell rna-seq data. 2017;14:483-486 232. Robinson MD, McCarthy DJ, Smyth GK. Edger: A bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139-140 233. Satija R. Seurat: R toolkit for single cell genomics. 234. Butler A, Hoffman P, Smibert P, Papalexi E, Satija R. Integrating single-cell transcriptomic data across different conditions, technologies, and species. 2018;36:411- 420 235. Stacklies W, Redestig H, Scholz M, Walther D, Selbig J. Pcamethods--a bioconductor package providing pca methods for incomplete data. Bioinformatics. 2007;23:1164-1167 236. Ilicic T, Kim JK, Kolodziejczyk AA, Bagger FO, McCarthy DJ, Marioni JC, Teichmann SA. Classification of low quality cells from single-cell rna-seq data. Genome Biol. 2016;17:29 237. Wold S, Esbensen K, Gelad iP. Principal component analysis. Chemometrics Intellig. Lab. Syst. 1987;2:37–52 238. Maaten LVD, Hinton G. Visualizing data using t-sne. J. Mach. Learn. Res. 2008;9:2579– 2605 239. Pan J, Dinh TT, Rajaraman A, Lee M, Scholz A, Czupalla CJ, Kiefel H, Zhu L, Xia L, Morser J, Jiang H, Santambrogio L, Butcher EC. Patterns of expression of factor viii and von willebrand factor by endothelial cell subsets in vivo. Blood. 2016;128:104-109 240. Kanwar S, Woodman RC, Poon MC, Murohara T, Lefer AM, Davenpeck KL, Kubes P. Desmopressin induces endothelial p-selectin expression and leukocyte rolling in postcapillary venules. Blood. 1995;86:2760-2766 241. Sanz MJ, Johnston B, Issekutz A, Kubes P. Endothelin-1 causes p-selectin-dependent leukocyte rolling and adhesion within rat mesenteric microvessels. Am. J. Physiol. 1999;277:H1823-1830 242. Collins PW, Macey MG, Cahill MR, Newland AC. Von willebrand factor release and p- selectin expression is stimulated by thrombin and trypsin but not il-1 in cultured human endothelial cells. Thromb. Haemost. 1993;70:346-350 243. Brown MA, Wallace CS, Anamelechi CC, Clermont E, Reichert WM, Truskey GA. The use of mild trypsinization conditions in the detachment of endothelial cells to promote subsequent endothelialization on synthetic surfaces. Biomaterials. 2007;28:3928-3935 244. Gräbner R, Till U, Heller R. Flow cytometric determination of e-selectin, vascular cell adhesion molecule-1, and intercellular cell adhesion molecule-1 in formaldehyde-fixed endothelial cell monolayers. Cytometry. 2000;40:238-244 245. Gardiner-Garden M, Frommer M. Cpg islands in vertebrate genomes. J. Mol. Biol. 1987;196:261-282 246. Illingworth RS, Bird AP. Cpg islands--'a rough guide'. FEBS Lett. 2009;583:1713-1720 247. Rice P, Longden I, Bleasby A. Emboss: The european molecular biology open software suite. Trends in genetics : TIG. 2000;16:276-277

133

248. Takai D, Jones PA. Comprehensive analysis of cpg islands in human chromosomes 21 and 22. Proc. Natl. Acad. Sci. U. S. A. 2002;99:3740-3745 249. Sherry ST, Ward, M. H., Kholodov, M., Baker, J., Phan, L., Smigielski, E. M., Sirotkin, K. Dbsnp: The ncbi database of genetic variation. Nucleic Acids Res. 2001;29:308-311 250. A user's guide to the encyclopedia of DNA elements (encode). PLoS Biol. 2011;9:e1001046 251. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The human genome browser at ucsc. Genome Res. 2002;12:996-1006 252. Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, Lee CM, Lee BT, Hinrichs AS, Gonzalez JN, Gibson D, Diekhans M, Clawson H, Casper J, Barber GP, Haussler D, Kuhn RM, Kent WJ. The ucsc genome browser database: 2019 update. Nucleic Acids Res. 2018 253. Cameron EE, Bachman KE, Myohanen S, Herman JG, Baylin SB. Synergy of demethylation and histone deacetylase inhibition in the re-expression of genes silenced in cancer. Nat. Genet. 1999;21:103-107 254. McClure WR. Mechanism and control of transcription initiation in prokaryotes. Annu. Rev. Biochem. 1985;54:171-204 255. Wang JC. Interactions between twisted dnas and enzymes: The effects of superhelical turns. J. Mol. Biol. 1974;87:797-816 256. Liu Z, Miner JJ, Yago T, Yao L, Lupu F, Xia L, McEver RP. Differential regulation of human and murine p-selectin expression and function in vivo. J. Exp. Med. 2010;207:2975-2987 257. Yao L, Setiadi H, Xia L, Laszik Z, Taylor FB, McEver RP. Divergent inducible expression of p-selectin and e-selectin in mice and primates. Blood. 1999;94:3820-3828 258. Chen X, Cheng Z, Werling D, Pollott GE, Salavati M, Johnson KF, Khan FA, Wathes DC, Zhang S. Bovine p-selectin mediates leukocyte adhesion and is highly polymorphic in dairy breeds. Res. Vet. Sci. 2016;108:85-92 259. Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim TK, Koche RP, Lee W, Mendenhall E, O'Donovan A, Presser A, Russ C, Xie X, Meissner A, Wernig M, Jaenisch R, Nusbaum C, Lander ES, Bernstein BE. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553-560 260. Bernstein BE, Mikkelsen TS, Xie X, Kamal M, Huebert DJ, Cuff J, Fry B, Meissner A, Wernig M, Plath K, Jaenisch R, Wagschal A, Feil R, Schreiber SL, Lander ES. A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell. 2006;125:315-326 261. Matsumura Y, Nakaki R, Inagaki T, Yoshida A, Kano Y, Kimura H, Tanaka T, Tsutsumi S, Nakao M, Doi T, Fukami K, Osborne Timothy F, Kodama T, Aburatani H, Sakai J. H3k4/h3k9me3 bivalent chromatin domains targeted by lineage-specific DNA methylation pauses adipocyte differentiation. Mol. Cell. 2015;60:584-596

134

262. Ernst J, Kheradpour P, Mikkelsen TS, Shoresh N, Ward LD, Epstein CB, Zhang X, Wang L, Issner R, Coyne M, Ku M, Durham T, Kellis M, Bernstein BE. Mapping and analysis of chromatin state dynamics in nine human cell types. Nature. 2011;473:43-49 263. Yan MS, Turgeon PJ, Man HJ, Dubinsky MK, Ho JJD, El-Rass S, Wang YD, Wen XY, Marsden PA. Histone acetyltransferase 7 (kat7)-dependent intragenic histone acetylation regulates endothelial cell gene regulation. J. Biol. Chem. 2018;293:4381-4402 264. Mauser R, Kungulovski G, Keup C, Reinhardt R, Jeltsch A. Application of dual reading domains as novel reagents in chromatin biology reveals a new h3k9me3 and h3k36me2/3 bivalent chromatin state. Epigenetics & Chromatin. 2017;10:45 265. Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J. Differential chromatin marking of introns and expressed exons by h3k36me3. Nat. Genet. 2009;41:376-381 266. Saint-Andre V, Batsche E, Rachez C, Muchardt C. Histone h3 lysine 9 trimethylation and hp1gamma favor inclusion of alternative exons. Nat. Struct. Mol. Biol. 2011;18:337-344 267. Savol AJ, Wang PI, Jeon Y, Colognori D, Yildirim E, Pinter SF. Genome-wide identification of autosomal genes with allelic imbalance of chromatin state. 2017;12:e0182568 268. Savova V, Vigneau S, Gimelbrant AA. Autosomal monoallelic expression: Genetics of epigenetic diversity? Curr. Opin. Genet. Dev. 2013;23:642-648 269. The Genomes Project C. A global reference for human genetic variation. Nature. 2015;526:68-74 270. Zerbino DR, Achuthan P, Akanni W, Amode M R, Barrell D, Bhai J, Billis K, Cummins C, Gall A, Girón CG, Gil L, Gordon L, Haggerty L, Haskell E, Hourlier T, Izuogu OG, Janacek SH, Juettemann T, To JK, Laird MR, Lavidas I, Liu Z, Loveland JE, Maurel T, McLaren W, Moore B, Mudge J, Murphy DN, Newman V, Nuhn M, Ogeh D, Ong CK, Parker A, Patricio M, Riat HS, Schuilenburg H, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Zadissa A, Frankish A, Hunt SE, Kostadima M, Langridge N, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Aken BL, Cunningham F, Yates A, Flicek P. Ensembl 2018. Nucleic Acids Res. 2018;46:D754-D761 271. Nicol JW, Helt GA, Blanchard SG, Jr., Raja A, Loraine AE. The integrated genome browser: Free software for distribution and exploration of genome-scale datasets. Bioinformatics. 2009;25:2730-2731 272. Ingram DA, Mead LE, Tanaka H, Meade V, Fenoglio A, Mortell K, Pollok K, Ferkowicz MJ, Gilley D, Yoder MC. Identification of a novel hierarchy of endothelial progenitor cells using human peripheral and umbilical cord blood. Blood. 2004;104:2752-2760 273. Ingram DA, Mead LE, Moore DB, Woodard W, Fenoglio A, Yoder MC. Vessel wall- derived endothelial cells rapidly proliferate because they contain a complete hierarchy of endothelial progenitor cells. Blood. 2005;105:2783-2786 274. Yoder MC. Human endothelial progenitor cells. Cold Spring Harb. Perspect. Med. 2012;2:a006692

135

275. Hayflick L, Moorhead PS. The serial cultivation of human diploid cell strains. Exp. Cell Res. 1961;25:585-621 276. Hayflick L. The cell biology of aging. Clin. Geriatr. Med. 1985;1:15-27 277. Sánchez-Romero MA, Casadesús J. Contribution of phenotypic heterogeneity to adaptive antibiotic resistance. Proceedings of the National Academy of Sciences. 2014;111:355 278. Bódi Z, Farkas Z, Nevozhay D, Kalapis D, Lázár V, Csörgő B, Nyerges Á, Szamecz B, Fekete G, Papp B, Araújo H, Oliveira JL, Moura G, Santos MAS, Székely Jr T, Balázsi G, Pál C. Phenotypic heterogeneity promotes adaptive evolution. PLoS Biol. 2017;15:e2000644 279. McGranahan N, Swanton C. Clonal heterogeneity and tumor evolution: Past, present, and the future. Cell. 2017;168:613-628 280. Vilkaitis G, Suetake I, Klimasauskas S, Tajima S. Processive methylation of hemimethylated cpg sites by mouse dnmt1 DNA methyltransferase. J. Biol. Chem. 2005;280:64-72 281. Chen T, Ueda Y, Dodge JE, Wang Z, Li E. Establishment and maintenance of genomic methylation patterns in mouse embryonic stem cells by dnmt3a and dnmt3b. Mol. Cell. Biol. 2003;23:5594 282. Ushijima T, Watanabe N, Okochi E, Kaneda A, Sugimura T, Miyamoto K. Fidelity of the methylation pattern and its variation in the genome. Genome Res. 2003;13:868-874 283. Lövkvist C, Dodd IB, Sneppen K, Haerter JO. DNA methylation in human epigenomes depends on local topology of cpg sites. Nucleic Acids Res. 2016;44:5123-5132 284. Seki Y, Hayashi K, Itoh K, Mizugaki M, Saitou M, Matsui Y. Extensive and orderly reprogramming of genome-wide chromatin modifications associated with specification and early development of germ cells in mice. Dev. Biol. 2005;278:440-458 285. Surani MA, Hayashi K, Hajkova P. Genetic and epigenetic regulators of pluripotency. Cell. 2007;128:747-762 286. Burr S, Caldwell A, Chong M, Beretta M, Metcalf S, Hancock M, Arno M, Balu S, Kropf VL, Mistry RK, Shah AM, Mann GE, Brewer AC. Oxygen gradients can determine epigenetic asymmetry and cellular differentiation via differential regulation of tet activity in embryonic stem cells. Nucleic Acids Res. 2018;46:1210-1226 287. Madugundu GS, Cadet J, Wagner JR. Hydroxyl-radical-induced oxidation of 5- methylcytosine in isolated and cellular DNA. Nucleic Acids Res. 2014;42:7450-7460 288. Branco MR, Ficz G, Reik W. Uncovering the role of 5-hydroxymethylcytosine in the epigenome. Nat. Rev. Genet. 2011;13:7-13 289. Reveron-Gomez N, Gonzalez-Aguilera C, Stewart-Morgan KR, Petryk N, Flury V, Graziano S, Johansen JV, Jakobsen JS, Alabert C, Groth A. Accurate recycling of parental histones reproduces the histone modification landscape during DNA replication. Mol. Cell. 2018;72:239-249.e235 290. Kitsberg D, Selig S, Brandeis M, Simon I, Keshet I, Driscoll DJ, Nicholls RD, Cedar H. Allele-specific replication timing of imprinted gene regions. Nature. 1993;364:459-463

136

291. Takagi N. Differentiation of x chromosomes in early female mouse embryos. Exp. Cell Res. 1974;86:127-135 292. Gendrel AV, Attia M, Chen CJ, Diabangouaya P, Servant N, Barillot E, Heard E. Developmental dynamics and disease potential of random monoallelic gene expression. Dev. Cell. 2014;28:366-380 293. Selig S, Okumura K, Ward DC, Cedar H. Delineation of DNA replication time zones by fluorescence in situ hybridization. EMBO J. 1992;11:1217-1225 294. Suter DM, Molina N, Gatfield D, Schneider K, Schibler U, Naef F. Mammalian genes are transcribed with widely different bursting kinetics. Science. 2011;332:472 295. Nicolas D, Phillips NE, Naef F. What shapes eukaryotic transcriptional bursting? Molecular BioSystems. 2017;13:1280-1290 296. Raj A, van Oudenaarden A. Nature, nurture, or chance: Stochastic gene expression and its consequences. Cell. 2008;135:216-226 297. Sanchez A, Golding I. Genetic determinants and cellular constraints in noisy gene expression. Science. 2013;342:1188-1193 298. Bacher R, Kendziorski C. Design and computational analysis of single-cell rna- sequencing experiments. Genome Biol. 2016;17:63 299. Livet J, Weissman TA, Kang H, Draft RW, Lu J, Bennis RA, Sanes JR, Lichtman JW. Transgenic strategies for combinatorial expression of fluorescent proteins in the nervous system. Nature. 2007;450:56 300. Snippert HJ, van der Flier LG, Sato T, van Es JH, van den Born M, Kroon-Veenboer C, Barker N, Klein AM, van Rheenen J, Simons BD, Clevers H. Intestinal crypt homeostasis results from neutral competition between symmetrically dividing lgr5 stem cells. Cell. 2010;143:134-144 301. Corey DM, Rinkevich Y, Weissman IL. Dynamic patterns of clonal evolution in tumor vasculature underlie alterations in lymphocyte-endothelial recognition to foster tumor immune escape. Cancer Res. 2016;76:1348-1353 302. Laird CD, Pleasant ND, Clark AD, Sneeden JL, Hassan KM, Manley NC, Vary JC, Jr., Morgan T, Hansen RS, Stoger R. Hairpin-bisulfite pcr: Assessing epigenetic methylation patterns on complementary strands of individual DNA molecules. Proc. Natl. Acad. Sci. U. S. A. 2004;101:204-209 303. Pfeifer GP, Steigerwald SD, Hansen RS, Gartler SM, Riggs AD. Polymerase chain reaction-aided genomic sequencing of an x chromosome-linked cpg island: Methylation patterns suggest clonal inheritance, cpg site autonomy, and an explanation of activity state stability. Proc. Natl. Acad. Sci. U. S. A. 1990;87:8252-8256 304. Banerji J, Olson L, Schaffner W. A lymphocyte-specific cellular enhancer is located downstream of the joining region in immunoglobulin heavy chain genes. Cell. 1983;33:729-740 305. Sharifi-Zarchi A, Gerovska D, Adachi K, Totonchi M, Pezeshk H, Taft RJ, Scholer HR, Chitsaz H, Sadeghi M, Baharvand H, Arauzo-Bravo MJ. DNA methylation regulates

137 discrimination of enhancers from promoters through a h3k4me1-h3k4me3 seesaw mechanism. 2017;18:964

138