Novel Single Cell Methods to Identify the Genetic Composition at a Single Nuclear Structure

by

David Roger Philip Anchel

A thesis submitted in conformity with the requirements for the degree of Doctorate of Philosophy Graduate Department of Biochemistry, in the University of Toronto © Copyright by David Roger Philip Anchel 2015

Abstract

David Anchel Novel Single Cell Methods to Indentify the Genetic Composition at a Single Nuclear Structure Doctor of Philosophy, Department of Biochemistry, University of Toronto, 2015.

Reported here are two novel techniques that allow for the identification of DNA obtained from a nanoscale region of a single cell nucleus based on either direct DNA dissection with high- precision nano-tools operated inside a scanning electron microscope, or the ligation of oligo probes to subnuclear sites of double stranded breaks induced by 2-photon irradiation. It is demonstrated that these techniques afford the ability to identify loci occupying a shared site at a single nuclear substructure, and are thus uniquely suited for identifying specific sets of loci that converge non-randomly at a shared nuclear body (NB; the " hub" model of nuclear organization). The techniques are applied towards determining DNA in the vicinity of single

PML nuclear bodies and yield novel loci that show significant frequency of association with PML bodies in a population of cells. As the "gene hub" model may be a general feature of nuclear organization, these techniques, applied to other NBs, may reveal specific sets of convergent loci that reflect novel gene co-regulatory relationships.

ii

To Dad, of course.

iii

Acknowledgements

To Professor David Bazett-Jones my supervisor, without whom I would not be writing this, my sincere gratitude. Of course I take your scientific guidance for granted; the first half of this thesis is entirely due to your idea for direct dissection under an SEM that led to the Sun lab collaboration at a time when my original approach was flagging. But I especially appreciate the lucky break of having someone with your foresight and humour to let me continue with a side project against all common sense. Over the years when I have several times been on the brink, you reined me in with reminders of how, considering the lunacy of the project, my sheer panic was perfectly natural. And even through the most uncertain times in our work, the pressure you applied to keep me on track was an accident of your genuine enthusiasm for the science. I always wondered how you kept up the cheer even with all the uncertainty of experiments and the continuous pressures of grant proposals and administrative duties. As the real world is coming fast, I realize in the end it comes from what you reminded me of so many times: what a privilege it is to tinker with ideas for their own end. Oh, and one more thing, get the bloody job done. Your advice and example has never made more sense, and now I see there's no good reason not to suffer through another idea all over again.

To Professor Yu Sun, my collaborator, I'm grateful for your enthusiasm and pressure throughout our project. Apart from the technological expertise on which half of this thesis largely relies, it also owes its completion to those progress meetings with you that kept

iv

Brandon and I on our toes and mindful of our responsibilities to getting the job done. We had the pressure of these upcoming discussions to thank for keeping us on task on those late night dissections, and keeping us from sitting back on those afternoons that followed when too often, the negative results came in. And towards the end, it was all the more satisfying to share in the success with you.

To Professors Shana Kelley and Christopher Yip, my committee members, I sincerely appreciate your patience through it all. The fear of an upcoming meeting with you greatly focused the work, and forced me to stare the data in the face. Believe it or not, this thesis would have been an even longer time coming had it not been for your well intentioned criticism and pressure. Through the mess of data I presented to you at these meetings, you always remained engaged, and your ready advice kept a coherence to the direction of the project.

Also, accidentally, in watching the efficiency with which you handle your responsibilities, I saw what I desperately need for scientific success. I hope I can begin to imitate those habits for what lies ahead.

v

Table of Contents

TABLE OF ABBREVIATIONS ...... X

CHAPTER 1: INTRODUCTION ...... 1

1.1 THE COMPARTMENTALIZED NUCLEUS ...... 1

1.2 GENOME-WIDE CHROMATIN ORGANIZATION ...... 4

1.3 NUCLEAR BODIES: NUCLEOLUS, PML BODIES, CAJAL BODIES, HISTONE BODIES, TRANSCRIPTION FACTORIES ...... 6 1.3.1 Nucleolus 7 1.3.2 Promyelocyctic Leukemia (PML) Bodies 10 1.3.3 Cajal Bodies 14 1.3.4 Histone Locus Bodies (HLBs) 17 1.3.5 Transcription Factories 18

1.4 INTERPLAY OF NUCLEAR SUBSTRUCTURES WITH CHROMATIN AND IMPLICATIONS FOR GENE REGULATION: THE "GENE HUB" MODEL ...... 20 1.4.1 The formation of specific chromatin loops: the active chromatin hub (ACH) 21 1.4.2 The convergence of at preferred nuclear sites 23 1.4.3 Active gene loci "shuttling" mechanisms 26

1.5 PML BODIES AS "GENE HUBS" ...... 28

1.6 SPECIFIC CONVERGENCE OF GENES AT "UNIQUE" SUBTYPES OF PML BODIES ...... 30

1.7 DETERMINING THE IN SITU LOCATION OF GENE LOCI WITH RESPECT TO NUCLEAR SUBSTRUCTURES: CHIP AND IMMUNO-FISH ...... 33 1.7.1 Immuno-FISH 33 1.7.2 ChIP 33 1.7.3 DamID 35 1.7.4 Fractionization of Nuclei for Specific Substructures and Associated Chromatin 36

1.8 DEVELOPMENT OF NOVEL TECHNIQUES TO DETERMINE DNA AT SINGLE NUCLEAR SUBSTRUCTURES ...... 36

CHAPTER 2: NANO-DISSECTION AND SEQUENCING OF DNA FROM SINGLE SUB-NUCLEAR STRUCTURES ...... 38

2.1 INTRODUCTION ...... 39

2.2 RESULTS ...... 42 2.2.1 Outline of DNA nano-dissection technique 42 2.2.2 SEM Imaging, DNA Integrity, and Nano-Manipulation 45 2.2.3 Enrichment of Expected Sub-Chromosomal Regions from a Single Nano-Dissected Nuclear Body 47 2.3.4 Nano-Dissection of Single PML NBs Identifies Novel Loci-Body Associations 51

2.3 DISCUSSION ...... 53

2.4 MATERIALS AND METHODS ...... 59

CHAPTER 3: A NOVEL LIGHT MICROSCOPY-BASED METHOD TO DETERMINE DNA SEQUENCES AT SINGLE NUCLEAR SUB-STRUCTURES ...... 65

vi

3.1 INTRODUCTION ...... 65 3.1.1 General Description of Technique 68 3.1.3 Targeted 2-photon DSB induction at single-nuclear substructures 70

3.2 TROUBLESHOOTING ...... 72 3.2.1 Decreasing time between initial image scan and 2-photon targeting 72 3.3.2 Alignment of focal plane between laser lines Calibration of targeting offset 73 3.2.4 Reliable isolation of single targeted cells from coverslip into lysis buffer 75 3.3.5 Lessons from targeting of Single Mouse Chromocentres: Use of "Signature" Sequence 78 3.2.7 Lessons from targeting of integrated tandem lac arrays: Blocking of endogenous breaks and linker-free PCR protocol 82

3.8 RESULTS ...... 85 3.8.1 Single molecule amplification and sequencing of DNA originating from a single nuclear body 85 3.8.2 FISHing of loci obtained from HLB targeting reveals a novel HLB-locus association 89

3.9 DISCUSSION ...... 91

3.10 MATERIALS AND METHODS ...... 94

CHAPTER 4: LTOL TO DETERMINE DNA SEQUENCES AT SINGLE PML BODIES REVEALS A PAIRED LOCI ASSOCIATION ...... 103

4.1 INTRODUCTION ...... 103

4.2 RESULTS ...... 106 4.2.1 Identification of paired loci that associate with PML bodies 106 4.2.2 Inter-loci and PML association frequency in ATRA treated cells 110 4.2.3 Common transcription factor binding sites enriched at convergent loci 112

4.3 DISCUSSION ...... 117

4.4 MATERIALS AND METHODS ...... 120

CHAPTER 5: DISCUSSION/FUTURE DIRECTIONS ...... 128

5.1 ADVANTAGE OF SINGLE CELL METHODS FOR GENOMIC ASSOCIATIONS WITH NUCLEAR BODIES ...... 128

5.2 INVESTIGATING THE NATURE OF LOCI-SPECIFIC PML ASSOCIATION ...... 129

5.3 IS MYB AND SP1'S ACTION ON THEIR TARGET LOCI POTENTIATED BY PML BODY ASSOCIATION? ...... 130

5.4 INVESTIGATING THE COOPERATIVE LOADING MODEL ...... 133

5.5 IMPROVEMENTS ON LTOL AND NANO-DISSECTION ...... 136

5.6 ALTERNATIVE NUCLEAR BODY TARGETS ...... 137

5.7 ALTERNATIVE TARGETING SCHEMES ...... 139 5.7.1 Immuno-Trap of PML-associating locus 139 5.7.2 High- throughput targeted subnuclear damage by functional complementation 142

vii

5.8 CONCLUSION ...... 145

6 REFERENCES ...... 146

7 APPENDIX ...... 160

A.1 MEF CHROMOCENTRE TARGETING SEQUENCES ...... 160

A.2 LAC ARRAY TARGETING SEQUENCES ...... 171

A.3 HISTONE LOCUS BODY TARGETING SEQUENCES ...... 178

A.4 NB4 PML BODY TARGETING SEQUENCES ...... 187

viii

Table of Figures

Chapter 1 Figure 1.1 The Compartmentalized Nucleus...... 3 Figure 1.2 The Perinucleolar compartment (PNC)...... 9 Figure 1.3 PML Domain structure...... 11 Figure 1.4 The PML body as a “Gene Hub”...... 29 Figure 1.5 “Unique” PML bodies found in mutant cell lines with attendant genomic disregulation...... 32

Chapter 2 Figure 2.1 Chromatin extraction setup and experimental process...... 44 Figure 2.2 Nanospatula and maniuplator description...... 45 Figure 2.3 Sequences obtained from HLB dissections...... 49 Figure 2.4 Summary of HLB dissection yields...... 50 Figure 2.5 Amplification Results obtained from dissections...... 50 Figure 2.6 Nano-dissection of single PML NBs...... 52

Chapter 3 Figure 3.1 Outline of LTOL procedure...... 69 Figure 3.2 Induction and labelling of DSBs is axially confined...... 75 Figure 3.3 Targeted DSB induction of mouse chromocentres and probe ligation...... 80 Figure 3.4 Amplification and identification of mouse alpha satellite DNA derived from a single targeted mouse chromocentre...... 81 Figure 3.5 Targeting of cells carrying stable Lac array...... 85 Figure 3.6 Amplification result from LTOL at single HLBs...... 87 Figure 3.8 LTOL results from HLB targeting...... 88 Figure 3.9 FISH of hits from HLB LTOL...... 90 Figure 3.10 ESI imaging of HLB body...... 92

Chapter 4 Figure 4.1 LTOL of single PML bodies in NB4 cell line...... 108 Figure 4.2 17 and 20 loci obtained by LTOL association with each other and show interdependence in PML association. .. 109 Figure 4.3 Frequency of PML body association of loci obtained by LTOL is preserved in all-trans retinoic acid (ATRA) treated NB4 cells. . 111 Figure 4.4 Frequency inter-loci association of sequences obtained by LTOL is preserved in all-trans retinoic acid (ATRA) treated NB4 cells...... 112 Table 4.1 GSEA results of genes near chromosome 17 and 20 loci...... 114 Figure 4.5 "Hotspots" of SP1 binding sites in genome...... 115 Table 4.2 Enrichment of MYB and SP1 binding sites in genes obtained from two different PML body association screens ...... 116

Chapter 5 Figure 5.1 "FISH-TRAP" scheme...... 140 Figure 5.2 "Complementation-TRAP" scheme...... 144

ix

Table of Abbreviations

DNA ( Deoxyribonucleic acid ) NPAT (Nuclear protein of the ataxia telangiectasia mutated locus) PML (Promyelocytic leukemia) RNA (Ribonucleic acid) FISH (Fluorescence In Situ Hybridization) ESI (Electron Spectroscopic Imaging) GC (Granular Component) DFC (Dense Fibrillar Component) FC (Fibrillar Component) CT (Chromosome Territory) EM (Electron Microscopy) MB (Megabase) IFN (Interferon) UBF (Upstream Binding Factor) PNC (Perinucleolar Compartment) PTB (Polypyrimidine tract-binding) HSV (Herpes Simplex Virus) HCMV (Human Cytomegalovirus) AAV (Adeno-associated virus) GFP (Green Fluorescent Protein) CB (Cajal Body) ELL (Eleven-Nineteen Lysine-rich Leukemia) TERT ( reverse transcriptase) SLBP (stem-loop binding protein) FLASH (FADD-like IL-1β-converting enzyme (FLICE) associated huge protein) HLB (Histone Locus Body) ACH (Active Chromatin Hub) MHC (Major Histocompatibility complex) CBP (Creb Binding Protein) MAR (Matrix Attachment Region) ICF (Immunodeficiency, Centromeric region instability, Facial anomalies syndrome) APL (Acute promyelocytic leukemia) PCR ( Chain Reaction) NB (Nuclear Body) AFM (Atomic Force Microscopy) SEM (Scanning Electron Microscope) EBID (Electron Beam-Induced Deposition) BAC (Bacterial Artificial Chromosome) PBS (Phosphate Buffer Saline)

x

LTOL (Laser Targeted Oligo Ligation) DSB (Double Strand Breaks) UV (Ultraviolet) ATOF (Acousto-Optic Tunable Filter) MEF (Mouse Embryonic Fibroblast) LMPC (Laser Microdissection Pressure Catapulting) PEN (Polyethylene Napthalate) PFA (Paraformaldehyde) NA (Numerical Aperture) LSM (Laser Scanning Microscope) AOM (Acousto Optical Modulator) DTT (Dithiothreitol) ISOL (In Situ Oligo Ligation) PEG (Polyethylene Glycol) ATP (Adenosine Triphosphate) BSA (Bovine Serum Albumin) DAPI (4',6-diamidino-2-phenylindole) ATRA (All Trans Retinoic Acid) GSEA (Gene Set Enrichment Analysis) MYB (myeloblastosis viral oncogene) FDR (False Discovery Rate) TRAP (Tagging and Recovery of Associated ) ALL (Acute lymphoblastic leukemia) LANDs (LYSP100-associated nuclear domains) DIG (Digoxigenin) HRP (Horse Radish Peroxidase) Fab (fragment antigen-binding) ROS (Reactive Oxygen Species) OPV (p-phenylene vynelene) BRET (Bioluminescence Resonance Energy Transfer) KRED (Killer Red)

xi

Chapter 1: Introduction

1.1 The compartmentalized nucleus

The ability of a given cell type to maintain a characteristic and robust expression profile is all the more remarkable considering that it must do so over successive cycles of structural disintegration and reorganization. Somehow emerging from the preferential binding of transcription factors to their cognate DNA sequences or epigenetic marks, the genome-wide orchestration of gene expression emerges in a volume of freely diffusing transcription factors

(Dehghani et al. 2005; Spector 2003; Bubulya and Spector 2004), with differential expression between individual genomic loci in close proximity along the same chromatin fibre, or shared expression between loci on separate . How this occurs remains largely explained by the interplay of two established mechanisms: the control of chromatin accessibility via changes in chromatin structure (Kwon et al. 2007), and control of the local concentration of a transcription factor to its active site via expression level, protein modification, or sequestration

(Resnick-Silverman and Manfredi 2006; Ecsedy et al. 2003; La et al. 2004; Li et al. 2000;).

From our increasing understanding of the interdependence of nuclear structure and function, it is becoming clear that the non-homogenous distribution of nuclear proteins into dynamic, stable structures of concentrated regulatory factors ("nuclear bodies"; NBs see Figure 1.1), with chromatin fibres individually partitioned into chromosome territories (CTs), with variant

1 domains of packing (Rapkin et al. 2012) imparts a modular organization to the nucleus whereby specific nuclear functions are carried out at discrete nuclear structures. This model of the nucleus, as a non-homogenous structure capable of providing biochemically distinct compartments, has led to the idea that compartmentalization provides a control over the local concentration of factors, and may underlie a system of gene regulation. Furthermore, the control over the spatial separation or convergence of genomic loci via these compartments could determine or at least contribute to the organization of a genome-wide regulatory network.

2

Figure 1.1 The Compartmentalized Nucleus. (A) Chicken fibroblast with each of its 12 chromosomes differentially labelled and false coloured. Some nuclear proteins form enriched foci or “nuclear bodies”, perhaps representative of a specialized structure serving a particular nuclear function: (B) – (F) Immunostaining of interphase nucleus for promyelocytic leukemia (PML) protein (B; PML bodies in SKN-SH nucleus), Coilin (C, red; Cajal body in HeLa nucleus), NPAT (C, green; Histone Locus Bodies in HeLa nucleus), NPAT (D; Histone Locus Bodies in HT1080 nucleus), Nucleolin (E; Nucleolus; adapted from Spector, 2001), active Pol II (F, blue; transcription factories) reveals nuclear compartmentalization. (F) RNA Immuno- FISH staining of a mouse erythrocyte nucleus shows transcription factories (Pol II foci (Blue)), and colocalizing Eraf and Hbb loci (green, red respectively). (G) ESI images allow for the discrimination of Nitrogen rich protein (blue) and Phosphorous rich nucleic acid structures (yellow) of the Nucleolus (large blue structure) consisting of Granular Compartment (GC), Dense Fibrillar Compartment (DFC), and Fibrillar Compartment (FC). Note the presence of nucleic acid at the periphery of the FC (arrow head). Scale bar 300nm. (panel A adapted with permission from Macmillan Publishers Ltd: Nature Reviews Genetics Apr;2(4):292-301, copyright 2001; (panel C adapted with permission from Macmillan Publishers Ltd: Nature Reviews Genetics Nov;3(9):843-854, copyright 2008); (panel F adapted with permission from Macmillan Publishers Ltd: Nature Genetics Sep;36(10):1065-1071, copyright 2004); (panel G adapted from Politz et al., 2005 Mol Biol Cell 16:3401-3410 under Attribution-Noncommercial-Share Alike 3.0 Unported Creative Commons License).

3

1.2 Genome-wide chromatin organization

The large-scale arrangement of chromatin within the nuclear volume is not random.

Although there is cell-to-cell variation of the location of chromosome territories (CTs) with respect to each other and to nuclear structures, there are spatial relationships that occur at a frequency above that which is expected by a purely independent positioning of chromatin fibers. For example, measurements of the radial distance of CT centres from the nuclear centre show a general trend for more gene-dense chromosomes to reside at a more interior location compared with gene-poor chromosomes (Bolzer et al., 2005, Bridger et al., 2000 and Cremer et al., 2003). There is also a non-random relationship between CTs: pairs of CTs have smaller center-to-center distances in G0 cells than is predicted by a purely random arrangement (Bolzer et al., 2005), and CTs of daughter cells retain the spatial relationships of the mother, at least with respect to their orientation relative to the metaphase plane through mitosis (Gerlich et al.,

2003). This is not to say that the arrangements of CTs is fixed between cells of a given type or even within a single cell throughout its cell cycle. For a given cell the relative CT position may be maintained through interphase and between daughter cells, however, this structure may be altered through mitosis and reconstitution in daughter cell nuclei (Strickfaden et al., 2010). The interplay of the transcription level of individual genes (Chambeyron and Bickmore, 2004), differentiation state (Stadler et al., 2004), and rearrangements upon cell division (Strickfaden et

4 al., 2010), contribute to both large-scale variations of CT arrangements, and changes in gene position within a CT.

The apparent stability in size and shape of CTs throughout interphase (Muller et al., 2010) belies a dynamic fibre, with alterations in the compaction and/or cis and trans associations of a chromatin segment in response to local changes in gene activity. For example, although the initial notion of transcription being confined to the "exterior surface" of a CT (Verschure et al.,

1999) is not supported by EM observations that show a CT “interior” that is actually continuous with the CT surface and would permit the free diffusion of the transcription machinery, genes are often observed being "extruded" to the surface of their chromosome territory, and this movement to the CT surface correlates with their increased expression (Chambeyron and

Bickmore, 2004). Consistent with this interdependence of chromatin folding with gene activity, the folding of the chromatin fiber appears to depend on the density of genes along it. For example, differential labelling of gene rich regions separated by >400kbp gene "deserts" show striking zigzag patterns of alternating gene clusters segregated together on one side abutting gene deserts, and "hub" structures with deserts arranged on the periphery of a core of gene clusters (Shopland et al., 2006). There are also long-range interactions between distant genomic regions along chromosomes and between different chromosomal loci: the binding of Uros and

Eras to the β-globin gene 30 MB away (Carter et al., 2002 and Simonis et al., 2006) through the formation of chromatin fibre “loops” (Kumar et al., 2007 and Palstra et al., 2008), and the IFN-γ gene on mouse chromosome 10 and the TH2 locus on chromosome 11 functionally associate, with the loss of association correlated with the down regulation of IFN-γ or TH2 transcription

(Spilianakis et al. 2005, Williams et al., 2010). Whether these non-random arrangements of CTs

5 and CT structural changes emerge through the course of transcriptional activity or are established prior to the formation of the nucleus, it is clear that variations in large scale chromatin structure have functional implications. One of the most striking examples can be seen in mouse retina cells, where the chromosomal arrangement is inverted, with the gene-dense chromosomes more peripheral compared with more interior gene-poor chromosomes (Solovei et al., 2009). This inverted arrangement is found in retina cells of a variety of mammals, and confers light scattering properties that improve nocturnal vision (Kreysing et al., 2010). This finding underscores the functional importance of the large scale organization of chromatin in the nucleus: the arrangement of chromosomes with respect to each other and within the nuclear volume is not entirely incidental during the course of nuclear activity, but rather a consequence of specific evolutionarily conserved mechanisms.

1.3 Nuclear bodies: Nucleolus, PML bodies, Cajal Bodies, Histone Locus Bodies, Transcription Factories

A number of nuclear processes are thought to be confined to specialized sub-nuclear organelles, or “nuclear bodies” (NBs) characterized by a local concentration of factors into foci distinct from the surrounding nucleoplasm (Spector 2001; Figure 1.1). A few examples of nuclear bodies are briefly reviewed here to underscore this "compartmentalized" organization of nuclear functions.

6

1.3.1 Nucleolus

The best characterized of all nuclear substructures, the nucleolus itself serves as the prototype of nuclear compartmentalization. As the site of ribosomal RNA (rRNA) transcription, processing, and assembly with ribosomal proteins, each step is distributed to within a particular nucleolar sub-compartment. Although the size, number of and position of nucleoli varies between cell types and metabolic state (Junera et al. 1995), the division of labour in the process of ribosomal biogenesis corresponds to three regions (at least in amniotic vertebrates; Thiry et al. 2011) in the nucleolus that are discernible based on differences in electron density under electron microscopy, or by the concentration of specific factors. At the core of the nucleolus is the Fibrillar centre (FC), which contains foci of PolI and other factors (i.e. UBF, Nopp140;

Cisterna and Biggiogera 2010) associated with the nucleolar organizing regions (NORs), comprising tandem repeats of ribosomal RNA (rRNA) genes located on the short arms of acrocentric chromosomes (Boisvert et al. 2007). Active transcription of the rRNA genes is thought to take place in a thick region entirely or partially surrounding the FC, known as the

Dense Fibrillar Component (DFC), as indicated by the presence of nascent transcripts either in the DFC proper, or at the FC/DFC border. These compartments are interspersed among the

Granular component (GC) that is enriched in proteins (e.g. Nop52, B23) involved in final stages of ribosomal assembly (see Figure 1.1). Distinct nucleolar structures within these compartments have since been discovered that have either confirmed the location of previously reported nucleolar functions (e.g. yeast "No Bodys" as the site of degradation of

7 defective ribosomal components; Sirri et al. 2008), or initiated a reconsideration of the roles of the individual factors that reside there. The presence of factors that have no obvious role in ribosomal assembly (e.g. viral proteins, tRNAs, RNAse P, signal recognition particle components, telomerase, and small nucleolar RNAs) point to alternative functions for this structure

(Olson 2004, Pederson and Tsai 2009.). It is likely that because of the abundance of chaperones and RNA processing enzymes in the nucleolus, it serves as a general nuclear factor assembly centre. The nucleolus may also indirectly serve as a regulatory centre for transcription of non- ribosomal genes by the post-translational modification and sequestration of factors that act on them (Morimoto and Boerkoel 2013). In this light, it is not surprising that nucleolar deregulation is associated with a wide variety of disease states (Olson 2004). In many cancers for example, there are marked changes in nucleolus structure that accompany the progression to metastasis; the Perinucleolar Compartment (PNC) forms in metastatic cancer cells either beside, or along invaginations in the nucleolar surface (see Figure 1.2; Slusarczyk 2010). It is likely that along with our increasing understanding of its role in ribosomal biogenesis and other nuclear functions, the nucleolus will be recognized to have a more complex structural organization than previously thought, as distinct structures dedicated to a particular process within these basic compartments are identified.

8

Figure 1.2 The Perinucleolar compartment (PNC). Inset-immunoflourescence of the PNC component PTB protein (green) reveals a foci abuting the nucleolus. ESI (electron spectroscopic imaging) reveals the PNC as an invagination interrupting the condensed chromatin (Ch, and arrows) surrounding the nucleolus (GC - granular compartment).

The above review of the nucleolus serves to illustrate a general theme of nuclear organization: the regulation of gene loci through their convergence at a subnuclear compartment. Although the association of rDNA loci with the nucleoli is expected given this compartment's role in ribosomal biogenesis, other non-rDNA loci frequently associate with nucleoli, suggesting a role for nucleoli as a regulatory centre for other loci (van Koningsbruggen et al. 2010). This appears to be a general theme of nuclear organization; like the nucleolus, other nuclear bodies make specific genomic associations, and may serve as regulatory centres for the loci convergent upon them.

9

1.3.2 Promyelocyctic Leukemia (PML) Bodies

These distinct foci (typically between 1 and 30 per mammalian nucleus; see Figure 1.1) were first identified by electron microscopy as dense, proteinaceous structures (0.2-1um in diameter) distinct from surrounding chromatin (Maul et al. 2000). PML (promyelocytic leukemia) bodies derive their name from their characteristic PML proteins, which nucleate PML bodies through interactions between their Ring-B-box-Coiled-coil (RBBCC) domains. Further recruitment of PML body proteins (e.g. DAXX, SP100) occurs through interactions of their

SUMOylation sites and SIMs (SUMO-interacting Motifs) with the respective SIMs and

SUMOylation sites on PML protein (see Figure 1.3; Lallemand-Breitenbach V and de Thé H

2010). Seven PML protein isoforms have been identified in humans (PMLI-VII; although PMLVII isoform lacks a nuclear localization signal (NLS) and is present only in the cytoplasm, the remaining six are each capable by themselves of nucleation; Beech et al. 2005, Nisole et al.

2013, and see Figure 1.3). The presence of functional PML protein is indispensible to the proper formation and structural integrity of the PML body: PML -/- MEFs (Mouse Embryonic

Fibroblasts) fail to form foci of PML body partner proteins (Zhong et al. 2000), and the fusion of the PML protein with the retinoic acid receptor alpha (PML-RARα; a hallmark of acute promyelocytic leukemia- APL, see Figure 1.3) results in a differentiation block of myeloid progenitor cells and a concomitant dispersal of PML body components (Borden 2002; see

Figure 1.5).

10

Figure 1.3 PML Protein Domain Structure. (A) PMLI isoform is presented indicating the amino acid (aa) positions of the ring box coiled coil (RBBCC/TRIM) domain, the nuclear localization signal (NLS), the nuclear export signal (NES), and the SUMO interactive motifs (SIMs; major Sumoylation sites indicated at K65, 160, and 490, and five minor sites indicated atK226, 380, 400, 497, 616). Unlike isoforms PMLI-V, isoforms PMLVI, isoforms PMLVI and PMLVIIb lack a SIM, and the NES is present only in PMLI. (B) Breakpoints in the APL phenotype are indicated, and the resultant PML-RARα fusion proteins are shown. Differences in the PML breakpoint (at 394aa and 552aa) result in "short" and "long" PML/RARα fusion proteins respectively (A-F indicate exons of RARα). Adapted from Nisole et al. 2013 under the creative commons attributions license.

11

By virtue of the individual roles of its components (i.e. PML, Daxx, p53, c-Jun, PA28) PML bodies have been implicated in apoptosis (Bernardi et al. 2008), tumour suppression (Salomoni et al. 2002) innate immunity (Everett and Chelbi-Alix, 2007), DNA damage response (Dellaire and Bazett-Jones, 2004), proteasome function (Fabunmi et al. 2001), and transcriptional regulation (Lin et al. 2003). Several studies point to an active role for the PML body, in which

PML protein levels correlate directly with the SUMOylation (Quimby et al. 2006), acetylation

(Bischof et al. 2002), or phosphorylation (Louria-Hayon et al. 2003) levels of interacting proteins. Given the number of proteins that have been shown to localize to PML bodies (a recent survey finds 166 proteins; Van Damme et al. 2010) , it is difficult to characterize its precise role and so it has been proposed that the PML body is merely a “storage depot” for proteins that are not otherwise engaged at their sites of activity (Negorev et al. 2001).

However, it is telling that PML body components change in response to various cellular insults

(Glass and Everett 2013; Sahin 2014) and there are several lines of evidence that support a role for PML bodies as the centre of activity for specific nuclear processes. Firstly, they are present in the vicinity of viral DNA upon infection. HSV-1 (herpes simplex virus 1), Ad5 (adenovirus 5),

SV40 (simian virus 40), HCMV (human cytomegalovirus), and AAV (adeno-associated virus) retain their DNA at sites “adjacent" to PML bodies (de Bruyn Kops et al. 1994; Ishov et al.

1996; Fraefel et al. 2004). Viral proteins associate with and dramatically alter PML body structure, and this disruption can correlate with an increased efficiency of viral transcription or protein function (Glass and Everett 2013). For example, in Human Cytomegalovirus (HCMV) infections, PML and SP100 proteins are dispersed from PML bodies by the HCMV-coded immediate-early protein IE1 (Tavalai et al. 2011). And in Herpes Simplex virus (HSV-1) infection,

12

HSV-1-coded ICP0 mediates the proteosome-dependant degredation of PML and SP100 (Muller

S, Dejean A. 1999). Secondly, there are structural and functional differences between PML bodies within a single nucleus: There are differences in PML body protein constiuents from body to body within the same cell (e.g. LYSP100-associated nuclear domains- LANDs; Dent et al.

1996), preferential associations of viral genomes to a subset of PML bodies within a single cell

(De Bruyn-Kops and Knipe 1994; Maul 1998), and the formation of unique bodies in cell lines with a concomitant gene deregulation (Luciani et al. 2006, Torok et al. 2009). We would not expect this heterogeneity if the PML body constituents were dictated merely by the excess protein present throughout the nucleoplasm. Thirdly, PML bodies may also form functional associations with surrounding cellular chromatin. ESI (electron spectroscopic imaging; see

Figure 1.4) shows that the PML body makes numerous contacts with the surrounding chromatin, and by live cell imaging of GFP tagged PML protein, the oscillatory movement of

PML bodies is similar to that of chromatin during S phase (Dellaire et al. 2006; Ching et al.

2005). One obvious question then is whether the PML body is regulating the chromatin with which it is in contact. Combined immunolabelling of PML protein with probes to specific gene loci that show non-random localization of particular genomic regions to the vicinity of PML bodies, suggest that these chromatin contacts are not merely fortuitous (Shiels et al. 2001,

Wang et al. 2004, Sun et al. 2003, Ching et al. 2013). Given that chromatin modifying enzymes are found at PML bodies (Seeler 1998; Wu 2001), the implication is that these specific chromatin loci are somehow regulated by the PML body. There is, however, little direct demonstration of PML bodies regulating the chromatin of an endogenous gene locus, as opposed to PML proteins from the nucleoplasm (where the majority of PML protein resides;

13

Nisole et al. 2013 and references therein), as supporting studies do not differentiate between the two fractions (e.g. Li et. al. 2014; Ulbricht et al. 2012). Even so, using a reporter system developed in our lab that targets biotinylated luciferase reporter plasmids to PML bodies incorporating a biotin binding-PML fusion protein, PML bodies can affect the transcription of targeted plasmids in a promoter-dependent manner (Block et al. 2006). The lesson of the body of research to date is that there does not seem to be an obvious overarching role for PML bodies that reconciles the variety of cellular processes in which they have been implicated. It is likely that between cells and even between bodies within the same cell, the function of a particular body is dictated by the balance of proteins that constitute it, and its particular biochemical environment, and so the notion of a generic PML body function without this context is not helpful.

Whatever their particular role(s) from body to body, it is increasingly clear that like the nucleolus, PML bodies are an example of a subnuclear compartment that serves as a regulatory centre for specific genomic loci that are convergent upon it. The other examples that follow are included as both a preparation for future chapters, and to further emphasize this general theme of nuclear organization.

1.3.3 Cajal Bodies

In higher eukaryotes, Cajal bodies (CBs) are foci (.5 to 1um; see Cajal body immunoflourescence in Figure 1.1) that are characterized by, and dependent on, the presence

14 of coilin protein. Cajal bodies vary in number between cells, with a dependence on cell type, and differentiation state, and within a single nucleus, Cajal bodies show variation both in their components (with the exception of coilin) and size, suggesting that like PML bodies, there is functional hetergeneity between them. Also like PML bodies, their supposed function has been suggested by that of their constituent protein partners. Although the increasing diversity of protein partners found at the CB suggests a multifunctional body involved in transcription (i.e. transcription factors Pol II, ELL, EAF1) , snoRNP processing (Fibrillarin, Nop140), maintenance (telomerase RNA, TERT), and nuclear signalling (SUMO1, CDK2-cyclinE; Machyna et al. 2013), the most well-studied of these associations is with spliceosomal component snRNAs (U2, U4/U6). Cajal bodies associate with these snRNAs at all stages of their maturation: the association of actively transcribing U2 gene loci at Cajal bodies precedes the release of transcripts loaded with factors that facilitate their export to the cytoplasm, which then return to the body for further assembly including the addition of snRNP proteins and modifications at specific nucleotides (Machyna et al. 2013 and references therein). Other snRNA transcripts cycle to and from CBs during their assembly into mature snRNPs, and although not indispensible for snRNP formation, the concentration of factors at the CB greatly increases its efficiency (Novotny et al. 2010). This is consistent with the observation that in metabolically active cells (i.e. transformed cell lines), the CBs are larger in size, and show greater concentration of snRNA processing factors compared with the surrounding nucleoplasm

(Spector et al. 1992). Thus, CBs highlight the two underlying themes of nuclear bodies as organizers of nuclear activity: the preferential and functional association of specific gene loci, and the increased efficiency of a process achieved by the spatial concentration of its accessory

15 factors. CBs are also a model for the interdependence of nuclear body formation and the organization of functional contacts with the surrounding chromatin. The integrity of the body is connected to that of the surrounding chromatin: live-cell imaging of CB movement shows a stable, intact structure with a constrained diffusion consistent with reversible chromatin contacts (Platani et al. 2002), and Cajal body composition and integrity is affected by DNA damaging agents (Dellaire and Bazett-Jones 2007). Conversely, even as CBs require active transcription to maintain their integrity (Shav-Tal et al. 2005), and can be seen nucleating de novo at lac array-tethered transcripts (Shevstov and Dundr 2011), U2 gene loci can exhibit large

(1-2 microns), directed movement towards an already intact CB (Dundr et al. 2007).

The above example further illustrates that a nuclear body can have a specific regulatory function for those loci that are convergent upon it. The following nuclear body shares some of the same protein components as the Cajal body, and is thus usually subsumed by it in most literature reviews. However, it is highlighted here separately, as it is a particularly striking example of a nuclear body association with specific gene loci, and as such, is the focus of experiments detailed in chapters 2 and 3.

16

1.3.4 Histone Locus Bodies (HLBs)

Originally thought to be Cajal bodies, Histone locus bodies (see Figure 1.1, and chapter

3) were first distinguished in Drosophila as a subset of Cajal bodies that contained the U7snRNP constituent Lsm110 (Liu et al. 2009). Subsequent reports in humans confirmed HLBs as distinct coilin-negative bodies frequently found colocalizing with the histone gene clusters at chromosomes 6p22.1 and 1q21 (Bongiorno-Borbone et al. 2008; Ghule et al. 2008). HLBs are believed to be involved in the processing of pre-mRNA from the histone locus, however this is only by virtue their constituent 3'mRNA splicing factors (U7 snRNP, SLBP (Ghule et al. 2008;

Machyna et al. 2013). It is possible that HLBs are not limited to this function, as they also contain transcription factors (e.g. Pol II, FLASH, NPAT), and are found throughout the cell cycle

(White et al. 2007), even though histone transcription and processing is confined to S phase

(Marzluff et al. 2005). Although a recent reports suggests that the HLB is formed by the nucleation of HLB components on the histone gene cluster at 6p22.1 (Salzler et al. 2013), HLBs may not be limited to the regulation of histone gene clusters alone; it is possible that other loci may converge with the histone clusters at a shared HLB, or that specific subtypes of HLBs regulate their own dedicated locus. It has been suggested that HLBs may also regulate non- histone genes (Ma et al. 2000); consistent with this, HLBs still form in mutant Drosophila embryos that lack histone genes (White et al. 2007). From electron microscopy images from our lab, the HLB is a protein-rich structure that appears to be closely associated with, but structurally independent of its surrounding chromatin (see Fig 3.9). Although its precise role in the regulation of the histone gene loci has not yet been firmly established, like rDNA at the

17 nucleolus (Boisvert et al. 2007), and U2snRNA gene loci at Cajal bodies (Dundr et al. 2007), HLBs are a model system in our understanding of nuclear bodies as centres of regulation for associating chromatin.

Emerging from the evidence surveyed above is a model of genetic control whereby the regulation of specific loci is restricted to, or are carried out with greater efficiency at (see Cajal bodies) those bodies that are concentrated in their cognate factors. The following body is reviewed to illustrate that the specific convergence of several genes can occur simultaneously at a shared body, and may thus provide a mechanism for the coordinated regulation of several genes in trans.

1.3.5 Transcription Factories

Transcription is compartmentalized. Active Pol II proteins converge at stable sites of transcription for genes that associate with them (see Figure 1.1). Initially discovered by the pulse-labelling of nascent transcripts in HeLa cells (Jackson et al. 1993), subsequent labelling studies that showed these transcripts and foci of active Pol II proteins colocalize (Grande et al.

1997; Osborne et al. 2004), as well as splicing, transcription factors, and chromatin remodelling enzymes (Melnik et al., 2011) suggested that these activities are integrated in this specialized compartment, the so-called “transcription factory”. These foci are resistant to extraction of the

18 soluble nuclear fraction and DNAseI treatment so these sites of transcription may be attached to some underlying insoluble component of the nucleus (Jackson et al. 1993). Although recent super-resolution live cell microscopy of fluorescent activated Pol II has suggested that these structures only appear transiently (mean lifetime ~5 seconds; Cisse et al. 2013), the Pol II signals appear to emerge and dissipate within a restricted region, suggesting that there is indeed a spatially stable, underlying scaffolding that active polymerase preferentially nucleates on (see Cisse et al. 2013 Supplemental). There is a variation in the number of transcription factories found at a given time in a cell nucleus (from 100-300 in differentiated mouse erythrocytes derived from tissue; Osborne et al. 2004, to 1500 in cultured human erythroblasts; Brown et al., 2008), that result from differences between cell types, and/or a dependence on the type of imaging or criteria used to identify them (Rieder et al. 2012). In any case, the number of transcription factories is still far fewer than the estimated number of transcribing genes at any one time (Osborne et al. 2004), and thus, multiple active genes must be grouping together to occupy the same factory. As discussed below (see The convergence of genes at preferred nuclear sites), it is the convergence of genes at stable, pre-formed transcription factories that may underlie a system of gene regulation, whereby a subset of loci across several chromosomes are regulated at specialized transcription factories by virtue of their shared affinity for those factors enriched in it.

19

1.4 Interplay of nuclear substructures with chromatin and implications for gene regulation: the "Gene Hub" model

The transcription factory is an example of the nuclear body as "Gene Hub": a localized compartment of gene regulation, enriched in factors for those specific loci that are convergent on it. The gene hub model represents the cobbling together of related ideas contributed by researchers working on gene regulation at different scales and using different assays, and thus several terms (eg. "gene hub" , "the active chromatin hub") have come to be used interchangeably in the description of a variety of phenomenon related by the common theme of long-range (i.e. tens of kilobases to megabases) chromatin interaction (de Laat and Grosveld

2003; Chakalova et al. 2005). Although there is no definitive articulation of the gene hub model in the literature, between its various incarnations three basic aspects consistently emerge: First, that there are specific convergences of genes either in cis or trans through the formation of chromatin "loops", and that this convergence contributes to the efficient co-regulation of the participating genes. Second, that the convergence of loci takes place at preferred sites in the nuclear volume, mediated by a specialized nuclear body containing their associated (but perhaps disparate) regulatory factors. Third, and perhaps most controversial, is that there are active mechanisms that "shuttle" the corregulated gene loci to these site of convergence. Thus, the gene hub model could provides a mechanism to coordinate genes that are functionally related but under the control of different regulatory sequences, and explains the phenomenon of cell-type specific gene expression profiles, by allowing for the co-regulation of multiple genes separated by large intrachromosomal regions (de Laat and Grosveld 2003) or across multiple

20 chromosomes (Chakalova et al. 2005). Furthermore, if the required threshold concentration of a regulatory factor for the particular binding affinity of its cognate promoter element only occurs at discrete nuclear volumes, then the active spatial association and separation of gene loci to and from these compartments would provide a novel, genome-wide mechanism of gene regulation. Because these three hypotheses of the gene hub model have not been well differentiated from one another, the current evidence supporting each is often taken to tacitly imply the other two. It is helpful then, to review the evidence for each separately.

1.4.1 The formation of specific chromatin loops: the active chromatin hub (ACH)

The growing evidence for the formation of chromatin loops has followed and spurred the development of technologies that detect specific long-range chromatin interactions. The first evidence of non-random chromatin folding came from EM studies of lampbrush chromosomes that revealed the formation of chromatin loops (Gall 1956). That this looping structure reflects a functional organization of gene expression domains, or more generally, that there are maintained chromatin structures associated with the regulation of genes contained in them, was supported by DNase hypersensitivity assays that mapped the division between open and closed chromatin regions to specific genomic elements (i.e. insulators; de Laat and

Grosveld 2003, and references therein, namely Chung et al. 1993, Hebbes et al. 1994, Litt et al.

2001a, Litt et al. 2001b ), and the demonstration that the disruption of these elements results in the deregulation of nearby genes (Bonifer et al. 1994). Regulatory sequences found directly

21 upstream (i.e. hundreds of basepairs) of promoters (Wittkopp and Kalay 2011) could establish distinct regulatory domains flanked by insulator elements, and thus provided a mechanistic link between chromatin structure and the propensity for co-regulated genes to cluster in close proximity on the same chromosome. Notable exceptions to this model of self-contained regulatory domains remained however, most notably, the human β-globin locus, which contains cis-regulatory elements upstream of an intervening and differentially expressed olfactory receptor gene cluster (de Laat and Grosveld 2003). The discovery of these "enhancer" elements that affected the expression of the β-globin genes found tens of kilobases away

(Smallwood and Bing 2013), suggested the existence of functional, long-range chromatin interactions (i.e. hundreds of kilobases). With the development of techniques that detect the interaction partners of specific sequences (ie. see Determining the in situ location of Gene Loci with respect to nuclear substructures below; and de Wit and de Laat 2012), enhancer elements were shown to preferentially associate with their cognate promoters at distances of several hundred kilobases (Tolhuis et al. 2002; Carter et al. 2002). That the interaction of the enhancer with its cognate genes tens of kilobases away correlated with increased expression (Carter et al.

2002), and this preferential association did not influence the expression of genes lying in the intervening region (de Laat and Grosveld 2003) suggested the formation of chromatin loops into an "active chromatin hub" (ACH), that allowed for the co-regulation of genes not clustered along the chromosome fibre (i.e. human α-globin locus, see d0e Laat and Grosveld 2003). That these preferential associations also occur between co-regulated genes residing on different chromosomes (Splianakis et al. 2005; Osborne et al. 2004; Simonis et al. 2006; Schoenfelder et al. 2010), and that the maintenance of an ACH is required for the transcriptional activity of its

22 convergent genes (Fanucchi et al. 2013), suggests that the ACH may underlie a general mechanism of co-regulation for sets of genes that are not contiguous along a chromosome fibre

(Kosak and Groudine 2004).

1.4.2 The convergence of genes at preferred nuclear sites

In parallel with the above investigations into chromosome conformation that led to the

ACH model, in situ labelling of gene loci (Fluorescence In situ Hybridization - FISH; see 1.7.1) revealed that gene expression correlated with changes in a gene's position relative to both its chromosome territory, and with respect to the nuclear volume. This phenomenon is best illustrated with FISH studies of the Hox gene cluster, a model for the differential expression of genes within a restricted chromosomal region. Although the Hox genes (numbered 1 through

9) cluster in a region spanning 90kb, they each exhibit distinct expression profiles during mouse development, with each activated in sequence according to their 3' to 5' order along the chromosome (Kmita and Duboule 2003). Chambeyron et al. (2005) noted that this sequential activation correlates with changes in Hox gene location with respect to its chromosome territory; FISH (see below) of the Hoxb1 gene upon its transcriptional activation gene correlates with its movement to the periphery of its chromosome territory, while the transcriptionally silent Hoxb9, remains in its interior (Chambeyron and Bickmore 2004; Chambeyron et al. 2005).

Although the notion of a chromosome territory "interior" is troublesome (see Genome-wide chromatin organization), the Hox gene phenomenon clearly underscores interplay of gene

23 regulation and changes in nuclear position. Other examples of this "extrusion" or looping out of an active gene from its chromatin territory have been reported (Ragoczy et al. 2003; Mahy et al. 2002), suggesting the movement of loci either preceding, or as a result of their transcriptional activation, is a general feature of gene activation. It is possible that the extrusions of active gene loci represent the movement of genes to shared regions enriched in transcriptional machinery- namely, transcription factories. This is consistent with the observed convergences at a shared transcription factory between genes separated by large distances

(tens of megabases) on the same chromatin fibre (Osborne et al. 2004; and Papantonis et al.

2012), as well as between genes on different chromosomes (Spilianakis et al. 2005; Osborne et al. 2007; Schoenfelder et al., 2010). In these studies, the presence of inactive Pol II, as well as colocalizing gene loci in the absence of their transcripts, is consistent with the transcription factory as an bona fide nuclear structure, and not an artifact of independently nucleating at actively transcribed genes separated by sub-resolution distances. This is supported by observations of genes that only converge within time scales consistent with their coincident expression (Papantonis et al. 2010). Furthermore, ultrastructural studies of active

Pol II show the transcription factory as a protein-rich core surrounded by, but not nucleating on, the surrounding chromatin (Eskiw and Fraser 2011). An interesting consequence of this is that the particular factors of a transcription factory could determine the subset of genes that are convergent upon then. Gene convergence appears to be non-random, with specific pairs of genes in cis and trans preferentially sharing a factory in response to an activation signal

(Papantonis et al. 2012; Osborne et al. 2007). Although this phenomenon may simply be a consequence of a non-random chromatin distribution (see Genome-Wide Chromatin

24

Distribution), with preferentially nearby loci simply associating with the nearest transcription factory, it may in fact be the particular constituents of a transcription factory that determines the set of genes convergent upon them. Schoenfelder et al. (2010) reported specific pairs of loci that preferentially associated with a subset of transcription factories that contained the

Klf1 transcription factor, consistent with the model that it is the specific affinities of loci for the constituents of transcription factor itself that contributes to the spatial organization of the surrounding chromatin (Sexton et al. 2007). The evidence of non-random associations between specific genes and nuclear domains other than transcription factories (i.e. Cajal bodies, PML bodies; see above) suggests that gene movement to them may have additional regulatory consequences. For example, chromatin at the nuclear lamina, a filamentous structure extending from the inner surface of the nuclear membrane, is enriched in histone marks associated with repression (i.e. H3K27me3), and a majority of genes in these lamin-associated domains are expressed at very low levels (Guelen et al. 2008). Whatever the various consequences of these non-random gene movements and convergences are, they highlight one essential theme of the gene hub model: that the co-regulation of specific sets of loci either in cis or trans, is conferred by their convergence at a shared nuclear body enriched for their cognate regulatory factors.

25

1.4.3 Active gene loci "shuttling" mechanisms

One mechanism proposed to explain the non-random movement of loci that occur either with respect to position within the chromosome territory (Chambeyron and Bickmore

2004), or towards a nuclear body enriched in those factors specific for its regulation (Dundr et al. 2007), is that there are underlying structures in place to facilitate them (Carmo-Fonseca

2007). Observations of directional movement have been reported by live-cell imaging of lac arrays (Chuang et al. 2006), or a U2 locus (Dundr et al. 2007) under control of an inducible promoter. In the case of the inducible lac array, these long-range movements (with respect to the nuclear periphery) followed a "linear" trajectory, and occurred within an hour of transcriptional activation, with an average velocity of 3 µm over two hours, and notably, included bursts of movement of up to 1 µm per minute. The U2 locus showed slower dynamics, initially making oscillatory, short-lived associations with a Cajal body (where it preferentially associates; see Cajal Bodies above) within 1-3 hours of activation, followed by more directed movement (0.1–0.2 μm/min) to the Cajal body after 6-7 hours. Both groups observed a marked inhibition of this movement upon overexpression of polymerization- defective actin mutants (Dundr et al. 2007; Chuang et al. 2006), or addition of myosin inhibitors

(Chuang et al. 2006). The authors suggest that this dependence on nuclear actin and myosin might reflect the activity of myosin motors directing loci along polymerized actin filaments.

This "shuttling" model has also been invoked to explain chromosomal movements induced by serum starvation (Mehta et al. 2010), that were inhibited by drugs that blocked actin polymerization and myosin activity. Although these studies point to the involvement of

26 functional actin filaments and myosin, it is controversial to suggest that they are forming stable, extended scaffoldings resembling a "nuclear matrix". Monomeric actin can be found in the nucleus (Pederson and Aebi 2005; Pederson 2008), however, previous studies pointing to the existence of a network stable actin filaments in nuclear extracts (Amankwah and De Boni 1994), cannot be confirmed using filamentous actin-specific antibodies in unextracted cells (Gonsior et al. 1999), and it is likely that the presence of filamentous structures remaining after nuclear extraction are an artifact of the particular protocol used to isolate them (Reviewed in Razin et al. 2014). More recent findings point to a role for nuclear actin and myosin associations in chromatin remodelling and transcription (de Lanerolle and Serebryannyy 2011). An alternative view then, is that actin and myosin complexes contribute to the relaxation of chromatin sufficient for loops to form, which in turn increases the spatial sampling of the extruded locus within its microenvironment (i.e. surrounding volume on the order of microns; Pederson 2008).

A similiar stochastic-diffusion model has been proposed for the preferential formation of interchromosomal contacts (Bohn and Heermann 2010). In this way, the initial, stochastic sampling of a locus is eventually refined to frequent, high affinity associations with specific factors enriched at nearby nuclear body. Thus it is possible to have specific, albeit stochastic loci movements that are dependent on actin/myosin complexes, without the requirement for stable, filamentous networks.

27

1.5 PML bodies as "Gene Hubs"

As described above (see PML Bodies), PML bodies may act as regulatory centres for the chromatin loci that are convergent upon them. One line of evidence supporting this is that specific loci are found preferentially associating with PML bodies: By observing the spatial relationship of genomic regions with PML bodies in a cell population in situ (see immuno-FISH below), several groups have found specific loci that either colocalize (by light microscopy) with a significantly greater frequency to PML bodies, or have a significantly less mean minimal distance (mmd) to them: In the T-lymphocyte Jurkat cell line, Sun et al. (2003) found a significant colocalization of the TP53 gene with PML bodies compared with the BCL2 gene. And in human fibroblasts, the mmd of the MHCI locus (an extremely gene-rich region on chromosome 6 with a gene on average every 16kb) to a PML body is significantly less than that of the comparatively gene poor 6p24 region of chromosome 6 (Shiels et al. 2001). Furthermore these associations are cell-type specific: using a non-biased method to detect novel PML body- chromatin associations in a cell population (see below), Ching et. al (2013) reported additional, specific loci-PML body associations, including a frequent association of the PML gene in the

HT1080 human fibrosarcoma cell line that is not recapitulated in normal human fibroblasts. In order to shed light on the significance of these associations, Wang et al. (2004) looked at 54 loci across 10 chromosomes with immuno-FISH to look at loci-PML mmds with respect to gene density, and gene transcription levels. Although there were no proximity – transcription correlations for particular genes, there was a positive correlation when comparing transcription levels across several genes within a contiguous chromosome region to PML body proximity, and

28 when comparing the gene density of a region to PML body proximity. Indeed, PML bodies contain accumulations of nascent RNA at their periphery, and their surrounding chromatin is enriched in acetylated histones (i.e. transcriptionally active chromatin; Boisvert et al. 2000).

These findings are all consistent with the role of the PML body as a preferred site for the functional association of specific genes across the genome to associate with their cognate transcription factors. Given the presence of histone modification enzymes (i.e. CBP, LaMorte et al. 1998; HDAC7, Gao et al. 2008), there are likely additional roles for the PML body in the regulation of the associating loci (see Figure 1.4).

Figure 1.4 The PML body as a “Gene Hub”. (A) ESI micrograph of nitrogen rich PML body (blue) with contacts of surrounding chromatin (yellow) indicated with arrow heads. Scale bar 200nm. (B) Illustration of the PML body as a regulatory hub, enriched in chromatin remodelling enzymes, transcriptional activators/repressors that provide a microenvironment for the co-regulation of specific sets of convergent loci. MAR (Matrix Attachment Region); Daxx (constituent protein of PML bodies); pp71 (human cytomegalovirus protein that mediates association of its viral genome at PML bodies; Hofmann et al. 2002). (Panel B is reproduced from Ching et al. J Cell Sci. 2005 Mar 1;118(Pt 5):847-54 by permission from The Company of Biologists)

29

1.6 Specific convergence of genes at "unique" subtypes of PML bodies

Like transcription factories, because the associating loci are found across the genome and occur frequently at the comparatively few PML bodies found in a typical mammalian cell

(see above- PML bodies), it follows that there are frequent convergences of loci from multiple chromosomes at a shared PML body. As suggested recently with respect to transcription factories, these interchromosomal convergences may be mediated by a shared affinity for particular factors that are enriched in specific subtypes of PML bodies (i.e. those enriched in the transcription factor Klf1; Schoenfelder et al. 2010). If this were true, one would expect not only the frequent occurrence of several convergent loci at a shared PML body, but that this frequency is greater than that expected than if their associations were independent of one another. Although there are so far no direct observations of specific, interchromosomal convergences of loci at a shared PML body, PML and SatB1 proteins (a known constituent of

PML bodies) are necessary for the maintenance of large (spanning several hundred kilobase), intrachromosomal loops in the MHC1 locus (Kumar et al. 2007), and the detection of enriched

PML body associations of loci separated by several megabases within a region of Chr17p13.2-

13.1 (Ching et al. 2013), is consistent with the body as the hub of several convergent loops formed between large spans of intervening chromatin (see Figure 1.3). Furthermore, there is an isoform-specific association of PML protein with specific regions within the MHC1 locus

(Kumar et al. 2007) which, in light of the specific association of this locus with PML bodies

(Shiels et al. 2001), and differences in the relative amounts of PML isoform III between PML bodies within a single cell (Beech et al. 2005), suggests that their compositional differences

30 reflect distinct, functional specializations. Other examples of this body-to-body heterogeneity have been reported, with transcription factors (sp140, Bloch et al. 1999; sp110, Dent et al.

1996) or nascent RNA (LaMorte et al. 1998, Kiesslich et al. 2002) only contained within a subset of the PML bodies within a single cell. Apart from heterogeneity in PML body constituents, there are structurally unique bodies that can be found in cell lines with attendant genome-wide deregulation. For example, along with various epigenetic and chromosome structure abnormalities (i.e. whole-arm deletions, aberrant decondensation), in lymphocytes derived from centromeric instability and facial dysmorphy (ICF) syndrome patients, a single "giant" PML body (i.e. 2-4 µm in diametre; Luciani et al. 2006) is seen among other more typically sized bodies (i.e. 1-2 µm; see Figure 1.5a). And in cells derived from patients with acute promyelocytic leukemia (APL), a fusion product of PML protein with the retinoic acid receptor alpha results in smaller PML "microspeckles" that are devoid of typical PML body components

(i.e. Sp100, CBP; de The et al. 2012). However, in some APL-derived cells, there are a subset of

PML bodies that retain their typical size and retain their constituent proteins (Torok et al. 2009;

Figure 1.5b, and see chapter 4). That these unique bodies are found in cells with attendant genome deregulation, and that their large size or integrity may be contributed by their surrounding chromatin (Torok et al. 2009), begs the question of whether they are convergent hubs for specific, high affinity genomic associations. However, to answer this, requires a method to isolate the DNA in the vicinity of these particular bodies, even if their defining characteristic is entirely structural (i.e. a different size) or not clearly defined biochemically. As the following survey of the relevant technology makes clear, there is presently no method for the determination of chromatin associations in the vicinity of a randomly selected nuclear body.

31

Figure 1.5 “Unique” PML bodies found in mutant cell lines with attendant genomic disregulation. A. Mutation in the DNA methyltransferase gene Dnmt3b in Immunodeficiency-centromeric instability syndrome- derived cells gives rise to abberant heterochromatin formation and prominent “Giant” PML bodies (green- PML protein; i. ICF-derived cells ii. wild type cell). B. In the Acute Promyelocytic Leukemia-derived NB4 cell line characteristic translocations between the PML and RARalpha genes correspond to a dispersal PML body components with the exception of a subset of PML bodies that maintain their integrity (indicated by square insets and colocalization of PML and SP100 signal). (Panel A reproduced from Luciani et al. J Cell Sci. 2006 Jun 15;119(Pt 12):2518-31. by permission from The Company of Biologists; Panel B reproduced from Torak et al. Front Biosci (Landmark Ed). 2009 Jan 1;14:1325-36 by permission from Frontiers in Bioscience)

32

1.7 Determining the in situ location of gene loci with respect to nuclear substructures: ChIP and Immuno-FISH

1.7.1 Immuno-FISH

FISH is the labelling of specific DNA sequences in fixed cells with fluorescent or hapten-labelled probes of complementary DNA fragments. Combined with immunostaining, immuno-FISH can be used to determine the spatial relationship of particular genomic loci with a given nuclear substructure (Solovei and Cremer 2010). Although refinements in this technique now allow for the nuclear localization of regions approaching 1kb (Levsky and Singer, 2003; Jiang and Katz,

2002), and the simultaneous detection of multiple genomic and protein targets (Levsky et al.

2002; Zhao et al. 2014), FISH requires a candidate DNA sequence to use as a probe, and thus, is not suited for de novo identification of DNA sequences at nuclear substructures.

1.7.2 ChIP

Chromatin Immunoprecipitation (ChIP) is a method used to determine DNA sequences that bind a protein of interest in vivo. Sheared chromatin from fixed cells is isolated by virtue of its cross-linking to a protein that is affinity-precipitated by an insoluble substrate-antibody conjugate (Das et al. 2004). The DNA/protein cross-links are then reversed, and PCR using sequence-specific primers can detect the presence of a sequence of interest. End-tagging of the precipitated DNA with universal primers in conjunction with high throughput genome-wide

33 sequencing techniques (i.e. "Chip-seq"; Johnson et al. 2007) or microarrays (i.e. “ChIP on chip”;

Rodriguez and Huang, 2005) now allows for genome-wide identification of DNA binding sites for the protein of interest without an a priori loci candidate. Thus, ChIP has been used for the determination of loci associating with a nuclear substructure (Kumar et al. 2007). However,

ChIP is limited in several respects. Firstly, a constituent protein may fail to co-precipitate with any cross-linked DNA if it does not directly bind the surrounding chromatin. This requirement is troublesome for the dissection of nuclear bodies, as transcription factors that lack DNA binding domains are consistently found in multi-protein complexes (Xie et al. 2013). To subvert this limitation, our lab has recently developed "Immuno-Trap", a variant of ChIP that relies on a tyramide-catalyzed deposition of biotin at chromatin in the vicinity of a nuclear body labelled with a tyramide-conjugated antibody (Ching et al. 2013). In this manner, the biotin-deposited chromatin can be specifically immunoprecipitated by a streptavidin-bound substrate for sequencing without the necessity for it to be cross-linked to the nuclear body in question.

Still, ChIP (including Immuno-Trap) is not feasible when the starting material is limiting (i.e. single cells, embryos, tissue, rare stem cell progenitors), as it requires large amounts of starting material (typically between 1 and 20 million cells, with the current ChIP-seq benchmark for

ChIP-seq on the order of 105 cells; Gilfillan et al. 2012). Although ChIP has been optimized to detect protein/DNA contacts involving particular loci from as few as 100 cells (O’Neill et al.

2006; Acevedo et al. 2007; Dahl and Collas 2007), these protocols are not compatible with high- throughput genome-wide analysis (Adli and Bernstein 2011), and thus cannot be used for a naive identification of DNA at a nuclear body. Lastly, ChIP does not distinguish the particular nuclear location at which a chromatin/protein association occurs. This is problematic if the

34 protein targeted by the ChIP precipitation is present at nuclear sites other than the nuclear body of interest. Such is the case for those nuclear bodies that, as described above (see Specific convergence of genes at "unique" subtypes of PML bodies) are differentiated by their unique structural or biochemical features (i.e. ratio of proteins) from other bodies within the same nucleus with identical protein constituents.

1.7.3 DamID

DNA adenine methyltransferase identification (DamID) is a method which involves the specific methylation of chromatin in the vicinity of nuclear structures that contain a recombinant E. Coli derived-methyltransferase fused to a protein of interest. The E. coli methyltransferase mediates methylation specifically at adenine bases, and as most eukaryotes do not exhibit methylation at adenine bases, loci that associate with the methyltransferase fusion protein (and by extension, associate with a nuclear structure of interest) can be determined by sequencing from linkers ligated to digested GmATC sites that have been specifically recognized by the restriction enzyme DpnI (Greel et al. 2006). Because DamID relies on the in vivo activity of the fusion methytransferase, it can detect transient loci associations with the fusion protein without the need for cross-linking and has been a powerful assay for the detection of specific loci associations with nuclear substructures, in particular, the detection specific of loci with the nuclear lamina (Guelin et al. 2008) . However, because

DamID does not allow for the discrimination of the timing of a loci's methylation by the fusion

35 methyltransferase, it is not suitable for the detection of pairs of loci that converge on a nuclear body of interest.

1.7.4 Fractionization of Nuclei for Specific Substructures and Associated Chromatin

Chromatin associated with a particular nuclear substructure can be identified if the subtructure can be isolated along with its associated chromatin via a nuclear fractionization strategy. Such a strategy was used by van Koningsbruggen et al. (2010) to identify loci that are nucleolar associated

(nucleolar associated domains -NADs). Chromatin associated with biochemically isolated nucleoli was deep sequenced, and non- rDNA loci that were highly represented in the resultant sequences were verified to associate with nucleoli in a cell population by FISH. This strategy can be highly successful for other nuclear substructures that can be similiarly isolated in a fractionation procedure (Lam et al. 2002), however, it is a cell population based assay and thus is not suitable for the determination of loci convergent at a shared nuclear body.

1.8 Development of Novel Techniques to Determine DNA at Single Nuclear Substructures

The abiding lesson of the above discussion is that the convergence of particular loci to a specialized nuclear compartment may underlie a genome-wide mechanism of regulation. We

36 are motivated then to look for preferential interactions between nuclear substructures and particular gene loci, as a starting point to identify the factors involved in the regulation of these interactions, and to ultimately understand these interactions in the larger context of genetic control. That there are preferential convergences of loci (both in trans and between distant cis elements) at transcription factories, suggests that a similiar phenomenon may take place at other nuclear bodies. Furthermore, because genes may converge on those nuclear body

"subtypes" that are enriched in their particular regulatory factors, of particular interest are those nuclear bodies that are unique, either structurally or with respect to the ratios of its components, among otherwise biochemically indistinguishable nuclear bodies within the same cell. Given the aforementioned limitations in the current technology, we were motivated to develop novel approaches that can interrogate these unique bodies, or, more generally, that can identify DNA in the vicinity of a single, arbitrarily defined nuclear substructure. To this end, the following pages outline the development and proof-of-principle of two such techniques, and their application to the identification of novel sequences that preferentially associate with

PML bodies, as well as the identification of a loci pair that shows an interdependence of association with a shared PML body.

37

Chapter 2: Nano-dissection and sequencing of DNA from single sub-nuclear structures

(Reproduced from Brandon K Chen,1 David Anchel,1 Zheng Gong, Rachel Cotton, Ren Li, Yu Sun, David P Bazett-Jones Small 2014. 10(16):3267-74 with permissions from Wiley VCH)

1 both authors contributed equally

Dr. B. K. Chen, Dr. Z. Gong, Prof. Y. Sun Department of Mechanical Engineering University of Toronto, M5S 3G8 E-mail: [email protected]

D. Anchel, R. Cotton, R. Li, Prof. D. P. Bazett-Jones Genetics and Genome Biology Program The Hospital for Sick Children, M5G 1X8 E-mail: [email protected]

Abstract

The relative positioning of gene loci within a mammalian nucleus is non-random and plays a role in gene regulation. Some sub-nuclear structures may represent "hubs" that bring specific genetic loci into close proximity where co-regulatory mechanisms can operate. The identification of loci in proximity to a shared sub-nuclear structure can provide insights into the function of the associated structure, and reveal relationships between the loci sharing a common association. A technique is introduced based on the nano-dissection of DNA from thin sections of cells by high-precision nano-tools operated inside a scanning electron microscope. The ability to dissect and identify gene loci occupying a shared site at a single sub-nuclear structure is demonstrated here for the first time. The technique is applied to the nano- dissection of DNA in vicinity of a single promyelocytic leukemia nuclear body (PML NB), and reveals novel loci from several chromosomes that are confirmed to associate at PML NBs with statistical significance in a cell population. Furthermore, it is demonstrated that pairs of loci from different chromosomes congregate at the same nuclear body. It is proposed that this technique is the first that allows the de novo determination of gene loci associations with single nuclear sub-structures.

38

2.1 Introduction

In recent years, it has become accepted that spatial positioning of chromosomes and genes within the cell nucleus is critically important for accurate gene regulation and integrity of the genome. However, it is unclear how multiple loci, separated across large regions of the same chromosome, or across multiple chromosomes, are co-regulated and give rise to stable, genome-wide transcription profiles that characterize a cell type. The location of a gene within the cell nucleus, both with respect to other genes on the same chromosome (in cis) or on separate chromosomes (in trans), and with respect to multi-protein nuclear sub-structures (i.e., nuclear bodies or NBs), is an actively regulated process that underlies a mechanism of genetic control (Osborne et al. 2004, Dundr et al. 2007, Osborne et al. 2007, Zhao et al. 2009).

Examples of such locus associations are chromosome loci associating with nuclear foci containing high concentrations of their regulatory factors (Takizawa et al. 2008), the shared regulation of certain gene pairs requiring their inter-chromosomal convergence (Spilianakis et al. 2005), and NBs serving as regulatory centers for the multiple convergent loci (Osborne et al.

2007, Schoenfelder et al. 2010). This has led to the idea of the NB as a “gene hub” (de Laat and

Grosveld 2003, Ching et al. 2005; Razin et al. 2013), whereby the co-regulation of multiple loci, separated across large regions of the same chromosome or across multiple chromosomes, emerges by virtue of their shared affinity with a single NB.

Testing the gene hub model necessitates the identification of genomic regions that preferentially associate at a single NB. The identification of these gene locus sequences will also

39 provide insights into the function of the associated body, and reveal novel co-regulatory relationships that may not be detectable using conventional genome-wide transcription profiling approaches (Levsky et al. 2002). Whereas fluorescence in situ hybridization (FISH) is effective at identifying interactions between specific loci or between specific loci and sub- nuclear structures (immuno-FISH), these methods require a priori information for selection of loci to probe. Hence, immuno-FISH is not appropriate for use in a naïve or unbiased context.

Although there are techniques available to detect protein/DNA interactions, including

ChIP (Chromatin Immuno Precipitaion; Das et al. 2004) the 3C technique and its variants,

(Simonis et al. 2007; van Berkum and Dekker 2009) e4C (Schoenfelder et al. 2010), and

ImmunoTrap,(Ching et al. 2013) these approaches detect the associations of gene loci in cell populations, and are therefore not suited for the de novo detection of pairs of loci that are convergent at a single nuclear body in a single cell. Moreover, there still remain NBs that, by virtue of their transience, (Luciani et al. 2006, Ebrahimian et al. 2010) limited number (e.g. embryos), or structural heterogeneity within single cells (Lee et al. 2002, Condemine et al.

2006, Ebrahimian et al. 2010), that cannot be assayed using the existing population-based approaches. Particularly when these “aberrant” or rare NBs occur in cell lines with concomitant genome-wide disregulation (Zhao et al. 2000), they are particularly attractive models for studying NBs as regulatory centers for multiple convergent loci. The identification of convergent sequences at these otherwise inscrutable NBs requires an alternative approach, one capable of determining the genomic neighbourhood of a single NB in a single cell.

40

Physical extraction of sub-cellular structures under optical microscopes using laser based microdissection (Emmert-Buck et al. 1996) and glass capillary needles (Wesley et al.

1990) have been reported; however, the minimum extractable area is limited to the microlitre scale. Atomic force microscopy (AFM) has been used for imaging, cutting, and extracting sub- micrometer-sized regions of an isolated chromosome (Hansma et al. 1992, Hu et al. 2001, Lü et al. 2006). The use of a single cantilever tip for both imaging and manipulation, however, leads to concerns of contamination. Furthermore, material extraction using AFM relies solely on adhesion to the sharp cantilever tip, making the process poor in reproducibility.

We report a nano-dissection technique for determining DNA in the immediate vicinity of a sub-nuclear structure (e.g., NB) in a single cell. The technique involves using a nano- manipulation system and custom made nano spatulas for DNA extraction inside a scanning electron microscope (SEM). The nano-manipulation system controls the motion of nano spatulas, under real-time SEM imaging, to accurately extract a minute amount of DNA that is then amplified and sequenced. As a proof-of-principle, we demonstrated the technique’s feasibility in the dissection of minute volumes in the vicinity of a single histone locus body

(HLB). HLB is an NB that has previously been shown to associate with histone gene loci on both chromosomes 1 and 6 (Zhao et al. 2000). From nano-dissection of single HLBs in HT1080 cells, we obtained sequences that are enriched for chromosomes 1 and 6, including sequences within

500 kilobases (kb) and 30 kb of their histone gene clusters, respectively.

We also applied the technique to the dissection of single promyelocytic leukemia (PML) bodies in Jurkat cells. These NBs have been shown to specifically associate with particular gene

41 loci in a cell population (by virtue of paricular loci that have significant differences in loci-to-

PML body mean minimal distance; Sun et al. 2003, Wang et al. 2004), although it has yet to be demonstrated that there are specific pairs of trans loci that preferentially associate at a shared

PML NB. Because there are multiple PML bodies per nucleus in a mammalian cell (typically between 10 and 20), our single body dissection approach is ideally suited to find these convergent trans associated loci, as loci obtained via a population-based approach would not necessarily originate at a shared body. Importantly, the results confirm our capability to recover multiple loci in trans from the dissection of a single PML NB.

2.2 Results

2.2.1 Outline of DNA nano-dissection technique

Targeted regions within a single cell nucleus are removed via nano-dissection inside an

SEM, followed by amplifying and sequencing of the extracted DNA. Cell sample preparation

(described in Materials and Methods) involves fixation, immuno-labelling, post-fixation, cryo- protection and freezing, before cryo-sectioning fixed cells into 300 nm-thick sections. The cell sections are first imaged by fluorescence microscopy to locate structures of interest before finding the same structures by correlative methods in the SEM, prior to nano-dissection.

42

Sections are supported on a smooth and clean working surface of a doped silicon substrate that is electrically conductive to reduce charging effects caused by electron-solid interactions under

SEM imaging. This also mitigates the complex electrostatic force interactions during nano- manipulation. To control for contaminating DNA that may be introduced to the sample prior to

PCR amplification, the cells are pre-ligated with fluorescent double-stranded DNA linkers that are complementary to primers used for the PCR (see Materials and Methods).

The extraction tool, termed nano spatula, is made from solid glass rods heated and pulled into a gradually tapered needle, which is then precision ground to produce a bevelled- shaped spatula with the tip end narrowed down to <100 nm (Figure 2.1a, Materials and

Methods). The nano spatula is then carbon coated to make it electrically conductive.

The cell sample is mounted on a piezoelectric nano-manipulator (Figure 2.1b), which provides nanometer motion resolution along X, Y, and Z axes (Zhang et al. 2013). The assembly is then transferred into the high vacuum chamber of SEM. The first step of the extraction process is to locate the cell of interest by correlating the SEM and fluorescence microscope images (Figure 2.1c,d). The target sub-nuclear region is then mechanically extracted using the nano spatula (Figure 2.1e-h). To ensure the extracted biomaterial stays attached to the spatula tip during SEM chamber venting and during manual handling, a fixation step called electron beam induced deposition (EBID) is conducted to deposit a line of hydrocarbon across the biomaterial and nano spatula to enhance adhesion (Figure 2.1i,j). The SEM is then vented, and the nano spatula with adhered chromatin fragments are carefully immersed into the buffer

43 solution inside a test tube. This process is repeated until all desired sub-nuclear regions are dissected. The collected samples are then PCR amplified and sequenced.

Figure 2.1 Chromatin extraction setup and experimental process. a) SEM image of the nano spatula with tip size less than 100 nm in width. It is made from heated glass rod, mechanically ground to produce a beveled surface. b) DNA extraction

44 setup within the SEM. The cell sample is mounted on an XYZ nano-manipulator, facing the stationary nano spatula. c-d) Image correlation between SEM image and fluorescence image. The arrows point to examples of matching features between the two images. e) Locating the cell of interest inside SEM, guided by correlated fluorescence image (not shown). f) Landing the nano spatula tip onto the target of interest. g) Pressing the nano spatula against the sample causes cell fragments to slide onto the bevel surface. h) Lifting up the nano spatula along with the extracted cell fragment. i,j) To ensure the cell fragment does not get detached due to vibration or airflow during SEM chamber venting, electron beam induced deposition (EBID) is used to ‘glue’ it in place. The nano spatula is then removed from the SEM and stored in buffer solution.

Figure 2.2 Nanospatula and maniuplator description. a) The nano spatula consists of a sharp bevelled tip and a neck structure. After it is used for DNA extraction and removed from the SEM, the tip of the nano spatula is carefully immersed into the buffer solution inside a test tube. Applying a sideways bending motion to the nanospatula breaks the neck structure, leaving only the tip within the buffer solution. The scale bar represents 4 mm. b) SEM compatible nano-manipulation system. The scale bar represents 2 cm.

2.2.2 SEM Imaging, DNA Integrity, and Nano-Manipulation

Exposure to electrons of just tens of eVs is sufficient to alter the organic chemical and biochemical integrity of DNA. Under electron beam irradiation inside the SEM, DNA within the volume penetrated by the electron beam is damaged and cannot be sequenced. Thus, minimizing the electron penetration depth is critical, which can be controlled by reducing the accelerating voltage at the cost of the image signal-to-noise ratio. The signal-to-noise ratio can be improved by increasing the number of irradiating electrons (e.g., by increasing current, spot size, and aperture size), with the trade-off of reduced imaging resolution. Increasing the sample

45 tilting angle also reduces electron penetration depth, but requires image processing to compensate for the tilt-induced image distortions. In summary, a balance between preserving

DNA’s biochemical integrity under electron irradiation and the ability to observe the extraction process must be achieved.

In experiments, we systematically varied the SEM imaging parameters (accelerating voltage, emission current, spot size, aperture size, and sample tilting angle) and conducted chromatin extraction under each set of parameters. These parameters were correlated to the success rate of DNA sequencing. We found that the optimal parameters for nano-dissection are: 0.4 kV accelerating voltage, 2 to 5 µA emission current, 50% spot size, 3 aperture size (50

µm diameter), and sample tilt of 65 degrees. No accumulation of negative charge on the sample surface was observed due to the low accelerating voltage used.

Despite the strong surface adhesion forces present at micro and nanometer scales

(Chen et al. 2009) vibration induced by the nano-manipulators and manual handling of the nano spatula, and turbulence of the air flow from venting the SEM vacuum chamber often detached the extracted materials from the nano spatula. To enhance the tool-sample adhesion, the nano spatulas were constructed using different materials (tungsten, glass, silicon, silicon dioxide, silicon nitride), combined with chemical coating (poly-L-Lysine), physical coating

(carbon and gold film), oxygen plasma treatment, and an applied voltage to induce electrostatic forces. However, none of these combinations provided a consistent adhesion in the vacuum environment. Thus, we used the EBID procedure to fix the extracted cell fragments onto the nano spatula. This worked effectively regardless of the material and surface properties of the nano spatula or the substrate. Prolonged exposure of the sample to electron beam during EBID

46 did not lower the success rate of obtaining sequenceable DNA, likely because the depth of electron interaction volume does not increase significantly with longer exposure time.

To extract DNA successfully, nano-manipulation must be properly performed. The thickness of the nano spatula tip and its angle relative to the sample substrate dictate how the fragment slides onto the bevelled surface of the nano spatula tip. If the cell fragment curls up or flips over during extraction, both sides of the cell fragment would be exposed to the electron beam thereby risking beam-induced specimen damage. Having the collected cell fragment break away cleanly from the rest of the cell when the nano spatula is lifted is also critical. A high success rate was achieved when the cell fragment was first pushed to an area outside of the cell, free of any surrounding tethering, before being lifted. To extract a target in the center of the cell, unwanted material is first scraped away, after which the target material is extracted.

2.2.3 Enrichment of Expected Sub-Chromosomal Regions from a Single Nano- Dissected Nuclear Body

To confirm the feasibility of extracting nano-volumes of chromatin from a specific targeted region, a nuclear structure that is known to associate with a particular gene locus is needed. We chose the histone locus body (HLB) as a model sub-nuclear target (see Chapter 1).

These 0.5-1 µm diameter structures closely associate in a large percentage of cells with the histone gene clusters on both chromosomes 1 and 6 (Zhao et al. 2000). The protein component of the body is called NPAT, and immuno-labelling against this protein was used to identify it in

47 sections imaged by fluorescence microscopy. Structures of interest were identified in the SEM by correlative imaging, and DNA was extracted by nano-dissection of individual HLBs from 300 nm cryosections of HT1080 cells (see Materials and Methods).

In five separate experiments, we obtained 11 sequences (bona fide, i.e. containing

"signature sequence" see Figure 2.5 and Materials and Methods, Sample Preparation section) 7 of which are from chromosome 6, indicating a significant enrichment for this chromosome

(p<5.3×10-7, Figures 2.3 and 2.4; see Material and Methods). In one dissection of a single HLB, a sequence within 500 kb of the histone cluster on was obtained (Figure 2.3a).

More remarkably, a sequence mapping to within 30 kilobases (kb) upstream of the histone cluster HIST1H2BJ from chromosome 6 was identified (Figure 2.3b). The dissections thus yielded a significant enrichment of sequences in the vicinity of the histone gene clusters on chromosome 1 and 6 (p<9.3×10-5; see Materials and Methods). Taken together, these results, for the first time, demonstrate the feasibility of amplification and identification of DNA dissected from targeted nano-scale volumes of single NBs.

48

Figure 2.3 Sequences obtained from HLB dissections. Five independent dissections of single HLBs yielded 11 bona fide DNA sequences (i.e., that are flanked on at least one end with probe that contains “signature” sequence; see Figure 2.5. Materials and Methods). The sequences obtained were enriched for those that mapped to chromosome 6 (7/11; p<5.3x10-7). Red bars indicate the chromosomal location of sequences obtained that map to a) chromosome 1, or b) chromosome 6. The histone locus clusters found on chromosome 6 and 1 are indicated by green bars. Furthermore, the obtained sequences were significantly enriched for those proximal (within a 2 Mb window) of a histone gene cluster, with two sequences mapping to within 30 kb and 500 kb of the histone clusters on chromosome 6 and 1, respectively (p<9.3x10-5).

49

Figure 2.4 Summary of HLB dissection yields. Chromosomal locations of all sequences obtained from dissecting individual HLB bodies (indicated by red bars).

Figure 2.5 Amplification Results obtained from dissections. a) A typical gel result after dissection and amplification. Upper left most lane: 300bp ladder; 1,14: Negative controls (lysis buffer, no input DNA); 2-13: each lane corresponds to a single dissection of nanovolumes of nuclear material. In this experiment, 2 out of 12 lanes (lanes 5, and 6) yielded bona fide sequences as illustrated in b)i., whereby genomic sequence (green) is flanked by primer sequence (blue) and nesting “signature” sequence (pink) that is contributed by ligated probe (see Materials and Methods- Sample Preparation). Occasionally amplicons do arise in negative controls (as seen in lane 1), however these can be discounted by a lack of signature sequence; as shown in b)ii, a sequenced amplicon from lane 1 contains an unidentified DNA sequence (green) flanked by primer sequences (blue), neither of which are flanking the “signature” sequence.

50

2.3.4 Nano-Dissection of Single PML NBs Identifies Novel Loci-Body Associations

The nano-dissection technique was developed to investigate whether a well-defined, limited set of genes cluster at a single PML NB. We applied the technique to isolate DNA from nano volumes (500 nm × 500 nm × 300 nm, Figure 2.6a) around and including PML NBs in 300 nm cryosections of Jurkat T-cell leukemia cells. Eight sequences were identified. Scraping of a single PML NB yielded sequences from multiple chromosomes. Moreover, owing to the small nuclear volumes extracted, clusters of genes were frequently observed (Figure 2.6b; chromosome 17 and 18 sequences). In fact, the single red bar on Chromosome 17 represents two sequences that mapped to within 30 kb of each other. We then used FISH to determine which of these sequences by themselves associated with a PML NB in a population of Jurkat cells.

Using BACs that contained the dissected loci DNA, we identified four genomic loci that were significantly associated with PML bodies in a population of Jurkat cells (Figure 2.6c). A BAC containing the BCL2 locus served as a negative control (Sun et al. 2003). Of these four loci, two loci (mapping to p15.4 and q12 of chromosome 11 and 17 respectively; see Figure 2.6b, probes

5 and 6) originated from the dissection of a single PML NB. Using immunoFISH on a cell population, we observed alleles containing these two loci together at a single PML NB in some cells (see Figure 2.5d), indicating that this co-association at a PML NB was not a rare occurrence. Hence, we demonstrate that our nano-dissection and identification strategy could be used to test whether pairs or sets of gene loci frequently cluster together at a single PML NB in a cell population. Such a clustering could correlate with the unique biochemical composition

51 of that particular PML NB, such as the relative concentrations of certain transcription regulatory factors. Alternatively, that particular PML NB may be non-randomly positioned in the nucleus, such as next to the nucleolus or at the nuclear envelope. Such spatially confined PML NBs may then associate with particular subsets of loci. The nano-dissection technique is presently applied to testing these ideas.

Figure 2.6 Nano-dissection of single PML NBs. a) Typical extraction volume is 500 nm X 500 nm X 300 nm on a 300 nm thick Jurkat cell sample (indicated by red circle). The scale bar represents 2µm. b) Sequences yielded from three nano volume dissections of single PML bodies from 300 nm Jurkat cryosections. Mapped sequences contained in the same bracket were amplified from the same dissection. Interestingly, we obtained sequences across multiple chromosomes from two of these single PML NB dissections. Their mapped chromosomal location is indicated with a red bar (*with the exception of the mapped sequences within 30kb of each other on chromosome 17, which are represented with a single bar). Sequences were found clustered in the genome (chromsome 17 and 18 sequences), as would be expected from dissection of a limited nuclear volume. c) Center-to-center distances of 3-D FISH signals of indicated BACs to PML bodies were measured in whole Jurkat nuclei. For each FISH experiment, at least 100 nuclei were analyzed. Significantly higher percentages of FISH signals within 1 µm and 1.5 µm of PML bodies were obtained compared with a BAC overlapping with the BCL2 gene. (* p<.0086; ** p<.0014; *** p<6x10-7; **** p<.03; ***** p<5x10-15; ****** p<.0006). d) 3-D double FISH was performed using probe 5 and 6, the two BACs

52 that showed significant PML association, and that corresponded to sequences obtained from the same dissection. Shown is an optical section of a Jurkat cell nucleus in which both FISH signals were within 1.5 µm of a shared PML NB.

2.3 Discussion

The spatial arrangements of specific gene loci relative to each other and to specific sub- nuclear neighbourhoods represent a level of gene organization that has regulatory potential

(Zhao et al. 2009). Hence, the ability to identify specific gene loci at particular sub-nuclear landmarks in single cells provides a powerful step in generating hypotheses on the mechanisms of regulating nuclear events. The work described herein provides proof-of-principle demonstrations of a transformative chromatin extraction and identification technique.

We demonstrate the feasibility of isolating nano volumes of chromatin followed by amplification and identification of the extracted DNA sequences. We first showed that the dissection is accurate enough to target a single HLB, and enrich for sequences previously shown to associate with it. We then applied the technique to dissect chromatin from single PML NBs.

Combined with FISH, we were able to identify novel loci that specifically associate with PML

NBs in a cell population. In its current state, the nano-dissection technique is useful as a hypothesis generator, uncovering novel NB-gene associations.

The nano-dissection technique is particularly suited for asking whether pairs or sets of genes come together at specific sub-nuclear structures. In these first nano-dissection experiments of single PML NBs, we identified a pair of loci that were together at a PML NB.

When probing these two loci in a cell population with immuno-FISH, we observed co-

53 association at single PML NBs, although this occurrence was less than each locus being found at a different PML NB. We thus concluded that these two loci, although spatially associated with

PML NBs, do not have to come together at the same body.

At present, the technique has a success rate of ~16% (Figure 2.S2a), which represents the frequency of dissections yielding bona fide DNA sequences (i.e. those sequence that originated from the dissected region by virtue of the presence of a "signature" sequence; see

Materials and Methods, Sample Preparation section) that can be unambiguously mapped to a genomic locus. The present accuracy of the scrape, based on the accuracy of correlative imaging, is +/- 350 nm, and has been greatly improved with recently developed automation of the correlation process (Gong et al. 2014). Further optimization of the technique is expected to improve the extraction accuracy and success rate.

We can claim that the sequences are significantly enriched for sequences within 500kb of the histone locus clusters at 6p22.1 and 1q21.2. The reported p-value of p<9.3×10-5 for this enrichment is based on comparing the proportion of sequences obtained from our HLB dissection experiment that are within 500kb of the histone locus cluster at 6p22.1 or 1q21.2, with the the probability of obtaining this same proportion from randomly sampling the genome

(as decribed Materials and Methods)). We can also claim that scraping of material in the vicinity of HLBs yielded a pool of sequences that are enriched in sequences within 500kb of the histone locus clusters at 6p22.1 and 1q21.2 by considering the volume of the nuclear material dissected compared with the total nuclear volume. In this present work, we chose nuclear bodies that were located at the edge of the nuclear volume to minimize chromatin material

54 extracted in the path of the dissection tool as it extended past the nuclear body of interest to the extra nuclear space. This extra-nuclear material was added to the tool tip as a measure to reduce possible chromatin damage incurred during EBID. Although this volume can be greatly reduced by first removing all unwanted nuclear material with a separate scaping tip, even with this extra nuclear material, our technique still allows for dissection of sub-um3 volumes of nuclear material surrounding a nuclear body of interest. Thus, even with a conservative upper estimate of 1 µm3 of dissected nuclear material, given an average nuclear volume of approximately 413 µm3 for HT1080 cells (nuclear volumes of 20 DAPI-stained z-stacks of

HT1080 cells were measured with ImageJ), the dissected region represents less than .25% of the total nuclear volume, suggesting that it is highly unlikely that the two sequences obtained out of 11 sequences that were within 30kb and 500kb of the histone locus clusters at 6p22.1 and 1q21.2 were obtained by merely randomly sampling chromatin in the nuclear volume. A conservative estimate of this probability can be obtained as follows: Although the distribution of chromatin throughout the nuclear volume can vary depending on local differences in compaction (Rapkin et al. 2012), ESI imaging of HLBs does not indicate compacted chromatin in its vicinity (see Figure 3.10). Thus, assuming a constant density of chromatin throughout the nuclear volume (such an assumption is a conservative estimate of the density at the HLB as it is less compact than other nuclear regions), .25% of the nuclear volume can be used to represent that same proportion of the total diploid genome (i.e. 6Gb; our model below yields the same probability regardless of whether a diploid or haploid genome size is chosen, as the factor of two difference between them cancels out in the calculation). A p-value can now be calculated, based on the probability of obtaining the proportion of sequences within 500kb of the histone

55 clusters on chromosome 6 and 1 as obtained in the HLB dissection experiment (2 out of 11) with a null hypothesis that the sequences were obtained by randomly sampling the chromatin in the nuclear volume. If the dissected region represents .25% of the genome (i.e.

.0025X6Gb=15Mb) then, by randomly sampling 15Mb regions of a genome binned into 1MB windows (based on the sequences being within +/-500kb of the histone clusters at either

6p22.1 or 1q21.2, i.e. 6Gb/(1MbX2)=3000 1Mb windows), there is a 1/200 chance (ie. 200 choices of 1Mb windows for a 15Mb random sampling of the genome; 1 in (3000 1Mb windows)/15Mb ≈ 200) chance that a dissected region spans the histone locus cluster. Thus a

.005 (i.e. 1/200) probability was used in a binomial test that a randomly dissected volume yields a sequence that is within a 1Mb window spanning the histone locus cluster at either 6p22.1 or

1q21.2, and the observed proportion of sequences obtained from the dissection of sub-micron volumes at HLBs (2 out of 11 sequences) yields a p-value ≈ 1.33X10-3 (cumulative probability that there are greater than or equal to 2 sequences out of 11 within 500kb of the histone locus cluster at either 6p22.1 or 1q21.2). Thus using two different statistical models, the chromatin obtained by dissection of HLBs is significantly enriched for sequences within 500kb of the histone locus clusters.

Density and porosity of the cell nucleus plays a key role in the depth of the electron interaction volume. To minimize the electron beam induced damage on DNA in future studies, cell samples can be “encapsulated” to fill the pores, or coated with a thin conformal material layer deposited over the sample to act as the electron shield. Encapsulation with Tween-20 detergent for example, has recently shown promise as a means to protect biological specimens from the ionizing damage of the SEM beam (Takaku et al. 2012).

56

In this study, we used correlative microscopy (i.e., correlating SEM and fluorescence microscopy images) to find structures of interest to nano-dissect. Imperfections in image alignments can lead to reduction of spatial resolution in dissecting the material of interest. A possible improvement is to immuno-label the target sub-nuclear structure with gold nanoparticles, which can be directly visualized within the SEM and hence, eliminate the need for correlative imaging.

It is possible that we successfully amplified only a portion of DNA that was dissected onto the tip of nano spatulas. Independent of electron-induced DNA damage, we suspect two other factors that could have limited the amplification efficiency. First, our approach used to reduce DNA contamination may have limited the sensitivity of the PCR amplification (See

Materials and Methods). DNA contained on the tip that is otherwise amplifiable (i.e., not damaged by the SEM beam) but not flanked by ligated probe, may be insufficiently amplified due to a lack of sites complementary to the PCR primer, or discounted after sequencing by a lack of probe “signature” sequence (See Materials and Methods, Sample Preparation section).

Thus, the sensitivity of PCR amplification is dependent on the proportion of DNA that is flanked by ligated probe. The proportion of DNA ligated with flanking probe may be improved by employing a staggered-ended probe instead. Second, because single molecule amplification requires a high number of PCR cycles for sufficient yields for cloning or direct sequencing, the final detectable PCR amplification products may only represent those dominant species that emerge over successive PCR cycles. As an alternative approach, a more representative population of template can be obtained by reducing the number of PCR cycles to yield sufficient amounts of DNA enough for either microarray analysis or next generation sequencing.

57

Our technique complements the scope of the present tools available to interrogate spatial relationships of gene loci (i.e. ChIP, 3C). It offers unprecedented sensitivity (e.g., the identification of genomic sequences converging on a single NB) and captures chromatin loci associations that may occur at distances that preclude direct or indirect chemical cross-linking to each other or a protein constituent of the nuclear substructure (i.e. as is required by 3C and

ChIP respectively). This allows for the interrogation of chromatin in an arbitrary vicinity of a nuclear structure, which may reveal otherwise undetectable transient or indirect chromatin associations. This work thus far serves as a proof-of-principle and is intended as a basis for further investigation. In addition to being particularly suited to the detection of pairs or limited sets of loci that converge at single NBs, our approach can be used to find novel chromatin/nuclear body associations that cannot be detected by the present population-based methods, either because of limiting starting material (embryos and tissue), or because the NB in question cannot be biochemically isolated. PML NBs, for example, show differences in the ratios of particular PML protein isoforms from body to body (Condemine et al. 2006), as well as other factors, and these compositional differences may reflect the preferential association of genomic loci to a particular variant of a PML NB among the several found within a single cell.

Such associations would not necessarily be detected by ChIP because it relies on the immuno- precipitation of a pooled population of cells.

The question of specific locus associations with PML NBs originally motivated the development of the nano-dissection technique. However, we expect that this technique has wide applicability and can be useful as a novel assay for general questions of interactions of chromatin with sub-nuclear domains that are so far intractable.

58

2.4 Materials and Methods

Sample Preparation: Cells are fixed in 2% formaldehyde for five minutes and washed in PBS.

Cell is pelleted and prepared for cryosectioning as previously described.(Ching et al. 2013)

Cryosections (300 nm thick) are then washed with PBS, and restriction digested with Sau96I and

MSE1 enzymes for 3 hours. The resultant double strand breaks are then blunted with for 1 hour at room temperature, washed 3X5 minutes in PBS and then ligated overnight at room temperature with the following fluorescently tagged double stranded oligo:

Link1 5' (Cy3)AGT GGG ATT CTT GCT GTC AGT TAG CTG 3' , Link2 5' CAG CTA ACT GAC AG(ddC)

3' (ddC: dideoxy C). Note that this oligo Linker contains the priming sites for the PCR amplification. The section is then washed 3X5 minutes with 0.1 % Tween 20/2XSSC solution at

42 ºC, and 3X5 minutes at 60 ºC with 0.5XSSC solution. The efficiency of the ligation is confirmed by the Cy3 signal under fluorescence microscopy. Now the genomic DNA is flanked by priming sites specific to those primers used for the PCR amplification. Contaminating DNA introduced during downstream processing steps will not be recognized by the linker specific primers and thus not be amplified. By performing this pre-ligation step prior to the nuclei scraping we were able to reduce the background amplification problem. Invariably, contaminants do get amplified by mispriming events; however, the presence of a 4bp

“signature sequence” in the ligated linker (denoted by the italicized subsequences of Link1 and

Link2), allowed us to distinguish bona fide amplicons that originated from DNA in the sample from those that arose from a mispriming event (see Figure 2.5).

59

Nano Spatula Fabrication: We used a micropipette puller (P-97 Sutter Instrument) to process a 1 mm diameter solid glass rod (GR100-4 from World Precision Instruments) for creating nano spatula tools. Fabrication parameters are summarized in Table 2.1. With the first recipe, a neck structure is created. With the second recipe, the sharp tip is created (Figure

2.2a). The sharp tip is then mechanically ground using a beveler (BV-10 Sutter Instrument) to create the beveled surface. The nano spatula is mounted on the nanomanipulation system (see

Figure 2.2b) for DNA extraction.

Table 2.1. Fabrication parameters for constructing nano spatulas.

Loop Recipe Heat Pull Velocity Time Number

1 508 75 12 250

1 2 508 75 7 250

3 508 75 6 250

1 508 90 8 250

2 2 508 90 8 250

3 508 90 12 250

DNA Extraction and PCR Amplification: Samples are incubated overnight at 42 ºC in 4.5 µL of DNA extraction buffer as used in Langer et al. 2005: 0.5 µl of 10 One-Phor-All-Buffer-Plus

(Amersham Pharmacia Biotech), 0.13 µl 10 % Tween 20 (Sigma, Germany), 0.13 µl 10 % Igepal

CA-630 (Sigma), 0.13 µl Proteinase K (10 mg ml−1, Sigma). Proteinase K is then inactivated with an 80 ºC incubation for ten minutes. PCR is performed in 50 µl total volume using the Titanium

Taq polymerase kit (Clonetech) under the following conditions: 72 ºC 1 minute, 68 ºC 3

60 minutes, then 14 cycles of 94 ºC 40 seconds, 57 ºC 30 seconds, 68 ºC 1 minute 30 seconds increasing by one second each cycle. Then 8 cycles of 94 ºC 40 seconds, 57 ºC 30 seconds, 68 ºC

1 minute 45 seconds increasing by one second each cycle. Then 22 cycles of 94 ºC 40 seconds,

65 ºC 30 seconds, 68 ºC 1 minute 53 seconds increasing by one second each cycle. Then 68 ºC for 3 minutes 40 seconds. Primer sequence is as follows: 5' AGTGGGATTCTTGCTGTCAGTTA 3'.

Then 10 µl of the amplification product is analyzed on a 2 % agarose gel, and 2 µl is cloned for sequencing using the CloneJET PCR cloning kit (Thermo Scientific).

3-D FISH and Imaging: For Jurkat cells immuno fluoresence and FISH was performed as previously described.(Ching et al. 2013) BACs corresponding with the mapped location of Jurkat sequences (Figure 2.5b) are described as follows: BCL2 Probe- rp11-299P2, Probe 1- rp11-

164H24, Probe 2- rp11-881M16, Probe 3- rp11-366H2, Probe 4- rp11-66P2, Probe 5- rp11-

15D18, Probe 6- rp11-510P20. Confocal stacks with a Z increment of 0.1 µm were taken with an

Olympus IX81 microscope.

3-D FISH Measurements and Statistics: The percentage of cells with at least one BAC signal associating with a PML NB (using thresholds of 1 and 1.5 µm) was determined by measuring the

Euclidian center to center distances between FISH signals and PML with ImageJ. Significant differences (p<0.05) in association frequencies were determined by taking the two-tailed p value under an exact Fisher test. To determine if the shared association at a PML NB of two

BACs was not due to their independent association, the frequency of their association (using a

61

1.5 µm threshold) was compared to a model in which the loci associations are independent and equally likely between any PML NB within a cell. This gives a probability of association of two

BACs at the same PML NB in a cell as: (frequency of BAC A)X(frequency of BAC B)/(Avg. # of

PML bodies per cell). We use this as the probability of success in a binomial distribution test to determine the p value for the frequency of paired associations at a single PML NB. To calculate lower estimate for the p value associated with finding two sequences within 500 kb of the two histone clusters on chromosome 6 and 1 (Pinsidewindow), we considered it as the complement probability of a sequence not being within either of 2 Mb windows centered on these two histone clusters (Poutsidewindow). This gives us: Pinsidewindow = 1- Poutsidewindow = 1- [(( number of base pairs in human haploid genome) – (2X2 Mb))/( number of base pairs in human haploid genome)] ≈0.0013. We then used this value as the probability of success in a binomial test. To determine a p value for the enrichment of chromosome 6 sequences obtained from the HLB dissections, we considered an upper estimate of the chance of getting a sequence from chromosome 6 as: (size in base pairs of chromosome 6)/(size of haploid genome) ≈

(171X106)/(3X109) = 5.7X10-2. We then used this value as the probability of success in a binomial test.

Nucleus volume of HT1080 cells: Optical stacks of DAPI stained HT1080 cells were obtained, and nuclear volumes measured using the "3D point picker" macro of ImageJ. The results of summarized below:

62

1 85345 0.321 1980.162 210.579 42.639 18.678 210.638 42.694 19.28 608.416 2 92126 0.259 2074.689 95.011 78.739 18.385 95.182 78.594 19.018 656.757 3 71924 0.237 2223.856 381.638 102.511 18.184 381.566 102.522 18.745 512.739 4 67553 0.218 2528.892 267.979 166.307 18.226 267.822 166.468 18.714 481.5786 5 5976 0.836 1820.69 3.116 193.224 20.285 3.057 193.091 20.463 42.60231 6 65232 0.224 2786.218 154.398 236.715 18.038 154.408 236.55 18.068 465.0324 7 68413 0.242 2375.002 335.144 296.404 17.896 335.071 296.299 18.162 487.7094 8 68966 0.261 2087.778 121.844 408.063 17.626 122.082 407.868 17.723 491.6517 9 76841 0.286 2102.903 348.083 437.285 17.373 348.164 437.021 17.55 547.7918 10 35219 0.365 2292.318 9.723 482.854 18.626 9.173 482.881 18.162 251.0727 stack2 1 31421 0.344 2155.969 98.881 37.849 21.396 98.933 37.954 21.963 223.9972 2 53125 0.284 2380.806 242.901 149.573 21.48 242.857 150.068 22.377 378.7228 3 64298 0.221 2619.403 414.556 158.532 19.955 414.363 158.58 20.595 458.374 4 29564 0.308 2962.282 8.618 201.88 19.654 7.71 201.947 20.043 210.7588 5 60532 0.243 2489.616 300.143 232.183 20.588 299.951 232.153 21.376 431.5266 6 85824 0.282 2248.794 140.742 289.409 21.024 140.921 289.203 21.722 611.8307 8 55265 0.456 1995.639 254.917 337.405 23.694 254.718 337.658 23.876 393.9787 9 33357 0.294 2204.761 493.697 361.389 20.982 493.779 361.417 21.198 237.7987 10 96627 0.404 1991.434 32.984 393.629 20.505 32.92 393.416 20.743 688.8442 11 62476 0.235 2401.223 332.759 398.45 19.702 332.655 398.217 19.913 445.3852 stack3 1 41809 0.488 2162.402 192.832 36.181 24.703 192.776 36.334 25.336 298.0522 2 49420 0.415 2178.28 318.291 45.015 25.453 318.33 45.019 25.995 352.3102 4 59568 0.247 2573.739 303.799 204.72 22.341 303.75 204.797 23.024 424.6543 5 35491 0.318 2401.547 277.642 278.342 23.701 277.703 278.145 24.081 253.0118 6 88697 0.252 2389.072 218.621 327.434 21.996 218.761 326.986 22.492 632.312 7 103760 0.25 2372.21 51.757 388.79 20.89 51.015 388.662 21.133 739.6947 8 53807 0.371 2153.786 261.535 445.447 22.082 261.546 445.534 22.356 383.5847 9 49355 0.274 2293.732 38.522 465.594 20.471 38.612 465.568 20.563 351.8469 10 31336 0.555 2061.225 460.138 462.19 24.993 460.05 462.194 25.147 223.3912 12 37606 0.34 2223.675 157.574 271.484 27.332 157.525 271.543 27.493 268.0894 13 36052 0.376 2225.988 248.85 140.07 27.766 248.824 140.36 28.065 257.0111

Average Nuclear Volume StDev 413.2428 165.2863

63

64

Chapter 3: A novel light microscopy-based method to determine DNA sequences at single nuclear sub- structures

As reviewed in the introduction, gene loci make specific associations with components of the nucleus, and this association may determine or reflect a mechanism of genetic control. The identification of "gene hubs" i.e. convergent loci at a shared nuclear body is not possible with current methods that rely on the use of loci-specific probes, or immunoprecipitation of bulk cells with a particular protein. The following is a summary of the major benchmarks leading to the development of a light microscopy-based method for the identification of loci contained within the vicinity of single nuclear body in a single cell. As a proof-of-principle I demonstrate that DNA obtained from the targeting of single nuclear substructures are enriched in loci known to associate with them.

3.1 Introduction

A prevailing theme of nuclear organization is that processes of genomic regulation are both imparted by, and contribute to, the compartmentalization of nuclear factors into enriched foci called "nuclear bodies" (NBs). The observation of specific genomic/nuclear body

65 associations, including U2 gene loci at Cajal bodies (Dundr et al. 2007), Hemaglobin b (Hbb) and other eurythroid specific genes at transcription factories (Osborne et al. 2004), and the p53 locus at PML bodies (Shiels et al. 2001), has led to the idea of the nuclear body as a "gene hub", whereby the co-regulation of sets of genes, either in trans, or separated by large regions along a chromosome, is achieved by their shared association with a nuclear body enriched for their cognate regulatory factors (Ching et al. 2005; de Laat and Grosveld 2003; Razin et al. 2013).

Crucial to the testing of the "gene hub" model is the ability to detect multiple loci that are convergent at a shared nuclear body. Their detection would also provide insight into the function of the associated body, and may reveal novel co-regulatory states that precede a change in transcription and are thus undetectable by traditional gene profiling approaches

(Levsky et al. 2002). So far, loci/nuclear body associations have been revealed by population- based approaches that rely on the crosslinking of DNA with either a constituent protein of the body (i.e. ChIP; Das et al. 2004) or an hapten deposited in its vicinity (i.e. immuno-Trap; Ching et al. 2013) that is precipitated in a biochemical pulldown, or by a direct observation of the nuclear location of an in-situ labelled candidate genomic loci (i.e. immuno FISH; Brown et al.

1997). However, there remain certain bodies that are not amenable to these approaches by virtue of limited starting material (e.g. embryos; Ebrahimian et al. 2010), structural heterogeneity (i.e. different sizes or ratio of constituent proteins between nuclear bodies) within single cells (Torok et al. 2009; Condemine et al. 2006; Luciani et al. 2006), or lack of a known constituent protein that can be cross-linked to its surrounding chromatin. In particular,

66 the presently available techniques do not allow for the dissection of those "unique" NBs that by virtue of their unique size or composition, cannot be biochemically distinguished from other structures within the same cell. Because such NBs occur in cell lines with concominant genome-wide disregulation (Lee et al. 2002) , we have hypothesized that the presence of these

"aberrant" bodies may be linked to the disregulation of multiple loci that are convergent upon them (Torok et al. 2009). We were thus motivated to develop a novel approach that allows for the dissection of the genomic neighbourhood of a single nuclear body.

To this end, I introduce here a novel technique- laser targeted oligo ligation (LTOL)- based on the amplification and sequencing of DNA at sites of double strand breaks (DSBs) induced by targeted 2-photon irradiation. We demonstrate that this approach is sensitive enough to yield DNA sequences originating from the targeted region of single nuclear bodies, of sufficient length to be mapped to a specific genomic locus. As proof-of-principle, we targeted single histone locus bodies (HLBs), and show that the resultant sequences are enriched for those that cluster near the expected colocalizing histone gene locus (Zhao et al. 2002). This work serves as both a benchmark for the determination of DNA sequences in selected volumes at single sub-nuclear structures in-situ, and as the introduction of a novel approach to find loci/nuclear body associations in those bodies that cannot be dissected by conventional biochemical assays.

67

3.1.1 General Description of Technique

The LTOL approach rests on the ability to ligate double stranded oligo probes to sub- nuclear volumes that contain DNA double strand breaks (DSBs) that have been induced by targeted 2-photon irradiation in the presence of a photosensitizer (Anchel 2009). As part of my

Master's work, the basic methodology was shown to be feasible: oligos can be specifically ligated to DSBs induced in selected sub-nuclear regions with two-photon irradiation, and the genomic DNA adjacent to the ligated oligos from a few pooled cells can be amplified to yields sufficient for cloning. The basic method is outlined in Figure 3.1 and described in more detail in Materials and Methods. Briefly, targeted two-photon irradiation creates localized DSBs at a single immunolabled subnuclear structure. The targeted cells are isolated by a laser catapulting microscope (Schutze et al. 1998), incubated in a lysis buffer, and subjected to PCR using a primer complementary to the ligated probe (see Materials and Methods for more detail). The amplified DNA is then cloned and sequenced, and for those sequences that can be mapped to specific genomic sites, their corresponding BACs are used for FISH to verify their colocalization with the targeted structure in a cell population. In this way, we can determine

DNA sequences in the vicinity of chosen nuclear subtructures, particularly those nuclear bodies that have so far been intractable by traditional biochemical pulldown-based approaches.

Although I was able to demonstrate previously that genomic DNA could be amplified from a few cells (as few as 8 pooled cells) laser targeted across relatively large subnuclear regions

(rectangles that extend across the diameter of the nucleus), further improvements were needed in the procedure in order to adapt it to the dissection of single nuclear bodies. The

68 work presented here outlines the development of those improvements made upon the targeting, probe labelling, amplification, and sequencing procedures, that allow for sufficient sensitivity and accuracy to identify DNA originating from the targeting of a single nuclear body.

Figure 3.1 Outline of LTOL procedure. (a) i. Cells on a keystoned or gridded coverslip are immunolabelled and Hescht stained. ii. In the fluorescent channel of the immunolabelled signal, a z-stack is taken. A 2D region of interest confirmed to a chosen number of stacks is targeted with two-photon irradiation. This bleaches the Hoescht in the targeted volume, causing localized DNA DSbreaks (DSBs). iii. In order to blunt the DNA for ligation with the blunt-end probe, the cells are incubated with Klenow enzyme and DNTPs. iv. The cells are then incubated with a blunt-end oligo and T4 DNA Ligase. The oligo contains priming sites that are used for the subsequent amplification steps. To prevent intra-probe ligation the oligo lacks 3'OH groups. Ligation to blunted genomic DNA occurs between a 5'phosphate contained on one strand of the oligo (binding strand) and a 3'hydroxyl of the blunted genomic DSB. (b) Single targeted cells (red arrow heads) are located within the keystone for PALM isolation (ii) into lysis buffer. (c) The lysed cell is then subjected to PCR amplification with primers (red) complementary to probe to sufficient yield for sequencing. Primers used for PCR are 4bp shorter than the probe sequence so that amplicons that occur by mispriming events can be discounted by the lack of a "signature sequence" immediately following the primer sequence (see figure 3.6).

69

3.1.3 Targeted 2-photon DSB induction at single-nuclear substructures

In order to confine the effects of the UV irradiation axially (i.e. in the axis parallel to the laser path), the "two-photon" effect is exploited: an electron can "transition" from its ground state (S o) to its excited state (S 2 ) by an incident photon whose energy is approximately equal to the potential difference between S2 and S0. In the "two-photon" effect, this same transition can take place if an electron is absorbed by two photons within the time interval required for a

-18 S0 to S2 transition (on the order of 10 seconds), whose summed energy approximately equal the transition potential. This process rarely occurs unless a large number of photons per unit area (photon flux) are coincident on a sample (typically requiring 1020–1030 photons/cm2), so that for an incident laser focused through a high numerical aperture objective, sufficient photon flux is only achieved within a volume largely confined to the focal plane (Oheim et al.

2006; Meldrum et al. 2003). Thus, the induction of DSBs in a cell sample mediated by photo- excitation of the DNA-binding Hoescht 32258 dye (requiring incident photons in the range of

320-400 nms), can be largely confined to the focal plane of incident 780 nm photons (i.e. of approximately half the energy required for Hoescht excitation), where the two-photon absorption events are maximal. In a two-photon microscope equipped with rastering scanning mirrors, the timing of the laser emmission can be precisely controlled to toggle on only when the light path is incident at user-defined coordinates in the field of view. Assuming a constant time for the scanning mirrors to complete a scan, and constant rotation speed throughout, this control is achieved by determining the fraction of the inputted x-y coordinate (xi,yi) to the

70 entire rastering path. For raster pattern that starts at (0,0) and ends at (512,512) completing a horizontal line before continuing vertically, this fraction is:

pathfraction= xi/512 * yi/512.

Thus the timing of the laser toggle is set to:

ttoggleon= ttotalrastertimeforfieldX pathfraction

i.e., a fraction of the total rastering time that is equal to the fraction of the inputted x-y coordinate to the entire rastering path.

In this way, the irradiation can be directed at those x-y coordinates that correspond to the location of an immunolabelled sub-nuclear structure within an image obtained prior to the 2- photon scan, provided of course that the fluorophore used for the labelling has an excitation spectrum that does not overlap with the photosensitizer. Although this strategy has been exploited in observing the dynamics of DNA repair proteins at DSBs induced in relatively large subnuclear regions (Bradshaw et al. 2005), it was not known whether this same approach could be used to reliable target volumes approaching 1µm3 (i.e. the diametre of a PML body).

Previous studies had demonstrated that with a high numerical aperature objective, the size of the damage induced by two-photon irradiation in the presence of a photosensitizer could be confined to a volume approximately equal to the focal spot volume (approximately 200nm radially, 500nm axially; Meldrum et al. 2003), however it remained to be seen whether this focal spot could be accurately directed at the intended x-y coordinate. For example, discrepancies can occur between the user-defined x-y coordinate, and that of the two-photon

71 irradiation, due to stage drift between images, differences in the timing of the scanning heads between imaging sessions, and/or differences in the light paths and focal plane between laser lines for a given scanning mirror orientation due to laser misalignments or chromatic abberation (Fellers and Davidson 2007; Conchello and Lichtman 2005). Indeed, in my initial attempts at targeting the centres of relatively large subnuclear structures, (e.g. chromocentres), these errors were large enough to misdirect the beam several microns beside the intended coordinates. Several measures were taken to reduce these errors:

3.2 Troubleshooting

Throughout the development of this procedure, there were several hurdles to overcome before its final realization. These issues are reviewed below to illustrate the difficulties of its development, as well as to provide further details of the procedure.

3.2.1 Decreasing time between initial image scan and 2-photon targeting

The microscope manufacture's control software was not entirely suitable for my needs, because the time periods required for manual target selection and irradiation introduced unnaceptable amounts of stage drift. Furthermore, it was inadequate for targeting multiple

ROIs (Region of Interest) that were comprised of only a single pixel, and I wanted in my initial

72 tests to target multiple subnuclear structures with diameters approaching the focal spot size of the two photon beam in a single scanning pass. Fortunately, James Jonkman of the UHN imaging facility had previously written a visual basic macro that used user-defined pixel "masks"

(i.e. a binary array corresponding to each pixel in the imaging field) as input for directing the laser irradiation. Minor modification to this macro were made to suit my needs, namely, ensuring that the laser shutter was used to toggled the bleam off between targeted pixels, as the original macro only toggled the beam off using the ATOF (Acousto-Optic Tunable Filter) component, which often left a residual track of laser damage outside the intended volume. In order to create the binary image array for the microscope software, I adapted an imageJ macro that creates a binary mask of an image from a user-controlled threshold that captures the immunofluorescence signals for the structures of interest, and then prompts the user to choose which coordinate corresponding to the centre of mass of each thresholded object is to be included in the resultant binary image array. This binary image array is then used as input for the microscope macro, to toggle the two-photon laser on only at those coordinates that correspond to the centres of the structures of interest. Thus, using this modified combination of ImageJ and microscope macros, from an initial image scan of the immunofluorsecence signal, the structures of interest can be quickly targeted, and stage drift minimized.

3.3.2 Alignment of focal plane between laser lines Calibration of targeting offset

73

Because the laser line used (typically Alexa 488) for the immunofluorescence image is chosen to be spectrally separated from the 2-photon laser (790nm), they significantly differ in their focal planes due to chromatic abberation. These differences were adjusted at the beginning of each session, using the reflected light from the coverslip surface as an alignment reference. For the session-to-session variability in the speed of the scanning mirrors, prior to each image session, I added an x-y coordinate offset in the microscope macro. To calculate this offset, I used H3K9trime antibody to label chromocentres (ref. h3k9trime for chromocentres) in

MEF (mouse embryonic fibroblast cells), as they offer large targets, such that a clear photobleach is left after localized 2-photon irradiation that can be compared to the original target coordinate. Several chromocenters are targeted in different areas of the imaging field (typically the four corners and centre) and the differences in the x and y values between the centre of the resultant bleach spot (xr,yr) and the original targeting coordinate (xt ,yt) are noted. Plots of xr vs. xt (yr vs. yt) fit well to linear functions fxr->xt (fyr->yt), and so the user-defined targeting coordinates are first offset by evaluation under them before they are inputted into the microscope targeting macro. This approach greatly reduced the targeting error, providing sufficient accuracy to consistently direct the laser spot to within the diameter of single nuclear bodies (Figure 3.2).

74

Figure 3.2 Induction and labelling of DSBs is axially confined. Because the effective wavelength necessary to induce two-photon breaks (390nm) is confined axially to the focus of where the 750nm irradiation in which the 2-photon interactions occur (see Material and Methods). Shown here is a z-stack of a targeted mouse embryonic (MEF) cell merging the probe signal (red) with the H3K9me3 signal (green). The enriched probe signal corresponding to the targeted region (indicated by arrow in frame at 1.4µm) is largely confined to between the 1.4µm and 2.24µm frames. (purple scale bar 1 µm).

3.2.4 Reliable isolation of single targeted cells from coverslip into lysis buffer

Following the targeting and labelling steps, single cells are "laser catapulted" using a

PALM LMPC microscope (Schutze et al. 1998; see Material and Methods). Briefly, the cells on

75 the coverslip are alcohol dehydrated, hematoxylin stained, and taped onto a microscope slide.

A quadralateral "keystone" etched into the coverslip glass, that encloses a region containing the targeted cells, aided in their identification under the low magnification objective of the PALM microscope, by referencing an annotated image taken at the time of targeting (see Figure 3.1).

A single targeted cell can be "catapulted" intact into a lysis buffer droplet contained in an overhanging eppendorf tube cap, by a directed UV pulse. In my initial attempts however, the relatively strong adherence of the cell to the glass coverslip, prevented portions of catapulted cells to remain, or skewed the trajectory of the cell so that it did not consistently land in the overhanging lysis buffer. UV-scissible PEN (polyethylene naphthalate) membrane coated slides are provided by the manufacturers of the PALM system to address this problem, although I found they were not adequate for the upstream laser targeting and ligation steps because of autofluorescence from the plastic substrate. Xylene film, and Teflon spray were tried in order to decrease the adherence of the cells to the coverslip although, but these measures were not successful. I then decided to make my own PEN coated coverslips. I tried various methods (i.e. vacuum wax, super glue) to seal the PEN film flush against the coverslip without success, either because the adhesive was toxic to the cells upon seeding, or because it broke down throughout the processing procedure. I eventually found that "Best-Test" rubber cement (Union Rubber

Inc.) was suitable, non-toxic, and sufficiently resistant. The use of these PEN filmed coverslips allowed for consistent non-contact catapulting of intact single cells, although the problem remained of reliably capturing the catapulted cell in the lysis buffer, as small variations in their trajectory sent them to the dry sides of the eppendorf cap. Instead of increasing the volume of lysis buffer (as a I did not want to change the conditions of the PCR protocol), I found that there

76 was a higher likelihood of capturing the cell in the lysis buffer if I pippeted it directly in the centre of the cap, so that it formed a droplet of maximal height from the cap surface. Perhaps it is because this method minimizes the distance that a catapulted cell needed to travel to contact the buffer surface, and thus maximizes the range of capturable trajectories from the coverslip. I also found that the catapulting trajectory was affected by where on the excised membrane I directed the UV pulse. Pulses aimed at the edges of a PEN membrane fragment typically catapulted at oblique angles and missed the lysis bubble, whereas pulses directed at the fragment centre, catapulted normal to the coverslip surface. Therefore, when cutting the membrane prior to catapulting, I took care to extend the cut area well beyond the cell on one side, so that the pulse could be directed at the centre of the fragment without ablating the cell itself. With all of these measures in place, I typically achieved catapulting efficiencies (i.e. intact cells verified to be floating in lysis buffer bubble in eppendorf cap) on adherent cells of approximately 90%. Non adherent cells (e.g. NB4 cells described in the next chapter) that require an adhereing agent (i.e. poly-L-lysine) to remain on the PEN film, were more difficult to successfully catapult, as they would often be ejected from the PEN film during membrane cutting.

77

3.3.5 Lessons from targeting of Single Mouse Chromocentres: Use of "Signature" Sequence

Once I could reliably direct the 2-photon targeting with sufficient accuracy to target single nuclear bodies, I attempted to test the LTOL to identify genomic sequences known to be contained at a particular nuclear body in a single cell. In interphase mouse nuclei, centromeric chromatin regions are condensed into discrete foci known as chromocentres. A 234bp canonical sequence - the mouse major alpha satellite repeat- is almost exclusively contained within the chromocentres (Guenatri et al. 2004). As a first proof of principle, I targeted single mouse chromocentres, in an attempts to obtain sequences enriched mouse alpha satellite repeat sequences. As described above, aiming of the 750 nm 2-photon irradiation was standardized by an ImageJ macro which positioned the laser pulse to a region corresponding to the centre of mass of a chosen chromocentre immunofluorescence signal in a chosen optical section (see Materials and Methods “Induction of DNA Damage in Subnuclear Regions”). Figure

3.2 and 3.3 show typical results of our targeting of single chromocentres. Mouse chromocentres are large enough (>1 µm) that the irradiated spot covers only a fraction of the entire structure. Thus, I am confident then that we are sampling only those sequences contained within the chromocentre. Amplification products containing genomic sequence flanking the ligated probe and linker sequences were only found in samples containing single targeted cells (see Figure 3.4), and not in control samples that contained cells that were not targeted. As expected, cloning of the resultant amplicons from single targeted chromocentres yields subsequences of the major alpha satellite repeat sequence in approximately 18% of picked colonies (7/39) (this result agrees with 20% estimated percentage of chromatin within a

78 chromocentre that is alpha satellite sequence; see Materials and Methods). We also found several clones containing sequences that lay within 10Mb from the long arm chromosomes, indicating that perhaps both ends of the chromocentre are convergent on a chromocentre.

Unfortunately however, out of 14 sequences amplified from cells that were targeted away from chromocentres ("off-body" lane; see Figure 3.3 and Apendix) that contained identifiable genomic DNA flanked by probe sequences, 1 clone mapped to the major alpha satellite repeat sequence. Thus, although the chromocentre- targeted cells yielded major alpha satellite sequences at a proportion that is consistent with their proportion of total DNA in chromocentre, there was not a significant more enrichment compared with the sequences obtained from the off-body targeted cells (Exact Fisher test p< 0.6650). Although these initial results were encouraging, they highlighted two problems that possibly prevented a more conclusive result. First, because the PCR procedure is optimized to single molecule sensitivity, it is especially prone to contamination, either from previous experiments, or between samples in the same PCR run. Thus, the presence of alpha satellite sequence in the off-body control can be explained by cross-contamination from the chromocentre targeted samples. Otherwise, it is possible that the amplified sequences are a mixture of those originating from genuine ligation events, and those resulting from a mispriming of random genomic fragments. Because the alpha satellite sequence is quite promiscuous (representing approximately 7% of total genomic

DNA; Rattner et al. 1978), it would be expected to be present in random pools of genomic DNA, and thus we cannot discount the possibility that its presence in both the off-body and chromocentre targeted pool were the result of random priming events.

79

Figure 3.3 Targeted DSB induction of mouse chromocentres and probe ligation. (A,D,E,H,I): .7µm optical sections of MEFs from the same coverslip immunolabelled with H3K9me3 prior to targeted laser damage or after as indicated. The 2-photon 750nm irradiation bleaches the Alexa 488 signal and can be used as confirmation of successful targeting (E,I indicated by arrows). After processing the coverslip for probe ligation (see Materials and Methods), the cells are visualized again to confirm the colocalization of the probe signal (B,F, J) with the targeted region (C,G,K). The amplification products of the DNA extracted from each cell (the two cells shown in the "Not Targeted" sample were pooled together for amplification) is shown in Figure 3.4.

80

Figure 3.4 Amplification and identification of mouse alpha satellite DNA derived from a single targeted mouse chromocentre. (a) Resultant products after DNA extraction and amplification procedure. From left to right: lane 1- ladder (upper most band 300bp); lane 2- lysis buffer alone as the starting material; lane 3- DNA extracted from cells that were not targeted ("Not Targeted" cells from Figure 3.3); lane 4- DNA extracted from cell that was targeted away from chromocentre ("Off-Body Targeted" cell from Figure 3.3); lane 5- DNA extracted from cell that was targeted at a chromocentre ("Body Targeted" cell from Figure 3.3). Note that there are only genuine amplicons in the samples that were tageted. The approximately 300bp band that is seen in lanes 2 and 3 is a primer dimer artifact. (b) Three cloned amplicons from the "Body Targeted" sample that align with the mouse major alpha satellite canonical sequence. The purple and blue subsequences flanking the orange satellite subsequence are contained in the probe sequence and linker sequence respectively.

To differentiate mispriming artifacts from those sequences that originate from probe ligated DNA, I shortened the PCR primer so that upon annealing to the probe template, 4bp at the 3' and of the template strand remain uncomplemented. Thus, amplified genomic sequences can be deemed "bona fide" if the 4bp "signature" sequence is found immediately 3' to the primer, as it is contributed by polymerizing over the probe template (see Figure 3.6).

81

Another possibility confounding the interpretation of the chromocentre targeting was the possibility that residual, unligated probe could be ligated to breaks that incidentally occur during processing (i.e. throughout heated overnight proteinase K digestion - see Materials and

Methods) during a subsequent linker ligation step (see Material and Methods). Thus, there would be a background amplification of random genomic sequences that were indistinguishable (as both would contain the "signature" sequence) from those sequences that were ligated by probe prior to the linker ligation step. And, owing to its promiscuity, alpha satellite sequence would be expected to be highly represented in this background. Although this background amplification could be eliminated by adopting an alternative PCR schemed that does not include a secondary ligation step (see Materials and Methods), I first turned to a different nuclear body target, one that could be clearly immunolabelled, and contained a known sequence that was not highly represented in the genome.

3.2.7 Lessons from targeting of integrated tandem lac arrays: Blocking of endogenous breaks and linker-free PCR protocol

Stably integrated tandem lac arrays provide a defined and unique sequence that is contained only within a well resolved nuclear focus. I stably integrated a GFP-Lac repressor fusion construct into a cell line containing stably integrated tandem LacO repeats on chromosome 3 (Soutoglou et al. 2007), and targeted the resultant GFP-foci (see Figure 3.5).

82

Although I was able to amplify sequences that clustered on chromosome 3, in my control samples, when I targeted away from the Lac foci, chromosome 3 was also enriched (see

Appendix lac-array sequencing). I hypothesized that this was not cross-contamination, as the sequences in my control samples were not the same as those in the targeted samples), and that the ligation of residual probe remaining during the DNA extraction is ligated to double stranded breaks during the linker ligation step. It was likely that, as was suggested by other groups using the same cell line (Jacome and Fernandez-Capetillo 2011), the integrated Lac-Array was in a

“fragile site” for dsDNA breaks that arise either during downstream lysis and PCR processing, or prior to the targeting and ligation step. RNA transcripts in my sample may also be ligating to residual probe, or there may be cross-contamination between samples over the course of processing. Both to reduce this possibility of cross contamination, and to eliminate the possibility of oligomer ligation to residual breaks during the second ligation step, I attempted the PCR amplification without a linker ligation step, and also added RNase immediately after the proteinase K digestion. My initial attempts to carry out the PCR amplification without the linker ligation step failed, likely because of a phenomenon known as “padlock suppression” whereby primer annealing is prevented by the annealing of complementary strands on opposite ends of the template strand (Rand et al. 2005). Without linker ligation, the only templates that were likely to be amplified were those genomic sequences flanked by probe on both sides, and because I was using relatively long probe sequences (60bp), the annealing temperature of the formed padlocks was lower than that of my primers. I decided to try using a shorter probe sequence (30bp), to lessen the padlock suppression effect. In parallel with these attempts, I had found that the 30bp oligomer used in the single PCR protocol used in the nano-dissection

83 technique (see chapter 2) was yielding amplicons with single molecule sensitivity, and that some of the amplicons resulted from a genuine ligation event (i.e. presence of "signature" sequence) on only one end of the genomic DNA, indicating that it likely arose from a mispriming event on the other side. Therefore, I could use the same single primer complementary to the

30bp oligomer used in the nano-dissection technique to capture genomic DNA that is both flanked on both sides by oligomer, and that ligated only on one side, by relying on the likelihood of a mispriming event on the other. To further avoid the possibility of oligomer probe ligating to endogenous dsDNA breaks, I added a "blocking" step, by ligating with a hairpin oligomer prior to the laser targeting (see Material and Methods). Because of the possibility that the lac array target was itself a "hot spot" for endogenous breaks (Jacome and Fernandez-

Capetillo 2011), I also decided to change the model target to an identifiable nuclear structure that frequently colocalizes with an endogenous, single copy sequence.

84

Figure 3.5 Targeting of cells carrying stable Lac array. (a) single GFP foci corresponding to the stably integrated lac array were laser targeted and ligated with oligo probe. (i.) GFP focu before laser irradiation. (ii.) image after laser irradiation and probe ligation. (iii.) enrichment of ligated probe that colocalizes with the lac array as shown in the (iv) merge. (b) the approximate location on chromosome 3 of identifiable sequences (indicated by red lines) obtained from targeting of single lac array foci. Out of 24 sequences that contained signature sequence (see Figure 3.6 and Materials and Methods, and Appendix), 6 mapped to chromosome 3. However, in sequences obtained from targeted away from the lac array, 3 out of 11 sequences mapped to chromosome 3 (corresponding to blue lines; see Appendix). Scale bar 5µm.

3.8 Results

3.8.1 Single molecule amplification and sequencing of DNA originating from a single nuclear body

An ideal candidate is the Histone Locus body (HLB), so-called because of its frequent colocalization with the histone clusters found at chromosomes 6p22.1 and 1q21.2 (Figure 3.2;

Zhao et al. 2000). I immunolabelled HLBs in HT1080 cells with an antibody to the HLB's constituent NPAT protein (Bongiorno-Borbone L et al. 2008; Nizami et al. 2010), and induced

DSBs in the vicinity of a single HLB by targeted 2-photon irradiation (see Figure 3.1 and

85

Materials and Methods). Subsequent labelling of targeted cells with fluorescent oligomers shows a clear enrichment of signal that overlaps with the NPAT signal, indicating the specific ligation of oligomers to chromatin in the vicinity of the HLB (Figure 3.7a). Although the spot size can be increased with increased laser power, I was able to confine the induced damage to approximately 600 nm radially, and 1µm axially (Figure 3.2). Single targeted cells are laser catapulted into individual eppendorf tubes, and the DNA amplified in parallel using primers complementary to the oligo probe sequence (see Materials and Methods). Parallel amplification of DNA obtained from targeting of single HLB bodies in 9 cells resulted in 9 sequences with a flanked "signature" sequence that could be unambiguously mapped to a specific genomic locus (Figure 3.7 and 3.8). The sequences were both significantly enriched for chromosome 6 (3 out of 9 sequences; p<.01 by the binomial test, see Materials and Methods), and clustered to within 1 and 2Mb of the histone gene locus at 6p22.2 (2 out of 9 sequences; p< 7X10-5 ). As a negative control, cells that were not targeted were processed in parallel, and only yielded sequences that did not contain a flanked signature sequence (i.e. mispriming artifacts). These results indicate that the laser targeting technique is sufficiently sensitive to obtain sequences originating from the targeting of a single nuclear body, and furthermore, that the targeting is sufficiently accurate to yield sequences that cluster at the specific sub- chromosomal regions that colocalize with the nuclear body.

86

Figure 3.6 Amplification result from LTOL targeting at single HLBs. a) Gel result after laser targeting and amplification. Shown are combined amplicons from three targeting experiments of HLBs. Two lower and upper left most lanes: 1kb, 300bp ladders; lanes 1-5: Negative controls (lysis buffer); lanes 6-10: Non targeted cells from the same coverslip as targeted cells; lanes 11-23: each lane corresponds to a single targeted and microdissected cell that was targeted at a signal HLB. b)i. genomic sequence (green) that originated from a bona-fide ligation event is distinguished by a primer sequence (blue) and nesting “signature” sequence (pink). ii. Occasionally amplicons do arise both in lysis buffer controls (as seen in lane 1), and non-targeted cells however these were discounted by a lack of signature sequence.

87

Figure 3.7 LTOL results from HLB targeting. (a) laser targeting of single HLB body. (i.) HLB body before laser targeting (green NPAT antibody). After targeting a single HLB body and ligating (ii.), an enriched focus of fluorescent oligo (red) colocalizes with the irradiated region (iii.). (b) Chormosomal locations of sequences obtained from targeting. Zoomed chromosomal locations of sequences obtained from targeting of single HLBs that mapped to chromosome 10p15.1 and 6p22.1 were found to significantly associate with HLBs. Two sequences were obtained that clustered to within 2MB of the histone gene cluster (indicated by green brackets).

88

3.8.2 FISHing of loci obtained from HLB targeting reveals a novel HLB-locus association

Using BACs overlapping with the genomic coordinates of the obtained sequences, I performed immuno-FISH to verify whether they indeed colocalized with HLBs in a cell population. As expected, the two sequences obtained within 1MB and 2MB from the histone locus clusters on 6p22.2 colocalized with significant frequency with HLBs (see Figure 3.7b and

3.8). Using BACs, I also tested sequences obtained from targeting that did not map to chromosome 6, to see if these were false positives or did indeed associate with HLBs.

Suprisingly, one sequence, mapping to within 10kb of the RBM17 gene on chromosome

10p15.1, and originating from a targeting that also yielded a sequence proximal to the histone locus (Figure 3.7b and Figure 3.8), significantly localizes in a cell population to within 1 µm of an

HLB compared with random controls (p<.016). It is noteworthy that the protein product of

RBM17, SPF45, plays a role in RNA splicing associated with Cajal bodies (Nizami et al. 2010).

Because Cajal bodies are frequently found adjacent to HLBs, it leads to the hypothesis that the

RBM17 gene is regulated by its localization at its protein products site of action.

89

Figure 3.8 FISH of hits from HLB LTOL. Measurements of the frequency of association (red and blue bars:600 and 1µm centre-to-centre thresholds respectively) of hits obtained from HLB targeting as well as two random BACs. X-axis labels correspond to BACs overlapping with sequence at chromosomal location indicated in Figure 3.7 (b). The green "histone" column refers to a BAC overlapping with the histone gene cluster at chromosome 6p22.1. See supplemental for listing of BACs used. (*p<.016; **p<.03; ***p<.0016; ****p<.0007; *****p<4e-9; ******p<.000008; *******p<1.3e-11). Inset is FISH of BAC corresponding to locus at g (red) with HLBs (green). Scale bar 5µm.

90

3.9 Discussion

This work demonstrates the feasibility of using targeted 2-photon irradiation to determine DNA sequences in the vicinity of selected sub-nuclear structures, and that it is sensitive enough to sequence DNA obtained from the targeting of a single nuclear body. In my hands, I was able to enrich for sequences within 1Mb of the expected gene loci associated with

HLBs at the chromsome 6p22.1 locus. It has previously been suggested that there may be loci other than the histone clusters at chromosomes 6p22.1 and 1q21.2 that associate with the HLB

(White et al. 2007). Electron microscopy images show that although the HLB maintains contacts with the surrounding chromatin, it does not appear to contain chromatin at its core, and thus may not be entirely dependent on its association with the histone gene clusters for its formation and maintenance (Figure 3.10). Indeed, the targeting has also revealed a locus at chromosome 10p15.1 that significantly associated with the HLB by FISH. It is noteworthy that the sequence at chromosome 10p15.1 overlaps with RBM17, the gene for the SPF45 protein, found in complex with the spliceosomal 17S U2snRNP (Nizami et al. 2010), a known component of Cajal bodies (Stanek and Neugebauer 2010), and that spliceosomal factors are also characteristic components of HLBs (i.e. NPAT, FLASH; Nizami et al. 2010). Because Cajal bodies are frequently found either close to or touching HLBs (Liu et al. 2006), it is possible that

RBM17's significant association at the HLB is actually a secondary consequence of its functional association at a nearby Cajal body. In either case, this initial finding underscores the value of the LTOL technique: we have detected a novel, significant association of a gene locus with a

91 nuclear body whose components are either in complex with, or at least functionally related to its protein product.

Figure 3.9 ESI imaging of HLB body. (a) HLB body is seen as protein-rich (blue) structure making contacts with surrounding chromatin (yellow), but devoid of chromatin itself. (b) Phosphorous channel alone reveals puncta within the HLB structure (arrows) suggesting the presence of RNA. Scale bars 200nm. These images were taken by Ren Li.

The accuracy of this technique is limited by the size of the irradiation volume, and thus may be greatly improved when combined with super-resolution microscopy (Ivanchenko et al.

2007). In its present incarnation however, the HLB targeting demonstrates that the accuracy is sufficient to yield sub-chromosomal clusters that frequently associate with a targeted body.

Thus, the LTOL approach can be used as a starting point for other biochemical approaches to hone in on those specific sequences that may mediate the interaction (i.e. deletion analysis). I

92 also expect that the coverage of sequences obtained from the targeting is limited by dominant amplicons that emerge during early cycles of the PCR. A modified approach based on a limited number of rounds of linear amplification followed by deep sequence may improve our coverage. Even so, we were able to obtain sequences from multiple chromosomes from the targeting of a single nuclear body.

I propose that this laser targeting technique will have wide applicability as a complement to ChIP and immunoTRAP, particularly for those questions of chromatin association with nuclear substructures that preclude a population-based or biochemical pulldown approach. For example, the nuclear body in question may be insoluble, the chromatin contacts transient or not mediated by the precipitated protein, or the constituent protein is present at nuclear sites other than the nuclear body. The "unique" intact PML bodies of the NB4 cell line present such a challenge to traditional biochemical approaches, as their constituent proteins are found in other are "microspeckles" in the same nucleus. They are also particularly interesting candidates as "gene hubs", as their integrity may be maintained by their affinity for specific chromatin contacts (Torok et al. 2009).

Our work serves both as a proof-of-principle for the laser targeting technique, and also as a demonstration of the advantages afforded by interrogating the chromatin neighbourhood of single arbitrary sub-nuclear structures. Although we were motivated to develop it for our own questions of PML biology (see chapter 3), we expect that it will find wider use for the detection of chromatin/nuclear body interactions that are so far intractable by the present

93 methods, as it does not require specialized equipment beyond that found in most institutional microscopy facilities. As reviewed in chapter 2 we have also recently developed a parallel approach using nano-dissection tools inside an SEM. Together, these serve as the first demonstrations of DNA isolation from single nuclear substructures in situ.

3.10 Materials and Methods

Preparation of PEN coverslips: PEN coverslips were prepared as follows: A small

“keystone” shape was etched into 25mm round glass coverslips using a diamond knife and overlayed stencil. The keystoned coverslips were washed and dried in 90% alcohol. Sheets of

1.35µm PEN film (kindly provided by PALM) were cut in approximately 18mm square pieces and floated on nuclease free water. The keystone coverslips were brought underneath the floating

PEN film and lifted out in order that the PEN film is evenly overlaid on the coverslips with the etched keystone roughly in the centre of the PEN film. The “PEN coverslips” were then dried in a 60 degree oven overnight. Rubber cement was then applied to the border of the overlaid PEN film and dried at room temperature overnight. Prior to cell culture the PEN coverslips were UV irradiated in a biohood for 30 minutes.

94

Immunolabelling/Preparation for Laser Targeting: Cells were incubated on PEN coverslips overnight. After a rinsing out media with phosphate buffered saline (PBS), Cells were fixed for

10 minutes with 4% paraformaldehyde (PFA) in PBS. Cells were then washed three times for five minutes each in PBS, and permeabilized for five minutes with .1% Triton X-100 in PBS followed by three more PBS washes of five minutes each. In order to prevent endogenous DSBs from being substrates for probe ligation and subsequent amplification, we blocked cells with a

13bp hairpin oligo ( 5’ GCG CTA GAC C*G GTC TAG CGC 3'; *internal Cy5 conjugate) that did not contain sequences that are complementary to the primers used for the amplification procedure. We followed the “Oligo Ligation of Targeted Double Strand Breaks” protocol as detailed below except that the hairpin blocking oligo was used in place of the targeting probe see "Details of Oligos". Cells were then immuno-labeled as previously described27 using mouse anti-NPAT (Abcam; HT1080 cells) and anti-mouse (HT1080 cells) alexa488 secondary antibody

(Invitrogen). Cells were then incubated with Hoechst 32258 diluted to .5ug/mL in PBS, and covered in tin foil to avoid light exposure.

Induction of DNA Damage in Subnuclear Regions: Using a confocal fluorescence microscope (LSM510 META; Carl Zeiss MicroImaging, Inc.) equipped with an argon laser tunable to 458, 488, and 514nm transmission, and a Chameleon two-photon laser transmitting a maximum power of 1300mW at 780nm (Coherent), keystones were located and situated under the objective either by direct visual inspection or under white light through a 10X Plan-

Neofluoar NA .3 objective. The field was scanned with 488nm excitation, visualized through a

95 short pass filter (505-530nm) with the 10X objective, and a low magnification image was taken.

A chosen cell or cluster of cells within the keystoned was visualized as above under the 63X C-

Apochromat NA 1.4 objective, and a high magnification image was taken. Using the LSM 510 photobleaching software, regions of interest (ROIs) were drawn in the high magnification image corresponding to subnuclear regions of the chosen cell(s) to dictate the path of the two-photon laser. The ROIs were irradiated by femtosecond 780nm pulses from the Chameleon laser, for an effective two-photon absorption event of 390nm. Various laser settings (combinations of 25%,

30%, 40% transmission power with 25, or 50 iterations per pixel) were used in the initial labeling experiments. In the targeted laser damage and subsequence amplification of DNA extracted here, a laser power of 30% with 50 iterations per pixel was used. Single nuclear bodies were targeted as follows: a pinhole size corresponding to an optical section of .7µm was used to take an image under 488nm excitation. Using ImageJ, an 8-bit grey scale 512X512 image was created from a mask of the original image after thresholding to the minimum level of the chosen chromocentre’s fluorescent signal. The pixel corresponding to the centre of mass of the chosen chromocentre’s fluorescence signal in the masked image was located using an automated macro, and the location of this pixel was encoded as a binary 512X512 array or “text image”. A Zeiss macro then positioned the laser pulse according to the resultant text image, and a 750nm two-photon pulse was applied for a duration of 1 second at approximately

900mW with AOM attenuation set at 4%. A subsequent image was taken under 488nm excitation to confirm that the irradiated spot was accurately targeted (the 750nm two-photon irradiation is sufficient to bleach the alexa 488 signal in the targeted region).

96

Oligo Ligation of Targeted Double Strand Breaks: To ligate the probe oligo to the laser targeted double strand breaks, the protocol of Didenko et al. (2003)28 was followed with slight modifications. For the sake of illustration, any original details of the protocol that have been changed since its optimization are indicated in square brackets ([]). After the laser damage, the cells were washed three times in PBS for five minutes each, and incubated for 1 hour [30 minutes] at 37ºC in klenow buffer (70mM TrisCl pH 7.5, 70mM MgCl2, 1M dithioerythritol

(DTT)) with 100U/mL, and 2.5mM each of dGTP, dATP, dCTP, and dTTP. After the klenow reaction, the cells were washed three times for five minutes each in PBS. Next, the cells were incubated for ten minutes in ISOL buffer (1X T4 DNA ligase reaction buffer (Fermentas) supplemented with 15% polyethylene glycol 8000 (PEG 8000), .5mM adenosine triphosphate

(ATP), and .05mg/mL bovine serum albumin (BSA)). The cells were then incubated with ISOL buffer with the addition of 100U/mL T4 DNA Ligase (Fermentas), and the appropriate oligo

(hairpin oligo 35ug/mL, double stranded oligo .29nM). After 18 hours [3 hours], were then washed 3X5 minutes with 0.1 % Tween 20/2XSSC solution at 42 ºC, and 3X5 minutes at 60 ºC with 0.5XSSC solution. The efficiency of the ligation is confirmed by the Cy3 signal under fluorescence microscopy, and then prepared for the microdissection as follows: the cells were dipped for two minutes in a filter purified 5% dilution of hematoxylin (Sigma) in PBS, washed in nuclease free water for 30 seconds, dehydrated for two minutes each in 70%,90%, and 100% ethanol dilutions, and then air dried for one hour.

Details of Oligos: The Cy3 conjugated double-stranded targeting oligo used for the ligation of DSBs subsequent to induced laser damage is: Strand 1- 5' (Cy3)AGT GGG ATT CTT GCT GTC

97

AGT TAG CTG 3' , strand 2- 5' CAG CTA ACT GAC AG(ddC) 3' (ddC: dideoxy C). The nucleotides in bold indicate the "signature sequence" used to indicate a bona-fide ligation event (Figure 3.1)

Note that this oligo Linker contains the priming sites for the PCR amplification. (adapted from

Langer et. al ; 2005)29.

Microdissection of Cells and DNA Extraction: After probe ligation, targeted cells were visualized using the Zeiss LSM 510 microscope to confirm enrichment of probe ligation in the targeted region. A low magnification (10X objective) bright field image of the keystone was taken and annotated with the location of the targeted cells. In preparation for the microdissection, the cells were stained with hematoxylin as follows: cells were dried at 37ºC for thirty minutes, dipped for two minutes in a filter purified 5% dilution of hematoxylin (Sigma) in

PBS, washed in nuclease free water for 30 seconds, and then air dried for one hour. For microdissection, a PALM (PALM Microsystems inc.) LMPC

microscope was used. The targeted cells corresponding to the annotated low magnification bright field image were located in the keystone under the 40X objective of the microscope. The targeted cells were catapulted into the cap of a 200 µL nuclease - free eppendorf tube (Ambion) containing 4.5µL of DNA extraction buffer: 0.5 µl of 10 One-Phor-All-Buffer- Plus (Amersham

Pharmacia Biotech), 0.13 µl 10% Tween 20 (Sigma, Germany), 0.13 ul 10% Igepal CA-630

(Sigma), 0.13 µl Proteinase K (10 mg/ml, Sigma). The eppendorf tube with lysis buffer and suspended cell in cap was then incubated in a PCR machine with a heated lid at 42ºC and block temperature at 70 ºC for 16 hours. The reaction was then spun down and heated to 80ºC for 10 minutes in order to inactivate the Proteinase K.

98

DNA Extraction and PCR Amplification: We adapted the protocol of Langer et al. (2005)29 as follows: PCR is performed in 50 µl total volume using the Titanium Taq polymerase kit

(Clonetech) under the following conditions: 72 ºC 1 minute, 68 ºC 3 minutes, then 14 cycles of

94 ºC 40 seconds, 57 ºC 30 seconds, 68 ºC 1 minute 30 seconds increasing by one second each cycle. Then 8 cycles of 94 ºC 40 seconds, 57 ºC 30 seconds, 68 ºC 1 minute 45 seconds increasing by one second each cycle. Then 22 cycles of 94 ºC 40 seconds, 65 ºC 30 seconds, 68

ºC 1 minute 53 seconds increasing by one second each cycle. Then 68 ºC for 3 minutes 40 seconds. Primer sequence is as follows: 5' AGTGGGATTCTTGCTGTCAGTTA 3'. Then 10 µl of the amplification product is analyzed on a 2 % agarose gel, and 2 µl is cloned for sequencing using the CloneJET PCR cloning kit (Thermo Scientific).

Original Linker Ligation PCR Scheme: Prior to the finding that a single PCR primer may be used to amplify probe ligated genomic DNA in a single PCR reaction to sufficient yield for sequencing (see Lac-Array targeting section), a nested PCR protocol was used with an initial linker ligation step. Briefly, to the sample of extracted DNA, a mix of .25uL of CSP6I (Fermentas) and .25μL of nuclease-free water was added and incubated at 37ºC for 3 hours before heat inactivating the reaction at 75ºC for five minutes. CSP6I recognizes and cleaves the following sequence (cleavage sites indicated by asterisks): 5’ G*TAC 3’ / 3’ CAT*G 5’. 3μL of the

Blockerette linker (Blockerette linker is detailed as follows: the sequence of the first strand of the linker is: 5’ AGT GTG AGT CAC AGT AGT CTC GCG TTC GAA TTC AAG CGG CCG CTG 3’, and the sequence of the second strand of the linker is: 5’ /5Phos/TAC AGC GGC CGC TTG AAT

T/3ddC/ , where /5Phos/ and /3ddC/ denote a 5’phosphate group and dideoxycystosine

99 nucleotide respectively). mix (1μL of 50μM Blockerette linker, .5μl One Phor-All buffer PLUS (GE

Healthcare), and 1.5μl nuclease free water) was added and annealed to the free ends left after the CSP6I digest by cycling down from 65ºC to 15ºC in 1ºC increments for one minute each cycle. The annealed linkers were ligated to the extracted DNA by the adding 1μl 10mM ATP (GE

Healthcare) and 1μL T4 DNA Ligase (Fermentas) and incubating at 15ºC for 16 hours. The resultant ligation product presumably leaves genomic DNA flanked by either linker ligated ends, or in the case of genomic DNA ligated by the initial probe oligo, flanked by probe oligo on one end, and linker on the other. A subsequent PCR using both a Blockerette linker specific primer

(LSP1: 5’ TCT CGC GTT CGA ATT CAA GCG GCC GCT A 3’), and probe specific primer (PSP1: 5’

CGT GAC AAC AGA TGG AAC AGC TGA ATA TG 3’) is then performed. Prior to the use of the signature sequence (see mouse chromocentre targeting section) a probe specific primer was used that extended the full length of the complementary strand (PSPwithout signature sequence: 5’ CGT GAC AAC AGA TGG AAC AGC TGA ATA TGAATATGC 3’): on the targeting probe

40 μL of PCR reaction mix (.5μM PSP1, .5μM LSP1, .2mM DNTPs, 1X SA Buffer (ClonTech), and

1μL Advantage 2 Polymerase (ClonTech), all in nuclease – free water) was added to the 10μL sample on ice and an initial PCR was carried out as follows: 94ºC for 1 minute, 5 cycles of 94ºC for 30 seconds, and 72ºC for 30 seconds, 5 cycles of 94ºC for 30 seconds, 70ºC for 30 seconds, and 72ºC for 3 minutes.

100

3-D FISH and Imaging: FISH was preformed as previously described27. BACs corresponding with the mapped location of HT1080 sequences are detailed in Table 1. Confocal stacks with a

Z increment of 0.1 µm were taken with an Olympus IX81 microscope.

List of BACs used in this study

Map designation Overlapping BAC Number of cells analyzed by FISH (from Figure 3.7) a Rp11-642E22 118 b Rp11-24C3 68 c Rp11-1145O11 97 d Rp11-12N18 99 e Rp11-609G19 83 f Rp11-430O23 65 g Rp11-1147O22 64 h Rp11-836G11 92 i Rp11-14P15 93 histone Rp11-2P4 98 Rndm1 Rp11-314G12 85 Rndm2 Rp11-1149E8 104

3-D FISH Measurements and Statistics: The percentage of cells with at least one BAC signal associating with a PML or HLB NB (using distance thresholds as indicated) was determined by measuring the Euclidian center to center distances between FISH signals and PML with ImageJ.

Significant differences (p<0.05) in association frequencies were determined by taking the two- tailed p value under an exact Fisher test compared with random controls. To calculate lower estimate for the p value associated with finding two sequences within 2MB of the two histone cluster on chromosome 6 (Pinsidewindow), we considered it as the complement probability of a sequence not being within a 4 Mb window centered on the histone cluster (Poutsidewindow). This gives us: Pinsidewindow = 1- Poutsidewindow = 1- [(( number of base pairs in human haploid genome) –

(4 Mb))/( number of base pairs in human haploid genome)] ≈0.0013. We then used this value as

101 the probability of success in a binomial test. To determine a p value for the enrichment of chromosome 6 sequences obtained from the HLB dissections, we considered an upper estimate of the chance of getting a sequence from chromosome 6 as: (size in base pairs of chromosome

6)/(size of haploid genome) ≈ (171X106)/(3X109) = 5.7X10-2. We then used this value as the probability of success in a binomial test. To estimate the percentage of alpha satellite DNA in the mouse genome, I measured the percentage of DNA in chromocentres by DAPI staining Ctot

(approximately 30% from integrated intensity measurements of DAPI staining) and compared with reported estimation of alpha satellite sequence in the mouse genome Alphatot (7%;

Rattner et. al 1987). Assuming all alpha satellite DNA is found in chromocentres, the percentage found in the chromocentre Alphac is then: Alphac= Ctot/ Alphatot. This estimation of course does not account for sequence specifity of DAPI staining (i.e. higher affinity for A-T rich sequences), and so may underestimate the actual percentage of alpha satellite sequence in the chromocentre.

102

Chapter 4: LTOL to determine DNA sequences at single PML bodies reveals a paired loci association

David Anchel, Rachel Cotton, Ren Li, David P Bazett-Jones*

*Prof. D. P. Bazett-Jones Genetics and Genome Biology Program The Hospital for Sick Children, M5G 1X8 E-mail: [email protected]

4.1 Introduction

As outlined in the last chapter, the development of LTOL allows for the identification of

DNA in the vicinity of a selected nuclear sub-structure in a single cell. The ability to identify genetic loci that localize in the vicinity of a single nuclear body is particularly suited to the interrogation of genomic/nuclear body associations that are not amenable by population based biochemical approaches. This can occur when the nuclear body in question is “unique” in some way from other bodies in the same cell with similar biochemical or compositional properties. It is noteworthy that these bodies are often found in cell lines with an accompanying disregulation (Luciani et al. 2006; Torok et al. 2009), as it suggests that the formation or maintenance of these structures is dependent on, or contributes to the regulation of the surrounding chromatin microenvironment. Such "unique" bodies are found in the acute

103 promyelocytic leukemia derived NB4 cell line. In these cells, a fusion of the PML gene with the

Retinoic Acid Receptor Alpha (PML-RARα) results in a differentiation block of myeloid progenitor cells, and a disregulation of RARα target genes (de Thé et al. 2012), that is concomitant with a dispersal of PML protein from some PML body components (e.g. SP100) into smaller, more numerous "micro - PML" accumulations. While the majority of these "micro

-PML" foci are representative of the PML-RARα fusion protein at RARα target genes (Pitha-

Rowe et al. 2003; Torok et al. 2009; de Thé et al. 2012) our lab has observed that in some cells there are one to three PML bodies that remain intact by virtue of their larger size (Figure 4.1a), and presence of PML body constituent proteins (i.e. SP100- Torok et al. 2009; Figure 4.1b). We hypothesize that the integrity of the intact bodies is dependent on specific genomic contacts that are perhaps stronger than the affinity of the PML-RARα fusion protein with RARα target genes (Torok et al. 2009).

If so, then these bodies represent "genomic hubs" that, as described in the introduction, may act as specialized regulatory centres for specific gene sets (either from relatively large distance along the same chromosome fibre or lying on separate chromosomes) that are convergent on them. Identifying these convergent genomic interactions may allow us to identify the shared factors involved in their regulation, and to ultimately understand these interactions in the larger context of genetic control. Their identification may also reveal co- regulatory relationships that, as they may represent nascent regulatory steps that precede an epiginetic or transcriptional state change, cannot be detected with traditional approaches (i.e. bisulphide sequencing- Frommer et al. 1992; transcriptome analysis- Wang and Bodovitz 2010).

Furthermore, just as the histone processing role of HLBs was revealed by their frequent co-

104 localization with the histone gene cluster, a starting point towards elucidating a structure's regulatory role is to identify its frequent genomic associations.

As explained above, ChIP would not be suitable to elucidate the genomic contacts surrounding these “unique” PML bodies since the precipitate would also contain genomic DNA in contact with the micro-PML accumulations. I report here the application of the LTOL technique to the "unique" PML bodies found in the NB4 APL cell model (Torok et al. 2009).

From the targeting of these bodies, I found two DNA sequences that originate from the same targeting event that frequently co-localize with each other and with PML bodies in a cell population. These sequences lie in a genomic region that are significantly enriched in shared transcription binding sites. Furthermore, these same shared binding sites are also found enriched in loci obtained from a population-based screen of PML body associating genes in the

Jurkat cell line, thus suggesting a common determinant of PML body association. These results underscore the value of our technique in the detection of convergent loci at a shared nuclear body, both as a starting point for generating hypothesese about the sequence-specific nature of the association, and more generally as an engineering benchmark, given its ability to find loci/nuclear body associations in those bodies that cannot be dissected by conventional biochemical assays.

105

4.2 Results

4.2.1 Identification of paired loci that associate with PML bodies

As explained above, only between 1 and 3 PML bodies are found in the acute promyelocytic leukemia-derived NB4 cell line, in a background of numberous micro-PML foci.

The small foci presumably arise from the dispersal of normal PML bodies when the fusion transgene is expressed (Pitha-Rowe et al. 2003; Torok et al. 2009; de Thé et al. 2012). The LTOL technique then, is particularly suited to interrogate those particular PML bodies in the acute promyelocytic leukemia-derived NB4 cell line, that retain their size and compositional integrity, since they may prevail because of particularly strong affinities with specific, convergent genomic contacts. We obtained several bona-fide sequences (ie. containing flanking

"signature" sequence - see above and Materials and Methods) from the targeting of single intact PML bodies (i.e. containing both SP100 and PML) that mapped unambiguously to genomic loci (Figure 4.1c). To determine whether any of these regions are specifically associated with these intact bodies in a population of NB4 cells, we performed immuno-FISH using fluorescently labeled BACS that overlapped with these sequences, and measured their frequency of association with intact PML bodies. We observed a significant association of two

BACs with PML bodies, mapping to p11.2 and q13.31 of chromosomes 17 and 20 respectively

(Figure 4.1d). These two sequences originated from the targeting of a single PML body, raising the possibility that their convergence at a shared PML body may be more frequent than that

106 predicted by considering their independent association. Indeed, paired FISH of these two loci revealed that they significantly co-localized with each other compared with the random controls (Figure 4.2a), and although their colocalization could occur in the absence of a common PML body, the chromosome 20 locus was significantly closer to a PML body when a chromosome 17 locus was within 1µm of the body (One tailed Mann-U-Whitney test p<.008;

Figure 4.2b). Taken together, we hypothesize that regulatory elements within these two loci are in some way driving their paired association with each other and/or with the PML body.

107

Figure 4.1 LTOL of single PML bodies in NB4 cell line. (a) Immunofluorescence with PML antibody in NB4 cells. Arrows indicate presence of "unique" PML bodies, identified by intense foci in "microspeckeld" background. (b) Laser targeting of intact PML bodies. "Unique" PML bodies are identified by overlapping immunoflouresence of PML and SP100 signals prior to laser targeting. After targeted irradiation, and ligation with a flourescent double stranded probes. It is verified to be enriched at the volume containing the PML body (c) summary of genomic mapping of sequences obtained by laser targeting of single PML bodies. Sequences obtained from the targeting of the same PML body are indicated by the colour of bar and later used to indicate the genomic location of the sequence. (d) measurement of 3D distance of FISH signals from PML bodies in NB4 cells indicated a significant association frequency of the hits colocalizing to chromosome 17 and 20 (*p<.015 ;**p<.006). Inset - an example of loci associating with a PML body: green (PML), red (chr20loci left inset; chr17 inset right inset), scale bars 5µm.

108

Figure 4.2 Chromosome 17 and 20 loci obtained by LTOL association with each other and show interdependence in PML association. (a)Significant frequency of association (i.e. less than 1µm distance) of chromosome 17 locus (a) with chromosome 20 locus (b) compared with random controls (* p<.003). inset: example of frequent interchromosomal association of chromsome 17 (red) and chromosome 20 loci (green). (b) Mann-Whitney U test was performed to compare the distances of Chromosome 20 locus to a PML body that has a chromosome 17 locus within 1µm of it (set 1) to those PML bodies that do not (set 2). The chromosome 20 locus was significantly closer in set 1 compared with set 2 (p<.008). (b) Example of the closer association of the chromosome 20 locus (green) at a PML body (blue) with a chromosome 17 locus (red) within 1µm of it.

109

4.2.2 Inter-loci and PML association frequency in ATRA treated cells

The frequent inter-loci and interdependence of PML association seen with the chromosome 17 and 20 loci could be explained by their shared affinity for a factor, which, considering their shared origin in the vicinity of a single targeted "unique" PML body, could be present in particular subsets of PML bodies within a single cell. One way to address this hypothesis is to see whether the interdependence of the chromosome 17/20 loci association with a shared PML body is retained in the presence of an increased number of reconstituted

PML bodies. To observe the frequency of inter-loci and loci-PML associations in a background of increased PML body number, I treated NB4 cells with all-trans retinoic acid (ATRA), which reverses the disintegration of PML body components through degradation of the PML-RARα fusion protein (de Thé et al. 2012). Consistent with their specific affinity for PML bodies, both the chromosome 17 and 20 loci showed significantly greater PML association compared with random controls in the ATRA treated cells (Figure 4.3). Although the frequency of 17/20 inter- loci association remained unchanged in ATRA-treated vs. untreated cells (Figure 4.4), the interdependence of PML association, as seen in the untreated cells (Mann Whitney U test, as described in Materials and Methods) was not seen in the ATRA-treated cells. These results suggest that the chromosome 17 and 20 loci do not necessarily share an affinity for a specific

PML "subtype" among a normal complement of intact PML bodies. However, because their inter-loci association is maintained in ATRA treated cells, it suggests that their frequent association in untreated cells is not merely an artifact of their shared affinity for the relatively limited number of PML bodies that are available to associate with.

110

Figure 4.3 Frequency of PML body association of loci obtained by LTOL is preserved in all- trans retinoic acid (ATRA) treated NB4 cells. NB4 cells were treated for 24 with ATRA and the distances of FISH signals corresponding to indicated loci (r1,r2,a,b correspond to two random BACs, and the loci on 17 and 20 respectively, as illustrated in Figure 4.1). Shown are the means and standard deviations of two independent experiments. (*p<002; **p<.0002).

111

Figure 4.4 Frequency inter-loci association of sequences obtained by LTOL is preserved in all-trans retinoic acid (ATRA) treated NB4 cells. NB4 cells were treated for 24 with ATRA and the distances of FISH signals corresponding to the two loci found to frequently associate with PML bodies (corresponding to the loci on 17 and 20 respectively, as indicated by "a" and "b" in Figure 4.1). Shown are the means and standard deviations of two independent experiments.

4.2.3 Common transcription factor binding sites enriched at convergent loci

I hypothesized that regulatory elements within these two loci are in some way driving their paired association with each other and/or with a PML body. As a starting point for this investigation, I explored the possibility that a shared regulatory factor may mediate the interaction with the PML body and with each locus. Using the GSEA ontology database

112

(Ashburner et al. 2000) , we analyzed a 500kb window on either side of the sequences from chromsome17 and chromosome20. Remarkably, there is a significant enrichment of shared consensus sequences for several transcription factors both between these two regions. In particular, an enrichment for SP1 and MYB consensus sites were found within the 500kb window on chromosome 17 (p <.0002 and FDR-q value<.03; see Tables 1 and 2). Furthermore, the SP1 consensus site is non-randomly distributed throughout the genome, and the cytogenetic band of chromosome 17 which contains the sequence found in our LTOL assay

(chr17p11) is the third most highly enriched SP1 consensus site cluster in the genome

(p<.000175, FDR-q value< .015; see Figure 4.5). SP1 is a known component of PML bodies that transactivates the targets of the PML/RARα fusion protein in NB4 cells (van Wageningen et al.

2007), and MYB's interaction with the PML component CBP has been implicated in the differentiation block of promyelocytes (Pattabiraman et al. 2009; Gonda 2008). Hence, we asked whether these consensus sites could be found at other loci that are frequently found associated with PML bodies. We examined the top 100 sequences on a promoter microarray of

PML body-associated sequences obtained by Immuno-Trap (Ching et al. 2013), and indeed, they were significantly enriched in the same SP1 and MYB consensus sites (p value <.0000015 and

FDR q value <.00003 SP1 consensus; p value<.0000275 and FDR q value <.00026 MYB consensus; Table 2). Thus, as is uniquely afforded by the LTOL approach, through determining

DNA sequences in the vicinity of a single nuclear body, we have found convergent loci that share commonality among the transcription factor binding sites significantly enriched in them.

A similar finding of gene convergence at subsets of transcription factories that are enriched in their cognate transcription factors (Schoenfelder et al. 2010), leads us to conclude that the

113 spatial association co-regulated genes at a shared nuclear body is a common theme in nuclear organization, and may underlie a novel co-regulatory mechanism. If so, we propose LTOL as a crucial method that can be generally applied towards detecting this phenomenon at other unique nuclear body subtypes.

Table 4.1 GSEA results of genes near chromosome 17 and 20 loci. Significant ontology clustering for genes within 500kb of chromosome 17 and chromosome 20 hits (Gene sets refer to the GSEA gene annotation see ref[21]; FDR - False Discovery Rate). Gray rectangle at the intersection of BMP7 and the SP1 consensus binding site gene set indicates that BMP7 contains a degenerate variant of the SP1 binding site (see Table 4.2).

114

Figure 4.5 "Hotspots" of SP1 binding sites in genome. Pink shaded regions represent chromosomal regions enriched in SP1 consensus binding sites as determined by GSEA cross-referencing the members of this gene set against the positional gene set database. Within these regions, there is further clustering of SP1 binding sites: indicated by the dark red bars are those regions that have at least 3 genes that contain the SP1 binding site in their promoter within a 100kb span. Note that the chromosome 17 sequence obtained by LTOL in NB4 cells is contained in one of these highly enriched sites.

115

Top 100 Genes ordered by Integrated intensity +/500kb +/-500kb In microarray from Immuno-TRAP of Chromosome Chromosome PML bodies in Jurkat Cells9 20 locus 17 locus DMD MUM1L1 SRY GPR132 MC3R RAI1 TSPY1 EFNA2 CCDC57 GNAS RTFDC1 smcr5 LOC728137 PPP1R1B C16orf30 LRCH2 FAM210B sreBf1 RBMY1A1 HDAC10 SLCO4A1 ABBA-1 AURKA Tom1L2 NDRG4 SYT8 TMSB4Y SNAPC2 CSTF1 LrrC48 CYorf15A FGF13 CYorf15B RAB40C CASS4 ATPaF2 NLGN4Y C10orf39 ELF4 TCEA2 TFAP2C DRG2 DDX3Y C11orf2 FAM57B TRAF7 BMP7 Myo15A DTNA WDR45 STAG2 ADSSL1 GCNT7 ALKBH5 CDY1B RNF126 CDY2A BGN FAM209A FLII PCDH11Y AIFM3 TSPY2 WDR24 FAM209B SMCR7 TBL1Y SLIT1 ATP2B3 KCNJ12 TOP3A HSFY2 CRAMP1L PLS3 TNFRSF6B SHMT1* RBMY1F COL18A1 EVI5L CYB561 TRIM16L RP1-32F7.2 FURIN KIAA0182 SNN FOX03B AMELY PTGER1 ABCA7 OPN4 EVPLL PRR5 RAI1 DGKZ BCL9L FBXW10 ADORA2A CDC42BPG LOC197322 NRXN2 WDR90 SLC4A1 HTR2C GNA11 DUSP9 ANKRD13D WDR81* RARA NARFL ZFY CENTD2 CBFA2T3 SLC25A22 DPEP1 AIRE NR4A1 TMEM105 FSCN2 MAP2K3 SSBP4 C19orf34 NXPH4 ITGB2 SOX5* UBE1 CCND2 CLUL1 ATP6V0C

Table 4.2 Enrichment of MYB and SP1 binding sites in genes obtained from two different PML body association screens. Genes containing MYB consensus sequence binding sites (either NAACNGNCN with 267 sites in the entire genome, FDR<1.2 x 10-4; or NNNGNCAGTTN with 250 sites in the entire genome, FDR<1.07 x 10-4) in their promoter are highlighted in yellow. Genes with the degenerate SP1 consensus sequence ( GGGCGGR; 2940 in entire genome, FDR<1.18 x 10-4) are highlighted in red. Genes that also contain the non-degenerate SP1 consensus sequence (GGGGCGGGGC; 243 in entire genome, FDR<1x10-4), or also contain a MYB consensus sequence, are also in bold and/or asterisked respectively.

116

4.3 Discussion

The work here demonstrates the application of the LTOL method to identify DNA sequences in the vicinty of "unique" sub-nuclear structures that cannot be isolated biochemically. As demonstrated in the previous chapter, because LTOL is sufficiently sensitive to provide DNA to sequence obtained from targeting a single nuclear body from a single cell, it could be applied to the identification of sequences that originated from their shared association at single intact (i.e. containing both PML and SP100 protein) PML bodies in the APL model NB4 cell line. Such "unique" one or two intact PML bodies in NB4 cells present a challenge to conventional biochemical approaches, since their constituent PML proteins are found in other

PML-containing microfoci in the same nucleus. They are also particularly interesting candidates as "gene hubs", because their integrity may be maintained by their affinity for specific chromatin contacts (Torok et al. 2009). Our initial targeting has so far revealed two loci that both frequently associate with PML bodies, and with each other. The specific sequences involved in these associations is not known. However, our finding of shared transcription factor binding consensus sites between these loci lead us to hypothesize that shared regulatory factor(s) may mediate the interaction with the PML body and with each locus. Given its role in the differentiation of promyelocytes (Pattabiraman et al. 2009; Ramsay and Gonda 2008), we are particularly interested in the shared enrichment between the two loci for the MYB transcription factor consensus sequence (see table 1). The enrichment of SP1 consensus sequences at the chromosome 17 locus is also noteworthy in light of SP1's role as a transactivator through PML-RARα in NB4 cells (van Wageningen et al. 2008), and the presence

117 of SP1 consensus sites in the promoters of the PML body-associated Adenoviral genome (Ishov and Maul 1996; Parks et al. 1997; Heysen et al. 1991) Furthermore, the significant enrichment of these two transcription factor binding sites in the promoter regions of genes obtained from a previous PML-association screen in Jurkat cells (Ching et al. 2013) suggests they may act as a common determinant of PML association between cell lines. Although MYB shows no identifiable clustering throughout the genome, the SP1 consensus site is highly clustered along regions of chromosomes 11,12,19 and 17 (Figure 4.5). We predict that these regions also frequently associate with PML bodies.

It is unclear from the results whether the frequent association of the chromosome 17 and 20 loci with each other is related to their individual affinities with PML bodies. Although there appears to be an interdependence to each loci's PML association in untreated cells (see

Figure 4.4), this effect is not seen upon ATRA treatment, even though the frequency of loci/loci association remains unaffected. One possibility is that the loci are brought together through complexes nucleated on common binding sites enriched in each, and which contain factors that in turn promote PML association; after disintegration of this complex, each loci remains loaded with high-affinity factors which will promote a PML body association. Thus, although there may be no direct inter-loci dependence for PML body association at a shared body, they may be

"poised" for PML body association by those factors that are cooperatively loaded in an initial loci/loci association. A consequence of this cooperative loading model is that the nucleation of high-affinity PML-binding factors on one locus will in turn increase the likelihood of nucleation of those same factors onto the other, and thus they would show similar dynamics in their PML

118 association. Thus, in the untreated NB4s where there are relatively few PML bodies to associate with, cooperatively "poised" loci are more likely to find the same PML body.

Although my data does not show evidence of PML body "subtypes" with unique constituent proteins that mediate interactions with specific subsets of convergent loci, it certainly does not rule out their existence. In any case, this initial application of LTOL has demonstrated its value in finding genomic associations with a shared affinity for each other and a nuclear substructure, and as a starting point for future investigations into the nature of these associations. I expect its further application, both to other "unique" PML bodies, in the NB4 cell line, and in other cell models with concominant gene disregulation (Luciani et al. 2006), will yield similar sets of convergent loci that share enrichments in regulatory factor binding sites, which will shed further light on a more general mechanism of co-regulation through shared nuclear body association.

119

4.4 Materials and Methods

Table M1. List of BACs used in this study

Map designation Overlapping Number of cells analyzed by FISH BAC (from Figure) a (4.3b) Rp11-145H11 154

b (4.3b) Rp11-936C14 137

c (4.3b) Rp11-100C13 133

d (4.3b) Rp11-164P9 127

e (4.3b) Rp11-254M15 125

f (4.3b) Rp11-768D16 130

g (4.3b) Rp11-493D24 76

Rndm1 (4.3b) Rp11-79E8 132

Rndm2 (4.3b) Rp11-89D23 103

Preparation of PEN coverslips: PEN coverslips were prepared as follows: A small

“keystone” shape was etched into 25mm round glass coverslips using a diamond knife and overlayed stencil. The keystoned coverslips were washed and dried in 90% alcohol. Sheets of

1.35µm PEN film (kindly provided by PALM) were cut in approximately 18mm square pieces and floated on nuclease free water. The keystone coverslips were brought underneath the floating

PEN film and lifted out in order that the PEN film is evenly overlaid on the coverslips with the etched keystone roughly in the centre of the PEN film. The “PEN coverslips” were then dried in a 60 degree oven overnight. Rubber cement was then applied to the border of the overlaid PEN

120 film and dried at room temperature overnight. Prior to cell culture the PEN coverslips were UV irradiated in a biohood for 30 minutes.

Immunolabelling/Preparation for Laser Targeting: Cells were incubated on PEN coverslips overnight. After a rinsing out media with phosphate buffered saline (PBS), Cells were fixed for

10 minutes with 4% paraformaldehyde (PFA) in PBS. Cells were then washed three times for five minutes each in PBS, and permeabilized for five minutes with .1% Triton X-100 in PBS followed by three more PBS washes of five minutes each. In order to prevent endogenous DSBs from being substrates for probe ligation and subsequent amplification, we blocked cells with a

13bp hairpin oligo ( 5’ GCG CTA GAC C*G GTC TAG CGC 3'; *internal Cy5 conjugate) that did not contain sequences that are complementary to the primers used for the amplification procedure. We followed the “Oligo Ligation of Targeted Double Strand Breaks” protocol as detailed below except that the hairpin blocking oligo was used in place of the targeting probe see "Details of Oligos". Cells were then immuno-labeled as previously described27 using either mouse anti-NPAT (Abcam; HT1080 cells) or rabbit anti-PML (PGM-3, Santa-Cruz; NB4 cells) primary antibody and anti-Rabbit (NB4 cells) or anti-mouse (HT1080 cells) alexa488 secondary antibody (Invitrogen). Cells were then incubated with Hoechst 32258 diluted to .5ug/mL in PBS, and covered in tin foil to avoid light exposure.

121

Induction of DNA Damage in Subnuclear Regions: Using a confocal fluorescence microscope (LSM510 META; Carl Zeiss MicroImaging, Inc.) equipped with an argon laser tunable to 458, 488, and 514nm transmission, and a Chameleon two-photon laser transmitting a maximum power of 1300mW at 780nm (Coherent), keystones were located and situated under the objective eitherby direct visual inspection or under white light through a 10X Plan-

Neofluoar NA .3 objective. The field was scanned with 488nm excitation, visualized through a short pass filter (505-530nm) with the 10X objective, and a low magnification image was taken.

A chosen cell or cluster of cells within the keystoned was visualized as above under the 63X C-

Apochromat NA 1.4 objective, and a high magnification image was taken. Using the LSM 510 photobleaching software, regions of interest (ROIs) were drawn in the high magnification image corresponding to subnuclear regions of the chosen cell(s) to dictate the path of the two-photon laser. The ROIs were irradiated by femtosecond 780nm pulses from the Chameleon laser, for an effective two-photon absorption event of 390nm. Various laser settings (combinations of 25%,

30%, 40% transmission power with 25, or 50 iterations per pixel) were used in the initial labeling experiments. In the targeted laser damage and subsequence amplification of DNA extracted here, a laser power of 30% with 50 iterations per pixel was used. Single nuclear bodies were targeted as follows: a pinhole size corresponding to an optical section of .7µm was used to take an image under 488nm excitation. Using ImageJ, an 8-bit grey scale 512X512 image was created from a mask of the original image after thresholding to the minimum level of the chosen chromocentre’s fluorescent signal. The pixel corresponding to the centre of mass of the chosen chromocentre’s fluorescence signal in the masked image was located using an automated macro, and the location of this pixel was encoded as a binary 512X512 array or “text

122 image”. A Zeiss macro then positioned the laser pulse according to the resultant text image, and a 750nm two-photon pulse was applied for a duration of 1 second at approximately

900mW with AOM attenuation set at 4%. A subsequent image was taken under 488nm excitation to confirm that the irradiated spot was accurately targeted (the 750nm two-photon irradiation is sufficient to bleach the alexa 488 signal in the targeted region).

Oligo Ligation of Targeted Double Strand Breaks: To ligate the oligo (either hairpin or double stranded) to the laser targeted double strand breaks, the protocol of Didenko et al.

(2003)20 was followed with slight modifications. For the sake of illustration, any original details of the protocol that have been changed since its optimization are indicated in square brackets

([]). After the laser damage, the cells were washed three times in PBS for five minutes each, and incubated for 1 hour [30 minutes] at 37ºC in klenow buffer (70mM TrisCl pH 7.5, 70mM MgCl2,

1M dithioerythritol (DTT)) with 100U/mL, and 2.5mM each of dGTP, dATP, dCTP, and dTTP.

After the klenow reaction, the cells were washed three times for five minutes each in PBS. Next, the cells were incubated for ten minutes in ISOL buffer (1X T4 DNA

ligase reaction buffer (Fermentas) supplemented with 15% polyethylene glycol 8000 (PEG

8000), .5mM adenosine triphosphate (ATP), and .05mg/mL bovine serum albumin (BSA)). The cells were then incubated with ISOL buffer with the addition of 100U/mL T4 DNA Ligase

(Fermentas), and the appropriate oligo (hairpin oligo 35ug/mL, double stranded oligo .29nM).

After 18 hours [3 hours], were then washed 3X5 minutes with 0.1 % Tween 20/2XSSC solution at 42 ºC, and 3X5 minutes at 60 ºC with 0.5XSSC solution. The efficiency of the ligation is confirmed by the Cy3 signal under fluorescence microscopy, and then prepared In preparation

123 for the microdissection as follows: the cells were dipped for two minutes in a filter purified 5% dilution of hematoxylin (Sigma) in PBS, washed in nuclease free water for 30 seconds, dehydrated for two minutes each in 70%,90%, and 100% ethanol dilutions, and then air dried for one hour.

Details of Oligos: The sequence of the hairpin is: 5’ GCG CTA GAC C*G GTC TAG CGC 3', where the asterisk denotes an internal Cy5 conjugate. The Cy3 conjugated double-stranded targeting oligo used for the ligation of DSBs subsequent to induced laser damage is: Strand 1- 5'

(Cy3)AGT GGG ATT CTT GCT GTC AGT TAG CTG 3' , strand 2- 5' CAG CTA ACT GAC AG(ddC) 3'

(ddC: dideoxy C). The nucleotides in bold indicate the "signature sequence" used to indicate a bona-fide ligation event (Figure 4.1) Note that this oligo Linker contains the priming sites for the

PCR amplification. (adapted from Langer et. al ; 2005)36.

Microdissection of Cells and DNA Extraction: After probe ligation, targeted cells were visualized using the Zeiss LSM 510 microscope to confirm enrichment of probe ligation in the targeted region. A low magnification (10X objective) bright field image of the keystone was taken and annotated with the location of the targeted cells. In preparation for the microdissection, the cells were stained with hematoxylin as follows: cells were dried at 37ºC for thirty minutes, dipped for two minutes in a filter purified 5% dilution of hematoxylin (Sigma) in

PBS, washed in nuclease free water for 30 seconds, and then air dried for one hour. For microdissection, a PALM (PALM Microsystems inc.) LMPC

124

microscope was used. The targeted cells corresponding to the annotated low magnification bright field image were located in the keystone under the 40X objective of the microscope. The targeted cells were catapulted into the cap of a 200 µL nuclease - free eppendorf tube (Ambion) containing 4.5µL of DNA extraction buffer: 0.5 µl of 10 One-Phor-All-Buffer- Plus (Amersham

Pharmacia Biotech), 0.13 µl 10% Tween 20 (Sigma, Germany), 0.13 ul 10% Igepal CA-630

(Sigma), 0.13 µl Proteinase K (10 mg/ml, Sigma). The eppendorf tube with lysis buffer and suspended cell in cap was then incubated in a PCR machine with a heated lid at 42ºC and block temperature at 70 ºC for 16 hours. The reaction was then spun down and heated to 80ºC for 10 minutes in order to inactivate the Proteinase K.

DNA Extraction and PCR Amplification: We adapted the protocol of Langer et al. (2005)36 as follows: PCR is performed in 50 µl total volume using the Titanium Taq polymerase kit

(Clonetech) under the following conditions: 72 ºC 1 minute, 68 ºC 3 minutes, then 14 cycles of

94 ºC 40 seconds, 57 ºC 30 seconds, 68 ºC 1 minute 30 seconds increasing by one second each cycle. Then 8 cycles of 94 ºC 40 seconds, 57 ºC 30 seconds, 68 ºC 1 minute 45 seconds increasing by one second each cycle. Then 22 cycles of 94 ºC 40 seconds, 65 ºC 30 seconds, 68

ºC 1 minute 53 seconds increasing by one second each cycle. Then 68 ºC for 3 minutes 40 seconds. Primer sequence is as follows: 5' AGTGGGATTCTTGCTGTCAGTTA 3'. Then 10 µl of the amplification product is analyzed on a 2 % agarose gel, and 2 µl is cloned for sequencing using the CloneJET PCR cloning kit (Thermo Scientific).

125

3-D FISH and Imaging: FISH was preformed as previously described9. For ATRA treatment, cell culture medium was split in half 72 hours prior to fixation for FISH processing, with one half of the culture complemented with normal medium, and the other half complemented with an equal volume of medium containing ATRA, for a final ATRA concentration of 1µm. BACs corresponding with the mapped location of NB4 sequences (Figure 4.1c) are detailed in table

M1. Confocal stacks with a Z increment of 0.1 µm were taken with an Olympus IX81 microscope.

3-D FISH Measurements and Statistics: The percentage of cells with at least one BAC signal associating with a PML body (using distance thresholds as indicated) was determined by measuring the Euclidian center to center distances between FISH signals and PML with ImageJ.

Significant differences (p<0.05) in association frequencies were determined by taking the two- tailed p value under an exact Fisher test. To determine if there was a relationship between the association of one locus at a PML body with the association of the other locus with that same body, we compared the distances of locus a (or b) to a PML body that had locus b (or a) within

1µm of it, with the distances of locus b (or a) to a body that did not have a locus a (or b) within

1µm of it. A Mann-Whitney U test was performed on these two sets of distances. To calculate lower estimate for the p value associated with finding two sequences within 2MB of the two histone cluster on chromosome 6 (Pinsidewindow), we considered it as the complement probability of a sequence not being within a 4 Mb window centered on the histone cluster (Poutsidewindow).

This gives us: Pinsidewindow = 1- Poutsidewindow = 1- [(( number of base pairs in human haploid

126 genome) – (4 Mb))/( number of base pairs in human haploid genome)] ≈0.0013. We then used this value as the probability of success in a binomial test.

127

Chapter 5: Discussion/Future Directions

5.1 Advantage of Single Cell Methods for Genomic Associations with Nuclear Bodies

The work here adds to the evidence that PML bodies frequently associate with specific loci

(Ching et al. 2013; Shiels et al. 2001, Sun et al. 2003). The discovery of two loci on separate chromosomes that frequently associate with PML, with each other, and that show an interdependance of PML association, justifies the development of the two novel techniques that aim to determine loci convergent at a shared PML body. The significant frequency of transcription factor binding sites of promoters within a 500kb window of these two loci suggests a mechanism for their shared association with each other and independance of association with a shared PML body. As PML association may be a genome-wide mechanism of regulation, it is likely that there is not a common DNA sequence that confers loci association for all PML bodies. Thus, sequences obtained from population based screens of PML body association (Ching et al. 2013) may not be as highly represented for a particular DNA sequence that confers PML body association as much as would sequences obtained from loci obtained from a single PML body. Although the yield of DNA obtained from a single LTOL targeting or nano-dissection can be improved upon (see section 5.4), these two techniques still yield sequences from multiple chromosomes from single nuclear bodies. Apart from establishing the feasibility of these two methods, from these initial experiments, LTOL has uncovered intriguing biological findings that suggest a mechanism for PML body association.

128

5.2 Investigating the Nature of Loci-Specific PML association

From the initial data provided by both LTOL, nano-dissection, and a previous screen of

PML associating loci by Immuno-TRAP (Ching et al. 2013), we hypothesize that MYB and SP1 transcription factor binding sites may contribute to their frequent association at PML bodies, and more generally, that distant loci (i.e. in trans) enriched in these binding sites may have an affinity for each other, and for particular PML bodies enriched in their cognate factors. One approach to test this hypothesis is to FISH with BACs that correspond to loci that are significantly enriched in these consensus sites. The genomic distribution of MYB binding sites shows no significant clustering, however SP1 sites are significantly enriched at chromosomal regions 11q13.2, 19p13.2, 19q13.32, 17p11.2, and 17p13.1 (Figure 4.4). These clusters were initially identified by cross-referencing the GSEAS database of SP1 consensus sites with the

GSEAS annotation of gene sets categorized by cytogenomic band, although within each band, these particular regions contained at least 3 consensus binding sites within a 100kb span. BACs corresponding to these SP1 binding site "islands" could be tested for their association frequency with PML bodies, and compared with the BCL2 locus as a negative control (Ching et al. 2013). This FISH experiment should be performed first in Jurkat cells, as this cell line was used in the original screen that identified the PML-associating loci that were also significantly enriched in the SP1 binding site (Ching et al. 2013).

129

5.3 Is MYB and SP1's specific action on their target loci affected by PML body association?

Although the MYB consensus site is almost identical to that of the two related transcription factors MYBL1, and MYBL2, and all three bind interchangeably at these sites in vitro (Rushton and Ness 2001), these three proteins do not regulate an identical set of genes.

Therefore, there must be an additional mechanism conferring MYB's specificity of action for its loci targets, presumably through post-translational modifications or recruitment of MYB- specific factors that are required to potentiate its activity at its interacting locus, or through loci-specific chromatin modifications that enhance MYB binding. I hypothesize that recruitment of a MYB gene target to a PML body facilitates this potentiation, through the action of MYB- specific co-factors or post-translational modifiers that are enriched there (Van Damme et al.

2010; Pearson et al. 2000), or because the occlusion of MYB binding sites are reversed through

PML-body associated chromatin modification enzymes (i.e. histone acetylases, DNA methylases; Torok et al. 2009). Several lines of evidence support the involvement of PML bodies in MYB function. Firstly, although endogenous staining of MYB appears homogenous throughout the nucleus, endogenous MYB co-immunoprecipitates with PML protein in Jurkat cells, and overexpression of PML and MYB proteins results in MYB's enhanced activation of a reporter plasmid, and enrichment at PML bodies (Dahle et al. 2004). Secondly, MYB's transcriptional activation activity is enhanced by SUMO1 conjugation (Dahle et al. 2003), and

PML body components, including PML protein are known to associate through their SUMO interacting motifs (SIMS; Lin et al. 2006; Shen et al. 2006). PML body localization is also

130 associated with SUMO1 conjugation, and several PML body components are players in the

Sumoylation pathway (e.g. E3 ligase UBC9; Van Damme et al. 2010; Quimby et al. 2006).

Thirdly, the PML body component CBP (Creb Binding Protein) is a coactivator of MYB, directly binding MYB in vivo, and enhancing MYB's transcriptional activation of a reporter gene containing tandem repeats of the MYB binding sites in its promoter (Dai et al. 1996).

Like MYB, the specificity of SP1 regulation for its target loci is not entirely conferred by the consensus binding site: SP3 protein also binds to the SP1 consensus site but the two proteins have different regulatory targets. As proposed for MYB, there may be a similar involvement of PML bodies in mediating the action of SP1 at its target loci. Consistent with this model, SP1 binds PML directly (Vallian et al. 1998), and PML overexpression results in SP1

SUMOylation and recruitment to PML bodies (Li et al. 2014). It should be stressed however, that PML bodies are not always permissive for transcription (depending on promoter elements, and particular protein constiuents present at the body, PML bodies can also be repressive environments for colocalizing genes; Block et al. 2006), and SP1's transactivation activity can be repressed upon overexpression of PML (Li et al. 2014).

It is difficult however to test the direct involvement of PML bodies in potentiating MYB and/or SP1 action on their target loci. Like other studies that examine the role of PML bodies in transcriptional activation, there is a reliance on reporter plasmid assays, or exogenous PML or

131 co-factor expression which makes it difficult to implicate the PML body per se in the observed transcriptional effect. Their interpretation can be confounded by the contribution of freely diffusible protein, artifacts of exogenous protein expression levels that are not physiologically relevant, or chromatin-mediated effects that cause the reporter plasmid to behave differently under the influence of a transcription factor from that of its corresponding endogenous locus.

And as described above, in the case of MYB and SP1, there are extra factors that mediate its effect on a locus, apart from the presence of its binding site. Thus, in testing the hypothesis that the PML body is involved in facilitating MYB or SP1's specificity for its target loci, it would be more fruitful to avoid reporter plasmid systems. A better experimental scheme is to observe the epigenetic (i.e. DNA methylation, acetylation) or transcriptional effect (i.e. RNA levels) of

PML or MYB/SP1 knockdowns on those endogenous loci that both contain their binding sites, and associate frequently with PML bodies. As a control, the same epigenetic or transcriptional readout could be observed in a locus not containing a MYB or a SP1 binding site, as well as at a locus that is enriched in their binding sites, but does not frequently associate at PML bodies

(although it remains to be seen whether there are such loci, as the MYB or SP1 binding site alone may promote a PML body association). It is hoped that these experiments may shed light on the nature of their target specificity, and explain the significant enrichment of MYB or SP1 binding sites found in our PML body-association screens.

132

5.4 Investigating the cooperative loading model

The chromosome 17 and 20 loci identified by the LTOL procedure (chapter 4) frequently associated with PML bodies in NB4 cells, and also frequently associated with each other.

Although the chromosome 20 locus was more likely to be closer to a PML body that had a chromosome 17 locus associated with it, this relationship was lost upon PML body reconstitution in response to ATRA treatment. These observations are consistent with a

"cooperative loading" model, whereby loci/loci interactions increase the likelihood of each being bound by shared factors that confer the affinity for a PML body. Thus, interacting loci are more likely "poised' for a PML association, although not necessarily at the same time or for a particular body. A prediction of this model is that interacting loci (loci X and Y) would be more likely loaded with factors that have affinity with PML body components vs. loci that were not interacting. To test this indirectly, one could observe the frequency of PML association of locus

X in a locus Y homozygous negative cell line, compared with a wild type background. The loss of association should decrease the likelihood of loci X complexing with PML body-binding factors, and thus, one would observe a relative decrease in the frequency of PML body association. It would of course be more informative to be able to isolate the proteins in complex with associating loci, vs. the protein cohort of the non-associating loci. Although it is not a fully realized assay, the "functional complementation" scheme outlined below would provide this discrimination (i.e. by labelling each FISH probe for loci X and Y with complementary components of the "split" catalyst; see below), and in combination with Mass-

Spectrometry would allow for the identification of proteins in complex with the associating loci

133 pair. The relative levels of PML associating factors could then be compared between the Mass-

Spec profile of the paired loci immunoprecipitate, and the immunoprecipitate of the combined paired and the unpaired loci.

5.5 Implication of Findings for Acute Promyelocytic Leukemia

The frequent presence of binding sites for MYB and SP1 in promoters within a 500kb window of the loci on chromosome 17 and 20 determined from the LTOL targeting of PML bodies in the APL patient-derived NB4 cell line supports previous findings that these transcription factors may play a role in the disease phenotype. RARα genes that are upregulated in NB4 cells (an APL patient-derived cell line) compared with wild-type cells require

SP1 to potentiate their transactivation by the PML/RARα fusion protein (i.e. the ID1 and ID2 genes; van Wageningen et al. 2008). This SP1-mediated upregulation only occurs in the presence of the PML/RARα fusion protein, as SP1 does not bind wild-type RARα. Our findings of SP1 sites in the promoters of loci that associate with PML bodies in the NB4 cell line suggests a mechanism by which PML bodies containing both wild type PML, and PML/RARα fusion protein may contribute to the APL phenotype by potentiating the disregulation (activation) of

RARα genes. A prediction of this hypothesis is that, similiar to the van Wageningen et al. (2008) study, the genes found at the chromosome 17 and 20 loci that contain SP1 binding sites in their promoters are disregulated compared with wild type cells. SP1 activation of the epidermal growth factor receptor (EGFR) gene is repressed upon PML overexpression (Li et al. 2014),

134 suggesting that wild-type PML bodies are a repressive environment for some SP1 target genes.

The association of SP1-bound loci at PML/RARα-containing PML bodies may thus be upregulated compared with their wild-type. Alternatively, the SP1-mediated disregulation of

RARα genes may occur indirectly through SP1's association and potentiation at PML bodies containing PML/RARα fusion protein when it binds the PML-associating loci on chromosome 17 and 20. In this scenario, RARα gene disregulation can still occur even though PML-loci association may remain unchanged compare with the wild type cell line. To test this hypothesis, immuno-FISH could be performed using probes to the PML-associating loci found in the NB4 cell line to see if they also associate in wild-type cells. MYB's involvement in loci associating with PML bodies in an APL patient-derived cell line is consistent with MYB's role in hematopoietic differention. The presence of MYB binding sites also in PML-associated loci in T- lymphoblast (Jurkat cells), is consistent with MYB's role as a "master" transcriptional regulator in hematopoietic lineages: MYB is required for the maintenance of hematopoietic precursor cells (Mucenski et al. 1991), and translocation of the MYB gene (c-myb) characterizes two subsets of T-cell acute lymphocytic leukemias (T-ALL; Zhou et al. 2011). MYB's possible role in mediating the association of cognate loci with PML bodies may thus link PML body disruption to the block in granulocytic differention seen in APL patients. To explore this mechanism, PML- associating loci could be screened in a wild-type promyelocyte cell line for those related to hematopoietic differentiation, and then those particular genes observed for their change in transcription level upon MYB siRNA knockdown in both wild-type and the NB4 cell line.

135

5.6 Improvements on LTOL and Nano-Dissection

In their present incarnation, the LTOL and nano-dissection techniques are capable of identifying multiple DNA sequences originating from a single nuclear body. However, it is likely that there is DNA that is being rendered unamplifiable due to the damage incurred by the electron beam or UV irradiation during the nano-dissection or LTOL procedure respectively. To mitigate this damage, two measures could be taken: the first, as mentioned in chapter two, is to treat the cell section with a detergent coat, as this strategy was shown to provide sufficient protection to live specimens placed in an SEM chamber (Takaku et al. 2012). Secondly, the DNA damage of the electron or UV irradiation (i.e. cyclobutane pyrimidine dimers, strand nicks;

Dinant et al. 2007) could be repaired prior to PCR by incubating the lysate with DNA repair enzymes (e.g. Uracil-DNA Glycosylase, and DNA Ligase; Do et al. 2012). A commercial enzyme

"repair cocktail" is available, and it has shown to be effective in the amplification of UV damaged DNA obtained in forensic samples (i.e. degraded sample; Diegoli et al. 2012). It is also likely that the final PCR pool may only be representative of a minority of dominant amplicons that are more frequently copied in the initial rounds of PCR due to sequence-specific differences of the DNA polymerase. To more broadly represent the DNA that is in the initial pool from the nano-dissection or LTOL targeting, PCR could be performed for a limited number of cycles to obtain a yield that is sufficient for multiplex sequencing. The efficacy of these measures could be easily tested by comparing the amplicon yield of increasingly limited amounts of nano-dissected or two-photon targeted DNA between

"protected" or "repaired" samples and untreated controls. The possible gains in DNA yield per

136 dissection will improve the throughput of each dissection, and consequently increase the likelihood of finding significant convergent trans associations at single nuclear bodies.

The precision of the LTOL targeting is also limited by the axial confinement of the two photon-induced damage. Using a two–photon system similar to my experimental setup, (see chapter 4 Material and Methods), Meldrum et al. (2003) achieved an axial precision of 500nm, although with our own system, the practical limits of the axial precision may be closer to 1um

(see Figure 3.2). A possible improvement the axial confinement of the induced damage could be achieved with a 4pi microscope (Heintzmann and Ficz, 2006). Ivanchenko et al. (2007) used a 4pi microscope to induce the conversion of a photoactivatable protein in individual mitochondrial tubes at an axial precision within 500nm. Alternatively, I could also adapt LTOL to 300nm sections (blocking the DSBs caused by sectioning prior to targeting with a blocking oligo or by preforming Terminal deoxynucleotidyl transferase dUTP nick end labeling -TUNEL- in the presence of dideoxynucleotides) to provide an improvement in axial precision.

5.7 Alternative Nuclear Body Targets

We are interested in the "unique" PML bodies of the NB4 cell line as they are a model of

PML body structural heterogeneity with an associated genomic disregulation. Other cell models also fit this criteria and so are intriguing candidates as possible "gene hubs": LANDS

(LySp100-associated nuclear domains) for example, are a subset of PML bodies that colocalize with the SP100 homolog SP110 (Dent et al. 1996), and may represent specialized regulatory

137 centres for SP110 target loci. And the "giant" (i.e. 1-2µm diameter) PML bodies found in the ICF

(centromeric instability and facial dysmorphy) syndrome-derived cell line, which carry DNA methyltransferase defects, and accompanying alterations in heterochromatin formation, are ideal candidates for finding non-random convergences of co-regulated loci (Luciani et al. 2006).

Nano-dissection and LTOL then, could be used to dissect these bodies, and the resultant sequences verified by FISH to find non-random associations in the cell population. The problem remains however, that due to the few loci that are obtained per experiment, there may not be a large enough sample size to mine for statistically significant commonalities among their regulatory sequences. Although the present techniques developed here offer unprecedented sensitivity, and discrimination in the sub-nuclear structures they can interrogate, they are intended as "first strike" approaches: the few loci that are detected per experiment, and verified to share common affinities for a sub-nuclear structure, generate hypotheses that shape further experiments involving these loci. In the NB4 cell LTOL targeting, however small the dataset, there were still significant overlaps in consensus binding sites between the two loci found to frequently associate with PML bodies. However, there are likely other significant mediators of PML body association that are being missed. If an alternative method could be developed that offered the same discriminatory power of LTOL and nano-dissection, but yielded a larger number of sequences per experiment, then there would be a greater chance of finding statistically significant overlaps between the regulatory sequences in the dataset, and in doing so, revealing a possible mechanism for the nuclear body association.

138

5.8 Alternative targeting schemes

5.8.1 Immuno-Trap of PML-associating locus

In this scheme, a locus previously found to frequently associate with PML is FISH- labelled using a corresponding BAC that is conjugated with a hapten (i.e. Digoxigenin - DIG).

This DIG tag is then labelled by an anti-DIG antibody that is conjugated with horseradish peroxidase (HRP). Biotinylated-Tyramide is subsequently deposited specifically in the vicinity of the FISH signal through its catalysis into an active intermediate species by the HRP tag (Speel et al. 1999). The DNA is then sonicated, immunoprecipitated on streptavidin beads, end-labelled for whole-genome amplification, and then deep-sequenced (see Figure 5.1).

139

Figure 5.1 "FISH-TRAP" scheme. Analogous to "Immuno-TRAP" (Ching et al. 2013), a DIG labelled FISH probe catalyzes the deposition of biotinylated-tyramide in the vicinity of a genomic region of interest. The genomic DNA is then sonicated and run down a streptavidin column, where the tyramide-conjugated DNA is precipitated and subjected to deep sequencing. Thus identifying DNA in those regions originating in the vicinity of the FISH probe (adapted with permission from Wanlei Lei).

In this manner, analogous to the Immuno-TRAP method, DNA in the vicinity of the "bait" locus can be identified, including possibly those convergent partners at a shared PML body.

Although there will be background contribution of random associations with the bait locus, as well as those associations that originate from a bait locus that is not localized at a PML body, we expect that, if there are indeed specific partners at a shared PML body, these sequences will be highly represented in the sequencing data, and thus correspond to "peaks" in the genomic profile of the entire dataset. BACs corresponding to these peaks can then be used as FISH probes in parallel with the "bait" probe to verify their specific convergence at shared PML

140 bodies in a cell population. This approach will thus provide a more comprehensive coverage than the LTOL and nano-dissection methods, and, owing to the greater number of sequences yielded, it is more likely that significant trends would be found in the data (i.e. shared transcription factor binding sites). The first step in the development in this technique is to demonstrate that the genomic DNA immunoprecipitated from the tyramide labelling is enriched for DNA due to the presence of the DIG-labelled probe: we could compare the PCR yields of immunoprecipitated DNA from tyramide-labelled cells that were originally hybridized with a BAC conjugated with DIG vs. a control preparation that was hybridized with a BAC that was not conjugated with DIG. A similar proof-of-principle experiment was carried out in Ching et al. (Ching et al. 2013, Figure 1c therein). As a bait probe, one could use the same BAC mapping to chromosome 17 found to frequently associate with PML bodies in the NB4 cell line

(see chapter 4). As a demonstration that the resultant DNA is enriched for those sequences in the vicinity of the BAC , one could verify that there is indeed a "peak" of sequence representation that clusters to the chromosome 17 region of the BAC , and also the chromosome 20 locus that was found to frequently colocalize with it (see chapter 4). Although this procedure is only a slight variation of the previously published Immuno-TRAP method, the amount of starting material will have to be adjusted, as the number of tyramide blooms per cell will be less than in its original application (2-4 per cell vs. 10-20 PML bodies in a typical Jurkat cell; Ching et al. 2013).

141

5.8.2 High-throughput targeted subnuclear damage by functional complementation

This scheme is at present only a fanciful idea, but if shown to be feasible, it could allow for high-throughput, localized damage in the vicinity of biochemically "unique" nuclear sub- structures, such as those PML bodies in the NB4 line that retain their constituent PML body components (i.e. CBP, SP100). It is based on a hetero-dimerized photosensitizer (or peroxidase) that induces DSBs (or tyramide deposition respectively), only in those sub-nuclear regions where it is complimented by the interaction of its endogenous protein conjugates (see Figure

5.2). Briefly, a photosensitizer (or peroxidase) is engineered to be comprised of two fragments,

(X and Y) that are each conjugated with a hapten or Fab fragment, so that they may separately immuno-label to two different proteins (PX and PY) that are characteristic constituents of a sub- nuclear structure of interest. In this manner, the complementation of X and Y through the interaction or close association of Px and Py , will enrich the functional photosensitizer (or peroxidase) at the sub-nuclear structure of interest. Thus, the targeted DSB induction (or tyramide deposition) at characteristic bodies can be carried out in a cell population, and the resultant breaks ligated with PCR-specific linkers for amplification and deep sequencing (or tyramide-conjugated DNA isolated on streptavidin beads, amplified, and deep sequenced). Like the tyramide scheme outlined above, this alternative would provide for a population-based isolation of DNA in the vicinity of a nuclear body with a specific biochemical profile, thus providing greater sequence coverage and improving the possibility of identifying statistically significant features in the dataset. Although not yet demonstrated specifically for the induction of DSBs (or tyramide deposition) localized to the vicinity of interacting proteins, this general

142 scheme is analogous to biomolecular fluorescence complementation (BiFC) assays that rely on the association of a protein pair to reconstitute a fragmented or "split" fluorescent reporter

(BiFC; Kodama and Hu, 2012). It remains to be seen however, whether there is a suitable "split" molecule that can be functionally conjugated to hapten or antibody fragments, that, when reconstituted, either generates short-lived reactive oxygen species (ROS) to induce localized

DSBs, or is a functionalized peroxidase that can catalyze the localized deposition of tyramide.

However, there are promising candidates: One possibility is to exploit the induction of OPV (p- phenylene vynelene) ROS production by bioluminescence resonance energy transfer (BRET) from luminol in the presence of hydrogen peroxide (Yuan et al. 2012). An antibody-conjugated luminol (Yang et al. 2010) can label Px, and OPV can be similarly conjugated to an antibody to label Py, and the DSBs restricted to those regions where, by the association of Px and Py, luminol and OPV are brought into close enough proximity for BRET to take place. It unclear however, whether OPV is an efficient mediator of DSBs, and/or the addition of hydrogen peroxide does not cause widespread DSBs by itself (Nakamura et al. 2003). A recently identified protein photosensitizer, KillerRED (KRED; Bulina et al. 2006) has been shown to catalyze the formation of short-lived (ROS) that induce the formation (among other forms of DNA damage) of DSBs in its immediate vicinity (~.1µm; Bulina et al. 2006) when exposed to high intensity green light

(540nm - 580nm; Bulina et al. 2006). Along with the recent publication of the crystal structure of KillerRed (Pletnev et al. 2009), there has been progress in engineering KillerRED variants for a variety of cell biology applications, including a monomeric form (Takemoto et al. 2013), and its fusion to a Fab fragment for antibody targeting (Serebrovskaya et al. 2009). In particular, a laminB1-KRED fusion protein was used to specifically fragment DNA at nuclear lamina-localized

143 genes, in order that the fragmented DNA could be isolated by electrophoresis and sequenced

(Waldeck et al. 2013). Although KRED is not yet available as a "split" variant, its homology to eGFP (Bulina et al. 2006), and the development of several "split" e-GFP variants (Lindman et al.

2010; Magliery et al. 2005), suggests a "split" KRED could be similarly engineered. At present it seems that there may be technical limitations to the idea of targeting DNA damage or tyramide deposition by complementation, however, the near future may see the development of other tools that are ideally suited for its implementation.

Figure 5.2 "Complementation-TRAP" scheme. i) Antibodies conjugated to complementary components of a "split" enzyme immuno-label corresponding proteins Px and Py. (ii) The colocalization of Px and Py (perhaps in a particular nuclear structure of interest) bring their labelled enzyme components (X and Y) in close enough proximity to combine into a functional enzyme. The complemented enzyme catalyzes the activation of a haptin (yellow circles; e.g. tyramide, and HRP; haptin and activating enzyme respectively) into a reactive intermediate that conjugates to chromatin in the vicinity of the Px/Py complex. Thus, only chromatin in the immediate vicinity of the Px/Py complex is deposited with hapten (iii). This chromatin can then be isolated by virtue of its haptin label and identified by deep sequencing.

144

5.9 Conclusion

The work outlined here introduces two complementary techniques that allow for the dissection of arbitrary sub-nuclear volumes, and the identification of those DNA sequences contained therein. Because the DNA in the vicinity of a single nuclear body can thus be determined, these two techniques are uniquely suited to the problem of identifying "gene hubs", i.e. specific subsets of genes that converge on a shared nuclear body. Although this work is presented mainly as a proof-of-principle, the initial application has already identified novel PML-associating loci in Jurkat cells, as well as two loci in the NB4 cell line that share a specific affinity for each other and for PML bodies, and that share significant commonalities among their transcription factor consensus binding sites. These initial findings underscore the utility of LTOL and nano-dissection in answering pressing questions concerning the structural basis of gene regulation, and it is hoped that with further optimization, they will be of use to the wider research community.

145

6 References

Acevedo, L. G., A. L. Iniguez, H. L. Holster, X. Zhang, R. Green and P. J. Farnham (2007). "Genome-scale ChIP-chip analysis using 10,000 human cells." Biotechniques 43(6): 791-797.

Adli, M. and B. E. Bernstein (2011). "Whole-genome chromatin profiling from limited numbers of cells using nano- ChIP-seq." Nat Protoc 6(10): 1656-1668.

Amankwah, K. S. and U. De Boni (1994). "Ultrastructural localization of filamentous actin within neuronal interphase nuclei in situ." Exp Cell Res 210(2): 315-325.

Ashburner, M., C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin and G. Sherlock (2000). ": tool for the unification of biology. The Gene Ontology Consortium." Nat Genet 25(1): 25-29.

Beech, S. J., K. J. Lethbridge, N. Killick, N. McGlincy and K. N. Leppard (2005). "Isoforms of the promyelocytic leukemia protein differ in their effects on ND10 organization." Exp Cell Res 307(1): 109-117.

Bernardi, R., A. Papa and P. P. Pandolfi (2008). "Regulation of apoptosis by PML and the PML-NBs." Oncogene 27(48): 6299-6312.

Bischof, O., O. Kirsh, M. Pearson, K. Itahana, P. G. Pelicci and A. Dejean (2002). "Deconstructing PML-induced premature senescence." EMBO J 21(13): 3358-3369.

Bloch, D. B., J. D. Chiche, D. Orth, S. M. de la Monte, A. Rosenzweig and K. D. Bloch (1999). "Structural and functional heterogeneity of nuclear bodies." Mol Cell Biol 19(6): 4423-4430.

Block, G. J., C. H. Eskiw, G. Dellaire and D. P. Bazett-Jones (2006). "Transcriptional regulation is affected by subnuclear targeting of reporter plasmids to PML nuclear bodies." Mol Cell Biol 26(23): 8814-8825.

Bohn, M. and D. W. Heermann (2010). "Diffusion-driven looping provides a consistent framework for chromatin organization." PLoS One 5(8): e12218.

Boisvert, F. M., M. J. Hendzel and D. P. Bazett-Jones (2000). "Promyelocytic leukemia (PML) nuclear bodies are protein structures that do not accumulate RNA." J Cell Biol 148(2): 283-292.

Boisvert, F. M., S. van Koningsbruggen, J. Navascues and A. I. Lamond (2007). "The multifunctional nucleolus." Nat Rev Mol Cell Biol 8(7): 574-585.

Bolzer, A., G. Kreth, I. Solovei, D. Koehler, K. Saracoglu, C. Fauth, S. Muller, R. Eils, C. Cremer, M. R. Speicher and T.

Bongiorno-Borbone L, De Cola A, Vernole P, Finos L, Barcaroli D, Knight RA,Melino G, De Laurenzi V. (2008). "FLASH and NPAT positive but not Coilin positive Cajal Bodies correlate with cell ploidy." Cell Cycle 7: 2357–2367.

Cremer (2005). "Three-dimensional maps of all chromosomes in human male fibroblast nuclei and prometaphase rosettes." PLoS Biol 3(5): e157.

146

Bongiorno-Borbone, L., A. De Cola, P. Vernole, L. Finos, D. Barcaroli, R. A. Knight, G. Melino and V. De Laurenzi (2008). "FLASH and NPAT positive but not Coilin positive Cajal Bodies correlate with cell ploidy." Cell Cycle 7(15): 2357-2367.

Bonifer, C., N. Yannoutsos, G. Kruger, F. Grosveld and A. E. Sippel (1994). "Dissection of the locus control function located on the chicken lysozyme gene domain in transgenic mice." Nucleic Acids Res 22(20): 4202-4210.

Borden, K. L. (2002). "Pondering the promyelocytic leukemia protein (PML) puzzle: possible functions for PML nuclear bodies." Mol Cell Biol 22(15): 5259-5269.

Bradshaw, P. S., D. J. Stavropoulos and M. S. Meyn (2005). "Human telomeric protein TRF2 associates with genomic double-strand breaks as an early response to DNA damage." Nat Genet 37(2): 193-197.

Bridger, J. M., S. Boyle, I. R. Kill and W. A. Bickmore (2000). "Re-modelling of nuclear architecture in quiescent and senescent human fibroblasts." Curr Biol 10(3): 149-152.

Brown, J. M., J. Green, R. P. das Neves, H. A. Wallace, A. J. Smith, J. Hughes, N. Gray, S. Taylor, W. G. Wood, D. R. Higgs, F. J. Iborra and V. J. Buckle (2008). "Association between active genes occurs at nuclear speckles and is modulated by chromatin environment." J Cell Biol 182(6): 1083-1097.

Bubulya, P. A. and D. L. Spector (2004). ""On the move"ments of nuclear components in living cells." Exp Cell Res 296(1): 4-11.

Bulina, M. E., D. M. Chudakov, O. V. Britanova, Y. G. Yanushevich, D. B. Staroverov, T. V. Chepurnykh, E. M. Merzlyak, M. A. Shkrob, S. Lukyanov and K. A. Lukyanov (2006). "A genetically encoded photosensitizer." Nat Biotechnol 24(1): 95- 99.

Carmo-Fonseca, M. (2007). "How genes find their way inside the cell nucleus." J Cell Biol 179(6): 1093-1094.

Carter, D., L. Chakalova, C. S. Osborne, Y. F. Dai and P. Fraser (2002). "Long-range chromatin regulatory interactions in vivo." Nat Genet 32(4): 623-626.

Carter, S. L. (2002). "Analysis and visualization of functional relationships between RNA expression and clinical annotation using PathlinX." Proc AMIA Symp: 121-125.

Chakalova, L., E. Debrand, J. A. Mitchell, C. S. Osborne and P. Fraser (2005). "Replication and transcription: shaping the landscape of the genome." Nat Rev Genet 6(9): 669-677.

Chambeyron, S. and W. A. Bickmore (2004). "Chromatin decondensation and nuclear reorganization of the HoxB locus upon induction of transcription." Genes Dev 18(10): 1119-1130.

Chambeyron, S., N. R. Da Silva, K. A. Lawson and W. A. Bickmore (2005). "Nuclear re-organisation of the Hoxb complex during mouse embryonic development." Development. 132(9): 2215-2223.

Chen, B. K., D. Anchel, Z. Gong, R. Cotton, R. Li, Y. Sun and D. P. Bazett-Jones (2014). "Nano-Dissection and Sequencing of DNA at Single Sub-Nuclear Structures." Small. 10(16):3267-74.

Chen, B. K., Z. Yong and S. Yu (2009). "Active release of microobjects using a MEMS microgripper to overcome adhesion forces." J. Microelectromech. Syst. 18(3): 652-659.

Ching, R., K. Ahmed, P. Boutros, L. Penn and D. Bazett-Jones (2013). "Identifying gene locus associations with promyelocytic leukemia nuclear bodies using immuno-TRAP." J. Cell Biol. 201(2): 325-335.

Ching, R., G. Dellaire, C. Eskiw and D. Bazett-Jones (2005). "PML bodies: a meeting place for genomic loci?" J. Cell Sci. 118(5): 847-854.

147

Ching, R. W., G. Dellaire, C. H. Eskiw and D. P. Bazett-Jones (2005). "PML bodies: a meeting place for genomic loci?" J Cell Sci 118(Pt 5): 847-854.

Chuang, C. H., A. E. Carpenter, B. Fuchsova, T. Johnson, P. de Lanerolle and A. S. Belmont (2006). "Long-range directional movement of an interphase chromosome site." Curr Biol 16(8): 825-831.

Chung, J. H., M. Whiteley and G. Felsenfeld (1993). "A 5' element of the chicken beta-globin domain serves as an insulator in human erythroid cells and protects against position effect in Drosophila." Cell 74(3): 505-514.

Cisse, II, I. Izeddin, S. Z. Causse, L. Boudarene, A. Senecal, L. Muresan, C. Dugast-Darzacq, B. Hajj, M. Dahan and X. Darzacq (2013). "Real-time dynamics of RNA polymerase II clustering in live human cells." Science 341(6146): 664-667.

Cisterna, B. and M. Biggiogera (2010). "Ribosome biogenesis: from structure to dynamics." Int Rev Cell Mol Biol 284: 67-111.

Conchello, J. A. and J. W. Lichtman (2005). "Optical sectioning microscopy." Nat Methods 2(12): 920-931.

Condemine, W., Y. Takahashi, J. Zhu, F. Puvion-Dutilleul, S. Guegan, A. Janin and H. de The (2006). "Characterization of endogenous human promyelocytic leukemia isoforms." Cancer Res 66(12): 6192-6198.

Cremer, M., K. Kupper, B. Wagler, L. Wizelman, J. von Hase, Y. Weiland, L. Kreja, J. Diebold, M. R. Speicher and T.

Cremer (2003). "Inheritance of gene density-related higher order chromatin arrangements in normal and tumor cell nuclei." J Cell Biol 162(5): 809-820.

Dahl, J. A. and P. Collas (2007). "A quick and quantitative chromatin immunoprecipitation assay for small cell samples." Front Biosci 12: 4925-4931.

Dahle, O., T. O. Andersen, O. Nordgard, V. Matre, G. Del Sal and O. S. Gabrielsen (2003). "Transactivation properties of c-Myb are critically dependent on two SUMO-1 acceptor sites that are conjugated in a PIASy enhanced manner." Eur J Biochem 270(6): 1338-1348.

Dahle, O., O. Bakke and O. S. Gabrielsen (2004). "c-Myb associates with PML in nuclear bodies in hematopoietic cells." Exp Cell Res 297(1): 118-126.

Dai, P., H. Akimaru, Y. Tanaka, D. X. Hou, T. Yasukawa, C. Kanei-Ishii, T. Takahashi and S. Ishii (1996). "CBP as a transcriptional coactivator of c-Myb." Genes Dev 10(5): 528-540.

Das, P. M., K. Ramachandran, J. vanWert and R. Singal (2004). "Chromatin immunoprecipitation assay." Biotechniques 37(6): 961-969. de Bruyn Kops, A. and D. M. Knipe (1994). "Preexisting nuclear architecture defines the intranuclear location of herpesvirus DNA replication structures." J Virol 68(6): 3512-3526. de Laat, W. and F. Grosveld (2003). "Spatial organization of gene expression: the active chromatin hub." Chromosome Res 11(5): 447-459. de Lanerolle, P. and L. Serebryannyy (2011). "Nuclear actin and myosins: life without filaments." Nat Cell Biol 13(11): 1282-1288. de The, H., M. Le Bras and V. Lallemand-Breitenbach (2012). "The cell biology of disease: Acute promyelocytic leukemia, arsenic, and PML bodies." J Cell Biol 198(1): 11-21.

148 de Wit, E. and W. de Laat (2012). "A decade of 3C technologies: insights into nuclear organization." Genes Dev 26(1): 11-24.

Dehghani, H., G. Dellaire and D. P. Bazett-Jones (2005). "Organization of chromatin in the interphase mammalian cell." Micron 36(2): 95-108.

Dellaire, G. and D. P. Bazett-Jones (2004). "PML nuclear bodies: dynamic sensors of DNA damage and cellular stress." Bioessays 26(9): 963-977.

Dellaire, G. and D. P. Bazett-Jones (2007). "Beyond repair foci: subnuclear domains and the cellular response to DNA damage." Cell Cycle 6(15): 1864-1872.

Dellaire, G., R. W. Ching, H. Dehghani, Y. Ren and D. P. Bazett-Jones (2006). "The number of PML nuclear bodies increases in early S phase by a fission mechanism." J Cell Sci 119(Pt 6): 1026-1033.

Dent, A. L., J. Yewdell, F. Puvion-Dutilleul, M. H. Koken, H. de The and L. M. Staudt (1996). "LYSP100-associated nuclear domains (LANDs): description of a new class of subnuclear structures and their relationship to PML nuclear bodies." Blood 88(4): 1423-1426.

Diegoli, T. M., M. Farr, C. Cromartie, M. D. Coble and T. W. Bille (2012). "An optimized protocol for forensic application of the PreCR Repair Mix to multiplex STR amplification of UV-damaged DNA." Forensic Sci Int Genet 6(4): 498-503.

Dinant, C., M. de Jager, J. Essers, W. A. van Cappellen, R. Kanaar, A. B. Houtsmuller and W. Vermeulen (2007). "Activation of multiple DNA repair pathways by sub-nuclear damage induction methods." J Cell Sci 120(Pt 15): 2731- 2740.

Do, H. and A. Dobrovic (2012). "Dramatic reduction of sequence artefacts from DNA isolated from formalin-fixed cancer biopsies by treatment with uracil- DNA glycosylase." Oncotarget 3(5): 546-558.

Doucas, V., A. M. Ishov, A. Romo, H. Juguilon, M. D. Weitzman, R. M. Evans and G. G. Maul (1996). "Adenovirus replication is coupled with the dynamic properties of the PML nuclear structure." Genes Dev 10(2): 196-207.

Dundr, M., J. K. Ospina, M. H. Sung, S. John, M. Upender, T. Ried, G. L. Hager and A. G. Matera (2007). "Actin- dependent intranuclear repositioning of an active gene locus in vivo." J Cell Biol 179(6): 1095-1103.

Ebrahimian, M., M. Mojtahedzadeh, D. Bazett-Jones and H. Dehghani (2010). "Transcript isoforms of promyelocytic leukemia in mouse male and female gametes." Cells Tissues Organs 192(6): 374-381.

Ecsedy, J. A., J. S. Michaelson and P. Leder (2003). "Homeodomain-interacting protein kinase 1 modulates Daxx localization, phosphorylation, and transcriptional activity." Mol Cell Biol 23(3): 950-960.

Emmert-Buck, M. R., R. F. Bonner, P. D. Smith, R. F. Chuaqui, Z. Zhuang, S. R. Goldstein, R. A. Weiss and L. A. Liotta (1996). "Laser capture microdissection." Science 274(5289): 998-1001.

Eskiw, C. H. and P. Fraser (2011). "Ultrastructural study of transcription factories in mouse erythroblasts." J Cell Sci 124(Pt 21): 3676-3683.

Everett, R. D. and M. K. Chelbi-Alix (2007). "PML and PML nuclear bodies: implications in antiviral defence." Biochimie 89(6-7): 819-830.

Fabunmi, R. P., W. C. Wigley, P. J. Thomas and G. N. DeMartino (2001). "Interferon gamma regulates accumulation of the proteasome activator PA28 and immunoproteasomes at nuclear PML bodies." J Cell Sci 114(Pt 1): 29-36.

149

Fanucchi, S., Y. Shibayama, S. Burd, M. S. Weinberg and M. M. Mhlanga (2013). "Chromosomal contact permits transcription between coregulated genes." Cell 155(3): 606-620.

Fellers TJ, D. M. (2007). "Introduction to Confocal Microscopy." Olympus Fluoview Resource Center. National High Magnetic Field Laboratory.

Fraefel, C., A. G. Bittermann, H. Bueler, I. Heid, T. Bachi and M. Ackermann (2004). "Spatial and temporal organization of adeno-associated virus DNA replication in live cells." J Virol 78(1): 389-398.

Frommer, M., L. E. McDonald, D. S. Millar, C. M. Collis, F. Watt, G. W. Grigg, P. L. Molloy and C. L. Paul (1992). "A genomic sequencing protocol that yields a positive display of 5-methylcytosine residues in individual DNA strands." Proc Natl Acad Sci U S A 89(5): 1827-1831.

Gall, J. G. (1956). "On the submicroscopic structure of chromosomes." Brookhaven Symp Biol(8): 17-32.

Gao, C., C. C. Ho, E. Reineke, M. Lam, X. Cheng, K. J. Stanya, Y. Liu, S. Chakraborty, H. M. Shih and H. Y. Kao (2008). "Histone deacetylase 7 promotes PML sumoylation and is essential for PML nuclear body formation." Mol Cell Biol 28(18): 5658-5667.

Gerlich, D., J. Beaudouin, B. Kalbfuss, N. Daigle, R. Eils and J. Ellenberg (2003). "Global chromosome positions are transmitted through mitosis in mammalian cells." Cell 112(6): 751-764.

Ghule, P. N., Z. Dominski, X. C. Yang, W. F. Marzluff, K. A. Becker, J. W. Harper, J. B. Lian, J. L. Stein, A. J. van Wijnen and G. S. Stein (2008). "Staged assembly of histone gene expression machinery at subnuclear foci in the abbreviated cell cycle of human embryonic stem cells." Proc Natl Acad Sci U S A 105(44): 16964-16969.

Gilfillan, G. D., T. Hughes, Y. Sheng, H. S. Hjorthaug, T. Straub, K. Gervin, J. R. Harris, D. E. Undlien and R. Lyle (2012). "Limitations and possibilities of low cell number ChIP-seq." BMC Genomics 13: 645.

Glass, M. and R. D. Everett (2013). "Components of promyelocytic leukemia nuclear bodies (ND10) act cooperatively to repress herpesvirus infection." J Virol 87(4): 2174-2185.

Gong, Z., Chen, B., Liu, J., Zhou, C., Anchel, D., Li, X., Ge, J., Bazett-Jones, D., Sun, Y. (2014) "Fluorescence and SEM correlative microscopy for nanomanipulation of subcellular structures." Light: Science & Applications 3, e224.

Gonsior, S. M., S. Platz, S. Buchmeier, U. Scheer, B. M. Jockusch and H. Hinssen (1999). "Conformational difference between nuclear and cytoplasmic actin as detected by a monoclonal antibody." J Cell Sci 112 ( Pt 6): 797-809.

Grande, M. A., I. van der Kraan, L. de Jong and R. van Driel (1997). "Nuclear distribution of transcription factors in relation to sites of transcription and RNA polymerase II." J Cell Sci 110 ( Pt 15): 1781-1791.

Greil F, Moorman C, van Steensel B. (2006). "DamID: mapping of in vivo protein-genome interactions using tethered DNA adenine methyltransferase." Methods Enzymol. 410:342-59.

Grossman, S. A., V. R. Sheidler, K. Swedeen, J. Mucenski and S. Piantadosi (1991). "Correlation of patient and caregiver ratings of cancer pain." J Pain Symptom Manage 6(2): 53-57.

Guelen, L., L. Pagie, E. Brasset, W. Meuleman, M. B. Faza, W. Talhout, B. H. Eussen, A. de Klein, L. Wessels, W. de Laat and B. van Steensel (2008). "Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions." Nature 453(7197): 948-951.

150

Guenatri, M., D. Bailly, C. Maison and G. Almouzni (2004). "Mouse centric and pericentric satellite repeats form distinct functional heterochromatin." J Cell Biol 166(4): 493-505.

Hansma, H. G., J. Vesenka, C. Siegerist, G. Kelderman, H. Morrett, R. L. Sinsheimer, V. Elings, C. Bustamante and P. K. Hansma (1992). "Reproducible imaging and dissection of plasmid DNA under liquid with the atomic force microscope." Science 256(5060): 1180-1184.

Hebbes, T. R., A. L. Clayton, A. W. Thorne and C. Crane-Robinson (1994). "Core histone hyperacetylation co-maps with generalized DNase I sensitivity in the chicken beta-globin chromosomal domain." EMBO J 13(8): 1823-1830. Heysen, A., P. Verwaerde and J. C. D'Halluin (1991). "Nucleotide sequence and regulation of the adenovirus type 3 E2A early promoter." Virology 181(1): 241-250.

Heintzmann R, Ficz G. (2006) "Breaking the resolution limit in light microscopy." Brief Funct Genomic Proteomic 5(4):289-301.

Hiscox, J. A. (2007). "RNA viruses: hijacking the dynamic nucleolus." Nat Rev Microbiol 5(2): 119-127.

Hofmann, H., H. Sindre and T. Stamminger (2002). "Functional interaction between the pp71 protein of human cytomegalovirus and the PML-interacting protein human Daxx." J Virol 76(11): 5769-5783.

Hu, J., Y. Zhang, H. Gao, M. Li and U. Hartmann (2001). "Artificial DNA patterns by mechanical nanomanipulation." Nano Lett. 2(1): 55-57.

Ishov, A. M. and G. G. Maul (1996). "The periphery of nuclear domain 10 (ND10) as site of DNA virus deposition." J Cell Biol 134(4): 815-826.

Ivanchenko, S., S. Glaschick, C. Rocker, F. Oswald, J. Wiedenmann and G. U. Nienhaus (2007). "Two-photon excitation and photoconversion of EosFP in dual-color 4Pi confocal microscopy." Biophys J 92(12): 4451-4457.

Jackson, D. A., A. B. Hassan, R. J. Errington and P. R. Cook (1993). "Visualization of focal sites of transcription within human nuclei." EMBO J 12(3): 1059-1065.

Jacome, A. and O. Fernandez-Capetillo (2011). "Lac operator repeats generate a traceable fragile site in mammalian cells." EMBO Rep 12(10): 1032-1038.

Jiang, F. and R. L. Katz (2002). "Use of interphase fluorescence in situ hybridization as a powerful diagnostic tool in cytology." Diagn Mol Pathol 11(1): 47-57.

Johnson, D. S., A. Mortazavi, R. M. Myers and B. Wold (2007). "Genome-wide mapping of in vivo protein-DNA interactions." Science 316(5830): 1497-1502.

Junera, H. R., C. Masson, G. Geraud and D. Hernandez-Verdun (1995). "The three-dimensional organization of ribosomal genes and the architecture of the nucleoli vary with G1, S and G2 phases." J Cell Sci 108 ( Pt 11): 3427- 3441.

Kiesslich, A., A. von Mikecz and P. Hemmerich (2002). "Cell cycle-dependent association of PML bodies with sites of active transcription in nuclei of mammalian cells." J Struct Biol 140(1-3): 167-179.

Kmita, M. and D. Duboule (2003). "Organizing axes in time and space; 25 years of colinear tinkering." Science 301(5631): 331-333.

Kodama, Y. and C. D. Hu (2012). "Bimolecular fluorescence complementation (BiFC): a 5-year update and future perspectives." Biotechniques 53(5): 285-298.

151

Kosak, S. T. and M. Groudine (2004). "Form follows function: The genomic organization of cellular differentiation." Genes Dev 18(12): 1371-1384.

Kreysing, M., L. Boyde, J. Guck and K. J. Chalut (2010). "Physical insight into light scattering by photoreceptor cell nuclei." Opt Lett 35(15): 2639-2641.

Kumar, P. P., O. Bischof, P. K. Purbey, D. Notani, H. Urlaub, A. Dejean and S. Galande (2007). "Functional interaction between PML and SATB1 regulates chromatin-loop architecture and transcription of the MHC class I locus." Nat Cell Biol 9(1): 45-56.

Kwon, C. S. and D. Wagner (2007). "Unwinding chromatin for development and growth: a few genes at a time." Trends Genet 23(8): 403-412.

La, M., K. Kim, J. Park, J. Won, J. H. Lee, Y. M. Fu, G. G. Meadows and C. O. Joe (2004). "Daxx-mediated transcriptional repression of MMP1 gene is reversed by SPOP." Biochem Biophys Res Commun 320(3): 760-765.

Lam YW, Lyon CE, Lamond AI. (2002) "Large-scale isolation of Cajal bodies from HeLa cells." Mol Biol Cell. Jul;13(7):2461-73.

LaMorte, V. J., J. A. Dyck, R. L. Ochs and R. M. Evans (1998). "Localization of nascent RNA and CREB binding protein with the PML-containing nuclear body." Proc Natl Acad Sci U S A 95(9): 4991-4996.

Lee, K., M. Chang, J. Ahn, D. Yu, S. Jung, J. Choi, Y. Noh, Y. Lee and M. Ahn (2002). "Differential gene expression in retinoic acid-induced differentiation of acute promyelocytic leukemia cells, NB4 and HL-60 cells." Biochem. Biophys. Res. Commun. 296(5): 1125-1133.

Levsky, J., S. Shenoy, R. Pezo and R. Singer (2002). "Single-cell gene expression profiling." Science 297(5582): 836-840.

Levsky, J. M. and R. H. Singer (2003). "Fluorescence in situ hybridization: past, present and future." J Cell Sci 116(Pt 14): 2833-2838.

Li, H., C. Leo, J. Zhu, X. Wu, J. O'Neil, E. J. Park and J. D. Chen (2000). "Sequestration and inhibition of Daxx-mediated transcriptional repression by PML." Mol Cell Biol 20(5): 1784-1796.

Li, J., W. X. Zou and K. S. Chang (2014). "Inhibition of Sp1 functions by its sequestration into PML nuclear bodies." PLoS One 9(4): e94450.

Lin, D. Y., Y. S. Huang, J. C. Jeng, H. Y. Kuo, C. C. Chang, T. T. Chao, C. C. Ho, Y. C. Chen, T. P. Lin, H. I. Fang, C. C. Hung, C. S. Suen, M. J. Hwang, K. S. Chang, G. G. Maul and H. M. Shih (2006). "Role of SUMO-interacting motif in Daxx SUMO modification, subnuclear localization, and repression of sumoylated transcription factors." Mol Cell 24(3): 341-354.

Lin, D. Y., M. Z. Lai, D. K. Ann and H. M. Shih (2003). "Promyelocytic leukemia protein (PML) functions as a glucocorticoid receptor co-activator by sequestering Daxx to the PML oncogenic domains (PODs) to enhance its transactivation potential." J Biol Chem 278(18): 15958-15965.

Lindman, S., A. Hernandez-Garcia, O. Szczepankiewicz, B. Frohm and S. Linse (2010). "In vivo protein stabilization based on fragment complementation and a split GFP system." Proc Natl Acad Sci U S A 107(46): 19826-19831.

Litt, M. D., M. Simpson, M. Gaszner, C. D. Allis and G. Felsenfeld (2001). "Correlation between histone lysine methylation and developmental changes at the chicken beta-globin locus." Science 293(5539): 2453-2455.

Litt, M. D., M. Simpson, F. Recillas-Targa, M. N. Prioleau and G. Felsenfeld (2001). "Transitions in histone acetylation reveal boundaries of three separately regulated neighboring loci." EMBO J 20(9): 2224-2235.

152

Liu, J. L., C. Murphy, M. Buszczak, S. Clatterbuck, R. Goodman and J. G. Gall (2006). "The Drosophila melanogaster Cajal body." J Cell Biol 172(6): 875-884.

Liu, J. L., Z. Wu, Z. Nizami, S. Deryusheva, T. K. Rajendra, K. J. Beumer, H. Gao, A. G. Matera, D. Carroll and J. G. Gall (2009). "Coilin is essential for Cajal body organization in Drosophila melanogaster." Mol Biol Cell 20(6): 1661-1670.

Louria-Hayon, I., T. Grossman, R. V. Sionov, O. Alsheich, P. P. Pandolfi and Y. Haupt (2003). "The promyelocytic leukemia protein protects p53 from Mdm2-mediated inhibition and degradation." J Biol Chem 278(35): 33134-33141.

Lü, J., H. An, H. Li, X. Li, Y. Wang, M. Li, Y. Zhang and J. Hu (2006). "Nanodissection, isolation, and PCR amplification of single DNA molecules." Surf. Interface Anal. 38(6): 1010-1013.

Luciani, J., D. Depetris, Y. Usson, C. Metzler-Guillemain, C. Mignon-Ravix, M. Mitchell, A. Megarbane, P. Sarda, H. Sirma, A. Moncla, J. Feunteun and M. Mattei (2006). "PML nuclear bodies are highly organised DNA-protein structures with a function in heterochromatin remodelling at the G2 phase." J. Cell Sci. 119(12): 2518-2531.

Ma, T., B. A. Van Tine, Y. Wei, M. D. Garrett, D. Nelson, P. D. Adams, J. Wang, J. Qin, L. T. Chow and J. W. Harper (2000). "Cell cycle-regulated phosphorylation of p220(NPAT) by cyclin E/Cdk2 in Cajal bodies promotes histone gene transcription." Genes Dev 14(18): 2298-2313.

Machyna, M., P. Heyn and K. M. Neugebauer (2013). "Cajal bodies: where form meets function." Wiley Interdiscip Rev RNA 4(1): 17-34.

Magliery, T. J., C. G. Wilson, W. Pan, D. Mishler, I. Ghosh, A. D. Hamilton and L. Regan (2005). "Detecting protein- protein interactions with a green fluorescent protein fragment reassembly trap: scope and mechanism." J Am Chem Soc 127(1): 146-157.

Mahy, N. L., P. E. Perry and W. A. Bickmore (2002). "Gene density and transcription influence the localization of chromatin outside of chromosome territories detectable by FISH." J Cell Biol 159(5): 753-763.

Marzluff, W. F. (2005). "Metazoan replication-dependent histone mRNAs: a distinct set of RNA polymerase II transcripts." Curr Opin Cell Biol 17(3): 274-280.

Maul, G. G. (1998). "Nuclear domain 10, the site of DNA virus transcription and replication." Bioessays 20(8): 660-667.

Maul, G. G., D. Negorev, P. Bell and A. M. Ishov (2000). "Review: properties and assembly mechanisms of ND10, PML bodies, or PODs." J Struct Biol 129(2-3): 278-287.

Mehta, I. S., M. Amira, A. J. Harvey and J. M. Bridger (2010). "Rapid chromosome territory relocation by nuclear motor activity in response to serum removal in primary human fibroblasts." Genome Biol 11(1): R5.

Meldrum, R. A., S. W. Botchway, C. W. Wharton and G. J. Hirst (2003). "Nanoscale spatial induction of ultraviolet photoproducts in cellular DNA by three-photon near-infrared absorption." EMBO Rep 4(12): 1144-1149.

Melnik, S., B. Deng, A. Papantonis, S. Baboo, I. M. Carr and P. R. Cook (2011). "The proteomes of transcription factories containing RNA polymerases I, II or III." Nat Methods 8(11): 963-968.

Morimoto, M. and C. F. Boerkoel (2013). "The role of nuclear bodies in gene expression and disease." Biology (Basel) 2(3): 976-1033.

Mucenski, M. L., K. McLain, A. B. Kier, S. H. Swerdlow, C. M. Schreiner, T. A. Miller, D. W. Pietryga, W. J. Scott, Jr. and S. S. Potter (1991). "A functional c-myb gene is required for normal murine fetal hepatic hematopoiesis." Cell 65(4): 677- 689.

153

Muller, I., Dejean, A. (1999). "Viral immediate-early proteins abrogate the modification by SUMO-1 of PML and Sp100 proteins, correlating with nuclear body disruption." J. Virol. 73:5137–5143

Muller, I., S. Boyle, R. H. Singer, W. A. Bickmore and J. R. Chubb (2010). "Stable morphology, but dynamic internal reorganisation, of interphase human chromosomes in living cells." PLoS One 5(7): e11560.

Nakamura, J., E. R. Purvis and J. A. Swenberg (2003). "Micromolar concentrations of hydrogen peroxide induce oxidative DNA lesions more efficiently than millimolar concentrations in mammalian cells." Nucleic Acids Res 31(6): 1790-1795.

Negorev, D. and G. G. Maul (2001). "Cellular proteins localized at and interacting within ND10/PML nuclear bodies/PODs suggest functions of a nuclear depot." Oncogene 20(49): 7234-7242.

Nisole, S., M. A. Maroui, X. H. Mascle, M. Aubry and M. K. Chelbi-Alix (2013). "Differential Roles of PML Isoforms." Front Oncol 3: 125.

Nizami, Z., S. Deryusheva and J. G. Gall (2010). "The Cajal body and histone locus body." Cold Spring Harb Perspect Biol 2(7): a000653. Novotny, I., M. Blazikova, D. Stanek, P. Herman and J. Malinsky (2011). "In vivo kinetics of U4/U6.U5 tri-snRNP formation in Cajal bodies." Mol Biol Cell 22(4): 513-523.

Oheim, M., D. J. Michael, M. Geisbauer, D. Madsen and R. H. Chow (2006). "Principles of two-photon excitation fluorescence microscopy and other nonlinear imaging approaches." Adv Drug Deliv Rev 58(7): 788-808.

Olsen, M. (2004). Non Traditional Roles for the Nucleolus. The Nucleolus, Landes Bioscience: 14.

Olson, M. O. (2004). "Sensing cellular stress: another new function for the nucleolus?" Sci STKE 2004(224): pe10.

O'Neill, L. P., M. D. VerMilyea and B. M. Turner (2006). "Epigenetic characterization of the early embryo with a chromatin immunoprecipitation protocol applicable to small cell populations." Nat Genet 38(7): 835-841.

Osborne, C. S., L. Chakalova, K. E. Brown, D. Carter, A. Horton, E. Debrand, B. Goyenechea, J. A. Mitchell, S. Lopes, W. Reik and P. Fraser (2004). "Active genes dynamically colocalize to shared sites of ongoing transcription." Nat Genet 36(10): 1065-1071.

Osborne, C. S., L. Chakalova, J. A. Mitchell, A. Horton, A. L. Wood, D. J. Bolland, A. E. Corcoran and P. Fraser (2007). "Myc dynamically and preferentially relocates to a transcription factory occupied by Igh." PLoS Biol. 5(8): e192.

Palstra, R. J., W. de Laat and F. Grosveld (2008). "Beta-globin regulation and long-range interactions." Adv Genet 61: 107-142.

Papantonis, A., T. Kohro, S. Baboo, J. D. Larkin, B. Deng, P. Short, S. Tsutsumi, S. Taylor, Y. Kanki, M. Kobayashi, G. Li, H. M. Poh, X. Ruan, H. Aburatani, Y. Ruan, T. Kodama, Y. Wada and P. R. Cook (2012). "TNFalpha signals through specialized factories where responsive coding and miRNA genes are transcribed." EMBO J 31(23): 4404-4414.

Papantonis, A., J. D. Larkin, Y. Wada, Y. Ohta, S. Ihara, T. Kodama and P. R. Cook (2010). "Active RNA polymerases: mobile or immobile molecular machines?" PLoS Biol 8(7): e1000419.

Parks, C. L. and T. Shenk (1997). "Activation of the adenovirus major late promoter by transcription factors MAZ and Sp1." J Virol 71(12): 9600-9607.

Pattabiraman, D. R., J. Sun, D. H. Dowhan, S. Ishii and T. J. Gonda (2009). "Mutations in multiple domains of c-Myb disrupt interaction with CBP/p300 and abrogate myeloid transforming ability." Mol Cancer Res 7(9): 1477-1486.

154

Pearson, J. C., D. Lemons and W. McGinnis (2005). "Modulating Hox gene functions during animal body patterning." Nat Rev Genet 6(12): 893-904.

Pearson, M., R. Carbone, C. Sebastiani, M. Cioce, M. Fagioli, S. Saito, Y. Higashimoto, E. Appella, S. Minucci, P. P. Pandolfi and P. G. Pelicci (2000). "PML regulates p53 acetylation and premature senescence induced by oncogenic Ras." Nature 406(6792): 207-210.

Pederson, T. (2008). "As functional nuclear actin comes into view, is it globular, filamentous, or both?" J Cell Biol 180(6): 1061-1064.

Pederson, T. and U. Aebi (2005). "Nuclear actin extends, with no contraction in sight." Mol Biol Cell 16(11): 5055-5060.

Pederson, T. and R. Y. Tsai (2009). "In search of nonribosomal nucleolar protein function and regulation." J Cell Biol 184(6): 771-776.

Pitha-Rowe, I., W. J. Petty, S. Kitareewan and E. Dmitrovsky (2003). "Retinoid target genes in acute promyelocytic leukemia." Leukemia 17(9): 1723-1730.

Platani, M., I. Goldberg, A. I. Lamond and J. R. Swedlow (2002). "Cajal body dynamics and association with chromatin are ATP-dependent." Nat Cell Biol 4(7): 502-508.

Pletnev, S., N. G. Gurskaya, N. V. Pletneva, K. A. Lukyanov, D. M. Chudakov, V. I. Martynov, V. O. Popov, M. V.

Kovalchuk, A. Wlodawer, Z. Dauter and V. Pletnev (2009). "Structural basis for phototoxicity of the genetically encoded photosensitizer KillerRed." J Biol Chem 284(46): 32028-32039.

Lallemand-Breitenbach, V. and de Thé, H. (2010). "PML Nuclear Bodies" Cold Spring Harb Perspect Biol. May 2010; 2(5): a000661.

Quimby, B. B., V. Yong-Gonzalez, T. Anan, A. V. Strunnikov and M. Dasso (2006). "The promyelocytic leukemia protein stimulates SUMO conjugation in yeast." Oncogene 25(21): 2999-3005.

Ragoczy, T., A. Telling, T. Sawado, M. Groudine and S. T. Kosak (2003). "A genetic analysis of chromosome territory looping: diverse roles for distal regulatory elements." Chromosome Res 11(5): 513-525.

Ramsay, R. G. and T. J. Gonda (2008). "MYB function in normal and cancer cells." Nat Rev Cancer 8(7): 523-534.

Rand, K. N., T. Ho, W. Qu, S. M. Mitchell, R. White, S. J. Clark and P. L. Molloy (2005). "Headloop suppression PCR and its application to selective amplification of methylated DNA sequences." Nucleic Acids Res 33(14): e127.

Rapkin, L. M., D. R. Anchel, R. Li and D. P. Bazett-Jones (2012). "A view of the chromatin landscape." Micron 43(2-3): 150-158.

Rattner, J. B. and B. A. Hamkalo (1978). "Higher order structure in metaphase chromosomes. I. The 250 A fiber." Chromosoma 69(3): 363-372.

Rattner, J. B., G. Krystal and B. A. Hamkalo (1978). "Selective digestion of mouse metaphase chromosomes." Chromosoma 66(3): 259-268.

Razin, S. V., A. A. Gavrilov, E. S. Ioudinkova and O. V. Iarovaia (2013). "Communication of genome regulatory elements in a folded chromosome." FEBS Lett. 587(13): 1840-1847.

155

Razin, S. V., O. V. Iarovaia and Y. S. Vassetzky (2014). "A requiem to the nuclear matrix: from a controversial concept to 3D organization of the nucleus." Chromosoma 123(3): 217-224.

Resnick-Silverman, L. and J. J. Manfredi (2006). "Gene-specific mechanisms of p53 transcriptional control and prospects for cancer therapy." J Cell Biochem 99(3): 679-689.

Rieder, D., Z. Trajanoski and J. G. McNally (2012). "Transcription factories." Front Genet 3: 221.

Rodriguez, B. A. and T. H. Huang (2005). "Tilling the chromatin landscape: emerging methods for the discovery and profiling of protein-DNA interactions." Biochem Cell Biol 83(4): 525-534.

Rushton, J. J. and S. A. Ness (2001). "The conserved DNA binding domain mediates similar regulatory interactions for A-Myb, B-Myb, and c-Myb transcription factors." Blood Cells Mol Dis 27(2): 459-463.

Sahin, U., O. Ferhi, M. Jeanne, S. Benhenda, C. Berthier, F. Jollivet, M. Niwa-Kawakita, O. Faklaris, N. Setterblad, H. de The and V. Lallemand-Breitenbach (2014). "Oxidative stress-induced assembly of PML nuclear bodies controls sumoylation of partner proteins." J Cell Biol 204(6): 931-945.

Salomoni, P. and P. P. Pandolfi (2002). "The role of PML in tumor suppression." Cell 108(2): 165-170.

Salzler, H. R., D. C. Tatomer, P. Y. Malek, S. L. McDaniel, A. N. Orlando, W. F. Marzluff and R. J. Duronio (2013). "A sequence in the Drosophila H3-H4 Promoter triggers histone locus body assembly and biosynthesis of replication- coupled histone mRNAs." Dev Cell 24(6): 623-634.

Schoenfelder, S., T. Sexton, L. Chakalova, N. Cope, A. Horton, S. Andrews, S. Kurukuti, J. Mitchell, D. Umlauf, D. Dimitrova, C. Eskiw, Y. Luo, C. Wei, Y. Ruan, J. Bieker and P. Fraser (2010). "Preferential associations between co- regulated genes reveal a transcriptional interactome in erythroid cells." Nat. Genet. 42(1): 53-61.

Schutze, K. and G. Lahr (1998). "Identification of expressed genes by laser-mediated manipulation of single cells." Nat Biotechnol 16(8): 737-742.

Seeler, J. S., A. Marchio, D. Sitterlin, C. Transy and A. Dejean (1998). "Interaction of SP100 with HP1 proteins: a link between the promyelocytic leukemia-associated nuclear bodies and the chromatin compartment." Proc Natl Acad Sci U S A 95(13): 7316-7321.

Serebrovskaya, E. O., E. F. Edelweiss, O. A. Stremovskiy, K. A. Lukyanov, D. M. Chudakov and S. M. Deyev (2009). "Targeting cancer cells by using an antireceptor antibody-photosensitizer fusion protein." Proc Natl Acad Sci U S A 106(23): 9221-9225.

Sexton, T., D. Umlauf, S. Kurukuti and P. Fraser (2007). "The role of transcription factories in large-scale structure and dynamics of interphase chromatin." Semin Cell Dev Biol 18(5): 691-697.

Shav-Tal, Y., J. Blechman, X. Darzacq, C. Montagna, B. T. Dye, J. G. Patton, R. H. Singer and D. Zipori (2005). "Dynamic sorting of nuclear components into distinct nucleolar caps during transcriptional inhibition." Mol Biol Cell 16(5): 2395- 2413.

Shen, T. H., H. K. Lin, P. P. Scaglioni, T. M. Yung and P. P. Pandolfi (2006). "The mechanisms of PML-nuclear body formation." Mol Cell 24(3): 331-339.

Shevtsov, S. P. and M. Dundr (2011). "Nucleation of nuclear bodies by RNA." Nat Cell Biol 13(2): 167-173.

Shiels, C., S. A. Islam, R. Vatcheva, P. Sasieni, M. J. Sternberg, P. S. Freemont and D. Sheer (2001). "PML bodies associate specifically with the MHC gene cluster in interphase nuclei." J Cell Sci 114(Pt 20): 3705-3716.

156

Shopland, L. S., C. R. Lynch, K. A. Peterson, K. Thornton, N. Kepper, J. Hase, S. Stein, S. Vincent, K. R. Molloy, G. Kreth, C. Cremer, C. J. Bult and T. P. O'Brien (2006). "Folding and organization of a contiguous chromosome region according to the gene distribution pattern in primary genomic sequence." J Cell Biol 174(1): 27-38.

Simonis, M., P. Klous, E. Splinter, Y. Moshkin, R. Willemsen, E. de Wit, B. van Steensel and W. de Laat (2006). "Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C)." Nat Genet 38(11): 1348-1354.

Simonis, M., J. Kooren and W. de Laat (2007). "An evaluation of 3C-based methods to capture DNA interactions." Nat. Methods 4(11): 895-901.

Sirri, V., S. Urcuqui-Inchima, P. Roussel and D. Hernandez-Verdun (2008). "Nucleolus: the fascinating nuclear body." Histochem Cell Biol 129(1): 13-31.

Slusarczyk, A., R. Kamath, C. Wang, D. Anchel, C. Pollock, M. A. Lewandowska, T. Fitzpatrick, D. P. Bazett-Jones and S. Huang (2010). "Structure and function of the perinucleolar compartment in cancer cells." Cold Spring Harb Symp Quant Biol 75: 599-605.

Smallwood, A. and B. Ren (2013). "Genome organization and long-range regulation of gene expression by enhancers." Curr Opin Cell Biol 25(3): 387-394.

Smith, K. P. and J. B. Lawrence (2000). "Interactions of U2 gene loci and their nuclear transcripts with Cajal (coiled) bodies: evidence for PreU2 within Cajal bodies." Mol Biol Cell 11(9): 2987-2998. Solovei, I. and M. Cremer (2010). "3D-FISH on cultured cells combined with immunostaining." Methods Mol Biol 659: 117-126.

Solovei, I., M. Kreysing, C. Lanctot, S. Kosem, L. Peichl, T. Cremer, J. Guck and B. Joffe (2009). "Nuclear architecture of rod photoreceptor cells adapts to vision in mammalian evolution." Cell 137(2): 356-368.

Soutoglou, E., J. F. Dorn, K. Sengupta, M. Jasin, A. Nussenzweig, T. Ried, G. Danuser and T. Misteli (2007). "Positional stability of single double-strand breaks in mammalian cells." Nat Cell Biol 9(6): 675-682.

Spector, D. L. (2001). "Nuclear domains." J Cell Sci 114(Pt 16): 2891-2893.

Spector, D. L. (2003). "The dynamics of chromosome organization and gene regulation." Annu Rev Biochem 72: 573- 608.

Spector, D. L., G. Lark and S. Huang (1992). "Differences in snRNP localization between transformed and nontransformed cells." Mol Biol Cell 3(5): 555-569.

Speel, E. J. and P. Komminoth (1999). "CARD In Situ Hybridization: Sights and Signals." Endocr Pathol 10(3): 193-198.

Spilianakis, C., M. Lalioti, T. Town, G. Lee and R. Flavell (2005). "Interchromosomal associations between alternatively expressed loci." Nature 435(7042): 637-645.

Stadler, S., V. Schnapp, R. Mayer, S. Stein, C. Cremer, C. Bonifer, T. Cremer and S. Dietzel (2004). "The architecture of chicken chromosome territories changes during differentiation." BMC Cell Biol 5(1): 44.

Stanek, D. and K. M. Neugebauer (2006). "The Cajal body: a meeting place for spliceosomal snRNPs in the nuclear maze." Chromosoma 115(5): 343-354.

Strickfaden, H., A. Zunhammer, S. van Koningsbruggen, D. Kohler and T. Cremer (2010). "4D chromatin dynamics in cycling cells: Theodor Boveri's hypotheses revisited." Nucleus 1(3): 284-297.

157

Sun, Y., L. Durrin and T. Krontiris (2003). "Specific interaction of PML bodies with the TP53 locus in Jurkat interphase nuclei." Genomics 82(2): 250-252.

Suzuki, T., H. Izumi and M. Ohno (2010). "Cajal body surveillance of U snRNA export complex assembly." J Cell Biol 190(4): 603-612.

Takahashi, Y., V. Lallemand-Breitenbach, J. Zhu and H. de The (2004). "PML nuclear bodies and apoptosis." Oncogene 23(16): 2819-2824.

Takaku, Y., H. Suzuki, I. Ohta, D. Ishii, Y. Muranaka, M. Shimomura and T. Hariyama (2012). "A thin polymer membrane, nano-suit, enhancing survival across the continuum between air and high vacuum." Proc Natl Acad Sci U S A 110(19): 7631-7635.

Takemoto, K., T. Matsuda, N. Sakai, D. Fu, M. Noda, S. Uchiyama, I. Kotera, Y. Arai, M. Horiuchi, K. Fukui, T. Ayabe, F. Inagaki, H. Suzuki and T. Nagai (2013). "SuperNova, a monomeric photosensitizing fluorescent protein for chromophore-assisted light inactivation." Sci Rep 3: 2629.

Takizawa, T., K. J. Meaburn and T. Misteli (2008). "The meaning of gene positioning." Cell 135(1): 9-13.

Tavalai N, Adler M, Scherer M, Riedl Y, Stamminger T. (2011). "Evidence for a dual antiviral role of the major nuclear domain 10 component Sp100 during the immediate-early and late phases of the human cytomegalovirus replication cycle." J. Virol. 85:9447–9458

Thiry, M., F. Lamaye and D. L. Lafontaine (2011). "The nucleolus: when 2 became 3." Nucleus 2(4): 289-293.

Tolhuis, B., R. J. Palstra, E. Splinter, F. Grosveld and W. de Laat (2002). "Looping and interaction between hypersensitive sites in the active beta-globin locus." Mol Cell 10(6): 1453-1465.

Torok, D., R. W. Ching and D. P. Bazett-Jones (2009). "PML nuclear bodies as sites of epigenetic regulation." Front Biosci (Landmark Ed) 14: 1325-1336.

Ulbricht, T., M. Alzrigat, A. Horch, N. Reuter, A. von Mikecz, V. Steimle, E. Schmitt, O. H. Kramer, T. Stamminger and P. Hemmerich (2012). "PML promotes MHC class II gene expression by stabilizing the class II transactivator." J Cell Biol 199(1): 49-63.

Vallian, S., K. V. Chin and K. S. Chang (1998). "The promyelocytic leukemia protein interacts with Sp1 and inhibits its transactivation of the epidermal growth factor receptor promoter." Mol Cell Biol 18(12): 7147-7156. van Berkum, N. and J. Dekker (2009). "Determining spatial chromatin organization of large genomic regions using 5C technology." Methods Mol. Biol. 567: 189-213.

Van Damme, E., K. Laukens, T. H. Dang and X. Van Ostade (2010). "A manually curated network of the PML nuclear body interactome reveals an important role for PML-NBs in SUMOylation dynamics." Int J Biol Sci 6(1): 51-67. van Wageningen, S., M. C. Breems-de Ridder, J. Nigten, G. Nikoloski, C. A. Erpelinck-Verschueren, B. Lowenberg, T. de Witte, D. G. Tenen, B. A. van der Reijden and J. H. Jansen (2008). "Gene transactivation without direct DNA binding defines a novel gain-of-function for PML-RARalpha." Blood 111(3): 1634-1643.

Verschure, P. J., I. van Der Kraan, E. M. Manders and R. van Driel (1999). "Spatial relationship between transcription sites and chromosome territories." J Cell Biol 147(1): 13-24.

Waldeck, W., G. Mueller, K. H. Glatting, A. Hotz-Wagenblatt, N. Diessl, S. Chotewutmonti, J. Langowski, W. Semmler, M. Wiessler and K. Braun (2013). "Spatial localization of genes determined by intranuclear DNA fragmentation with the fusion proteins lamin KRED and histone KRED und visible light." Int J Med Sci 10(9): 1136-1148.

158

Wang, D. and S. Bodovitz (2010). "Single cell analysis: the new frontier in 'omics'." Trends Biotechnol 28(6): 281-290.

Wang, J., C. Shiels, P. Sasieni, P. Wu, S. Islam, P. Freemont and D. Sheer (2004). "Promyelocytic leukemia nuclear bodies associate with transcriptionally active genomic regions." J. Cell Biol. 164(4): 515-526.

Wesley, C. S., M. Ben, M. Kreitman, N. Hagag and W. F. Eanes (1990). "Cloning regions of the Drosophila genome by microdissection of polytene chromosome DNA and PCR with nonspecific primers." Nucleic Acids Res. 18: 599-603.

White, A. E., M. E. Leslie, B. R. Calvi, W. F. Marzluff and R. J. Duronio (2007). "Developmental and cell cycle regulation of the Drosophila histone locus body." Mol Biol Cell 18(7): 2491-2502.

Williams, A., C. G. Spilianakis and R. A. Flavell (2010). "Interchromosomal association and gene regulation in trans." Trends Genet 26(4): 188-197. Wittkopp, PJ, and Kalay G. (2011) "Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence." Nat Rev Genet. Dec 6;13(1):59-69

Wu, W. S., S. Vallian, E. Seto, W. M. Yang, D. Edmondson, S. Roth and K. S. Chang (2001). "The growth suppressor PML represses transcription by functionally and physically interacting with histone deacetylases." Mol Cell Biol 21(7): 2259- 2268.

Xie, D., A. P. Boyle, L. Wu, J. Zhai, T. Kawli and M. Snyder (2013). "Dynamic trans-acting factor colocalization in human cells." Cell 155(3): 713-724.

Y. Takaku , H. S., I. Ohta , D. Ishii , Y. Muranaka , M. Shimomura , and T. Hariyama (2012). "A thin polymer membrane, nano-suit, enhancing survival across the continuum between air and high vacuum." PNAS 110 7631-7635.

Yang, X., Y. Guo and A. Wang (2010). "Luminol/antibody labeled gold nanoparticles for chemiluminescence immunoassay of carcinoembryonic antigen." Anal Chim Acta 666(1-2): 91-96. Yuan, H., H. Chong, B. Wang, C. Zhu, L. Liu, Q. Yang, F. Lv and S. Wang (2012). "Chemical molecule-induced light- activated system for anticancer and antifungal activities." J Am Chem Soc 134(32): 13184-13187.

Zhang, Y. L., Y. Zhang, C. Ru, B. K. Chen and Y. Sun (2013). "A load-lock-compatible nanomanipulation system for scanning electron microscope." IEEE/ASME Trans. Mechatron. 18(1): 230-237.

Zhao, F. Y., X. Yang, D. Y. Chen, W. Y. Ma, J. G. Zheng and X. M. Zhang (2014). "A method for simultaneously delineating multiple targets in 3D-FISH using limited channels, lasers, and fluorochromes." Eur Biophys J 43(1): 53-58.

Zhao, J., B. Kennedy, B. Lawrence, D. Barbie, A. Matera, J. Fletcher and E. Harlow (2000). "NPAT links cyclin E-Cdk2 to the regulation of replication-dependent histone gene transcription." Genes Dev. 14(18): 2283-2297.

Zhao R., M. S. Bodnar and D. L. Spector (2009). "Nuclear neighborhoods and gene expression." Curr. Opin. Genet. Dev. 19(2): 172-179.

Zhou, Y. and S. A. Ness (2011). "Myb proteins: angels and demons in normal and transformed cells." Front Biosci (Landmark Ed) 16: 1109-1131.

Zhong S, Müller S, Ronchetti S, Freemont PS, Dejean A, Pandolfi PP (2000). "Role of SUMO-1-modified PML in nuclear body formation." Blood. 2000 May 1;95(9):2748-52.

159

Zhu, J., M. H. Koken, F. Quignon, M. K. Chelbi-Alix, L. Degos, Z. Y. Wang, Z. Chen and H. de The (1997). "Arsenic- induced PML targeting onto nuclear bodies: implications for the treatment of acute promyelocytic leukemia." Proc Natl Acad Sci U S A 94(8): 3978-3983.

7 Appendix

A.1 MEF chromocentre targeting sequences

mouse genomic

probe

linker

April 14th 08

2-B

Mouse

chr5:24,323,596-24,348,054

cgtaggaatcaacaaattgaccttcaacaaactgaattgtcaatgaggactttcccacagaccgatagcccagg atggcgatcttccgggacttggactgaggcatggatctgagtccggacttgtacagcatattcagctgttcctctgt tgtcacg

July 10th 08

1Mouse alpha satellite sequence

Cgtgacaacagatggaacagctgaatatgcgtgagaaacatccacttgccgacttgaaaaatgacgaaatcactaaaaaacgtgaaaaatg agaaatgcacactgaaggacagcggccgcttgaattcgaacgcgaga

August 14th 08 (from S2 of July31st 08)

2- mouse Repetitive all over chromosome 14 tgggacatcgtcctgctggagttcgtgacgacagcggccgcttgaattcgaacgcgagacagcggccgctgtcctgg actttttggggctgggagactattgatgactgcttctatttctttagggaaatgagagtgtttacatcgttaatctg attctgatttaactttggtatctggtatctgtctaggaagttgtttatttcatacaggttttcagttttcctgagta tagcctgcatattcagctgttccatctgttgtcacg

160

5-chrX:66733497-66733620 gtgacaacagatggaacagctgaatatgcactctaagccaccatcaccccccactcccccgaccaccctgtgcaaca gggcaggttgtaaatcataggttttgtgtctcagttggtatccactaggagcctagtctgattacagacggccagtg aattgtaatacgactcactataggctgggacatcgtcctgctggagttcgtgacgacagcggccgcttgaattcgaa cgcgagaatctttctagaagatctcctacaatat

6Mouse Alpha Satellite DNA!

Agcaagatcgtgacaacagatggaacagctgaatatgatggaaaatgagaaacatccacttgacgacttgaaaaatg atgaaatcactaaaaatcgtgaaaaatgagaaatgcacactgaaggacagcggccgctgtctcgcgttcgaattcaa gcggccgctgtct

8mouse chromosome 2 chr2:22,444,490-22,444,597

Gctgggacatcgtcctgctggagttcgtgacagcggccgcttgaattcgaacgcragacagcggccgctstcccytc ctccagysgratcaaagaaagttgtgtttaggttgcggtctgttagtagtatagtaatgcctgcggctagcactggt agtgataataggagcagtacggctgtgcatattcagctgttccatctgttgtcacg

9Mouse

chr11:94,049,342-94,049,481 (dark green)

chr11:92,573,333-92,573,407 (light green)

Inter-linker ligation gtgacaacagatggaacagctgaatatgcaggtcascggagggatgcatgcaaagakcctggggtacccactgtagt ctcatcctagaaaaagctaattgtctgacaaggctgtaakgtayataccttttgtgaggtgacagkkkacactggwa aagtgagtatcttgggttggccagcggccgctgtstcgygttcraattcaagcggccgctgtcyckcktkcgarttc argcggccgctstccyyawctykwktmkwgtgtctcattccacttcaggcttaagtgcacctgaggttgtcaaaatt gagaaagtgctggaatagctgagtggcatattcagctgttccatctgttgtcacgatctttctagaagatctcctac aatattctcagctgccatggaaaatc

12 Mouse chromosome 15

chr15:99,797,381-99,797,500

tgaattcgaacgcgagacagcggccgctGTCCAAGTAGTGTTCCTTGTAACCTCGGGAGTCCTTGAAGCT TCCTAGTGGACACTTAGTAGCATTTGTTTATATTCTGTGAAAAACAGAATAGAAACAAAGAAATCACCCA AATTTTCAgcatattcagctgttccatctgttgtcacg

14 Mouse chr7:144,119,406-144,119,488 acagcggccgcttgaattcgaacgcgagacagcggccgctgtcctgagttccaatcccagcaaccacatgggattgt attttaaagaaatctttaaaaaagaagaggaggaggaagaggaggagcatattcagctgttccatctgttgtcacg

161

Nov.27th08 (from Nov.25th08 gel result)

chr2:173,279,732-173,279,862

TTWRAGWGTGAMACAGATGGACAGCTGAATATGCTCTGAAGCCCTGTTATGACCTTGCCATCACCCGCTGTC CCATTTTGCAGAGGAGGAAACCAAACTGAGGTGAAGACACTCAACTGAGATCACACAGCCGTGCTCACTAAGCTC TATGCCCGGCAGGTAGGATAGCGGCCGCTTGAATTCGAACGCGA

chr5:4,400,153-4,400,262

TTRGCAGATCTCGCGTTCGATTCAGCGGCCGCTGTCCTGGTCTTGTTTCAAGAAATCAGAGTTTTCTGGAATAA TTCAAGCTCCCAAGAAAACAGAACACAGACAGACCTTTAAATGTGTTTTAACCATTAAGGGTGGGTGGATGCATAT TCAGCTGTTCCATCTGTTATC

chr1:172,470,983-172,471,041

CAGATGGACAGCTGAATATGCTAGGTGTCTCAGTTGGCTAGCTTAATGGGAGCTCTGGTGCAGTGTTACCCA GTAGAGGACAGCGGCGCTTGAATTCGAACGCGAGA

chr19:6,963,709-6,963,783

TTGAGATCTCGCGTTCGATTCAGCGGCCGCTGTCCCTCCTCCAAGGCTCCCAGCACTGCCCCACTCCCTTGAAC TATGGAATCCCCAGATACCGTCTAGAGCTCCCTGCATATTCAGCTGTTCCATCTGTTGTCAC

(S2-7) -mouse major alpha sat!

CACAGATGGACAGCTGAATATGCTGACGACTTGGAAAATGACGAAATCACTAAAAAACGTGAAAAATGAGAA ATGCACACTGAAGGACAGCGGCCGCTTGAATTCGAACGCGAGA

chr7:99,533,461-99,533,619

TTAAGWGTGMACAGATGGAACAGCTGAATATGCCTCACCCAAATCCACAGACTCTTTTTTTTTAAAGATTTAT TTATTTTATGTATATGGGTGATCTATCTGCATATACACCTGCATGCACGAAGAGGGCTTCTGCACTCTCTAGAAATA GCTGTGAGCCACCATGTGGTTCCTGGGAATTGAACTCAGGACAGCGGCCGCTTGAATTCGAACGCGAGA

162

S2-9B

Mouse alpha satellite sequence and unique!

twrtaggwamaacagatggacagctgaatatgctccactgtaggacgtggaatatggcaagaaaactgaaaatcatggaaaatgagaaac atccacttgacgacttgaaaaatgaagaaatcactgaaaaacgtgaaaaatgaaaatgcacactgtaggacagcggccgcttgaattcgaacgcga ga

S2-1B

Mouse alpha satellite sequence! and unique!

crgagawtctcgcgttcgattmagcggccgctgtcctacagtgtgcattttcatttttcacgtttttcagtgatttcttcatttttcaagtcgccaag tggatgtttctcattttccatgattttcagttttcttgccatattccacgtcctacagtggagcatattcagctgttccatc tgttgtcac

S2-8B

chr10:75,764,351-75,764,441

tkarwtgtgmacagatggacagctgaatatgcagactcctgcgcgcgacaaaccaacaaggaacgctctcattggctggagtgtcagccttat cagccaatcggagtaggctgtttgggcggacagcggccgcttgaattcgaacgcgaga

S2-7B

chr7:68,818,824-68,819,055

trtagatctcgcgttcgattcagcggccgctgtccataggttaacaacctcagcatgaatatctgtgaaagagtgggcacagaggaacatattctt gaccgtcagcacagggttctagaactttctggtataaagcactcatcttccttcctctcagttcctcaccttactcctgatgcttcctcatttccctttgact gagcccatcccccctgaggaactaacacaaaaggtaagagaactctgaaggttgtgaagctcttcgcatattcagctgttccatctgttgtcac

S2-5B

Mouse major alpha satellite sequence! and unique!

Agtrgwtctcgcgttcgattcagcggccgctgtccttcagtgtgcatttcacatttttcacgtcttttagtgatttcgtcatttttcaagtcgtgcatat tcagctgttccatctgttgtca

163

S2-3B

chr5:136,414,848-136,414,923

tagwtggmacagatggacagctgaatatgcctagggatggcatgtgtgcgagctgtatccctagctatgctttctgtatttttaaagagctgga ggatttgataggacagcggccttgaattcgaacgcgaga

S2-10B

chrX:133,455,972-133,456,090

Trcagwtcgttcgattcagcggccgctgtcctacaacaactggagcaggggctatccctaaagstgtgagctgactgtggtgtctgttccgaaac agggctcccttatctggsctcagtgagagagaaagattcttatcctgtagaggcatattcagctgttccatctgktgkcac

S2-2B

Completely paints chromosome 16,11,1,5,X,Y and Blasts to an UN contig

tgcgatctcgcgttcgattcagcggccgctcccatcatttcatcacagctccaaattttgtctctgtaactccttccatgggtgttttgttcccatttct aagaaagggtaaagtgtccacactttggtctttgttcgcatattcagctgttccatctgttgtcac

April 22nd 09 Sequencing

S5-2

Highly repetative including UN contig

cggattgtaggagwcttctagaagattctcgcgttcgaattcaagcgccgctgtccagtggatgtttctcattttccatgattttcagttttcttgcc atattccatgtcctacagtggacatttctaaattatccaactttttcagtgcatattcagctgttccatctgttgtca

S5-3

highly repetative including UN contig

gtgacaacagatggaacagctgaatatgcaatgttatcgaatttcctagcttctcctctgaggatcacctaccctttctaccccctgctccccaacc cacccaatcctgctttctgccccaggcattcccctatactgggtcatagaaccttcacacgacagcggccgcttgaattcgaacgcgagaatcttgctg

164

S5-4

chr4:36,968,937-36,969,007

cggcattgtaggagwcttctagaagwtctcgcgttcgaattcaagcggccgctgtcccagcaatactggtgcataaaatgaaggtcagcttagc tataagttaatgaaatcggtgtagatattagcatattcagctgttccatctgttgt

S5-8

Highly repetative including UN contig

tctcgcgttcgaattcaagcggccgctgtccttcagtgtgcatttctcatttttcacgttttttaggcgctagacgtcgtctagcgcgcatattcagct gttccatctgttgtcac

January 7th 09 Sequencing:

S14-2: repetative all over chromosome 14 Tktagwgtgmacagatggacagctgaatatgcaatacccaaaattctaattttgagatgacaggcatctgtggaagactcattatctggtatg gtaggggacagcggccgcttgaattcgaacgcgaga

S3-5

Mitochondrial Ttwwagatctcgcgttcgattcaagcggccgctgtccgattccaccccctcacgactaataataactttattttaacaactatactttgcctcgga gccctaaccacattatttacagctatttgtgctctcacccaaaatgacatcaaaaaaatcattgccttctctacatcaagccaactaggcctgataatag tgacgctaggaataaaccaaccacacctagcattcctacacatctgtgcatattcagctgttccatctgttgtcac

CS5-3 chr4:93,342,209-93,342,303

165 ttgcgatctcgcgttcgattcagcggccgctgtcctgctgttgtgcctgtcactggctgc catgcttcctcaccatgatggactccttcttctctagaaccagaagggcacataaacttt ttcttctgcatattcagctgttccatctgttgtcac

CS5-4

(Mouse Alpha Satellite Sequence) tgagagtgmacagatggacagcgaatatgcctttaggacgtgaaatatggcgaggaaaac tgaaaaaggtggaaaatttagaaatgtccactgtaggacgtggaatatggcaagaaaact gaaatcatgaaaaatgagaaacatccacttgacgacttgaaaaatgacgaaatcactaaa aaacgtgaaaaatgagaaatgcacactgaaggacagcggccgcttgaattcgaacgcgag a

CS5-5 chr4:106,825,468-106,825,607 ttwgagwgtgmacagatggacagctgaatatgcataaaatgtcagagcaaaatggagcat ctgtgtgctcaattgtacagtgcgtgcttccacttctccctggaggtggtggaggggtct tgggctgcagcagctgagggtgtctcccagattccttcttgcagaaggggaggacagcgg ccgcttgaattcgaacgcgaga

CS5-9 chr17:69,607,290-69,607,362 twrtagatctcgcgttcgattcagcggccgctgtcctgtgagataagctttacctgcaag actgtggcctttgaagcttgtgtaaaactgttgggaggcaagtacagcatattcagctgt tccatctgttgtcac

S5-Trunc-2 Chromosome 4 contig

chr4:36,968,937-36,969,003

Ctggtaggagwcttctagaagwagtggtcctctcgcgttcgaattcaagcggccgctgtcgcagcaatactggtgcataaaatgaaggtcagc ttagctataagttaatgaaatcggtgtagatattagcatattcagctgttccatctgttgtcacggaccactatcttgctgaaaaactcgagccatccgg aaga

S5-Trunc-4 Chromosome 7 contig

166

chr7:40,856,401-40,856,575

Atgtacgagwcttctagaagwagtggtcctctcgcgttcgaattcaagcggccgctgtccatgaacagaacagcaaggcgagtcatccaattt tgtgaatccaagtctgctgttcttcactccttcccattctcctttcatttattaagtcacccatccacacattcggttatacattaccttacttgatcatctga atgtcaagtttatgccggcctctgtgcctggtatttagcatattcagctgttccatctgttgtcacggaccactatcttg

S5-Trunc-6

Repetitive all over including UN contig

Atggtaggagwcttctagaagwagtggtcctctcgcgttcgaattcaagcggccgctgtccttcagtgtgcatttctcatttttcacgttttttagg cgctagacgtcgtctagcgcgcatattcagctgttccatctgttgtcacggaccactatcttgct

S3-Trunc-1

chr6:31,523,587-31,523,651

Atgtaggagwcttctagaagwagtggtccgtgacaacagatggaacagctgaatatgtaaacaaacaaggctccttcagagcagcctgaca ggacttttcagtggtcaaggtgaatagggtggacagcggccgcttgaattcgaacgcgagaggaccactatcttg

S3-Trunc-2

repetative all over genome

Attgtaggagwcttctagaagwagtggtccgtgacaacagatggaacagctgaatatgctgcacccaaccaatggacagaagcagctgact gctggtgttgaatttggaatggctgaaagaagctgaggagaagggcaaccctgtagcaggacagcggccgcttgaattcgaacgcgagaggaccac tatcttg

S3-Trunc-4

repetative all over genome

Cggattgtaggagwcttctagaagwagtggtcctctcgcgttcgaattcaagcggccgctgtccagtctatttggaattctgtaggcttcttttat gttcatgggcatgtcattctttaggtttgggaagttttcttctgcatattcagctgttccatctgttgtcacggaccactatcttg

167

Chromocentre off body Targeting:

NUCS4-1

chr8:16,263,798-16,263,991

twkcagwgtgacacagatggacagctgaatatgctactgcaaaggcagaaagcgtgaaccttgagtagagggcagcatcagaagcctccct gggatgcttcaatgctgagcattccttaaactgcatcagcaagtgggcatcttgtttgtgaccaggacttgggcagtgagagcttcatctgggctgcatt aagtcatgattgaagtggccagtgrwgkrgctagaaa

NUCS4-2

chr4:80,004,091-80,004,358

Gwtctcgcgttcgattcagcggccgctgtcccttctaggagtctgcctaatagtccaaatcattacaggtcttttcttagccatacactacacatc agatacaataacagccttttcatcagtaacacacatttgtcgagacataaattacgggtgactaatccgatatatacacgcaaacggagcctcaatatt ttttatttgcttattccttcatgtcggacgaggcttatattatggatcatatacatttatagaaacctgaaacattggagtacttctactgttcgcagtcagc atattcagctgttccatctgttgtcac

NUCS4-5

repetative all over including UN contig

tggaacagctgaatatgccatggaaaatgagaaacatccacttgacgacttaaaaaatgacgaaatcactaaaaaacgtgaaaaatgagaaa tgcatcttgaaagacctggaatatggcgataaaacttaaaatcacggaacatgagatatacacactttaggacgtgaaatatggcgaggaaaactga aaaaggtggaatatttagaaatgtccactgtaggacgtgaaatatggcaagaaaactgaaaatcatggaaaatgataaacatccacttgacgacttg aaaaatgacgaaatcactaaaaaacgtgaaaaatgagaaattcacactgaaggacagcggccgcttgaattcgaacgcgaga

NUCS4-6

chr18:45,531,078-45,531,145

Tkcagawcacagatggacagctgaatatgccacagagatgcattgactcctcaagtctacctttagggttttgaagtgagattatgtgtaatata ggacagcggccgcttgaattcgaacgcgaga

168

NUCS4-7

chr17:39,844,032-39,844,143

Cwgagatcacagatggacagctgaatatgcccccagccaccaccacacaaccggagccacatggctccgcagcaacggcaggacgacagac aggctctgccccgcgtgatccctccccgaactcggagcggggaggcgcgggcagcggccgcttgaattcgaacgcgaga

NUCS4-8

highly repetitive including UN contig

agatggacagctgaatatgccatggaaaatgagaaacatccacttgacgacttaaaaaatgacgaaatcactaaaaaacgtgaaaaatgaga aatgcatcttgaaagacctggaatatggcgataaaacttaaaatcacggaacatgagatatacacactttaggacgtgaaatatggcgaggaaaact gaaaaaggtggaatatttagaaatgtccactgtaggacgtgaaatatggcaagaaaactgaaaatcatggaaaatgataaacatccacttgacgact tgaaaaatgacgaaatcactaaaaaacgtgaaaaatgagaaattcacactgaaggacagcggccgcttgaattcgaacgcgaga

NUCS4-10

highly repetitive including UN contig

tkgtgatctcgcgttcgattcagcggccgctgtcgtcacgaactccagcaggacgatgtcccagcctatagtgagtcgtcaagtggatgtttctca ttttccatggcatattcagctgttccatctgttg

>201815 (OB1-1)

chr5:117,111,525-117,111,605

ttrcagatctcgcgttcgattcaGCGGCCGCTGTCGTGCTGCTCTGCCAGCTTTTTACTGGCTGAGTCCCGCCTGCTTGA AACTTGCTCTCACAGACTCTTCTCCGTGGATGTGCATATTCAGCTGTTCCATCTGTTGTCAC

>201816 (OB1-2)

chr8:16,263,811-16,263,991

TKGAGATGMACAGATGGACAGCTGAATATGCTACTGCAAAGGCAGAAAGCGTGAACCTTGAGTAGAGGGCA GCATCAGAAGCCTCCCTGGGATGCTTCAATGCTGAGCATTCCTTAAACTGCATCAGCAAGTGGGCATCTTGTTTGT GACCAGGACTTGGGCAGTGAGAGCTTCATCTGGGCTGCATTAAGTCATGATTGAAGTGGCCAGTGAATTGTAATA CGACTCACTATAGGCTGGGACATCGTCCTGCTGGAGTTCGTGACGACAGCGGCCGCTTGAATTCGAACGCGAGA

169

>201819 (OB1-5)

highly repetative all over genome

GCCTATAGTGAGTCGTATTACAATTCACTGGCCAGTGTGAGTCACAGTAGTCTCGCGTTCGAATTCAAGCGGC CGCTGTCCGGGGACAGCCGGCCACCTTCCGGACCGGAGGACAGGTGCCCGCCCGGCTGGGCATATTCAGCTGTTC CATCTGTTGTCAC

OB1-4B

highly repetative all over genome

Garatgtgmacagatggacagctgaatatgcccagccgggcgggcacctgtcctccggtccggaaggtggccggctgtccccggacagcggc cgcttgaattcgaacgcgaga

OB1-10B chr4:93,342,217-93,342,303

Araagawgattcagcggccgctgtcctgctgttgtgcctgtcactggctgccatgcttcctcaccatgatggactcc ttcttctctagaaccagaagggcacataaactttttcttctgcatattcagctgttccatc

OB1-2B

chr5:117,111,525-117,111,605

tctcgcgttcgaattcaagcggccgctgtcctgctgctctgccagctttttactggctgagtcccgcctgcttgaaacttgctctcacagactcttctc cgtggatgtgcatattcagctgttccatctgtt

OB1-9B

Mouse alpha satellite sequence

Cagatgacacagatggacagctgaatatgcacgacctgaaaaatgacggaatcactaaaaaacgtaaaaaatgagaaatgcacactgaagg acagcggccgcttgaattcgaacgcgagact

170

A.2 Lac array targeting sequences

probe sequence probe signature

linker sequence linker signature

genomic

S15-2

chr19: 41,311,117-41,311,162 GTACTCTCGCGTTCGAATTCNAGCGGCCGCTCTACCCACAAAAGCNCNCTTCTTATCACGTCTCCAGAGGTGC AGCTTCGCATATTCNGCTGTTCCATCTGTTGTCACGTACAC

S15

chr2: 89,479,134-89,479,211 GTACTCTCGCGTTCGAATTCAAGCGGCCGCTCTACCTTCACTAACTGTCACACCCACTGGGAATCCATAGGAG GCCAGTGGCCTCGAGACCTATCATCCCCCAAACCCTCTGCATATTCAGCTGTTCCATCTGTTGTCACGTAC

S7-7

chr6: 54,885,264 - 54,885,325 GNACAGTTCGTGACAACAGATGGAACAGCTGAATATGCTCATGNNNTTAACTACTCTGACCCTTCAGGAATTC TTCCTGGACAATGTGGTCATGGTATGTAGAGCGGCCGCTTGAATTCGAACGCGAGACTACTGTGACTCACACT

S19-6

chr X: 93,750,376-93,750,423

AGTGTACGTGACAACAGATGGAACAGCTGAATATGCTGCATGCAAGTGNATGNNCGAGGTTAACTTTGGTGT TGTGTCTCAGGTAGAGCGGCCGCTTGAATTCGAACGCGAGAGTACTGGATCTTGCTGAAAAACTCGAGCC

171

S7-9

chr18:25,854,179-25,854,288

NNNNNNNNNNTTCNNNAAANNNAGTGTACaGTTCGTGACAACAGATGGAACAGCTGAATATGCGTTTCCG GCattcCACCAAACCAGATGCCAGGAAGAGAGTATTTCCTCAGTGCCaATTAgCTATGCAGGCTCCAGGACTCCTGT CAAATTAAACTCCAACTCTAATCAGTAGAGCGGCCGCTTGAATTCGAACGCGAGACTACTGTGACTCACAC

S7-9

chr18:58,676,989-58,677,039

NNNNNNNNNNNNGTTTTTNNNNNNNANNNNANNGTTCGTGACAACAGATGGAACAGCTGAATATNNNN NNNNGNCTCTATGCAAACACTTAGAAGCCTGTTCTAAAAGCTTTNNATTTTCAGTAGAGCGGCCGCTTGAATTCN NACGCGAGA

S15-1

chr13:18,043,338-18,043,390

NNNNNNNNNNNNNNNNCTTCNANAANGNNGTGTACAGTTCGTGACaACAGATGGAACAGCTGAATATGC CCACAGcccacacacCtTCAAAGGAACTGCCTTCCATTCTGCAAGTTTAAGtaCaGCGGCCGCTTGAATTCGAACGCGA GAGTACTGGATCTTGCTGAAAAACTCTATGAT

S15-7

chr3:16,514,678-16,514,750

NNNNNNNNNNNNNNNNNNCTTCNANAAGNNAGTGTGAGTCACAGTAGTCTCGCGTTCGAATTCAAGCGG CCGCTCTACtTgcagtGcAGAACTGAATAGAAGTAATGATGACATCGTCACCTGATCTCAGTGGAAAAGTGTTTAATG TTCATATTCAGCTGTTCCATCTGTTGTCACGAACTGTACACTATCTTGCTGAAAAAC

S15-5

chr3:96,332,866-96,332,911

NNNNNNNNNNNNNNNNNNCTTCTANAAGNNAGTGTACAGTTCGTGACAACAGATGGAACAGCTGAATAT GCCCAAAAGCtCCAAAgAcTTTGGGGATCAGGCCTTTCACTCTTGTACAGCGgCCGCTTGAATTCGAACGCGAGAGT ACTGGATCTTGCTGAAAAACTCGA

172

S15-3

chr3:103,710,164-103,710,241

NNNNNNNNNNNNNNNNCTTCTNNNNNNCCAGTACTCTCGCGTTCGAATTCAAGCGGCCGCTGTACTAGTA TAAGGaccacaAttcAGATCCCCAGCATCCAGGAGATGCTGTGCAGATGTGGCTGCAGGGGAGTGGTTCTGGCATAT TCAGCTGTTCCATCTGTTGTCACGAACTGTACACTATCTTGCTGAAAAACT

S7-8

chr3:140,521,798-140,521,918

GNNNNNNNNNNNNTTTTTNNCAAGNNTCTCGCGTTCGAATTCAAGCGGCCGCTCTACATGGACCAAACCAA NGNAAAACTGTCTTTAAGTATGGTCACAATCAGAACATAGCAGCTGACTGTAACAAGCCACAAAGTTTTAAAAGA GAGGAAATGTTTATGAGATGAGATAAGCATGCATATTCAGCTGTTCCATCTGTTGTCACGAACTGTAC

S7-10

chr3:89,177,726-89,177,906

NNNNNNNNNNNTCNNNTTTTTNGCAAGANAGTGNACAGTTCGTGACAACAGATGGAACAGCTGAATATGN NNNTNNGNNTTTTGACCCAGGGTAGAAGTGTCGCTATGTGAANNNGNNGGNAGCTCGAGACCAGTGAACATGG GGAGCTAAAGGGTTGCGGGCTCTGCCCCTTTAATGAGCGGGTTCTAGCTATGTAGAGTGATAGACTTGAAGGCAA ATGTTGGGAACAACGGCGTGGCACCAAAGGCTCAGTAGAGCGGCCGCTTGAATTCGAACGCGAGAGGGCCGCTC T

S19-2

chr3:133,912,647-133,912,690

CCNGTACTCTCGCGTTCGAATTCNGCGGCCGCTCTACCGAGTTANANNNTNNNCTGGAACCTATGAGGCCCT GCCAAGAGCATATTCANNTGTTCCATNTGNTGTCA

S7-1 chr9:98,451,234-98,451,389 CAGTTCGTGACAACAGATGGAGCAGCTGAATATGCTGGAGCCTTNANNCTAGGTTGGCAGCAGCTCGCATCT TGGTGACTGCTCANNCNTNTTTGTGGCCCGTTAGCCCCTCTCACCTCCAGGGCCAAGCTACTGATGCTACCACAAC TTCAAACCCTGCCNGACCAACATGTGGATGTCCTTGTCATGGGN

173

S7-5

chr9:72,912,886-72,913,007

CTCGCGTTCGAATTCaAGCGGCCGCTCTACATAGGGGATCGTGTCCTTGCAaagcTGTTGGGTTTGGGACACCA GAGCAGGAGAATTTTgCtccagcCTGCATCAGCCTGCCCTCAGCCTGCAAAACCACCTTTCCCTTTGCTCCACAGCGC ATATTCAGCTGTTCCATctgttGTCACGAACTGTAC

S7-2

chr1:127,218,931-127,219,019

gtACAGTTCGTGACaaCAGATGGAACAGCTGAATATGCCAATAGGgAAgaAacaatGGGAGAGCCCTTGCTCAG CAAGGTGGAAGGCGAGAATacTCTCGAATGTATGTCCACATACATGCCATGGTAGAGCGGCCGCTTGAATTCGAA CGCGAGATT

S7-8

chr1:52,690,879-52,691,044

GTACAGTTCGTGACAACAGATGGAACAGCTGAATATGCTCCACTGgtGatGAACTTGTTCTGACTAGCTTCTCTT TCTCACCCATCAgCaCAGATAAAACAAATCTAAACATCGAAAGCCACTAACTGTGATTCCTAAAGAAAAGCTCTAAC TGGCTCACTTTATTGCCTACTTAATAATTCCATCCCAGACCCACCCAATTGTAGAGCGGCCGCTTGAATTCGAACGC GAGACTACTGTGACTCAC

S7-3 (from June7th repeat)

chr4:78,428,661-78,428,795

ANNNNTCTCGCGTTCGAATTCNAGCGGCCGCTCTACTTAACAATGCTACTGTNAATAAANNGAGCAAGAAAT AGCTATTAATCTTCATTAGAATTGNNNNNAGNCTCCACTCTAGATACCATGGACAGGAAAGCCATAAGCATAGAA CCTTCTAGTATATTATTACATAGNATATTCAGCTGTTCCATCTGTTGTCACGAACTGTAC

S7-1

174

chr4:66,483,770-66,483,851

GTACAGTTCGTGACAACAGATGGAACAGCTGAATATGCAGAAGGGAGGAGGGCAGGGGATGGAGTAAATG GAGACAGAGCTGTAAGCATGCAGAGAAGTGCGGCCTCATAACAAGAATGTAGAGCGGCCGCTTGAATTCGAACG CGAG

S7-4

chr17:12,672,141-12,672,345

NNNNNNNNCNNNNNGANNNNAGTGtACaGTTCGTGACaACAGATGGAACAGCTGAATATGCATCAGAgGA caaCTtGCaGGAGATGATCTTCTCCTCCACTtCCAGGGaCTGAacTCAGGTGGtCAGGcTTGGtGGcGAGGACCTCTA CCAGCTGAaCCAGCTCACCAGCCCTTCATTCGTTTACCCTTATGTGCACCaagCTAgCTGGCCCACAAgGTTTGAGGG ACTCTTTCCATTTCCCATCTCACTAGAGGAGCAATGGGGTAGAGCGGCCGCTTGAATTCgAACGCNNNA

S7-6 (June7th resequenced)

chr8:88,667,705-88,667,773

CGNGTTCGAATTNNAGCGGCCGCTCTACNNNNNNNACNNATNNNNNNTNNCNNGCTCAGCACCCATCCAT CCCTGGAATTTTTAAATGNTCCTTTTTTTTGATGATTAAGAACGTGCATATTCAGCTGTTCCATCTGTTGTCACGAAC TGTAC

S7-3

chr12:116,323,756-116,323,839

GNACAGTTCGTGACAACAGATGGAACAGCTGAATATCNTNNNNNNCCANNTCANNCNCNNNNNTNCCNAC NNNNCTANNNCNNANAAATTCTGCATTACNTANNNNGANTGGAATTAANCATCCTCTTANCTATCTATTANNATT ANKCNTTGCAAAGNAGAGCGNNCGCTTGANTTCNANNNNNANAA

S7-4

chrM:12,492-12,619

NNNCAAGANTCTCGCGTTCNNNTTCNAGCGGCCGCTCTACAATAGTAGNNNNNNNNNNNNCCTACTGGTC CGATTCCACCCCCTCACGACTAATNNNNNNTTTATTTTAACAACTATACTTTGCCTCGGAGCCCTAACCACATTATT TACAGCTATTTGTGCTCTGCATATTCAGCTGTTCCATCTGTTGTCACGAACTGTAC

175

S7-12

chr12:49,827,033-49,827,094

CGTGACAACAGATGGAACAGCTGAATATGCTANTGNNTAGTTCATATTGTTGTTCCACCTATAGGGTTGCAGA CCCCTTTAGCTCCTTGGGTAGAGCGGCCGCTTGAATTCGAACGCGAGA

From June 7th 11 OB-3 sequencing

OB3-2

chr11:121,483,125-121,483,193

NNACAGTTCGNGACAACAGATGGAACAGCTGAATATGCNATGACCAGACATTGGTCACCAGCCACACAGAT GTTCAAAGCTTCATCAGTTCAGTTTCCTAGCCCACGTAGAGCGGCCGCTTGAATTCAACGCGAGA

OB3-3

chr3:119,107,483-119,107,538

TTNNNNNNANNNANNNNACAGTTCGTGACAACAGATGGAACAGCTGAATATGCNNNNTTATAGAGAGAGA GCTATGTTTGTCTTTCCTCCTCAACCAGCTGTCATAGTAAATGTAGAGCGGCCGCTTGAATTCGAACGCGAGAA

OB3-2

chr3:118,144,797-118,144,897

GTACAGNTCGNGACNACAGANGGAACAGCTGAATATGCANTTGTANGACAGCTNNGANCTCNNCNCCNCA NTTTNGATCNNAANNTANTTTANNAAANCCATAACATGGCGNTTNGNNGNNTGNTCCCNATGAGGCTGGNAGA GCGGCCGCNTGANNTCNAACGNGANACTACTGTGACTCNCACT

OB3-10

chr3:16,534,429-16,534,521

176

NNNNNNNNNNNTTTTTNNNCAAGNNAGTGNACAGTTCGTGACAACAGATGGAACAGCTGAATATGCNCTA GCATAGAACATTCTACTTAAATGAGGCTGGTCTTTTATTCNCAGACATCTACTTGCATGTCCCTCCCAAATGATAGA ATTAAGGGCATGTAGAGCGGCCGCTTGAATTCGAACGCGAGAATCTTTCTAGAAGATCTCCTACA

OB3-8

Highly repetitive over genome

CAAGNNTCTCGCGTTCGAACTCAAGCGGCCGCTCTACCAGGATACAGGTGAATCCAATTTGGTGGAGATTTG CCCCTGCTGCCCTGATTAGCTGCATATTCAGCTGTTCCATCTGTTGTCACGAACTGTAC

OB3-12

highly repetitive over genome

Repeat element

GNACAGTTCGTGACAACAGATGGAACAGCTGAATATNNTAGNNNNAGGAGAGGCGTGTCTTACACCCGGAT TGGTTATGCNCNNCGCCTCATTTGCATGTTCCTCATCTGATTGGCTACTCTCTCTCAGTANAGCGGNCGCTTGAATT CGAACGCGAGAATCTTTCTAGAAGATCTCCTACA

OB3-5

Highly repetitive all over genome

GNACAGTTCGTGACAACAGATGGAACAGCTGAATATGCTCAGCTATTAAAAAGAATGAATTAATGAAATTCCT AGCCAAATGGATGGACCTGGAGGGCATCATCCTGAGTGAGGTAACACATTCACAAAGGAACTCACACAATATGTA GAGCGGCCGCTTGAATTCGAACGCGAGA

OB3-7

chr6:128,171,948-128,171,987

CCNGTACTCTCGCGTTCGAATTCAAGCGGCCGCTCTACAATGGGATCaCTGgcacCACAGGATCAGTCAGTTCG CTGCATATTCAGCTGTTCCATCTGTTGTCACGTACACT

OB3-6

177

chr6:27,744,426-27,744,512

CGCGTTCGAATTCNAGCGGCCGCTCTACAGGTAGGCACNTNNNNGNAAGAGATGATTAAGAGGCCATAAGG GAGTGCTGTTGACTGACTTATGCCTTATGGTTTGCCCAGCCTGCATATTCAGCTGTTCCATCTGTTGTCACGAACTG TAC

OB3-4

chr8:9,108,288-9,108,340

AGTGTACAGTTCGTGACAACAGATGGAACAGTGNATATGNCNCANNGTNCGNAGATGTGANGCCCTCTTTC AGAATTGNCTGNGNNCTACAGTAGAGCGNCCGCTTGAATTCGAACGCGAGAATCTTTCTANAAGATCTCCTACA

Chromosome 18

chr18:3,759,938-3,759,997

AGTTCGTGACAACAGATGGAACAGCTGAATATGCNNCCNTTTGTGACTGATGGCCACAAATCTATGGGGGAC TGTGGCTAGTGATAAGGGCCTGGTAGAGCGGCCGCTTGAATTCGAACGCGAGAATCTTTCTAGAAGATCTCCTAC

A.3 Histone locus Body targeting sequences

178

probe sequence

signature sequence

S3-2 Sept. C

FishBAC: Rp11-836G11

chr6:29,843,347-29,843,665

Nice Chrom. 6 hit with signature sequence on one side

NNNNNNNNNNNNNNNNNNTCNNNNNNNNNNNNAGANGGNGGNNNTNNNNNTGTCNGTTAACTAGCC CNNTCTATTNNNNNNNNNNGGNCCATCCCTTCGTTTCCTATAAGGGATACTTNNNNNTNANTNAATATCCATAG AAACGATGCTAATGACAGGTTTGCTGTTAATAAATATGTGGATAAATCTCTGTTCCGGGCTCTCANCTCTGAAGGC TGTGANACCCCTGATTTCCCNCTTCNCNCCTCTATATTNCNGTGTGTGTGTCTTTAATTCCTCTAGCACCACTGGGT TAGGGTCTCCCTGACCGAGCTGGTCTCANCAGGAGACTCCATCCTGAGTCTTCTGATACATATNCCTTTGTATTCCT TCAAANATTTGTGCAATAAGTATATATATATATACAGCTAACTGACAGCAAGANTCCCACCNNNTT

S3-3 Sept. E

FishBAC:Rp11-609G19

chr10:6,125,167-6,125,236

FishBAC:

Chrom. 10 with signature sequence – rest seems to have been cut off during cloning

(have previously cloned this S3(4) from Oct. 25th sequencing)

NNNNNNNNNNNNNNNNCGNNTTTTTNNNNNANGGTGGGANTCTTGCTGTCAGTTAGCTGGGGGAGCAA GGCGNGNNNTTTCANAGTACGGCTTTGGAAGCTGGGAACTCTGGGTCCNGGTATTTATGACATCTTTCTAGAAGA TCTCCTACAATATTCTCAGCTGCCATGGAAAATCGATGTT

179

1S8-2 A

FishBAC: Rp11-12N18

chr7:99,981,091-99,981,248

NNNNNNNNNNNNNNNNCNNNTTTTNNGCAAGANAGTGGGANTCTTGCTGTCAGTTAGATCGGAAGAGCA CANNNNNNNNNNNNNGTCACGCCAATATCTCGTATGCCGTCTTCTNCNNNNAANNNACAGCAAGAATCCCAAG TGGGATTCTTGCTGTCAGTTANACATGGGACAAGTGTGGCTGAAGACAGNNNNGAGGNANAGGGTGGAAGGAG GCGGGGAAGATCCTGTGTGGCTTTGAAAGCCAAGGAAACNACTCACTTTTGCTCNGAGTGNCATGGNAGNGANG GAGCGTTTCCTGCAGGAGTGCNGAGATCTGACTCAGCTAACTGACAGCAAGAATCCCNCTATCTTTCT

1S8-3 A

FishBAC:Rp11-642E22

chr3:10,247,195-10,247,466

NNNNNNNNNNNNNNNNNNNNNTTTTTNNNNNNTAGTGGGANTCTTGCTGTCAGTTAGCTGAATTCCAGT GTTATNNNNNNGNCTTGTGTTAAACAGTAATATCCATGAAGTCCCNNNTTCTTCTACACATGATCTATATTTATGC TAACACCCAGGTTCTGTGTACCACTGCAGGTGTGGGCCCCACACTGGGAACCGCTGCTCCAGACTAACTTGAGTTC CATCCTGCTGCCAGGCCTGGCCACAGGGCAGGGCTGTCTCTGCACACAGGGTCCTTCTGTGCCCAGTAAATCTGAG GTCAGAGCTTGCCACACACAAGTCATGAGTTTCCTAACTGACAGCAAGAATCCCACTATCTTTCTAG

1S8-3 B

FishBAC: Rp11-14P15

chr6:158,956,812-158,957,050

Chromosome 6

NNNNNNNNNNNNNNNNNNCGNNTTTTTNNNNNANAGTGGGANTCTTGCTGTCAGTTAGCCAGCTAACTG ACAGGTTTTGTCCCTAAAAGTAAGAGGTTTCCCCCAGTGCAGGCAGGATNAATCTTGCGTGGAAACCTGCCCTGG GTTCAGTTGGAAATGCACTAACATCGTCTGGAAAGACTGTTTTTCCACCTATTATTGGAAAAATTCACAAGAGGAG

180

ATCACACCACGCTTTGGAGAAGCCATTCCGTAGAAGACGTGTTAACTGGTTAACTATAGTTTGCACACTAGTGTTG AGCTTGTAGAAGGTGAGCTAACTGACAGCAAGAATCCCACTATCTTTCTAGAAGATCTCCTACAATAT

1S9-1 A

FishBAC: Rp11-24C3

Chrom. 3 hit

chr3:48,482,774-48,482,867

NNNNNNNNNNNNNNNNNTCGNNNNNNNNNNNNNNAGTGGGANTCTTGCTGTCAGTTAGCTGTAATCCG AGCACNCNNNNNNNNNGAGGCGGGTGAATCACTTGAAGTCAGGAGTTCNANACCAGCTTGGCCAACATGGTGA AACTCCGTCTCTACTAACTGACAGCAAGAATCCCACTATCTTTCTAGAAGATC

S42-3

FishBAC: Rp11-430O23

chr10:14,498,510-14,498,677

!!!! Chromosome 10 One side genuine ligation event (other side may be mispriming of Probe) but 87% identity to chromosome 6 which is within 200kbp of histone genes! And so perhaps this Chromosome 10 region contains an NPAT binding site!

NNNNNNNNNNNNNNNNNCTTCNAGAAGNTGGGANTCTTGCTGTCAGTTAGCTGAAAGGTAGAAGTTAAA AGAATAGNCNNNNNNGAAATTCACAGTCAGACAGCCCAGCACCACATTTCANACNTGGTCATTAAAAGTCAACC CCTGCCCTCACCGCTTGTGGTATCTATAGATTCTAGACACTGCATGAGGAAGCATTGTGAAATTCTCTGTTCTTTTCT TGCTGTCAGTTAGCTGCAGCTAACTGACAGCAAGAATCCCACTTATCTTGCTGAAAAACTCGAGCCA

Also from Sept.19th11 sequencing:

S7-2

181

Chromosome 6 28658264-28658322

NNNNNNNNNNNNNNNNCTTCTNNNNNNNAGTGGGATTCTTGCTGTCNGTTAGCTGACGTACCTACTTTCN GTACNNNNNNNNANNNACTTATCACATGGTAAGTGGCTGTCAGTTAGCTNNNTNNNNNANCTGCAGCTAACTG ACAGCAAGAATCCCACTATCTTGCTGAAAAACTCGAGC

From Oct.25th 11

S7-2

BAC: rp11-258M21

Chromosome 5

NNNNNNNNNNNNNNNNNNNNNNNAANNNAAGTGGGANTCTTGCTGTCNGTTAGCTGCTGTATTTAGATA ACCCTAAANNNNNNNNTCTTCTATATAACATAAATATAATTGTTAAGAATGNNANCACATAACTGACAGCAAGAA TCCCACTATCTTGCTGAAAAACTCGAGCCATCCGG

Sequencing of negative controls for Histone locus body LTOL targeting

probe sequence

signature seqeunce

NC - no cell sample

NT - not targeted sample

NC1 sept.2

mispriming ref|NT_008413.18| Homo sapiens chromosome 9 genomic contig, GRCh37.p5 Primary Assembly Length=39964796

182

NNNNNNNNNNNNNNNNNNNNNNNTTTTTNNNAGANAGTGGGANTCTTGCTGTCAGTTAAGGCTCTCTTA ATGTCAGttttctttgcatAAtGCCTTGAAATGAAGACTGATGAAGTTCTTCaaaTTgAtGCCGCTTATATAAACTGACCCC GATTTGCCCTATTCTGCaTGACAACTTAaCTGACAGCAAGAATCCCACTATCTTTCTAGA

NT4-2 3 ref|NG_006884.3| Mus musculus vomeronasal 2, receptor, pseudogene 16 (Vmn2r-ps16) on chromosome 4 NNNNNNNNNNNNNNNNNNTCNNNTTTTTNNNNNNNNGTGNNNNNNTTGCTGTCAgTTAGCTGcTGTCAg TTAGCTGcagttaActgacAGCCTTGATTTTGACTTTCTTTGTTCTGTAAAaCTgTgaAAtTGCTTCCACATTGGGACATA GAGTTGGGtAACTTCAATCTGTGTTCCTGGACTATGGTCACTTGAAATGGcTTCAAAATAAACTGTCCTCTAgCTAAC TGACAGCAAGAATCCCACTATCTTTCTAgAA

NT1-1 2

mispriming ref|NT_011362.10| Homo sapiens chromosome 20 genomic contig, GRCh37.p5 Primary Assembly Length=31409461 NNNNNNNNNNNNNNNNNNNNCNNNNNNNNNNCAAGNNAGTGGNNTCTTGCTGTCAGTTACAATGGCA GCAAAGAGCAGtggctaactGtTGTTTTGTGGTACTTGAGCTTCTCTACTCAgAggcTCaTCCCTCTCCTTAACTGACAGC AAGAATCCCACTATCTTTCTAGAAgA

NC1-2 mispriming b|CP002897.1| Paracoccus denitrificans SD1, complete genome Length=2985589 NNNNNNNNNNNNGNNCGagTTtTtcagcaagatAGTGGGatTCTTGCTGTCAgTTACAAGCAgaAgACGGCATac gAgatattGgcGTGACtGGAGTTCAgACGTGTGCTCTTCCGatctaactGacaGCAAGAATCCCACTATCTTTCT

NC2-1

mispriming

183 gb|CP002897.1| Paracoccus denitrificans SD1, complete genome Length=2985589 NNNNNNNNNNNNNNTCgaGtttttcagCAAGAtAGTGGGAtTCTTGCtGtcagttacaagcagAAgACGGCATacNA ANNttgGCGTGACTGGAGTTCAgACGTGTGCTCTTCCgatctaNNNNNNNNCAAgAATCCCAcTATCTTTCtAgA

NT4-1 1

mispriming gb|CP002897.1| Paracoccus denitrificans SD1, complete genome Length=2985589 NNNNNNNNNNNNNNNCGNNTTTTTcagCAAGAtAGTGGGatTCTTGCTGTCAgTTACAAGCAgAAgACGGCA TACgAgAtattGGCGTGACTGGAGTTCAgACGTGTGCTCTTCCGATCTAACtGACAGCAAGAATCCCACTATCTTTCTA G

NT4-3 1

mispriming ref|NT_008818.16| Homo sapiens chromosome 10 genomic contig, GRCh37.p5 Primary Assembly Length=6758678 NNNNNNNNNNNNNNNNNTCNNNNTTTTCNGCAAGANAGTGGGNNTCTTGCTGTCAGTTAAGTTCTCTGTC AGGTCNNNNTTCNGCACTGGAGTTCCCAAAAACGTGTTGATGTCTTTCNCTTCACCTACTGCTGGTTTGGGTGTGT GCATGGCTTTGCCTGCTGATGGCGTTTGTTTCCTAAATGCTAAAAATTCTTCTTCAGTGTCTGCTTTCCTGANACTTC TCTTGGGCTGTGGCTTGGAGCTTGTTGGGGTGTCCACTAGGTCTGGCTGTGAAGCTCTGTAGGATACTTTGGTANT TTTTTCATTAGTCATTGATTCCTTGGTGTGACTTGNTGTCTGGAANAGCTCGATGAAGCCGGCCAGGTCTTCAGGG ACTTCANACTTTCCCTTAGGAGTTCTCAGCTGCCTCTTGCTGCCAGTTAGACTTGCTGCTGAGTCTAACTGACAGCA AGAATCCCACTATCTTTCTAGA

NT1-2 4

mispriming ref|NT_026446.14| Homo sapiens chromosome 15 genomic contig, GRCh37.p5 Primary Assembly Length=5594590 NNNNNNNNNNNNNNNNGGNTCNAGTTTTTNNNNNNNAGTGGGANTCTTGCTGTCAGTTAGGAGACCTGA CTTAAATNNNNNNNNTGNAGGAGACTGAGCCTCATTTGGAATGGAACTTANTATATCNGGGANAAGGGCTTAAA ACATCTCTGAGGTGACTGACATTGAGGCTCTAAAAGGCAGCCTCNAAGACTANCCTGATGCTCGTTTCGTCCCATA ACTATCTGCAGGGAATATTCCAGGAAGANTTGTANAGCTTGCTTANAAAANAGTCTGTTCTGGTNANNTCACTGT GGTCNNNNNNNGNGCTCTTNNNGNANAANNGNGNNNANNN

184

NT4 4- 1

mispriming ref|NT_008818.16| Homo sapiens chromosome 10 genomic contig, GRCh37.p5 Primary Assembly Length=6758678 NNNNNNNNNNNNNNNGCTNNNGTTTTTNNNANNNAGTGNGANTCTTGCTGTCAGTTANACTTGCTGCTG AGTCNANNNNNNGNNTTGGAGACTCCATAAATGCTTTCATGCTCTTACCANCTCNNGTTGGCTCTGTGTGTGTGT GTGTAGTCTCTCCTGATGTCTGTGTGAGCTTGCCAACTGCTAGGAGCTCTTCTTTCACGCCCACTTTCCCCAGGGAT GTCTTGAGCCGTCGCTTGGAGCTTGCTGGGTTTTTGTCTGGGTCTGGTTGTGAANATTTGCAGGCTANTTTGGCAN TTTTATCGTTAGTCATTGATTCCTCAGTGTGACCTCGTGTCTGGAANAGCTCTTTAAAGCCAGCCAGGTCTTCTAGA GCCTGGGCCTTTTCCTTAGGAGTTTGTAGCCGTCTCTTGCTGCCAGTTAAGTTCTCTNTCAGGTCCAGTTTCTGCACT GNAGTTCCCAAAAACGTGTTGATGTCTTTCTCTTCACCTACTGCTGGTTTGGGTGTGTGCATGGNTTTGCCTGCTNA TGGNGTTTGTTTCCTAAATGCTAAAAATTCTTCTTNAGTGTCTGCTTTCCTGANACTTCNCTTGGGCTGTGGCTTGN AGCTTGTTGNGGTGTCNACNANNNCNGGNTGTGAANNTCTGNANNANACTNTGGTAGTTTTTNNATTANNNNN NN

NT1-1 Sept. 1

Mispriming unknown

NNNNNNNNNNNGGNNCGNGTTTTTNNCNNGANAGTGGGNNTCTTGCTGTCAGTTAGGGACTAAAGAAAA TGTNNNNCNNANTTTGATTTGCTTGCTGATCTTCGGTTAACTCCATNNANNANNTNTGTACTGATTCAGTTAACTG ACAGCAAGAATCCCACTATCTTTCTAGAANATC

NT3-smear 1 mispriming gb|AC114491.1| Homo sapiens chromosome 1 clone RP11-270C12, complete sequence Length=207698 NNNNNNNNNNNNNGGCNCNNGTTTTTNNCNNANAGTGGGANTCTTGCTGTCAGTTACTGAATCCCTTTTTT AAATCNTNNGTTCCCAAACCCAAGTCCTTTTGGACCACAATTCCACATGGCTCCTTTTGACAATCCTCTGGGAGGAC TGGAACCTTGGGCCTGGATCCCTTGGACTCTTCCTGATATCGCCTACTGCCTCCCTGGCAGTGCCTTCCTAAATGCC TGCTACTCTCAGCTCATCGCTGCAACACCACTGTCAGCCCTACTCTCCCACTGGTTTTCTGCTTGGCATTGATTTTGT GTTGCCAGGTTAATAGCTCTTCTACTAATACNGTGGCCAAATCCAAGGANGGGTTCCACTCAGGGCAAGTGGAAG TCAGGGTGAGCAGTGGAANAGGGGAAAGGAANACAAGGATGCACTCAAGGGGCATTTTTGAGTGCATCCATGT GTGTGAGGACAGGAGCAAAATGAAAAAAAAAATTTATCAACCTTCCANCACCAACTAATCANAATGCACATCAAC TATAATCCTGNAGCAGGTGCTAGTGGCATACTACTGTGACACGTGTTACCTCGTTATTGCTGATCCTTGGTGACAG

185

CCCAANAGATCCCAGGGGAAGGCACTTCAGGATCTTGGTGAAGTAGCCATTTACTAACTGACAGCAAGAATCCCA CTATCTTTCTAGAANA

186

A.4 NB4 PML body targeting sequences

linker

genomic

signature

probe

S6 sample:

S6-12

chr20:55,320,769-55,320,811

NNNNNNNNNNNNNCNNNTTTTTNNNNNANAGTGTGAGTCACAGTAGTCTCGCGTTCGAATTCAAGCGGCC gcTCTAcACTTGGCTGTGCCACAGCCTTACTGAGGGCAAACATCCAGCAGCTAACTGACAGCAAGAATCCCACTATC TTTCTAGAAGATCTCCTACAATATTCTCAGCTGCCATGGAAAAT

S6-18

Chr17: chr17:18,157,959-18,158,120

AGTGTGAGTCacagtAgTCTCGCGTTCGAATTCaAGCGGCCGCTCTACacttaggatggtccATAAACATGAGGTTGA GCACCAGCATGTTCTTGgCacaGctAACTGACAGCAAGAATCCCACT

S7-1

chr12:92,783,501-92,783,569

NNNNNNNNNNNNNNNNNCTTCtAgAAgatAGTgtGAGTCACAGTAGTCTCGCGTTCGAATTCAGCGGCCGCT CTACCCTcagtcctGtTGtTGGATCATCTGGTTGGAGGCTTCTGGCCCAGAGAacctttgtCCTCtGGAGCAGCTAACTGA CAGCAAGAATCCCACTATCTTGCTGAAAAACTCGA

S7-5

chr1:162,352,130-162,352,171

187

NNNNNNNNNNNNNNNNNNNNNCNNGTTTTTNNNNGANAGTGGGANTCTTGCTGTCAGTTAGCTGCCAGC TCCCGAGAANGTGGCCTGATCCTGCAGCTTTTCAGTAGAGCGGCCGCTTGAATTCGAACGCGAGACTACTGTGAC TCACACTATCTTTCTAGAAGATCTCCTACAATATTCTCAGCTGCCATGGAAAATCG

S7-9

chr1:38,902,652-38,902,694

NNNNNNNNNNNNNGNGTCACAGTAGTCTCGCGTTCGAATTCAAGCGGCCGCTNTNNNNNNNAGGAGTGA GCTTGTTAATGAGCTAAACTGAACCCCTGCCACCTCAGCTAACTGACAGCAAGAATCCCACTATCTTGCTGAAAAA CTCG

S7-11

chr1:183,052,564-183,052,613

NNNNNNNNNNNNNNNNCTTCNANAANNNAGTGNGAGTCACAGTAGTCTCGCGTTCGAATTCAAGCGGCC GCTCTNCNTANANNAAAGTTTCAGTCTCAGGACATACCAGCATACATCAGGTAGCAGCTAACTGACAGCAAGAAT CCCACTATCTTGCTGAAAAACTCGAGCCATC

S9-7

chr5:37,765,867-37,765,908

NNNNNNNNNNNNNNGGGNNTCTTGCTGNCAGTTAGCTGCTTCCAGTGACTCCTGNGNNTNNNGGGNGAC CAAGAGAGGTAGAGCGGCCGCTTGAATTCNNNNGCGAGACTACTGTGACTCACACTATCTTGCTGAAAAACTCGA GCCGTGGCGCGGTATTATCCCGTATTGACGCCNGGCAAGAGCAACTCGGTCGCCGCATACNCTATTCTCAGAATG ACTT

188

189