The Pennsylvania State University

The Graduate School

The Huck Institutes of the Life Sciences

INTERACTIONS BETWEEN THE COHESIN COMPLEX AND ITS

DNA PARTNERS:

A STRUCTURAL AND PHYLOGENETIC APPROACH

A Thesis in

Integrative Biosciences

by

Alexandra Surcel

© 2007 Alexandra Surcel

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

August 2007

The thesis of Alexandra Surcel was reviewed and approved* by the following:

Hong Ma Professor of Biology Thesis Advisor Chair of Committee

David S. Gilmour Associate Professor of Molecular and Cell Biology

William O. Hancock Assistant Professor of Bioengineering

Wendy Hanna-Rose Assistant Professor of Molecular and Cell Biology

Douglas Koshland Special member Senior staff member of the Carnegie Institution of Washington, Department of Embryology Investigator, Howard Hughes Medical Institute

Peter J Hudson Director of the Integrative Biosciences Graduate Degree Program The Huck Institutes of the Life Sciences

*Signatures are on file in the Graduate School

iii ABSTRACT

Cohesin is an evolutionarily conserved complex responsible for maintaining sister chromatid cohesion from early S phase to the metaphase-anaphase transition. The cohesin complex is comprised of four – Smc1 and Smc3 that heterodimerize, and Scc1 and Scc3. Imaging of the Smc heterodimer shows that it forms a V shaped molecule and imaging of the entire cohesin holocomplex suggests that it forms an enclosed ring structure. The cohesin complex binds to specific loci along the chromosome arms and centromeres known as Cohesin Attachment Regions (CAR). In lieu of consensus binding sequences at CAR loci, several models have been proposed for cohesin interactions at CAR sites, though no direct structural information about the in vivo interaction of cohesin at CARs has been obtained. This thesis is the first documented effort of the successful isolation and imaging of cohesin-chromatin complexes assembled in vivo. A minichromosome containing a CAR sequence and a CEN3 sequence was isolated from Saccharomyces cerevisiae using the Minichromosome Affinity Purification (MAP) method. The MAP protocol had to be overhauled to obtain high yield from this low-copy centromeric construct. Adjustments to MAP resulted in a 100-fold increase over the yield obtained by using the published protocol. Samples were isolated from G1 synchronized cultures in which cohesin is not bound to CARs and M-phase synchronized cultures in which cohesin is bound to CARs. These MAP isolated samples were negatively and positively stained for TEM analysis. Images showed that replicated minichromosomes always interact with one end of a flexible rod. Length measurements of this protrusion are consistent with a collapsed cohesin ring and width measurements of the rod suggest that multiple cohesins interact at CAR loci. These images lead to a model that suggests conformational changes within the cohesin complex may be responsible for the topological binding of cohesin to chromatin. Phylogenetic analyses were undertaken of the Structural Maintenance of Chromosome (SMC) family of proteins in an effort to identify conserved sequences within the arms of Smc1 and Smc3 that may explain the interaction observed via TEM along the coiled-coil domains of these molecules. Extra disruptions to the coiled-coil iv domains among SMC members were identified. These disruptions further support an alternative to the evolutionary history of these proteins that is presented in phylogenetic trees based solely on sequence alignments. In addition, phylogenetic analysis of the meiotic form of Smc1, known as Smc1β, suggests that it arose via a gene duplication event early in animal divergence, even though it has only been maintained in vertebrate lineages. This study of Smc1β lends support to the evolution of various meiotic regulatory mechanisms among animals.

v TABLE OF CONTENTS

LIST OF FIGURES...... viii

LIST OF TABLES ...... x

LIST OF ABBREVIATIONS...... xi

Chapter 1 Introduction: Cohesin Structure and Function...... 1

Eukaryotic Cell Cycle ...... 1 Cohesin structural components ...... 3 SMC1 and SMC3...... 3 SCC1 and SCC3 ...... 6 Cohesin mechanics...... 8 Loading of the mitotic cohesin complex...... 10 Binding of the mitotic cohesin complex to DNA/chromatin...... 13 Dissolution of the mitotic cohesin complex ...... 16 Cohesin in meiosis ...... 18 Other proteins essential for cohesin function...... 18 Other roles for cohesin...... 20 Overview of thesis ...... 23

Chapter 2 The Construction of CAR-Containing Minichromosomes and the Development of the Minichromosome Affinity Purification (MAP) Technique for Low-Copy Plasmids...... 25

Abstract ...... 26 Introduction...... 27 Results...... 30 pCM26-1 plasmid generation and characterization ...... 30 pAS1 plasmid generation and characterization...... 32 Changes to MAP methodology for pCM26-1...... 36 Additional considerations for the MAP technique...... 43 Protein analysis of isolated minichromosomes detects presence of cohesin ...... 43 Discussion and significance ...... 45 MAP methodology...... 45 Important cis-elements needed for cohesin binding constructs ...... 45 Improvements to the MAP technique ...... 46 Material and Methods ...... 49 Strains ...... 49 Plasmids...... 49 Cell Synchronization...... 50 Smash and Grab DNA isolation from yeast...... 50 vi Minichromosome Affinity Purification (MAP)...... 51 LacIZ Column Preparation...... 52 Chromatin Immunoprecipitation (ChIP), PCR, and Data Analysis ...... 53 Western blot...... 53

Chapter 3 Cohesin Interaction at One CAR Locus Shows a Flexible Rod Multi- Complex Structure ...... 54

Abstract ...... 55 Introduction...... 56 Results...... 59 TEM analysis from alpha-factor arrested cells shows a singular minichromosome ...... 59 TEM analysis shows a flexible rod protruding from replicated minichromosomes...... 61 Nucleosome mapping at several CAR loci shows well-positioned nucleosomes and no protein footprint...... 66 Discussion ...... 68 Characterization of the pCM26-1 minichromosome ...... 68 Characterization of cohesin bound pCM26-1 and assessment of binding models...... 69 Cohesin binding may involve conformational changes to the protein ring ...71 Materials and Methods...... 75 Strains and Plasmids ...... 75 Transmission Electron Microscopy Sample Preparation and Analysis ...... 75 Immunogold labeling...... 76 Analysis of TEM images...... 77 Nuclei Isolation and Nucleosome Mapping...... 77

Chapter 4 Phylogenetic Analysis of the SMC family and Characterization of the SMC1 clade...... 79

Abstract ...... 80 Introduction...... 81 Tools for Phylogenetic Analyses ...... 81 Phylogeny of the SMC family of proteins ...... 84 Results...... 91 Phylogenetic analysis of SMC family...... 91 COILS analysis of SMC proteins identifies additional interruptions to coiled-coil domains...... 94 Phylogenetic analysis of SMC1 ...... 96 Discussion ...... 100 Implications for structure based on the phylogeny of the SMC family of proteins ...... 100 Possible origin of SMC1β and the evolution of meiotic cohesin function....103 vii Materials and Methods...... 105

Chapter 5 Conclusions and Future Direction for the Study of Cohesin-Chromatin Interactions...... 106

MAP method – a viable approach for isolating low-copy plasmids ...... 107 TEM analysis of cohesin-bound minichromosomes and mapping data pave the way for future interaction analyses ...... 108 Phylogeny of the SMC family – an important tool for deciphering structure- function relationships for cohesin...... 110

Appendix A PCR primers and additional constructs ...... 113

Construct 1:pCM27-1 ...... 118 Construct 2: pCM26-1 backbone ...... 119 Construct 3: pCM26-1 with AluI mutations...... 120

Appendix B Additional experiments—Detection of noncoding RNAs across CAR loci...... 122

Non-coding RNAs detected at multiple CAR loci...... 123

Appendix C Phylogenetic analysis—complete sequence alignments and neighbor-joining trees ...... 126

Alignment of the SMC1 clade ...... 127 Neighbor joining tree for the entire SMC family of proteins...... 140

BIBLIOGRAPHY ...... 145

viii LIST OF FIGURES

Figure 1.1 Structural Maintenance of Chromosome family of proteins (SMC) protein structure………………… 4 Figure 1.2 Cycle of ATP binding and hydrolysis at the cohesin Smc1 and Smc3 head interface…………………… 7 Figure 1.3 The cohesin holocomplex………………………………… 9 Figure 1.4 Cohesin interfaces for ring opening and model for cohesin loading onto DNA………………………… 12 Figure 1.5 Models for cohesin binding…………………………………15 Figure 1.6 Mitotic sister chromatid cohesion cycle…………………… 21

Figure 2.1: Plasmid map of pDTL, the backbone of the minichromosome constructs……………………….. 29 Figure 2.2: Minichromosome pCM26-1……………………………….. 31 Figure 2.3: Minichromosome pAS1……………………………………. 33 Figure 2.4: Chromatin immunoprecipitation of pAS1…………………. 35 Figure 2.5: Southern blot of pCM26-1 in the tween and triton trials…... 37 Figure 2.6: Southern blot of pCM26-1 concentrated under varying pre-treatment conditions…………………………… 38 Figure 2.7: Ethidium bromide stained agarose gel of pCM26-1 washes of the precipitate that formed from a large- scale preparation…………………………………… 39 Figure 2.8: Changes to the sucrose gradient…………………………… 41 Figure 2.9: Southern blots of two MAP experiments of pCM26-1……. 42 Figure 2.10: Protein analysis of purified pCM26-1………………………44

Figure 3.1: Anticipated structures of cohesin bound minichromosomes based on predicted models…………………………. 57 Figure 3.2: Minichromosome pCM26-1 isolated from alpha-factor arrested cells……………………………………….. 60 Figure 3.3: Negatively stained pCM26-1 isolated from nocodazole- arrested cells……………………………………….. 62 Figure 3.4: Minichromosome pCM26-1 isolated from nocodazole- arrested cells……………………………………….. 63 Figure 3.5: Positively-stained pCM26-1 from nocodazole arrested cells………………………………………………… 64 Figure 3.6: Measurements of the minichromosomes………………….. 65 Figure 3.7: Nucleosome mapping of CARC1 and CARC3 in G1 and M phases…………………………………………… 67 Figure 3.8: Models for cohesin binding at CAR loci……………………74

Figure 4.1: Proposed structures of SMC-containing complexes……….. 86 Figure 4.2: Tree topologies for the SMC family of proteins…………… 87 ix Figure 4.3: Conserved disruptions in the SMC coiled-coil domains….. 89 Figure 4.4: Partial view of alignment of the SMC family of proteins across eukaryotic and prokaryotic kingdoms……… 92 Figure 4.5: NJ tree for the SMC family of proteins……………………. 93 Figure 4.6: COILS output for Smc1-6 from S. cerevisiae……………… 95 Figure 4.7: NJ tree for the Smc1 clade…………………………………. 97 Figure 4.8: Alignment of Smc1α and Smc1β residues in the hinge domain………………………………………………98 Figure 4.9: A generalized phylogeny of the Smc1 family with mapped unique amino acid residue changes………………… 99

Figure A.1 pCM27-1…………………………………………………… 118 Figure A.2 pCM26-2…………………………………………………… 119 Figure A.3 pCM26-1 with AluI mutations………………………………121 Figure B.1 Noncoding RNA are transcribed at CAR loci …………….. 125 x LIST OF TABLES

Table 1.1 Mitotic and meiotic specific cohesin components ………… 19 Table A.1 Primer information………………………………………… 115 xi LIST OF ABBREVIATIONS ABC: ATPase-Binding Cassette kDa: kilo-Dalton

APC: Anaphase promoting complex MAP: Minichromosome Affinity Purification ATP: Adenosintriphosphate MCD1: Mitotic Chromosome Bp: Base pairs Determinant 1; also known as Scc1

BSA: Bovine Serum Albumin MEGA: Molecular Evolutionary Genetic Analysis BST1: Bypass of Sec Thirteen 1 MRE11: Meiotic REcombination 1 CAR: Cohesin Attachment Region PBS: Phosphate Buffered Saline CDC20: Cell Division Cycle 20 Solution

Cdks: Cyclin-dependent protein kinases PCR: Polymerase Chain Reaction

CHL1: CHromosome Loss 1 PDS1: Precocious Dissociation of Sisters 1 CTF7: Chromosome Transmission Fidelity 7; also known as ECO1 POL30: POLymerase 30

DSB: Double stranded breaks REC11: RECombination 11

ECO1: Establishment of Cohesin 1; also RNAi: RNA interferance known as Ctf7 RSC: Remodels the Structure of ESP1: Extra Spindle Pole bodies 1 Chromatin

FISH: Fluorescence in situ SA: Stromal Antigen Hybridization SCC: Sister Chromatid Cohesion GIMP: GNU Image Manipulation Program SCC1: Sister Chromatid Cohesion 1; also known as MCD1 H2AX: Histone 2 SCC3: Sister Chromatid Cohesion 3 HAT: Histone Acetyl-Transferase SGO: Shogusin HEPES: 4-(2-Hydroxyethyl)piperazine- 1-ethanesulfonic acid SIR: Silent Information Regulator

HU: Hydroxyurea SMC1: Structural Maintenance of Chromosome 1 IRR1: IRRegular 1 xii SMC3: Structural Maintenance of Chromosome 3

STE2: STErile 2

TEL1: TELomere maintenance 1

TEM: Transmission Electron Microscope

TEV: Tobacco Etch Virus

TRIS: Tris(hydroxymethyl) aminomethane hydrochloride

UA: Uranyl acetate xiii

ACKNOWLEDGEMENTS

When you come to the end of all the light you know, and it's time to step into the darkness of the unknown, faith is knowing that one of two things shall happen: Either you will be given something solid to stand on or you will be taught to fly.

Edward Teller

I would like to thank those individuals who gave me both a solid place to plant my feet and those who gave me a push to find my wings. To Dr. George B. Chapman of Georgetown University, who told me early on that if you love what you do, no matter what happens around you, you will always find a way to make it happen. To Dr. Chris Woodcock of the University of Massachusetts, who had the patience to guide me on my EM work and sample preparation and who taught me that being a giant doesn’t preclude one from being generous and kind. To my first mentor, Dr. Robert T. Simpson, whose words at the start of my thesis project were “Finally. Good. Go.” Bob taught me to think before I do, to listen before I speak, and to put garlic in everything. His most important advice to me was to “Work as hard as a banshee when things are going badly. You’ll never need motivation when things are going well.” And he was right. To all the Simpsonites who made 301 Althouse feel alive, thank you. I would like to especially thank my friends who gave me scientific advice and encouragement whenever I needed it: Bob Boor – for making sure I always had what I needed to do my work, including support and caring words, Dr. John Diller – for teaching me everything that I learned about yeast maintenance and what I needed to know to get my experiments off of the ground, Sevinc Ercan – for sifting through my data with good advice and lots of laughs, and Julie Norseen – for making me feel like I finally had a younger sister. When Bob passed away in April of 2004, there was a lot of uncertainty about the future of this project. A handful of wonderfully supportive people ensured that I had the xiv time and space to pursue this work. I would like to thank Dr. Dick Frisque who quickly assembled my committee, voiced support and concern for my graduate career, read and commented on all of my progress reports even though he wasn’t on my committee, and told me that all I had to worry about were my experiments. I would like to thank Dr. Hong Ma for being an exceptional committee member and eventual advisor for my project. Hong not only gave me a new home in his lab, but he also taught me how to think critically of my experiments, foresee the obtainable future, and be optimistic. Additionally, he taught me a lot about the nuts and bolts of being in academia and that advice is proving to be invaluable. I would like to thank Dr. Doug Koshland who joined my committee in late 2004. His excitement over my project was contagious and I learned a lot about the cohesin field from the many conversations I had with him. Many thanks also for the remaining members of my committee for their support and help – Dr. Dave Gilmour, Dr. Wendy Hanna-Rose, and Dr. Will Hancock. I would like to extend much gratitude to the Ma lab members for being friendly and helpful to the non-plant person who one day showed up in their lab, asking many basic questions about plant meiosis. I am especially indebted to Asela Wijerante, Yujin Sun, and Zhenguo Lin for teaching me the rudiments of phylogenetic analysis. This work could not have been done without the patience and technical assistance of members of several of the facilities and groups on the University Park campus. Many thanks to the EM group, especially Missy Hazen, for showing me how to use the microscope, making suggestions to improve my imaging, and answering my endless questions. Thanks also to Elaine Kunze, Susan Magargee, and Nicole Bern at the Center for Quantitative Cell Analysis, for helping me with my flow cytometry experiments, always asking how I’m doing, and being cheerfully pleasant on any given day. I would like to thank members of the Transcriptional Regulation group (Tan, Reese, and Pugh labs especially) for their critiques and advice during our weekly mega meetings. I would like to acknowledge the Uhlmann lab for their generous gift of several yeast strains, which were of tremendous help in my experiments. At the end of the day though, it is the support of friends and family who carry me through. Aside from Bob, John, and Sevinc, I am indebted to many people who knew xv somehow to call or to write in the exact moments when I was feeling down and who did cartwheels when I was successful. To Doina, Zeno, Dan, Chiquita Banana Erica, Geoff, Simon, John N., Carmen, Tom, Ann Marie, Lena and Sheila, who always asked me if my yeast were behaving and if my gels were running well – you made me laugh and have hope and my life is filled with stories of you. To Nigam, Prachi, and Paja who shared in my graduate student woes and happiness – you are some of the best of what Penn State has given me. To Jo and Jan who gave me loads of advice and lots of hugs – you are the best neighbors we could have ever hoped for. And lastly, my biggest thanks to my biggest fans: To my mom, Anca Surcel who by example, taught me to work hard, be strong, and have high expectations for myself. To my husband, John Debes who has been my rock and my support and who taught me to believe in myself at least half as much as others believed in me. And to my weebeastie daughter Katrina who has taught me that every rough patch can be cured with the laugh of a child.

Chapter 1

Introduction: Cohesin Structure and Function

Eukaryotic Cell Cycle

The cell cycle is the universal sequence of events that cells undergo from one division to the next. This highly regulated process includes two key events – the replication of genetic material and the subsequent segregation into distinct daughter cells. In order for offspring to be viable, the cellular machinery must guarantee the appropriate partition of the cytoplasm, organelles, and genetic material. This is achieved by a complex network of regulatory proteins that not only play a role in the final stages of cell division, but that are also active throughout all of the cell cycle stages.

The cell cycle can be divided into four distinct phases – G1, S, G2, and M phase. During the synthesis, or S phase, nuclear DNA is replicated. S phase is sandwiched by the two gap phases G1 and G2 that separate it from mitosis or M phase. During mitosis, nuclear division occurs, followed by cytoplasmic division known as cytokinesis. The cell cycle length varies among different organisms and in the budding yeast Saccharomyces cerevisiae the entire cell cycle takes approximately 1.5 hours. Mitosis is further subdivided into six stages, mainly distinguishable by changes in chromosome morphology and cell shape. During prophase, the replicated chromatids, which are closely associated with their sister, condense. The chromatids are attached to spindle microtubules during prometaphase and migrate to the central plane. At metaphase, the paired kinetochore microtubules from each sister chromatid attach to opposite poles of the spindle. During the metaphase to anaphase transition, sister chromatids separate and are pulled to opposite ends of an elongating cell. Nuclear division is finished in telophase, when chromosomes are packaged into distinct nuclei. The completion of mitosis occurs when the cytoplasm of the mother cell is divided in two during cytokinesis. 2 This complicated cytoplasmic and chromosome movement is regulated by an intricate feedback loop through a series of checkpoints, the three most prominent occurring at the G1 to S phase transition, at the G2 to M phase transition, and at the metaphase-anaphase transition. At the first two transitions, the integrity of DNA is monitored, while at the last one, the proper orientation of sister chromatids is verified. Much of the cell cycle regulation is modulated by the phosphorylation and dephosphory- lation of key proteins. The kinases responsible for these protein modifications are known as cyclin-dependent protein kinases or Cdks. Cdks are so named because their activity is dependent upon their interaction with a group of differentially expressed cell cycle proteins known as cyclins, which also aid in targeting Cdks to their various substrates. In addition to cyclins and Cdks, there are other proteins that play an integral role in cell cycle control and maintenance. One such is the tightly regulated anaphase-promoting complex (APC). The APC is a protein complex comprised of at least 13 subunits in yeast (11 in vertebrates) and is essential for both anaphase initiation and progression. The APC is inhibited by the spindle checkpoint and released only once all of the kinetochores are correctly attached to opposite poles of the mitotic spindle. The APC is able to launch anaphase by stimulating the proteolysis of various cell cycle regulators via its ubiquitin-ligase activity. One of its activators is Cell Division Cycle 20 (Cdc20), a cell cycle specific protein that binds to APC only during mitosis and whose phosphorylation in part, leads to its dissociation from and eventual degradation of the complex (reviewed in (Castro et al., 2005)). Another essential protein complex is responsible for preventing the improper chromosome segregation, causative of such disorders as Down’s Syndrome, Cornelia deLange syndrome, and tumorogensis (Pati et al., 2002; Gilliland and Hawley, 2005; Ren et al., 2005; Musio et al., 2006; Deardorff et al., 2007; Ohbayashi et al., 2007). To ensure proper chromosome segregation so that each progeny has a complete set of chromosomes, cells guard against aneuploidy by pairing replicated sister chromatids together both at their centromere and along their arms. This pairing starts in S phase and continues through DNA replication to the anaphase-metaphase transition in mitosis. This 3 evolutionarily conserved process is facilitated by a multimeric protein complex known as cohesin. Cohesin’s interaction with chromatin throughout the cell cycle is highly coordinated by a bevy of proteins involved in the loading, possible sliding, establishment, and dissolution of the complex. While much is known about the four-part structure of cohesin and its role in sister chromatid cohesion both in mitosis and meiosis, in vivo data regarding the binding of cohesin to its DNA counterparts is minimal.

Cohesin structural components

The mitotic cohesin holocomplex is composed of four proteins – Smc1p, Smc3p, Scc1p, and Scc3p. The former members are part of the chromatin related six member class of proteins known as the Structural Maintenance of Chromosome family, while Scc1 is part of the kleisin family. The cohesin complex is over 400 kDa and is cell cycle regulated – all four proteins have peak expression in G1.

SMC1 and SMC3

Like all members of the SMC family of proteins, Smc1p and Smc3p share a unique structure. SMC proteins have globular N- and C-terminal domains, each attached to a long coiled-coil domain, separated by a flexible hinge domain (see Figure 1.1A). The N terminal domains contain a Walker A motif (consensus G-X-S/T-G-X-G-K-S/T- S/T), but surprisingly do not contain a complementary Walker B motif (h-h-h-h-D, where h is a hydrophobic residue) characteristic of ATP-binding proteins. In a novel configuration, it was suggested that the Walker B motif is located in the C-terminal DA box – a highly conserved alanine and aspartic acid containing 35-amino acid stretch – and that these two motifs at either end of the long SMC protein interacted in concert to form a

4

Figure 1-1: Structural Maintenance of Chromosome family of proteins (SMC) protein structure

(A) SMC proteins contain N-terminal and C-terminal globular domains that are seperated from a flexible hinge domain by two long coiled-coil arms. (B) and (C) show two possible orientations of SMC dimers. (B) In the intermolecular model of dimerization, the N-terminal domain of one SMC binds to the C-terminal domain of a different SMC molecule. (C) In the intramolecular model of dimerization proven by several in vitro studies addressed above, each SMC molecule folds in on itself and dimerization is mediated by the hinge domain. Arrows show directionality of each SMC molecule. 5 functional ATPase (Saitoh et al., 1995). The physical interaction of the two end domains was hypothesized to be precipitated by the formation of an antiparallel coiled-coil domain, which could be achieved either intramolecularly with the N- and C-termini of one SMC interacting with each other, or intermolecularly with the N terminal domain of one SMC interacting with the C-terminal domain of another (see Figure 1.1B and 1.1C). Because anti-parallel coiled-coil domains had not yet been proven to exist in other lengthy proteins, it was also suggested that ATP binding might occur on separate rod- shaped parallel SMC dimers, generating the formation of long filaments (Hirano et al., 1997). To distinguish between these possibilities, the first use of transmission electron microscopy (TEM) for the imaging of SMC proteins was employed. SMC proteins from B. subtilis and the MukB proteins from E. coli (the SMC counterpart in this species) were recombinantly expressed and purified. Hydrodynamic data of these isolates was consistent with homodimer formation. Subsequent rotary shadow imaging of these homodimers showed that they were folded into a V shaped molecule, with globular domains at either ends and at the hinge area of the V (Melby et al., 1998). In addition, the arms of these molecules took on a number of different conformations including a completely open configuration in which their end globular domains remained uncoupled, a configuration in which the arms interacted along approximately 2/3 of their lengths but not at their globular domains, and one in which the V shape appears collapsed, with tight association along the entire arm length. These TEM images suggested that SMCs do not form long filaments, but distinguishing between the inter- and intra-molecular SMC dimers and finding definitive support that the coiled-coil arms were indeed antiparallel remained to be explained. This was resolved by several different experiments. First the antiparallel nature of SMC arms was proved by an experiment that replaced the N-terminal domain of MukB with a rod- shaped fibronectin segment and that removed the C-terminal domain (Melby et al., 1998). Rotary shadowed images of the ensuing complexes showed V shaped molecules with fibronectin rods at either end, but never two at one end with none on the other end of the V. This would only be possible if the coiled-coil domains were in an antiparallel fashion (Melby et al., 1998). 6 Intermolecular interaction between SMC dimers was also dissuaded by the following data: recombinately expressed and isolated Smc1 and Smc3 from budding yeast formed one half of the V molecule in the absence of their partner pair and did not homodimerize as visualized by TEM analysis (Haering et al., 2002). Additionally, Smc3 formed dimers in the presence of a modified Smc3 whose hinge domain had been replaced with that of Smc1. The crystal structure of the hinge domain from the Smc homodimer from T. maritima showed a novel donut-hole structure in which two β- strands of one domain interacted with three β-strands from another domain (Chiu et al., 2004). This interaction could only be completed if two SMC dimers abutted each other at the hinge domain. With the folding over of each SMC in a dimer pair at the hinge domain, the Walker A and Walker B motifs of each protein are oriented in close proximity to each other, forming a functional ATPase of the ATP-Binding Cassette (ABC) transporter proteins (reviewed in (Hirano, 2005; Pedersen, 2005)). See Figure 1.2 for ATP binding and hydrolysis characteristic of ABC proteins. This large family of proteins are typically imbedded membrane proteins that use the energy of ATP hydrolysis to undergo a conformational change that allows them to transport solutes across membranes. These proteins contain two domains – a hydrophobic one that crosses the membrane and a dimeric hydrophilic one that binds and hydrolyzes ATP. In addition to containing a Walker A and a Walker B motif that together form an ATP binding pocket, the hydrophilic dimer contains a highly conserved signature motif (L-S-G-G-Q-Q/R/K-Q-R), essential for ATP hydrolysis (Covitz et al., 1994). SMC proteins also contain this signature motif.

SCC1 and SCC3

Sister Chromatid Cohesion 1 or Scc1 (also known as Mcd1 for Mitotic Chromosome Determinant) is a member of the subfamily group of proteins identified as

7

Figure 1-2: Cycle of ATP binding and hydrolysis at the cohesin Smc1 and Smc3 head interface. Binding by the Walker A domains in the Smc heads of ATP (small red circles) leads to the head-head engagement that occurs within Smc dimers. Binding and engagment are two separate events, as mutations in the C-terminus prevents engagement but not ATP binding and a mutation in the Walker A motif in the N-terminus prevents ATP binding. Hydrolysis of the ATP molecules leads to the disengagement of the head- head interaction. 8 α−kleisins. These proteins are characterized by their association with SMC proteins and their requirement for the function of SMC-containing complexes. They tend to be tri- partate proteins of 550-700 amino acids in length, with highly conserved globular N- and C-terminal domains separated by a region rich in polar residues, containing a central partially hydrophobic segment of 20-30 residues (Schleiffer et al., 2003). Scc1 is cell cycle regulated, with peak expression occurring in late G1/early S phase (Guacci et al., 1997). The N- and C-terminal domains of Scc1 interact respectively with the globular heads of Smc3 and Smc1, forming an in vitro cohesin ring structure (Haering et al., 2002; Gruber et al., 2003) (see Figure 1.3). Scc1 is the proteolytic substrate for separase, whose cleavage of Scc1 leads to the dissociation of cohesin from chromosome arms at the metaphase-anaphase transition (discussed in detail below). Scc3’s (also known as IRRegular 1 or Irr1) interaction to Scc1 is essential for cohesin complex formation, but little is know about its function in cohesin loading or binding. First identified in the yeast mitotic cohesin complex (Toth et al., 1999), its meiotic counterparts are speculated to play a role in differential centromere and arm cohesin binding in S. pombe (Kitajima et al., 2003) and are required for monopolar kinetochore orientation in meiosis I in A. thaliana (Chelysheva et al., 2005). Differences in the mitotic and meiotic cohesin complexes are discussed in further detail in the Cohesin in meiosis section below.

Cohesin mechanics

In order for cohesin to promote sister chromatid cohesion, it must undergo several structural modifications or changes for each of its functional steps. First, cohesin must load onto DNA prior to replication, an event hypothesized to be facilitated by the opening of the cohesin tripartite ring complex of Smc1, Smc3, and Scc1. This ring is believed to be formed independent of chromatin. Secondly, it must remain localized to specific chromosome regions for the duration of mitosis prior to chromosome segregation. 9

Figure 1-3: The cohesin holocomplex. Smc1 and Smc3 heterodimerize at their hinge regions to form a V shaped molecule. The two globular heads of the Smc molecules bind ATP and are engaged when Scc1 (aqua tri-partate structure) binds to each head. The interaction of Scc1 with both Smc1 and Smc3 gives cohesin its characteristic ring structure observed in solution from TEM analyses. Scc3, an essential cohesin protein that binds to Scc1 is colored orange in this figure. 10 Thirdly, it must dissociate from these loci at the metaphase to anaphase transition of mitosis. While dissociation is fairly well understood, a solid understanding of how cohesin binds and loads to DNA remains elusive. Many in vitro experiments have been done, providing support for a number of different models discussed below.

Loading of the mitotic cohesin complex

Cohesin maintains sister chromatid cohesion by binding to specific sites at the centromere and along the arms of all chromosomes. However, cohesin loading is a distinctly separate event from cohesin binding and is speculated to occur at different loci.

The cohesin complex interacts with chromatin during G1 and establishes cohesion simultaneously with DNA replication during S phase (Takeda et al., 2001). Cohesin loading is dependent upon several proteins, two of which, Scc2 (Mis4p in S. pombe) and Scc4, form a protein complex. Both proteins are essential for cell viability and scc2 and scc4 mutant strains exhibit the abolishment of cohesin binding, though cohesin complex assembly is unaffected (Toth et al., 1999). In addition, the Scc2/Scc4 complex associates along chromosome arms at sites different than cohesin binding sites and localization of either Scc2 or Scc4 is dependent upon the other (Toth et al., 1999). One study has identified Scc2/Scc4 binding sites in S. cerevisiae (Lengronne et al., 2004). Many of these sites are located both upstream and downstream of cohesin binding loci that are found between convergently transcribed genes. In the Lengronne study, ChIP data showed that cohesin loading occurs at the sites of the Scc2/Scc4 protein complex and then is relocated along the length of the transcribed genes, finally ending up at cohesin binding sites (see below for definition and characterization of those loci). The abolishment of transcription at the STErile 2 (STE2) locus subsequently abolished cohesin binding downstream of STE2 (Lengronne et al., 2004). From these experiments, the authors hypothesized that the Scc2/Scc4 protein complex acts as a loading dock for the completely assembled cohesin complex and that transcriptional machinery pushes the cohesin ring to the cohesin binding loci. This hypothesis is in keeping with the embrace 11 model described below. However, it should be noted that several of these experiments have not been reproducible in other laboratories (Gerton, personal communication). While there is no doubt that Scc2 and Scc4 are essential for proper cohesin loading, the exact mode of interaction remains unclear. Another protein essential for cohesin loading is Chromosome Transmission Fidelity 7 or Ctf7 (also known as Establishment of COhesin 1 or Eco1). Work on temperature sensitive Ctf7 strains demonstrates that it is essential for SCC, that it is required during S-phase, and that it interacts with both the POLymerase Pol30, a DNA replication processivity factor (Toth et al., 1999) and Chromosome Loss Chl1, a nuclear protein essential for SCC with predicted DNA helicase activity (Petronczki et al., 2004). Additionally, biochemical in vitro work demonstrates that Ctf1/Eco1 acetylates cohesin proteins, but not histone tails. Even though this activity was not measured in in vivo samples, the role of Ctf1/Eco1 in SCC suggests that it may acetylate other proteins involved in cohesin loading (Ivanov et al., 2002). While many protein players involved in cohesin loading have been identified, the exact mode of cohesin loading remains vague. How does the cohesin pre-formed ring open to accommodate loading onto cohesin? The cohesin molecule has three primary candidates whose dissociation can result in a transiently open ring. The first is the Smc3 globular head and the N-terminal domain of Scc1; the second is the Smc1 globular head and the C-terminal domain of Scc1; the third is the hinge domain interface between Smc1 and Smc3. See Figure 1.4A for a schematic of these three interfaces. Because ATPase activity is essential for cohesin function, many have speculated that the ATPase heads on Smc1 and Smc3 quickly hydrolyze ATP, allowing for the ring to open, then rebind ATP to lock the heads in position. However, recent data on all three interfaces suggests another more likely mechanism for cohesin opening. Gruber et al. made constructs covalently fusing each of the potential candidate openings and testing their ability to complement deletion mutants of the corresponding genes (Gruber et al., 2006). Their data shows that preventing the Smc heads from dissociating from Scc1 did not affect yeast viability, whereas affecting the ability of the 12

Figure 1-4: Cohesin interfaces for ring opening and model for cohesin loading onto DNA (A) Cohesin has in principle three interactions that can be interrupted to load the closed ring onto DNA. Interface 1 is between the N-terminal domain of Scc1 and the Smc3 globular head. Interface 2 is between the C-terminal domain of Scc1 and the Smc1 globular head. Interface 3 is between the two hinge domains that modulate Smc dimerization. (B) Recent work suggests that the hinge domains of cohesin’s Smc1 and Smc3 open to accommodate cohesin binding. (i) The hinge and part of the coiled-coil domain of Smc1 interacts with the DNA. (ii) In a possible long-range domain interaction, the heads bind ATP inducing a conformational change that disrupts the hinge dimerization, opening the cohesin ring. (iii) The cohesin ring closes, now trapping the DNA. Figure adapted from Hirano, Nature Reviews Molecular Cell Biology 7, 311-322 (May 2006). 13 hinge domain to undimerize affected cell growth. Further experiments with hinge mutants demonstrated that dimerization did occur and that Scc1 did bind to the Smc1 and Smc3 heads to form a functional ATPase, but that this complex did not associate with chromatin or DNA in vivo (Gruber et al., 2006). To bypass the problem that the hinge domain interface has a tight interaction that would preclude transient dissociation, the authors proposed that cohesin loads onto DNA via a stepwise opening of the hinge domain that lowers the activation requirement necessitated by this structural change (see Figure 1.4B). It is important to note for this thesis work two things: The first is that while individual changes to the Smc1 and Smc3 interfaces with Scc did not affect cell viability, a double mutant failed to function in vivo. Secondly, Smc mutant strains that have abolished ATP hydrolysis activity fail to bind to cohesin at CARs. Gruber et al propose that the hinge opening may be coupled to the head ATPase cycle, either through direct interaction of these domains or by conformational changes to the protein.

Binding of the mitotic cohesin complex to DNA/chromatin

Cohesin binding for SCC does not occur at arbitrary locations along chromosomes. Within the past seven years, efforts have been made to identify the cis- acting elements involved in cohesin binding (Blat and Kleckner, 1999; Laloraya et al., 2000) (Tanaka et al., 1999) (Megee et al., 1999). Work done in 2000 by Laloraya et al. used Chromatin Immuno-Precipitation (ChIP) and high-resolution PCR based chromosome walking to catalog specific sites along arms, in rDNA, and in sub-telomeric repeat regions (Laloraya et al., 2000). Although none of these sites, named Cohesin Attachment Regions (CAR), possess sequence homology to each other, they share several characteristics. All are 0.8-1.0 kb in length, exhibit A-T richness and are spaced from one another by approximately 10kb. An analysis of chromosomes III, IV, V, and VI showed that 91% (276 or 304) CARs lay in the intergenic regions between convergently transcribed genes and that of the 328 convergent intergenic regions, cohesin localized to 84% of them (Lengronne et al., 2004). 14 A genome-wide ChIP on ChIP study of CAR loci done by Glynn et al in 2004 further characterized these regions. Besides binding to previously documented CAR loci and centromeric regions, this study demonstrated low-level binding to a subset of telomeres. In addition, while the authors could compartmentalize over 70% of the sites in intergenic regions, of the 230 remaining sites, more than two-thirds were located in ORFs, comprising more than half of all of the cohesin-associated ORFs (Glynn et al., 2004).

The lack of shared sequence homology of CAR loci, coupled with their observed chromosomal spacing and the structure of the cohesin complex, has lead to several models of cohesin binding (see Figure 1.5). Rotary shadowed images of the cohesin holocomplex show a ring structure, which has supported the embrace or ring model (Haering et al., 2002). In this model, two 10-nm chromatids (DNA plus nucleosomes) are trapped within the cohesin ring and the strength of this topological interaction is sufficient for maintaining SCC (see Figure 1.5A). In the embrace model, the cohesin ring is opened upon ATP hydrolysis, allowing the DNA to enter the ring, which is then closed upon the subsequent binding of another ATP molecule. This theory takes into account a number of experimental results. To begin with, cleavage of multiple points along the cohesin ring generated by various insertions of Tobacco Etch Virus (TEV) protease sites along the Smc1 and Smc3 arms leads to cohesin dissociation, as does premature cleavage of Scc1 (Gruber et al., 2003). In addition, SCC is abolished upon the linearization of minichromosomes with cohesin attached to them (Ivanov and Nasmyth, 2005), suggesting that cohesin is able to fall off of linearized chromatin. However, the embrace model fails to explain one important observation and that is the continuous maintenance of binding at CARs, each spanning over 800 bp.

The next two models are variations of the ring model. In the extended embrace model, multimeric cohesin molecules intramolecularly interact forming a larger cohesin ring surrounding the two sister chromatids (see Figure 1.5C). This is facilitated by the interaction of Scc1 from one cohesin molecule with its Smc1 head and with the Smc3 head from the second cohesin molecule. Currently, no evidence for this model exists. 15

Figure 1-5: Models for cohesin binding. (A) In the ring or embrace model, the cohesin ring encircles replicated DNA strands and does not interact directly with DNA to maintain binding. (B) In the physical interaction model, the globular heads of each of the Smc partners interacts with an individual replicated strand of DNA. (C) In the extended embrace model, two cohesin molecules interact with each other at their head-Scc1 interface to make a larger ring that encircles both replicated strands of DNA. (D) In the snap-model, one of two interactions is proposed. In the top panel, two cohesin complexes interact with each other, while their Smc heads interact with individual strands of replicated DNA – this is a hybridization of both the ring and physical interaction models. In the lower panel, one cohesin ring encircles each replicated strand of DNA and the cohesin rings interact with some unknown protein to bring the sister strands together. Black lines represent sister chromatids. 16 The second model that invokes part of the embrace model is the snap model, which suggests that individual sister chromatids are trapped inside separate cohesin rings that are topologically interconnected or bound by as yet unidentified protein partners (see Figure 1.5D). Alternatively, the two cohesin molecules in the snap model can also interact directly with chromatin as predicted in the fourth model (Huang et al., 2005). This model for cohesin binding is the physical interaction model, based on several lines of evidence (see Figure 1.5B). Support for this model can be gleaned from studies on the interaction of the condensin complex with chromatin. Like cohesin, condensin contains two Smc proteins – Smc2 and Smc4. These proteins heterodimerize at their hinge region, forming a long rod as visualized by rotary shadow imaging. Condensin, the complex responsible for chromatid condensation, introduces negative supercoils in DNA, believed to be caused by the winding of the chromosome around each of the Smc2 and Smc4 heads (Kimura and Hirano, 1997; Bazett-Jones et al., 2002). In addition to studies on the condensin complex, it has been reported that Smc1 and Smc3 C-terminal fragments are capable of in vitro binding of DNA (Akhmedov et al., 1998), though the in vivo counterpart of this experiment remains to be done. The structural and functional overlap of individual protein partners in the cohesin and condensin complexes implies a possible mechanistic overlap as well.

While each of these models is based on an extensive body of data, none of them fully answer all of the following questions: How does cohesin interact with pre and post replicated DNA, how does cohesin stably maintain sister chromatid cohesion, and how does that interaction position cohesin such that scc1 cleavage is the sole process necessary to eliminate SCC? An in vivo analysis of cohesin binding at CAR loci has the potential to directly address these questions.

Dissolution of the mitotic cohesin complex

As mentioned above, cleavage of Scc1 leads to the dissolution of the mitotic cohesin complex. The cleavage of this α-kleisin protein is both necessary and sufficient 17 for cohesion abolishment and is part of a required checkpoint for the metaphase-anaphase transition (Uhlmann et al., 1999). In budding yeast, the cleavage of Scc1 is accomplished by the protease separase, which is released from its inhibitory chaperone by the APC complex at mitosis (Kitagawa et al., 2002). Esp1 or Extra Spindle Pole bodies 1, is the separin responsible for Scc1 cleavage (Uhlmann et al., 1999). Separins share a conserved C-terminus similar to the active site of caspases (Cysteine-dependent AStartyl specific proteASE), which are peptidases of the peptidase family C14. Caspases are cysteine proteases that use the cysteine sulphydryl group to catalyze cleavage of the aspartyl residues in their target protein. Additionally, caspases have a catalytic dyad consisting of histidine-glycine and alanine-cysteine residues pocketed by blocks of hydrophobic residues (Chen et al., 1998). Esp1 is bound by the securin Pds1, or Precocious Dissociation of Sisters 1, prior to metaphase, which inhibits its proteolytic activity (Ciosk et al., 1998; Uhlmann et al., 1999). Pds1 is degraded by the 26S proteosome upon ubiquitination by APC in conjunction with the Cdc20 protein (Cohen-Fix et al., 1996; Funabiki et al., 1996; Yamamoto et al., 1996). Securins share two functional roles – they are involved in the nuclear transport of separases and they inhibit separase catalytic activity. In addition to binding the separin Esp1, Pds1 also blocks mitotic exit and cyclin destruction during early mitosis. Thus, it is the interplay of Esp1 and Pds1 that renders cohesin an integral part of the metaphase-anaphase checkpoint. Unlike in budding yeast, other eukaryotes from fission yeast to vertebrates undergo a two-step dissolution of cohesin (Losada et al., 2000). Release from chromosome arms occurs during prophase and is necessary for the proper resolution of sister chromatids (Losada et al., 2002). Cohesin at the pericentric regions is released at the initiation of anaphase and not before, as it plays a vital role in generating the tension between centromeric cohesion and the antagonistic pulling force of the spindle microtubules (Waizenegger et al., 2000). Here, proteolysis of Scc1 is responsible for the removal of most of the cohesin from the centromere, while the release of Scc1 from the arms is accomplished by two mitotic kinases – aurora B and polo (Losada et al., 2002; Sumara et al., 2002). When phosphorylation sites on the vertebrate Scc3 isoform known 18 as Stromal Antigen 2 (SA2) are blocked, cohesin dissociation from arm CARs is reduced (Hauf et al., 2005). Why higher order eukaryotes have developed this bilateral approach to cohesin dissolution remains an intriguing question.

Cohesin in meiosis

During meiosis, a diploid cell undergoes one round of DNA replication followed by two rounds of cell division, generating four haploid cells. In this process, chromosomes separate twice – homologous chromosomes in meiosis I and sister chromatids in meiosis II. It seems likely that the temporally different role of cohesin in both meiosis I and meiosis II was causative for the rise of meoitic specific cohesin counterparts to Scc1, Smc1 and Scc3/SA. Published data shows that in mammals, two SMC1 isoforms exist – SMC1α (or SMC1L1 for SMC1-like 1) and SMCβ (or SMC1L2 for SMC1-like 2) (Revenkova et al., 2004). SMCβ-deficient mice are sterile and are faulty in chromosome recombination, synapsis and cohesin maintenance. It is speculated that SMCα is responsible for cohesion during the premeiotic S phase, whereas SMCβ plays a later role as it is detected in the zygotene phase (Revenkova et al., 2004) (Hodges et al., 2005). In addition to SMC isoforms, Scc3 variants have been identified in S. pombe. The mitotic version Scc3 (known in fission yeast as Psc3) is localized to centromeres, while the meiotic version known as Rec11 is detected along the chromosome arms (Kitajima et al., 2003). Table 1 outlines the different mitotic and meiotic cohesin components in varying organisms.

Other proteins essential for cohesin function

One of the more baffling questions in the cohesin field has been how cohesin at centromeres is protected against cleavage during the removal of cohesin from chromosome arms. Screens in budding and fission yeast have identified proteins related to the Drosophila centromeric protein known as Mei-S332 (Kerrebrock et al., 1995; Katis 19 et al., 2004; Rabitsch et al., 2004). These proteins have been recently described as members of the shugoshin (or Sgo) family of proteins. Depletion of one of the human isoforms Sgo1 leads to precocious chromatid separation that is partially rescued by the nonphosphorylated SA2 used in cohesin dissociation studies (see above) (McGuinness et al., 2005). Localization of Sgo1 is regulated by Bub1, the spindle checkpoint protein. In

Table 1-1: Mitotic and meiotic specific cohesin components

Mitotic specific components Meiotic specific components SMC1 SMC3 SCC1 SCC3 SMC1β Kleisin SCC3 Organism counterpart S. Smc1 Smc3 Scc1 Scc3 --- Rec8 --- cerevisiae S. pombe Psm1 Psm3 Rad21 Psc3 --- Rec8 Rec11 C. SMC-1 SMC-3 SCC-1/ SCC-3 --- REC-8 --- elegans COH-2 H. hSMC1α hSMC3 hRAD21 hSA1/ hSMC1β hRec8 hSA3/ sapiens hSA2 STAG3 X. laevis XSMC1 XSMC3 XRAD21 XSA1/XSA2 --- AAH87346 ? A. AtSMC1 AtSMC3 SYN2-4 CAB45374 --- SYN1 --- thaliana

a bub1- background, Sgo1 and cohesin are found spread throughout the chromosome arms (Kitajima et al., 2005; Watanabe and Kitajima, 2005). This data suggests that Sgo proteins protect centromeric cohesin from prophase dissolution, in a pathway that involves probable overlap with the spindle checkpoint signaling pathway. Differential binding of cohesin to intergenic regions and heterochromatic regions such as centromeres and mating-type loci can in part be reconciled by the prospect of nucleosomal changes at these regions. In addition to post-translational modifiers of histone proteins such as histone acetyltransferases (HATs), another group of proteins 20 known as ATP-dependent chromatin remodelers are used by the cell to increase contact with nucleosomal DNA. One such ATPase containing complex is the Remodels the Structure of Chromatin (or RSC) complex. Budding yeast have at least two RSC complexes involving Rsc1 or Rsc2. While single deletions of either Rsc1 or Rsc2 are non-lethal, the double mutant is nonviable (Cairns et al., 1999). Interestingly, several rsc mutants show defects in the G2/M phase transition (Damelin et al., 2002) (Angus-Hill et al., 2001), suggesting a possible role in chromosome segregation for the complex. Indeed, rsc mutants lead to failure of cohesin to bind along chromosome arms, but not at centromeres, leading to premature sister chromatid separation (Huang et al., 2004; Huang and Laurent, 2004). Additionally, these studies also showed that RSC precedes Scc1 binding to CAR loci by 15 minutes. However, genome-wide ChIP analyses have been unable to reproduce this data in its entirety, though they do show that RSC positively affects cohesin binding to a very small subset of CAR loci (Gerton, personal communication). Another essential gene whose protein product is found to play a role in SCC is the Precocious Dissociation of Sisters 5, or PDS5, gene. It is localized at both centromeres and along chromosome arms and its binding is dependent upon Scc1 (Hartman et al., 2000). Besides functioning in mitosis, pds5- mutants cause premature separation of chromosomes in meiosis I (Ren et al., 2005). An overview of some of these proteins and their role in mitotic SCC is shown in Figure 1.6.

Other roles for cohesin

In addition to its involvement in sister chromatid cohesion, the cohesin complex participates in a multitude of cellular processes. Of particular interest to some of this thesis work, is the postulated role in RNAi. Data from S. pombe experiments suggests that RNAi plays a regulatory role in the establishment of heterochromatin by transcriptional silencing both at the centromeres and the mating type locus. Deletion of RNAi machinery components leads to an accumulation of small dsRNAs with homology to centromeric repeats, along with loss of histone H3 lysine-9 methylation and reduced centromeric function (Hall et al., 2003). Other data identified a functional link between 21

Figure 1-6: Mitotic sister chromatid cohesion cycle. Scc2 is involved in cohesin loading during late G1, in conjunction with Scc4. RSC may play a role in chromatin remodeling at specific CAR loci during G1. Eco1/Ctf7 aids in cohesin establishment by acetylating cohesin subunits, while Pds5 stabilizes cohesin localization to CAR loci, which may be pushed to CARs by transcriptional machinery. Polo and aurora are the two mitotic kinases responsible for the phosphorylation of cohesin during prophase that results in its dissociation from chromosome arms. Sgo, in conjunction with the spindle body protein Bub1, protects centromeric cohesin from removal during this phase. Separase, released from securin by the presence of APC during late metaphase, cleaves Scc1 from condensed chromosomes. This release triggers anaphase and the migration of sister chromatids to opposite ends of the dividing cell. Cohesin: aqua circles; Centromere – red blocks; known cohesin support proteins – italicized under their function; speculated, but not fully documented support proteins and complexes – magenta lettering. 22 RNAi and cohesin, also in S. pombe. Fluorescence in situ hybridization (FISH) analysis of centromeres and arm sequences in RNAi mutants revealed chromosomal segregation defects (Hall et al., 2003). While RNAi is not conserved across all kingdoms, chromatid cohesion is, and this data suggested that there may be a possible role for noncoding RNAs in the evolutionarily conserved process of sister chromatid cohesion. Cohesin has also been found to play a role in maintaining heterochromatin at several loci throughout the budding yeast genome. In S. cerevisiae, repression of gene expression occurs at the silenced mating-type loci and telomeres (Rusche et al., 2003). Boundary cis elements prevent the spread of heterochromatin and nucleate the assembly of proteins, such as Sir2 (a histone deacetylase), Sir3, and Sir4, involved in silencing at these domains (Donze et al., 1999). The efficiency of these boundary elements was adversely affected in some cohesin mutants (Donze et al., 1999). Likewise, cohesin binding at the mating type locus on the right arm of chromosome III, known as HMR, was reduced in Sir mutant backgrounds, though unaffected at non-silenced CAR loci (Chang et al., 2005). These observations are consistent either with a role of cohesin in maintaining silencing or with a mode of cohesin binding that is dependent upon Sir proteins at silenced loci. Besides these two putative roles in RNAi and heterochromatin maintenance, the cohesin subunit Scc1 has also been implicated in DNA repair (Birkenbihl and Subramani, 1992). This overlap of Scc1’s function with DNA repair occurs in a variety of species from yeast to human (Sjogren and Nasmyth, 2001) (Schar et al., 2004). When the two sister chromatids are held together during S and G2, homologous recombination is the preferred method to repair double stranded breaks (DSBs). The fact that DSB repair is weakened in cohesin mutant backgrounds suggests that cohesin plays a vital role in this pathway. Three interesting papers suggest that de novo cohesin binding occurs at DSB. In the first, the authors showed that in mammalian cells, cohesin accumulated at sites of laser-induced DNA damage (Kim et al., 2002). In the latter two papers, cohesin recruitment occurs de novo in S. cerevisiae covering 100 kb regions bordering a single DSB (Unal et al., 2004). This increase of cohesin is dependent upon the phosophylation of histone H2AX over a wide region by Mitosis Entry Checkpoint 1 (Mec1) and 23 TELomere maintenance 1 (Tel1), two DNA damage checkpoint kinases. Cohesin binding here is reliant upon the repair protein Meiotic REcombination 1 (Mre1) and on Scc2, one of two proteins speculated to form a cohesin loading dock (see above section of cohesin loading) (Unal et al., 2004). Cohesin function may also play a role in enhancer activation. Studies of the Drosophila Scc2 ortholog known as Nipped-B have shown that it is essential for the activation of the homeotic cut gene, which is regulated by a distant enhancer, while Scc3 inhibits that long range activation of cut. These antagonistic interactions suggest that Scc2/Nipped B acts as both a loader of cohesin and an unloader of the cohesin blocking enhancer-promoter interaction (Rollins et al., 2004).

Overview of thesis

As mentioned above, there are multiple models proposed for how cohesin interacts with CAR loci. While there is a large body of in vitro work done on cohesin structure, no model can account for or adequately explain all of the data. Any model that is put forth must be able to address three key components of sister chromatid cohesion. They are (1) the interaction of the cohesin holocomplex with DNA/chromatin both pre- and post-DNA replication, (2) the stable maintenance of cohesin at CAR loci over a lengthy period of time, and (3) the positioning of the cohesin Scc1 subunit for easy access for cleavage by separase. An in vivo analysis of cohesin binding at CAR loci has the potential to directly address these questions. The subsequent chapters of this thesis will focus on several approaches undertaken to tease apart the cohesin structure and how it interacts with its DNA partners. To start with, a minichromosome affinity purification scheme was modified to allow for the isolation of in vivo bound cohesin to a plasmid backbone. In-depth revisions to published protocols were undertaken to generate significant yield for subsequent analysis. Chapter 2 comprehensively covers modifications to this procedure. In Chapter 3, isolated minichromosomes are observed by negative staining under the electron microscope. Two populations of minichromosomes are evaluated – one isolated from alpha factor arrested 24 cells in G1 when cohesin is not bound to CAR loci and one isolated from nocodazole arrested cells in M phase when cohesin is bound to CAR loci. Coupled with nucleosome mapping at various CAR loci, the TEM data allows for the reevaluation of current models. Due to the structural configuration of the coiled-coil arms of the cohesin complex bound to minichromosomes, a phylogenetic approach of the Structual Maintenance of Chromosome family was undertaken to search for regions of high similarity in the arm domains among the members, which are part of the condensin and a DNA repair complex, in addition to cohesin. These phylogenetic analyses, documented in Chapter 4, additionally involved an examination of the origin of the meiotic specific SMC1 protein.

Chapter 2

The Construction of CAR-Containing Minichromosomes and the Development of the Minichromosome Affinity Purification (MAP) Technique for Low-Copy Plasmids

This chapter contains data that will be submitted for publication by Alexandra Surcel1, Douglas Koshland3, Hong Ma1,2,4 Robert T. Simpson1,2,5,6 1 Integrative Biosciences Graduate Degree Program, Pennsylvania State University 2 The Huck Institutes of the Life Sciences, Pennsylvania State University 3 Carnegie Institution of Washington, Department of Embryology 4 Biology Department, Pennsylvania State University 5 Biochemistry and Molecular Biology Department, Pennsylvania State University 6 Deceased

26

Abstract

This chapter covers the construction of two CAR containing minichromosomes from Saccharomyces cerevisiae and the development of a minichromosome affinity purification technique specific for low-copy, CEN3 containing minichromosomes. As previously published, the MAP method is based on the high affinity interaction between the E. coli lac operon present in the minichromosome backbone and the repressor protein immobilized onto chitin beads on a column, via the addition of a chitin-binding domain to the fusion protein. Minichromosomes were generated by PCR cloning, coupled with restriction internal deletions of two of the CAR loci. The fragments were inserted into the minichromosome backbone pALT, modified with a multiple restriction enzyme site. Characterization of the minichromosome containing the STE2-BST1 CAR locus via ChIP showed that cohesin binding occurred throughout the cell cycle, making this high-copy plasmid unsuitable for further analysis or as a control for centromeric plasmids. A minichromosome containing the CARC1 locus and an added CEN3 sequence was shown by ChIP to bind cohesin in S-phase, but not in G1. Due to the presence of the CEN3 sequence, the published protocol had to be revamped to account for low-copy number and to increase yield. This was achieved through a variety of different methods that resulted in a 45-50% yield, a 100-fold amplification over yields using published methods.

27

Introduction

Protein-DNA interactions are paramount in a multitude of cellular process and biochemical pathways. Such interactions have been documented for transcriptional regulation, chromatin packaging and dynamics, DNA repair, and structural regulation, among others. To further understand how these interactions occur in vivo, a plethora of techniques have been developed. Chromatin immunoprecipitation, fluorescence microscopy, protein-DNA crystallization, and DNA footprinting are all methods that address localization and interaction between DNA and proteins. Within this body of data however, there is a dearth of information regarding specific structural interactions between DNA and its protein partners in vivo. One technique that has been developed to address this need is the Minichromosome Affinity Purification (MAP) system. The MAP method is unique in that it allows for the isolation from other cellular components of chromatin-protein complexes assembled in vivo. The isolation of such complexes allows for further structural characterization and biochemical analysis to be performed. MAP originally was established to compare the secondary chromatin structure of a locus in an actively transcribed state versus a repressed state, as well as to gain insight into the trans-acting factors involved in chromatin structural maintenance. The principle of the MAP method was derived from two well-known facts. The first is that Saccharomyces cerevisiae stably packages and maintains episomal DNA as chromatin. This capacity has paved the way for using minichromosome systems to study a variety of biological processes, including the effect of nucleosome positioning and twisting on the function of genomic loci (Simpson et al., 1993), the interactions between trans-acting elements and chromatin (Ducker and Simpson, 2000), the result of protein binding on chromatin remodeling (Xu et al., 1998), and the outcome of chromatin structure on DNA repair (Smerdon and Thoma, 1990). The backbone for MAP plasmids contains the TRP1/ARS1 sequences. TRP1 encodes for N-phosphoribosyl-anthranilate isomerase and ARS1 is an autonomously replicating sequence. Both were identified as an element derived from a genomic EcoRI digested fragment that could be autonomously 28 replicated and maintained at high copy numbers upon circularization and transformation into budding yeast (Zakian and Scott, 1982). The second fact upon which the MAP system is based, is the repressive activity of the lacI protein. In E. coli and select other enteric bacteria, the lac operon is required for the metabolism and transport of lactose. When lactose is not available, the transcription of three lactose transporters and catalytic is prevented by the binding of the repressor protein lacI to the lac operon. This interaction is reversed in the presence of lactose. The cloning of the lac operon into the backbone of the minichromosome takes advantage of its affinity to the lac repressor-β-galactosidase (lacI), which is immobilized onto chitin beads by its fusion to a chitin-binding domain (CDB). The lacZ fusion protein also contains lacI to stabilize the protein and an intein domain, which was originally cloned as a release mechanism from the chitin column in the presence of DTT (Ducker and Simpson, 2000). However, the intein domain is not used because bound minichromosomes can be released by competitive interaction with added IPTG in the presence of high salt. Both the lac operator and the multi-enzyme cloning site were placed in nuclease hypersensitive regions of TRP1 that did not have repressive well- positioned nucleosomes (Ducker and Simpson, 2000). Figure 2.1 shows a schematic of the minichromosome backbone. The MAP protocol involves four main steps. Cells containing the minichromosome are harvested and washed two times in a sorbitol buffer prior to zymolyase digestion of the cell walls. The cells are then washed two more times in sorbitol buffer to remove remaining zymolyase and resuspended in a low salt (50mM NaCl), tween containing buffer in which they are homogenized. Lysed cells are incubated on ice for several hours to allow for the passive diffusion of minichromosomes. Afterwards, the cells are spun and the supernatant containing the minichromosomes is applied to the lac-IZ containing column. The column is washed and the sample is eluted in high salt containing IPTG. The eluate is concentrated by centrifugation and applied to a 15-40% sucrose gradient, upon which the aliquots containing the purified 29 minichromosome are collected. The samples are then used for a variety of different analyses.

Figure 2-1: Plasmid map of pDTL, the backbone of the minichromosome constructs. pDTL contains the sequence for TRP1/ARS1. TRP1 functions as a marker, while ARS1 functions as an origin of replication for the construct. The lacO sequence is inserted in the 3’ end of the ARS1 locus (black region on map). Genes of interest are inserted in the linker region (purple). The AmpR gene cloned into a pUC19 insertion confers ampicillan resistance for work in E.coli. Additionally, ori – an origin of replication for bacteria – is also located in the pUC19 sequence. The bacterial part of the construct is removed upon EcoRI digestion, and the construct is ligated prior to transformation into yeast. 30 MAP has been used successfully with high copy minichromosomes to study the interaction of the repressor complex Ste6-Tup1 and to identify novel proteins involved in yeast mating type switching (Ducker and Simpson, 2000) (Simpson et al., 2004; Ruan et al., 2005). MAP was initially employed for this study to isolate CAR containing minichromosomes from alpha-factor arrested cells (in G1 phase when cohesin is not bound to CAR loci) and from nocodazole-arrested cells (in M phase when cohesin is bound to CAR loci). Isolated samples were then to be used for structural TEM analysis and for mass spec analysis to identify novel candidates involved in sister chromatid cohesion. In this chapter, I show that the standard MAP protocol cannot be applied to CAR containing minichromosomes, especially those with a CEN3 sequence inserted into the backbone. The MAP protocol had to be reworked specifically for low-copy CEN3 containing plasmids and in so doing, I was able to increase the yield from MAP experiments 100-fold from the original recovery. The samples obtained from this revamped protocol were of good enough quality to be used for TEM analysis discussed in Chapter 3.

Results

pCM26-1 plasmid generation and characterization

In order to study cohesin-chromatin interactions, a construct was generated that would undergo segregation in a manner similar to chromosomes. This plasmid, named pCM26-1 contains both a CEN3 sequence and CARC1 – a 829 bp locus on chromosome arm III, whose 5’ end overlaps with BUD3, a non-essential gene involved in bud neck development (Chant et al., 1995; Megee et al., 1999). See Figure 2.2A for a map of the pCM26-1 construct. Cloning into the backbone pDTL and subsequent ChIP work was done in the Koshland lab. pDTL is a plasmid derived from pALT, the original parental minichromosome (Ducker and Simpson, 2000). It has a multi-cloning site designed and 31

Figure 2-2: Minichromosome pCM26-1. (A) Schematic map of pCM26-1. CARC1 is located between the centromeric sequence CEN3 and the TRP1/ARS1 locus. The AmpR and ori sequences are removed by an EcoRI digest to eliminate bacterial sequence from the backbone. (B) Southern blot of the TRP1/ARS1 locus and pCM26-1 from the same cell line, showing a comparable copy number of the minichromosome to the genomic TRP1/ARS1 locus. The probe was generated by a digest of the minichromosome backbone and contains the 3’ end of the TRP1 gene and all of the ARS1 locus. (C) Growth curve of cell lines 8803 (wt) and 8803 transformed with pCM26-1 shows nearly identical doubling times for both populations. 32 engineered by Dr. John Diller who graciously donated pDTL for this work. Cohesin could be localized to the plasmid CARC1 in nocodazole-arrested cells and not in alpha- factor arrested cells as determined by ChIP (Koshland, personal communication). Directionality of CEN3 had no bearing on cohesin binding at CARC1. Southern blot analysis of crudely extracted DNA from cells transformed with pCM26-1, showed that the copy number per cell was ~2 (see Figure 2.2B). Additionally, the presence of pCM26-1 did not affect cell division time – transformed cultures doubled every 1.7 hours, nearly identical to untransformed lines (see Figure 2.2C). Similarly, transformed cells exhibited small bud morphology at G1 and large bud morphology at M phase that was indistinguishable from wildtype (data not shown). As a control for pCM26-1, a construct was generated without the CEN3 sequence as the MAP method had been previously used only for high-copy number plasmids. ChIP of pCM26, which contained only CARC1, showed no cohesin binding at either G1 or S phase of the cell cycle (Koshland, personal communication). Due to its lack of ability to bind to cohesin, this construct was not used for further MAP analysis.

pAS1 plasmid generation and characterization

The metaphase-anaphase transition is partially dependent upon opposing forces between the kinetochore attached to centromeres and sister chromatid cohesion. As such, centromeric interactions are integral to cohesin dynamics and the presence of the CEN sequence on pCM26-1 ensured that cohesin binding on the minichromosome would be most similar to binding at genomic loci. However, the MAP method was devised for high-copy constructs, not low-copy ones. Additionally, the inability of cohesin to be detected by ChIP on a minichromosome with a CAR locus not containing a CEN3 sequence, suggested that there might be an important cis-element that was excluded from the design of that minichromosome. Research identified such an element as Scc2/4 binding sites, which act as cohesin loading docks. When further research in the Uhlmann lab implied that cohesin localization to CAR regions was facilitated by the pushing of the 33 cohesin complex by the transcriptional machinery used by convergently transcribed genes, a new minichromosome was created taking this information into account

Figure 2-3: Minichromosome pAS-1. (A) Schematic map of pAS-1. Internal deletions were made of both BST1 and STE2. The construct contains an intergenic CAR with overlap of the 3’-end of both genes. (B) Southern blot of the TRP1/ARS1 locus and pAS-1 from the same cell line, showing a copy number of over 35/cell. The probe was generated by a digest of the minichromosome backbone and contains the 3’ end of the TRP1 gene and all of the ARS1 locus. 34 (Lengronne et al., 2004). This minichromosome, named pAS1, was designed as both a construct containing important cis elements possibly necessary for cohesin binding, and as a control for the CEN3 containing pCM26-1, since CARC1-containing pCM26 that lacks the CEN3 sequence could not bind cohesin (Koshland, personal communication). pAS1 was generated by a two-step cloning procedure that involved the insertion of a 2119 bp region into the parental minichromosome backbone pDTL (see Figure 2.3A for a construct map). The region cloned into pAS1 spanned both STE2 and BST1 genes and included their 5’-UTRs, which each contained one Scc2/4 binding site (Lengronne et al., 2004). The 268 bp intergenic region of these two convergently transcribed genes was identified as a CAR locus in addition to the 3’ end of both genes (Glynn et al., 2004; Lengronne et al., 2004). Because the size of STE2 and BST1 (1295 bp and 3089 bp respectively) was prohibitive to generating a minichromosome small enough to undergo passive diffusion out of the nucleus, in-frame internal deletions were produced in both genes. The final size of pAS1 was 3605bp, in line with previously published minichromosome constructs. A Southern blot of crudely extracted DNA from cells transformed with pAS1 showed that the copy number of the minichromosome was approximately 35 copies/cell (see Figure 2.3B). To ensure that cohesin localized to the CAR locus of pAS1, chromatin immunoprecipitation was performed with three primer pairs spanning STE2, BST1 and their shared CAR, using two genomic controls. The two genomic loci were CARC1 and CARL2 with primer pairs donated by the Koshland lab. The ChIP experiment demonstrated that cohesin binding occurred at pAS1’s CAR locus during both G1 and M phases, whereas cohesin binding to the genomic controls occurred only during M phase as predicted (see Figure 2.4). Binding of cohesin to pAS1 was reduced in the alpha- factor arrested cells as compared to the nocodazole-arrested cells, but its reduction was only two-fold. These ChIP results showed two things – first, the additional cis elements cloned into pAS1 were sufficient for inducing cohesin binding. Secondly, the inability of cohesin to be cleaved from the entire population of pAS1 at the metaphase-anaphase transition suggests that interactions with the centromere are important for proper cohesin dissolution. As such, pAS1 could not function as a proper control for pCM26-1. 35

Figure 2-4: Chromatin Immunoprecipitation of pAS1. (A) Ethidium bromide stained agarose gel of PCR products from pAS1 generated after ChIP to 6-HA-Scc1. Three sets of primers were used to ChIP pAS1 in nocodazole arrested and alpha factor arrested cells – one over the BST1 deletion, one over the STE2 deletion and one over the intervening CAR. Samples are marked as follows: A – nocodazole arrested with no antibody; B – nocodazole arrested with antibody; C – nocodazole arrested total; D – alpha factor arrested with no antibody; E – alpha factor arrested with antibody; F – alpha factor arrested total. (B) Graphic representation of percentage ChIP from PCR shown in (A). All primer sets chromatin immunoprecipitated Scc1 in both G1 and S phase, demonstrating that pAS1 was unreliable for further analyses. (C) Ethidium bromide stained agarose gel of PCR products of genomic loci known to ChIP cohesin in a cell cycle dependent manner. Lettering is the same is as in (A). 36 Changes to MAP methodology for pCM26-1

Following the MAP protocol used in previous published studies for the isolation of pCM26-1 from arrested cells generated an extremely low yield of less than 0.5% of starting material. In order to increase the sample yield to sufficient quantities for the analysis undertaken in Chapter 3, changes were made at several key steps of the MAP method. The first significant loss of minichromosome occurred in the passive diffusion of minichromosome step. Over 80% of the sample was lost in this single step. Although the buffer used here (MB) contained 0.5% tween, it was thought that additional detergent could aid in disturbing interactions between a CEN containing plasmid and the nuclear envelope. A trial series of varying concentrations of triton and tween added to the homogenate buffer was undertaken. The supplementation of an additional 0.5% tween generated an increase of 40% more pCM26-1 (see Figure 2.5). Consequently, future MAP preparations included this change in the MB passive-diffusion of nuclei step. The second largest loss of sample occurred during the concentration of the eluate during centrifugation. Although the Millipore column type used in previously published studies was no longer commercially available, the Amicon Ultra centrifugal filter device, with a 30k cutoff had the same specifications. However, use of this filter device led to a complete loss of sample eluted off of the column. Several alternatives were pursued to modify this step. Firstly, another filter type was used – Microcon centrifugal devices with a 30k cutoff. When this failed to concentrate retrievable sample, a series of different pre-washes of the centrifugal device was carried out. These included 5mM and 10mM BSA, the elution buffer alone, and the elution buffer with 0.5% added tween. The highest recovery was obtained from the prewash with the elution buffer plus added 0.5% tween (see Figure 2.6). While not noticeable with 2, 4, and 8L preparations, when a 40L preparation of nocodazole-arrested cells was performed, the entire eluate precipitated during concentration. The precipitate could not be resuspended despite multiple incubations with elution buffer performed on ice. Analysis of fractions on Et-Br stained agarose gel showed that all of the minichromosome was trapped in the precipitate (see Figure 2.7). 37

Figure 2-5: Southern blot of pCM26-1 in the tween and triton trials. Identical amounts (1% of total volume) of homogenate and supernatant were loaded onto the gel. Homogenate (H) is defined as the MAP sample after passive diffusion of minichromosome and supernatant (S) is defined as the homogenate after centrifugation. The supernatant contains the sample that is loaded onto the column. In all conditions shown, the homogenate is on the left and the supernatant is on the right. Percentage of pCM26-1 retained from H to S was calculated as follows: For the 0.5% tween – 24%; for the 0.5% triton – 18%; for the 1% tween – 43%; for the 1% triton – 32%. The top band is the nicked plasmid, while the lower band is the supercoiled plasmid. The blot was probed with a fragment of the TRP1/ARS1 locus.

38

Figure 2-6: Southern blot of pCM26-1 concentrated under varying conditions. 5% of eluate and concentrate was loaded onto a 0.7% TBE-agarose gel, whose blot was probed with a restriction enzyme digest generated fragment of the TRP1/ARS1 locus from pDTL. Samples were isolated from asynchronous cultures and concentrated by centrifugation in devices that were pretreated with 5 and 10 mM BSA, elution buffer, or elution buffer with tween. The highest yield was obtained in the elution buffer plus tween pre-washed column. Differences in the eluate are attributed to differences in the starting quantities of the sample among the varying preparations.

39

Figure 2-7: Ethidium bromide stained agarose gel of pCM26-1 washes of the precipitate that formed from a large-scale preparation. Despite multiple incubations in elution buffer, the precipitate remained. This analysis shows that pCM26-1 could be mainly found in that precipitate and could not be resolubalized in EB.

40 This suggested that the high density of minichromosome plus RNA and genomic contaminant led to precipitation and loss of sample that could not be seen with preparations from smaller cell volumes. Several protocol modifications were performed to reduce the presence of contaminants, which in turn eliminated precipitation visible by eye and which generated a retention of almost 90% of the sample during concentration. The modification involved washing the column with a higher salt MB solution – trials with 50mM, 100mM, and 200mM NaCl in MB showed that a significant fraction of the genomic contaminant was eliminated by a column wash with 200mM NaCl/MB. Because RNA adheres strongly to glass columns and the high salt EB strips the RNA from the column surface during minichromosome elution, the second modification involved adding EB in 1ml increments instead of 5ml all at once, in order to reduce contact time with the glass column. These two changes eliminated much of the contamination from the elution step. The final problem that was overcome in this procedure for pCM26-1 involved generating a better separation of the minichromosomes from other material in the sucrose gradient. The original MAP method called for an overnight run at 30K. This led to a significant overlap of upper weight genomic contaminant, as well as RNA. Several variations of the run time were performed, and the optimal running conditions were for 4 hours at 40K (see Figure 2.8). Overall, the changes to the MAP method – altering the MB buffer to contain added tween, preincubating the filter centricons with EB plus 1% tween, changing how the column is loaded and washed, decreasing the sucrose gradient spin – led to a 100-fold increase in the amount of isolated pCM26-1 over the yield using the original method, as shown in Figure 2.9.

41

Figure 2-8: Changes to the sucrose gradient. (A) Sucrose gradient according the published protocol. (i) Ethidium bromide stained agarose gel (ii) Southern blot of the agarose gel, probed with a TRP1/ARS1 specific probe (iii) Ethidium bromide stained gel with aliquot of concentrated fractions from this prep. The black arrows are minichromosome and the brackets are RNA. (B) Sucrose gradient with modified run conditions yield better separation from RNA and less genomic contamination. (i) Ethidium bromide stained agarose gel of fractions (ii) Southern blot of the agarose gel (iii) Ethidium bromide stained gel with aliquot of concentrated fractions from this prep. All samples were treated with proteinase K for five hours prior to gel electrophoresis.

42

Figure 2-9: Southern blots of two MAP experiments of pCM26-1. The top panel shows aliquots taken throughout the minichromosome affinity purification using the published protocol (Ducker and Simpson, 2000). The bottom panel shows the same aliquots taken throughout a similar minichromosome affinity purification after all of the modifications were undertaken. The homogenate refers to sample after passive diffusion from nuclei and the supernatant is the sample that is applied to the lacIZ column. Percentages are by volume of total sample. The blot was probed with a fragment of the TRP1/ARS1 locus.

43

Additional considerations for the MAP technique

Besides changing steps in the MAP technique to accommodate the low-copy number CEN3 containing plasmid pCM26-1, there are several other matters that needed to be closely monitored over time. After thirty MAP experiments and over 15 nuclei preps that involve a highly viscous buffer, the homogenizer-pestle fit deteriorated, leading to a decrease in the homogenization efficiency prior to the passive diffusion of nuclei step. This was resolved by using a new homogenizer/pestle combination. Furthermore, the effectiveness of the chitin beads to bind the lac-IZ-CDB fusion protein also decreased over time, well within the expiration date provided by the manufacturer. New beads were purchased that bound the fusion protein at a higher rate, leading more sample bound to the column.

Protein analysis of isolated minichromosomes detects presence of cohesin

In order to ensure that pCM26-1 isolated from nocodazole-arrested cells contained cohesin, a western blot was performed. Concentrated samples from both nocodazole and alpha-factor arrested cells showed the presence of cohesin only in the sample from M phase arrested cells (see Figure 2.10A). Additionally, the sample was run on a 12% SDS-polyacrylamide gel and stained with Cypro Orange. This protein gel showed that the minichromosome samples had at least sixteen proteins, eight over the expected histone and cohesin components (see Figure 2.10B). 44

Figure 2-10: Protein analysis of purified pCM26-1. (A) Western blot of 20% volume of concentrated pCM26-1 isolated from an eight-liter preparation shows the presence of

HA-Scc1 in M phase (M) versus G1. (B) SDS-PAGE of pCM26-1 stained with Cypro Orange, shows a minimum of sixteen individual proteins co-eluting with the minichromosome complex. These bands are designated by black arrows.

45

Discussion and significance

MAP methodology

In this chapter, I have developed a minichromosome affinity purification technique specific for the isolation of low-copy number, centromeric minichromosomes. I have characterized two minichromosomes – pAS1 and pCM26-1 – and determined that only a minichromosome with a centromeric backbone binds cohesin in a cell-cycle dependent manner. Additionally, I have improved upon previously published protocols for MAP using the chitin-lacIZ column and I have identified key steps in the purification procedure that can be readily modified to meet the specifications of other minichromosomes. The changes to the MAP method have led to sample of sufficient yield and purity with which to do structural TEM analysis (Chapter 3).

Important cis-elements needed for cohesin binding constructs

One of the main advantages to the MAP method was the generation and use of a high-copy number plasmid based on the TRP1/ARS1 locus. However, when CAR loci (both CARC1 and CARL2) were inserted into the pDTL backbone, cohesin could not be chromatin immunoprecipitated to that sequence. This suggested that there were important cis-elements that were missing from the construct. One such element was proposed to be Scc2/Scc4 binding sites believed to be cohesin loading docks (Lengronne et al., 2004). Based on the model of transcriptional machinery pushing cohesin rings to intergenic CARs, two Scc2/Scc4 binding sites had to be incorporated into the minichromosome backbone, to sandwich the CAR and maintain cohesin in a small, defined space. Another interpretation of this data is that the centromeric sequences nucleate cohesin binding more efficiently that individual CARs or Scc2/Scc4 binding sites. 46 The STE2/BST1 sequences were chosen based on a preponderance of work done with their intergenic CAR (Lengronne et al., 2004). The use of the complete ORFs for both of these genes would have made a minichromosome that exceeded the size recommendation for constructs using the MAP technique (Ducker, thesis). To overcome this obstacle while leaving both the Scc2/4 binding sites and the CAR site unaltered, internal in-frame deletions were generated of both STE2 and BST1. The final construct, named pAS1, retained the high-copy number similar to its parental construct, but was unable to bind cohesin in a cell-cycle dependent manner. ChIP experiments show that cohesin was bound to the minichromosome during both G1 and M phase. This non- discriminate binding of cohesin suggests that constructs not subject to normal chromosome segregation may bypass correct cohesin dissolution at the anaphase to metaphase transition. The presence of the centromeric locus in pCM26-1 demonstrated that CAR sequences alone are not sufficient for cohesin binding. In all eukaryotes, centromeres have a high affinity for cohesin, starting in early S phase. While cohesin dissolution from centromeres is mechanistically different than that from chromosome arms in most higher eukaryotes, Saccharomyces cerevisiae does not have this bi-modal removal system. The requirement of CEN3 for cohesin binding to pCM26-1 therefore implies one of two occurrences: Either cohesin binding along arms requires a nucleating cohesin binding site that activates binding at CARs, or the cohesin binding that was observed in the ChIP experiments is reflective of cohesin interaction at CEN3 and not CARC1, as the two are separated by a short distance making them indistinguishable from each other with this technique. It is more likely that the latter is true, but it is equally important to consider other possible explanations for this observation.

Improvements to the MAP technique

As outlined in the introduction, the MAP protocol was effective for isolating a number of minichromosome constructs that were used to characterize a variety of chromatin/protein interactions. However, the introduction of the centromeric sequence 47 into pCM26-1 required for cohesin binding, generated a recovery of less that 0.5% of starting material. This loss from a low-copy number system renders the MAP technique unfeasible for studying cohesin-chromatin interactions. In an attempt to salvage this method for use with pCM26-1, numerous adjustments were made throughout the protocol. The first major loss occurred during the passive diffusion of minichromosomes out of the nuclei. The addition of a detergent was tested with the argument that the CEN3 or CARC1 locus impeded the diffusion process and that added triton-100 or tween-20 would aid in the dissociation of the minichromosome from the nuclear envelope (Li et al., 1997; Montpetit et al., 2005). Detergent did in fact increase the recovery of minichromosome at this step by 100-fold. Higher detergent concentrations were not pursued in an effort to keep potential protein denaturation to a minimum. The second major loss of sample occurred during the concentration of the eluate via centrifugation by filter columns. Almost all of the eluted sample was lost at this step. Besides using other commercially available centricon systems, a series of different pre- washes was tested for their ability to increase recovery. Low concentrations of BSA, elution buffer alone, and elution buffer with added tween were all evaluated because of their ability to non-specifically interact with a number of different membrane surfaces. Of the attempted conditions, pre-washing with EB plus added tween generated almost 80% recovery from the eluate step. This suggests that the tween in the preincubation buffer non-specifically interacted with the membrane, blocking the binding of the MAP samples to the membrane. This discovery helps to explain the loss of sample at this step since EB alone contains some tween and most likely led to the nonspecific binding of the sample to the membrane and subsequent loss during concentration. While the quality of the minichromosome isolated with the MAP technique was high as verified by both Southern and western blot analyses, preparations of pCM26-1 were not pure enough for TEM analysis. Significant amounts of genomic DNA and RNA co-purified with the minichromosome and could not be sufficiently removed during the sucrose gradient purification step. Whereas purity (defined as the ratio of plasmid to genomic contaminants) of other constructs as determined by ethidium bromide staining 48 was in the 50-70% range (Ducker and Simpson, 2000), the fact that pCM26-1 is a low- copy number plasmid presented a certain set of challenges that high-copy number construct MAPs did not. In reviewing the purification of minichromosomes undertaken by other members of the Simpson lab, all preparations contained some upper weight genomic contaminant and/or RNA contaminant. This was not a problem with high-copy number plasmids where the ratio of minichromosome to contaminants was high. However, it becomes problematic when the final material was in low supply, making TEM work impractical. The amount of RNA in the sample was reduced by adding the EB in 1ml aliquots, instead of applying all five or ten milliliters at once. Because RNA binds to the glass column and the high salt in the elution buffer removes the RNA off the sides of the column, reducing the amount of interaction between the column sides and EB reduced the amount of RNA in the sample by almost 60%. RNAse A treatments were not performed because the presence of noncoding RNAs from CAR loci were detected by Northern blots (see Appendix 2), suggestive of a possible role in cohesin binding. The genomic contaminant could not be reduced by changes to the sucrose gradient procedure as there was significant overlap with the minichromosome samples. Instead of attempting to reduce the amount of genomic contaminant in the concentrate, I focused on removing it from the eluate. This lead to a trial of varying the wash conditions of the column. 200mM NaCl was sufficient to wash about 40% of the genomic contaminant away, without eluting off the minichromosome. While the genomic material could not be removed in its entirety, its reduction made TEM work more feasible. This host of changes to the MAP technique helped to identify two steps that can be adjusted for different constructs, mainly the make-up of the MB buffer for the passive diffusion step and the sucrose gradient conditions. The remaining changes – the pretreatment of the filter, the handling of the column, the wash conditions – coupled with attention to reagent deterioration over time (the homogenizer integrity and chitin beads’ affinity), should be considered permanent changes to the protocol irrespective of the construct used. The work presented in this chapter represents three years of research and 49 over 600 liters of culture and has culminated not only in the development of the MAP technique for low-copy number and CEN containing constructs, but also in the attainment of sample for other analyses (see Chapter 3).

Material and Methods

Strains

All cloning work and plasmid purifications were done in Escherichia coli DH5α. The lac-IZ fusion protein was expressed from pTLIZ plasmid (constructed by Dr. Charles Ducker) in Escherichia coli BL21-DE3 cells. All yeast strains were isogenic with strain W303 (MATa/MATa ADE2/ade2 CAN1/can1-100 CYH2/cyh2 his3-11,15/his3-11,15 LEU1/leu1-c LEU2/leu2-3,112 trp1-1:URA3:trp1-3' /trp1-1 ura3-1/ura3-1). Strains 8803 (6HA-Mcd1p) and 1607 were generously provided by the Uhlmann laboratory (described in (Lengronne et al., 2004)).

Plasmids

The minichromosome backbone pALT was designed by Charles Ducker and modified with a multi-cloning site into pDTL by John Diller who generously donated it for this project. pCM26-1, pCM26-2, pCM27-1, and pCM27-2 were cloned into the pDTL backbone and donated by members of Doug Koshland’s laboratory. The pCM backbone was generated by digesting pCM26-1 with NotI. pAS1 was generated by a two-step cloning procedure. STE2 and BST1 with their respective UAS were individually cloned by PCR into pGEM-T vector (Promega) with BamHI and XmaI sites engineered into each vector respectively. An internal deletion of 1089bp from 134 to 1220 of STE2 was obtained by HpaI and PstI double digestion of pGEM-T-STE2. An internal deletion of 2545bp from 211 to 2736 of BST1 was obtained by a partial digestion with HincII of 50 pGEM-T-BST1. The ∆STE2 fragment was isolated from a pGEMT- ∆STE2 restriction enzyme digestion with XmaI and a partial digestion with HindIII and subsequently ligated to pGEMT- ∆BST1 linearized with BamHI. The entire insert was amplified via PCR, ligated into pGEMT to generate pGEMT- ∆STE2 ∆BST1, isolated following BamHI and XmaI digestion, and ligated into pDTL to create pAS1. All minichromosomes were digested with EcoRI to remove the bacterial backbone and religated prior to transformation into yeast strains.

Cell Synchronization

Exponentially growing cells were grown in appropriate minimal media with 20% dextrose at 30oC, 250rpm. Cells were arrested in G1, S, or M phase by the addition of 10mg/ml alpha factor, 100mM hydroxyurea (USBiological), or 15µg/ml nocodazole (USBiological) respectively. 1% final volume DMSO was added concurrently with the addition of nocodazole to cultures arrested in M phase. Cell synchronization was monitored by propidium-iodine (Sigma) staining FACS analysis on a Coulter XL-MCL I single laser cytometer.

Smash and Grab DNA isolation from yeast

Two milliliters of exponentially growing cells were centrifuged and the pellet resuspended in 0.2ml S/G lysis buffer (10mM Tris-HCl, pH 8.0, 2% Triton X-100, 1% SDS, 100mM NaCl, 1mM EDTA, pH 8.0). 0.3g acid-washed beads and 0.2 ml phenol chloroform were added. The sample was vortexed for 10 minutes and then centrifuged for 5 minutes. The supernatant was transferred to a fresh tube and precipitated with 1ml room temperature 100% ethanol plus 20 µl 3M NaOAc, pH 5.3 by centrifugation for 15 minutes. The pellet was rinsed twice with 70% ethanol and speed-vac dried. The pellet was resuspended in 20 µl 0.1XTE with 5 µg/ml RNase A 51 Minichromosome Affinity Purification (MAP)

MAP was adapted from the procedure previously described by Ducker, 2001. The following protocol is for a 4L arrested culture. The procedure was stepped up linearly with larger cultures. Harvested cells were washed two times in 30ml SB (1.4M sorbitol,

40mM HEPES, 0.5mM MgCl2, pH 7.5) with 1mM phenylmethylsulfonyl fluoride (PMSF, Sigma-Aldrich) and 10mM β-mercaptoethanol (Fisher) and spun at 5,000 rpm in an SS34 or Sorvall G-20 for five minutes. Cells were resuspended in 30ml SB, 1mM PMSF to a total volume equivalent to 4X wet pellet weight. 10mg/ml freshly made zymolyase (Cape Cod) was added to a final concentration of 0.5mg/ml and the sample was incubated at 30oC for approximately 20 minutes or until spheroblasting was completed, as determined microscopically. Volume was brought up to 30ml SB, 1mM PMSF and the sample was spun at 5,000rpm for 5 minutes. All subsequent steps were performed on ice. Pellets were gently resuspended with plastic 25ml pipettes and washed twice in 30ml cold SB, 1mM PMSF. Pellets were resuspended in 10ml MBB (150mM NaCl, 20mM HEPES, 1mM EDTA, 0.1% Tween-20, pH 8.0), with 1mM PMSF, 10µg/ml Α-protinin, 2µg/ml leupeptin, 2µg/ml pepstatin A (Sigma) and incubated on ice for 15 minutes. MAP of pCM26-1 and pCM27-1 used a MBB buffer with 0.6% Tween- 20. The chilled spheroblasts were lysed in a Thomas glass homogenizer and Teflon motor-driven pestle with 8 strokes. Samples were incubated on ice for 3-4 hours with occasional gentle agitation on a platform shaker to allow for the passive diffusion of minichromosome from the nuclei, then centrifuged at 18,000rpm for 30 minutes. The supernatant was incubated on a rotator with charged column resin (see column preparation) at 4oC for one hour, before being loaded onto the column by gravity at a flow rate of 0.5ml/minute. The column was washed with 20ml MBB, followed by 20ml MBB-200 (MBB with 200mM NaCl). 1ml elution buffer (300mM NaCl, 20mM HEPES, 1mM EDTA, 0.1% Tween-20, pH 8.0) was applied to the column and allowed to flowthrough. Although IPTG was initially used as a competitor for the minichromosome interacting with the column, high salt proved to be a better competitor and has been used for all MAPs. An additional 1ml was added and was incubated on the column for 30 52 minutes. An additional 3 ml EB was added to the column and the eluate was collected in its entirety and diluted immediately 1:1 in cold distilled water. The diluted eluates were centrifuged in Millipore 30K centricon filter to a final volume of 300-400ul. Eluates were loaded onto 15-40% sucrose gradients (Ultrapure sucrose, Gibco BRL) and spun at 40K, 4oC for 4 hours. Samples were harvested in 0.5ml aliquots, of which 25µl was proteinase K treated at 50oC for 2 hours before being loaded onto a 0.8% TBE agarose gel (SeaKem ME), transferred to Hybond-NX membrane (Amersham-Biosciences), and assessed by Southern blot analysis with a probe specific to TRP1-ARS1. Sucrose gradient aliquots containing minichromosome were centrifuged in Nanosep 30K concentrators (Pall Gelman Lab), with a volume reduction to 50-100µl. To monitor recovery of sample throughout MAP, 1-10% of sample was removed at the following steps: after passive diffusion of nuclei (1%), prior to application to the column matrix (1%), flowthrough off the column (1%), wash (2%), and eluate (10%). These samples were treated with 100µg/ml RnaseA (Sigma) at 37oC for 30-120 minutes. 2µl proteinase K was added in addition to SDS to a final concentration of 1%. Samples were incubated at 50oC for a minimum of 2 hours, phenol chloroform extracted, and ethanol precipitated. Precipitates were resuspended in 20µl 0.1XTE, run on a 0.8% TBE agarose gel (SeaKem ME), transferred to Hybond-NX membrane (Amersham-Biosciences), and assessed by Southern blot analysis with a probe specific to TRP1-ARS1. Band intensities were quantified using the Image Quant software program.

LacIZ Column Preparation

BL21(DE3) cells containing pTLIZ were grown to an OD of 0.8 in 2XYT (16g bacto- tryptone, 10g yeast extract, 5g NaCl per litre) at 37oC. Cells were incubated on ice for 15 minutes after which expression was induced with 40µM isopropylthio-β-D- galactopyranoside (IPTG) and the cultures were grown overnight at 15oC. Harvested cultures were resuspended in 15ml cold CBB (20mM HEPES pH 8.0, 1mM EDTA, 500mM NaCl, 0.1% Tween-20) and sonicated 10s at 50% six times with 30s incubation 53 on ice in between each sonification. Clarified lysate obtained by centrifuging sample at 12,000xg for 30 minutes at 4oC was applied by gravity at 0.5ml/minute to 1ml chitin beads (New England Biolabs) pre-equilibrated with 20ml CBB. The column was washed with 15ml CBB, followed by 10ml MBB.

Chromatin Immunoprecipitation (ChIP), PCR, and Data Analysis

ChIP was performed as previously described (Megee et al, 1999), after a 1-hour fixation with 1% formaldehyde (Fisher). Samples were immunoprecipitated with anti-HA antibody (Covance). Input DNA was diluted 45-fold prior to PCR analysis. 50µl PCR reactions containing 3µl IP were run on a thermo-cycler. Primer information can be found in Appendix 1. PCR products were resolved on 2.3% agarose gels in 1XTBE buffer stained with ethidium bromide. Gels were visualized on a Typhoon Imager and band intensities were quantified with the ImageQuant software program.

Western blot

Samples were added 1:1 to 2XSDS loading buffer and boiled for five minutes before being briefly centrifuged and placed on ice. The samples were loaded onto a 12% SDS- polyacrylamide gel and run at 120V for 1.5-2 hours. The samples were transferred onto Immuno-blot PVDF membrane (Bio-Rad) in transfer buffer (192mM glycine; 25mM Tris-HCl) overnight at 30V. Blots were blocked for one hour at room temperature in 1% non-fat milk/PBS-T (1XPBS, 0.1% tween). After being rinsed twice with PBS-T for fifteen minutes per wash, the blots were incubated at room temperature for one hour with primary antibody (HA, 1:5000, from Covance) in 0.5% non-fat milk in PBS-T and then rinsed again, twice with PBS-T. The blots were incubated with a secondary antibody (goat anti-mouse, 1:2000, from Covance) for one hour. After the final 2 X 15 minute washes in PBS-T, the blots were incubated with pico-chemiluminesce reagents (Biorad) for 2 minutes and exposed to film (Kodak) for 2 minutes – overnight

Chapter 3

Cohesin Interaction at One CAR Locus Shows a Flexible Rod Multi-Complex Structure

This chapter includes data that will be submitted for publication by Alexandra Surcel1, Douglas Koshland3, Hong Ma1,2,4 Robert T. Simpson1,2,5, 6 1 Integrative Biosciences Graduate Degree Program, Pennsylvania State University 2 The Huck Institutes of the Life Sciences, Pennsylvania State University 3 Carnegie Institution of Washington, Department of Embryology 4 Biology Department, Pennsylvania State University 5 Biochemistry and Molecular Biology Department, Pennsylvania State University 6 Deceased

55

Abstract

Models of cohesin binding suggest various conformations and numbers of the cohesin complex at CAR loci. Although in vitro studies show that cohesin forms a single ring, in vivo interactions with chromatin have yet to be visualized. The MAP technique provides an excellent method for isolating chromatin-cohesin complexes assembled in vivo from Saccharomyces cerevisiae. This chapter investigates the structural interaction of cohesin with pCM26-1 – a CARC1 containing minichromosome. Using the MAP technique, pCM26-1 was isolated from alpha-factor arrested and nocodazole-arrested cells, at G1 and M phase respectively. TEM analysis of minichromosomes from G1 phase without cohesin bound to them shows a singular circular minichromosome with a diameter of approximately 60 nm and identifiable nucleosomes measuring on average 10nm in diameter. TEM analysis of minichromosomes from M phase show two replicated minichromosomes interacting with a long flexible rod, suggestive of collapsed cohesin rings. These minichromosomes interact with the cohesin rod in a position- dependent manner, not a randomly positioned manner as the widely accepted embrace model proposes. Nucleosome mapping of genomic CAR loci show well-positioned nucleosomes, whose arrangement is unaltered upon cohesin binding, implying that it is unlikely that cohesin induces large-scale changes in the chromatin structure as the physical model implies. Additionally, the TEM data suggests that multiple cohesin molecules interact at either CARC1 or CEN3 and that they may cooperate at multiple sites along the Smc1 and Smc3 coiled-coil arms to form a thick cohesin rod.

56

Introduction

Cohesin binding is essential for the proper segregation of chromosomes and thus for cell viability. Cohesin is comprised of two SMC members that alone form a V shaped molecule, plus Scc1 and Scc3. Imaging of in vitro assembled complexes isolated from bacteria and insect-cells suggest that cohesin forms a singular ring. This structure has given rise to several models for cohesin binding, the most prominent of which is the embrace model. All are discussed in detail in Chapter 1. However, the exact mode of binding maintenance at CAR loci remains elusive. Understanding how cohesin interacts with DNA or chromatin can shed light not only on how sister chromatid cohesion is maintained, but it may also illuminate how other members of the SMC family of proteins involved in chromosome condensation and DNA repair function. A method that images either in vivo cohesin binding or in vivo assembled cohesin-DNA/chromatin complexes has such a potential. One useful tool for isolating in vivo assembled chromatin-protein complexes is the minichromosome affinity purification technique, described in Chapter 2. In addition to identifying possible chromatin secondary structures such as nucleosome arrays that may be recognized by the cohesin complex, TEM analysis of the CARC1-containing minichromosome pCM26-1 was undertaken to provide data that could discern between the various published models of cohesin binding. Figure 3.1 shows a schematic of anticipated structures based on the following models. The ring/embrace model implies that sister chromatid cohesion is maintained solely by the cohesin ring surrounding two 10 nm fibers of chromosomes (Haering et al., 2002; Gruber et al., 2003). This suggests that TEM analysis of the replicated minichromosomes should find them randomly distributed around the circumference of the cohesin ring. Similarly, with its permutation of the embrace model, in the extended ring model the minichromosomes again would be randomly distributed around a ring, whose circumference now is equal to that of two or more cohesin rings. The physical interaction model based on the condensin-chromosome relationship predicts that the 57

Figure 3-1: Anticipated structures of cohesin bound minichromosomes, based on predicted models. (A) In the ring model, we expect to see replicated minichromosomes in a random configuration along the cohesin holocomplex. (B) In the physical interaction model, we expect to see two replicated minichromosomes always interacting at one end of the cohesin complex. (C) Just like in the ring model, with the extended embrace model, we expect a random distribution of two minichromosomes separated by a distance of up to two cohesin lengths. (D) Two different structures are possible with the snap model. In the first, if the minichromosomes interact with the Smc heads, we anticipate a single structure in which the minichromosomes are within two cohesin lengths apart. In the second, we foresee an arbitrary distribution of replicated minichromosomes along a structure of up to two cohesin lengths, similar to the extended embrace model. Minichromosomes are depicted as black circles. 58 minichromosomes would be localized to one end of the cohesin ring (Kimura and Hirano, 1997; Hirano and Hirano, 1998; Hirano, 1998; Huang et al., 2005). Finally, the snap model predicts one of two structures. In the first, the two replicated minichromosomes interact with one domain of each cohesin ring, separated by a distance of two cohesin lengths. In the second, if the cohesin molecules interact independently of the minichromosomes, then the two pCM26-1 minichromosomes would be randomly distributed along the cohesin snapped complex and would likely be indistinguishable from the expected results of the extended embrace model (Huang et al., 2005). While each of these models has in vitro support discussed in Chapter 1, direct observations of cohesin-chromatin in vivo interactions are needed to validate a correct model or to establish a new one. To date, the cohesin complex has been directly imaged only in the absence of DNA or chromatin. Recombinantly expressed Smc1 and Smc3 have been rotary shadow imaged as V shaped dimers (Anderson et al., 2002) and the entire holocomplex has also been rotary shadow imaged as an open ring. Even though these images have provided a plethora of information regarding the structure of the cohesin complex, they have been unable to resolve the differences cited in the four models of cohesin binding. In this chapter I demonstrate the use of a minichromosome affinity purification system for studying in vivo assembled chromatin-cohesin interactions. Samples isolated from M-phase arrested Saccharomyces cerevisiae (when cohesin has been ChIP’ed to CARs) and observed under the TEM show that in these samples, cohesin always binds to replicated minichromosomes at one end of the protein complex. This data also show that instead of maintaining an open ring structure, cohesin forms a flexible rod-like structure with several kinks, indicative of a collapsed ring. In addition, I hypothesize that multiple cohesins interact with each other along their coiled-coil domains to maintain sister chromatid cohesion. Nucleosome mapping at CAR loci was performed to look for major changes in secondary chromatin structure upon cohesin binding. This low-resolution mapping identified the presence of well-positioned nucleosomes at these sites throughout the cell cycle. Additionally, the nucleosome mapping was unable to confirm either the presence of specific chromatin secondary structures used to flag cohesin binding to CARs or the binding of cohesin to chromatin via a protein footprint at these loci. 59 Results

TEM analysis from alpha-factor arrested cells shows a singular minichromosome

As shown in Chapter 2 by ChIP analysis and western blot analysis, pCM26-1 is associated with cohesin prior to mitosis, but not before DNA replication. This is in agreement with known information about cohesin loading and binding in vivo at CAR loci. An examination of the structure of pCM26-1 without cohesin is imperative for proper analysis for cohesin-chromatin association. Cohesin-free minichromosomes serve as a negative control for the structural analysis done on cohesin-bound minichromosomes. Measurements taken and morphology observed of these samples aid also in the identification of replicated minichromosomes from M-phase arrested cells.

As alpha-factor arrests budding yeast cells in late G1 prior to cohesin binding, minichromosomes isolated from alpha-factor arrested cells provide a structural pre-DNA replication snapshot of pCM26-1. Negatively stained samples were identified as circular structures exhibiting the characteristic beads-on-a-string morphology of chromatin. Fourteen minichromosomes in total were imaged from alpha-factor arrested cells. The minichromosomes could be subdivided into two subsets. In the first, the minichromosome takes a typical circular conformation expected of a singular minichromosome. In the second group, the minichromosomes all exhibit a small extension of variable length and width. Both categories of minichromosomes are shown in Figure 3.2. Two measurements were taken of each minichromosome imaged – the diameter of the minichromosome and the diameter of several nucleosomes on the minichromosome. The average from the fourteen samples of the minichromosome diameter was 60nm + 4.3nm, and measurements of the nucleosome diameters yielded an average of 10.6 + 1.4nm. There were approximately 13 nucleosomes per minichromosome, although individual nucleosomes could not be resolved from each image. In addition to these two measurements taken of all minichromosomes, the small

60

Figure 3-2: Minichromosome pCM26-1 isolated from alpha-factor arrested cells. All minichromosome samples from alpha-factor arrested cells are continuous stretches of chromatin characterized by a closed circular shape with beads-on-a-string morphology. The diameter of these samples was measured to be 60nm. Diameter measurements of nucleosomes showed them to be 10.2nm. The schematic panel is a 1:1 diagram of the sample in panel C. The nucleosomes are shown in aqua circles. A subset of minichromosomes exhibited a small protrusion of varying length. This structure is shown in panels F-K and is indicated by the black arrows. Scale bar = 50nm. 61 protrusion that was found in a subset of minichromosomes was also measured. Seven samples had such a protrusion (marked with an arrow in Figure 3.2). The protrusion length varied from approximately 10nm to 30nm; the width varied from approximately 10 to 14 nm.

TEM analysis shows a flexible rod protruding from replicated minichromosomes

In order to assess the structural interaction between replicated minichromosomes and cohesin, minichromosomes were isolated from M-phase cells arrested with nocodazole, a microtubule-destabilizing reagent. Samples were identified by TEM if they contained two minichromosomes in close proximity to each other or up to three cohesin lengths apart. The minichromosome morphology in these samples was matched to that from alpha-factor arrested samples, in that minichromosomes were identified as continuous stretches of circular chromatin exhibiting a bead-on-a-string morphology and having a diameter close to 60nm. All images taken from M-phase samples demonstrate a singular and consistent structure that I termed the “closed scissors conformation” (see Figure 3.3 for a schematic and Figures 3.3. and 3.4 for images). The two replicated minichromosomes are always situated next to each other, and in some instances, even exhibit partial overlap (see Figure 3.4 A, B, D, G, and H). In addition, each reveals a long flexible rod extending beyond the minichromosomes. While this rod appears fairly straight in most images, some show flexibility at either end. A twisting pattern along the length of this protrusion is discernable in a handful of images and a bifurcation of the rod at the end emanating from the minichromosomes can be seen in at least two images (see Figure 3.4I). Of the twenty-nine samples imaged, several measurements of each image were taken. In addition to those of the minichromosome mentioned above, the length and width of the rod was also measured and shown to be 70 + 11nm and 14 + 4nm respectively. Minichromosome diameter of 54 + 2.6nm and nucleosome diameter of 10 + 1.5nm were consistent with those taken of samples from alpha-factor arrested cells. 62

Figure 3-3: Negatively stained pCM26-1 isolated from nocodazole-arrested cells. Image on the left shows two replicated minichromosomes interacting at one end of a long flexible rod protruding from the minichromosomes. The panel on the right is a schematic of the image. Aqua circles represent nucleosomes of ~10nm in diameter on chromatin rings ~60nm in diameter. Scale bar = 100nm 63

Figure 3-4: Minichromosome pCM26-1 isolated from nocodazole-arrested cells. All samples are negatively stained and were identified by the presence of two minichromosomes of the same morphology as those observed from alpha-factor arrested cells. Panels B, C, and K show twisting along the length of the protruding rod. Kinks in the rod can be seen in panels E and F. Panel I shows bifurcation at the end of the rod closest to the minichromosomes. Scale bar = 100nm.

64

Figure 3-5: Positively-stained pCM26-1 from nocodazole arrested cells. These images were taken from attempts at immunogold labeling of Scc1. Both images show the closed scissors conformation present in negatively-stained samples of two replicated minichromosomes with a long flexible protruding rod. The rod in the right panel shows a helical twisting along its length as seen in some of the negatively stained images above.

65

Figure 3-6: Measurements of the minichromosomes. (A) A typical image in which minichromosomes diameter (red), nucleosome diameter (blue), rod length (purple), and rod width (yellow) are measured. Scale bar = 50nm. (B) Distribution of rod length measurements.

66 Positively stained immunogold samples show an identical closed scissors structure with similar measurement (see Figure 3.5). Unfortunately, the background of the immunogold particles was too high for an accurate orientation of the Smc dimer or a consistent number of particles to be counted for each image.

Nucleosome mapping at several CAR loci shows well-positioned nucleosomes and no protein footprint

One of the interpretations of the TEM images is that the interaction of the cohesin collapsed ring with the minichromosomes is most like what is expected for the physical interaction model. In the interaction between chromatin and condensin (upon which this model is based), the chromatin wraps around the heads of the SMC proteins. This is likely to induce a large change in the local chromatin structure, either by nucleosome sliding and/or footprinting by the SMC heads. If cohesin follows the same model, then one way to lend weight to this model would be to observe changes in the chromatin structure at CARs. One method for observing protein-chromatin interactions and changes in local chromatin structure is to map the position of nucleosomes across a locus of interest. It is likely that if cohesin binds to chromatin at CAR loci, it will induce a large-scale change in nucleosome positioning visible in Mnase digestion and low-resolution mapping of CARs. If a dramatic change occurs, it can be monitored by comparing nucleosome positioning in alpha-factor arrested cells when cohesin is not bound to CARs versus nocodazole-arrested cells when cohesin is found at those loci. Additionally, this method can be used to examine whether or not a unique nucleosome arrangement exists at CAR loci, acting as a chromatin flag for cohesin binding. Several CARs were examined under these two conditions, including CARC1, the locus introduced in the minichromosome pCM26-1. At both G1 and M phase, well- positioned nucleosomes cover all four CARs examined (CARC1, CARC3, CARL2, and CARL3). The nucleosomes are evenly spaced and do not suggest a unique underlying

67

Figure 3-7: Nucleosome mapping of CARC1 and CARC3 in G1 and M phases. Nuclei were isolated from arrested cells and digested with increasing concentrations of Mnase, amounts schematically shown by the triangles at the top of both panels. The samples were further digested with EcoRI and probed with locus specific probes. The rectangular maps on the left side of both images indicate the genomic region covered by mapping. The ovals represent nucleosomes mapped to the region. The nucleosome positioning does not change in alpha-factor versus nocodazole-arrested cells.

68 secondary structure at these loci. Additionally, the positioning of these nucleosomes did not change from G1 to M phase and no protein footprint could be detected over the CAR loci. Figure 3.7 shows mapping at CARC1 and CARC3.

Discussion

The work in this chapter presents the first direct imaging of CAR-containing replicated minichromosomes with cohesin. The minichromosome affinity purification technique allowed for the isolation of minichromosomes containing the CARC1 sequence from both alpha-factor arrested and nocodazole-arrested budding yeast cells, with and without cohesin bound respectively. This analysis reveals that replicated minichromosomes are always oriented at one end of the cohesin complex. This work also shows that cohesin forms a flexible rod suggestive of a collapsed ring and that the width of this rod implies that multiple cohesins are bound per minichromosome pair. Furthermore, nucleosome mapping at CAR loci suggests that cohesin binding does not induce large conformational changes of the chromatin.

Characterization of the pCM26-1 minichromosome

The CARC1 CEN3 containing minichromosome pCM26-1 was designed for this project and best characterized in samples isolated from alpha-factor arrested cells. Identified by their distinctive circular shape, each minichromosome exhibited the typical chromatin structure. The various elliptical shapes seen with the minichromosomes are more likely symptomatic of adhesion angle to the grid due to the flexibility of the minichromosome, than an inherent morphology of the minichromosome. Another positive identifier of minichromosomes was the diameter measurements. Other constructs that have been isolated and imaged by the same techniques have been approximately 6 kb in length, with a diameter of 90-120nm (Ducker and Simpson, 2000) (Wang, unpublished). By comparison, pCM26-1 is almost 3 kb in length upon the 69 removal of the bacterial backbone. Its measured diameter of approximately 60 nm is in keeping with both its size and observed measurements of other minichromosomes. A third characteristic gleaned from pCM26-1 minichromosome imaging was a nucleosome number of on average, thirteen. This closely reflects the five nucleosomes mapped at CARC1 (see Figure 3.6), the seven over the minichromosome backbone (Ducker, unpublished), and the one predicted for the CEN3 locus, plus an additional couple for remaining sequences. While thirteen is fewer than the predicted eighteen (construct size/165 bp per nucleosome), it is in accordance with other TEM images of minichromosomes that also image fewer than the predicted number of nucleosomes (Ducker, unpublished; Wang, unpublished).

Two different morphologies were observed in pCM26-1 isolated from G1 arrested cells. Half of the images were of the open minichromosome circle, with the other half showing small extensions also from the open minichromosome circle. These protrusions exhibit a beads-on-a-string morphology similar to the minichromosomes, and as such are believed to be chromatin extensions. Due to the variety of lengths measured of this protrusion, they are speculated to be parts of the ring that have folded in sample preparations and not indicative of a structure assembled in vivo.

Characterization of cohesin bound pCM26-1 and assessment of binding models

In addition to verifying the presence in my MAP sample of minichromosome and cohesin via Southern blot and western blot analysis (shown in Chapter 2), the TEM identification of cohesin bound minichromosomes was based on two criteria: the minichromosomes had to be similar in shape, size, and nucleosome number as those isolated from alpha-factor arrested cells and the samples had to contain two minichromosomes either together or in close approximation to one another of up to three cohesin lengths. Using these standards, a series of images was taken, revealing an identical structure, which I have called the “closed scissors conformation.” In this conformation, the two replicated minichromosomes are found close together (in some cases, with partial overlap), with a long rod structure emanating from the two rings. This 70 long rod structure is pliant, being able to exhibit at least two bends along its length. Pairs of minichromosomes were always associated with this protrusion and always in the same orientation that is at one end of the protrusion. Measurements of the length of the rod show that it is 70 + 11nm with a width of 14 + 4nm. The length measurements are consistent with other measurements of the cohesin holocomplex (approximately 64 + 6nm) and longer than published measurements of the Smc1/Smc3 arms alone (approximately 59 + 4nm) (Anderson et al., 2002; Haering et al., 2002). The cohesin length distribution is shown in Figure 3.6B. The minor distribution of length measurements may be in part due to an inability to distinguish the boundary between the cohesin rod and the minichromosomes, thus leading to a handful of longer measurements. The width of the flexible rod however, is significantly larger than the expected width of a singular coiled-coil domain. The width of the cohesin rod could not be directly compared to recombinant cohesin holocomplex, as those published samples were rotary shadowed, making their measurable width a reflection of shadow length and not actual sample width. Comparison to several other negatively stained antiparallel coiled-coil domains, such as the stalk from dynein, suggest a width in the range of 2nm per antiparallel coiled-coil domain. This implies that the approximately 14nm width of the collapsed rod is indicative of 6-8 SMC arms, or 3-4 collapsed cohesin rings per pair of minichromosomes. This is suggestive of two cohesin complexes per cohesin binding locus – one for CARC1 and one for the CEN3 locus on the minichromosome backbone. These images are inconsistent with predicted observations of proposed models for cohesin binding (see Figure 3.1). If cohesin formed an open ring around CAR loci with no chromatin-cohesin interaction occurring during binding, as suggested by the embrace or extended-embrace models, we would anticipate seeing a random distribution of replicated minichromosomes along the circumference of an open ring or along the length of a collapsed ring. Instead, the positioning of replicated minichromosomes at one end of the cohesin rod may be indicative of a preferential interaction with one part of the cohesin complex. This favored arrangement, coupled with the rod length of the cohesin holocomplex, is also inconsistent with the snap model, which would dictate a separation between the minichromosomes of at least two cohesin lengths. 71 Of the models currently published in the cohesin literature, only the physical interaction model based on condensin-chromatin interactions, predicts positional dependence of sister chromatids with respect to the cohesin complex. In addition to studies on the condensin complex, it has been reported that Smc1 and Smc3 C-terminal fragments are capable of in vitro binding of DNA (Akhmedov et al., 1998), though the in vivo counterpart of this experiment remains to be done. Despite the minichromosome-cohesin orientation, our data does not directly support the notion that chromatin interacts with the globular heads of the Smc1 and Smc3 proteins, as directionality of the cohesin complex cannot be assessed here. However, based on the DNA binding data for both condensin and the condensin-like bacterial mukB complexes which demonstrate tight DNA-SMC head interactions, as well as imaging of the condensin complex to DNA which reveal a head domain to DNA association, this remains a possibility (Case et al., 2004). Nucleosome mapping at CAR loci was undertaken under the assumption that if the chromatin wraps around the Smc heads, either a change in nucleosome positioning would occur upon binding or a protein footprint would be detected at these sites. My low-resolution mapping identified well-positioned nucleosomes over the span of several

CARs. This positioning was not altered in M phase compared to its G1 phase mapping as was hypothesized based on the TEM data. This data does not exclude the possibility that cohesin interacts directly with chromatin – cohesin may in fact interact with chromatin at CAR loci in a manner that does not alter nucleosome binding or it may interact loosely at these sites such that the interaction is not measurable via low-resolution mapping over a large population of samples. The data do show however, that if binding occurs, it does not promote large-scale changes to the chromatin secondary structure.

Cohesin binding may involve conformational changes to the protein ring

One of the most interesting observations from this work is the appearance of cohesin as a flexible rod, instead of a ring. Unlike the condensin dimer, which in solution forms a rod-like shape, the cohesin dimer in solution forms a V shaped molecule. Likewise, the cohesin arms remain apart in images taken of the cohesin holocomplex. 72 How then does the cohesin ring result in a pair of collapsed rings that forms a rod? Two explanations can be proposed. The first is that the isolation of these in vivo assembled cohesin-chromatin complexes in MAP and TEM sample preparation affects the structure of these complexes in such a way as to promote cohesin ring collapse and the accumulation of minichromosomes at one end of the cohesin rod. This explanation implies that the TEM structures are the result of sample preparation and are not naturally occurring. This is a valid concern with isolated samples via any technique and is not unique to the MAP method. If this is the case, these images still give us important insights into cohesin- chromatin in vivo assembled complexes. Firstly, we learn that multiple cohesins interact per pair of replicated minichromosomes. Secondly, we discover that the cohesin molecules may have the ability to interact with each other, even if this interaction is influenced by sample preparation. Since the TEM data implies the presence of at least two cohesins per replicated CAR locus, another alternative explanation for the preferential association of chromatin to one end of the cohesin complex may lie in how multiple cohesins intermolecularly interact with each other at CAR loci. As has been previously suggested, cohesin molecules may form polymers by the bracelet mechanism, where the Smc heads of different heterodimers interact with one another, instead of with their intramolecular counterpart (Huang et al., 2005). Conversely, or perhaps in addition to a head domain intermolecular interaction, cohesin complexes may interact along the length of their coiled-coil arms. This type of tetrameric or higher-order interaction of coiled-coil domains is not novel as it forms the basis of myosin filament assembly and is found in proteins that have tetrameric coiled-coil interactions, such as the influenza hemogglutinin HA2 (Whitson et al., 2005; Deng et al., 2006; Liu et al., 2006a; Liu et al., 2006b; Deng et al., 2007). Studies of the amino acid composition of the arm domains within vertebrates of Smc1 and Smc3 show low divergence, suggestive of coiled-coils that function as more than just spacers between two active domains (White and Erickson, 2006). A multimerization interaction along the coiled-coil arms is also supported by recent studies that have demonstrated the requirement of three loops within the coiled-coil domains for cohesin binding and for cell viability (Milutinovich et al., 2007). Figure 3.8 shows a 73 schematic representation of these model interactions. The suggestion that in vivo assembled, chromatin-bound cohesin forms a multimeric rod structure is in keeping with numerous observations ascertained from in vitro studies. To begin with, cleavage of multiple points within the cohesin coiled-coil domains generated by various insertions of Tobacco Etch Virus (TEV) protease sites along the Smc1 and Smc3 arms leads to cohesin dissociation (Gruber et al., 2003). While these findings were previously used to support the notion that cohesin forms a ring whose cleavage leads to sister chromatid dissociation, it is equally likely that cleavage along the coiled-coil domains affects intermolecular interactions along the SMC arms necessary for cohesion. The embrace model was further buoyed by data showing the abolishment of SCC upon the linearization of minichromosomes with cohesin attached to them (Ivanov and Nasmyth, 2005), suggesting that cohesin is able to fall off of short stretches of linearized chromatin. However, the decrease of the centromeric protein Cse4p in these samples upon linearization suggests that this data obtained from only 10% of isolated minichromosomes may not be characteristic of the majority of cohesin-chromatin interactions. Additionally, several studies have also shown that sister chromatid cohesion can be abolished without dissociation of the cohesin complex in the absence or overexpression of other protein factors involved in the cohesion pathways (Lam et al., 2006, Chang, 2005 #121, Losada, 2005 #180). One possibility is that additional proteins may be involved in driving the interaction between coiled-coil domains and that it is this interaction of multiple cohesin complexes, not the mere binding of cohesin, that is in fact responsible for SCC. These additional protein interactions may be modified by changing conformations affecting the rigidity of the cohesin multi-complex, as demonstrated by several images showing flexibility within the rod. While new information about cohesin loading cannot be ascertained from this study, the images obtained from using the MAP technique give a snapshot of the cohesin- chromatin interactions at centromeres and CAR loci, post-replication. Even though it can be determine that multiple cohesins bind at these loci in a position-dependent manner, how the rod structure is initially formed can only be speculated. I conjecture that cohesin rings are loaded individually as the embrace model suggests, and that their close 74

Figure 3-8: Models for cohesin binding at CAR loci. (A) Multiple cohesin rings interact with each other at a CAR locus. One of the most feasible modes of interaction would occur between the SMC heads of multiple cohesins, akin to the bracelet proposal. This would precipitate physical interactions among coiled-coil domains (as indicated by the red arrows) or simple stacking that would collapse the rings. (B) In the reverse orientation, SMC heads may directly interact with chromatin at replicated CAR loci, akin to condensin’s physical interaction model. This interface may propagate interaction of the coiled-coil domains to generate the flexible rod seen in the TEM images.

75 localization to each other precipitates contact between their arm domains. Similarly feasible is that multiple complexes interact immediately upon binding (such as via their head domains as suggested by the snap model) or by interactions with other proteins and that this contact promotes arm interactions. Further experiments to test these hypotheses will shed light on the evolving understanding of sister chromatid cohesion. Images of minichromosomes containing other CAR sequences or isolated from mutant backgrounds in which cohesin is loaded but binding is not maintained, may help tease apart these interactions and lead to a better understanding of how cohesin binding is established and maintained. Additionally, immunogold-labeling experiments can address orientation of cohesin molecules with respect to replicated minichromosomes.

Materials and Methods

Strains and Plasmids

All cloning work and plasmid purifications were done in Escherichia coli DH5α. The lac-IZ fusion protein was expressed from pTLIZ plasmid in Escherichia coli BL21- DE3 cells (Ducker and Simpson, 2000). All yeast strains were isogenic with strain W303 (MATa/MATa ADE2/ade2 CAN1/can1-100 CYH2/cyh2 his3-11,15/his3-11,15 LEU1/leu1-c LEU2/leu2-3,112 trp1-1:URA3:trp1-3' /trp1-1 ura3-1/ura3-1). Strain 8803 (6HA-Mcd1p) was generously provided by the Uhlmann laboratory (described in (Lengronne et al., 2004)). Plasmids, cell growth and synchronization, as well as the minichromosome affinity purification, were previously described in Chapter 2.

Transmission Electron Microscopy Sample Preparation and Analysis

Samples concentrated from the MAP sucrose gradients were dialyzed against HEN10 buffer (10mM NaCl, 10mM HEPES pH 7.5, 1mM EDTA) in a Slide-A-Lyzer 76 MINI dialysis unit (Pierce) at 4oC overnight. Samples were fixed by dialysis against HEN buffer with 1% gluteraldehyde (Electron Microscopy Sciences) for 4-6 hours at 4oC. Excess gluteraldehyde was removed by dialysis against HEN buffer at 4oC. A 5µl sample drop was diluted 1:1 with HEN100 (100mM NaCl, 10mM HEPES pH 7.5, 1mM EDTA) buffer on parafilm. A carbon-coated 400 mesh copper grid (Spi Supplies, Inc) glow discharged for 2 minutes on a glow-discharger was floated on the sample drop for 10 minutes at 22oC. Excess solution was removed by touching the grid to the edge of Whatman paper. The grid was washed with three times with HEN50 (50mM NaCl, 10mM HEPES pH 7.5, 1mM EDTA) for 15 seconds each wash. For positive staining with aqueous uranyl acetate (UA), the grid was placed on three successive drops of 2% UA for 30 seconds on each drop. The grid was then washed with water for 30 seconds per drop. After the last drop, the grid was air-dried overnight. For negative staining with UA, the grid was washed as above after adhesion, and then stained on three successive drops of 2% UA for 30 seconds per drop. Grids were viewed with a JEOL 1200 Ex-II TEM and pictures were taken on a TIETZ camera.

Immunogold labeling

Immunogold labeling experiments were attempted on minichromosomes isolated from nocodazole arrested cells expressing 6-HA-Scc1. The protocol was modified from protocols published by Maria Schnos at the University of Madison Wisconsin on her website (http://www.biochem.wisc.edu/faculty/inman/empics/gold.htm). Samples were adhered to carbon grids as described above. The grids were floated on a drop of 0.1% BSA in PBS (150 mM NaCl, 8.3 mM Na2HPO4, 1.85 mM NaH2PO4) containing 20ul of 1:5000 antibody against HA. The grids were agitated on the drop for 10 minutes at room temperature. The grids were washed with 0.1% BSA in PBS twice, for three minutes each. The grids were then incubated for with agitation for 20 minutes at room temperature on a 20 ul drop of HA-gold (5nm beads from Sigma Aldrich) diluted 1:20 in 0.1% BSA in PBS. The grids were washed twice on a 50ul drop of 0.1% BSA in PBS for 3 min, followed by two washes with water. The grids were stained on two successive 77 drops of 2% UA for 30 seconds each, followed by six successive washes on water drops. The grids were dried overnight.

Analysis of TEM images

Images were imported into the drawing program GIMP. Pixel length of the minichromosome diameter, nucleosome diameters, cohesin rod length, and cohesin rod width was taken three times. The average of these measurements was calculated. These measurements were converted to nm by taking the pixel length of the scale bar. Similar measurements were taken of published cohesin arms and holocomplexes.

Nuclei Isolation and Nucleosome Mapping

The following protocol is for a 4L arrested culture or 2L asynchronous culture. Harvested cells were washed two times in 30ml SB (1.4M sorbitol, 40mM HEPES,

0.5mM MgCl2, pH 7.5) with 1mM phenylmethylsulfonyl fluoride (PMSF, Sigma- Aldrich) and 10mM β-mercaptoethanol (Fisher) and spun at 5,000 rpm in an SS34 or Sorvall G-20 for five minutes. Cells were resuspended in 30ml SB, 1mM PMSF to a total volume equivalent to 4X wet pellet weight. 10mg/ml freshly made zymolyase (Cape Cod) was added to a final concentration of 0.5mg/ml and the sample was incubated at 30oC for approximately 20 minutes or until spheroblasting was completed, as determined microscopically. Volume was brought up to 30ml SB, 1mM PMSF and the sample was spun at 5,000rpm for 5 minutes. All subsequent steps were performed on ice. Pellets were gently resuspended with plastic 25ml pipettes and washed twice in 30ml cold SB, 1mM PMSF. The pellet was resuspended in 20ml cold FB (18% Ficoll 400;

20mM PIPES, pH 6.5; 0.5mM MgCl2) and homogenized in a Thomas glass homogenizer and Teflon motor-driven pestle with 8 strokes. The homogenate was layered carefully over 20ml cold GB (7% Ficoll 400; 20& glycerol; 20mM PIPES, pH 6.5; 0.5mM MgCl2) and centrifuged at 11,500 rpm for 30 minutes. The pellet was resuspended in 20ml cold 78 FB and vortexed five times – one minute of vortexing, followed by one minute on ice. The sample was centrifuged at 4,500 rpm for 15 minutes and the supernatant was pelleted in a clean tube at 11,500 rpm for 30 minutes. The pellet was resuspended in 10ml DB

(10mM HEPES, pH 7.5; 0.5mM MgCl2; 0.05mM CaCl2) and centrifuged at 11,500 rpm for 15 minutes. The pellet was completely resuspended in 2.4ml DB and divided into six tubes, 400µl each. Samples were digested with 0, 2, 4, 8, and 16 units of Mnase (Worthington Company) for nine minutes at 37oC. 8µl of EDTA was added to each tube to stop digestion. Samples were RNAse A treated for two hours at 37oC. 60µl of 22%

Sarkosyl, 20µl of 5M NaClO4 and 2.5µl of 10mg/ml proteinase K were added to each sample which was subsequently incubated at 37oC for two hours to overnight. Samples are PhCl extracted, EtOH precipitated and resuspended in 50µl 0.1XTE. Integrity of sample is verified by running 2µl of each sample on 1.4% agarose gel. All samples are digested with an individual restriction enzyme at 37 degrees overnight. Samples that were mapped at CARC1, CARC3, and CARL2 were digested with EcoRI, while the other samples were digested with HaeIII. The samples were then phenol chloroform extracted and ethanol precipitated and resuspended in 20ul. Samples in their entirety were run on 1.4% agarose gel at 80 volts for 12 hours and transferred overnight onto nitrocellulose paper. The blots were probed with CAR-specific probes that had been generated by PCR.

Chapter 4

Phylogenetic Analysis of the SMC family and Characterization of the SMC1 clade

This chapter contains data that will be submitted for publication by Alexandra Surcel1, Hong Ma1,2 1 Integrative Biosciences Graduate Degree Program, Pennsylvania State University 2 The Huck Institutes of the Life Sciences, Pennsylvania State University 3 Biology Department, Pennsylvania State University

80

Abstract

Members of the Structural Maintenance of Chromosome (SMC) family of proteins form dimer pairs that are involved in a number of DNA metabolism pathways. Smc1/3 are involved in sister chromatid cohesion, Smc2/4 are involved in condensation, and Smc5/6 are involved in the DNA repair pathway. Eukaryotic species contain all six paralogs of SMC, while prokaryotes contain only one copy. TEM data shows that Smc1/3 may form a flexible rod, similar to images of the recombinant Smc2/4 dimer pair. The structure of these heterodimeric pairs is a source of debate whose answer may be further illuminated by a thorough phylogenetic analysis. Evolutionary trees on the SMC family of proteins have already suggested that Smc5 and Smc6 evolved from an early duplication event separate from the generation of the cohesin and condensin Smc partners, yet a different tree topology is generated solely from the secondary structural analysis of the coiled-coil domains of these proteins. Also, no study to date includes a comprehensive investigation of the emergence of the meiotic specific form of Smc1, known as Smc1β or Smc1L2. This chapter revisits the phylogeny of the SMC family and the secondary structural information of the coiled-coil arms of the Smc proteins. In addition, phylogenetic analysis indicates that Smc1β arose from a gene duplication event at the origin of animals, not at the later origin of vertebrates as previously speculated.

81

Introduction

Phylogenetic analysis is a useful tool for molecular and evolutionary biologists alike. In the broadest sense, phylogeny is the evolutionary history of a group of species. This organismal organization can be attained through the characterization of a single gene, a gene family, or a collection of gene families. Phylogenetic analyses are useful not only for identifying inter-species relationships but also for assigning putative functions to unknown proteins, for gaining insight into protein function via comparison to homologs, and for answering how genes are propagated and/or lost, when they arise and by what method they arise (for example, via a gene duplication event).

Tools for Phylogenetic Analyses

A number of tools have been developed for assessing phylogenetic relationships. In this section, I will describe several of the most commonly used programs applied to studying SMC phylogeny in the rest of this chapter. All phylogenetic analyses have at a minimum the first two steps in common – data mining and sequence alignment. The most popular method for data mining is the Basic Local Alignment Search Tool (BLAST), a web-based software program run through the National Center for Biotechnology Information (NCBI). BLAST employs sequence information for a known gene or protein that is used to identify other genes or proteins across species that share identity and similarity to the queried sequence. BLAST is often used to identify new members in a gene family or to cluster together genes with domains, defined as a part of a protein that exhibits a characteristic-folding pattern indicative of a specific function The second, and perhaps more important step, is the correct alignment of the mined sequences. Alignment refers to the arrangement of sequences in multiple rows in such a way that identical or similar residues are placed in the same column, while non- similar residues are placed in parallel to a gap in the other sequence(s), or if placed in the 82 same column, are mismatched by such an assignment. Correct alignments are not trivial to accomplish, and as such, a wide variety of algorithms are employed that perform either pair-wise (between two sequences) or multiple sequence alignments (MSA). In MSAs, three or more sequences are aligned such that each column represents changes in evolution at that one residue, including substitutions, insertions and deletions. MSAs are deemed to be the starting point for phylogenetic analyses. One of the most popular alignment programs available is ClustalX. Alignments in ClustalX are considered to be standard progressive alignments, which are performed in the following manner: First, pair-wise alignments are performed between all of the sequences, generating alignment scores or best fits for all pair-wise alignments. These scores are used to generate a phylogenetic tree. This tree provides a guide for phylogenetic relationships between sequences, which are then sequentially aligned to one another. One of the advantages of ClustalX over other alignment programs is the variety of scoring matrices that are compatible with this program. A protein-scoring matrix is an algorithm that is used to determine the quality of an aligned region using a matrix that supports alignments of amino acid residues that are related and that penalizes against poorly matched residue alignments or gaps. Scoring matrices differ in the criteria that are used, but most look at the probability that a given stretch or pair of residues occurs in the alignment across all compared sequences, and at the distribution of particular amino acids and amino acid pairs in a protein. The most commonly used protein-scoring matrix, and the one used for the work in this chapter, is BLOcks SUbstitution Matrix 30 (BLOSUM30). BLOSUM matrices score residue positions and alignments based on the substitution rate found in large protein family alignments of mixed sequence similarity. These family alignments are derived from the BLOCKS database, a protein classification system that looks for and identifies ungapped, conserved regions in proteins. The BLOSUM system is weighted in the sense that over-represented sequences found in BLOCKS alignments are grouped together to reduce their contribution in generating the scoring matrix. Despite using well-documented popular programs for generating what is essentially a best-fit protein alignment, often alignments need to be adjusted by hand. 83 The program that I employed to view alignments and adjust individual residues was GeneDoc. GeneDoc is a program with a user-friendly interface that allows for the movement of individual residues or entire regions, as well as the introduction of individual gaps in a sequence and gaps across most sequences, referred to as gap columns. While ClustalX with the BLOSUM matrix was efficient in aligning the conserved domains of the SMC family of proteins (see below), it did not always generate the best alignments in the less conserved coiled-coil domains, which had to be done manually through GeneDoc. Once a sequence alignment is completed, a series of analyses can be performed on the alignment. These include, but are not limited to, identification, conserved and variable residue identification, and the construction of evolutionary trees. Phylogenetic trees are a visual representation of the predicted model for the evolutionary history of a set of sequences. Here, two highly similar sequences are represented by lines or branches that intersect at a common point or node. Sequences that are more distantly related are united at other additional nodes by longer branches. Branch length is directly proportional to the number of residue changes between adjacent nodes on a tree. Phylogenetic trees are used to study how gene families evolve, as well as how function and structure convergently or divergently evolve over time. Several different methods are used to construct phylogenetic trees. The most popular are Neighbor-Joining (NJ), Maximum Likelihood (ML) and Maximum Parsimony (MP). NJ is a distance-based method, based on the number of differences in a multiple sequence alignment between pairs of sequences. NJ is based on an algorithm that calculates a tree by increasingly adding the next most-similar sequence as an additional branch to the existing tree, using the distance values generated between two sequences. NJ trees are fast and easy to generate, however because this method produces trees in a step-wise fashion, it does not always give the best or true tree topology as compared to other methods. NJ trees are usually complemented by additional analyses or trees generated by ML or MP. In Maximum Likelihood, a tree is generated using an algorithm that relies on the expected arrangement of residue changes coupled with the probability of the most likely 84 branch arrangement that would generate the sequences used in the alignment. Because it relies on a probability function, ML tree topology can vary depending on the model to which the data is fit. Maximum Parsimony generates a tree without regard for pair-wise sequence comparisons. Its trees are based on which topology will yield the least amount of evolutionary changes for the given data set, by creating the best fit for the sequence variation for each residue across a MSA. Although like neighbor-joining, MP is easy to utilize, one of the biggest critiques of this method is that in some cases it increases the likelihood of long-branch attraction – a phenomenon that links two sequences together, not based on their similarity to each other, but based on their dissimilarity to other sequences. Long-branch attraction is indeed problematic with neighbor-joining and maximum likelihood, but appears more pronounced with MP. Phylogenetic tools are ultimately as good as the data quality that is used and the models that are employed. While there isn’t one alignment program or tree building program that is optimal for all phylogenetic studies, a combinatorial approach using these tools can lead to significant understanding, not just about evolution, but also about structure/function relationships of proteins and protein domains.

Phylogeny of the SMC family of proteins

The Structural Maintenance of Chromosome (SMC) family is the subject of considerable interest among both molecular and evolutionary biologists. This family is comprised of one gene in prokaryotes whose protein forms a homodimer, and six paralogs (genes arising from gene duplication events) in eukaryotes that form three distinctive heterodimers. These proteins are all involved in DNA metabolism. In eukaryotes, the Smc1/3-containing complex is required for chromatid cohesion, the Smc2/4-containing complex is essential for chromosome condensation, and the Smc5/6 heterodimer is involved in DNA repair. As mentioned in Chapter 1, protein members of this family share a similar structure of two globular domains, each connected to a long coiled-coil domain that is separated by a flexible hinge region. The N- and C-termini join 85 together upon folding at the hinge domain to form a functional ATPase of the ABC family of ATPases. Much of what we know about how SMC dimer pairs interact within their complexes comes from TEM analysis, summarized in Figure 4.1. As discussed in Chapter 1, rotary shadowed images of MukB homodimers demonstrated that they form a V shaped molecule (Melby et al., 1998). Additionally, this study showed that the SMC pairs took on a number of different conformations. TEM data on recombinantly expressed cohesin and condensin complexes have shown that in vitro, cohesin forms a ring, whereas condensin forms a rod-like structure. This structural information, coupled with data shown in Chapter 3 demonstrating that cohesin likely forms a flexible rod when complexed with CAR-containing minichromosomes, raises an interesting question – is there structural information within the coiled-coil domains of the SMCs that influence or dictate complex structure? Phylogenic analysis may provide a key. To date, several phylogenetic analyses have been conducted on the SMC family, resulting in a few different interpretations. Using maximum-likelihood analysis, Cobbe and Heck investigated the relationship between SMCs from both prokaryotes and eukaryotes and included the similar ABC ATPase Rad50 and MukB superfamilies as outgroups or clades that have branched earlier than other monophyletic ones (Cobbe and Heck, 2004). Their phylogenetic tree shows that eukaryotic SMCs evolved from several ancient gene duplication events. They concluded that Smc5 and Smc6 evolved separately and likely with a higher rate of evolution, and the larger cohesin and condensin subunits (Smc1 and Smc4) formed a separate clade (grouping with a common ancestor) from the smaller subunits (Smc3 and Smc2). Figure 4.2A shows the general topology of their tree. This work supported the proposition that the most conserved domains of SMCs are the N and C termini, plus the dimerization or hinge domain. Additionally, they examined correlated rates of evolution of different regions of the SMC proteins – regions that interact with one another should co-evolve and demonstrate similar rates of residue substitution. Their data shows a strong correlation between the terminal domains of the same SMC, but low correlation between the N- and C-terminal domains of dimer pairs. This further lends support to the model that SMC dimers interact intramolecularly, 86

Figure 4-1: Proposed structures of SMC containing complexes. Rotary shadowed imaging of condensin, the Smc2/Smc4 containing multi-protein complex, shows that it forms a rod structure with interaction or overlap between the SMC arms. In solution, recombinantely expressed cohesin complex shows that Smc1 and Smc3 form a ring structure. The DNA repair complex that includes Smc5 and Smc6 forms a V shaped molecule in vitro.

87

Figure 4-2: Tree topologies for the SMC family of proteins. (A) The predicted evolution of the SMC family of proteins based on sequence alignment and correlated evolutionary rates of SMC domains (Soppa, 2001; Cobbe and Heck, 2004). (B) The predicted evolution of the SMC family of proteins based on analysis of the secondary structure of their coiled-coil domains.

88 instead of intermolecularly. Of great interest was the strong correlation, also using correlated rates of evolution, found along the coiled-coil domains for an intramolecular interaction for condensin, but not for SMCs involved in the other complexes. Finally, their consensus tree showed that plant and animal SMC proteins group together with fungi as an outgroup, in contradiction to rRNA trees that show plants as the outgroup. The Cobbe and Heck study supported phylogenetic work done on prokaryotic SMCs (Melby et al., 1998; Soppa, 2001). These analyses identified the presence of a single copy of SMC in all prokaryotes. Additionally, using both NJ and MP to develop phylogenetic trees, they deduced that two horizontal gene transfer events occurred between archaea and cyanobacteria and between archaea and the thermophilic eubacteria Aquifex aeolicus. Although sequence alignments coupled with phylogenetic trees contain useful information, alone they may not provide the most accurate evolutionary relationship. Studies on the secondary structure of the coiled-coil domains using the COILS program noted that each SMC arm contained one or two interruptions to the coiled-coil arms (Beasley et al., 2002). These interruptions occurred in opposite orientation among dimer partner pairs. One disruption occurred in the coiled-coil between the N-terminal domain and hinge (left arm) and two occurred in the coiled-coil between the hinge and the C- terminal domain (right arm) in Smc1, Smc4, and Smc5, while two occurred in the left arm and one in the right arm of Smc2, Smc3, and Smc6 . The distribution of these disruptions is shown in Figure 4.3. This evaluation of the secondary structure of the coiled-coil arms suggest that unlike the tree generated by Cobbe and Heck, the SMCs could be categorized into two clades, implying that there was one main gene duplication event in the ancestral SMC that gave rise to two dimer partners that were further duplicated to meet various functions. This tree topology is shown in Figure 4.2B. All of these phylogenetic analyses fail to include the meiotic specific form of SMC1. First identified in mice, the meiotic specific form of Smc1, named Smc1L2 or Smc1β, has no homologs in budding yeast or C.elegans. A preliminary alignment and phylogenetic tree suggests that it originated early in vertebrate evolution (Revenkova et al., 2001). 89

Figure 4-3: Conserved disruptions in the SMC coiled-coil domains. Smc1, Smc4, and Smc5 all have one disruption in their first coiled-coil domain and two in the coiled-coil domain flanked by the C terminus and the hinge region, with the second being larger than the other two interruptions. Smc3, Smc2, and Smc6 also share interruptions to their coiled-coil domains in an antiparallel orientation to their partners. These three proteins have one interruption in their C terminal flanked coiled-coil, with two interruptions in the region separated by the hinge and the N terminal domain. Figure adapted from (Melby et al., 1998).

90 The aforementioned studies have generated both some contention regarding the evolutionary history of SMCs, as well as some interest in using the SMC phylogeny to address other structural problems. Some of the questions that remain to be answered are: • Is the tree topology using all six SMC paralogs different than that published in the literature if additional species are taken into account? What if the coiled-coil interruptions are reexamined – would extensive sequence analysis be able to support the tree topology rendered by the COILS analysis? (see Figure 4.2) • Can the kingdoms be resolved within the SMC family tree upon the addition of additional species or upon a better MSA? • Are there similarities among the coiled-coil domains in Smc1/3 to Smc2/4 that would suggest that the cohesin arms interact with each other in the same way that condensin’s arms interact to form a rod? • When and how did the meiotic SMC1 arise?

In this chapter, I evaluate the current phylogeny of the SMC family taking into account the interrupted coiled-coil domains. I discover several instances of incorrect gene annotation and that the addition of more species does not resolve the kingdom discrepancy observed in other analyses. I also identify additional interruptions to the coiled-coil domains that may affect partner interaction. Finally, I examine the emergence of the meiotic-specific SMC1 in vertebrates and conclude that the responsible gene- duplication event occurred early in the evolution of animals, despite the lack of SMC1L1 in non-vertebrate animals.

91 Results

Phylogenetic analysis of SMC family

Alignments of each SMC paralog showed expected regions of high conservation at the globular domains and at the hinge domain (see Figure 4.4 for a partial alignment across SMC paralogs). The coiled-coil or arm domains remained highly divergent, even among clades, though Smc1 showed more conservation than the other SMC proteins within the arm domains. The final neighbor-joining tree for the entire SMC family is presented in Figure 4.5. This tree contains sequences from over 250 species and includes members of all kingdoms. This tree is consistent with the published observations that within the eukaryotic clade, the larger subunits of cohesin and condensin, Smc1 and Smc4 respectively branch together, as do their smaller subunits Smc2 and Smc3, while Smc5 and Smc6 involved in DNA repair branch separately into their own clade. Trees with identical topologies are attained when using the whole protein sequence or just the three conserved domains – the N and C termini plus the hinge region . The topology of the tree suggests either that Smc5 and Smc6 originated from the first gene duplication event of the Smc ancestor at the origin of eukaryotes or that Smc5 and Smc6 group together due to long-branch attraction. In addition, as predicted from other trees, this alignment and phylogenetic analysis shows a horizontal gene transfer event occurring from the archae SMC sequence, resulting in the SMC proteins of the cyanobacteria and the eubacterial species Aquifex aeolicus. While this gene transfer event is in disagreement with speciation trees derived from ribosomal sequences, it does support two previous phylogenies. A measure of the validity of phylogenetic relationships shown in a given tree is the bootstrap value, depicted from 1-100 at each node. Bootstrap values are generated by a random sampling of columns within an alignment and are then used to determine how well the data from the overall alignment supports the tree. By comparison, the bootstrap 92

Figure 4-4: Partial view of the alignment of the SMC family of proteins across eukaryotic and prokaryotic kingdoms. This figure shows the highly conserved C-terminal domain. Residue columns highlighted in black show conservation among all sequences, while dark gray columns show 80% conservation and light gray show 60% conservation among sequences.

93

Figure 4-5: Figure 4.5: NJ tree for the SMC family of proteins. The smaller Smc components of the cohesin and condensin complexes – Smc2 (green) and Smc3 (red) – grouped together as did the larger components – Smc1 (light aqua) and Smc4 (blue). Smc5 (green) and Smc6 (purple) cluster together. The eubacteria Smc (yellow) and archaebacteria Smc (olive) are ancestral to the eukaryotic paralogs. The meiotic form of Smc1 – Smc1β is shown in bright blue. Bootstrap supports for major clades are shown.

94 values for this tree are higher than the bootstrap values from published trees, suggesting that the MSA is more accurate. Of all the internal nodes, only six have bootstrap values less than 50 and of the rest, the majority are over 75. When this tree was first generated, several incorrect gene annotations were discovered. The sequences for the putative Smc2 from the filamentous fungus Aspergillus oryzae, as well as that for the putative Smc1, were in fact Smc3 and Smc4 respectively. Likewise, the putative Smc1 sequence for the yeast Kluyveromyces lactis was in fact Smc4 and the putative Smc3 sequence for the filamentous fungus Aspergillus nidulans was its Smc2 paralog. A recent review of these BLAST searches reveals that the correct annotation for the K. lactis and the A. nidulans sequences have been assigned in the interim, while the A. oryzae genome has been removed from the NCBI database and will be restored upon completion of genome assembly.

COILS analysis of SMC proteins identifies additional interruptions to coiled-coil domains

Because the phylogenetic analysis of the SMC family did not generate a different tree than had been previously published despite higher bootstrap values and the addition of more sequences, an examination of the secondary structure of these proteins was undertaken. Preliminary analysis reveals that in addition to the symmetrical distribution of interruptions to the coiled-coil domains, a pattern of other interruptions emerges (Figure 4.6). In the COILS outputs for SMCs from S. cerevisiae, one additional interruption was seen in the right arm (the domain stretching from the hinge to the C- terminus) in Smc1, Smc4, and Smc5. This interruption is more pronounced in the cohesin and DNA repair partners, than in the condensin subunit Smc4. For Smc4, Smc1, and Smc5, an interruption was noticed in the left arm (the domain stretching from the N- terminus to the hinge region), though it was more pronounced in Smc1 and Smc4. Additionally, the left arm of Smc1 and Smc5 had one more interruption and the right arm of Smc5 also had an interruption to the coiled-coil domain. Overall, the interruptions in

95

Figure 4-6: COILS output for Smc1-6 from S. cerevisiae. On the left, Smc2, Smc3, and Smc6 show the characterized interruptions – 2 on the left arm and one on the right arm of the protein. On the right, Smc4, Smc1, and Smc5 show the characterized interruptions – 1 on the left arm and two on the right arm. The previously characterized interruptions are marked by *. New interruptions are identified in all proteins and shown by a black arrow.

96 the cohesin and DNA repair pathway Smcs of both partners are more pronounced than the same interruption found in the condensin partners.

Phylogenetic analysis of SMC1

In order to properly identify when the gene duplication event that gave rise to Smc1β occurred, an Smc1 alignment with all known Smc1β sequences was performed. In addition to the Smc1β sequences, the alignment also contained Smc1α sequences from vertebrates and Smc1 sequences from budding yeast, trypanosome, and other animals – sea urchin, sea squirt, sea anemone, and the flying insects Drosophila and mosquito. An NJ-tree of the Smc1 alignment showed that the meiotic and mitotic Smc1 separated into two clades (see Figure 4.7). All non-vertebrate animal Smc1 sequences clustered with the vertebrate Smc1α clade. The budding yeast and trypanosome Smc1s formed the outgroup for the phylogenetic tree. Because the bootstrap values were low at the nodes connecting the urochordate sea squirt, sea urchin and sea anemone with the Smc1α clade, further analysis needed to be performed to establish if these three sequences are indeed more closely related to Smc1α than to the Smc1β or to the vertebrates as an individual clade. A residue-by-residue assessment of the alignment was performed. Figure 4.8 shows a partial alignment with conserved residues highlighted. Thirty-six amino acid residues were conserved among non-vertebrate animals and vertebrate Smc1α. Six conserved residues support the sea anemone node and ten support the insect node. The conserved residues span the entire length of the protein (see Figure 4.9). A total of 12 residues were conserved in all vertebrate sequences, but not in the other animal sequences or in the outgroups. Of these, seven occurred in the coiled-coil domains, one in the N terminus, one in the hinge, and three in the C-terminal domain. Despite the presence of these twelve residues, the high number of conservation of residues at the nodes identified in the NJ tree further support the tree topology that suggests that SMC1β arose from a gene duplication event in early animals and that only the mammals retained the meiotic form of this cohesin protein. 97

Figure 4-7: NJ tree for SMC1 clade. This tree includes both meiotic and mitotic specific forms. With Typanosome and budding yeast as the two outgroups for this tree, the nonvertebrate animals – fruit fly, mosquito, sea urchin, and sea anemone – cluster with the urochordate sea squirt and are more similar to SMC1α (blue) than to SMC1β (red).

98

Figure 4-8: Alignment of Smc1α and Smc1β residues in the hinge domain. Residues highlighted in green support the node differentiating the two animal clades. Residues highlighted in purple support the node with the insect offgroup. C.familiaris – dog; B.taurus – cow; H.sapiens – human; R.norvegicus – rat; M.musculus – mouse; G.gallus – chicken; X.laevis – frog; T.rubripes – pufferfish; D.rerio – zebrafish; S.purpuratus – sea urchin; C.intestinalis – sea squirt; N.vectensis – sea anemone; D.melanogaster – fruitfly; A.aegypti – mosquito; M.mulatta – rhesus monkey; P.troglodytes – chimpanzee; T.cruzi – trypanosome (protist); S.cerevisiae – budding yeast.

99

Figure 4-9: A generalized phylogeny of the Smc1 family with mapped unique amino acid residue changes. 100 Discussion

Implications for structure based on the phylogeny of the SMC family of proteins

Previous studies on the phylogeny of the SMC family of proteins yielded two different tree topologies. Based solely on sequence alignments, the phylogenetic trees showed that a gene duplication event occurred either right before or soon after the emergence of eukaryotes (Cobbe and Heck, 2004). This gene duplication event gave rise to two Smcs – one that evolved into the DNA repair Smc5 and Smc6 dimer, and the other that gave rise to two Smcs each of slightly different length that further duplicated into the small and large dimer partners involved in cohesin and condensin. Closer examination of the tree generated from the Cobbe and Heck study suggested that the sequence alignment and phylogeny should be revisited. Firstly, as the authors stated, their overall family tree shows that plant and animal SMC proteins group together with fungi as an outgroup. This is in contradiction with numerous trees of other sequences, primarily rRNA trees that position plants as the outgroup. Secondly, it seems logical for evolution to favor a gene duplication event that would give rise to two dimer partners involved in a variety of DNA metabolism events rather than a gene duplication event that would lead in the eukaryotic ancestor to two homodimers – one involved in DNA repair and one involved in cohesion and condensation. Thirdly, while the bootstrap values of individual clades from published trees are 100 or near 100, the bootstrap values from the consensus tree are low. The second major analysis of the SMC family came from examining the secondary structure via the COILS program and demonstrated that based on the placement of conserved interruptions to the coiled-coil domains, the tree topology for the SMC family of proteins is more likely to involve a gene duplication event in the early ancestor that would give rise to an early heterodimer pair (Beasley et al., 2002). With these two phylogenies at odds with each other, I embarked on a reexamination of the SMC phylogeny. In addition to trying to resolve these two trees, I also took a closer look at the coiled-coil domains especially of the condensin and cohesin proteins, in search of 101 candidate stretches of sequence along the coiled-coil domains that would provide support for intermolecular interactions along cohesin arms (see Chapter 3). More than 200 sequences across all kingdoms were aligned to each other and were used to construct a phylogenetic tree. While ClustalX was exceptionally good at the alignment of the three conserved domains – the N-terminus, hinge domain, and C- terminus – the coiled coil domains were exceptionally difficult to align. Manual alignments of heptad repeats in the coiled-coil regions had to done in GeneDoc and were attempted, but did not yield completely satisfactory results. However, Neighbor-Joining trees were generated from individual conserved domains, all the conserved domains together, and the entire protein sequence. All trees yielded the same topology with Smc1 and Smc4 forming a clade, Smc2 and Smc3 forming another clade, and Smc5 and Smc6 branching off in a separate clade prior to the duplication events leading to the rise of the condensin and cohesin Smcs. Because the differences among the coiled-coil sequences was so great even among the different clades (with the exception of Smc1), it was not possible to align these regions from cohesin and condensin subunits in order to identify candidate regions that may play a role in how the dimers interact with each other in vivo. Despite the fact that the tree topology groups Smc5 and Smc6 together in the same clade, it is quite possible that this clustering is due to long-branch attraction and not a true phylogenetic relationship. Smc5 and Smc6 have already been speculated to have a higher evolutionary rate than the other members of the Smc family (Cobbe and Heck, 2004). This divergence is likely to influence the assignment of Smc5 and Smc6 to the same clade, an assumption that is further supported by the results from the COILS outputs discussed below. All monophyletic clades showed good orthology of the individual SMC proteins. However, the inclusion of additional sequences over the ones used in previous studies did not resolve the kingdom differences seen in the consensus tree. The current understanding of speciation places fungi closer to animals than to plants, which becomes the outgroup to the other two. This relationship is sometimes masked though, by the more rapid evolution of fungi species. Their greater divergence on occasion places them 102 as the outgroup, with animals and plants more closely related. It is likely that this masking is what is happening here. Several studies have identified regions of the coiled-coil domains that are important both from an evolution standpoint and a biological one. Use of the COILS programs identified conserved interruptions in the coiled-coil domains in opposite orientation within SMC dimer pairs (Melby et al., 1998). Some of these disruptions were predicted to form functionally significant loops within the arms of the complexes containing these dimers. This supposition was substantiated by mutations generated in the Loop1 region of Smc1. In these mutant lines, Smc1 could load onto chromosomes, but could not accumulate at CARs and pericentric regions (Milutinovich et al., 2007). A cursory examination of the COILS output of Smc1 from S.cerevisiae showed that there was at least one additional interruption to its coiled-coil domain. This prompted an inspection of all Smc sequence outputs from the COILS program. Maintaining the characterization that Smc2, Smc3, and Smc6 belong to the same clade, an additional disruption was found on the right arm of their coiled-coil domains. Similarly, Smc4, Smc1, and Smc5 had one extra conserved interruption, with Smc1 and Smc5 showing one and two additional interruptions respectively. These supplementary findings to the original COILS assessments suggest that the coiled-coil domains are more discordant than initially believed and that these interruptions conserved across Smcs may play more of a role in the structure of the various Smc-containing complexes. Of the pairs of dimers, Smc5 and Smc6 showed the largest number of interruptions, closely followed by Smc1 and Smc3 of the cohesin complex. This implies that interruptions in the arm regions of these two pairs may affect the function or structure of these complexes differently than the interruptions in the condensin pair. While no direct imaging on Smc5/6 has been performed, there have been several imaging studies done on cohesin, suggesting either a ring structure or a collapsed rod (Anderson et al., 2002) (Gruber et al., 2003) (Chapter 3). The presence of multiple interruptions to the coiled-coil regions gives the proteins greater flexibility necessary to form a cohesin ring. Another interpretation of these breaks is that the cohesin complex may need to form several different 103 conformations – such as a ring for loading and a rod for maintaining binding at CARs – and that these interruptions facilitate such flexibility. In addition to confirming phylogenetic relationships based on sequence and based on secondary structure, the analysis performed on the SMC family of proteins identified several genes with incorrect annotations. While these annotations have been resolved since the first alignments and trees were produced, their identification highlights the role that such type of phylogenetic analysis can have on accurately distinguishing paralogs from each other.

Possible origin of SMC1β and the evolution of meiotic cohesin function

Several meiotic-specific cohesin components have been identified in recent years of Smc1, Scc1, and Scc3 (Revenkova et al., 2001; Hodges et al., 2005; Revenkova and Jessberger, 2005, 2006) (Eijpe et al., 2000). The presence of these cohesin proteins involved specifically in meiosis supports the idea that the regulation of cohesin here may be different than cohesin regulation in mitosis. Unlike in mitosis, during meiosis two rounds of cell division occur, following one round of DNA replication. Cohesion is essential prior to the formation of cross-overs, and in fact is removed from the arms during anaphase I, allowing for the separation of homologs, but not sister chromatids. The cohesin that remains at the centromeres is important for appropriate bi-polar attachment to sister chromatids in meiosis II. This preferential dissolution is in part accomplished by the presence of meiotic-specific isoforms of cohesin proteins. The evolutionary history of these meiotic-specific isoforms may shed more light on the differential regulation of meiosis across species. Smc1β is the meiotic form of Smc1 found in mouse and human testes that coprecipitates and colocalizes with Smc3 (Revenkova et al., 2004) (Revenkova et al., 2001) (Hodges et al., 2005). Putative homologs for Smc1β have also been found in rats (Revenkova et al., 2004). Because no homolog for Smc1β was found in frogs and Smc1 from frogs formed a clade with Smc1α from vertebrates, it was speculated that the gene 104 duplication event that led to the meiotic specific form occurred early in the divergence of vertebrates, a speculation that is refuted by the work presented in this chapter. The Smc1 sequence mining showed that while there is no known meiotic homolog in frogs, there are meiotic homologs in other vertebrate animals, namely in monkey, chimpanzee, chicken, and fish genomes. There are no Smc1β homologs in the urochordate sea squirt, an early member of the chordate family to which all vertebrates belong. Additionally, there are no Smc1β homologs in other animals – sea anemone, sea urchin, and insects – all lower on the evolutionary tree. Phylogenetic analysis grouped together all the animal mitotic Smc1s as a clade separate from the meiotic specific Smc1. This suggests that the gene duplication event that gave rise to Smc1β occurred much earlier at the diversification of animals. This conclusion is sustained by the finding of a large number of residues conserved among animal Smc1α, supporting the inner nodes. It is of interest to note that twelve residues are conserved among the vertebrate Smc1 of both forms, with a higher preponderance occurring at the C-terminal domain. This implies that this domain is disproportionately important for cohesin function as compared to the other domains. Perhaps it is involved in the binding other conserved vertebrate-specific proteins and thus must maintain a high level of conservation within this sequence. The results of the phylogenetic analysis of the Smc1 clade are at first glance surprising. The neighbor-joining tree suggests that non-vertebrate animals not only gained the meiotic form of Smc1, but also subsequently lost it. This loss had to occur at several points throughout evolution. It is known however, that at least in Drosophila, meiosis differs significantly than in vertebrates. While synaptonemal complexes form in females, their formation is not essential for recombination and in fact does not occur in males. This differential mode of recombination suggests that the role of cohesin in meiosis may also be altered among different animals. It is therefore not surprising that non-vertebrate animals have evolved an Smc1 that can be used in both meiosis and mitosis, whereas vertebrate meiosis dictates the maintenance of both a mitotic and meiotic form of Smc1.

105 Materials and Methods

Sequences were mined from BLAST, using Smc1-6 protein sequences from Sacchromyces cerevisae as queried sequenced. For each subfamily, a BLAST score cutoff of 22% similarity was used. Whole protein sequences were aligned with ClustalX, using the following Pairwise Alignment parameters – 5 for gap opening, 0.5 for gap extension, and BLOSUM30 for the protein weight matrix. Alignments were inspected manually and adjusted using the GeneDoc program. Neighbor joining trees were constructed using MEGA 3.0 (Kumar et al., 2004). The reliability of internal branches was calculated with 1000 bootstrap pseudoreplicates using the “pairwise deletion option” of amino acid sequences. Tree files were viewed by using MEGA. NJ trees are shown with bootstrap support.

SMC sequences from S.cerevisiae were input into the COILS 2.1 program with sliding windows of 14 and 28, using the MTIDK matrix. The COILS program is available online at http://www.ch.embnet.org/software/COILS_form.html. Disruptions to the coiled-coil domains were marked by measurements over 0.7 for more than 10 consecutive amino acids.

Chapter 5

Conclusions and Future Direction for the Study of Cohesin-Chromatin Interactions

107

The work presented in this thesis addresses specifically how cohesin molecules interact with chromatin attachment regions (CARs) and with other cohesin molecules by employing both a structural and a phylogenetic approach. The minichromosome affinity purification technique was modified for CAR-containing constructs that were low-copy due to the addition of the CEN3 sequences. These modifications led to sufficient yield for TEM structural analysis of these in vivo assembled complexes. Results from TEM imaging demonstrated that several cohesins are found at CAR loci and that it is highly likely that these protein complexes interact with each other. These speculations lend themselves to the establishment of a new model for cohesin binding that is partially substantiated by phylogenetic analysis of the Structural Maintenance of Chromosome family. A brief overview of the thesis results is presented here, along with proposals for future work.

MAP method – a viable approach for isolating low-copy plasmids

The MAP technique was originally designed for use with minichromosomes whose backbone was derived from the TRP1/ARS1 locus, rendering them as high-copy constructs that segregate randomly. For these constructs, MAP was used to isolate in vivo assembled complexes in sufficient quantities for TEM analysis and protein characterization. To ensure proper assessment of cohesin-chromatin interaction, the minichromosomes designed to study this interaction should have normal segregation. Therefore, a CEN3 sequence was inserted into the construct for proper segregation by the spindle at anaphase. This rendered pCM26-1 as a low-copy number plasmid, and as such, the published protocol was inadequate for generating sufficient yield of this CARC1 containing minichromosome. Results displayed in Chapter 2 show that the MAP protocol could be modified to make it a useful technique for studying in vivo assembled cohesin-chromatin complexes generated from CEN3 containing minichromosomes. Adjustments to the make-up of the 108 MB buffer used in the passive diffusion step, coupled with changes to the sucrose gradient conditions generated a 100-fold increase in sample yield over that obtained by using the published protocol. Work on identifying steps involved in sample loss also led to the discovery of changes in several steps that can be used regardless of minichromosome characteristics. These include the pretreatment of the filter, the handling of the column, and the wash conditions, coupled with attention to reagent deterioration over time (the homogenizer integrity and chitin beads’ affinity).

TEM analysis of cohesin-bound minichromosomes and mapping data pave the way for future interaction analyses

The TEM data shown in Chapter 3 of pCM26-1 isolated from alpha factor arrested and nocodazole arrested cells demonstrate that structural information about cohesin-chromatin interactions can be attained from complexes isolated by the MAP method. Replicated minichromosomes identified by their similarity to those isolated from alpha-factor arrested cells when cohesin is not bound to chromatin, were always associated with a long flexible rod. Measurements of this rod indicate a length similar to published measurements of the cohesin holocomplex and a width indicative of up to four cohesin complexes. The presence of these multiple cohesins as a singular rod with replicated minichromosomes at one end is suggestive of a new model for cohesin binding. At a minimum, the TEM data supports the notion that multiple cohesins bind at CAR loci. Additionally, the data supports the possibility that cohesin complexes are able to interact with each other along the lengths of their arms. If the flexible rod imaged is reflective of in vivo interactions, then cohesin binding at CAR loci may be maintained by a topological constraint imposed between the chromatin and collapsed rings by the interactions along the cohesin arms. The interactions along the cohesin arms do not preclude interaction of different cohesin complexes at their head domains as postulated by the bracelet model. While nucleosome mapping data demonstrates that the large 109 secondary chromatin structural changes occur, it does not eliminate the possibility that cohesin may still physically bind to chromatin at CAR loci. This data lays the foundation for a number of other studies that can be done to tease apart both the chromatin-cohesin interaction as well as interactions between cohesin molecules. To directly test the assertion that the flexible bound to these minichromosomes rod is indeed cohesin, MAP samples can be applied to columns containing active recombinantly expressed separase that will cleave the Scc1 protein and should release cohesin from the replicated minichromosomes. This should result in the imaging only of singular minichromosomes with a similar morphology to those isolated from alpha-factor arrested cells. Similar results may be attainable by mild protease treatment of the minichromosome-cohesin complex. However, singular minichromosomes may exhibit a slightly different morphology than those isolated from alpha-factor arrested samples, as the protease treatment may affect nucleosome structure as well. Information about the interaction of cohesin with different CAR loci can be attained by analyzing additional constructs. Ones containing larger CAR sequences or just the CEN3 sequence alone may bind more or less cohesin molecules respectively. Constructs containing just Scc2/4 binding sites can be used to test the theory that cohesin loading occurs at the binding sites for this protein complex. Immunogold labeling experiments of isolated minichromosomes can be conducted to address orientation of cohesin with respect to the replicated minichromosomes, as well as the number of cohesins bound to the minichromosome pair. However, this experiment is technically challenging and requires the additional labeling of the cohesin complex alone as a positive control. The interactions between the cohesin molecules can be addressed by studying minichromosomes isolated from different yeast strains. Mutations that affect cohesin binding but not loading may show structures that are unable to interact with each other to form the long flexible rod seen in images shown in Chapter 3. Mutations specifically of the loops present in the coiled-coil domains identified in published data and in Chapter 4 can address if they are necessary for the interaction between cohesin arms and for the 110 formation of the collapsed rings. Finally, mass spectrometry of isolated minichromosome samples can be carried out to identify novel candidate proteins involved in sister chromatid cohesion establishment or maintenance.

Phylogeny of the SMC family – an important tool for deciphering structure-function relationships for cohesin

The phylogenetic analyses, whose results are shown in Chapter 4, were originally undertaken in an attempt to find stretches of conserved sequences in the coiled-coil arms that may be important for promoting intermolecular cohesin interactions similar to the interactions observed in isolated Smc2/4 dimers. As an adjunct reason, a revisit of the SMC family phylogeny was carried out in an attempt to resolve different tree topologies. The composite phylogenetic tree of the SMC family was similar to those published previously from multiple sequence alignments despite the presence of additional sequences. Smc5 and Smc6 form a separate clade originating prior to the gene duplication events that gave rise to the small and large subunit specific clades for cohesin and condensin. Even though the bootstrap values for the composite tree were higher than those from published trees, which would suggest a better alignment, the kingdom relationships remained unchanged. The SMC family MSA and phylogeny did however reinforce the idea that phylogeny can be a useful tool for the correct annotation of newly described or sequenced genes. Despite the output from the family phylogenetic tree, there are several other analyses that can be done to strengthen the resulting topology or to support the topology generated from an examination of the secondary structure of these proteins. As with the analysis done on the Smc1 clade, a residue by residue breakdown of this family can be performed. If Smc5 and Smc6 do form an individual clade, then there should be more conserved residues between them than between Smc1/4/5 and Smc2/3/6. A preliminary analysis of the secondary structure through the COILS program of the Smc proteins from S.cerevisae show that there are additional conserved interruptions to the coiled-coil domains of all of the proteins. These interruptions segregate in the 111 same manner as those previously described; that is an additional interruption was identified in the right arm of Smc2, Smc3, and Smc6 and an interruption was identified in the left arm of Smc4, Smc5, and Smc6. These results suggest that a more comprehensive analysis of these interruptions is warranted. It is important to discern whether or not these interruptions maintain a conserved sequence both among the budding yeast protein sequences and those from other organisms. Likewise, it is of interest to determine whether or not these interruptions facilitate and/or are necessary for the intermolecular interactions observed from images of the condensin heterodimers and from images of the cohesin bound minichromosomes. Mutation studies of these loops and the generation of protein chimeras with swapped coiled-coil domains may yield more information regarding these interactions. The results from the COILS program can also be applied to the family phylogeny, which can be revisited with these conserved interruptions aligned. Additionally, it would be interesting to run several prokaryotic sequences through the COILS program. Since there is only one SMC member per prokaryotic sequence, an assessment of the interruptions present in these coiled-coil domains may shed light on both the importance of certain interruptions over others as well as the evolution of these breaks in the coiled- coil domains. The high conservation of the SMC family across kingdoms also makes it an attractive family for studying species evolution. While phylogeny of genes and gene families are useful for learning about the evolution of a subset of the genome, alignments between species of tandemly arranged families of proteins can aid in dissecting species evolution among highly similar phyla. Finally, the phylogenetic analysis of the Smc1 clade in Chapter 4 suggests that the duplication event responsible for the meiotic specific form of Smc1 occurred right before or early on in the divergence of animals. The gain of an additional copy of Smc1, coupled with the subsequent loss in non-vertebrate animals suggests that meiosis in vertebrates may be differentially regulated than in other animals. Additionally, the formation of an Smc1 clade between non-vertebrate animal Smc1 and the meiotic form of Smc1 suggests that cohesion may be differentially regulated in both mitosis and meiosis 112 in these animals, than in vertebrates. Based on this phylogenetic analysis, a closer examination of meiosis in urochordates, sea urchin, and sea anemone may yield surprising results in meiotic control.

Appendix A

PCR primers and additional constructs

114 This appendix contains two bodies of information. The first is a list of all primers used throughout this thesis – their name, their sequence, and their purpose. The second is a list of additional constructs not discussed in Chapter 2 that were made, including rationale and plasmid maps. 115 Table A-1: Primer Information Primer name Sequence Purpose

STE2 locus primers: HAC1, AvaII F CCAAGAGACTTCATGGGAGCT Mapping STE2 area, Ava II HAC1, AvaII R CCGCTATATCGTCGCAGAGT Mapping STE2 area, Ava II AGX1a, AvaII F ACCTTAGTCATCATATCACTCTCA Mapping STE2 area, Ava II AGX1a, AvaII R ATGAGCTTTCGTCTTCAACC Mapping STE2 area, Ava II AGX1b, AvaII F ACCCAAGTATATCCGTATTGGAC Mapping STE2 area, Ava II AGX1b, AvaII R GGAGCTGTCAAAGAATCTTGCA Mapping STE2 area, Ava II CAK1a, AvaII F GTCCAGTACATACTTGCCATCG Mapping STE2 area, Ava II CAK1a, AvaII R ATTGATGACGGCAGCGAC Mapping STE2 area, Ava II CAK1b, AvaII F GTCCGCAACAATTGGGATACT Mapping STE2 area, Ava II CAK1b, AvaII R GGGCAACAAATGTA AGCACATC Mapping STE2 area, Ava II GYP8, AvaII F GACCAACGCCCTTGTCTTG Mapping STE2 area, Ava II GYP8, AvaII R CAGCATGTGGAAAGCCAGGAT Mapping STE2 area, Ava II STE2/BST1 RNA probe F AATACTAGGATAGGACCGTTTGC STE2/BST1 RNA probe STE2/BST1 RNA probe R CATCACAATATACTAGCAGTGGC STE2/BST1 RNA probe STE2 CAR F AAAGAGGGAGAAGTTGAACCC Mapping STE2 area, HaeII STE2 CAR R TGCCTGGTTGCTATTTTTCG Mapping STE2 area, HaeII HAC1 probe F, HaeII ACTGGACACCAGGGCCAGTT Mapping STE2 area, HaeII HAC1 probe R, HaeII CGACTCTGGTACATTTTCCGTCT Mapping STE2 area, HaeII CAK1 probe F, HaeII CGGGTCTTTCTATCTCGCTAT Mapping STE2 area, HaeII CAK1 probe R, HaeII ATGCAACCCCTCTTCGAGAAT Mapping STE2 area, HaeII GYP8 probe F, HaeII AACTGTTTGTATAGCACCATGGC Mapping STE2 area, HaeII GYP8 probe R, HaeII GGGATTTCATGATGAACTCAC Mapping STE2 area, HaeII Bst1 F GTTCAAGCTTTTTTCTTGCCAT Cloning BST1 for pAS1 Bst1 R with BamHI site CGGGATCCAGCAGCACAACTTT GTACCTCTC Cloning BST1 for pAS1 Ste2 F with XmaI site CCCGGGTCGTATTTTGTTAATTG GCTTGT Cloning STE2 for pAS1 Ste2 R TGCCTGAGAGTTCTAGATCATGG Cloning STE2 for pAS1 Bst1 F for ChIP AGGCTTACCATCATACAAAAATC ChIP Bst1 R for ChIP TTGTTTCGAAAAATAGCAACC ChIP CAR btwn Bst1 and Ste2 F GAACTACAACCCAGTAAAAAAAGA ChIP CAR btwn Bst1 and Ste2 R GATCAAAATTTACGGCTTTGA ChIP Ste2 F1 for ChIP CTTTTTCAAAGCCGTAAATTTTG ChIP Ste2 R1 for ChIP ATGTTTGGTGTCAGATGTGGTG ChIP Ste2 F2 for ChIP CCATAAACATGAACGTCACCTCT ChIP Ste2 R2 for ChIP TCTCGTGCATTAAGACAGGCTA ChIP UAS Ste2 F for ChIP ATGCACGAGACATTTACCCA ChIP Bkbn pDTL R1 for ChIP TGCAAAGAAACCACTGTGTTT ChIP UAS Ste2, HincII - F TTTGTCGTGGCTACTCTGATTAG Mapping UAS Ste2, HincII - R GGAAGGGAAGATTATCTTGA ATTG Mapping

116 STE2/CAR F TGAGTTGCAAGGTTTAGTTGACA ChIP STE2/CAR R TTGAACTCGTAAAAGCAAAGGTG ChIP CAR/BST1 F CAAGCTTTTTTCTTGCCATGATC ChIP CAR/BST1 R TCGCCTTTTCTCTATTCTTGG ChIP STE2, HincII - F GTTACTCAGGCCATTATGTTTGG Mapping STE2, HincII - R AGTGCAGAATGCAAAATGATTAA Mapping STE2 3', HincII - F ACTCCCGATACGGCAGCTGAT Mapping STE2 3', HincII - R TGCATCTGATGAGCACCTGAAT Mapping BST1, HincII - F ACCAATCTCTGATCCCCAAA Mapping BST1, HincII - R ACGCTGTTCTTAGCTTCGCCT Mapping FET5 CAR probe - F AGAGGTTTCTCGCGGCTAATTA RNA FET5 CAR probe - R GCTTTCATTGCTGTCATTGTGC RNA MOB2 CAR probe - F TTTTTTCTCGTGAGCCGTGA RNA MOB2 CAR probe - R GCTGCTACCGTTAATTGAAAGC RNA

Bkbn Bkbn pDTL R1 for ChIP TGCAAAGAAACCACTGTGTTT ChIP Bkbn Fragment 1 F AAAAGAAAAGGAGAGGGCCA ChIP Bkbn Fragment 1R ACCGGGTCAATTGTTCTCTT ChIP Bkbn Fragment 2F TGCAAGGAAAATTTCAAGTCTTG ChIP Bkbn Fragment 2R AATGAGGTTTCTGTGAAGCTGC ChIP Bkbn Fragment 3F TTTGATTCAGAAGCAGGTGGG ChIP Bkbn Fragment 3R GCATTTTTGACGAAATTTGC ChIP Bkbn Fragment 4F TGTGAGCGGATAACAATTTAAGT ChIP Bkbn Fragment 4R CCTGTACAATCAATCAAAAAGCC ChIP Bkbn Fragment 5F ATCGCAGGGGGTTGACTTTTA ChIP Bkbn Fragment 5R CTTTCATCAAATCGTGGTCG ChIP Bkbn Fragment 6F CGACCACGATTTGATGAA ChIP Bkbn Fragment 6R CCGAAAGTTAAAAAAGAAATAGT ChIP Bkbn Fragment 7F AACTTTCGGAAATCAAATACAC ChIP Bkbn Fragment 7R GTTTTTACAGCGAAAAGACG ChIP Bkbn Fragment 8F TCTCAAATACACTTATTAACCGC ChIP Bkbn Fragment 8R AATAGTGAAGGAGCATGTTCGG ChIP

CARC1 CARC1 Fragment 1 F AGTTCAAGAGTCTCCTAGTGGAC Cloning CARC1 Fragment 1 R GTATTACGCTCTGGCGTCAAT Cloning CARC1 Fragment 2 F GTTCCTTTGAAAAACAACGACAG Cloning CARC1 Fragment 2 R CAATAAAATTTCAGTTTCTCTGG Cloning CARC1 Fragment 3 F GAGTGTTTTATTGATTTTTTTTTT Cloning CARC1 Fragment 3 R ATTATATATGGAAAGAAA GTTCGG Cloning CARC1 Fragment 4 F AGTCTAGCCTTATCGAAATCTAC Cloning CARC1 Fragment 4 R GCCCTTGTAATAGCCGACAAAT Cloning C1 replacement F, C1 replacement R, Bud3 5'end TCCCCCCGGGAATTTGGTTGTG GATAACCCA Cloning 117 Bud3 C1 deletion F CGGGATCCATGACTCGGGTAT TGAAAAAAGTG Cloning Bud3 C1 deletion R TCCCCCCGGGCTGAGCATTTCC TAACACGGTT Cloning Spacer C1 deletion F TCCCCCCGGGAGCTGTTAGTATT TGATACGGTTTGC Cloning Spacer C1 deletion R CGGGATCCGAAGCGGGCGGGTT ATTAAATAA Cloning pCM26-1 F CCTTTTCTCGGTCTTGCAAA Cloning pCM26-1 R AGTCGACCCGAGATCATATCA Cloning

CARC3 CARC3 EcoRI F GATGATTTCAATGGTAGAAGCAG Mapping CARC3 EcoRI R CTCATTTCTTATGACATTTTCCT Mapping

118

Construct 1:PCM27-1

This construct was cloned by the Koshland lab who inserted CARL2 into the pDTL backbone. CARL2 is a 1388 bp CAR located on the right arm of chromosome XII. Like pCM26-1, it could not ChIP cohesin unless CEN3 was also inserted into the backbone. CARL2 was selected because it was a longer CAR than CARC1 (at 829 bp) and part of my original comprehensive proposal involved looking at different CARs isolated by MAP to see if they maintain the same structure under the TEM and if they have the same number of cohesin molecules that interact at the CAR locus.

Figure A-1: pCM27-1

119

Construct 2: pCM26-1 backbone

This construct was designed as a control for pCM26-1 as it contains the pCM26-1 sequence without CARC1. It was generated by an XhoI digest and religation. This plasmid was discontinued under the assumption that due to the presence of the CEN3 sequence, cohesin binding to this minichromosome would be indistinguishable from cohesin binding to the CARC1-containing minichromosome.

Figure A-2: pCM26-2 120

Construct 3: pCM26-1 with AluI mutations

To ensure that the minichromosome containing CAR can be used as an accurate representation of the genomic CAR, four mutations to generate AluI restriction enzyme recognition sites in CARC1 were designed. PCR primers amplifying those regions were also designed. This meant that following an AluI digest, the genomic CARC1 could be distinguished from the minichromosome CARC1 by PCR. In the wild-type CARC1, the PCR primers would amplify four regions of 182, 197, 217, and 237 bp each. In the minichromosome, these four regions would each be reduced to 80 + 102, 72 + 125, 97 + 120, 63 + 174 bp fragments. These AluI mutations would have allowed for a side-by-side ChIP analysis (as well as primer extension experiments) of both genomic and minichromosome CARC1 within the same experiment. The mutations would also help elucidate the minimal DNA recognition domain necessary for cohesin binding, under the assumption that cohesin interacted with specific regions within a CAR. This aspect of MAP was shelved upon the publication that cleavage of minichromosome leads to the cohesin removal from those minichromosomes. 121

Figure A-3: pCM26-1 with Alul mutations

Appendix B Additional experiments—Detection of noncoding RNAs across CAR loci 123

Non-coding RNAs detected at multiple CAR loci

Research in S. pombe implies an overlap between the cohesin complex and the RNA interference pathway machinery. This data suggests that RNAi plays a regulatory role in the establishment of heterochromatin by transcriptional silencing both at the centromeres and the mating type locus, two loci where cohesin function has been involved in the maintenance of heterochromatin. Deletion of RNAi machinery components leads not only to an accumulation of small dsRNAs with homology to centromeric repeats, but also to a loss of histone H3 lysine-9 methylation and reduced centromeric function (Volpe et al., 2003) (Hall et al., 2003). Other data from the Grewal lab identified a functional link between RNAi and cohesin, also in S. pombe (Hall et al., 2003). Fluorescence in situ hybridization (FISH) analysis of centromeres and arm sequences in RNAi mutants revealed chromosomal segregation defects. In my comprehensives proposal, I speculated that because there are no RNAi components identified in S. cerevisiae, researchers have not considered a link between cohesin binding and RNA in budding yeast. However, a possible role for noncoding RNAs in the evolutionarily conserved process of sister chromatid cohesion remained a possibility. This published research puts forward the possibility that cohesin involvement in silencing and RNAi (along with other functions) can be attributed to a single molecular mechanism or that the cohesin complex functions as a unit in multiple cellular pathways. To further investigate the overlap between small RNAs and cohesin, northern blots were performed to look for noncoding RNAs at several CAR loci – CARC1, CARL2, CARC3, the MOB2 CAR, and the FET2 CAR. Of these loci, only CARC3 did not generate an RNA detectable by northern blot (See Figure A2.3A). Total RNA was isolated from cells arrested at G1, S, and M phase by alpha-factor, hydroxyurea, and nocodazole respectively and probed with probes specific to each CAR loci. For those loci that overlapped with a gene, the probe was designed to cover both the CAR locus and the upstream gene to ensure that the RNA identified was CAR specific and not attributable to transcription read-through of the upstream gene. Non-coding RNAs from 124 all these CARs were measurable throughout the cell cycle, with a peak during S phase (See Figure A2.3B). The presence of these non-coding RNAs during M phase presents an interesting conundrum for the binding of cohesin at CAR sites due to the pushing of the transcriptional machinery from convergently transcribed genes (Lengronne et al., 2004) and suggests that this model may not accurately reflect in vivo conditions. In addition to this data, low levels of transcription have been observed at other cohesin attachment regions (Gerton, personal communication). 125

Figure B-1: Noncoding RNA are transcribed at CAR loci. (A) Northern blot of total RNA with probes specific to CARC1, CARL2, the MOB2 CAR and the FET2 CAR show transcripts at G1, S, and M phase. (B) Transcript levels are slightly elevated in RNA isolated from S phase arrested cells. The graph shows fold difference of each transcript over G1 amounts in S phase and M phase.

Appendix C

Phylogenetic analysis—complete sequence alignments and neighbor-joining trees

127

Alignment of the SMC1 clade

Residue columns highlighted in black show conservation among all sequences, while dark gray columns show 80% conservation and light gray show 60% conservation among sequences. C.familiaris – dog; B.taurus – cow; H.sapiens – human; R.norvegicus – rat; M.musculus – mouse; G.gallus – chicken; X.laevis – frog; T.rubripes – pufferfish; D.rerio – zebrafish; S.purpuratus – sea urchin; C.intestinalis – sea squirt; N.vectensis – sea anemone; D.melanogaster – fruitfly; A.aegypti – mosquito; M.mulatta – rhesus monkey; P.troglodytes – chimpanzee; T.cruzi – trypanosome (protist); S.cerevisiae – budding yeast.

128

129

130

131

132

133

134

135

136

137

138

139

140 Neighbor joining tree for the entire SMC family of proteins

The smaller Smc components of the cohesin and condensin complexes – Smc2 (green) and Smc3 (red) – grouped together as did the larger components – Smc1 (light aqua) and Smc4 (blue). Smc5 (green) and Smc6 (purple) cluster together. The eubacteria Smc (yellow) and archaebacteria Smc (olive) are ancestral to the eukaryotic paralogs. The meiotic form of Smc1 – Smc1β is shown in bright blue. Breakdown of each clade with visible bootstrap values is shown on the following pages

141

142

143

144

BIBLIOGRAPHY

Akhmedov, A.T., Frei, C., Tsai-Pflugfelder, M., Kemper, B., Gasser, S.M., and Jessberger, R. (1998). Structural maintenance of chromosomes protein C- terminal domains bind preferentially to DNA with secondary structure. The Journal of biological chemistry 273, 24088-24094. Anderson, D.E., Losada, A., Erickson, H.P., and Hirano, T. (2002). Condensin and cohesin display different arm conformations with characteristic hinge angles. The Journal of cell biology 156, 419-424. Angus-Hill, M.L., Schlichter, A., Roberts, D., Erdjument-Bromage, H., Tempst, P., and Cairns, B.R. (2001). A Rsc3/Rsc30 zinc cluster dimer reveals novel roles for the chromatin remodeler RSC in gene expression and cell cycle control. Molecular cell 7, 741-751. Bazett-Jones, D.P., Kimura, K., and Hirano, T. (2002). Efficient supercoiling of DNA by a single condensin complex as revealed by electron spectroscopic imaging. Molecular cell 9, 1183-1190. Beasley, M., Xu, H., Warren, W., and McKay, M. (2002). Conserved disruptions in the predicted coiled-coil domains of eukaryotic SMC complexes: implications for structure and function. Genome research 12, 1201-1209. Birkenbihl, R.P., and Subramani, S. (1992). Cloning and characterization of rad21 an essential gene of Schizosaccharomyces pombe involved in DNA double-strand- break repair. Nucleic acids research 20, 6605-6611. Blat, Y., and Kleckner, N. (1999). Cohesins bind to preferential sites along yeast chromosome III, with differential regulation along arms versus the centric region. Cell 98, 249-259. Cairns, B.R., Schlichter, A., Erdjument-Bromage, H., Tempst, P., Kornberg, R.D., and Winston, F. (1999). Two functionally distinct forms of the RSC nucleosome- remodeling complex, containing essential AT hook, BAH, and bromodomains. Molecular cell 4, 715-723. Case, R.B., Chang, Y.P., Smith, S.B., Gore, J., Cozzarelli, N.R., and Bustamante, C. (2004). The bacterial condensin MukBEF compacts DNA into a repetitive, stable structure. Science (New York, N.Y 305, 222-227. Castro, A., Bernis, C., Vigneron, S., Labbe, J.C., and Lorca, T. (2005). The anaphase- promoting complex: a key factor in the regulation of cell cycle. Oncogene 24, 314-325. Chang, C.R., Wu, C.S., Hom, Y., and Gartenberg, M.R. (2005). Targeting of cohesin by transcriptionally silent chromatin. Genes & development 19, 3031-3042. Chant, J., Mischke, M., Mitchell, E., Herskowitz, I., and Pringle, J.R. (1995). Role of Bud3p in producing the axial budding pattern of yeast. The Journal of cell biology 129, 767-778. Chelysheva, L., Diallo, S., Vezon, D., Gendrot, G., Vrielynck, N., Belcram, K., Rocques, N., Marquez-Lema, A., Bhatt, A.M., Horlow, C., Mercier, R., Mezard, C., and Grelon, M. (2005). AtREC8 and AtSCC3 are essential to the 146 monopolar orientation of the kinetochores during meiosis. Journal of cell science 118, 4621-4632. Chen, J.M., Rawlings, N.D., Stevens, R.A., and Barrett, A.J. (1998). Identification of the active site of legumain links it to caspases, clostripain and gingipains in a new clan of cysteine endopeptidases. FEBS letters 441, 361-365. Chiu, A., Revenkova, E., and Jessberger, R. (2004). DNA interaction and dimerization of eukaryotic SMC hinge domains. The Journal of biological chemistry 279, 26233-26242. Ciosk, R., Zachariae, W., Michaelis, C., Shevchenko, A., Mann, M., and Nasmyth, K. (1998). An ESP1/PDS1 complex regulates loss of sister chromatid cohesion at the metaphase to anaphase transition in yeast. Cell 93, 1067-1076. Cobbe, N., and Heck, M.M. (2004). The evolution of SMC proteins: phylogenetic analysis and structural implications. Molecular biology and evolution 21, 332- 347. Cohen-Fix, O., Peters, J.M., Kirschner, M.W., and Koshland, D. (1996). Anaphase initiation in Saccharomyces cerevisiae is controlled by the APC-dependent degradation of the anaphase inhibitor Pds1p. Genes & development 10, 3081- 3093. Covitz, K.M., Panagiotidis, C.H., Hor, L.I., Reyes, M., Treptow, N.A., and Shuman, H.A. (1994). Mutations that alter the transmembrane signalling pathway in an ATP binding cassette (ABC) transporter. The EMBO journal 13, 1752-1759. Damelin, M., Simon, I., Moy, T.I., Wilson, B., Komili, S., Tempst, P., Roth, F.P., Young, R.A., Cairns, B.R., and Silver, P.A. (2002). The genome-wide localization of Rsc9, a component of the RSC chromatin-remodeling complex, changes in response to stress. Molecular cell 9, 563-573. Deardorff, M.A., Kaur, M., Yaeger, D., Rampuria, A., Korolev, S., Pie, J., Gil- Rodriguez, C., Arnedo, M., Loeys, B., Kline, A.D., Wilson, M., Lillquist, K., Siu, V., Ramos, F.J., Musio, A., Jackson, L.S., Dorsett, D., and Krantz, I.D. (2007). Mutations in cohesin complex members SMC3 and SMC1A cause a mild variant of cornelia de Lange syndrome with predominant mental retardation. American journal of human genetics 80, 485-494. Deng, Y., Liu, J., Zheng, Q., Eliezer, D., Kallenbach, N.R., and Lu, M. (2006). Antiparallel four-stranded coiled coil specified by a 3-3-1 hydrophobic heptad repeat. Structure 14, 247-255. Deng, Y., Zheng, Q., Liu, J., Cheng, C.S., Kallenbach, N.R., and Lu, M. (2007). Self- assembly of coiled-coil tetramers in the 1.40 A structure of a leucine-zipper mutant. Protein Sci 16, 323-328. Donze, D., Adams, C.R., Rine, J., and Kamakaka, R.T. (1999). The boundaries of the silenced HMR domain in Saccharomyces cerevisiae. Genes & development 13, 698-708. Ducker, C.D., and Simpson, R.T. (2000). The organized chromatin domain of the repressed yeast a cell-specific gene STE6 contains two molecules of the corepressor Tup1p per nucleosome. The EMBO journal 19, 400-409. 147 Eijpe, M., Heyting, C., Gross, B., and Jessberger, R. (2000). Association of mammalian SMC1 and SMC3 proteins with meiotic chromosomes and synaptonemal complexes. Journal of cell science 113 ( Pt 4), 673-682. Funabiki, H., Yamano, H., Kumada, K., Nagao, K., Hunt, T., and Yanagida, M. (1996). Cut2 proteolysis required for sister-chromatid seperation in fission yeast. Nature 381, 438-441. Gilliland, W.D., and Hawley, R.S. (2005). Cohesin and the maternal age effect. Cell 123, 371-373. Glynn, E.F., Megee, P.C., Yu, H.G., Mistrot, C., Unal, E., Koshland, D.E., DeRisi, J.L., and Gerton, J.L. (2004). Genome-wide mapping of the cohesin complex in the yeast Saccharomyces cerevisiae. PLoS biology 2, E259. Gruber, S., Haering, C.H., and Nasmyth, K. (2003). Chromosomal cohesin forms a ring. Cell 112, 765-777. Gruber, S., Arumugam, P., Katou, Y., Kuglitsch, D., Helmhart, W., Shirahige, K., and Nasmyth, K. (2006). Evidence that loading of cohesin onto chromosomes involves opening of its SMC hinge. Cell 127, 523-537. Guacci, V., Koshland, D., and Strunnikov, A. (1997). A direct link between sister chromatid cohesion and chromosome condensation revealed through the analysis of MCD1 in S. cerevisiae. Cell 91, 47-57. Haering, C.H., Lowe, J., Hochwagen, A., and Nasmyth, K. (2002). Molecular architecture of SMC proteins and the yeast cohesin complex. Molecular cell 9, 773-788. Hall, I.M., Noma, K., and Grewal, S.I. (2003). RNA interference machinery regulates chromosome dynamics during mitosis and meiosis in fission yeast. Proceedings of the National Academy of Sciences of the United States of America 100, 193-198. Hartman, T., Stead, K., Koshland, D., and Guacci, V. (2000). Pds5p is an essential chromosomal protein required for both sister chromatid cohesion and condensation in Saccharomyces cerevisiae. The Journal of cell biology 151, 613- 626. Hauf, S., Roitinger, E., Koch, B., Dittrich, C.M., Mechtler, K., and Peters, J.M. (2005). Dissociation of cohesin from chromosome arms and loss of arm cohesion during early mitosis depends on phosphorylation of SA2. PLoS biology 3, e69. Hirano, M., and Hirano, T. (1998). ATP-dependent aggregation of single-stranded DNA by a bacterial SMC homodimer. The EMBO journal 17, 7139-7148. Hirano, T. (1998). SMC protein complexes and higher-order chromosome dynamics. Current opinion in cell biology 10, 317-322. Hirano, T. (2005). SMC proteins and chromosome mechanics: from bacteria to humans. Philosophical transactions of the Royal Society of London 360, 507-514. Hirano, T., Kobayashi, R., and Hirano, M. (1997). Condensins, chromosome condensation protein complexes containing XCAP-C, XCAP-E and a Xenopus homolog of the Drosophila Barren protein. Cell 89, 511-521. Hodges, C.A., Revenkova, E., Jessberger, R., Hassold, T.J., and Hunt, P.A. (2005). SMC1beta-deficient female mice provide evidence that cohesins are a missing link in age-related nondisjunction. Nature genetics 37, 1351-1355. 148 Huang, C.E., Milutinovich, M., and Koshland, D. (2005). Rings, bracelet or snaps: fashionable alternatives for Smc complexes. Philosophical transactions of the Royal Society of London 360, 537-542. Huang, J., and Laurent, B.C. (2004). A Role for the RSC chromatin remodeler in regulating cohesion of sister chromatid arms. Cell Cycle 3, 973-975. Huang, J., Hsu, J.M., and Laurent, B.C. (2004). The RSC nucleosome-remodeling complex is required for Cohesin's association with chromosome arms. Molecular cell 13, 739-750. Ivanov, D., and Nasmyth, K. (2005). A topological interaction between cohesin rings and a circular minichromosome. Cell 122, 849-860. Ivanov, D., Schleiffer, A., Eisenhaber, F., Mechtler, K., Haering, C.H., and Nasmyth, K. (2002). Eco1 is a novel acetyltransferase that can acetylate proteins involved in cohesion. Curr Biol 12, 323-328. Katis, V.L., Galova, M., Rabitsch, K.P., Gregan, J., and Nasmyth, K. (2004). Maintenance of cohesin at centromeres after meiosis I in budding yeast requires a kinetochore-associated protein related to MEI-S332. Curr Biol 14, 560-572. Kerrebrock, A.W., Moore, D.P., Wu, J.S., and Orr-Weaver, T.L. (1995). Mei-S332, a Drosophila protein required for sister-chromatid cohesion, can localize to meiotic centromere regions. Cell 83, 247-256. Kim, S.T., Xu, B., and Kastan, M.B. (2002). Involvement of the cohesin protein, Smc1, in Atm-dependent and independent responses to DNA damage. Genes & development 16, 560-570. Kimura, K., and Hirano, T. (1997). ATP-dependent positive supercoiling of DNA by 13S condensin: a biochemical implication for chromosome condensation. Cell 90, 625-634. Kitagawa, R., Law, E., Tang, L., and Rose, A.M. (2002). The Cdc20 homolog, FZY-1, and its interacting protein, IFY-1, are required for proper chromosome segregation in Caenorhabditis elegans. Curr Biol 12, 2118-2123. Kitajima, T.S., Yokobayashi, S., Yamamoto, M., and Watanabe, Y. (2003). Distinct cohesin complexes organize meiotic chromosome domains. Science (New York, N.Y 300, 1152-1155. Kitajima, T.S., Hauf, S., Ohsugi, M., Yamamoto, T., and Watanabe, Y. (2005). Human Bub1 defines the persistent cohesion site along the mitotic chromosome by affecting Shugoshin localization. Curr Biol 15, 353-359. Kumar, S., Tamura, K., and Nei, M. (2004). MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Briefings in bioinformatics 5, 150-163. Laloraya, S., Guacci, V., and Koshland, D. (2000). Chromosomal addresses of the cohesin component Mcd1p. The Journal of cell biology 151, 1047-1056. Lam, W.W., Peterson, E.A., Yeung, M., and Lavoie, B.D. (2006). Condensin is required for chromosome arm cohesion during mitosis. Genes & development 20, 2973-2984. Lengronne, A., Katou, Y., Mori, S., Yokobayashi, S., Kelly, G.P., Itoh, T., Watanabe, Y., Shirahige, K., and Uhlmann, F. (2004). Cohesin relocation from 149 sites of chromosomal loading to places of convergent transcription. Nature 430, 573-578. Li, J., Xu, M., Zhou, H., Ma, J., and Potter, H. (1997). Alzheimer presenilins in the nuclear membrane, interphase kinetochores, and centrosomes suggest a role in chromosome segregation. Cell 90, 917-927. Liu, J., Deng, Y., Zheng, Q., Cheng, C.S., Kallenbach, N.R., and Lu, M. (2006a). A parallel coiled-coil tetramer with offset helices(,). Biochemistry 45, 15224-15231. Liu, J., Zheng, Q., Deng, Y., Cheng, C.S., Kallenbach, N.R., and Lu, M. (2006b). A seven-helix coiled coil. Proceedings of the National Academy of Sciences of the United States of America 103, 15457-15462. Losada, A., Hirano, M., and Hirano, T. (2002). Cohesin release is required for sister chromatid resolution, but not for condensin-mediated compaction, at the onset of mitosis. Genes & development 16, 3004-3016. Losada, A., Yokochi, T., Kobayashi, R., and Hirano, T. (2000). Identification and characterization of SA/Scc3p subunits in the Xenopus and human cohesin complexes. The Journal of cell biology 150, 405-416. McGuinness, B.E., Hirota, T., Kudo, N.R., Peters, J.M., and Nasmyth, K. (2005). Shugoshin prevents dissociation of cohesin from centromeres during mitosis in vertebrate cells. PLoS biology 3, e86. Megee, P.C., Mistrot, C., Guacci, V., and Koshland, D. (1999). The centromeric sister chromatid cohesion site directs Mcd1p binding to adjacent sequences. Molecular cell 4, 445-450. Melby, T.E., Ciampaglio, C.N., Briscoe, G., and Erickson, H.P. (1998). The symmetrical structure of structural maintenance of chromosomes (SMC) and MukB proteins: long, antiparallel coiled coils, folded at a flexible hinge. The Journal of cell biology 142, 1595-1604. Milutinovich, M., Unal, E., Ward, C., Skibbens, R.V., and Koshland, D. (2007). A Multi-Step Pathway for the Establishment of Sister Chromatid Cohesion. PLoS Genet 3, e12. Montpetit, B., Thorne, K., Barrett, I., Andrews, K., Jadusingh, R., Hieter, P., and Measday, V. (2005). Genome-wide synthetic lethal screens identify an interaction between the nuclear envelope protein, Apq12p, and the kinetochore in Saccharomyces cerevisiae. Genetics 171, 489-501. Musio, A., Selicorni, A., Focarelli, M.L., Gervasini, C., Milani, D., Russo, S., Vezzoni, P., and Larizza, L. (2006). X-linked Cornelia de Lange syndrome owing to SMC1L1 mutations. Nature genetics 38, 528-530. Ohbayashi, T., Oikawa, K., Yamada, K., Nishida-Umehara, C., Matsuda, Y., Satoh, H., Mukai, H., Mukai, K., and Kuroda, M. (2007). Unscheduled overexpression of human WAPL promotes chromosomal instability. Biochemical and biophysical research communications 356, 699-704. Pati, D., Zhang, N., and Plon, S.E. (2002). Linking sister chromatid cohesion and apoptosis: role of Rad21. Molecular and cellular biology 22, 8267-8277. Pedersen, P.L. (2005). Transport ATPases: structure, motors, mechanism and medicine: a brief overview. Journal of bioenergetics and biomembranes 37, 349-357. 150 Petronczki, M., Chwalla, B., Siomos, M.F., Yokobayashi, S., Helmhart, W., Deutschbauer, A.M., Davis, R.W., Watanabe, Y., and Nasmyth, K. (2004). Sister-chromatid cohesion mediated by the alternative RF-CCtf18/Dcc1/Ctf8, the helicase Chl1 and the polymerase-alpha-associated protein Ctf4 is essential for chromatid disjunction during meiosis II. Journal of cell science 117, 3547-3559. Rabitsch, K.P., Gregan, J., Schleiffer, A., Javerzat, J.P., Eisenhaber, F., and Nasmyth, K. (2004). Two fission yeast homologs of Drosophila Mei-S332 are required for chromosome segregation during meiosis I and II. Curr Biol 14, 287- 301. Ren, Q., Yang, H., Rosinski, M., Conrad, M.N., Dresser, M.E., Guacci, V., and Zhang, Z. (2005). Mutation of the cohesin related gene PDS5 causes cell death with predominant apoptotic features in Saccharomyces cerevisiae during early meiosis. Mutation research 570, 163-173. Revenkova, E., and Jessberger, R. (2005). Keeping sister chromatids together: cohesins in meiosis. Reproduction (Cambridge, England) 130, 783-790. Revenkova, E., and Jessberger, R. (2006). Shaping meiotic prophase chromosomes: cohesins and synaptonemal complex proteins. Chromosoma 115, 235-240. Revenkova, E., Eijpe, M., Heyting, C., Gross, B., and Jessberger, R. (2001). Novel meiosis-specific isoform of mammalian SMC1. Molecular and cellular biology 21, 6984-6998. Revenkova, E., Eijpe, M., Heyting, C., Hodges, C.A., Hunt, P.A., Liebe, B., Scherthan, H., and Jessberger, R. (2004). Cohesin SMC1 beta is required for meiotic chromosome dynamics, sister chromatid cohesion and DNA recombination. Nature cell biology 6, 555-562. Rollins, R.A., Korom, M., Aulner, N., Martens, A., and Dorsett, D. (2004). Drosophila nipped-B protein supports sister chromatid cohesion and opposes the stromalin/Scc3 cohesion factor to facilitate long-range activation of the cut gene. Molecular and cellular biology 24, 3100-3111. Ruan, C., Workman, J.L., and Simpson, R.T. (2005). The DNA repair protein yKu80 regulates the function of recombination enhancer during yeast mating type switching. Molecular and cellular biology 25, 8476-8485. Rusche, L.N., Kirchmaier, A.L., and Rine, J. (2003). The establishment, inheritance, and function of silenced chromatin in Saccharomyces cerevisiae. Annual review of biochemistry 72, 481-516. Saitoh, N., Goldberg, I., and Earnshaw, W.C. (1995). The SMC proteins and the coming of age of the chromosome scaffold hypothesis. Bioessays 17, 759-766. Schar, P., Fasi, M., and Jessberger, R. (2004). SMC1 coordinates DNA double-strand break repair pathways. Nucleic acids research 32, 3921-3929. Schleiffer, A., Kaitna, S., Maurer-Stroh, S., Glotzer, M., Nasmyth, K., and Eisenhaber, F. (2003). Kleisins: a superfamily of bacterial and eukaryotic SMC protein partners. Molecular cell 11, 571-575. Simpson, R.T., Ducker, C.E., Diller, J.D., and Ruan, C. (2004). Purification of native, defined chromatin segments. Methods in enzymology 375, 158-170. 151 Simpson, R.T., Roth, S.Y., Morse, R.H., Patterton, H.G., Cooper, J.P., Murphy, M., Kladde, M.P., and Shimizu, M. (1993). Nucleosome positioning and transcription. Cold Spring Harbor symposia on quantitative biology 58, 237-245. Sjogren, C., and Nasmyth, K. (2001). Sister chromatid cohesion is required for postreplicative double-strand break repair in Saccharomyces cerevisiae. Curr Biol 11, 991-995. Smerdon, M.J., and Thoma, F. (1990). Site-specific DNA repair at the nucleosome level in a yeast minichromosome. Cell 61, 675-684. Soppa, J. (2001). Prokaryotic structural maintenance of chromosomes (SMC) proteins: distribution, phylogeny, and comparison with MukBs and additional prokaryotic and eukaryotic coiled-coil proteins. Gene 278, 253-264. Sumara, I., Vorlaufer, E., Stukenberg, P.T., Kelm, O., Redemann, N., Nigg, E.A., and Peters, J.M. (2002). The dissociation of cohesin from chromosomes in prophase is regulated by Polo-like kinase. Molecular cell 9, 515-525. Takeda, T., Ogino, K., Tatebayashi, K., Ikeda, H., Arai, K., and Masai, H. (2001). Regulation of initiation of S phase, replication checkpoint signaling, and maintenance of mitotic chromosome structures during S phase by Hsk1 kinase in the fission yeast. Molecular biology of the cell 12, 1257-1274. Tanaka, T., Cosma, M.P., Wirth, K., and Nasmyth, K. (1999). Identification of cohesin association sites at centromeres and along chromosome arms. Cell 98, 847-858. Toth, A., Ciosk, R., Uhlmann, F., Galova, M., Schleiffer, A., and Nasmyth, K. (1999). Yeast cohesin complex requires a conserved protein, Eco1p(Ctf7), to establish cohesion between sister chromatids during DNA replication. Genes & development 13, 320-333. Uhlmann, F., Lottspeich, F., and Nasmyth, K. (1999). Sister-chromatid separation at anaphase onset is promoted by cleavage of the cohesin subunit Scc1. Nature 400, 37-42. Unal, E., Arbel-Eden, A., Sattler, U., Shroff, R., Lichten, M., Haber, J.E., and Koshland, D. (2004). DNA damage response pathway uses histone modification to assemble a double-strand break-specific cohesin domain. Molecular cell 16, 991-1002. Volpe, T., Schramke, V., Hamilton, G.L., White, S.A., Teng, G., Martienssen, R.A., and Allshire, R.C. (2003). RNA interference is required for normal centromere function in fission yeast. Chromosome Res 11, 137-146. Waizenegger, I.C., Hauf, S., Meinke, A., and Peters, J.M. (2000). Two distinct pathways remove mammalian cohesin from chromosome arms in prophase and from centromeres in anaphase. Cell 103, 399-410. Watanabe, Y., and Kitajima, T.S. (2005). Shugoshin protects cohesin complexes at centromeres. Philosophical transactions of the Royal Society of London 360, 515- 521, discussion 521. Whitson, S.R., LeStourgeon, W.M., and Krezel, A.M. (2005). Solution structure of the symmetric coiled coil tetramer formed by the oligomerization domain of hnRNP C: implications for biological function. Journal of molecular biology 350, 319- 337. 152 Xu, M., Simpson, R.T., and Kladde, M.P. (1998). Gal4p-mediated chromatin remodeling depends on binding site position in nucleosomes but does not require DNA replication. Molecular and cellular biology 18, 1201-1212. Yamamoto, A., Guacci, V., and Koshland, D. (1996). Pds1p, an inhibitor of anaphase in budding yeast, plays a critical role in the APC and checkpoint pathway(s). The Journal of cell biology 133, 99-110. Zakian, V.A., and Scott, J.F. (1982). Construction, replication, and chromatin structure of TRP1 RI circle, a multiple-copy synthetic plasmid derived from Saccharomyces cerevisiae chromosomal DNA. Molecular and cellular biology 2, 221-232.

153 VITA:

Alexandra Surcel

Education The Pennsylvania State University, PhD in Cell and Developmental Biology, August 2007 Johns Hopkins University, Natural Sciences Public Health BA and Chemistry BA, May 2000

Awards 2006 Intercollege Graduate Student Outreach Achievement Award 2005-2007 NASA Space Grant Consortium Fellowship 2004 Second Place, Graduate Exhibition, Health and Life Science category, PSU 2000 National Science Foundation (NSF) honorable mention for the Graduate Research Fellowship program 1999, 1998 Provost Undergraduate Research Award for project entitled: “Characterization of Histidine Rich Proteins of Plasmodium falciparum” 1999 Howard Hughes Summer Research Fellowship for the proposal entitled: “Characterization of Histidine Rich Proteins of Plasmodium falciparum” Selected to receive the Second Decade Society Summer grant for the proposal entitled: “Characterization of Histidine Rich Proteins of Plasmodium falciparum”

Presentations and Publications In preparation: “Cohesin Interaction with Centromeric Minichromosomes Shows a Flexible Rod Multi-Complex Structure” In preparation:“The origin of the cohesin specific vertebrate meiotic SMC1” 2005 American Society for Cell Biology national meeting, poster entitled “The interaction between DNA and the cohesin complex,” San Francisco, CA

Public Outreach and Service 2003-2007 Volunteer at Penn State BioDays 2003-2004 Founder and coordinator for Penn State BioDays 2001-2003 Volunteer and board member for Penn State Science Lions, school science education program 2000-2003 Volunteer at Penn State Astrofest

Work Experience from 1995 01/00 – 05/00 Naval Medical Research Institute; Lab of Dr. Wei-Mei Ching 09/97 – 01/00 “Characterization of Histidine Rich Proteins of Plasmodium falciparum and their Role in Hemozoin Formation;” W. Harry Feinstone Department of Molecular Microbiology and Immunology, Johns Hopkins School of Hygiene and Public Health; Lab of Dr. David Sullivan 09/97 – 12/99 Undergraduate tutor in Organic Chemistry I, various lab classes 09/99 – 01/00 Study Consultant; Office of Academic Advising, Johns Hopkins University 07/97 – 03/98 Contractor for the National Bureau of Economic Research, Cambridge, MA 06/96-07/97 “Deformed Homeodomain Site Directed Mutagenesis;” Chemistry Department, Johns Hopkins University; Lab of Dr. Thomas Tullius 09/95-05/96 “Characterization of Gal Repressor Protein through Site Directed Mutagenesis;” Biology Department, Johns Hopkins University; Lab of Dr. Ludwig Brand