THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE

DEPARTMENT OF BIOCHEMISTRY AND MOLECULAR BIOLOGY

EXPRESSION AND PURIFICATION OF THE HBO1/JADE1 HISTONE ACETYLTRANSFERASE COMPLEX

VIKTOR TOLLEMAR SPRING 2013

A thesis submitted in partial fulfillment of the requirements for a baccalaureate degree in Premedicine with honors in Biochemistry and Molecular Biology

Reviewed and approved* by the following:

Song Tan Professor of Biochemistry and Molecular Biology Thesis Supervisor

David Gilmour Professor of Biochemistry and Molecular Biology Honors Adviser

Scott Selleck Professor and Head, Department of Biochemistry and Molecular Biology

* Signatures are on file in the Schreyer Honors College.

i

ABSTRACT

Normal human development and cell function require the regulation of gene expression.

Our genetic material is packaged as chromatin and this packaging directly affects how genes are regulated. The repeating unit of chromatin, the nucleosome, consists of DNA wrapped around an octamer of histone proteins. An important field of epigenetic research focuses on chromatin enzymes that regulate gene expression by chemically modifying histone proteins in a nucleosome. Specifically, the histone acetyltransferase (HAT) class of enzymes acetylate histone tails, a chemical modification that is closely associated with gene activation. The histone acetyltransferase binding to ORC (HBO1) HAT enzyme is of particular interest to me. HBO1 binds to the origin of replication complex (ORC), which is critical to the initiation and regulation of DNA replication. The exact relationship between acetylation and initiation of DNA replication remains unclear, but it is known that HBO1’s HAT enzymatic activity is significant in DNA replication. It is also known that HBO1 requires a protein partner called JADE1 in order to effectively carry out its catalytic function on the nucleosome. How the HBO1/JADE1 complex interacts with the nucleosome is currently under investigation. The ultimate goal of my project is to determine the structure of the HBO1/JADE1 complex bound to the nucleosome in order to elucidate the specific physical interactions between complex and nucleosome. To that end, I have created a polycistronic containing both genes, I have co-expressed the

HBO1/JADE1 complex, and I have attempted to purify the complex using metal .

ii

TABLE OF CONTENTS

List of Figures...... v

List of Tables ...... vii

Acknowledgements ...... viii

Chapter 1 Introduction ...... 1

1.1Chromatin ...... 1

1.2 Chromatin Modification...... 3

1.3 DNA Replication ...... 7

1.4 Regulation by HBO1 ...... 8

1.5 Protein Expression ...... 10

1.6 Polycistronic Expression System ...... 10

1.7 ...... 14

1.8 Summary ...... 15

Chapter 2 Materials & Methods ...... 17

2.1 Optimizing Purification ...... 17

2.2 Nomenclature Guide ...... 19

2.3 Subcloning ...... 20

2.3.1 PCR ...... 20

2.3.2 Restriction Enzyme Digestion ...... 21

2.3.3 Agarose Gel Purification ...... 22

2.3.4 Ligation ...... 23

2.3.5 Transformation ...... 23

iii

2.3.6 PCR Screening ...... 24

2.3.7 Plasmid Isolation ...... 25

2.3.8 Restriction Mapping ...... 26

2.3.9 Sequencing ...... 27

2.3.10 Subcloning Flow Charts...... 27

2.4 Supplemental DNA Methods ...... 32

2.4.1 Ethanol Precipitation ...... 32

2.4.2 Phenol/Chloroform Extraction ...... 32

2.4.3 Agarose Gel Electrophoresis ...... 33

2.4.4 UV Quantitation of DNA ...... 34

2.4.5 Site Directed PCR Mutagenesis ...... 34

2.5 Protein Expression ...... 36

2.5.1 Small-Scale Protein Expression ...... 36

2.5.2 Experiments Performed ...... 37

2.6 Purification ...... 39

2.6.1 Checking Solubility ...... 39

2.6.2 Small-Scale TALONTM Affinity Purification ...... 40

2.6.3 Experiments Performed ...... 41

2.7 Supplemental Protein Methods ...... 43

2.7.1 SDS-PAGE ...... 43

2.7.2 Purification by Q-Sepharose Ion Exchange ...... 44

Chapter 3 Results/Discussion ...... 46

3.1 Poly-histidine Affinity Tag Purification ...... 46

3.2 LYTAG Affinity Tag Purification ...... 50

3.3 Expressing the HBO1/JADE1 Complex ...... 52

iv

Chapter 4 Conclusion ...... 69

4.1 Future Directions ...... 69

4.2 Summary ...... 72

4.3 Epigenetics and Cancer ...... 73

Appendix A DNA Sequences ...... 74

Appendix B Molecular Weights ...... 89

Appendix C Solutions ...... 89

REFERENCES ...... 91

v

LIST OF FIGURES

Figure 1-1. The Nucleosome Core Particle...... 2

Figure 1-2. Acetylation/Deacetylation's Effect on Chromatin Structure...... 4

Figure 1-3. Bromodomain Interactions with Histone H4 Acetylated Lysine ...... 6

Figure 1-4. pST44 Polycistronic Expression Vector Map...... 12

Figure 1-5. pST66Tr Transfer Vector Map...... 13

Figure 1-6. pST69 Polycistronic Expression Vector Map...... 13

Figure 1-7. Model for TALON Affinity Interactions ...... 15

Figure 1-8. Project Roadmap...... 16

Figure 2-1. Typical PCR Thermal Cycle Pattern ...... 21

Figure 2-2. Creating 8x and 10x.HIS Tagged yEpl1 for Piccolo NuA4 ...... 28

Figure 2-3. Creating HST/LYTAG Tagged DHFR and EGFP...... 29

Figure 2-4. Creating HBO1/JADE1 Complex, SUMO/HST Tagged on HBO1 ...... 30

Figure 2-5. Creating HBO1/JADE1 Complex and HBO1/JADE1∆1 Complex...... 31

Figure 3-1. TALON Purification of 6x, 8x, and 10x Histidine Tagged Piccolo NuA4 Complex...... 47

Figure 3-2. TALON Purification of 6x, 8x, and 10x Histidine Tagged LSD1/CoREST Complex...... 49

Figure 3-3. Q-Sepharose and TALON purifiation of HST/LYTAG Tagged DHFR and EGFP...... 51

Figure 3-4. Problematic Arginine Residues in SUMO/HST Tagged HBO1 ...... 53

Figure 3-5. Expression (28°C and 37°C) and TALON Purification of HBO1/JADE1 Complex in BL21 (DE3) pLysS...... 55

Figure 3-6. Expression (37°C) and TALON Purification of HBO1/JADE1 Complex in BL21-CodonPlus (DE3) ...... 58

Figure 3-7. Expression of HBO1 in BL21-CodonPlus (DE3) at 37°C...... 60

Figure 3-8. Expression of SUMO/HST tagged HBO1 in BL21-CodonPlus (DE3) at 37°C. .. 62

Figure 3-9. Expression of JADE1 and JADE1∆1 in BL21-CodonPlus (DE3) at 37°C ...... 63

vi

Figure 3-10. Expression of SUMO/HST tagged HBO1/JADE1 complex in BL21- CodonPlus (DE3) at 28°C and 37°C...... 65

Figure 3-11. Expression (18°C and 23°C) and Purification of SUMO/HST tagged HBO1 in BL21-CodonPlus (DE3) ...... 67

vii

LIST OF TABLES

Table 2-1. Appropriate Agarose Concentrations for Visualization of Variable-length DNA Fragments ...... 33

Table 3-1. Piccolo NuA4 Complex Molecular Weights ...... 46

Table 3-2. LSD1 Complex Molecular Weights...... 48

Table 3-3. DHFR & EGFP Molecular Weights...... 50

Table 3-4. HBO1/JADE1 Complex (Version 1) Molecular Weights ...... 52

Table 3-5. HBO1/JADE1 Complex (Version 2) Molecular Weights ...... 64

viii

ACKNOWLEDGEMENTS

I would like to thank all the members of the Tan laboratory who have helped me in my studies and worked with me for the past three and a half years. These include post-doctoral fellows, graduate students, research technicians, and fellow undergraduates. The road I have take has been challenging and certainly also frustrating at times, but I value the skills and independence I have gained from my experiences in the laboratory. Finally, I cannot neglect to mention the profound impact Dr. Tan’s interest in my work has had on me. The contributions that

Dr. Tan make are invaluable to me. The processes involved can be tedious, and meticulous attention to detail and technique is necessary. I greatly appreciate having access to Dr. Tan and his years of experience. As a mentor to me, he has also taught me far-reaching lessons of how to do science. Dr. Tan has shown me that science involves so much more than following protocols and compiling data. It is a way to think critically, to apply knowledge, and to express oneself effectively.

1

Chapter 1

Introduction

1.1 Chromatin

Chromatin refers to DNA complexed around protein contained within the nucleus of every eukaryotic cell. There are three primary functions of chromatin. The first involves spatial packaging so that the large amount of DNA can fit inside the small enclosure of the nucleus (Li et al, 2011). The second function is to facilitate the condensation of chromosomes during mitosis

(Wei et al, 1999). And the final major function is regulation of gene expression. The way DNA is packaged directly affects how genes are expressed within the cell, and it is this function that is critically important to gene regulation research.

The repeating unit of chromatin, the nucleosome, consists of roughly 146 base pairs of

DNA wrapped around an octamer of histone proteins (Fig. 1-1) (Li et al, 2011). There are four pairs of unique histone proteins called H2A, H2B, H3, and H4. Dipoles resulting from successive peptide bonds in alpha helices are present within all four histones and result in a net positive charges along the outer perimeter of the histone octamer (Luger et al, 1997). The positive charges localize on the N-termini of the alpha helices and facilitate the formation of the nucleosome by fixing the position of individual phosphate groups in the DNA backbone (Luger et al, 1997).

Hydrogen bond contacts also exist between amide nitrogens near the N-terminal ends of aforementioned alpha helices (Luger et al, 1997). Additional hydrogen bonding also occurs between oxygen atoms of the DNA phosphate groups and protein basic side chain groups (Luger

2 et al, 1997). Thus, basic residues such as lysine and arginine play a critical role in DNA/histone interactions.

Figure provided by Song Tan; Ref: Davey et al, 2002 (PDP ID: 1kx5) Figure 1-1 The Nucleosome Core Particle

In chromatin, nucleosome core particles (pictured above) are linked to each other linearly through a length of linker DNA of variable length. It is known that there are many enzymes that bind to the nucleosome and influence gene expression through chemical modifications. These enzymes are vitally important to regulation of gene expression, something that is referred to by the term epigenetics. In its broadest sense, epigenetics is the study of gene expression variability that results from means other than direct changes in the genomic sequence (Holliday, 2006).

3 Nucleosome modifying enzymes constitute a key component of epigenetics and large contributing factor to the plasticity and conformational variability of chromatin (Holliday, 2006).

How some of these epigenetic enzymes interact with the nucleosome is less clear and is the focus of ongoing structural studies in the Tan laboratory.

1.2 Chromatin Modification

Normal human development and cell function requires the regulation of gene expression.

Our genetic material is packaged as chromatin and this packaging directly affects how genes are regulated. Repeating nucleosome core particles are able to condense and pack DNA very tightly, much too tightly to allow transcription factors to initiate transcription and gene expression.

Chromatin-modifying enzymes are necessary for "unwinding" tightly packaged chromatin and allowing transcription initiation. In order to achieve this unwinding effect, these enzymes function in a variety of ways ranging from direct loosening of the DNA/histone interaction, to rearranging nucleosome positioning along a stretch to DNA, to recruiting other enzymes in a gene-activation pathway (Serna et al, 2006).

Some epigenetic chromatin enzymes are able to regulate gene expression by chemically modifying the histone proteins in a nucleosome. In particular, the histone acetyltransferase (HAT) class of enzymes acetylate lysine residues on N-terminal histone tails, a chemical modification that is generally closely associated with gene activation. One theory of how this process works is that acetylation alters the charge of the histone core, reducing its affinity for DNA (Grunstein et al, 1997). The acetyl functional group contains a methyl group bonded to a carbonyl. The

4 carbonyl's relative negative charge neutralizes the histones' positive charge and thereby weakens

the interaction between histone octamer and DNA within the nucleosome.

Chromatin comprised of nucleosomes in this acetylated loosened conformation is

generally in a state of active transcription and is called euchromatin. This is sharply contrasted by

heterochromatin, chromatin that is tightly packed, not actively transcribing genes, and normally

hypoacetylated (Fig. 1-3) (Kueh et al, 2011).

A Acetyl Group

M Methyl Group

P Phosphate Group

U Ubiquitin Group

Image Source: Shukla et al, 2008

Figure 1-2. Acetylation/Deacetylation's Effect on Chromatin Structure (A) Heterochromatin is tightly packed and characterized by hypoacetylation. (B) Euchromatin illustrates a looser conformation. (C) Transcriptionally active euchromatin is characterized by acetylated histone tails. 5 However, this a generalization that certainly does not hold true in all cases. Histone acetylation, specifically H4 acetylation at lysine 16, has been linked with gene silencing in

Drosophila melanogaster and in yeast (Grunstein et al, 1997). It seems that the roles of histone acetylation are varied and diverse, and the mechanisms by which many HAT enzymes exert their effects remain unknown.

For example, weakening the histone/DNA interaction is not the only possible mechanism for how histone acetylation influences gene expression. Acetylation can also establish epigenetic signals recruiting other proteins. Nuclear effector proteins often contain protein domains that recognize chemical modifications such as acetylation, ubiquitination, methylation, and phosphoryaltion (Saksouk et al, 2009). Specifically, bromodomains are capable of recognizing acetylated lysine residues on histone H4 (Moriniere et al, 2009). Figure 1-2 illustrates the bonding network formed by the acetylated N-terminal lysine on histone H4 in a complex with a bromodomain protein module. A system of hydrogen bonds forms between nearby asparagine and proline residues on the bromodomain and adjacent water molecules (Owen et al, 2000). An unacetylated lysine would not form this network of bonds, leaving a localized positive charge and making binding to the bromodomain thermodynamically unfavorable. Thus, the charge- neutralizing effect of histone acetylation can serve a dual function - to reduce histone affinity for its surrounding DNA and/or to create favorable conditions for bromodomain binding.

6

Image source: Owen et al, 2000 Figure 1-3. Bromodomain Interactions with Histone H4 Acetylated Lysine

Histone chemical modification, specifically acetylation, is also intimately involved in the initiation of DNA replication. The tight regulation of this process is essential for cellular survival.

Epigenetic mechanisms regulate the assembly of a preregulatory complex (pre-RC) and ensure that the genome is copied only during S-phase and only once each cell cycle. Various regulatory mechanisms have already been characterized and include ATP utilization, cyclin-dependent kinase abundance, protein turnover rate, and Cdt1-geminin binding (Iizuka et al, 2012).

7

1.3 DNA Replication

DNA replication is a process that is essential to all living organisms and is a necessary precursor for cell division. In eukaryotes, replication of each chromosome begins at many origins of replication, which are recognized by an origin recognition complex (ORC). The ORC recruits additional proteins and facilitates the formation of the pre-regulatory complex, which unwinds and separates the double helix at the "origin" (Ladenburger et al, 2002). Origins of replication typically form along regions of DNA rich in adenine-thymine base pairs (Ladenburger et al,

2002). Adenine and thymine form only two hydrogen bonds between them whereas guanine and cytosine form three. A-T rich regions of DNA, thus, have a relatively weak interaction between complementary strands, facilitating separation by the pre-RC.

Through a series of protein-recruiting pathways, pre-RC initiates the formation of the replisome, a complex containing DNA polymerase as well as various other proteins necessary for replication itself. DNA helicase is one such protein and serves to destabilize and unwind the double helix at each replication fork of the origin of replication (Benkovic et al, 2001). Single- strand binding (SSB) proteins prevent reannealing of separated DNA strands at the replication forks. Topoisomerase relieves adjacent chromatin from its native, tightly-coiled conformation

(Nitiss, 1998). These proteins, as well as many more, are critical to the function of the replisome.

Importantly, DNA polymerase can only add nucleotides to the 3' end of an existing DNA strand. Thus there will necessarily be a "leading strand," which continually adds nucleotides in the 5' to 3' direction; and a "lagging strand," which can only polymerize in short segments from 5' to 3' but has a net growth from 3' to 5'. The fragments of the lagging strand, known as Okazaki

8 fragments, are ligated together by yet another component of the replisome called DNA ligase

(Benkovic et al, 2001). Because DNA polymerase can only add nucleotides to an existing strand, each new Okazaki fragment of the lagging strand also requires a short RNA primer to build off.

This primer is laid down by another replisome protein called Primase (Benkovic et al, 2001).

DNA replication in eukaryotes is a tightly regulated process and requires very precise coordination in order to ensure that only one additional copy of the genome is made each cell cycle. This precision is achieved through a series of pathways involving the formation of several key regulatory complexes at the origins of replication. Three have already been discussed above -

ORC, pre-RC, and replisome. Interestingly, these protein assemblies are highly conserved among nearly all eukaryotic organisms, despite major structural differences in the origin of replication of the respective organisms (Bell et al, 2002). These regulatory complex's structural consistency across species is indicative of their importance in promoting survival, both on the cellular and organismal level.

1.4 Regulation by HBO1

Histone acetylation is believed to be a critical regulatory mechanism in the regulation of pre-RC assembly and initiating DNA replication (Tomas et al, 2007). The histone acetyltransferase binding to ORC (HBO1) binds to the origin recognition complex (ORC), which serves as a platform for pre-RC assembly and is critical to the initiation and regulation of DNA replication (Iizuka et al, 2006). The exact relationship between acetylation and initiation of DNA replication remains unclear, but it is known that HBO1’s HAT enzymatic activity is significant in

DNA replication (Iizuka et al, 2006). Evidence for this includes in vitro and in vivo binding of

9 HBO1 to both ORC and another important protein in the pre-RC, mini-chromosome maintenance protein MCM2 (Burke et al, 2001).

HBO1 is a member of the MYST (so named because of the founding protein members

MOZ, Ybf2/Sas3, Sas2, and Tip60) family of HAT enzymes, characterized by a highly conserved

MYST region containing a canonical acetyl-CoA-binding site and a C2HC zinc finger motif

(Burke et al, 2001). The MYST family proteins play a key role in histone modification and thus profoundly impact chromatin structure and transcription (Avvakumov et al, 2007). Specifically,

MYST histone acetyltransferases are major participants in a wide array of important nuclear processes. Unsurprisingly, their malfunction has been linked to a host of diseases involving gene regulation, including cancer (Avvakumov et al, 2007). HBO1 is one such critical MYST family

HAT protein. It functions in DNA replication initiation and its HAT domain is largely responsible for H4 acetylation at lysine residues 5, 8, and 12 (Saksouk et al, 2009).

HBO1 requires other protein subunits to perform its catalytic function efficiently. In particular, HBO1 forms a protein complex together with a protein named JADE1. This binary complex is able to bind to and acetylate N-terminal histone tails of histone H4 in the nucleosome

(Foy et al, 2008). Small interfering RNA-mediated diminution of endogenous JADE1 effectively reduces H4 acetylation, demonstrating the importance of complex formation in synergistically enhancing HBO1's catalytic activity (Foy et al, 2008). Still, the precise physical interactions between HBO1 and JADE1 remain uncharacterized. In order to effectively study HBO1/JADE1 function, it is critical to develop and optimize a protocol for expression and purification of this complex.

10 1.5 Protein Expression

A common scheme for expressing recombinant plasmids in E. coli is using the a modified lac operon system. This system utilizes an E. coli strain that has been engineered to contain a lac operator just downstream of a gene coding for T7 RNA polymerase. Lac repressor binding to the lac operator in the E. coli genome prevents T7 RNA polymerase expression. Isopropyl β-D-1- thiogalactopyranoside (IPTG) binds to lac repressor and triggers its release from the lac operator.

Thus, the introduction of IPTG is used to activate the induce expression of T7 RNA polymerase.

T7 RNA polymerase will, in turn, bind to a second lac operator-regulated T7 promoter in the expression plasmid and initiate expression of the protein(s) of interest. This complicated T7- based system of recombinant gene expression is preferred over a system utilizing endogenous

RNA polymerase because to a large extent it prevents "leaky" background expression. With the

T7-based method, it is possible to grow up expression plasmid-containing cells that express next to none of the target protein or complex until induced by IPTG (pET system manual, 1999).

1.6 Polycistronic Expression System

Polycistronic expression involves an expression vector designed to transcribe several linked genes in a single strand of messenger RNA (mRNA). Coding genes may be inserted into one of four cassettes from a corresponding expression plasmid, or vector, using the appropriate restriction enzyme pair. Expression vectors specially designed to contain an insert flanked by particular restriction enzyme pairs (called "transfer vectors" in typical laboratory jargon) facilitate transfer into one of the four cassettes in the corresponding polycistronic expression vector (Fig. 1-

4) (Tan, 2001). Each cassette contains a translational enhancer sequence; Shine-Delgarno, or

11 ribosomal binding sequence; as well as initiation and termination codons. T7 promoter and terminator sequences are only present before cassette 1 and after cassette 4, respectively. The net effect of the polycistronic expression template is efficient translation of unlinked protein subunits from a single strand of transcribed mRNA.

The Tan laboratory has developed a published T7 expression based polycistronic expression plasmid, pST44 (Fig. 1-4), to coexpress protein complexes in E. coli. (Tan et al,

2005). The polycistronic expression plasmid used to express the HBO1/JADE1 complex was pST69 (Fig. 1-6), an unpublished derivative of the pST44 plasmid. Importantly, pST69 contains rare restriction sites for SgrAI and RsrII restriction enzymes in its first translation cassette. These sites are extremely rare in genomic DNA and so chances of a confounding internal restriction site are minimized (Bertaux et al, 2004).

As indicated in Figure 1-4, pST50Tr is the transfer vector precursor to pST44. pST50Tr facilitates transfer of genes into pST44 by utilizing matching restriction enzyme pairs. Figure 1-5 provides a vector map of pST66Tr, which is the transfer vector precursor (also unpublished) to pST69. pST66Tr facilitates transfer of genes into pST69 by utilizing matching restriction enzyme pairs.

12

A. Cassette Transfer Scheme for pST44

B. pST44 Vector Map

Modified from Figure 1, Tan et al, 2005

Figure 1-4. pST44 Polycistronic Expression Vector Map (A) pST50Tr transfer plasmids can be used to efficiently transfer genes into one of four expression cassettes in the pST44 polycistronic expression plasmid. (B) pST44 is designed with one T7 promoter upstream of cassette 1 and one terminator sequence downstream of cassette 4 so that all four genes can be transcribed into a single mRNA. Each cassette contains a Shine-Dalgarno sequence upstream of the gene of interest so that subunits are translated individually from the same length of mRNA.

13

Figure Provided by Song Tan

Figure 1-5. pST66Tr Transfer Vector Map pST66Tr is a transfer plasmid similar to pST44 but contains SgrAI and RSrII restriction enzymes in place of XbaI and BglII for insertion into cassette 1. A unique pair of restriciton enzymes for each cassette facilitates transfer of up to four genes into the pST69 polycistronic expression plasmid.

Cassette Cassette Cassette Cassette

Figure Provided by Song Tan

Figure 1-6. pST69 Polycistronic Expression Vector Map pST69 is a polycistronic expression plasmid designed with one T7 promoter upstream of cassette 1 and one terminator sequence downstream of cassette 4 so that all four genes can be transcribed into a single mRNA. Each cassette contains a Shine-Dalgarno sequence upstream of the gene of interest so that subunits are translated individually from the same length of mRNA.

14 1.7 Protein Purification

A simple means of purifying protein involves a poly-histidine affinity tag and a metal ion-based resin (typically nickel or cobalt). A solution of endogenous E. coli protein and poly- histidine tagged protein can be run through a column of cobalt-based resin, and only the tagged protein, or proteins with a high concentration of accessible histidine residues, will bind tightly enough to remain on the column after several washes. The cobalt-based resin used was

TALONTM resin and its affinity structure is centralized around a cobalt cation. The positive charge on the cobalt binds a lattice of carboxyl groups and also the partial negative charge of a histidine side chain structure (Figure 1-7 A) (BD TALONTM, 2004).

Imidazole (Figure 1-7 B) shares the structure of the ringed histidine side chain and, more importantly, also shares its cobalt-binding properties. Imidazole competitively binds to cobalt allowing for elution of the purified protein upon treatment with an imidazole solution. In the case of a multi-subunit complex purification, only one subunit should be tagged. The other subunit(s) will co-purify based on binding properties with the tagged subunit and/or with each other. These principles have been applied in the Tan laboratory to purify binary, ternary and quaternary protein complexes.

15

A. Poly-histidine/TALON Interaction B. Imidazole Structure

Image Source: Clontech Image Source: Sigma-Aldrich (Imidazole)

Figure 1-7. Model for TALON Affinity Interactions (A) TALON's cobalt-based affinity structure forms weak interactions with the side chains of histidine residues in the poly-histidine tag. The partial negative charge of the side chains interact with the positively charged cobalt atom. (B) Imidazole shares structure and cobalt-binding function with the histidine side chain. This allows for competitive inhibition and elution of the protein of interest using imidazole solution washes.

1.8 Summary

The ultimate goal of my project is to produce a recombinant HBO/JADE1 complex and to determine its crystal structure by X-ray crystallography. To accomplish that end, I created a recombinant plasmid to coexpress HBO1 complexed with JADE1 in E. coli, and then to purify the complex (See Fig. 1-8).

16

Coexpression Bacterial DNA Vector HBO1

JADE1

Nucleosome Express HBO1/JADE1 Purify Complex Crystallize Complex with

Complex in E. coli Nucleosome in Order to Determine Structure by X-ray Crystallography

Figure 1-8: Project Roadmap This basic flow chart broadly outlines the steps necessary for structure determination by X-ray crystallography. First the complex is expressed using a polycistronic expression plasmid in E. coli. The complex is then purified by means such as metal affinity chromatography and ion-exchange chromatography. Finally, crystal trials are set up, where various conditions are used to grow non-mosaic crystal lattices composed of repeated units of nucleosome bound to the complex.

17

Chapter 2

Materials & Methods

2.1 Optimizing Purification

Prior to beginning the construction of a polycistronic expression vector containing HBO1 and JADE1, I wanted to optimize metal affinity purification of recombinant proteins. Our laboratory previously used cobalt metal affinity chromatography with a 6x histidine affinity tag, but reports in the literature suggested better purification was possible with longer histidine tags

(Mohanty et al, 2004). I wanted to test the merits of those claims. To that end, I first compared the relative protein purity following cobalt metal affinity column chromatography using either 6x,

8x, or 10x histidine affinity tags. I performed the necessary experiments by creating three expression plasmids containing the Piccolo NuA4 complex (6x, 8x, and 10x HIS tagged on yEpl1). Piccolo NuA4 is a nucleosome-binding HAT complex consisting of three major subunits

- yEpl1, Yng2, and yEsa1. In the experiments described herein, Piccolo NuA4 was used only as a means to compare effectiveness of variable-length poly-histidine tags. The plasmids containing the complex were grown in E. coli (BL21 pLysS (DE3)) and the tagged proteins were expressed and purified. In the purification step, the complex eluted in a spectrum of increasing concentrations of imidazole.

A follow-up experiment using the LSD1/CoREST complex (6x, 8x, and 10x HIS tagged on LSD1) was performed in order to corroborate my findings. This second complex is a lysine- specific demethylase (LSD) which demethylates methylated lysine residues on histone tails. It

18 consists of two subunits - LSD1 and its corepresser protein CoREST. Like the Piccolo complex,

LSD1/CoREST was only used to compare effectiveness of 6x, 8x, and 10x poly-histidine tags.

Second, I wanted to diversify the Tan laboratory's available purification techniques by developing a novel method of purification. This novel system utilized a LYTAG affinity tag in both phase partitioning purification and in affinity resin chromatography. LYTAG is a large, choline-binding affinity tag that that allows purification by a choline-containing chromatographic matrix (Hernandez-Rocamora et al, 2008). I wanted to test whether the choline-like head on Q-

Sepharose resin would have sufficient affinity for LYTAG to form the basis of a new means of affinity resin purification. Additionally, I wanted to confirm the literature's claims that LYTAG tagged proteins can be purified via phase-partitioning. LYTAG's hydrophobic binding site is evidently capable of binding polyethylene glycol (PEG), allowing LYTAG tagged proteins to be purified by phase partitioning using an aqueous two-phase system containing PEG (Maestro et al,

2008).

Experiments were performed to test whether the LYTAG affinity tag could be utilized in purification methods using Q-Sepharose affinity purification. Constructs containing genes encoding 10x HIS/LYTAG tagged DHFR and 10x HIS/LYTAG tagged EGFP were created.

Dihydrofolate reductase, or DHFR, is an enzyme that has a regulatory function in cellular tetrahydrofolate levels in the cell. DHFR is a commonly used control protein in the Tan laboratory and was chosen for these experiments for its reliability, ease of expression, and high degree of solubility (Lichty et al, 2005). Enhanced green fluorescent protein, or EGFP, is fluorescent protein emitting green fluorescence when exposed to blue and ultraviolet light. I originally intended to test LYTAG's ability to purify proteins by phase partitioning, and EGFP's fluorescent properties made it an excellent candidate - it would have been easy to visually gauge

19 experimental success. However, those experiments were never performed, and the LYTAG tagged EGFP was instead used in mirroring the DHFR purification experiments. These proteins were expressed in E. coli (BL21 (DE3) pLysS). The 10x HIS/LYTAG tagged DHFR and EGFP were purified via both Q-Sepharose affinity chromatography and traditional TALONTM affinity chromatography.

2.2 Nomenclature Guide

 In expression vectors, name of protein is followed by its truncation number (∆1, ∆2, ∆3...).

 In expression vectors, truncation number is followed by the protein's version number (x1, x2, x3...).

 HST signifies a 10x poly-hisitidine affinity tag.

 HIS signifies a poly-histidine tag with the number of histidine residues indicated by a #x preceding it.

 In all experiments, a truncated version of HBO1 is used but rather than refer to it as HBO1∆#, it is consistently referred to only as HBO1.

 "DE3" is used to indicate the type of E. coli strain that has been engineered to contain a modified lac operon expressing the gene coding for T7 RNA polymerase.

 In discussions of expression vectors, the arbitrary term "transfer vector" simply refers to an expression vector specially designed to facilitate transfer of translation cassettes into a polycistronic expression vector.

20 2.3 Subcloning

Expression and purification experiments first required the creation of an expression plasmid. This process involved a series of subcloning procedures including preparation of both insert and vector DNA, ligation in order to create the plasmid, transformation into E. coli, PCR screening, plasmid isolation, restriction mapping, and DNA sequencing. The following outline the basic subcloning steps necessary for creating an expression vector.

2.3.1 PCR

Polymerase Chain Reaction, or PCR, is an automated process by which a

predetermined length of DNA can be exponentially amplified through repeated temperature

cycles. The process requires a buffer, free deoxynucleotides, forward and reverse primers

designed to produce the desired PCR product, Taq polymerase, and a small amount of

template DNA. PCR was used for site-directed mutagenesis, PCR screening, as well as

amplifying a target gene insert from a parent plasmid.

In all PCR experiments performed, DNA-containing sample comprised 5 % of the

total reaction volume, while PCR master-mix comprised the other 95 %. PCR was used for

screening bacterial colonies (see section 2.3.6) and for site-directed mutagenesis (see

section 2.4.5), but its primary use was in amplifying genes from a plasmid solution. For this

latter use, 10 µl of 10 ng/µl of template DNA solution was added to a master-mix

consisting of 10 µl of 10x Thermo Pol buffer (New England BioLabs), 10 µl of 2.5 mM

dNTP solution, 5 µl of 10 µM forward primer solution, 5 µl of 10 µM reverse primer

21 solution, and 2 units/µl Pfu Polymerase (PCR mutagenesis experiments utilized higher- fidelity Pfu Turbo Polymerase). MilliQ water was used to bring the final reaction volume to

100 µl. The PCR reaction was run in a thermocycler programmed to execute the following cycle pattern.

initial denaturation denaturation annealing extension

2 minutes 95°C → 25x (30 seconds 95°C → 30 seconds 50°C → 1 minute 75°C)

Figure 2-1. Typical PCR Thermal Cycle Pattern

Annealing temperature and extension time were variable depending on certain characteristics of primer set used and the target DNA fragment. Annealing temperature was calculated by Tm-5°C, where Tm was the lowest primer annealing temperature. Extension time varied with fragment size, with every 1000 bp roughly corresponding to 1 minute (i.e. 1500 bp target would have an extension time of 90 seconds). Typically, 25 thermal cycles were used (12 cycles were use for single point mutation PCR mutagenesis).

2.3.2 Restriction Enzyme Digestion

In restriction digestion, one or two restriction enzymes were used to cleave the

"blunt" ends of the PCR product. These restriction enzymes cut in a way that leaves overhanging single stranded DNA on each end, such that it may anneal with another length of DNA cut by the same restriction enzyme. The end result was a PCR product with

22 "sticky" ends that could be easily inserted into a plasmid vector cut with the same two restriction enzymes.

Both insert and vector had to be prepared by restriction enzyme digestion. For inserts prepared by PCR, 5 µl of PCR product (dissolved in TE (10, 0.1)) was added to 3µl of the appropriate 10x buffer (New England BioLabs), 3 µl of 1 mg/ml BSA, 100 mM

DTT, between 0.25 µl and 1 µl of 5-50 units/µl restriction enzymes, and water to a final reaction volume of 30 µl. Vector DNA was prepared by adding 2-4 µl of parent plasmid

DNA to the appropriate New England BioLabs (NEB) buffer, BSA, DTT, restriction enzymes, and water. The reaction mixture was incubated in a 37°C water bath for 1 to 2 hours. If the desired restriction enzyme pair was incompatible with the same NEB buffer, a staggered digestion was performed in which the enzyme active in the low salt buffer was added first for ~1 hour, followed by the addition of salt solution and the second enzyme and

37°C incubation for an additional ~1 hour.

2.3.3 Agarose Gel Purification

Agarose Gel Purification was necessary to isolate restriction enzyme digested DNA fragments. Predicted recovery of the desired DNA is roughly estimated at 50-70 % using this method. The digested DNA fragment was run on an agarose gel (normally 1% for

~800-1000bp fragment) and visualized under UV light (302 nm or 365 nm). The desired band was cut out using a razor blade and spun through a filter apparatus at 4,700 rcf for 5 minutes. The filter apparatus was constructed by placing a 0.5 ml Eppendorf tube, with a

23 small hole at the base covered with siliconized glass wool, inside a larger 1.5 ml Eppendorf tube. The isolated DNA-containing flow-through was collected and stored at -20°C.

*For more specific information about agarose gel electrophoresis, see section 2.4.3 Agarose

Gel Electrophoresis.

2.3.4 Ligation

Isolated insert and vector DNA with matching "sticky ends" were combined in the presence of T4 DNA ligase, which catalyzes the formation of phosphodiester linkages between the 5' and 3' ends on both sides of the double stranded DNA. The reaction mixture consisted of 1 µl of 10x T4 DNA ligase buffer, 0.5 µl of 100 mM DTT, 2 µl of vector

DNA, 1.5 µl of insert DNA, 40 units/µl T4 DNA ligase (750 units/ µl was used for

NdeI/BsrGI ligations), and MilliQ water up to a final reaction volume of 10 µl. The reaction was always performed in parallel with a negative control consisting of all the same components minus the insert DNA. Both reaction mixtures were incubated at room temperature for 30 to 60 minutes. Subsequent transformation of both reaction mixtures provided a rough estimation of whether the ligation was successful - it was expected that the negative control plate would have significantly fewer colonies than the ligation plate.

2.3.5 Transformation

2 µl of the newly formed plasmid DNA solution was introduced to a 100 µl suspension of E. coli cells, which were thawed on ice. After roughly 15 minutes of incubation on ice, the cells were heat shocked in a 42°C water bath for 30 seconds in order

24 to initiate uptake of the plasmid. 0.5 ml of 2x Tryptone-Yeast (2x TY) media was added to the cell suspension and placed in a 37°C shaking incubator (200 rpm) for roughly 15 minutes. The plasmid was designed to carry a gene for ampicillin resistance, and this incubation period was necessary to allow time for expression of that gene. 0.3 ml of the cell suspension was plated onto ampicillin-containing Tryptone-Yeast Extract (TYE) plates (in some cases, other appropriate antibiotics were used in addition to ampicillin), and incubated at 37°C.

2.3.6 PCR Screening

Bacterial colonies on the transformation plate were restreaked on fresh ampicillin- containing TYE plates and screened by PCR for the correct vector/insert combination.

Primers were chosen to produce a PCR product that overlaps with both insert and vector but one that does not cover the entire insert. This is to ensure the highest possible confidence in the results. Depending on the ratio of colonies on ligation plate to negative control plate, between 4 and 16 colonies were swirled in 100 µl MilliQ water aliquots and restreaked on fresh TYE plates. The restreaked plates were incubated at 37°C for 10-18 hours. PCR master-mix (for screening 16 colonies) was prepared by combining 232.2 µl of

MilliQ water, 36 µl of 10x Thermo Pol buffer (New England BioLabs), 36 µl of 2.5 mM dNTP solution, 10 µl of forward primer (10ng/µl), 10 µl of reverse primer (10ng/µl), and

1.8 µl of 2 units/µl Pfu DNA polymerase. 19 µl of this master-mix was added to the appropriate number of PCR tubes, followed by 1 µl of cell suspension from the 100 µl water aliquots.

25 *PCR was performed using a thermocycler using the same procedure as previously described in section 2.3.1 PCR.

2.3.7 Plasmid Isolation

2 flasks of 100ml sterile 2xTY media (with appropriate antibiotics) each were inoculated with colonies from the re-streaked plates, chosen based on the results of the PCR screen. The cells were grown up overnight in a 37°C shaking incubator (200 rpm), usually between 16 and 18 hours. The resulting cell cultures were then spun at 3,005 rcf for 5 minutes and the supernatant was discarded. The remaining cell pellet was resuspended in 5 ml of cell LYSIS buffer (50 mM glucose, 25 mM Tris-Cl pH 8.0, 10 mM EDTA, Na). 10 ml of NaOH/SDS solution (0.2 M NaOH, 1 % SDS) was added, after which the sample was shaken forcefully and incubated on ice for 5 minutes. 10 ml of cold 5M KAc/2.5 M HAc was added and again shaken forcefully and incubated on ice for 5 minutes. The sample was spun at 3,005 rcf for 3 minutes, leaving a pellet and a thin film of precipitated DNA at the liquid surface. It was generally sufficient to pour off the liquid into a new tube, while taking great care to avoid the DNA precipitate. 12.5 ml of isopropanol was added to the decanted sample, which was incubated at room temperature for 10 min, and then spun at

20,199 rcf for 5 minutes. The pellet was resuspended in 150 µl of TE (10, 50) (10 mM Tris-

Cl pH 8.0, 50 mM EDTA), and the resuspended sample was incubated with 1.5 µl of 10 mg/ml RNase A for 15 minutes at 37°C. The RNase and other remaining cellular proteins were removed by extracting twice with 150 µl phenol/chloroform and once with 500 µl chloroform (See section 2.4.2 Phenol/Chloroform Extraction).

26 A Sephacryl S400 column assembly was prepared by fitting a Gilson blue micropipet tip, with a small amount of siliconized glass wool stuffed into the bottom, inside the top half of a 1.5 ml Eppendorf tube. The half-Eppendorf tube casing allowed the column to fit snuggly on the end of 5 ml polypropylene tube. S400 high resolution resin equilibrated in TE (10, 0.1) was poured into the Gilson pipet tip, and the whole column assembly was spun at 751 rcf for 5 minutes to elute the excess TE (10, 0.1). The extracted sample (aqueous phase) was then placed onto a S400 Sephacryl column and spun at 751 rcf for 5 min. The column removed small RNA molecules left in the sample, leaving only plasmid DNA dissolved in TE (10, 0.1) in the eluate.

2.3.8 Restriction Mapping

Restriction mapping was a procedure used to gain further confirmation that the isolated plasmid contained the correct insert prior to submission of the sample to the sequencing facility. A small sample of plasmid DNA was digested with one or two restriction enzymes so that at least one cut was made in the insert and at least one was made in the vector. The digested fragments were run on an agarose gel for visualization of the fragments. The reaction mixture for a restriction mapping was prepared by combining 1.5

µl of plasmid DNA (~0.2 µg/ µl), 1 µl of the appropriate 10x New England BioLabs buffer,

1 µl of 1 mg/ml BSA, 0.5 µl of 100 mM DTT, 0.25-1 µl of the appropriate restriction enzymes (5-50 units/µl), and MilliQ water to a final reaction volume of 10 µl. The reaction was incubate in a 37°C water bath for 1 to 2 hours, then combined with 2 µl of 6x gel loading buffer (GLB) (2.5 mg/ml bromophenol blue, 2.5 mg/ml xylene cyanol, 0.3 g/ml glycerol, 60 mM EDTA), and visualized by agarose gel electrophoresis.

27 2.3.9 Sequencing

All samples were sequenced by the Nucleic Acid Facility in Chandlee Laboratory,

University Park. Samples were prepared by diluting with water to 0.2 µg/µl. 5 µl per reaction requester was submitted together with an order confirmation sheet and order number. The Nucleic Acid Facility has several stock primers, including T7, T7-term, M13 univ., etc, but if requesting sequencing using custom primers, an additional 5 µl aliquot per reaction of 1 µM primer is required. The text file and chromatogram result could be downloaded and analyzed on FinchTV (FinchLabTM).

2.3.10 Flow Chart Outline of Subcloning Schemes

The following are diagrams outlining the basic subcloning steps necessary to produce the expression plasmids used in the experiments discussed herein.

28 Figure 2-2. Creating 8x and 10x HIS Tagged yEpl1 for Piccolo NuA4

8x HIS tagged yEpl1 for Piccolo NuA4 complex 10x HIS tagged yEpl1 for Piccolo NuA4 complex

PCR amplified 8x HIS tagged yEpl1 PCR amplified 10x HIS tagged yEpl1 from existing transfer vector from existing transfer vector

Isolated NdeI-BsrGI digested PCR Isolated NdeI-BsrGI digested PCR product to create insert product to create insert

Ligated NdeI-BsrGI digested 8x HIS Ligated NdeI-BsrGI digested 10x HIS tagged yEpl1 insert into NdeI-BsrGI tagged yEpl1 insert into NdeI-BsrGI digested transfer vector digested transfer vector

Isolated XbaI-BglII digested 8x HIS Isolated XbaI-BglII digested 10x HIS tagged yEpl1 insert from transfer tagged yEpl1 insert from transfer vector to create new insert vector to create new insert

Ligated XbaI-BglII digested 8x HIS Ligated XbaI-BglII digested 10x HIS

tagged insert into XbaI-BglII digested tagged insert into XbaI-BglII digested polycistronic expression vector polycistronic expression vector

containing Yng2 and yEsa1 subunits containing Yng2 and yEsa1 subunits

Plasmid Created: pST44- Plasmid Created: pST44- 8xHISNyEpl1t3x3 -Yng2t3x3-yEsa1x3 10xHISNyEpl1t3x3-Yng2t3x3- yEsa1x3

*Bolded squares indicate that small-scale expression was performed using that vector

29 Figure 2-3. Creating HST/LYTAG Tagged DHFR and EGFP

HST/LYTAG tagged DHFR HST/LYTAG tagged EGFP

PCR amplified 10x HIS tagged DHFR Isolated BglII-BamHI digested LYTAG from transfer vector insert from transfer vector to create new insert

Isolated BglII -BsrGI digested PCR product to create insert Ligated BglII-BamHI digested LYTAG Ligation possible with insert into phosphatased BamHI insert in either digested transfer vector orientation, but only the forward orientation was Ligated BglII-BsrGI digested insert desired and checked into BamHI-BsrGI digested transfer by PCR screening, vector Isolated BamHI-BsrGI digested EGFP insert from existing transfer vector restriction mapping, to create new insert and sequencing

Isolated BglII -BamHI digested 10x HIS tagged DHFR insert from created transfer vector to create new insert Ligated BamHI-BsrGI digested EGFP insert into BamHI-BsrGI digested LYTAG-containing transfer vector

Ligated BglII-BamHI digested LYTAG insert from existing transfer vector Plasmid Created: pST50Tr- into BglII-BamHI digested 10x HIS STRaHSTLYTNEGFP tagged DHFR-containing transfer vector

Plasmid Created: pST50Tr- STRaHSTLYTNDHFR

*Bolded squares indicate that small-scale expression was performed using that vector

30 Figure 2-4. Creating HBO1/JADE1 Complex, SUMO/HST Tagged on HBO1

SUMO/10x HIS tagged HBO1

PCR mutagenesis to mutate Ile153 to Phe in Corrected nonconservative point HBO1x1, creating HBO1x2 mutation causing Ile153Phe

PCR mutagenesis to mutate Ala561 to Val in Corrected nonconservative point HBO1x2, creating HBO1x3 mutation causing Ala561Val

PCR amplified HBO1∆1x3 from transfer vector containing HBO1x3

Isolated BamHI-BsrGI digested PCR product to create insert

Ligated BamHI-BsrGI digested HBO1∆1x3 insert into BamHI-BsrGI digested SUMO/10x HIS tag- containing transfer vector.

Plasmid Created: pST50Tr-SMAHSTNhHBO1t1x3

Isolated NdeI-BsrGI digested HBO1∆1x3 insert from created transfer vector to create new insert

Ligated NdeI-BsrGI digested HBO1∆1x3 insert into NdeI-BsrGI digested transfer vector

Isolated AscI-SbfI digested SUMO/10x HIS tagged Insert to be used creation of HBO1

HBO1∆1x3 insert from created polycistronic complex (see next cloning scheme) expression vector to create new insert

*Bolded squares indicate that small-scale expression was performed using that vector

31 Figure 2-5. Creating HBO1/JADE1 Complex and HBO1/JADE1∆1 Complex

HBO1/JADE1 HBO1/JADE1∆1

PCR amplified JADE1∆1 from existing transfer vector containing JADE1

Isolated BamHI-BsrGI digested PCR Partial digest product to create insert performed because PCR product contained internal BamHI site. Ligated existing BamHI-BsrGI Ligated BamHI-BsrGI digested digested JADE1 insert into BamHI- JADE1∆1 insert into BamHI-BsrGI BsrGI digested transfer vector digested transfer vector Plasmid Created: pST66Tr-hJADE1 Plasmid Created: pST66Tr-hJADE1t1

Isolated NdeI-BsrGI digested JADE1 Isolated NdeI-BsrGI digested insert from existing transfer vector JADE1∆1 insert from created transfer to create new insert vector to create new insert

Ligated NdeI-BsrGI digested JADE1 Ligated NdeI-BsrGI digested JADE1∆1 insert into Nde1-BsrGI digested insert into Nde1-BsrGI digested polycistronic expression vector polycistronic expression vector

Ligated AscI-SbfI digested SUMO/10x Ligated AscI-SbfI digested SUMO/10x HIS tagged HBO1∆1x3 insert into HIS tagged HBO1∆1x3 insert into AscI-SbfI digested JADE1-containing AscI-SbfI digested JADE1∆1- polycistronic expression vector containing polycistronic expression vector Plasmid Created: pST69-hJADE1- SMAHSTNhHBO1t1x3 Plasmid Created: pST69-hJADE1t1- SMAHSTNhHBO1t1x3

*Bolded squares indicate that small-scale expression was performed using that vector

32

2.4 Supplemental DNA Methods

2.4.1 Ethanol Precipitation

In order to concentrate a plasmid solution to a desired concentration for

sequencing, it was often necessary to precipitate the DNA and resuspend in a smaller

volume. In order to precipitate the DNA out of solution, 0.1 volumes of 3 M sodium acetate

pH 5.2 were added, followed by addition of 2.5 volumes of 100% ethanol. The sample was

vortexed for 5 seconds and centrifuged at 17,000 rcf for 10 minutes. The supernatant was

aspirated off, and the pellet was resuspended in the desired volume of TE (10, 0.1) (10 mM

Tris-Cl pH 8.0, 0.1 mM EDTA).

2.4.2 Phenol/Chloroform Extraction

Extraction was a common technique used to remove cellular proteins and RNase A

from solution during plasmid isolation. Phenol/cholorform extraction extracts proteins from

solution, leaving nucleic acids in the aqueous phase. Equal volumes of TE equilibrated

phenol and Chloroform are added to the sample, vortexed, and centrifuged at 17,000 rcf for

1 min. Two phases, upper aqueous and lower organic, separated upon centrifugation. The

undesired proteins extracted into the organic phase, and so the aqueous phase, containing

the desired DNA sample, was removed and collected in a new tube.

33 2.4.3 Agarose Gel Electrophoresis

DNA's negatively charged phosphate backbone causes it to migrate through a porous agarose gel when a voltage is applied. The DNA will travel toward the positively charged cathode at a rate inversely proportional to molecular weight; higher molecular weight (longer) DNA fragments will migrate slower than lower molecular weight fragments and vise versa.

Gels of various agarose concentrations were made depending on the size of the

DNA fragment being visualized or isolated (Table 2-1). Agarose was dissolved in 30 ml of

0.5x TBE buffer (45 mM Tris base, 45 mM boric acid, 1.5 mM EDTA) by heating in a microwave for roughly one minute. The solution was placed at room temperature for 5 to

10 minutes in order to cool, after which 1.5 µl of 10 µg/ml ethidium bromide was added.

The solution was then poured into a gel block with an 8-well or 15-well comb and allowed to solidify for appoximately 20 to 40 minutes

Table 2-1. Appropriate Agarose Concentrations for Visualization of Variable-length DNA Fragments

DNA Fragment bp Length Agarose Concentration

< 300 bp 2 % 300-600 bp 1.5 % 600-1500 bp 1 % 1500-3000 bp 0.7 %

The solidified gel was transferred to a gel electrophoresis box and covered completely with 0.5x TBE. The comb was removed and samples mixed with 6x gel loading buffer (GLB) were pipetted into the appropriate wells. Samples were made to contain 1x

34 GLB by mixing 5 volumes of sample with 1 volume of 6x GLB. The loaded gel was run at

125V for 40 minutes. In some cases, gels were run for shorter or longer times depending on the degree of separation necessary.

3.4.4 UV Quantitation of DNA

A spectrophotometer set to measure an absorbance spectrum from 320 nm to 220 nm was used to determine plasmid DNA concentrations. For DNA samples dissolved in TE

(10, 0.1), a blank sample of TE (10, 0.1) was prepared to blank the spectrophotometer.

Dilutions were prepared using (TE 10, 0.1) so that absorbance readings were still at least

0.01 (for best accuracy absorbance should be between 0.1 and 1.0, but it was not critical that accurate plasmid concentrations be determined). The resulting absorbance spectrum should contain a peak at 260 nm (A260). The adjusted A260 was calculated by subtracting

A320 from A260. Using the stock A260 of 1.0 for a 50 µg/ml DNA solution, the concentration of the sample solution was calculated by multiplying 50 µg/ml by the extrapolated A260 of the undiluted sample.

2.4.5 Site Directed PCR Mutagenesis

Site directed mutagenesis follows the same basic principles as PCR, but uses oligonucleotide primers that are not exactly complementary to their target sequence. The primer mismatch(es) is/are tailored to a specific desired mutation and facilitate(s) amplification of the desired mutant DNA in subsequent rounds of amplification. The protocol used in the Tan laboratory is largely based on the QuikChange method from

Stratagene. This method was used to introduce conservative point mutations in genes of

35 interest in order to eliminate internal restriction sites and to correct nonconservative point mutations.

A starting volume of 0.5 µl of 10 ng/µl template plasmid solution was combined with 2.5 µl of 10x Pfu DNA polymerase buffer, 2.5 µl of 2.5 mM dNTP solution, 0.7 µl of

10 ng/µl forward mutagenesis primer, 0.7 µl of 10 ng/µl reverse mutagenesis primer, and

MilliQ water to a total reaction volume of 25 µl. PCR was performed in a thermocycler programmed to denature the DNA for an initial 2 minutes at 95°C, and then run through 12 cycles of (30 seconds 95°C, 1 minute 55°C, extension time 68°C). As with regular PCR, extension time varies with DNA bp length, with every 1000 bp corresponding to 1 minute of extension time. 2 µl of the amplified sample was removed to be used as a DpnI- undigested control.

DpnI is a restriction endonuclease that cleaves methylated DNA. 0.5 µl of 20 units/µl DpnI was added to the remaining reaction mix and incubated in a 37°C water bath for roughly 1 hour. The digested sample and the undigested control were then transformed into competent E. coli TG1 cells and plated onto ampicillin-containing TYE plates.

Endogenous wild-type DNA is highly methylated, while amplifying DNA by artificial means (PCR) produces unmethylated DNA. Digested plasmid DNA will not transform correctly, but if the PCR reaction was successful, there will be unmetylated, and thus undigested, plasmid DNA that will successfully transform. Accordingly, the DpnI-digested transformation plate was expected to have some but significantly fewer colonies than the undigested control plate. To confirm correct mutagenesis, samples were PCR screened and treated with a restriction enzyme whose restriction site was removed - lack of digestion would suggest successful mutagenesis.

36 2.5 Protein Expression

2.5.1 Small -Scale Protein Expression

All expression vectors used in these experiments were designed by Dr. Song

Tan and utilize a T7-based expression system (Tan et al, 2005) The desired plasmid

was transformed into the appropriate E. coli strain and grown up on a Tryptone-Yeast

Extract (TYE) agar plate with appropriate antibiotics (usually 50 µg/ml ampicillin). 100 ml

of 2x Tryptone-Yeast (TY) media with appropriate antibiotics were inoculated with one

colony (roughly 50-100 colonies were used for slow-growing CodonPlus cells). These

flasks were grown up in a shaking incubator at 200 rpm and 37°C. If the expression was to

be carried out at a temperature lower than 37°C, the flask was transferred to the appropriate

temperature shaking incubator at an OD600 of between 0.1 and 0.2. *It is important that at

least one replication cycle occur at the expression temperature prior to induction.

At an OD600 of between 0.5 and 0.9 the culture was induced with 100 ul of 0.2 M

Isopropyl β-D-1-thiogalactopyranoside (IPTG). For overnight expressions (12 hours) the

culture was induced at an OD600 between 0.4 and 0.5. Expressions performed at 28°C or

37°C were harvested at 3 hours post-induction, while expressions performed at 18°C or

23°C were harvested at 12 hours post-induction. During harvesting, 50 ml of culture was

spun down and resuspended in 8 ml of P300 - EDTA (50 mM NaPO4 pH 7.0, 300 mM

NaCl, 1mM benzamidine, 5mM 2-mercaptoethanol). This suspension was either flash

frozen using liquid nitrogen, or immediately tested for solubility and purified. All samples

were mixed with equal volumes of PGLB, boiled, and run on an SDS-PAGE gel.

37

*All protein expressions were carried out in 2x TY media with 50ug/ml ampicillin using E. coli cells.

2.5.2 Experiments Performed:

Expression of Piccolo NuA4 complex, tagged (6x, 8x, and 10x HIS) on yEpl1

1. 100 ml expression of Piccolo NuA4 complex, 6x HIS tagged on yEpl1, using

BL21(DE3) pLysS cells. The expression was carried out at 37°C.

2. 100 ml expression of Piccolo NuA4 complex, 8x HIS tagged on yEpl1, using

BL21(DE3) pLysS cells. The expression was carried out at 37°C.

3. 100 ml expression of Piccolo NuA4 complex, 10x HIS tagged on yEpl1, using

BL21(DE3) pLysS cells. The expression was carried out at 37°C.

4. 100 ml expression of LSD1 complex, 6x tagged on LSD1, using BL21(DE3)

pLysS cells. The expression was carried out at 28°C.

5. 100 ml expression of LSD1 complex, 8x tagged on LSD1, using BL21(DE3)

pLysS cells. The expression was carried out at 28°C.

6. 100 ml expression of LSD1 complex, 10x tagged on LSD1, using BL21(DE3)

pLysS cells. The expression was carried out at 28°C.

38 Expression of LYTAG tagged DHFR and LYTAG tagged EGFP

7. 100 ml expression of 10x HIS/LYTAG tagged DHFR using BL21(DE3) pLysS

cells. The expression was carried out at 28°C.

8. 100 ml expression of 10x HIS/LYTAG tagged EGFP using BL21(DE3) pLysS

cells. The expression was carried out at 28°C.

Expression of HBO1/JADE1 complexes

9. 100 ml expressions of HBO1/JADE1 complex as well as HBO1/JADE1∆1

complex (both versions HST tagged on JADE1) using BL21 cells. The

expressions were carried out in at both 28°C and 37°C.

10. 100 ml expressions of HBO1/JADE1 complex and HBO1/JADE1∆1 complex

(versions HST tagged on JADE1) using BL21-CodonPlus cells. The expressions

were carried out at both 28°C and 37°C

11. 100 ml expressions of both HST tagged and untagged HBO1 in an expression

vector using CodonPlus BL21 cells. The expressions were carried out at 37°C.

12. 100 ml expressions of untagged JADE1 and JADE1∆1 in an expression vector

using CodonPlus BL21 cells. The expressions were carried out at 37°C.

39 13. 100 ml expressions of HBO1/JADE1 complex as well as HBO1/JADE1∆1

complex (both versions SUMO tagged and HST tagged on HBO1). The

expressions were carried out at 28°C and 37°C.

14. 100 ml expressions of HBO1/JADE1 complex as well as HBO1/JADE1∆1

complex (both versions SUMO tagged and HST tagged on HBO1). The

expressions were carried out at 18°C and 23°C

2.6 Purification

2.6.1 Checking Solubility

BL21 (DE3) pLysS cells contain a plasmid carrying a gene for T7 lysozyme

(pLysS), which lowers background expression of T7 RNA polymerase. pLysS also serves

to weaken the cell wall structure of E. coli cells. This second function greatly facilitates cell

lysis and often causes spontaneous lysing upon thawing. Expressions using BL21-

CodonPlus (DE3) cells require addition of lysozyme before sonication to ensure complete

cell lysis.

The 8 ml cell suspension was lysed via sonication. Two rounds of 10x 0.5 seconds

at 30% power was used for BL21 (DE3) pLysS cells and 5 rounds of 10x 0.5 seconds at

35% power was used for BL21-CodonPlus (DE3). 0.5 ml of the lysed cells, or whole cell

extract, was spun down to separate pellet and supernatant. The sample was spun down at

17,000 rcf, the supernatant was decanted into a new tube, and the pellet was resuspended in

40 0.5 ml of P300-EDTA. 25 µl samples of whole cell extract, resuspended pellet, and supernatant were collected, mixed with PGLB, boiled, and analyzed on an SDS-PAGE gel to determine solubility.

2.6.2 Small-Scale TALONTM Affinity Purification

Roughly 6 ml of whole cell extract was spun down in five 1.5 ml Eppendorf tubes.

The supernatants were pooled and added to 0.5 ml of washed [with P300 - EDTA (see

Appendix A)] TALONTM resin, forming a resin suspension. This suspension was incubated for 20 to 30 min at room temperature on a nunator to prevent sedimentation of the resin.

Following incubation, the suspension was spun down and the supernatant ("flow through") was collected. The sedimented resin was then washed twice with 10 ml of P300 - EDTA to remove any unbound protein that was not already removed in the flow through.

The TALONTM resin was resuspended a final time in 3 ml of P300 - EDTA and poured into a Micro Bio-Spin® chromatography column. The buffer was allowed to drain into a collection tube, leaving a column of 0.5 ml of TALON resin. The bound protein was eluted with P300 - EDTA with an appropriate concentration of Imidazole. Six fractions of the same volume as the TALONTM column (roughly 0.5 ml) were collected.

*Only soluble proteins can be purified in this way

41 2.6.3 Experiments Performed

Purification of Piccolo NuA4 complex, tagged (6x, 8x, and 10x HIS) on yEpl1

1. Purification procedure utilizing TALONTM metal-affinity resin was performed on

Piccolo NuA4 complex (6x HIS tagged on yEpl1) (from expression experiment

#1).

2. Purification procedure utilizing TALONTM metal-affinity resin was performed on

Piccolo NuA4 complex (8x HIS tagged on yEpl1) (from expression experiment

#2).

3. Purification procedure utilizing TALONTM metal-affinity resin was performed on

Piccolo NuA4 complex (10x HIS tagged on yEpl1) (from expression experiment

#3).

4. Purification procedure utilizing TALONTM metal-affinity resin was performed on

LSD1/CoREST complex (6x HIS tagged on LSD1) (from expression experiment

#4).

5. Purification procedure utilizing TALONTM metal-affinity resin was performed on

LSD1/CoREST complex (8x HIS tagged on LSD1) (from expression experiment

#5).

42 6. Purification procedure utilizing TALONTM metal-affinity resin was performed on

LSD1/CoREST complex (10x HIS tagged on LSD1) (from expression

experiment #6).

Purification of LYTAG tagged DHFR and LYTAG tagged EGFP

7. Purification procedure utilizing TALONTM metal-affinity resin was performed on

HST/LYTAG tagged DHFR (from expression experiment #7)

8. Purification procedure utilizing TALONTM metal-affinity resin was performed on

HST/LYTAG tagged EGFP (from expression experiment #8)

Purification of HBO1/JADE1 complexes

9. Co-purification procedure utilizing TALONTM metal-affinity resin was

performed on HBO1/JADE1 complex as well as HBO1/JADE1∆1 complex (both

versions HST tagged on JADE1) (from expression experiment #9).

10. Co-purification procedure utilizing TALONTM metal-affinity resin was

performed on HBO1/JADE1 complex as well as HBO1/JADE1∆1 complex (both

versions HST tagged on JADE1) (from expression experiment #10).

11. Purification procedure utilizing TALONTM metal-affinity resin was performed on

SUMO/HST tagged HBO1 (from expression experiment #11)

43 12. Co-purification procedure utilizing TALONTM metal-affinity resin was

performed on HBO1/JADE1 complex as well as HBO1/JADE1∆1 complex (both

versions SUMO/HST tagged on HBO1) (from expression experiment #14)

2.7 Supplemental Protein Methods

2.7.1 SDS-PAGE

Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) was the

method used for protein visualization following expression and purification. SDS is a

negatively charged detergent and is added to the protein sample to serve a dual purpose; (1)

keeping proteins in a denatured conformation and (2) adding a roughly uniform negative

charge per unit mass to all proteins so that separation occurs only based on differences in

size rather than differences in total charge. Gels were run using a Mini-Protean II

eletrophoresis apparatus filled with protein gel loading buffer (PGRB) (10 mM Tris, 76

mM glycine, 0.02 % SDS). Protein samples were mixed with an equal volume of protein

gel loading buffer (PGLB) (125 mM Bis-Tris pH 6.8, 20% glycerol, 4% SDS, 15% 2-

mercaptoethanol, 0.04% bromophenol blue) and then boiled for 5 min to denature and limit

. The loaded samples were electrophoresed at 10 watts (~300 V, ~35 mA) for

roughly 30 min (or until bromophenol blue had run of the bottom of the gel).

After electrophoresis, the gels were soaked in FIX solution (45% ethanol, 9%

acetic acid) for 5 minutes and then soaked in STAIN solution (0.5% Coomassie Blue R,

45% ethanol, 9% acetic acid) for 5 to 10 minutes. The STAIN was decanted to be reused,

44 and the gel was soaked in DESTAIN solution (7% ethanol, 5% acetic acid) in a 65° water bath.

The gels used were 18 % polyacrylamide and prepared using 8ml of MilliQ water,

36 ml of 30% acrylamide/0.5 % bisacrylamide, 120 µl of bromophenol blue in ethanol, and

15 ml of 3 M Tris-Cl pH 8.8. The solution was deaerated, after which 600 µl of 10 % SDS,

60 µl of tetramethylethyldiamine (TEMED), and 240 µl of 25 % AMPS were added. The resulting solution was injected into the gel-pouring block up to roughly 70 % capacity.

Water-saturated butanol was poured over the gel block and the acrylamide was allowed to polymerize at room temperature. A stacking solution was prepared using 5 ml of MilliQ water, 10 ml of 10 % acrylamide/0.5 & bisacrylamide, and 4.8 ml of 0.5 M Bis-Tris solution. The stacking solution was deaerated, after which 200 µl of 10 % SDS, 15 µl of

TEMED, and 80 µl of 25 % AMPS were added. the water-saturated butanol was poured off the block, and the stacking solution was poured over the top of the gel block. Combs were placed and the gels were allowed to polymerize at room temperature. Gels were stored at

4°C wrapped with wet paper towels.

2.7.2 Purification by Q-Sepharose Ion Exchange

Q-Sepharose ion-exchange resin has a choline-like charged head (Paterson, 2006) that allows it to be repurposed for affinity purification of LTYAG tagged proteins. LYTAG binds choline and similarly binds the choline-like moiety on Q-Sepharose resin. The bound

LYTAG tagged protein can be eluted with choline chloride, which competitively binds to the LYTAG tag.

45 *Experiments using Q-Sepharose Purification were performed by laboratory technicians

Allen Minns and Bryan Thurston.

46

Chapter 3

Results/Discussion

3.1 Poly-histidine Affinity Tag Purification

SDS-PAGE gel electrophoresis analysis of the purified 6x, 8x, and 10x Piccolo NuA4 complex elution fractions revealed that the 10x affinity tag was optimal for obtaining the highest level of purity.

Table 3-1. Piccolo NuA4 Complex Molecular Weights:

6x HIS tagged yEpl1 41 896.0 Da

8x HIS tagged yEpl1 42 298.4 Da

10x HIS tagged yEpl1 42 572.6 Da

Yng2 21 030.0 Da

yEsa1 52 481.0 Da

As shown in Figure 3-1, washing with 20 mM imidazole solution (lanes 1, 2, & 3) elutes less of the 10x HIS (HST) tagged complex than either the 6x or 8x HIS tagged complex. This suggests higher affinity, or tighter binding, for the HST tag, which would in turn allow for more numerous and more concentrated imidazole washes before final elution of the complex. This might allow for elution of undesired contaminants while keeping the complex bound to the resin, thus leading to higher purity. Observation of elution fractions 7 and 8 also seemed to indicate that the final elution of HST tagged complex was purer than either 6x HIS tagged or 8x HIS tagged

47 complexes - a band clearly visible at roughly 40 kD in lanes 4, 5, 7, and 8 was much fainter in lanes 6 and 9, suggesting the absence of a significant contaminant. This apparent purity improvement justified additional purification experiments with the LSD1/CoREST complex.

*

Figure 3-1. TALON Purification of 6x, 8x, and 10x Histidine Tagged Piccolo NuA4 Complex Side-by-side comparison of elution fraction 5 (20 mM imidazole (Im) wash) for 6x, 8x, and 10x HIS

tagged complex revealed that the smallest amount of complex was eluted during 20 mM washes when tagged with an HST tag rather than 6x or 8x HIS tag. Fractions 7 and 8 also indicated that the HST tagged complex was the purest upon final elution with 200 mM imidazole. *The asterix in the molecular weights column indicates ~40 kD where a significant contaminant (mentioned in text) migrated. *yEpl1 migrated anomalously slow.

48 SDS-PAGE gel electrophoresis analysis of the purified 6x, 8x, and 10x HIS tagged LSD1 complex (LSD1/CoREST) elution fractions confirmed results from experiments with Piccolo

NuA4 complex - HST tag allowed for higher degree of purification.

Table 3-2. LSD1 Complex Molecular Weights

6x HIS tagged LSD1 80 715.8 Da

8x HIS tagged LSD1 79 791.8 Da

10x HIS tagged LSD1 80 066.1 Da

CoREST 25 817.8 Da

Figure 3-2 C shows that less of the desired complex was eluted at lower concentration imidazole washes when using the HST tag. As seen in Figure 3-2 A, 6x HIS tag showed the lowest affinity for the TALONTM resin and eluted most during the low concentration imidazole washes. As shown in Figure 3-2 B, 8x HIS tag showed the second highest affinity and eluted less in the washes. It is clearly observable from the final 200 mM imidazole elution fractions in Figure

3-2 A, B, and C, that more of the HST tagged complex was left on the column following the washes than either the 6x HIS or 8x HIS tagged complexes. It was concluded that the HST tag has a greater affinity for the Cobalt-based TALONTM resin and allows for low concentration imidazole washes without eluting the protein of interest. This ultimately yields a higher purity protein or protein complex.

49

Figure 3-2. TALON Purification of 6x, 8x, and 10x Histidine Tagged LSD1/CoREST Complex (A) Most of the 6x HIS tagged complex eluted during the low-concentration imidazole (Im) washes, especially 30 mM and 40 mM concentrations. Very little complex remained on the column to be eluted in the final 200 mM imidazole wash. (B) Most of the 8x HIS tagged complex eluted during the imidazole washes, especially the 40 mM and 50 mM concentrations. Very little complex (but more than the 6x HIS tagged version) remained on the column to be eluted by the final 200 mM imidazole wash. (C) Some of the 8x HIS tagged complex eluted during the late imidazole washes. Much more complex than in the 6x or 8x HIS tagged versions remained on the column to be eluted by the final 200 mM imidazole wash. 50 Based on these experiments, the default poly-histidine affinity tag in our laboratory was switched to HST, and most frequently used constructs have been successfully converted.

3.2 LYTAG Affinity Tag Purification

The HST/LYTAG tagged DHFR and EGFP were purified via both Q-Sepharose affinity chromatography and traditional TALONTM affinity chromatography.

Table 3-3. DHFR & EGFP Molecular Weights

HST/LYTAG tagged DHFR 34 663.4 Da

HST/LYTAG tagged EGFP 43 605.5 Da

SDS-PAGE gel electrophoresis analysis of the eluted fractions showed that LYTAG tagged proteins could be purified via Q-Sepharose affinity chromatography to an equal or greater degree of purity than using the traditional TALONTM resin method. Figure 3-3 A shows the comparison of Q-Sepharose purified and TALONTM purified HST/LYTAG tagged DHFR. Apart from the single extra band at roughly 31 kD, the Q-Sepharose purification method yielded a slightly purer final sample of DHFR (compare lanes 6 and 7 to lanes 10 and 11). Figure 3-3 B shows the comparison of Q-Sepharose purified and TALONTM purified HST/LYTAG tagged

EGFP. As shown by the small amount of protein in the supernatant as opposed to the large amount in the pellet, very little of the expressed EGFP was soluble. However, the soluble sample that was able to be purified revealed that Q-Sepharose purification yielded a higher purity sample than the TALONTM purification (compare lane 6 to lanes 10 and 11).

51

Figure 3-3. Q-Sepharose and TALON Purification of HST/LYTAG Tagged DHFR and EGFP (A) Overall purity of final elution fractions (lanes 6 & 7 vs lanes 10 & 11) seemed to be higher for DHFR purified by Q-Sepharose resin than DHFR purified by TALONTM resin. (B) Overall purity of final elution fractions (lanes 6 vs lanes 10 & 11) also seemed to be higher for EGFP purified by Q-Sepharose resin than EGFP purified by TALONTM resin. *Samples on Q-Sepharose column were eluted with a solution of 20mM Tris chloride and 150mM choline chloride , after two T1500 washes.

52 3.3 Expressing the HBO1/JADE1 Complex

The original cloning scheme involved creating two constructs with different version of the HBO1/JADE1 complex (one with full length JADE1 and the other with a truncated JADE1∆1 version), HST tagged on JADE1.

Table 3-4. HBO1/JADE1 Complex (Version 1) Molecular Weights

6x HIS tagged JADE1 62 033.0 Da

6x HIS tagged JADE1∆1 49 136.4 Da

HBO1 35 609.2 Da

When attempting to clone HBO1 into the first translation cassette in the pST69 polycistronic expression vector, however, there was a problem with ligation so that no clones were produced with the correct insert. An alternate cloning strategy was devised using Nde1 and

BsrGI restriction enzymes to insert both versions of JADE1 into pST69's first cassette. The parent pST69 vector already contained a 6x HIS tag in cassette 1. AscI and SbfI restriction enzymes were used to insert HBO1 into the second cassette.

The initial co-expression experiments of HBO1/JADE1 complex and HBO1/JADE1∆1 complex (both versions HST tagged on JADE1) using BL21 (DE3) pLysS cells failed to show convincing evidence of expression of HBO1, although Figure 3-4 shows there was probable expression of JADE1 as well as JADE1∆1. Sources in the literature confirmed that HBO1 could be expressed in E. coli (Iizuka, 1999), so I hypothesized that the lack of expression of HBO1 was due to adjacent arginine amino acid residues coded for by AGA and AGG. The tRNAs with

53 complementary anticodons to AGA and AGG codons are extremely rare and poorly produced in

E. coli (Imamura, 1999). These codons in close proximity in highly expressed recombinant genes have been associated with poor translation (Imamura, 1999).

Figure 3-4. Problematic Arginine Residues in SUMO/HST Tagged HBO1

1 M S D S E V N Q E A K P E V K P E V K P 20 1 ATGTCTGACTCAGAAGTCAATCAAGAAGCTAAGCCAGAGGTCAAGCCAGAAGTCAAGCCT 60 . | . | . | . | . | . |

21 E T H I N L K V S D G S S E I F F K I K 40 61 GAGACTCACATCAATTTAAAGGTGTCCGATGGATCTTCAGAAATCTTCTTCAAGATCAAA 120 . | . | . | . | . | . |

41 K T T P L R R L M E A F A K R Q G K E M 60 121 AAGACCACTCCTTTAAGAAGGCTGATGGAAGCGTTCGCTAAAAGACAGGGTAAGGAAATG 180 . | . | . | . | . | . |

61 D S L R F L Y D G I R I Q A D Q T P E D 80 181 GACTCCTTAAGATTCTTGTACGACGGTATTCGTATTCAAGCTGATCAGACCCCTGAAGAT 240 . | . | . | . | . | . |

81 L D M E D N D I I E A H R E Q I G P S S 100 241 TTGGACATGGAGGATAACGATATTATTGAGGCTCACAGAGAACAGATTGGCCCGAGCAGC 300 . | . | . | . | . | . |

101 H H H H H H H H H H S S G S G G G G G E 120 301 CATCATCATCATCATCATCATCATCATCACAGCAGCGGATCTGGTGGTGGTGGTGGTGAA 360 . | . | . | . | . | . |

121 N L Y F Q G S F R R A Q A R A S E D L E 140 361 AACCTGTACTTCCAGGGATCCTTCCGAAGAGCACAAGCCCGGGCTTCAGAGGATTTGGAG 420 . | . | . | . | . | . |

141 K L R L Q G Q I T E G S N M I K T I A F 160 421 AAGTTAAGGCTGCAAGGCCAAATCACAGAGGGAAGCAACATGATTAAAACAATTGCTTTT 480 . | . | . | . | . | . |

161 G R Y E L D T W Y H S P Y P E E Y A R L 180 481 GGCCGCTATGAGCTTGATACCTGGTACCATTCTCCATATCCTGAAGAATATGCACGGCTG 540 . | . | . | . | . | . |

181 G R L Y M C E F C L K Y M K S Q T I L R 200 541 GGACGTCTCTATATGTGTGAATTCTGTTTAAAATATATGAAGAGCCAAACGATACTCCGC 600 . | . | . | . | . | . |

201 R H M A K C V W K H P P G D E I Y R K G 220 601 CGGCACATGGCCAAATGTGTGTGGAAACACCCACCTGGTGATGAGATATATCGCAAAGGT 660 . | . | . | . | . | . |

221 S I S V F E V D G K K N K I Y C Q N L C 240 661 TCAATCTCTGTGTTTGAAGTGGATGGCAAGAAAAACAAGATCTACTGCCAAAACCTGTGC 720 . | . | . | . | . | . |

54

241 L L A K L F L D H K T L Y Y D V E P F L 260 721 CTGTTGGCCAAACTTTTTCTGGACCACAAGACATTATATTATGATGTGGAGCCCTTCCTG 780 . | . | . | . | . | . |

261 F Y V M T E A D N T G C H L I G Y F S K 280 781 TTCTATGTTATGACAGAGGCGGACAACACTGGCTGTCACCTGATTGGATATTTTTCTAAG 840 . | . | . | . | . | . |

281 E K N S F L N Y N V S C I L T M P Q Y M 300 841 GAAAAGAATTCATTCCTCAACTACAACGTCTCCTGTATCCTTACTATGCCTCAGTACATG 900 . | . | . | . | . | . |

301 R Q G Y G K M L I D F S Y L L S K V E E 320 901 AGACAGGGCTATGGCAAGATGCTTATTGATTTCAGTTATTTGCTTTCCAAAGTCGAAGAA 960 . | . | . | . | . | . |

321 K V G S P E R P L S D L G L I S Y R S Y 340 961 AAAGTTGGCTCCCCAGAACGTCCACTCTCAGATCTGGGGCTTATAAGCTATCGCAGTTAC 1020 . | . | . | . | . | . |

341 W K E V L L R Y L H N F Q G K E I S I K 360 1021 TGGAAAGAAGTACTTCTCCGCTACCTGCATAATTTTCAAGGCAAAGAGATTTCTATCAAA 1080 . | . | . | . | . | . |

361 E I S Q E T A V N P V D I V S T L Q A L 380 1081 GAAATCAGTCAGGAGACGGCTGTGAATCCTGTGGACATTGTCAGCACTCTGCAGGCCCTT 1140 . | . | . | . | . | . |

381 Q M L K Y W K G K H L V L K R Q D L I D 400 1141 CAGATGCTCAAATACTGGAAGGGAAAACACCTAGTTTTAAAGAGACAGGACCTGATTGAT 1200 . | . | . | . | . | . |

401 E W I A K E A K R S N S N K T M D P S C 420 1201 GAGTGGATAGCCAAAGAGGCCAAAAGGTCCAACTCCAATAAAACCATGGACCCCAGCTGC 1260 . | . | . | . | . | . |

421 L K W T P P K G T 429 1261 TTAAAATGGACCCCTCCCAAGGGCACT 1287 . | . | .

Figure 3-4. Problematic Arginine Residues in SMA/HST Tagged HBO1 In the above sequence of the HBO1 truncation used in these experiments, all Arginine residues are indicated by a red letter, and the problematic codons (AGA & AGG) are highlighted in red. One potential problem area can be seen between base pairs 135 and 192. There are four problematic Arginine residues in that length and most worrisome are the adjacent AGA and AGG codons between base pairs 135 and 141.

55

Figure 3-5. Expression (28°C and 37°C) and TALON Purification of HBO1/JADE1 Complex in BL21 (DE3) pLysS

(A & B) There appears to be an induced band at roughly where JADE1 would be expected to migrate. It is not clear whether there is any expression of HBO1. (C & D). There appears to be an induced band at roughly where JADE1∆1 would be expected to migrate. It is not clear whether there is any expression of HBO1. In all four gels, the fractions were eluted with 100 mM imidazole (Im), but no purified protein was observed. *Both JADE1 and JADE1∆1 migrated anomalously slow.

56 An important and unexpected result was that what was believed to be JADE1 and

JADE1∆1 migrated anomalously slowly on the SDS-PAGE gel. This is not an uncommon observation with many proteins, but more experiments were necessary to confirm the expression.

One possible explanation for the anomalous migration rates was incomplete SDS detergent binding. Some structural motifs observed in membrane proteins have been associated with irregular SDS binding (Rath, 2009). With a lower concentration of SDS/unit mass bound to the denatured protein, for example, the protein would be less negatively charged than predicted and migrate toward the cathode slower than expected. JADE1 and JADE1∆1 may share the structural characteristics that cause some membrane proteins to bind SDS less efficiently and, in turn, migrate more slowly on a gel. An alternative explanation is that the proteins were not sufficiently denatured by heating in order to fully bind SDS along their entire length. As the coming experiments will illustrate, however, this slow migration was consistently observed with JADE1 and JADE1∆1, making the latter explanation much less likely.

In order to ameliorate the problem of close-proximity rare Arginine codons, I attempted the same HBO1/JADE1 and HBO1/JADE1∆1 expressions in BL21-CodonPlus (DE3) cells. The

CodonPlus cells used contain extra copies of the argU gene coding for the complementary AGA and AGG tRNAs. Thus, CodonPlus cells will have a higher concentration of these problematic tRNAs and theoretically facilitate the translation of adjacent arginine residues. Figure 3-6 shows that there was expression of both JADE1 truncations in CodonPlus cells, but it was unclear whether HBO1 expressed at either 28°C or 37°C. Expression of both JADE1 versions was stronger in CodonPlus cells than in regular BL21 cells, so further experiments were carried out using BL21-CodonPlus (DE3).

57 Importantly, all or nearly all of the expressed JADE1 and JADE1∆1 was insoluble. This was evident from lanes 4, 5, & 6 in Figure 3-6 A and lanes 5, 6, & 7 in Figure 3-6 B. These lanes show the relative amounts of protein in the whole cell extract (WCE), pellet, and supernatant; and gels clearly show that the vast majority of either JADE1 truncation stays in the pellet. The presence of other bands in the supernatant undermines the possibility of incomplete cell lysis and suggests that the expressed JADE1 and JADE1∆1 was insoluble. This was a problem to be tackled later, but only after both JADE1 and HBO1 could be coexpressed.

58

Figure 3-6. Expression (37°C) and TALON Purification of HBO1/JADE1 Complex in BL21-CodonPlus (DE3) (A) There is clear expression of JADE1 in CodonPlus cells, indicated by the labeled induced bands. (B) There is clear expression of JADE1∆1 in CodonPlus cells, indicated by the labeled induced bands. In both gels, the fractions were eluted with 100 mM imidazole (Im), but no purified protein was observed *Both JADE1 and JADE1∆1 migrated anomalously slow. 59 A noteworthy observation regarding Figure 3-6 is the strong bands at 14.4 kD. These bands represent the lysozyme added to the sample prior to sonication in order to facilitate cell lysis. This conclusion can be made with great confidence because lysozyme is the protein used in the molecular weight marker to create the lowest (14.4 kD) band. In every case, the added lysozyme band migrates alongside the lysosyme in the molecular weight marker. Lysozyme was added in all future expressions and purifications using CodonPlus, and the band will be clearly visible in many ensuing figures.

SDS-PAGE analysis of expression of HBO1 alone in BL21-CodonPlus cells similarly showed good evidence that HBO1 could be expressed at 37°C in CodonPlus cells (Figure 3-8).

However, misinterpretation of the molecular weight markers initially led to the conclusion that the strong band at ~35 kD was closer to ~25 kD. At first, we did not believe that the strong band was HBO1 (MW: 35.6 kD), and future experiments moved forward based on this assumption.

60

Figure 3-7. Expression of HBO1 in BL21-CodonPlus (DE3) at 37°C A band migrating at ~35 kD in lanes 6 and 7 suggests moderate expression of HBO1 in CodonPlus cells at 37°C. The sample appears to be nearly completely insoluble based on only very minute quantities in the supernatant (lane 8).

Improving expression levels can often be achieved by using an N-terminal protein tag.

Transcribed genes sometimes produce an mRNA with a convoluted secondary structure, a characteristic that may prevent a ribosome from binding to the Shine-Delgarno sequence and initiating translation (Hansted et al, 2011). The idea behind an N-terminal tag is altering the 5' mRNA secondary structure, thus facilitating translation (Hansted et al, 2011). Based on what constructs were readily available in the laboratory, I decided to test a dual-tag fusion with HBO1 consisting of HST and modified SUMO. SUMO proteins are post-translational modifications in

61 eukaryotic cells and are involved in apoptosis, protein activation, conformational stability, stress response, cell cycle progression, and cytosolic transport (Panavas et al, 2009). As an N-terminal protein tag, SUMO may offers variety of advantages, including protection from degredation, improved protein folding, and enhanced protein expression (Butt et al, 2005). In all experiments performed, the SUMO tag used was a mutant version excluding the recognition site for SUMO protease. Other members of the Tan laboratory had previously found that the SUMO protease recognition site could lead to cleavage in E. coli by an unknown protease.

In order to test whether the addition of an affinity tag would improve expression of

HBO1, an expression vector containing SUMO/HST tagged HBO1 (MW: 49.7 kD) was created

(see section 2.3.10 Flow Chart Outline of Subcloning Schemes). SDS-PAGE analysis of tagged

HBO1 expression revealed good expression in BL21-CodonPlus (DE3) cells at 37°C (Figure 3-

8). Two 100 ml expressions were performed from separate colonies in order to demonstrate that this result was reproducible. In both experiments, the expressed HBO1 appeared to be slightly soluble (~30-40 %) based on analysis of lanes 4, 5, 6, 10, 11, and 12. These lanes allowed comparison of relative amounts of HBO1 in the pellet and supernatant of spun-down whole cell extract. Thus, although the SUMO/HST tags were shown not be necessary by Figure 3-7, Figure

3-8 showed that the tags may have improved solubility, which was certainly a worthwhile development. Future experiments were performed using the SUMO/HST tagged version of

HBO1.

62

Figure 3-8. Expression of SUMO/HST tagged HBO1 in BL21-CodonPlus (DE3) at 37°C

A band migrating at ~50 kD in lanes 4, 5, 6, 10, 11, and 12 suggests moderate expression of SUMO/HST tagged HBO1 in CodonPlus cells at 37°C. The sample appears to be slightly soluble based on a significant quantity present in the supernatant (lanes 6 & 12).

63 Because of the redundancy of having two poly-histidine tagged subunits in a co- purification procedure, it was necessary to create an untagged version of both full-length and truncated JADE1. Two expression vectors containing both versions of JADE1 were created (see section 2.3.10 Flow Chart Outline of Subcloning Schemes), and SDS-PAGE gel electrophoresis analysis conclusively demonstrated that untagged JADE1 (MW: 58.4 kD) and JADE1∆1 (MW:

45.5 kD) could be successfully expressed in BL21-CodonPlus (DE3) cells at 37°C (Figure 3-9).

Figure 3-9 also shows that both truncations were nearly completely insoluble (no visible target protein band in the supernatant).

Figure 3-9. Expression of JADE1 and JADE1∆1 in BL21-CodonPlus (DE3) at

37°C A band migrating at ~50 kD in lanes 4 and 5 suggests moderate expression of JADE1∆1 in CodonPlus cells at 37°C. The sample appears to be nearly completely insoluble based on the lack of sample present in the supernatant. A band migrating at ~66 kD in lanes 10 and 11 suggests moderate expression of JADE1 in CodonPlus cells at 37°C. JADE1 also appears to be nearly completely insoluble based on the lack of sample present in the supernatant. *Both JADE1∆1 and JADE1 migrated anomalously slow.

64 A final cloning scheme was carried out to create two pST69 polycistronic expression vectors containing the HBO1/JADE1 and HBO1/JADE1∆1 complexes, both SUMO/HST tagged on HBO1 (see section 2.3.10 Flow Chart Outline of Subcloning Schemes).

Table 3-5. HBO1/JADE1 Complex (Version 2) Molecular Weights

JADE1 58 383.0 Da

JADE1∆1 45 485.7 Da

SUMO/HST tagged HBO1 49 705.7 Da

100 ml expressions carried out in BL21-CodonPlus (DE3) cells at 28°C and 37°C showed good expression of both subunits but low solubility (Figure 3-10). From the small amount of sample that appeared to be in the supernatant of HBO1/JADE1 expressions at both temperatures, it was suggested that both versions of JADE1 may have been slightly soluble (Figure 3-10 A).

However, upon examination of Figure 3-10 B, we can observe a band in the supernatant of

HBO1/JADE1∆1 expressions where JADE1 would have migrated. The presence of this band was suggests that the apparent solubility of JADE1 in Figure 3-10 A was a product of an endogenous soluble protein that happened to be the same size as JADE1. The SUMO/HST tagged HBO1 protein also appears to be insoluble.

65

Figure 3-10. Expression of SUMO/HST tagged HBO1/JADE1 and HBO1/JADE1∆1 Complexes in BL21-CodonPlus (DE3) at 28°C and 37°C (A) A band migrating at ~66 kD in lanes 4, 5, 10, and 11 suggests moderate expression of JADE1 in CodonPlus cells at 28°C and 37°C. The sample appears to slightly soluble (lanes 6 & 12), but this is most likely the result of a co-migrating endogenous protein (see explanation in text). There is good evidence of SUMO/HST tagged HBO1 expression based on a sharp band present in lanes 4, 5, 10, and 11 migrating between 45 kD and 50 kD. (B) A band migrating at ~50 kD in lanes 4, 5, 10, and 11 suggests moderate expression of JADE1∆1 in CodonPlus cells at 28°C 37°C. There is little indication that JADE1∆1 is soluble at all (lanes 6 & 12). There is good evidence of SUMO/HST tagged HBO1 expression based on a sharp band present in lanes 4, 5, 10, and 11 migrating between 45 kD and 50 kD. *Both JADE1∆1 and JADE1 migrated anomalously slow.

66 The 100ml small-scale expressions of both complexes were repeated at 18°C as well as

23°C in hopes of achieving higher solubility. Figure 3-11 reveals similar levels of expression of both versions of the complex, but did not improve solubility. There is clear expression of tagged

HBO1, JADE1, and JADE1∆1 in all four expressions (Figure 3-11 A, B, C, D; lanes 2 & 3).

However, there is little to no soluble protein, as indicated by the near absence of HBO1, JADE1, or JADE1∆1 bands in the supernatant of any of the expressions (Figure 3-11 A, B, C, D; lane 4).

Interestingly, there does appear to be some purification of SUMO/HST tagged HBO1

(lanes 7-10), as a faint band consistent with the molecular weight of HBO1 is eluting off the resin during 100 mM and 200 mM imidazole washes. However, this is far from convincing evidence and alternate conditions and/or truncations should be tested to improve overall solubility.

67

Figure 3-11. Expression (18°C and 23°C) and Purification of SUMO/HST Tagged HBO1 in BL21-CodonPlus (DE3) (A & B) Both bands of the HBO1/JADE1 complex can be seen in the whole cell extract (WCE) and pellet, migrating at ~50 kD and ~66 kD respectively. Thus, the complex can be expressed at 18°C and 23°C. Solubility is not improved from that shown in Figure 3-10, but some SUMO/HST tagged HBO1 appears to have been purified (lanes 7-10). (C & D) Both bands of the HBO1/JADE1∆1 complex migrate very close and appear as one wide band that can be seen in WCE and pellet at ~50 kD. Thus, the complex can be expressed at 18°C and 23°C. Solubility is not improved from that shown in Figure 3-10, but some SUMO/HST tagged HBO1 appears to have been purified (lanes 7-10). * There were two holes in the gel (D) in lanes 3 and 8, leading to abnormally slow migration in those lanes. *Both JADE1∆1 and JADE1 migrated anomalously slow.

68 In summary, the above experiments have shown that poly-histidine tag length does affect affinity for cobalt-based TALONTM resin. It was found that 10x HIS, or HST, tag bound tightest to the TALONTM resin, and HST tagged proteins were able to achieve the highest purity.

Experiments also determined that LYTAG affinity tag could be used to successfully purify proteins via Q-Sepharose ion exchange chromatography. Lastly, I was able to show that HBO1 and JADE1 could be individually expressed and coexpressed in E. coli. However, solubility seemed to be the limiting factor, as I was not successful in expressing a soluble complex.

69

Chapter 4

Conclusion

4.1 Future Directions

One of the major problems currently faced with performing structural studies on the

HBO1/JADE1 complex is producing large quantities of soluble protein. It has been demonstrated in the experiments discussed herein that the vast majority of co-expressed complex was insoluble.

The native conformations of both HBO1 and JADE1 are soluble, suggesting that the complexes produced in the aforementioned experiments did not fold properly. Two possible approaches can be taken to try to resolve the issue of solubility.

The first approach is based on the presumption that producing a soluble complex using the currently available truncations is possible. In this case, future experiments should focus on altering expression conditions. A range of temperatures (18°C, 23°C, 28°C, and 37°C) have already be investigated and it appears that the lower two temperatures produce the best (but not satisfactory) results. A second criterion to explore is the media in which the cells are grown and expressed. Optimizing the media conditions has been shown to dramatically improve solubility of proteins expressed in E. coli (Broedel et al, 2001). Specifically, the inclusion of sorbitol in growth media has been suggested to improve solubility of some expressed proteins (Blackwell et al,

1991) and is one condition that should be tested. Additionally, both HBO1 and JADE1 contain a zinc finger motif (Burke et al, 2001) (Tzouanacou et al, 2003). This functional domain is characterized by a central zinc atom, which may play an important role in structural stabilization.

70 The mere addition of zinc to the expression media (the 2x TY media I used did not include additional zinc) might facilitate stable zinc finger formation and overall proper folding.

The second possibility is that regardless of conditions, the particular truncations of

HBO1and JADE1 previously expressed simply cannot achieve proper, solubility-conferring conformations. In this case, new versions of both protein subunits, with different regions excluded from the sequence, need to be created and tested for soluble expression. HBO1 and

JADE1 may also simply not be sufficient to form a soluble complex. As previously discussed, both proteins are part of a much larger complex, the pre-RC, and form interactions with other protein subunits. The key to continued progress on this project might lie with other components of the pre-RC that are required to form a soluble complex. It was suggested to our laboratory by our collaborator on this project, Jacques Côté, that it may be possible to form a binary complex with HBO1 and JADE1 alone. However, I have been unable to find any indication in the literature that such a complex has been made.

It is highly likely that HBO1 and JADE1 make contacts with several other protein subunits in the pre-RC complex. The question is whether those contacts are critical for proper conformation of HBO1 and JADE1. It is conceivable that those contacts are not critical, and that

HBO1 and JADE1 are sufficient for forming a soluble and functional binary complex. The experiments performed so far have been based on this hypothesis that HBO1 and JADE1 are sufficient, but future experimentation may reveal the contrary.

Exploring the use of different affinity tags may help improve solubility as well as open new options for purification. An undergraduate in the Tan laboratory, Ryan Henrici, has observed significant increase in solubility of several proteins upon switching from a /HST

71 tagged protein to a SUMO/HST tagged protein. A similar scenario may be tested with the

HBO1/JADE1 complex. Although I have already performed several unsuccessful experiments using SUMO/HST tagged HBO1/JADE1, switching to a different tag may grant some improvement in solubility. There are numerous examples in the literature of affinity tags or fusion partners improving expression levels of low or non-expressed proteins (Hewitt et al, 2011). For example, an "expressivity tag" derived from the InfB gene in E. coli conferred a 3-5 fold increased in green fluorescent protein (GFP) expressivity (Hansted et al, 2011). The LYTAG affinity tag may also be tested in conjunction with purification by Q-Sepharose affinity chromatography.

Once a large quantity of pure and soluble complex can be isolated, the recombinant complex should be assayed for enzymatic HAT activity on both histones and nucleosomes. It is important that the truncated versions used do not severely limit, or remove altogether, the complex's functionality. The structure of a catalytically inactive protein may not correlate well with the native catalytically active state. Thus, to determine the structure of such a complex would be of limited use. HAT activity assays would ensure that the complex produced is catalytically active.

Following these assays, crystal trials must be set up for finding conditions optimal for growing HBO1/JADE1/nucleosome non-mosaic crystal lattices. To define optimal conditions for crystallization, it may be necessary to create multiple additional truncated versions of the complex by deleting regions of HBO1 and/or JADE1. Each of these truncated versions must also be assayed for HAT activity to prevent deletion of regions critical to functionality. This will help define the minimum HBO1 and JADE1 regions needed for efficient histone acetylation.

Furthermore, since proteins are less likely to pack together in a crystal if they contain extraneous

72 segments, using a minimal, yet enzymatically active, HBO1/JADE1 complex should increase the likelihood of successful crystallization of the protein complex on the nucleosome.

4.2 Summary:

Through the experimental procedures contained herein, I have shown that poly-histine tag length has a direct effect on its ability to facilitate protein purification. Of the poly-histidine tags tested (6x, 8x, 10x (HST)), it was concluded that 10x histidine tags (HST) has the highest affinity for cobalt-based TALONTM resin and is able to achieve the highest degree of purity. As a result, the Tan laboratory has switch its default histidine tag to be used in small-scale TALONTM purifications from 6x histidine to HST. Other purification experiments have also demonstrated the effectiveness of the LYTAG affinity tag to be used in Q-Sepharose ion-exchange purification.

It was concluded that LYTAG in conjunction with Q-Sepharose resin is at least as effective as

HST in conjunction with TALONTM resin in small-scale purification.

Finally, a successful method for expressing a multi-subunit complex in a pST69 polycistronic expression vector has been established. Genes encoding HBO1 as well as full- length and truncated JADE1 have been successfully incorporated into two versions of a pST69 polycistronic expression plasmid. Both plasmids can be successfully used to produce the correct complex. The major hurtle has been producing a soluble complex, which I have not been successful in doing. There is some evidence that a small amount of SUMO/HST tagged HBO1 can be purified cobalt-based TALONTM metal affinity chromatography (Fig. 3-11), which suggests a small degree of solubility. But the binary HBO1/JADE1 complex has not copurified and there was no significant evidence for solubility of either truncation of JADE1. Perhaps

73 altering expression conditions and/or testing new versions/truncations of HBO1/JADE1 will result in the production of a soluble complex that can be the subject of structural studies. The overall goal of this project has been to expand understanding of the HBO1/JADE1 complex, of histone acetyltransferases, and more generally of epigenetic enzymes interacting with the nucleosome. It is my hope that my work will be expanded on and contribute to the advancement of the above areas of structural studies.

4.3 Epigenetics and Cancer

Virtually all our cells contain exactly the same genetic material, and yet cell differentiation occurs already in the very early stages of development. We are the product of an incredible diversity of cooperating cells, made possible by epigenetic gene regulation. Any glitches in these regulatory pathways can have dire consequences both for the affected cell and for the organism of which it is a part. Most notably, cancer is the result of a failure in gene regulation. The more we know about the enzymes that dictate gene expression, the closer we come to understanding, preventing, and possibly reversing the causes of cancer, and other diseases related to gene regulation.

74

Appendix A

DNA Sequences yEpl1t3 (part of Piccolo NuA4 complex)

1 M G S N S R F R H R K I S V K Q H L K I 20 1 ATGGGATCCAATAGTCGATTTAGACATCGAAAAATATCTGTGAAGCAACATCTTAAGATA 60 1 TACCCTAGGTTATCAGCTAAATCTGTAGCTTTTTATAGACACTTCGTTGTAGAATTCTAT 60

21 Y L P N D L K H L D K D E L Q Q R E V V 40 61 TATCTGCCTAACGATCTGAAACACCTGGATAAAGATGAATTGCAACAGAGAGAGGTGGTT 120 61 ATAGACGGATTGCTAGACTTTGTGGACCTATTTCTACTTAACGTTGTCTCTCTCCACCAA 120

41 E I E T G V E K N E E K E V H L H R I L 60 121 GAGATCGAAACTGGTGTAGAAAAAAATGAAGAAAAGGAGGTCCATTTGCATCGAATATTA 180 121 CTCTAGCTTTGACCACATCTTTTTTTACTTCTTTTCCTCCAGGTAAACGTAGCTTATAAT 180

61 Q M G S G H T K H K D Y I P T P D A S M 80 181 CAAATGGGTTCTGGTCATACAAAGCACAAAGACTATATTCCGACCCCGGATGCTTCTATG 240 181 GTTTACCCAAGACCAGTATGTTTCGTGTTTCTGATATAAGGCTGGGGCCTACGAAGATAC 240

81 T W N E Y D K F Y T G S F Q E T T S Y I 100 241 ACTTGGAATGAATATGATAAGTTTTATACGGGCAGTTTCCAGGAGACCACCAGCTACATC 300 241 TGAACCTTACTTATACTATTCAAAATATGCCCGTCAAAGGTCCTCTGGTGGTCGATGTAG 300

101 K F S A T V E D C C G T N Y N M D E R D 120 301 AAATTTTCTGCCACTGTGGAGGATTGCTGTGGCACCAACTACAATATGGATGAAAGAGAT 360 301 TTTAAAAGACGGTGACACCTCCTAACGACACCGTGGTTGATGTTATACCTACTTTCTCTA 360

121 E T F L N E Q V N K G S S D I L T E D E 140 361 GAGACCTTTTTAAATGAACAAGTCAACAAAGGTTCATCAGACATTTTAACTGAAGACGAA 420 361 CTCTGGAAAAATTTACTTGTTCAGTTGTTTCCAAGTAGTCTGTAAAATTGACTTCTGCTT 420

141 F E I L C S S F E H A I H E R Q P F L S 160 421 TTTGAAATACTTTGTTCCAGTTTTGAACATGCTATTCACGAGCGTCAACCATTCTTGAGT 480 421 AAACTTTATGAAACAAGGTCAAAACTTGTACGATAAGTGCTCGCAGTTGGTAAGAACTCA 480

161 M D P E S I L S F E E L K P T L I K S D 180 481 ATGGACCCGGAAAGTATACTTTCTTTTGAAGAATTGAAGCCCACATTAATAAAATCAGAT 540 481 TACCTGGGCCTTTCATATGAAAGAAAACTTCTTAACTTCGGGTGTAATTATTTTAGTCTA 540

181 M A D F N L R N Q L N H E I N S H K T H 200 541 ATGGCTGATTTCAATCTAAGAAACCAGCTGAATCATGAAATAAATTCTCATAAAACACAT 600 541 TACCGACTAAAGTTAGATTCTTTGGTCGACTTAGTACTTTATTTAAGAGTATTTTGTGTA 600

201 F I T Q F D P V S Q M N T R P L I Q L I 220 601 TTTATCACACAATTCGACCCCGTATCTCAAATGAATACGAGACCTTTAATTCAGCTGATA 660 601 AAATAGTGTGTTAAGCTGGGGCATAGAGTTTACTTATGCTCTGGAAATTAAGTCGACTAT 660

221 E K F G S K I Y D Y W R E R K I E V N G 240 661 GAGAAGTTCGGCTCTAAAATTTATGATTATTGGAGAGAAAGAAAAATTGAAGTTAACGGG 720 661 CTCTTCAAGCCGAGATTTTAAATACTAATAACCTCTCTTTCTTTTTAACTTCAATTGCCC 720

75

241 Y E I F P Q L K F E R P G E K E E I D P 260 721 TACGAAATTTTTCCGCAGCTGAAATTTGAAAGGCCGGGTGAAAAAGAAGAGATTGATCCC 780 721 ATGCTTTAAAAAGGCGTCGACTTTAAACTTTCCGGCCCACTTTTTCTTCTCTAACTAGGG 780

261 Y V C F R R R E V R H P R K T R R I D I 280 781 TACGTCTGTTTCAGAAGAAGAGAAGTGAGACATCCGCGGAAAACAAGACGTATAGATATC 840 781 ATGCAGACAAAGTCTTCTTCTCTTCACTCTGTAGGCGCCTTTTGTTCTGCATATCTATAG 840

281 L N S Q R L R A L H Q E L K N A K D L A 300 841 TTAAACAGTCAGCGTTTAAGGGCGTTGCATCAAGAATTGAAAAACGCGAAGGACTTGGCC 900 841 AATTTGTCAGTCGCAAATTCCCGCAACGTAGTTCTTAACTTTTTGCGCTTCCTGAACCGG 900

301 L L V A K R E N V S L N W I N D E L K I 320 901 CTGCTTGTTGCTAAACGTGAGAACGTTTCCCTAAATTGGATTAATGATGAATTAAAAATA 960 901 GACGAACAACGATTTGCACTCTTGCAAAGGGATTTAACCTAATTACTACTTAATTTTTAT 960

321 F D Q R V K I K N L K R 332 961 TTCGATCAAAGGGTAAAAATTAAGAATTTGAAAAGA 996 961 AAGCTAGTTTCCCATTTTTAATTCTTAAACTTTTCT 1020

yEsa1x3 (part of Piccolo NuA4 complex)

1 M G S H D G K E E P G I A K K I N S V D 20 1 ATGGGATCCCATGACGGAAAAGAAGAACCTGGTATTGCCAAAAAGATAAACTCAGTAGAT 60 1 TACCCTAGGGTACTGCCTTTTCTTCTTGGACCATAACGGTTTTTCTATTTGAGTCATCTA 60

21 D I I I K C Q C W V Q K N D E E R L A E 40 61 GATATTATTATCAAATGTCAATGCTGGGTCCAAAAAAATGATGAAGAACGATTAGCTGAA 120 61 CTATAATAATAGTTTACAGTTACGACCCAGGTTTTTTTACTACTTCTTGCTAATCGACTT 120

41 I L S I N T R K A P P K F Y V H Y V N Y 60 121 ATTTTATCCATAAACACAAGAAAAGCACCACCAAAATTCTATGTTCACTATGTTAATTAC 180 121 TAAAATAGGTATTTGTGTTCTTTTCGTGGTGGTTTTAAGATACAAGTGATACAATTAATG 180

61 N K R L D E W I T T D R I N L D K E V L 80 181 AACAAGCGTTTAGATGAGTGGATTACCACTGACAGAATAAACCTGGATAAAGAAGTACTA 240 181 TTGTTCGCAAATCTACTCACCTAATGGTGACTGTCTTATTTGGACCTATTTCTTCATGAT 240

81 Y P K L K A T D E D N K K Q K K K K A T 100 241 TATCCGAAACTAAAGGCTACTGATGAAGATAATAAGAAACAAAAAAAGAAGAAGGCAACA 300 241 ATAGGCTTTGATTTCCGATGACTACTTCTATTATTCTTTGTTTTTTTCTTCTTCCGTTGT 300

101 N T S E T P Q D S L Q D G V D G F S R E 120 301 AATACTAGTGAAACGCCACAAGACTCTCTGCAAGATGGTGTAGATGGTTTCTCAAGAGAA 360 301 TTATGATCACTTTGCGGTGTTCTGAGAGACGTTCTACCACATCTACCAAAGAGTTCTCTT 360

121 N T D V M D L D N L N V Q G I K D E N I 140 361 AATACGGATGTTATGGACTTAGATAATCTAAACGTACAGGGAATAAAAGATGAGAACATA 420 361 TTATGCCTACAATACCTGAATCTATTAGATTTGCATGTCCCTTATTTTCTACTCTTGTAT 420

141 S H E D E I K K L R T S G S M T Q N P H 160 421 TCACACGAGGATGAGATAAAAAAGCTGAGAACCTCCGGCTCTATGACACAAAATCCACAT 480 421 AGTGTGCTCCTACTCTATTTTTTCGACTCTTGGAGGCCGAGATACTGTGTTTTAGGTGTA 480

161 E V A R V R N L N R I I M G K Y E I E P 180 481 GAGGTGGCTCGAGTTAGAAATCTCAATCGAATCATTATGGGGAAATATGAAATAGAACCA 540 481 CTCCACCGAGCTCAATCTTTAGAGTTAGCTTAGTAATACCCCTTTATACTTTATCTTGGT 540

76

181 W Y F S P Y P I E L T D E D F I Y I D D 200 541 TGGTACTTTTCTCCATATCCTATTGAATTAACTGATGAAGATTTTATATATATCGACGAT 600 541 ACCATGAAAAGAGGTATAGGATAACTTAATTGACTACTTCTAAAATATATATAGCTGCTA 600

201 F T L Q Y F G S K K Q Y E R Y R K K C T 220 601 TTCACGTTGCAGTATTTTGGATCTAAGAAACAATACGAACGCTACAGGAAGAAATGTACC 660 601 AAGTGCAACGTCATAAAACCTAGATTCTTTGTTATGCTTGCGATGTCCTTCTTTACATGG 660

221 L R H P P G N E I Y R D D Y V S F F E I 240 661 TTAAGACATCCGCCAGGAAATGAAATCTACAGAGACGATTATGTTTCATTCTTTGAAATC 720 661 AATTCTGTAGGCGGTCCTTTACTTTAGATGTCTCTGCTAATACAAAGTAAGAAACTTTAG 720

241 D G R K Q R T W C R N L C L L S K L F L 260 721 GATGGTAGAAAGCAAAGGACTTGGTGTCGAAACTTGTGTTTACTTTCCAAACTTTTCCTA 780 721 CTACCATCTTTCGTTTCCTGAACCACAGCTTTGAACACAAATGAAAGGTTTGAAAAGGAT 780

261 D H K T L Y Y D V D P F L F Y C M T R R 280 781 GATCACAAAACATTATACTATGACGTTGATCCGTTTTTGTTTTATTGCATGACGAGACGA 840 781 CTAGTGTTTTGTAATATGATACTGCAACTAGGCAAAAACAAAATAACGTACTGCTCTGCT 840

281 D E L G H H L V G Y F S K E K E S A D G 300 841 GATGAGTTGGGTCACCATCTGGTGGGATATTTTTCCAAGGAAAAAGAATCCGCGGATGGT 900 841 CTACTCAACCCAGTGGTAGACCACCCTATAAAAAGGTTCCTTTTTCTTAGGCGCCTACCA 900

301 Y N V A C I L T L P Q Y Q R M G Y G K L 320 901 TACAATGTTGCATGTATCTTAACACTACCACAATACCAAAGGATGGGATATGGTAAGTTA 960 901 ATGTTACAACGTACATAGAATTGTGATGGTGTTATGGTTTCCTACCCTATACCATTCAAT 960

321 L I E F S Y E L S K K E N K V G S P E K 340 961 TTGATTGAATTTTCGTATGAATTGTCGAAAAAGGAAAACAAAGTTGGTTCTCCCGAGAAA 1020 961 AACTAACTTAAAAGCATACTTAACAGCTTTTTCCTTTTGTTTCAACCAAGAGGGCTCTTT 1020

341 P L S D L G L L S Y R A Y W S D T L I T 360 1021 CCTTTGTCTGATTTGGGTCTCTTATCCTATAGAGCCTATTGGTCGGACACTCTCATAACG 1080 1021 GGAAACAGACTAAACCCAGAGAATAGGATATCTCGGATAACCAGCCTGTGAGAGTATTGC 1080

361 L L V E H Q K E I T I D E I S S M T S M 380 1081 CTATTAGTGGAACACCAGAAGGAAATTACTATAGACGAAATAAGCTCCATGACTTCGATG 1140 1081 GATAATCACCTTGTGGTCTTCCTTTAATGATATCTGCTTTATTCGAGGTACTGAAGCTAC 1140

381 T T T D I L H T A K T L N I L R Y Y K G 400 1141 ACCACTACAGATATATTACACACAGCAAAGACACTGAATATCCTGCGATATTACAAGGGT 1200 1141 TGGTGATGTCTATATAATGTGTGTCGTTTCTGTGACTTATAGGACGCTATAATGTTCCCA 1200

401 Q H I I F L N E D I L D R Y N R L K A K 420 1201 CAGCATATTATTTTCCTGAATGAAGATATTTTAGATAGGTACAATCGACTTAAAGCCAAA 1260 1201 GTCGTATAATAAAAGGACTTACTTCTATAAAATCTATCCATGTTAGCTGAATTTCGGTTT 1260

421 K R R T I D P N R L I W K P P V F T A S 440 1261 AAGAGAAGGACAATAGACCCTAATAGACTCATATGGAAACCACCGGTATTTACTGCCTCT 1320 1261 TTCTCTTCCTGTTATCTGGGATTATCTGAGTATACCTTTGGTGGCCATAAATGACGGAGA 1320

441 Q L R F A W 446 1321 CAGTTACGCTTTGCCTGG 1338 1321 GTCAATGCGAAACGGACC 1380

77 Yng2 (part of Piccolo NuA4 complex)

1 M D P S L V L E Q T I Q D V S N L P S E 20 1 ATGGATCCAAGTTTAGTTTTAGAGCAAACGATACAAGATGTGTCCAACCTCCCATCAGAA 60 1 TACCTAGGTTCAAATCAAAATCTCGTTTGCTATGTTCTACACAGGTTGGAGGGTAGTCTT 60

21 F R Y L L E E I G S N D L K L I E E K K 40 61 TTTCGTTACCTCTTAGAGGAGATCGGTTCAAATGATTTGAAGCTCATCGAAGAAAAAAAG 120 61 AAAGCAATGGAGAATCTCCTCTAGCCAAGTTTACTAAACTTCGAGTAGCTTCTTTTTTTC 120

41 K Y E Q K E S Q I H K F I R Q Q G S I P 60 121 AAATACGAGCAAAAAGAATCACAAATACACAAATTTATAAGACAGCAAGGCTCAATACCG 180 121 TTTATGCTCGTTTTTCTTAGTGTTTATGTGTTTAAATATTCTGTCGTTCCGAGTTATGGC 180

61 K H P Q E D G L D K E I K E S L L K C Q 80 181 AAACATCCACAGGAAGATGGGCTTGACAAAGAAATAAAAGAATCACTTTTGAAATGTCAG 240 181 TTTGTAGGTGTCCTTCTACCCGAACTGTTTCTTTATTTTCTTAGTGAAAACTTTACAGTC 240

81 S L Q R E K C V L A N T A L F L I A R H 100 241 TCTTTGCAAAGAGAAAAATGCGTTCTGGCGAACACTGCCTTGTTTCTAATTGCTAGACAC 300 241 AGAAACGTTTCTCTTTTTACGCAAGACCGCTTGTGACGGAACAAAGATTAACGATCTGTG 300

101 L N K L E K N I A L L E E D G V L A P V 120 301 TTGAATAAGTTGGAAAAAAACATCGCTTTATTGGAGGAAGATGGTGTTCTAGCCCCCGTG 360 301 AACTTATTCAACCTTTTTTTGTAGCGAAATAACCTCCTTCTACCACAAGATCGGGGGCAC 360

121 E E D G D M D S A A E A S R E S S V V S 140 361 GAAGAAGATGGAGACATGGATAGCGCTGCTGAAGCCTCTAGAGAAAGTTCAGTTGTGAGT 420 361 CTTCTTCTACCTCTGTACCTATCGCGACGACTTCGGAGATCTCTTTCAAGTCAACACTCA 420

141 N S S V K K R R A A S S S G S V P P T L 160 421 AACAGTAGCGTGAAAAAGAGAAGAGCTGCATCAAGCTCAGGATCCGTTCCACCCACTTTG 480 421 TTGTCATCGCACTTTTTCTCTTCTCGACGTAGTTCGAGTCCTAGGCAAGGTGGGTGAAAC 480

161 K K K K T S R T S K L Q N E I D V S S R 180 481 AAAAAGAAAAAAACTAGTCGAACATCTAAACTGCAAAATGAAATTGACGTTTCTTCAAGA 540 481 TTTTTCTTTTTTTGATCAGCTTGTAGATTTGACGTTTTACTTTAACTGCAAAGAAGTTCT 540

181 E K S V T P V S P S I E K K I A R T K E 200 541 GAAAAGTCTGTTACTCCAGTGAGCCCAAGCATTGAAAAGAAGATTGCAAGAACAAAAGAA 600 541 CTTTTCAGACAATGAGGTCACTCGGGTTCGTAACTTTTCTTCTAACGTTCTTGTTTTCTT 600

201 F K N S R N G K G Q N G S P E N E E E D 220 601 TTCAAAAACAGTAGAAATGGTAAAGGCCAAAACGGTTCCCCTGAAAACGAGGAAGAGGAC 660 601 AAGTTTTTGTCATCTTTACCATTTCCGGTTTTGCCAAGGGGACTTTTGCTCCTTCTCCTG 660

221 K T L Y C F C Q R V S F G E M V A C D G 240 661 AAAACTTTATACTGCTTCTGTCAAAGAGTTTCGTTTGGAGAAATGGTTGCATGTGATGGA 720 661 TTTTGAAATATGACGAAGACAGTTTCTCAAAGCAAACCTCTTTACCAACGTACACTACCT 720

241 P N C K Y E W F H Y D C V N L K E P P K 260 721 CCCAACTGTAAATATGAATGGTTTCATTATGATTGTGTAAATTTAAAAGAACCTCCGAAA 780 721 GGGTTGACATTTATACTTACCAAAGTAATACTAACACATTTAAATTTTCTTGGAGGCTTT 780

261 G T W Y C P E C K I E M E K N K L K R K 280 781 GGAACATGGTACTGTCCCGAATGTAAAATTGAGATGGAAAAAAACAAACTGAAAAGAAAA 840 781 CCTTGTACCATGACAGGGCTTACATTTTAACTCTACCTTTTTTTGTTTGACTTTTCTTTT 840

281 R N * 282 841 CGTAACTGA 849 841 GCATTGACT 900

78

HST tagged LSD1

1 M G S S H H H H H H H H H H S S G S G G 20 1 ATGGGCAGCAGCCATCATCATCATCATCATCATCATCATCACAGCAGCGGATCTGGTGGT 60 1 TACCCGTCGTCGGTAGTAGTAGTAGTAGTAGTAGTAGTAGTGTCGTCGCCTAGACCACCA 60

21 G G G E N L Y F Q G S P S G I N G V E G 40 61 GGTGGTGGTGAAAACCTGTACTTCCAGGGATCTCCATCAGGTATCAACGGTGTAGAAGGG 120 61 CCACCACCACTTTTGGACATGAAGGTCCCTAGAGGTAGTCCATAGTTGCCACATCTTCCC 120

41 A A F Q S R L P H D R M T S Q E A A C F 60 121 GCCGCCTTCCAGAGCCGTCTCCCACATGACCGTATGACATCACAGGAAGCTGCATGCTTC 180 121 CGGCGGAAGGTCTCGGCAGAGGGTGTACTGGCATACTGTAGTGTCCTTCGACGTACGAAG 180

61 P D I I N G P Q H T Q K V F L Y I R N R 80 181 CCTGACATAATTAATGGACCACAACATACTCAGAAAGTGTTCTTGTATATTCGAAACCGC 240 181 GGACTGTATTAATTACCTGGTGTTGTATGAGTCTTTCACAAGAACATATAAGCTTTGGCG 240

81 T L Q L W L D N P K V Q L T F E A T V Q 100 241 ACACTTCAGCTGTGGTTGGACAACCCCAAAGTCCAGCTCACCTTTGAGGCCACAGTGCAG 300 241 TGTGAAGTCGACACCAACCTGTTGGGGTTTCAGGTCGAGTGGAAACTCCGGTGTCACGTC 300

101 Q L E A P Y N S D A V L V H R I H S Y L 120 301 CAGCTAGAAGCTCCATACAACAGTGATGCCGTCCTGGTCCACCGGATACACAGCTATTTA 360 301 GTCGATCTTCGAGGTATGTTGTCACTACGGCAGGACCAGGTGGCCTATGTGTCGATAAAT 360

121 E R H G F I N F G I Y K R V K P L P T K 140 361 GAGAGACACGGGTTTATCAACTTTGGCATCTACAAGAGAGTAAAGCCTCTGCCAACAAAG 420 361 CTCTCTGTGCCCAAATAGTTGAAACCGTAGATGTTCTCTCATTTCGGAGACGGTTGTTTC 420

141 K T G K V I V I G A G V S G L A A A R Q 160 421 AAGACTGGAAAGGTGATCGTGATTGGTGCCGGGGTGTCAGGGCTGGCCGCAGCAAGACAA 480 421 TTCTGACCTTTCCACTAGCACTAACCACGGCCCCACAGTCCCGACCGGCGTCGTTCTGTT 480

161 L Q S F G M D V T V L E S R D R V G G R 180 481 CTGCAGAGTTTTGGGATGGATGTGACTGTTCTGGAGTCAAGAGATCGAGTTGGTGGACGA 540 481 GACGTCTCAAAACCCTACCTACACTGACAAGACCTCAGTTCTCTAGCTCAACCACCTGCT 540

181 V A T F R K G N Y V A D L G A M V V T G 200 541 GTGGCCACTTTCAGAAAAGGGAACTATGTAGCTGACTTGGGTGCTATGGTGGTGACTGGC 600 541 CACCGGTGAAAGTCTTTTCCCTTGATACATCGACTGAACCCACGATACCACCACTGACCG 600

201 L G G N P M A V V S K Q V N M E L A K I 220 601 CTGGGTGGAAACCCTATGGCTGTGGTCAGCAAACAGGTCAATATGGAGCTGGCTAAGATA 660 601 GACCCACCTTTGGGATACCGACACCAGTCGTTTGTCCAGTTATACCTCGACCGATTCTAT 660

221 K Q K C P L Y E A N G Q A G E R C T S V 240 661 AAACAGAAGTGTCCCCTCTATGAGGCCAATGGACAGGCTGGAGAGCGCTGCACAAGTGTC 720 661 TTTGTCTTCACAGGGGAGATACTCCGGTTACCTGTCCGACCTCTCGCGACGTGTTCACAG 720

241 P K E K D E M V E Q E F N R L L E A T S 260 721 CCTAAAGAGAAGGATGAGATGGTAGAGCAAGAGTTTAATCGCCTGTTAGAGGCAACATCA 780 721 GGATTTCTCTTCCTACTCTACCATCTCGTTCTCAAATTAGCGGACAATCTCCGTTGTAGT 780

261 Y L S H Q L D F N F L N N K P V S L G Q 280 781 TACCTCAGCCACCAGCTTGACTTCAACTTTCTTAACAACAAACCTGTGTCTCTGGGACAA 840 781 ATGGAGTCGGTGGTCGAACTGAAGTTGAAAGAATTGTTGTTTGGACACAGAGACCCTGTT 840

79

281 A L E V V I Q L Q E K H V K D E Q I E H 300 841 GCACTGGAAGTGGTCATACAGCTACAGGAAAAACATGTGAAAGATGAACAGATCGAACAC 900 841 CGTGACCTTCACCAGTATGTCGATGTCCTTTTTGTACACTTTCTACTTGTCTAGCTTGTG 900

301 W K K I V K T Q E E L K D L L N K M V T 320 901 TGGAAGAAGATTGTGAAGACTCAAGAAGAGCTCAAAGACCTACTGAACAAGATGGTGACT 960 901 ACCTTCTTCTAACACTTCTGAGTTCTTCTCGAGTTTCTGGATGACTTGTTCTACCACTGA 960

321 T K E K V K E L H Q Q Y K E A S E V K P 340 961 ACTAAGGAGAAAGTGAAGGAGCTTCATCAGCAGTACAAAGAGGCCAGTGAAGTAAAGCCA 1020 961 TGATTCCTCTTTCACTTCCTCGAAGTAGTCGTCATGTTTCTCCGGTCACTTCATTTCGGT 1020

341 P R D I T A E F L V K S K H R D L T A L 360 1021 CCCAGAGACATCACAGCTGAGTTCCTGGTGAAGAGCAAACACAGAGACCTCACAGCACTC 1080 1021 GGGTCTCTGTAGTGTCGACTCAAGGACCACTTCTCGTTTGTGTCTCTGGAGTGTCGTGAG 1080

361 C K E Y D E L V E M Q V K L E E R L Q E 380 1081 TGCAAGGAGTATGATGAGCTGGTGGAGATGCAGGTTAAACTAGAGGAGAGACTGCAGGAG 1140 1081 ACGTTCCTCATACTACTCGACCACCTCTACGTCCAATTTGATCTCCTCTCTGACGTCCTC 1140

381 L E A N P P S D V Y L S S R D R Q I L D 400 1141 CTGGAGGCCAATCCACCCAGTGATGTGTATCTGTCTTCAAGAGACCGTCAGATTCTTGAC 1200 1141 GACCTCCGGTTAGGTGGGTCACTACACATAGACAGAAGTTCTCTGGCAGTCTAAGAACTG 1200

401 W H F A N L E F A N A T P L S T L S L K 420 1201 TGGCACTTTGCAAACCTGGAGTTTGCCAATGCCACACCGCTGTCAACACTCTCACTCAAA 1260 1201 ACCGTGAAACGTTTGGACCTCAAACGGTTACGGTGTGGCGACAGTTGTGAGAGTGAGTTT 1260

421 H W D Q D D D F E F T G S H L T V R N G 440 1261 CACTGGGATCAGGATGATGATTTTGAGTTTACGGGCAGTCATCTGACTGTGCGGAACGGG 1320 1261 GTGACCCTAGTCCTACTACTAAAACTCAAATGCCCGTCAGTAGACTGACACGCCTTGCCC 1320

441 Y S C V P V A L A E G L D I K L N T A V 460 1321 TACTCGTGTGTCCCGGTGGCCCTTGCCGAGGGTTTGGATATCAAACTCAACACTGCAGTA 1380 1321 ATGAGCACACAGGGCCACCGGGAACGGCTCCCAAACCTATAGTTTGAGTTGTGACGTCAT 1380

461 R Q V R Y T S S G C E V I A V N T R S T 480 1381 AGACAGGTCCGATACACATCATCTGGTTGCGAGGTGATTGCGGTGAATACTCGCTCCACT 1440 1381 TCTGTCCAGGCTATGTGTAGTAGACCAACGCTCCACTAACGCCACTTATGAGCGAGGTGA 1440

481 T Q T F I Y K C D A V L C T L P L G V M 500 1441 ACCCAGACATTCATTTATAAATGTGACGCCGTGTTGTGTACCCTACCTTTGGGAGTGATG 1500 1441 TGGGTCTGTAAGTAAATATTTACACTGCGGCACAACACATGGGATGGAAACCCTCACTAC 1500

501 K Q Q P P A V Q F V P P L P E W K T A A 520 1501 AAACAGCAGCCGCCGGCCGTGCAGTTTGTCCCTCCTCTGCCAGAATGGAAGACGGCTGCC 1560 1501 TTTGTCGTCGGCGGCCGGCACGTCAAACAGGGAGGAGACGGTCTTACCTTCTGCCGACGG 1560

521 I Q R M G F G N L N K V V L C F D R V F 540 1561 ATCCAGAGAATGGGCTTTGGAAATCTCAACAAGGTGGTGTTGTGTTTTGATAGAGTATTC 1620 1561 TAGGTCTCTTACCCGAAACCTTTAGAGTTGTTCCACCACAACACAAAACTATCTCATAAG 1620

541 W D P S V N L F G H V G S T T A S R G E 560 1621 TGGGATCCCAGTGTTAACCTCTTTGGCCATGTGGGAAGCACGACAGCCAGTCGAGGAGAA 1680 1621 ACCCTAGGGTCACAATTGGAGAAACCGGTACACCCTTCGTGCTGTCGGTCAGCTCCTCTT 1680

561 L F L F W N L Y K A P I L L A L M A G E 580 1681 CTGTTTCTCTTCTGGAACCTCTACAAAGCCCCCATACTGTTAGCTCTGATGGCAGGAGAG 1740 1681 GACAAAGAGAAGACCTTGGAGATGTTTCGGGGGTATGACAATCGAGACTACCGTCCTCTC 1740

80

581 A A G I M E N I S D D V I V G R C L A I 600 1741 GCTGCAGGCATTATGGAGAATATCAGCGATGATGTCATCGTTGGTCGTTGTCTGGCCATC 1800 1741 CGACGTCCGTAATACCTCTTATAGTCGCTACTACAGTAGCAACCAGCAACAGACCGGTAG 1800

601 L K G I F G S S A V P Q P K E T V V S R 620 1801 CTCAAGGGCATTTTTGGAAGCAGCGCAGTCCCTCAGCCGAAGGAGACGGTGGTGAGCCGC 1860 1801 GAGTTCCCGTAAAAACCTTCGTCGCGTCAGGGAGTCGGCTTCCTCTGCCACCACTCGGCG 1860

621 W R A D P W A R G S Y S Y V A A G S S G 640 1861 TGGCGTGCAGACCCCTGGGCTCGAGGCTCATACTCGTACGTCGCAGCCGGCTCTTCTGGT 1920 1861 ACCGCACGTCTGGGGACCCGAGCTCCGAGTATGAGCATGCAGCGTCGGCCGAGAAGACCA 1920

641 N D Y D L M A Q P I T P G P A I P G A S 660 1921 AATGACTATGATCTCATGGCTCAGCCTATCACACCTGGTCCTGCCATACCTGGAGCCTCA 1980 1921 TTACTGATACTAGAGTACCGAGTCGGATAGTGTGGACCAGGACGGTATGGACCTCGGAGT 1980

661 Q P V P R L F F A G E H T I R N Y P A T 680 1981 CAGCCTGTTCCACGTCTGTTCTTCGCTGGTGAACACACGATCAGAAACTATCCTGCCACA 2040 1981 GTCGGACAAGGTGCAGACAAGAAGCGACCACTTGTGTGCTAGTCTTTGATAGGACGGTGT 2040

681 V H G A L L S G L R E A G R I A D Q F L 700 2041 GTACATGGTGCTTTGCTGAGTGGCCTGAGGGAGGCCGGGAGGATAGCCGATCAGTTCCTG 2100 2041 CATGTACCACGAAACGACTCACCGGACTCCCTCCGGCCCTCCTATCGGCTAGTCAAGGAC 2100

701 G A M Y T M P R Q A T A N P N P Q P S P 720 2101 GGAGCCATGTATACTATGCCACGCCAGGCCACAGCCAATCCAAACCCACAGCCCTCCCCC 2160 2101 CCTCGGTACATATGATACGGTGCGGTCCGGTGTCGGTTAGGTTTGGGTGTCGGGAGGGGG 2160

721 S I Q 723 2161 AGCATCCAA 2169 2161 TCGTAGGTT 2220

CoREST

1 M P A I L A E K P M H V K K E A Q G L A 20 1 ATGCCAGCAATTCTTGCAGAGAAGCCAATGCATGTAAAGAAGGAAGCCCAGGGTCTTGCA 60 1 TACGGTCGTTAAGAACGTCTCTTCGGTTACGTACATTTCTTCCTTCGGGTCCCAGAACGT 60

21 G R N L N R A K K K P P K G M Y L S A D 40 61 GGGAGGAACCTGAACAGGGCTAAGAAAAAACCTCCCAAGGGAATGTATTTAAGTGCTGAT 120 61 CCCTCCTTGGACTTGTCCCGATTCTTTTTTGGAGGGTTCCCTTACATAAATTCACGACTA 120

41 D V T A M S S S G P A A V S V L R G L D 60 121 GATGTGACTGCAATGTCCAGCAGTGGCCCGGCTGCAGTTAGTGTGCTGAGAGGGCTGGAT 180 121 CTACACTGACGTTACAGGTCGTCACCGGGCCGACGTCAATCACACGACTCTCCCGACCTA 180

61 M E L I A I K R Q I Q S I K Q H N S A L 80 181 ATGGAGCTCATTGCCATAAAACGTCAGATTCAGAGCATTAAACAGCACAATAGTGCTCTC 240 181 TACCTCGAGTAACGGTATTTTGCAGTCTAAGTCTCGTAATTTGTCGTGTTATCACGAGAG 240

81 R E K L D T G V D E F R P S E S N Q K F 100 241 AGAGAAAAGCTGGACACTGGTGTGGATGAGTTCAGGCCATCTGAGTCGAACCAGAAATTT 300 241 TCTCTTTTCGACCTGTGACCACACCTACTCAAGTCCGGTAGACTCAGCTTGGTCTTTAAA 300

101 N T R W T T E E Q L L A V Q A I R K Y G 120 301 AATACCCGCTGGACCACAGAAGAGCAGCTGCTTGCAGTGCAAGCTATAAGGAAGTACGGG 360 301 TTATGGGCGACCTGGTGTCTTCTCGTCGACGAACGTCACGTTCGATATTCCTTCATGCCC 360

81

121 R D F Q A I S D V I G N K S V V Q V K N 140 361 CGGGATTTCCAGGCCATTTCTGATGTGATCGGCAACAAGTCTGTGGTGCAAGTCAAAAAC 420 361 GCCCTAAAGGTCCGGTAAAGACTACACTAGCCGTTGTTCAGACACCACGTTCAGTTTTTG 420

141 F F V N Y R R R F N L D E V L Q E W E A 160 421 TTTTTTGTGAATTACCGGCGTCGCTTCAACCTGGACGAGGTGCTGCAGGAGTGGGAAGCA 480 421 AAAAAACACTTAATGGCCGCAGCGAAGTTGGACCTGCTCCACGACGTCCTCACCCTTCGT 480

161 E H G V E E R K G L D E E K M E V S S E 180 481 GAGCACGGTGTGGAGGAACGCAAGGGTTTAGATGAAGAGAAGATGGAGGTGTCATCTGAA 540 481 CTCGTGCCACACCTCCTTGCGTTCCCAAATCTACTTCTCTTCTACCTCCACAGTAGACTT 540

181 D G T T S G P A E N Q T E E Q T P M E T 200 541 GACGGCACAACCTCAGGGCCTGCAGAAAACCAAACAGAGGAACAAACGCCAATGGAAACA 600 541 CTGCCGTGTTGGAGTCCCGGACGTCTTTTGGTTTGTCTCCTTGTTTGCGGTTACCTTTGT 600

201 Q N P S V S 206 601 CAGAACCCTTCGGTTTCC 618 601 GTCTTGGGAAGCCAAAGG 660

DHFR

1 M I S L I A A L A V D R V I G M E N A M 20 1 ATGATCAGTCTGATTGCGGCGCTAGCGGTAGATCGCGTTATCGGCATGGAAAACGCCATG 60 1 TACTAGTCAGACTAACGCCGCGATCGCCATCTAGCGCAATAGCCGTACCTTTTGCGGTAC 60

21 P W N L P A D L A W F K R N T L N K P V 40 61 CCATGGAACCTGCCTGCCGATCTCGCCTGGTTTAAACGCAACACCTTAAATAAACCCGTG 120 61 GGTACCTTGGACGGACGGCTAGAGCGGACCAAATTTGCGTTGTGGAATTTATTTGGGCAC 120

41 I M G R H T W E S I G R P L P G R K N I 60 121 ATTATGGGGCGCCATACCTGGGAATCAATCGGTAGGCCTTTGCCCGGCCGCAAAAATATT 180 121 TAATACCCCGCGGTATGGACCCTTAGTTAGCCATCCGGAAACGGGCCGGCGTTTTTATAA 180

61 I L S S Q P G T D D R V T W V K S V D E 80 181 ATCCTCAGCAGTCAACCCGGGACCGATGATCGGGTTACCTGGGTTAAATCGGTCGACGAA 240 181 TAGGAGTCGTCAGTTGGGCCCTGGCTACTAGCCCAATGGACCCAATTTAGCCAGCTGCTT 240

81 A I A A C G D V P E I M V I G G G R V Y 100 241 GCCATCGCGGCGTGTGGTGACGTACCAGAAATCATGGTGATTGGCGGCGGACGCGTTTAT 300 241 CGGTAGCGCCGCACACCACTGCATGGTCTTTAGTACCACTAACCGCCGCCTGCGCAAATA 300

101 E Q F L P K A Q K L Y L T H I D A E V E 120 301 GAACAGTTCTTGCCAAAAGCGCAAAAACTGTATCTGACGCATATCGATGCAGAAGTGGAA 360 301 CTTGTCAAGAACGGTTTTCGCGTTTTTGACATAGACTGCGTATAGCTACGTCTTCACCTT 360

121 G D T H F P D Y E P D D W E S V F S E F 140 361 GGCGACACCCATTTTCCGGATTACGAGCCGGATGACTGGGAATCGGTATTCAGCGAATTC 420 361 CCGCTGTGGGTAAAAGGCCTAATGCTCGGCCTACTGACCCTTAGCCATAAGTCGCTTAAG 420

141 H D A D A Q N S H S Y C F E I L E R R 159 421 CACGATGCTGATGCGCAGAACTCGCATAGCTATTGTTTCGAAATCCTCGAGCGTCGT 477 421 GTGCTACGACTACGCGTCTTGAGCGTATCGATAACAAAGCTTTAGGAGCTCGCAGCA 480

82 EGFP

1 M S K G E E L F T G V V P I L V E L D G 20 1 ATGTCTAAAGGTGAAGAATTATTCACTGGCGTTGTCCCAATTTTGGTTGAATTAGATGGT 60 1 TACAGATTTCCACTTCTTAATAAGTGACCGCAACAGGGTTAAAACCAACTTAATCTACCA 60

21 D V N G H K F S V S G E G E G D A T Y G 40 61 GATGTTAATGGTCACAAATTTTCTGTCTCCGGTGAAGGTGAAGGTGACGCTACTTACGGT 120 61 CTACAATTACCAGTGTTTAAAAGACAGAGGCCACTTCCACTTCCACTGCGATGAATGCCA 120

41 K L T L K F I C T T G K L P V P W P T L 60 121 AAATTGACCTTAAAATTTATTTGTACTACTGGTAAATTGCCAGTTCCATGGCCAACCTTA 180 121 TTTAACTGGAATTTTAAATAAACATGATGACCATTTAACGGTCAAGGTACCGGTTGGAAT 180

61 V T T F G Y G V Q C F A R Y P D H M K Q 80 181 GTCACTACTTTCGGTTATGGTGTTCAATGTTTTGCGAGATACCCAGATCACATGAAACAA 240 181 CAGTGATGAAAGCCAATACCACAAGTTACAAAACGCTCTATGGGTCTAGTGTACTTTGTT 240

81 H D F F K S A M P E G Y V Q E R T I F F 100 241 CATGACTTTTTCAAGTCTGCCATGCCAGAAGGTTATGTTCAAGAAAGAACTATTTTTTTC 300 241 GTACTGAAAAAGTTCAGACGGTACGGTCTTCCAATACAAGTTCTTTCTTGATAAAAAAAG 300

101 K D D G N Y K T R A E V K F E G D T L V 120 301 AAAGATGACGGTAACTACAAGACCAGAGCTGAAGTCAAGTTTGAAGGTGATACCTTAGTT 360 301 TTTCTACTGCCATTGATGTTCTGGTCTCGACTTCAGTTCAAACTTCCACTATGGAATCAA 360

121 N R I E L K G I D F K E D G N I L G H K 140 361 AATAGAATCGAATTAAAAGGTATTGATTTTAAAGAAGATGGTAACATTTTAGGTCACAAA 420 361 TTATCTTAGCTTAATTTTCCATAACTAAAATTTCTTCTACCATTGTAAAATCCAGTGTTT 420

141 L E Y N Y N S H N V Y I M A D K Q K N G 160 421 TTGGAATACAACTATAACTCTCACAATGTTTACATCATGGCTGACAAACAAAAGAATGGT 480 421 AACCTTATGTTGATATTGAGAGTGTTACAAATGTAGTACCGACTGTTTGTTTTCTTACCA 480

161 I K V N F K I R H N I E D G S V Q L A D 180 481 ATCAAAGTTAACTTCAAAATTAGACACAACATTGAAGATGGTTCTGTTCAATTAGCTGAC 540 481 TAGTTTCAATTGAAGTTTTAATCTGTGTTGTAACTTCTACCAAGACAAGTTAATCGACTG 540

181 H Y Q Q N T P I G D G P V L L P D N H Y 200 541 CATTATCAACAAAATACTCCAATTGGTGATGGTCCAGTCTTGTTACCAGACAACCATTAC 600 541 GTAATAGTTGTTTTATGAGGTTAACCACTACCAGGTCAGAACAATGGTCTGTTGGTAATG 600

201 L S T Q S A L S K D P N E K R D H M V L 220 601 TTATCCACTCAATCTGCCTTATCCAAAGATCCAAACGAAAAGAGAGACCACATGGTCTTG 660 601 AATAGGTGAGTTAGACGGAATAGGTTTCTAGGTTTGCTTTTCTCTCTGGTGTACCAGAAC 660

221 L E F V T A A G I T G S 232 661 TTAGAATTTGTTACTGCTGCTGGTATTACCGGATCC 696 661 AATCTTAAACAATGACGACGACCATAATGGCCTAGG 720

83 LYTAG

1 M L A D R W R K H T D G N W Y W F D N S 20 1 ATGCTTGCAGACCGCTGGAGGAAGCACACAGACGGCAACTGGTACTGGTTCGACAACTCA 60 1 TACGAACGTCTGGCGACCTCCTTCGTGTGTCTGCCGTTGACCATGACCAAGCTGTTGAGT 60

21 G E M A T G W K K I A D K W Y Y F N E E 40 61 GGCGAAATGGCTACAGGCTGGAAGAAAATCGCTGATAAGTGGTACTATTTCAACGAAGAA 120 61 CCGCTTTACCGATGTCCGACCTTCTTTTAGCGACTATTCACCATGATAAAGTTGCTTCTT 120

41 G A M K T G W V K Y K D T W Y Y L D A K 60 121 GGTGCCATGAAGACAGGCTGGGTCAAGTACAAGGACACTTGGTACTACTTAGACGCTAAA 180 121 CCACGGTACTTCTGTCCGACCCAGTTCATGTTCCTGTGAACCATGATGAATCTGCGATTT 180

61 E G A M V S N A F I Q S A D G T G W Y Y 80 181 GAAGGCGCCATGGTATCAAATGCCTTTATCCAGTCAGCGGACGGAACAGGCTGGTACTAC 240 181 CTTCCGCGGTACCATAGTTTACGGAAATAGGTCAGTCGCCTGCCTTGTCCGACCATGATG 240

81 L K P D G T L A D R P E F T V E P D G L 100 241 CTCAAACCAGACGGAACACTGGCAGACAGGCCAGAATTCACAGTAGAGCCAGATGGCTTG 300 241 GAGTTTGGTCTGCCTTGTGACCGTCTGTCCGGTCTTAAGTGTCATCTCGGTCTACCGAAC 300

101 I T V K L A E A A A K E A A A K E A A A 120 301 ATTACAGTAAAACTGGCAGAAGCGGCTGCTAAAGAGGCTGCCGCGAAGGAAGCAGCGGCG 360 301 TAATGTCATTTTGACCGTCTTCGCCGACGATTTCTCCGACGGCGCTTCCTTCGTCGCCGC 360

121 K E A A A K A A A G S D D D D K G R M L 140 361 AAAGAAGCCGCAGCAAAAGCGGCAGCGGGTTCTGATGACGATGACAAGGGTCGCATGCTC 420 361 TTTCTTCGGCGTCGTTTTCGCCGTCGCCCAAGACTACTGCTACTGTTCCCAGCGTACGAG 420

141 D I G I R R A E L G I R R P D L S S G P 160 421 GATATCGGGATCCGGCGCGCCGAGCTCGGAATTCGTCGACCAGATCTCTCGAGCGGGCCC 480 421 CTATAGCCCTAGGCCGCGCGGCTCGAGCCTTAAGCAGCTGGTCTAGAGAGCTCGCCCGGG 480

161 G W P V P S L I S * 169 481 GGGTGGCCGGTACCAAGCTTAATTAGCTGA 510 481 CCCACCGGCCATGGTTCGAATTAATCGACT 540

HBO1∆1 (*referred to only as HBO1 above)

1 M G S F R R A Q A R A S E D L E K L R L 20 1 ATGGGATCCTTCCGAAGAGCACAAGCCCGGGCTTCAGAGGATTTGGAGAAGTTAAGGCTG 60 1 TACCCTAGGAAGGCTTCTCGTGTTCGGGCCCGAAGTCTCCTAAACCTCTTCAATTCCGAC 60

21 Q G Q I T E G S N M I K T I A F G R Y E 40 61 CAAGGCCAAATCACAGAGGGAAGCAACATGATTAAAACAATTGCTTTTGGCCGCTATGAG 120 61 GTTCCGGTTTAGTGTCTCCCTTCGTTGTACTAATTTTGTTAACGAAAACCGGCGATACTC 120

41 L D T W Y H S P Y P E E Y A R L G R L Y 60 121 CTTGATACCTGGTACCATTCTCCATATCCTGAAGAATATGCACGGCTGGGACGTCTCTAT 180 121 GAACTATGGACCATGGTAAGAGGTATAGGACTTCTTATACGTGCCGACCCTGCAGAGATA 180

61 M C E F C L K Y M K S Q T I L R R H M A 80 181 ATGTGTGAATTCTGTTTAAAATATATGAAGAGCCAAACGATACTCCGCCGGCACATGGCC 240 181 TACACACTTAAGACAAATTTTATATACTTCTCGGTTTGCTATGAGGCGGCCGTGTACCGG 240

84

81 K C V W K H P P G D E I Y R K G S I S V 100 241 AAATGTGTGTGGAAACACCCACCTGGTGATGAGATATATCGCAAAGGTTCAATCTCTGTG 300 241 TTTACACACACCTTTGTGGGTGGACCACTACTCTATATAGCGTTTCCAAGTTAGAGACAC 300

101 F E V D G K K N K I Y C Q N L C L L A K 120 301 TTTGAAGTGGATGGCAAGAAAAACAAGATCTACTGCCAAAACCTGTGCCTGTTGGCCAAA 360 301 AAACTTCACCTACCGTTCTTTTTGTTCTAGATGACGGTTTTGGACACGGACAACCGGTTT 360

121 L F L D H K T L Y Y D V E P F L F Y V M 140 361 CTTTTTCTGGACCACAAGACATTATATTATGATGTGGAGCCCTTCCTGTTCTATGTTATG 420 361 GAAAAAGACCTGGTGTTCTGTAATATAATACTACACCTCGGGAAGGACAAGATACAATAC 420

141 T E A D N T G C H L I G Y F S K E K N S 160 421 ACAGAGGCGGACAACACTGGCTGTCACCTGATTGGATATTTTTCTAAGGAAAAGAATTCA 480 421 TGTCTCCGCCTGTTGTGACCGACAGTGGACTAACCTATAAAAAGATTCCTTTTCTTAAGT 480

161 F L N Y N V S C I L T M P Q Y M R Q G Y 180 481 TTCCTCAACTACAACGTCTCCTGTATCCTTACTATGCCTCAGTACATGAGACAGGGCTAT 540 481 AAGGAGTTGATGTTGCAGAGGACATAGGAATGATACGGAGTCATGTACTCTGTCCCGATA 540

181 G K M L I D F S Y L L S K V E E K V G S 200 541 GGCAAGATGCTTATTGATTTCAGTTATTTGCTTTCCAAAGTCGAAGAAAAAGTTGGCTCC 600 541 CCGTTCTACGAATAACTAAAGTCAATAAACGAAAGGTTTCAGCTTCTTTTTCAACCGAGG 600

201 P E R P L S D L G L I S Y R S Y W K E V 220 601 CCAGAACGTCCACTCTCAGATCTGGGGCTTATAAGCTATCGCAGTTACTGGAAAGAAGTA 660 601 GGTCTTGCAGGTGAGAGTCTAGACCCCGAATATTCGATAGCGTCAATGACCTTTCTTCAT 660

221 L L R Y L H N F Q G K E I S I K E I S Q 240 661 CTTCTCCGCTACCTGCATAATTTTCAAGGCAAAGAGATTTCTATCAAAGAAATCAGTCAG 720 661 GAAGAGGCGATGGACGTATTAAAAGTTCCGTTTCTCTAAAGATAGTTTCTTTAGTCAGTC 720

241 E T A V N P V D I V S T L Q A L Q M L K 260 721 GAGACGGCTGTGAATCCTGTGGACATTGTCAGCACTCTGCAGGCCCTTCAGATGCTCAAA 780 721 CTCTGCCGACACTTAGGACACCTGTAACAGTCGTGAGACGTCCGGGAAGTCTACGAGTTT 780

261 Y W K G K H L V L K R Q D L I D E W I A 280 781 TACTGGAAGGGAAAACACCTAGTTTTAAAGAGACAGGACCTGATTGATGAGTGGATAGCC 840 781 ATGACCTTCCCTTTTGTGGATCAAAATTTCTCTGTCCTGGACTAACTACTCACCTATCGG 840

281 K E A K R S N S N K T M D P S C L K W T 300 841 AAAGAGGCCAAAAGGTCCAACTCCAATAAAACCATGGACCCCAGCTGCTTAAAATGGACC 900 841 TTTCTCCGGTTTTCCAGGTTGAGGTTATTTTGGTACCTGGGGTCGACGAATTTTACCTGG 900

301 P P K G T 305 901 CCTCCCAAGGGCACT 915 901 GGAGGGTTCCCGTGA 960

85 JADE1

*Section highlighted in red was removed in JADE1∆1 truncation

1 M K R G R L P S S S E D S D D N G S L S 20 1 ATGAAACGAGGTCGCCTTCCCAGCAGCAGTGAGGATTCTGACGACAATGGCAGCCTGTCA 60 1 TACTTTGCTCCAGCGGAAGGGTCGTCGTCACTCCTAAGACTGCTGTTACCGTCGGACAGT 60

21 T T W S Q N S R S Q H R R S S C S R H E 40 61 ACTACTTGGTCCCAGAATTCCCGATCCCAGCATAGGAGAAGCTCCTGCTCCAGACATGAA 120 61 TGATGAACCAGGGTCTTAAGGGCTAGGGTCGTATCCTCTTCGAGGACGAGGTCTGTACTT 120

41 D R K P S E V F R T D L I T A M K L H D 60 121 GATCGAAAGCCTTCAGAGGTGTTTAGGACAGACCTGATCACTGCCATGAAGTTGCATGAC 180 121 CTAGCTTTCGGAAGTCTCCACAAATCCTGTCTGGACTAGTGACGGTACTTCAACGTACTG 180

61 S Y Q L N P D E Y Y V L A D P W R Q E W 80 181 TCCTACCAGCTGAATCCGGATGAGTACTATGTGTTGGCAGATCCCTGGAGACAGGAATGG 240 181 AGGATGGTCGACTTAGGCCTACTCATGATACACAACCGTCTAGGGACCTCTGTCCTTACC 240

81 E K G V Q V P V S P G T I P Q P V A R V 100 241 GAGAAAGGGGTCCAGGTGCCTGTGAGCCCGGGGACCATCCCTCAGCCTGTGGCCAGGGTT 300 241 CTCTTTCCCCAGGTCCACGGACACTCGGGCCCCTGGTAGGGAGTCGGACACCGGTCCCAA 300

101 V S E E K S L M F I R P K K Y I V S S G 120 301 GTGTCTGAAGAGAAATCCCTCATGTTCATCAGGCCCAAGAAGTACATCGTGTCATCAGGC 360 301 CACAGACTTCTCTTTAGGGAGTACAAGTAGTCCGGGTTCTTCATGTAGCACAGTAGTCCG 360

121 S E P P E L G Y V D I R T L A D S V C R 140 361 TCTGAGCCTCCCGAGTTGGGCTATGTGGACATCCGGACGCTGGCTGACAGCGTGTGTCGC 420 361 AGACTCGGAGGGCTCAACCCGATACACCTGTAGGCCTGCGACCGACTGTCGCACACAGCG 420

141 Y D L N D M D A A W L E L T N E E F K E 160 421 TATGACCTCAATGACATGGATGCTGCATGGCTGGAACTGACCAATGAAGAATTTAAGGAG 480 421 ATACTGGAGTTACTGTACCTACGACGTACCGACCTTGACTGGTTACTTCTTAAATTCCTC 480

161 M G M P E L D E Y T M E R V L E E F E Q 180 481 ATGGGAATGCCTGAACTAGATGAATACACCATGGAGAGGGTCCTAGAGGAATTTGAGCAG 540 481 TACCCTTACGGACTTGATCTACTTATGTGGTACCTCTCCCAGGATCTCCTTAAACTCGTC 540

181 R C Y D N M N H A I E T E E G L G I E Y 200 541 CGATGCTACGACAATATGAATCATGCCATAGAGACTGAGGAAGGCCTGGGGATCGAATAT 600 541 GCTACGATGCTGTTATACTTAGTACGGTATCTCTGACTCCTTCCGGACCCCTAGCTTATA 600

201 D E D V V C D V C Q S P D G E D G N E M 220 601 GATGAAGATGTTGTCTGTGATGTCTGCCAGTCTCCTGATGGTGAGGACGGCAATGAGATG 660 601 CTACTTCTACAACAGACACTACAGACGGTCAGAGGACTACCACTCCTGCCGTTACTCTAC 660

221 V F C D K C N I C V H Q A C Y G I L K V 240 661 GTGTTCTGTGACAAATGCAACATCTGTGTGCACCAGGCCTGTTATGGAATCCTCAAGGTA 720 661 CACAAGACACTGTTTACGTTGTAGACACACGTGGTCCGGACAATACCTTAGGAGTTCCAT 720

241 P E G S W L C R T C A L G V Q P K C L L 260 721 CCAGAGGGCAGCTGGCTGTGCCGGACATGTGCCCTGGGGGTTCAGCCAAAATGTCTGCTG 780 721 GGTCTCCCGTCGACCGACACGGCCTGTACACGGGACCCCCAAGTCGGTTTTACAGACGAC 780

261 C P K K G G A M K P T R S G T K W V H V 280 781 TGTCCGAAGAAGGGTGGAGCTATGAAGCCCACCCGTAGCGGAACCAAGTGGGTCCACGTT 840 781 ACAGGCTTCTTCCCACCTCGATACTTCGGGTGGGCATCGCCTTGGTTCACCCAGGTGCAA 840

86

281 S C A L W I P E V S I G S P E K M E P I 300 841 AGCTGTGCTCTGTGGATCCCTGAGGTGAGCATTGGCAGCCCAGAGAAGATGGAGCCCATC 900 841 TCGACACGAGACACCTAGGGACTCCACTCGTAACCGTCGGGTCTCTTCTACCTCGGGTAG 900

301 T K V S H I P S S R W A L V C S L C N E 320 901 ACCAAGGTGTCACACATTCCCAGCAGCCGGTGGGCGCTAGTGTGCAGCCTCTGCAATGAG 960 901 TGGTTCCACAGTGTGTAAGGGTCGTCGGCCACCCGCGATCACACGTCGGAGACGTTACTC 960

321 K F G A S I Q C S V K N C R T A F H V T 340 961 AAGTTTGGGGCCTCTATACAGTGCTCTGTGAAGAACTGCCGCACAGCCTTCCATGTGACC 1020 961 TTCAAACCCCGGAGATATGTCACGAGACACTTCTTGACGGCGTGTCGGAAGGTACACTGG 1020

341 C A F D R G L E M K T I L A E N D E V K 360 1021 TGTGCTTTTGACCGGGGCCTGGAGATGAAGACCATCTTAGCAGAGAATGATGAAGTCAAG 1080 1021 ACACGAAAACTGGCCCCGGACCTCTACTTCTGGTAGAATCGTCTCTTACTACTTCAGTTC 1080

361 F K S Y C P K H S S H R K P E E S L G K 380 1081 TTCAAGTCCTATTGCCCAAAGCACAGCTCACATAGGAAACCCGAGGAGAGTCTTGGCAAG 1140 1081 AAGTTCAGGATAACGGGTTTCGTGTCGAGTGTATCCTTTGGGCTCCTCTCAGAACCGTTC 1140

381 G A A Q E N G A P E C S P R N P L E P F 400 1141 GGGGCTGCACAGGAGAATGGGGCCCCTGAGTGTTCCCCCCGGAATCCGCTGGAGCCCTTT 1200 1141 CCCCGACGTGTCCTCTTACCCCGGGGACTCACAAGGGGGGCCTTAGGCGACCTCGGGAAA 1200

401 A S L E Q N R E E A H R V S V R K Q K L 420 1201 GCCAGCCTTGAGCAGAACCGGGAGGAGGCCCACCGGGTGAGTGTCCGTAAGCAGAAGCTG 1260 1201 CGGTCGGAACTCGTCTTGGCCCTCCTCCGGGTGGCCCACTCACAGGCATTCGTCTTCGAC 1260

421 Q Q L E D E F Y T F V N L L D V A R A L 440 1261 CAGCAGTTGGAGGATGAGTTCTACACCTTCGTCAACCTGCTGGATGTTGCCAGGGCTCTG 1320 1261 GTCGTCAACCTCCTACTCAAGATGTGGAAGCAGTTGGACGACCTACAACGGTCCCGAGAC 1320

441 R L P E E V V D F L Y Q Y W K L K R K V 460 1321 CGGCTGCCTGAGGAAGTAGTGGATTTCCTGTACCAGTACTGGAAGTTGAAGAGGAAGGTC 1380 1321 GCCGACGGACTCCTTCATCACCTAAAGGACATGGTCATGACCTTCAACTTCTCCTTCCAG 1380

461 N F N K P L I T P K K D E E D N L A K R 480 1381 AACTTCAACAAGCCCCTGATCACCCCAAAGAAAGATGAAGAGGACAATCTAGCCAAGCGG 1440 1381 TTGAAGTTGTTCGGGGACTAGTGGGGTTTCTTTCTACTTCTCCTGTTAGATCGGTTCGCC 1440

481 E Q D V L F R R L Q L F T H L R Q D L E 500 1441 GAGCAGGATGTCTTATTTAGGAGGCTGCAGCTGTTCACGCACCTGCGGCAGGACCTGGAG 1500 1441 CTCGTCCTACAGAATAAATCCTCCGACGTCGACAAGTGCGTGGACGCCGTCCTGGACCTC 1500

501 R V M I D T D T L * 509 1501 AGGGTAATGATTGACACTGACACCTTATAG 1530 1501 TCCCATTACTAACTGTGACTGTGGAATATC 1560

87 JADE1∆1

1 G S K Y I V S S G S E P P E L G Y V D I 20 1 GGATCCAAGTACATCGTGTCATCAGGCTCTGAGCCTCCCGAGTTGGGCTATGTGGACATC 60 1 CCTAGGTTCATGTAGCACAGTAGTCCGAGACTCGGAGGGCTCAACCCGATACACCTGTAG 60

21 R T L A D S V C R Y D L N D M D A A W L 40 61 CGGACGCTGGCTGACAGCGTGTGTCGCTATGACCTCAATGACATGGATGCTGCATGGCTG 120 61 GCCTGCGACCGACTGTCGCACACAGCGATACTGGAGTTACTGTACCTACGACGTACCGAC 120

41 E L T N E E F K E M G M P E L D E Y T M 60 121 GAACTGACCAATGAAGAATTTAAGGAGATGGGAATGCCTGAACTAGATGAATACACCATG 180 121 CTTGACTGGTTACTTCTTAAATTCCTCTACCCTTACGGACTTGATCTACTTATGTGGTAC 180

61 E R V L E E F E Q R C Y D N M N H A I E 80 181 GAGAGGGTCCTAGAGGAATTTGAGCAGCGATGCTACGACAATATGAATCATGCCATAGAG 240 181 CTCTCCCAGGATCTCCTTAAACTCGTCGCTACGATGCTGTTATACTTAGTACGGTATCTC 240

81 T E E G L G I E Y D E D V V C D V C Q S 100 241 ACTGAGGAAGGCCTGGGGATCGAATATGATGAAGATGTTGTCTGTGATGTCTGCCAGTCT 300 241 TGACTCCTTCCGGACCCCTAGCTTATACTACTTCTACAACAGACACTACAGACGGTCAGA 300

101 P D G E D G N E M V F C D K C N I C V H 120 301 CCTGATGGTGAGGACGGCAATGAGATGGTGTTCTGTGACAAATGCAACATCTGTGTGCAC 360 301 GGACTACCACTCCTGCCGTTACTCTACCACAAGACACTGTTTACGTTGTAGACACACGTG 360

121 Q A C Y G I L K V P E G S W L C R T C A 140 361 CAGGCCTGTTATGGAATCCTCAAAGTACCAGAGGGCAGCTGGCTGTGCCGGACATGTGCC 420 361 GTCCGGACAATACCTTAGGAGTTTCATGGTCTCCCGTCGACCGACACGGCCTGTACACGG 420

141 L G V Q P K C L L C P K K G G A M K P T 160 421 CTGGGGGTTCAGCCAAAATGTCTGCTGTGTCCGAAGAAGGGTGGAGCTATGAAGCCCACC 480 421 GACCCCCAAGTCGGTTTTACAGACGACACAGGCTTCTTCCCACCTCGATACTTCGGGTGG 480

161 R S G T K W V H V S C A L W I P E V S I 180 481 CGTAGCGGAACCAAGTGGGTCCACGTTAGCTGTGCTCTGTGGATCCCTGAGGTGAGCATT 540 481 GCATCGCCTTGGTTCACCCAGGTGCAATCGACACGAGACACCTAGGGACTCCACTCGTAA 540

181 G S P E K M E P I T K V S H I P S S R W 200 541 GGCAGCCCAGAGAAGATGGAGCCCATCACCAAGGTGTCACACATTCCCAGCAGCCGGTGG 600 541 CCGTCGGGTCTCTTCTACCTCGGGTAGTGGTTCCACAGTGTGTAAGGGTCGTCGGCCACC 600

201 A L V C S L C N E K F G A S I Q C S V K 220 601 GCGCTAGTGTGCAGCCTCTGCAATGAGAAGTTTGGGGCCTCTATACAGTGCTCTGTGAAG 660 601 CGCGATCACACGTCGGAGACGTTACTCTTCAAACCCCGGAGATATGTCACGAGACACTTC 660

221 N C R T A F H V T C A F D R G L E M K T 240 661 AACTGCCGCACAGCCTTCCATGTGACCTGTGCTTTTGACCGGGGCCTGGAGATGAAGACC 720 661 TTGACGGCGTGTCGGAAGGTACACTGGACACGAAAACTGGCCCCGGACCTCTACTTCTGG 720

241 I L A E N D E V K F K S Y C P K H S S H 260 721 ATCTTAGCAGAGAATGATGAAGTCAAGTTCAAGTCCTATTGCCCAAAGCACAGCTCACAT 780 721 TAGAATCGTCTCTTACTACTTCAGTTCAAGTTCAGGATAACGGGTTTCGTGTCGAGTGTA 780

261 R K P E E S L G K G A A Q E N G A P E C 280 781 AGGAAACCCGAGGAGAGTCTTGGCAAGGGGGCTGCACAGGAGAATGGGGCCCCTGAGTGT 840 781 TCCTTTGGGCTCCTCTCAGAACCGTTCCCCCGACGTGTCCTCTTACCCCGGGGACTCACA 840

281 S P R N P L E P F A S L E Q N R E E A H 300 841 TCCCCCCGGAATCCGCTGGAGCCCTTTGCCAGCCTTGAGCAGAACCGGGAGGAGGCCCAC 900 841 AGGGGGGCCTTAGGCGACCTCGGGAAACGGTCGGAACTCGTCTTGGCCCTCCTCCGGGTG 900

88

301 R V S V R K Q K L Q Q L E D E F Y T F V 320 901 CGGGTGAGTGTCCGTAAGCAGAAGCTGCAGCAGTTGGAGGATGAGTTCTACACCTTCGTC 960 901 GCCCACTCACAGGCATTCGTCTTCGACGTCGTCAACCTCCTACTCAAGATGTGGAAGCAG 960

321 N L L D V A R A L R L P E E V V D F L Y 340 961 AACCTGCTGGATGTTGCCAGGGCTCTGCGGCTGCCTGAGGAAGTAGTGGATTTCCTGTAC 1020 961 TTGGACGACCTACAACGGTCCCGAGACGCCGACGGACTCCTTCATCACCTAAAGGACATG 1020

341 Q Y W K L K R K V N F N K P L I T P K K 360 1021 CAGTACTGGAAGTTGAAGAGGAAGGTCAACTTCAACAAGCCCCTGATCACCCCAAAGAAA 1080 1021 GTCATGACCTTCAACTTCTCCTTCCAGTTGAAGTTGTTCGGGGACTAGTGGGGTTTCTTT 1080

361 D E E D N L A K R E Q D V L F R R L Q L 380 1081 GATGAAGAGGACAATCTAGCCAAGCGGGAGCAGGATGTCTTATTTAGGAGGCTGCAGCTG 1140 1081 CTACTTCTCCTGTTAGATCGGTTCGCCCTCGTCCTACAGAATAAATCCTCCGACGTCGAC 1140

381 F T H L R Q D L E R V M I D T D T L 398 1141 TTCACGCACCTGCGGCAGGACCTGGAGAGGGTAATGATTGACACTGACACCTTA 1194 1141 AAGTGCGTGGACGCCGTCCTGGACCTCTCCCATTACTAACTGTGACTGTGGAAT 1200

89

Appendix B

Molecular Weights

6x HIS tagged yEpl1 41 896.0 Da

8x HIS tagged yEpl1 42 298.4 Da

10x HIS tagged yEpl1 42 572.6 Da

Yng2 21 030.0 Da yEsa1 52 481.0 Da

6x HIS tagged LSD1 80 715.8 Da

8x HIS tagged LSD1 79 791.8 Da

10x HIS tagged LSD1 80 066.1 Da

CoREST 25 817.8 Da

HST/LYTAG tagged DHFR 34 663.4 Da

HST/LYTAG tagged EGFP 43 605.5 Da

6x HIS tagged JADE1 62 033.0 Da

6x HIS tagged JADE1∆1 49 136.4 Da

HBO1 35 609.2 Da

JADE1 58 383.0 Da

JADE1∆1 45 485.7 Da

SUMO/HST tagged HBO1 49 705.7 Da

90

Appendix C

Solutions

2x TY Media TE (10, 50) Protein Gel Loading Buffer

1.6 % bacto tryptone 10 mM Tris-Cl pH 8.0 125 mM Bis-Tris pH 6.8 1.0 % yeast extract 50 mM EDTA 20% glycerol 0.5 % sodium chloride 4% SDS 15% 2-mercaptoethanol T1500 0.04% bromophenol blue TYE Agar Plates 20 mM Tris-Cl pH 8 1.0 % bacto tryptone 1500 mM NaCl Protein Gel Running Buffer 0.5 % yeast extract 1mM benzamidine 0.8 % sodium chloride 0.5 mM EDTA, Na 10 mM Tris 1.5 % agar 10 mM 2-mercaptoethanol 76 mM glycine 0.02 % SDS

LYSIS Buffer P300 - EDTA FIX Solution 50 mM glucose 50 mM sodium phosphate pH 25 mM Tris-Cl pH 8.0 7.0 45% ethanol 10 mM EDTA, Na 300 mM sodium chloride 9% acetic acid 1 mM benzamidine 5 mM 2-mercaptoethanol NaOH/SDS STAIN Solution

0.2 M NaOH 0.5 TBE 0.5% Coomassie Blue R 1% SDS 45% ethanol 45 mM Tris base 9% acetic acid 45 mM boric acid 5 M KAc/2.5 M HAc 1.5 mM EDTA DESTAIN Solution 5 M KAc 2.5 M HAc 6x Gel Loading Buffer 7% ethanol 5% acetic acid 2.5 mg/ml bromophenol blue TE (10, 0.1) 2.5 mg/ml xylene cyanol 0.3 g/ml glycerol 10 mM Tris-Cl pH 8.0 60 mM EDTA 0.1 mM EDTA

91

REFERENCES

Avvakumov, N.; Cote, J. The MYST Family of Histone Acetyltransferases and Their Intimate Links to

Cancer. Oncogene [Online] 2007. 26, 5395-5407.

Bell, S. P.; Dutta, A. DNA Replication in Eukaryotic Cells. Annual Review of Biochemistry [Online]

2002. 71, 333-374.

BD TALONTM Metal Affinity Resins User Manual. BD Biosciences Clontech [Online] 2004. Protocol

No. PT1320-1. Version No. PR42678

Benkovic, S. J.; Valentine, A. M.; Salinas, F. Replisome-mediated DNA Replication. Annual Review of

Biochemistry [Online] 2001. 70, 181-208.

Bertaux, F.; Lagardette, F.; Lapize, C. Double Selection Cloning Method ad Vectors Therefore. U.S.

Patent 0253688. December 16, 2004.

Blackwell, J. R.; Horgan, R. A Novel Strategy for Production of a Highly Expressed Recombinant Protein

in an Active Form. FEBS Letters [Online] 1991. 295 (1-3), 10-12.

Broedel, S. E.; Papciak, S. M.; Jones, W. R. The Selection of Optimum Media Formulations for Improved

Expression of Recombinant Proteins In E. coli. Athena Environmental Sciences [Online] 2001. 2.

92 Burke, T. W.; Cook, J. G.; Asano, M.; Nevins, J. R. Replication Factors MCM2 and ORC1 Interact with

the Histone Actyltransferase HBO1. The Journal of Biological Chemistry [Online] 2001. 276,

15397-15408.

Butt, T. R.; Edavettal, S. C.; Hall, J. P.; Mattern, M. R. SUMO Fusion Technology for Difficult-to-

express Proteins. Protein Expression and Purification [Online] 2005. 43 (1), 1-9.

Clontech Laboratories. His-Tagged Protein Purification.

Davey, C. A.; Sargent, D. F.; Luger, K.; Maeder, A. W.; Richmond, T. J. Solvent Mediated Interactions in

the Structure of the Nucleosome Core Particle at 1.9 A Resolution. Journal of Molecular Biology

[Online] 2002. 319, 1097-1113.

Foy, R. L.; Song, I. Y.; Chitalia, V. C.; Cohen, H. T.; Saksouk, N.; Cayrou, C; Vaziri, C.; Côté, J.;

Panchenko, M. V. Role of Jade-1 in the histone acetyltransferase (HAT) HBO1 complex. Journal

of Biological Chemistry [Online] 2008. 283, 18817-18826.

Grunstein, M. Histone Acetylation in Chromatin Structure and Transcription. Nature [Online] 1997. 389,

349-352.

Hansted, J. G.; Pietikainen, L.; Hog, F.; Sperling-Petersen, H. U.; Mortensen, K. K. Expressivity Tag: A

Novel Tool for Increased Expression in . Journal of Biotechnology [Online]

2011. 155 (3), 275-283.

93 Hernandez-Rocamora, V. M.; Maestro, B.; Molla-Morales, A.; Sanz, J. M. Rational Stabilization of the

C-LytA Affinity Tag by Protein Engineering. Protein Engineering, Design and Selection [Online]

2008. 21 (12), 709-720.

Hewitt, S. N.; Choi, R.; Kelly, A.; Crowther, G. J.; Napuli, A. J.; Van Voorhis, W. C. Expression of

Proteins in Escherichia coli as Fusions with Maltose-binding Protein to rescue Non-expressed

Targets in a High-throughput Protein-expression and Purification Pipeline. Structural Biology

and Crystallization Communications [Online] 2011. 67 (9), 1006-1009.

Holliday, R. Epigenetics: A Historical Overview. Epigenetics [Online] 2006. 1 (2), 76-80.

Iizuka, M.; Matsui, T.; Takisawa, H.; Smith, M. Regulation of Replication Licensing by Acetyltransferase

Hbo1. Molecular and Cellular Biology [Online] 2006. 26, 1098-1108.

Iizuka, M.; Stillman, B. Histone Acetyltransferase HBO1 Interacts with the ORC1 Subunit of the Human

Initiator Protein. The Journal of Biological Chemistry. [Online] 1999. 274, 23027-23034.

Imamura, H.; Jeon, B.; Wakagi, T.; Matsuzawa, H. High Level Expression of Thermococcus litoralis 4-α-

glucotransferase in a Soluble form in Escherichia coli with a Novel Expression System Involving

Minor Arginine tRNAs and GroELS. FEBS Letters [Online] 1999. 457 (3), 393-396.

Imidazole; MSDS No. 56750 [Online]; Sigma-Aldrich.

94 Kueh, A. J.; Dixon, M. P.; Voss, A. K.; Thomas, T. HBO1 Is Required for H3K14 Acetylation and

Normal Transcriptional Activity during Embryonic Development. Journal of Molecular and

Cellular Biology [Online] 2011. 31 (4), 845-860.

Ladenburger, E.; Keller, C.; Knippers, R. Identification of a Binding Region for Human Origin

Recognition Complex Proteins 1 and 2 That Coincides with an Origin of DNA Replication.

Molecular and Cellular Biology [Online] 2002. 22 (4), 1036-1048.

Li, G.; Reinberg, D. Chromatin Higher-order Structures and Gene Regulation. Current Opinion in

Genetics & Development [Online] 2001. 21 (2), 175-186.

Lichty, J.L.; Malecki, J.L.; Agnew, H.D.; Michelson-Horowitz, D.J.; Tan, S. Comparison of Affinity Tags

for Protein Purification. Protein Expression and Purification [Online] 2005. 41, 98-105.

Maestro, B.; Velasco, I.; Castillejo, I.; Arevalo-Rodriguez, M.; Cebolla, A.; Sanz, J. M. Affinity

Partitioning of Proteins Tagged with Choline-binding Modules in Aqueous Two-phase Systems.

Journal of Chromatography A [Online] 2008. 1208 (1-2), 189-196.

Mohanty, A. K.; Wiener, M. C. Membrane Protein Expression and Production: Effects of Polyhistidine

Tag Length and Position. Protein Expression and Purification [Online] 2004. 33 (2), 311-325.

Moriniere, J.; Rousseaux, S.; Steuerwald, U.; Soler-Lopez, M.; Curtet, S.; Vitte, A.; Govin, J.; Gaucher,

J.; Sadoul, K.; Hart, D. J.; Krijgsveld, J.; Khochbin, S.; Muller, C. W.; Petosa, C. Cooperative

Binding of Two Acetylation Marks on a Histone Tail by a Single Bromodomain. Nature [Online]

2009. 461, 664-668.

95 Nitiss, J. L. Investigating the Biological Functions of DNA Topoisomerases in Eukaryotic Cells.

Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression [Online] 1998. 1400 (1-

3), 63-81.

Owen, D. J.; Ornaghi, P.; Yang, J.; Lowe, N.; Evans, P. R.; Ballario, Pl; Neuhaus, D.; Filetici, P.; Travers,

A. A. The Structural Basis for the Recognition of Acetylated Histone H4 by the Bromodomain of

Histone Acetyltransferase Gcn5p. The EMBO Journal [Online] 2000. 19 (22). 6141-6149.

Panavas, T.; Sanders, C.; Butt, T. R. SUMO Fusion Technology for Enhanced Protein Production in

Prokaryotic and Eukaryotic Expression Systems. Methods in Molecular Biology: SUMO

Protocols [Online] 2009. 497, 303-317.

Paterson, N. G.; Riboldi-Tunicliffe, A.; Mitchell, T. J.; Isaacs, N. W. Overexpression, purification and

crystallization of a choline-binding protein CbpI from Streptococcus pneumoniae. Acta

Crystallographer Secton F Structrual Biology and Crystallization Communications [Online]

2006. 62 (Pt 7), 672-675.

pET System Manual. Novagen 1999. TB055 8th Edition 02/99.

Rath, A.; Glibowicka, M.; Nadeau, V. G.; Chen, G.; Deber, C. M. Detergent binding explains anomalous

SDS-PAGE migration of membrane proteins. Proceedings of the National Academy of Sciences

of the United States of America [Online] 2009. 106 (6), 1760-1765.

Saksouk, N.; Avvakumov, N.; Champagne, K. S.; Hung, T.; Doyon, Y.; Cayrou, C.; Paquet, E.; Ullah,

M.; Landry, A.; Cote, V.; Yang, X.; Gozani, O.; Kutateladze, T. G.; Cote, J. HBO1 HAT

96 Complexes Target Chromatin Throughout Gene Coding Regions via Multiple PHD Finger

Interactions with Histone H3 Tail. Molecular Cell [Online] 2000. 33, 257-265.

Serna, I. L.; Ohkawa, Y.; Imbalzano, A. N. Chromatin Remodelling in Mammalian Differentiation:

Lessons from ATP-dependent Remodellers. Nature Reviews Genetics [Online] 2006. 7, 461-473.

Tan, S. A Modular Polycistronic Expression System for Overexpressing Protein Complexes in

Escherichia coli. Protein Expression and Purification [Online] 2001 21 (1), 224-234.

Tan, S.; Kern, R.; Selleck, W. "The pST44 Polycistronic Expression System for Producing Protein

Complexes in Escherichia coli." Protein Expression and Purification [Online] 2005. 40, 385-395.

Thomas, Tim and Anne K. Voss. The Diverse Biological Role of MYST Histone Acetyltransferase

Family Proteins. Cell Cycle [Online] 2007. 6, 696-706.

Tzouanacou, E.; Tweedie, S.; Wilson, V. Identification of Jade1, a Gene Encoding a PHD Zinc Finger

Protein, in a Gene Trap Mutagenesis Screen for Genes Involved in Anteroposterior Axis

Development. Journal of Molecular and Cell Biology [Online] 2003. 23 (23), 8553-8552.

Wei, Y.; Yu, L.; Bowen, J.; Gorovsky, M. A.; Allis, C. D. Phosphorylation of Histone H3 Is Required for

Proper Chromosome Condensation and Segregation. Molecular Cell [Online] 1999. 1 (2), 99-109.

ACADEMIC VITA

Viktor Tollemar

[email protected]

______

Education

B.S., Premedicine, 2013, Pennsylvania State University, University Park, PA

Honors and Awards

 President's Freshman Award (March, 2010)

 President Sparks Award (March, 2011)

 Evan Pugh Scholar Award (Junior) (March, 2012)

 Duffy Premedicine Endowment (September, 2012)

 Evan Pugh Scholar Award (Senior) (March, 2013)

 Pennsylvania State Undergraduate Research Fund Grant Recipient (2012, 2013)

Association Memberships/Activities

 Eberly College of Science Executive Board (President, Vice President, Academic

Affairs Chairman)

 Biology Department Teaching Assistant

 Schreyer Honors Orientation Mentor

 Morale THON Committee

 Alpha Epsilon Delta: Beta Chapter (Nationally Inducted and Distinguished Member)

 Phi Beta Kappa Lambda of Pennsylvania (Nationally Inducted and Treasurer)

Professional Experience

 Pediatric Oncology Education Research Fellowship (St. Jude Children's Research Hospital)

 Project Healthcare Program (Bellevue Hospital, New York, NY)

 Clinic Intern Program (University Health Services)

Professional Presentations

 Characterization of Specific Interactions Between Talin & Focal Adhesion Kinase (St.

Jude Children's Research Hospital; Memphis, TN; August 17, 2012)

 Preparing the Genome for DNA Replication (Penn State Undergraduate Poster

Exhibition; University Park, PA; April 10, 2013)

 Effectiveness of Interactive Computer Modules in Delivering Hypertension Information

(Project Healthcare; Bellevue Hospital, New York, NY; August 19, 2011)