<<

ORIGIN AND EVOLUTION OF THE HUMAN 70-kDa HEAT SHOCK (HSPA1A) ______

A Thesis

Presented to the

Faculty of

California State University, Fullerton ______

In Partial Fulfillment

of the Requirements for the Degree

Master of Science

in

Biology ______

By

Ryan Oliverio

Thesis Committee Approval:

Nikolas Nikolaidis, Department of Biological Science, Chair Veronica Jimenez-Ortiz, Department of Biological Science Kristy L. Forsgren, Department of Biological Science

Fall, 2017

ABSTRACT

Determining the reasons behind how and why evolutionary mechanisms that establish or remove genetic variations within a population is a fundamental question in evolutionary biology. By tying together, the origin, evolutionary patterns, molecular mechanisms, and functional outcomes of genotype to phenotype, we gain insight on how and why are conserved or changed. This methodology was followed to investigate the stress-inducible in humans (HSPA1A), a critical component of the cellular stress response, and whose modification has been associated with a variety of human diseases. My results indicate that the origin of the HSPA1A gene is in placental mammals. Additionally, a combination of purifying selection and with its closely-related paralog, HSPA1B, have conserved the amino acid sequence of this gene from possibly deleterious mutations in multiple mammalian species. This pattern can also be applied to human microevolution in how single-nucleotide polymorphisms are distributed for both HSPA1A and HSPA1B, in contrast to their closely-linked homolog HSPA1L. This finding is further supported by the fact that natural variants of

HSPA1A, that were most likely to change function, only apply small changes to the primary function of HSPA1A and have very low allele frequencies within human populations. Altogether, my experimental observations show HSPA1A is subject to purifying selection and provide functional characterization of two positions, R36 and

I480, that appear to have importance in the folding function of HSPA1A.

Supplementary Tables 1 and 2 of extended mammalian evolutionary trends are provided.

ii

TABLE OF CONTENTS

ABSTRACT ...... ii

LIST OF TABLES ...... vi

LIST OF FIGURES ...... vii

ACKNOWLEDGMENTS ...... viii

Chapter 1. INTRODUCTION ...... 1

Single-Nucleotide Polymorphisms as a Tool for Human Adaptation ...... 1 Observing Adaptation Through the Cellular Stress Response...... 3 High Conservation Shapes Structure and Function of ...... 5 The Hsp70 Family and its Evolution Between Species ...... 6 The Expansion of Cytosolic Hsp70 Genes and Their Modes of Evolution ...... 7

2. PURPOSE AND HYPOTHESIS ...... 11

3. MATERIALS...... 12

Cell Lines ...... 12 Chemicals...... 12 ...... 12 Enzymes and ...... 13 Equipment ...... 13 Growth Media ...... 14 Reagents ...... 14 Software ...... 14 Other Materials ...... 15

4. METHODS ...... 16

Origin and Evolution of HSPA1A ...... 16 Sequence Collection and Analyses ...... 16 Collection and Analysis of SNPs ...... 17 Computational Predictions of Non-synonymous SNPs ...... 18

iii Protein Activity Tests on Recombinant HSPA1A ...... 19 Generation of Mutated Recombinant Clones, Proteins, and Protein Purification ...... 19 Thermal Shift Assay ...... 19 ATPase Activity Assay ...... 20 Isothermal Titration Calorimetry ...... 21 Functional Assays on Overexpressed HSPA1A in Mammalian Cells ...... 21 Protein Localization ...... 22 Protein Refolding ...... 24 Protein Aggregation ...... 26

5. RESULTS ...... 29

Origin and Evolution of HSPA1A ...... 29 Phylogenetic Analyses and Genomic Context of HSPA1A ...... 29 Determining Evolutionary Mechanisms that Represent how HSPA1A is Maintained ...... 30 Signs of Purifying Selection and Gene Conversion in Human Microevolution ...... 34 Structural and Functional Consequences of SNPs on HSPA1A ...... 36 Results of SNP Curation and Collection ...... 36 Effects of SNPs on Thermal Stability of HSPA1A ...... 38 Effects of SNPs on Binding Kinetics of HSPA1A to ATP, ADP, and Protein Substrate ...... 39 Effects of SNPs on ATPase Function of HSPA1A ...... 41 Functional Consequences of SNPs on Cellular Stress Response ...... 43 Effects of SNPS on Protein Localization of HSPA1A Inside a Cell ...... 43 Effects of SNPs on Protein Refolding Function of HSPA1A ...... 45 Effects of SNPS on HSPA1A Protecting Against Protein-Aggregate Cell Toxicity ...... 47

6. DISCUSSION ...... 50

The Evolution of HSPA1A is Shaped by Purifying Selection and Gene Conversion ...... 50 Purifying Selection and Gene Conversion is Observed in Human Microevolution...... 51 Protein Activity Tests on Natural HSPA1A Mutations Verify Purifying Selection...... 52 Variation in Protein Activity Translate to Functional Consequences ...... 53 Final Conclusions...... 55

iv APPENDICES ...... 58

A. PRIMERS USED FOR SITE-DIRECTED MUTAGENESIS ...... 58 B. COMPARISONS OF HSPA1A AND HSPA1B AMINO ACID SEQUENCES IN MULTIPLE MAMMALIAN SPECIES ...... 59 C. COMPARISONS OF HSPA1A AND HSPA1B AMINO ACID SEQUENCES IN MULTIPLE MAMMALIAN SPECIES ...... 60 D. TABLE OF SYNONYMOUS AND NON-SYNONYMOUS DISTANCES BETWEEN REPRESENTATIVE HSPA1 CLUSTER GENES ...... 61 E. SYNONYMOUS AND NON-SYNONYMOUS DISTANCES BETWEEN REPRESENTATIVE HSPA1 CLUSTER GENES IN HUMANS ...... 62 F. SYNONYMOUS AND NON-SYNONYMOUS SNP DISTRIBUTION BETWEEN REPRESENTATIVE HSPA1 CLUSTER GENES ...... 63 G. BINDING AND THERMAL RESULTS OF HSPA1A WITH ATP ...... 63 H. BINDING AND THERMAL RESULTS OF HSPA1A WITH ADP ...... 64 I. BINDING AND THERMAL RESULTS OF HSPA1A WITH PROTEIN SUBSTRATE ...... 65 J. COLOCALIZATION OF HSPA1A WITH NUCLEUS ...... 66 K. COLOCALIZATION OF HSPA1A WITH MITOCHONDRIA ...... 67 L. COLOCALIZATION OF HSPA1A WITH LYSOSOMES ...... 68 M. COLOCALIZATION OF HSPA1A WITH PLASMA MEMBRANE ...... 69 N. TABLES OF STATISTICAL SIGNIFICANCE FOR COLOCALIZATION DATA ...... 70 O. REPRESENTATIVE IMAGES OF CELLS WITH HUNTINGTIN AGGREGATES ...... 71

REFERENCES ...... 72

SUPPLEMENTARY TABLES 1 AND 2 SHOWING NUCLEOTIDE AND AMINO ACID PAIRWISE COMPARISONS AND PS/PN VALUES FOR SELECTED MAMMALIAN SPECIES ARE PROVIDED

v

LIST OF TABLES

Table Page

1. Sequence Comparisons Between Representative HSPA1 Cluster Genes...... 31

2. The Distribution of SNP Type Is Significantly Different between the HSPA1 and Their Neighboring Genes ...... 35

3. Criteria Data for SNPs of Interests in Functional Characterization ...... 37

4. Melting temperature (Tm) of the Wild Type and Mutated HSPA1A Variants ... 38

5. Equation, R2, rate (s-1), and Standard Deviation for Linear Regression Lines Of the Observed Data for ATP Hydrolysis ...... 42

vi

LIST OF FIGURES

Figure Page

1. Hierarchy and characteristics of coding single-nucleotide polymorphisms (SNPs) ...... 2

2. Pie charts of SNPs collected in NCBI (dbSNP)...... 3

3. Environmental and pathological stressors that can activate the cellular stress response ...... 4

4. Structural and functional schematics of Hsp70s ...... 6

5. Genomic organization and phylogenetic analysis of HSPA1 cluster genes shows similar evolutionary patterns...... 9

6. The HSPA1A/LB cluster originated early during the evolution of placental mammals ...... 30

7. HSPA1A and HSPA1B genes are highly conserved at both the nucleotide and amino acid levels...... 32

8. Representative Isothermal Titration Calorimetry (ITC) assays using purified recombinant HSPA1A proteins and ATP...... 40

9. Variants of HSPA1A can affect the rate of ATP hydrolysis for the given protein over a 90-minute period of time...... 41

10. Histogram with data points (circles) of CTCF ratios for nucleus, mitochondria, lysosome, and plasma membrane ...... 44

11. Change in luciferase refolding rate for WT and HSPA1A variants ...... 46

12. Percent active -3/-7 activity in mammalian cells co-transfected with mutant HSPA1A and Q74 constructs ...... 48

13. Percent cell viability in mammalian cells co-transfected with mutant HSPA1A and Q74 constructs ...... 49

vii

ACKNOWLEDGMENTS

To my graduate advisor, Dr. Nikolas Nikolaidis, you have provided me, as someone who had no knowledge of the evolutionary process, the chance to better educate himself and obtain a deeper outlook on the science of genetics. Your guidance and patience has molded me into being a bolder and more inquisitive person; who should explore many different perspectives and skills to attain the answers I am looking for.

Without your breadth of knowledge, opinions, and fields of research in exploring molecular evolution though combining computational, biochemical, and cellular processes I would not have found my interests in helping change the culture of our education on genetics and how to apply that to human health.

To my wife, Tamara Oliverio, without you I would not nearly be as appreciative of my abilities and grounded in my life. You have been my best friend and support through all the late nights in lab and days where we do not see each other until late at night. Though you have no idea what I have been doing as a thesis project your willingness to help and critique me has been a driving force for my life.

For my son, Grayson Oliverio, although you are still in your mom’s womb as I am finishing this paper, I hope one day that you find this thesis and it inspires you to pursue your own passions and interests. If somebody asked me five years ago, if I would be doing research in genetics I would say they are crazy, but I would not be the same man

viii without the time and experiences it took to create this thesis and I hope this can be an example to you of how determination, luck, and sacrifice can create wonderful things.

To my cohorts in lab. Kyle Hess, you are the most insightful helper/teacher I could have asked for. Be it teaching me the nuances of the lab or all the other techniques

I needed to learn you were always there to help, critique, and brainstorm. Andrei Bilog, you have grown so much since I first met you, and your natural leadership and charisma will help you succeed in whatever is next. Peter Nguyen, I know I am hard on you, but I see so much potential in your capabilities, you are the ever-present source of comedy and energy in the lab and if you can focus those energies you will go on to do great things.

Brianna Kdeiss, the “lab mom”, you served as a pillar for me and the lab and your presence has been missed greatly since you graduated. Sara Ord, Jacqueline Ellis,

Chaiyon Park, Brandon Mauch, and Bryan Dighera for the laughs, support, and helpful discussions with my project.

Lastly, I want to thank my family. My father, who has been my inspiration for providing care and support for those who are underserved and overlooked. My mother, for always being my rock, you have always told me to “think outside the box” and to never leave a question unanswered. Last, but not least, my brother Tyler, you are one of my closest friends and have provided me a lifetime of laughs, I cannot wait to see the doctor you will end up being.

ix 1

CHAPTER 1

INTRODUCTION

Single-Nucleotide Polymorphisms as a Tool for Human Adaptation

A fundamental question of molecular evolution is determining the mechanisms that govern how species evolve to adapt to their environment. These adaptations are seen as incremental changes to a gene’s structure, causing a change in the encoded protein’s function to provide the best chance for survival in a specific environment. Mutations, genomic recombination, and other molecular events shaped by natural selection and drift are key mechanisms by which genetic and cellular systems change to help organisms adapt to their environment. However, the relationship between these molecular events and their phenotypic outcomes is still unclear.

A similar idea holds true in the genetics of human health. Through human evolution, we have adapted to a variety of drastically different environments, yet the genomes between any two human individuals are almost 99.5% identical (Levy et al.,

2007). Therefore, it is the remaining 0.5% of our genome that provides clues to how molecular events have shaped human phenotypes.

For humans, a significant portion of these events are single-nucleotide polymorphisms (SNPs) scattered throughout the (Levy et al., 2007).

SNPs are point mutations, single nucleotide changes commonly used as markers to trace the ancestral genetic history or probable cause of disease for an individual. SNPs can be

2 found nearly anywhere in the human genome and are categorized depending on their location, either in the non-coding/coding region of genes (causative SNPs) or inter-genic regions between genes (linked SNPs) (Genetic Science Learning Center, 2015) (Figure

1). Though unclear, linked SNPs appear to have no direct effect on their neighboring genes, whereas, causative SNPs found in either the coding or non-coding regions of the gene may have direct physiological consequences by changing the structure and potentially function or expression of the protein (Figure 1).

Figure 1. Hierarchy and characteristics of coding single-nucleotide polymorphisms (SNPs). (a) SNPs are sub-classified depending on their location and how they affect the gene product. (b) SNPs found in protein-encoding regions of the gene may alter the codon changing the amino acid being encode, or stop protein synthesis altogether.

Causative SNPs can be further organized into two distinct categories: synonymous and nonsynonymous (Figure 1) (Genetic Science Learning Center, 2015).

Synonymous SNPs are silent point mutations that change the codon but do not alter the encoded amino acid, and presumably have a neutral effect on the protein’s structure and function. Nonsynonymous SNPs, on the other hand, change the codon to one that encodes

3 a different amino acid, which may have significant structural and functional consequences to the synthesized protein.

Thus, to better understand how molecular events shape human phenotype it is vital to understand the effects that nonsynonymous SNPs have on the function of a protein to determine their role in adaptation and disease. However, from looking at the human SNPs data collected in NCBI (dbSNP; https://www.ncbi.nlm.nih.gov/snp/) only a small fraction of polymorphisms have been experimentally tested to determine their functional and clinical significance to elucidate their role in genetic diversity and phenotype (Figure 2).

a b

Figure 2. Pie charts of SNPs collected in NCBI (dbSNP). (a) Total and (b) non-synonymous SNP data for Homo sapiens collected from dbSNP. *Filtered for “likely pathogenic” or “pathogenic” clinical significance; 90% of these are clinical submissions. **Filtered to being annotated as “Cited in PubMed”; 80% of these are from computational studies.

Observing Adaptation Through the Cellular Stress Response

There are several cellular systems that have evolved to help humans cope with their environment and are intensively studied to elucidate their relationship between genetic diversity and phenotype. These systems include, among others, the immune system, the mitochondria, and the cellular stress response (CSR) that work to help our cells to adapt dynamically to environmental perturbation and survive until the stress is

4 relieved (Pruett, 2003; Richter, Haslbeck, & Buchner, 2010; Shaughnessy et al., 2014).

These perturbations, also known as stressors, are actue or chronic changes in external

(temperature, osmotic pressure, heavy metals, etc.) or internal (pathological) factors that can disrupt cellular structures and systems eventually leading to cell death (Figure 3)

(Macario & Conway de Macario, 2005). The CSR, in particular, has a large protein- protein interaction network whose central core of proteins are highly conserved from bacteria to humans (Morimoto, 2011). This response contains hundreds of proteins, with the predominant class being molecular chaperones, that act to defend the cell against the internal damages caused by cellular stress (Richter et al., 2010). A key orchestrator of this response and the most conserved of all chaperones are a family of proteins known as the 70-kilodalton heat shock proteins (Hsp70s) (Murphy, 2013).

Figure 3. Environmental and pathological stressors that can activate the cellular stress response.

5

High Conservation Shapes Structure and Function of Hsp70s

The hallmark of Hsp70s is their remarkably high level of amino acid conservation throughout organismal evolution (Hunt & Morimoto, 1985). This feature is credited to the action of strong purifying selection that functions to prevent amino acid mutations that would alter their function (Kominek, Marszalek, Neuvéglise, Craig, &

Williams, 2013; McCallister, Siracusa, Shirazi, Chalkia, & Nikolaidis, 2015; Nikolaidis

& Nei, 2004). This strong conservation provides a degree of structural similarity to all

Hsp70 proteins, which consist of two major domains. The first domain being a 44-kDa N- terminal adenosine triphosphatase (ATPase) domain (nucleotide binding domain [NBD])

(Figure 4). The NBD is composed of four lobes which together create the nuclotide binding pocket. The second domain is the substrate binding domain, which is composed of the 18-kDa β-sandwich substrate binding domain (SBD) and the α-helical “lid” region

(Figure 4). The NBD and SBD communicate via allosteric coupling through an interdomain hydrophobic linker (Figure 4) (Zhuravelva, Clerico, & Gierasch, 2012).

The primary chaperone function of Hsp70s depends on their interaction with regulatory co-chaperones (e.g., nucleotide exchange factors [NEFs], J-domain proteins

[Hsp40s]) (Figure 4). Through interaction with Hsp40s, ATP hydrolysis occurs at the

NBD. The ATP hydrolysis causes a conformational change that is communicated via the linker to the SBD (Zhuravleva et al., 2012). The hydrolysis of ATP greatly increases the binding affinity of the SBD to hydrophobic residues of denatured proteins, and causes the

α-helical lid to “close” onto the protein substrate holding it in place (Figure 4). To release the protein substrate, Hsp70 interacts with a NEF to release the ADP. This release opens the lid and lowers the binding affinity of the SBD to protein substrate, releasing the

6 protein back into the cellular environment and freeing the nucleotide binding pocket for another ATP. This cycle grants Hsp70’s the chaperoning functions to hold and supplement folding of misfolded or denatured protein substrates.

Figure 4. Structural and functional schematics of Hsp70s. (a) Schematic of amino acid residues for domains and subdomains of Hsp70s (numbers are representative of amino acids for HSPA1A). (b) Cartoon representation of ATP-dependent cycle for Hsp70s in ATP- and ADP- bound states. NBD is blue, hydrophobic linker is red, β-sandwich is orange, α-helical lid is green, unfolded protein substrate is black, Hsp40s are purple, and NEFs are yellow.

The Hsp70 Family and its Evolution Between Species

Due to the strong action of purifying selection Hsp70s across species have very similar amino acid identity, with the prokaryotic Hsp70 (dnaK) and eukaryotic Hsp70s being approximately 50% similar (Daugaard, Rohde, & Jäättelä, 2007; Lindquist &

Craig, 1988). This mode of selection estalishes a structure that is very similar for all

Hsp70s and creates constraints in the protein’s function (Lindquist & Craig, 1988).

7

In eukaryotes there is an amplification of Hsp70 genes between species and almost all of them have several Hsp70 genes (Bettencourt & Feder, 2001; Daugaard et al., 2007; Kominek et al., 2013; Nikolaidis & Nei, 2004). The emergenece of these multiple Hsp70 members suggest important adaptative reasons for why these genes arose through the species tree. Thus, multiple theories have arisen to provide reasons for how the different members of the Hsp70 family have evolved, for example, divergent evolution that resulted in Hsp70s located to distinct cellular compartments dating back to the first eukaryotes (cytosol, mitochondira, and enodplasmic reticulum [ER]) and diversification in the functions of closely related Hsp70 members through sub- and neo- functionalization (Daugaard et al., 2007; Kourtidis et al., 2006; Krenek, Schlegel, &

Berendonk, 2013; McCallister, Siracuasa et al., 2015). Thus, the nebulous nature of

Hsp70s evolutionary history provides an ideal framework of seeing different models of how multigene families evolve, but still leaves many questions as to why and how different Hsp70 homologs found between and within species originated and evolved.

The Expansion of Cytosolic Hsp70 Genes and Their Modes of Evolution

As stated previously, eukaryotic Hsp70 genes can be classified based on their sub- cellular localization. However, the expansion of the cytosolic Hsp70 members has largely outnumber those found in the ER or mitochondria in a variety of eukaryotes and proposes the question of why so many cytosolic Hsp70 genes evolved.

One major difference in the cytosolic Hsp70 genes is their modes of expression, being either constitutive or heat-inducible (Brocchieri, Conway de Macario, & Macario,

2008; Kudla, Helwak, & Lipinski, 2004; McCallister, Siracusa et al., 2015; Tavaria,

Gabriele, Kola, & Anderson, 1996). The constitutively-expressed genes are typical

8 examples of evolution by the birth-and-death model, where genes are duplicated, then some are either maintained or accumulate deleterious mutations removing their function

(Brocchieri et al., 2008; Nei & Rooney 2005). On the other hand, heat-inducible Hsp70s has been observed to appear independently in multiple lineages and are almost identical in both their amino acid and nucleotide sequence (McCallister, Siracusa et al., 2015).

These evolutionary patterns suggest that heat-inducible HSP70s arose through convergent evolution in these lineages and are maintained by a concerted model of evolution, using a mechanism resembling gene conversion as modeled in yeast (Chen, Cooper,

Chuzhanova, Fèrec, & Patrinos, 2007; Kudla et al., 2004; McCallister, Siracusa et al.,

2015; Nikolaids & Nei, 2004).

Closer inspection of flies and nematodes, which have multiple heat-inducible

Hsp70 genes have shed some light onto how heat-inducible Hsp70s have evolved

(Bettencourt & Feder, 2002; Nikolaidis & Nei, 2004). These studies indicate that their genes are not fully homogenized and have a few synonymous mutations in many divergent lineages, suggesting that purifying selection is the mechanism by which amino acid mutations are eliminated. For mammalian species however, it is still unclear how their heat-inducible Hsp70 genes, HSPA1A and HSPA1B, have evolved.

In mammals, we are provided the unique opportunity to see how closely-linked heat-inducible and constitutive Hsp70 genes evolve. The genomic loci for the mammalian heat-inducible Hsp70s is found in tandem with a constitutive homolog HSPA1L (Figure

5). A previous study and my own observations comparing human and rodent Hsp70 genes, has observed HSPA1L following a pattern of birth-and-death evolution common to other constitutive Hsp70 genes (Figure 5) (Kudla et al., 2004). On the other hand, the

9 heat-inducible Hsp70 genes cluster into species-specific clades suggesting concerted evolution by a gene conversion-like mechanism (Figure 5) (Kudla et al., 2004). However, it is still unclear when HSPA1A originated and if it is experiencing purifying selection like other heat-inducible genes seen in non-mammalian species.

Figure 5. Genomic organization and phylogenetic analysis of HSPA1 cluster genes shows similar evolutionary patterns. A) Genomic context of HSPA1A, HSPA1B, and HSPA1L (HSPA1 cluster) in humans and mice, figure shows directionality but not size or distance between the genes. B) Rooted maximum-likelihood tree of the HSPA1 family genes for humans and mice (m). Numbers on tree represent bootstrap values of 1000 repetitions and scale bar represents evolutionary distance. Red asterisks (*) indicate heat-inducible Hsp70 genes.

Furthermore, as well-documented as the inter-species evolution of Hsp70s is, their evolution within a species, including humans remains unknown. A handful of studies have associated altered forms of HSPA1A to a variety of human diseases (Ayub et al.,

2010; Dulin, García-Barreno, & Guisasola, 2012; He et al., 2009; Ramos-Arroyo et al.,

2001). However, none of these studies have investigated how purifying selection or gene conversion manifest, or if these patterns hold true within species. Therefore, how

10 naturally occurring mutations manifest functional consequences, and if they are subject to the same evolutionary mechanisms observed in other species remains unknown.

11

CHAPTER 2

PURPOSE AND HYPOTHESIS

The HSPA1A protein orchestrates an important system in how a cell copes with stressors. As disruptions in this system can have a variety of possible consequences leading to cell death and disease, better understanding of their evolutionary role can help elucidate their genotypic and phenotypic relationship. By determining the origin and evolution of HSPA1A both inter- and intra-species, I provided more information about why they appeared, how they are maintained, and what consequences there are when mutation occurs. Furthermore, by determining how function is altered by point mutations,

I surveyed the importance of specific sites to protein function.

To elucidate the molecular mechanisms that govern the evolution of these important proteins, I determined the origin and mode of selection of HSPA1A by observed patterns seen in information collected in multi-species and human-specific genetic databases, and experimentally verified my theorized mode of evolution by how natural variations translate to functional outcomes. I hypothesized that the origin of

HSPA1A is somewhere within the evolution of mammals and have evolved through a combination of concerted evolution and purifying selection. If this is the case, then purifying selection can be experimentally tested by using natural variations to show only minor alterations to function.

12

CHAPTER 3

MATERIALS

Cell Lines

Human embryonic kidney cells (HEK293, CRL-1573) and human cervical cells (HeLa, CCI-2) were purchased from ATCC (Manassas, VA).

Chemicals

Ampicilin, Dimethyl sulfoxide (DMSO), Isopropyl β-D-1-thiogalactopyranoside

(IPTG), Phenylmethylsulfonyl fluoride (PMSF), Mono- and dibasic sodium phosphate,

Sodium chloride, Lysozyme, Imidazole, beta-mercaptoethanol, Tris-Base, Hydrochloric acid, Urea, Dithiothreitol (DTT), HEPES, Magnesium chloride, Potassium chloride,

Ethylenediaminetetraacetic acid (EDTA) Tween-20, Chloroform, and Methanol were purchased from Fisher Scientific (St. Louis, MO). Deoxyribonucleotides (dNTPs), 2-(N- morpholino) ethanesulfonic acid (MES) and 3-(N-morpholino)propanesulfonic acid

(MOPS) were purchased from Life Technologies (Waltman, MA). Ortho-Nitrophenyl-β- galactoside (ONPG), (ATP), and Glycylglycine were purchased from Sigma-Aldrich (St. Louis). Cycloheximide solution (100mg/mL in DMSO) was purchased from Sigma-Aldrich (St. Louis, MO).

DNAs

The HSPA1A cDNA clone (accession number BC054782) was subcloned into pet-

22b and peGFP-C2 vectors, Lamp1 was subcloned to a mRFPC1 vector, Q23 and Q74

13 huntingtin variants were fused to a red fluorescent protein gene and then subcloned to a pCDNA3.1 vector. All clones were then transformed into DH5α Escherichia coli cells

(Life Technologies; Waltham, MA) and selected, then grown LB media. Variants were created from the same recombinant HSPA1A template and then mutated through site- directed mutagenesis using long-PCR amplification followed by DpnI digestion (Liu &

Naismith, 2008). The pGL4 vector for luciferase was purchased from Promega.

Enzymes and Proteins

Recombinant HSPA1A was created from sequence-verified recombinant clones and transformed into BL21(DE3) E. coli cells (Life Technologies; Waltham, MA ).

Variants were created from the same recombinant HSPA1A template and then mutated through site-directed mutagenesis using long-PCR amplification followed by DpnI digestion (Liu & Naismith, 2008). After clones were sequence verified they were purified as stated in McCallister, Siracusa et al. (2015).

Equipment

GloMax 96 Microplate Luminometer (Promega, graciously provided by Dr. Nilay

Patel at CSUF), Leica TCS SP2 Inverted Scanning Confocal Microscope equipped with an acoustic optical beam splitter, acoustic optical tunable filters, eight laser line, and two channel photo multiplier (Leica Microsystems GmbH, available on CSUF campus),

Olympus CKX41 inverted microscope (Olympus; Waltham, MA), CoolLED pE-300

LED fluorescent illuminator (CoolLED; Hampshire, UK), Nano ITC (TA Instruments, available on CSUF campus), CFX96 Touch Real-Time PCR Detection System (Bio-Rad;

Hercules, CA). VWR Unstirred Digital Dual Water Bath (Randor, PA).

14

Growth Media

Super Optimal Broth with 20 mM glucose (SOC) was purchased from Life

Technologies (Waltham, MA). Luria-Bertani (LB) media and agarose for DNA electrophoresis were purchased from Fisher Scientific (St. Louis, MO). LB agar was purchased from Becton, Dickson and Company (Franklin Lakes, NJ). Coring Celgro®

Minimum essential media (MEM) and Coring Celgro® Dulbecco’s modified eagle medium (DMEM) were purchased from Thermo Fischer Scientific (St. Louis, MO).

Reagents

The QuantiChrom ATPase/GTPase Assay kit was purchased from BioAssay

Systems (Hayward, CA). Protein Thermal Shift Assay kit was purchased from Thermo

Fisher Scientific (St. Louis, MO). Dual-Glo Luciferase Refolding Assay, Caspase 3/7 Glo

Assay, and Cell-Titer 2.0 Glo Assay were purchased from Promega (Madison, WI).

Fluorescent cellular stains: DAPI-Fluoromount-G was purchased from Southern Biotech

(Birmingham, AL). MitoTracker Red and Wheat germ agglutinin, Alexa Fluor 555

Conjugate were purchased from Thermo Fisher Scientific (St. Louis, MO). PolyJet In

Vitro DNA Transfection Reagent is purchased from SignaGen Laboratories (Rockville,

MD).

Software

SIFT (http://sift.jcvi.org/), PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/),

ImageJ (https://imagej.nih.gov/ij/download.html), Wright Cell Imaging Facility ImageJ

Plug-ins (http://www.uhnresearch.ca/wcif), MEGA6 (http://megasoftware.net/), PyMOL

(v0.99; https://pymol.org/), SWAAP (v.1.0.3;

15 http://asiago.stanford.edu/SWAAP/SwaapPage.htm). All were obtained or are available online.

Other Materials

Nuclon Delta Surface 6- and 24-well cell culture plates, Corning 96-well clear bottom white polystyrene microplates, Fisherbrand circular cover glasses, and

Fisherbrand InkJet microscope slides were purchased from Thermo Fisher Scientific.

16

CHAPTER 4

METHODS

Origin and Evolution of HSPA1A

Sequence Collection and Analyses

To determine the origin and evolution of HSPA1A and HSPA1B between species, protein sequences were collected from the NCBI mammalian reference sequence protein database using protein-BLAST searches with the human HSPA1A, HSPA1B, and

HSPA1L (HSPA1 cluster) sequences as queries, and default parameters (Altschul, Gish,

Miller, Myers, & Lipman, 1990). The collected proteins from this initial search were mapped back to the corresponding mammalian genome sequence to identify the genomic organization of the genes and ensure collection of all three Hsp70 genes. Based on this search, nucleotide sequences (genomic and coding) were also collected for the corresponding genes identified above. Furthermore, at least two neighbor genes 5’ and 3’ of the HSPA1 cluster were also collected and were used to determine the conservation of synteny. Although there are more than 100 mammalian genomic sequences at various levels of completion available at the NCBI website, I identified 30 species where the region and all three Hsp70 genes were complete, and out of these sequences nine representative species were further analyzed.

The collected sequences were aligned with CLUSTALW in MEGA 6.0 using default parameters and manually corrected with BioEdit (Tamura, Stecher, Peterson,

17

Filipski, & Kumar, 2013). Pairwise sequence alignments and synonymous/non- synonymous substitutions over sliding windows were performed using SWAAP (v.1.0.3).

Nucleotide and amino acid pairwise identities, as well as synonymous and non- synonymous substitutions (modified Nei-Gojobori method), were performed with MEGA

6.0.

Maximum-likelihood (ML) was used to find the best model of evolution (MEGA

6.0) for the protein sequences (Tamura et al., 2013). Phylogenetic trees were generated using the neighbor-joining (NJ), ML, and maximum parsimony algorithms as implemented in MEGA 6.0. One thousand bootstrap replicates were used to test the reliability of the inferred trees and any positions with gaps were removed.

Collection and Analysis of SNPs

To investigate the microevolutionary processes of the HSPA1A and HSPA1B genes, each gene’s genomic coordinates and single nucleotide polymorphisms (SNPs) were collected from Ensembl’s 1000 Genomes browser

(http://grch37.ensembl.org/Homo_sapiens/Info/Index) using all available data present within the 1000 Genomes database (phase 3). The data were filtered to include SNPs found only in the gene region and in the 1000 Genomes project. For each HSPA1 gene,

SNP data were also collected from the ExAC database (August 2016). This dataset contains variants found in the exomes of 60,706 unrelated individuals from disease- specific and population genetic studies and is 24.24 times larger than the 1000 Genomes dataset. The patterns of SNP density within the HSPA1 genes and surrounding genes were also calculated by dividing the total number of SNPs within a given region by that region’s sequence length in base-pairs (bp).

18

Computational Predictions of Non-Synonymous SNPs

To prioritize the non-synonymous SNPs found on the HSPA1A and HSPA1B genes and determine which mutations will be first experimentally tested six major criteria were used aiming to best predict their functional effect. First, the SNPs were based on whether they change an amino acid of known function because a mutation on a site of established function would almost most likely have a functional impact. Second, I categorized the SNPs based on whether they occur on a highly conserved amino acid position by determining the amino acid conservation level of each position, because I predicted that highly conserved amino acids would have a higher probability of inflicting a functional change. Third, the SNPs were shorted based on their frequency on a population, because I aimed to identify mutations that are established in a population and maybe are related with major adaptations or are new and may be related to a disease.

Fourth, the SNPs were also classified based on whether the amino acid change was predicted to be radical (different amino acid class and negative or zero scores in both

BLOSUM 65 and 80). The rationale of this criterion relies on the fact that radical changes may alter the function with a higher probability than non-radical amino acid changes.

Fifth, I generated three-dimensional models of the WT and mutated proteins and predicted whether a mutation is altering the local conformation or the molecule surface.

Sixth, I used PolyPhen (Adzhubei et al., 2010) and SIFT (Kumar, Henikoff, & Ng, 2009) to predict whether any of these mutations were expected to alter the function of the protein.

19

Protein Activity Tests on Recombinant HSPA1A

To test the effects of the mutations on the structure and function of HSPA1A, I determined whether they affect (i) protein stability (ii) ATP hydrolysis, and (iii) binding to nucleotides (ATP, ADP) and protein substrate. All experiments were repeated at least three times using different batches of protein

Generation of Mutated Recombinant Clones, Proteins, and Protein Purification

The cDNA clone containing the HSPA1A gene sequence (Accession number:

BC054782) was used to generate the recombinant clones used in this study. Site-directed mutagenesis using long-PCR amplification followed by DpnI digestion was used to generate the mutated Hsp70 variants (Liu & Naismith, 2008). Specifically, 10 ng of plasmid DNA were mixed with 125 ng each primer (see Appendix A) in a 50 µL reaction containing 1 µL DMSO, 5 µL Buffer, 1 µL of 10 mM dNTPs, and 1 µL (2.5 units) of native PFU polymerase. The whole plasmid was then amplified using the following conditions: 5 min at 95oC; (30 sec at 95oC; 1 min at 60oC; 15 min at 72oC) cycle repeated for 16 times; 15 min at 72oC. After sequencing verification, the mutated and wild-type

(WT) constructs were used to generate and purify recombinant proteins as described in

McCallister, Siracusa et al. (2015).

Thermal Shift Assay

To determine the effect of naturally occurring mutations on protein stability a

Thermal Shift Assay (TSA; Thermo Scientific) was employed and protein stability was determined as a function of the protein melting temperature (Tm). In this assay, 5 µM of each protein was mixed with 5 μl of Protein Thermal Shift™ Buffer (Thermo Scientific), and 2.5 μL diluted Protein Thermal Shift™ Day (8X) in 20 μL total volume. The mixture

20 was incubated in CFX96 Real-Time PCR Detection System (BioRad) under continuous ramp mode, from 16-95 oC, using a 0.05 oC/s ramp rate. The experiment was repeated using three different batches of each protein. The data were then plotted and fitted to the

Boltzmann equation using SigmaPlot v10 to calculate the Tm. The obtained values were averaged per protein and the standard deviation was calculated.

ATPase Activity Assay

HSPA1A’s chaperone function requires the hydrolysis of ATP. Therefore, to test whether a mutation altered the ability of HSPA1A to hydrolyze ATP, an ATPase assay was employed. The assay was performed by mixing 1 μM of recombinant protein with 4 mM ATP at in ATPase Buffer [20 mM Tris-HCl pH 7.5; 40 mM NaCl; 4 mM

Mg(CH3COO)2; 0.5 mM Ethylenediaminetetraacetic acid (EDTA)] to 120 μl final volume. The reaction mixtures were then incubated at 37 oC for 90 min, and the release of inorganic phosphate (Pi) was quantified every 30-min using the colorimetric assay

QuantiChrom™ ATPase/GTPase Assay kit (BioAssay Systems). To determine the amount of Pi released a standard curve was generated by measuring the absorbance produced by known Pi concentrations. No chaperone controls were used to account for spontaneous

ATP hydrolysis. The final amounts of Pi released were calculated by subtracting the control values from each sample. The data were then plotted and analyzed in SigmaPlot and Jmp software packages and the rate of hydrolysis was quantified using the slope generated by linear regression divided by the time. The phosphate released in three independent experiments were then averaged and plotted in a histogram. A linear regression line of the histogram was then calculated, and rates (s-1) were determined by the slope of the line divided by 60 minutes.

21

Isothermal Titration Calorimetry

HSPA1A binds to nucleotides and protein substrates, therefore, to determine whether mutated variants of HSPA1A bind to ATP, ADP, and protein substrate as compared to the WT protein, Isothermal Titration Calorimetry (ITC) was employed. ITC measurements were performed at 25 °C with a NanoITC calorimeter (TA Instruments).

For these experiments, 4 μM of each protein was diluted in a final volume of 250 μl in

ATPase Buffer, and 8 mM of ATP and ADP and 16 mM of protein substrate were diluted in a 50 μl in the same buffer. After degassing of the samples, the protein was loaded into the cell, and the ligands (ATP, ADP, peptide) were loaded in the titration syringe. For each injection, a 2.5 μL portion of each ligand in the syringe was injected into the protein in the cell, at 240 second intervals. The data were processed with the NanoITC software

(NanoAnalyze Software v3.7.0). Each measurement was repeated three times with different protein batches and values for each experiment were presented as independent values.

Functional Assays on Overexpressed HSPA1A in Mammalian Cells

To test the chaperone functions of HSPA1A within a cell, I determined the whether the mutations affect (i) protein localization, (ii) refolding of stress-denatured proteins, and (iii) protection of cellular toxicity against protein-aggregate stress. Hek293

(ATCC® CRL-1573™) and HeLa (ATCC® CCI-2) cells were maintained in a humidified

o 5% CO2 atmosphere at 37 C in complete medium consisting of DMEM (Hek293) or

MEM (HeLa) supplemented with 10% fetal bovine serum, 2mM L-glutamine, and penicillin-streptomycin.

22

Protein Localization

To determine the change in subcellular localization that occurs due to the presence of natural variation to HSPA1A. Twenty-four hours prior to transfection, poly-

D-lysine treated coverslips were placed into two different 24-well plates one labeled

37 oC and another labeled 42oC. After 18 hours, cells were transiently transfected with the WT or mutant HSPA1A-GFP construct using PolyJet In Vito DNA Transfection

Reagent (SignaGen) as per the manufacturer’s instructions. Transfection continued for 18 hours, and then transfection media was removed and replaced with fresh complete media.

At that time the cells are then kept at 37 oC or placed in the 42 oC water-bath for 60 minutes. Immediately after the 60-minute heat stress, coverslips are fluorescently labeled for specific subcellular organelles following the procedures below:

Nucleus. Coverslips are fixed by aspirating culture media and adding 200 µL of fresh 4% paraformaldehyde (PFA) in complete growth medium. Coverslips are then placed in the dark and incubated for 15 minutes at room temperature. PFA is aspirated from the coverslips, and coverslips are washed five times with 1x PBS. Coverslips are then mounted on glass slides using 4 µL of DAPI-Fluoromount-G mounting media

(Southern Biotech), and gentle pressure is applied to remove air bubbles. Newly mounted slides are then placed in the dark at room temperature and allowed to dry for 24 hours prior to long-term storage at 4 oC.

Mitochondria. Culture media from confluent coverslips is aspirated and are stained with 200 µL of 100 nM MitoTracker Red (Invitrogen) in complete growth medium at 37 oC for 40 minutes. Coverslips are then fixed, washed, mounted, and stored as previously stated.

23

Lysosome. Coverslip cells are co-transfected with the lysosome-associated membrane protein 1 (LAMP1)-mRFPC1 construct. After 18 hours transfection media is removed and replaced with fresh complete medium. Coverslips are then fixed, washed, mounted, and stored as previously stated.

Plasma Membrane. Coverslips are first fixed as described in nucleus section.

After the final 1x PBS wash is aspirated from the coverslips, 200 µL of 500 ng/mL wheat germ agglutinin-AF555 (Thermo Fischer) is added. Coverslips are then placed in the dark and incubated at room temperature for 30 minutes. Coverslips are then washed, mounted, and store as previously stated.

Quantification of Cell Images. Cell images were taken on a Leica DM IRE2 inverted scanning confocal microscope equipped with a 63x 1.4 oil objective (available on CSUF campus). From the provided Leica software, raw image files are split and collected as three channels: channel 0 (green [λ = 488nm]), channel 1 (red [λ = 555nm]), and channel 2 (blue [λ = 358nm]). Images were performed on at least 30 cells from three independent experiments. Images were analyzed in ImageJ

(https://imagej.nih.gov/ij/download.html) using the corrected total cell fluorescence method (CTCF) (Burgess, 2011; Burgess et al., 2010; McCloy et al., 2014).

To measure CTCF, image files were opened in ImageJ and a region of interest

(ROI) was determined using the free-hand ROI tool and tracing the organelle in channel 1

(mitochondria, lysosome, plasma membrane) or channel 2 (nucleus). This ROI was duplicated onto the image containing HSPA1A-GFP of the same cell. The area, mean fluorescent intensity, standard deviation, min, max, and integrated density of the ROI are measured, using the built-in measuring program (Analyze  Measure), and data were

24 collected. The same process is then repeated to measure the cytosol, by using the channel

0 image to determine the entire cytosol of the cell and excluding the nucleus, by using the image in channel 2. The only exception to this method is when the nucleus is measured then the CTCF of the entire cell is measured. Lastly, a background measurement is taken by averaging the fluorescent intensity of the channel 0 background in three 20-pixel radius circles, using the provided circular ROI tool in ImageJ. CTCF is then calculated for the organelle and cytosol/total cell by following the formula:

CTCF = Integrated Density – (Area of Region of Interest * Fluorescence of background reading)

A ratio of the CTCF measurements are then determined from the organelle CTCF to cytosolic CTCF. The individual cell CTCF ratios were then plotted onto a box plot.

Quartiles, means, medians, and standard deviations were calculated and displayed in as box plots.

Protein Refolding

Use of firefly luciferase provides stable and consistent results for the rate at which heat-denatured proteins will refold after stress. Luciferase reporter systems have long been used for assaying the changes that occur due to alterations of molecular chaperone systems, and can provide insight to the importance loss or alterations to chaperone function have on protein refolding in both prokaryotic and eukaryotic systems (Freeman,

Michels, Song, Kampinga, & Morimoto, 2000; Freeman, Myers, Schumaucher, &

Morimoto, 1995; Hageman, Vos, van Waarde, & Kampinga, 2007; He et al., 2009;

Mohanan & Grimes, 2014; Nollen et al., 2001; Schröder, Langer, Hartl, & Bukau, 1993).

HeLa cells (ATCC, CCI-2) are split into 6-well plates at 2.0 x 106 cells/well and grown in complete media (MEM, 10% NGS, P/S) for 24 hours in standard cell culture conditions. Cell are then split into three labeled 24-well plates: 37 oC control, 45 oC with

25

0-minute recovery, and 45 oC with 60-minute recovery. Plates containing cells are then transiently co-transfected with PolyJet In Vitro DNA Transfection Reagent (SignaGen) with the WT or mutant HSPA1A-GFP construct and firefly luciferase following manufacturer’s protocol. Transfection is given 18 hours to process, and then transfection media is removed and replaced with fresh complete media. After six hours, media is removed and replaced once again with a 20-mM solution of 3-Morpholinopropane-1- sulfonic acid (MOPS; Life Technologies) in fresh complete media. Cells are then incubated under standard culture conditions in the MOPS-containing media for 20 hours at which time the MOPS-containing media is removed and replaced with the “heat-shock buffer,” which contains at final concentration 20 mM MOPS and 40 µg/mL cycloheximide (Sigma-Aldrich), diluted in fresh complete media. Plates are then placed back into the incubator for 30 minutes to stabilize cell lines to new media and inhibit protein synthesis. After the 30-minute incubation heat-shock plates are placed into a

45 oC water bath for 30 minutes to denature luciferase. After stress, the cells in the 37 oC control and the 45 oC with 0-minute recovery are resuspended through trypsinization and counted by a hemocytometer. Cell counts are calculated and averaged to create a fixed concentration of cells for each construct. Resuspended cells are spun-down (700 x g for 2 minutes), and media is aspirated. Cells are then resuspended in ice-cold 1x PBS to 8 x 105 cells/mL concentration, and 25 µL of cell suspension is aliquoted in triplicate into a white, clear-bottomed 96-well plate. After cell suspension is added, a 1:1 ratio (25 µL) of

Dual-Glo Luciferase reagent (Promega) is added, the plate is gently swirled, and incubated in the dark for 10 minutes. After incubation luminescence is measured using a

Promega GloMax luminometer. The same procedure is then replicated for the 45 oC with

26

60-minute recovery to obtain the recovery measurement. The raw luminescence measurements are averaged and normalized as percentages of the ratio of the experimental average to the GFP 37 oC control average, and standard deviation is calculated. The averages for three independent experiments had their quartiles, means, medians, and standard deviations calculated and displayed in as box plots.

Protein Aggregation

Modulations of cellular toxicity due to protein aggregation is a common assay for determining the effect that chaperone activity of HSPA1A have on cellular

(Nollen & Morimoto, 2002; Sakahira, Breuer, Hayer-Hartl, & Hartl, 2010). Though the exact reason(s) for cell toxicity is still controversial, the consensus seems to think that impairment of cellular and transcriptional functions by protein aggregates are most likely to blame (Blum, Hourez, Galas, Popoli, & Schiffmann, 2003).

One commonly used model of protein aggregate stress is by simulating the symptoms of Huntington’s disease (HD) by transfection of eukaryotic cells or organisms with a mutated form of exon 1 of the huntingtin gene (HTT). In the wild-type form of the huntingtin protein, the first exon contains a long stretch of 6-30 glutamines that begins at position 17 of the huntingtin protein, commonly called a polyglutamine (polyQ) stretch

(Didszun, 2009). For an individual with HD a trinucleotide, (CAG)n, repeat-expansion mutation creates an elongated polyQ stretch ranging from 36 to 120 glutamines long, with age of onset for HD inversely related to the length of the polyQ stretch (Kuiper, de

Mattos, Jardim, Kampinga, & Bergink, 2017). In cellular and animal studies, the HD phenotype has been developed using only exon 1 of HTT, and thus has become a common

27 method for which to assay protein-aggregate-induced cellular toxicity (Kakkar, Prins, &

Kampinga, 2012).

In this study, two different constructs of HTT exon 1: a 23-glutamine control

(Q23) and 74-glutamine experimental (Q74) were used. To monitor transfection efficiency, Q23 and Q74 are C-terminally fused to a red fluorescent protein (RFP).

HeLa cells are split onto 6-well plates with each well labelled with the different constructs of HSPA1A and huntingtin (i.e. wild-type Q23, wild-type Q74). These are placed in at 37oC incubator and allowed to grow for 24 hours under standard cell culture conditions to reach ~80-90% confluency. The next day cells are transiently transfected with the labelled constructs using PolyJet In Vitro DNA Transfection Reagent

(SignaGen), following manufacturer’s protocols. The next day cells are then observed under the microscope for the transfection efficiency and split in triplicate into a 24-well plate for testing. After 48 hours post transfection, the cells are then trypsinized and placed into centrifuge tubes. A cell count is taken by hemocytometer and all constructs concentrations are equalized to 8 x 105 cells/mL in ice-cold 1x PBS. 25 µL of the cell suspensions are placed into two white, clear-bottom 96-well plate in duplicate, one plate to measure caspase and the other to measure cell viability. To the cell suspension, 25 µL of Caspase-3/7 Glo (Promega) reagent or 25 µL of CellTiter-Glo 2.0 (Promega) reagent is added to the cell suspension in the properly labelled 96-well plate and gently swirled.

The plate is then incubated for 1 hour (Caspase 3/7 Glo) or 12 minutes (CellTiter-Glo

2.0) at room temperature in the dark. After the incubation, the plate is read by the

Glomax luminometer to measure luminescence.

28

The results for each construct are then averaged and percentages were determined by calculating the ratio of the average luminesce for the construct to the average luminescene for a GFP-Q74 control (100% control). The diffference from a construct’s calculated ratio to the 100% control was then calculated and displayed as a bar graph and standard deviations of the differences from the GFP-Q74 control were calculated.

29

CHAPTER 5

RESULTS

Origin and Evolution of HSPA1A

The HSPA1 cluster genes provide a unique model for observing how closely- related constitutive and heat-induced cytosolic Hsp70 genes arose and are maintained through mammalian evolution. However, clear models for when these genes arose and how they have evolved has yet to be determined. To elucidate this information, I collected nucleotide and amino acid sequences, genomic information, and human SNP information of the HSPA1 cluster genes to perform phylogenetic analyses, genomic context, sequence comparisons, and SNP profile comparisons.

Phylogenetic Analyses and Genomic Context of HSPA1A

To determine the origin of HSPA1A, I analyzed the relationships of the HSPA1 cluster genes relative to each other through phylogenetic analysis and by comparing changes in their genomic positions (synteny) between species. As HSPA1 genes are only observed in the earliest of mammals, I focused my query exclusively to mammalian sequences in order to determine where the HSPA1A/B genes emerged.

First, the HSPA1A/B genes appear to first arise in placental mammals, which is supported by marsupials only having one heat-inducible HSPA1A/B gene as well as the

HSPA1L gene (Figure 6). Second, the HSPA1A, HSPA1B, and HSPA1L genes are in conserved synteny in all mammals, and it appears that an event early in the divergence of

30 placental mammals and marsupials gave rise to HSPA1A and HSPA1B as independent genes. Third, based on phylogenetic analysis, the paralogous HSPA1A/B sequences in placental mammals show intra-species clustering of HSPA1A/B within each species. In contrast, the HSPA1L genes only show inter-species clustering in which monophyletic clades are formed. Lastly, in all species studied, HSPA1A and HSPA1B are intronless while HSPA1L contains one at the 5-UTR (not shown), a trend that is conserved in all the species studied.

Figure 6. The HSPA1A/B cluster originated early during the evolution of placental mammals. (a) The evolutionary history was inferred using the Neighbor-Joining and Maximum Likelihood methods. The percentage of replicate trees in which the sequences clustered together in the bootstrap test (1000 replicates) are shown next to the branches. All positions containing gaps and missing data were eliminated. (b) The HSPA1 genes HSPA1A, HSPA1B, and HSPA1L are in conserved synteny in all mammals. Collected proteins were mapped to corresponding mammalian genome sequence and the genomic context between sequences were compared to determine syntenic relationships.

Determining Evolutionary Mechanisms that Represent how HSPA1A is Maintained

The phylogenetic analyses revealed that HSPA1A and HSPA1B are highly conserved. This high level of sequence conservation can be explained either through

31 strong purifying selection, non-reciprocal recombination (gene conversion), or a combination of both mechanisms. To first distinguish between these three possibilities, I compared the nucleotide and amino acid identities between the nine representative mammalian species to see if nucleotide or amino acid sequences are more preserved between the HSPA1 cluster genes (Suppl. Table 1; Table 1). What is revealed is that distinct species and different genes show relatively different patterns of amino acid and nucleotide sequence conservation. Yet, for all sequence comparisons the amino acid identity is considerably higher than the nucleotide identity. This type of conservation is suggestive of purifying selection as the number of synonymous (nucleotide) differences is higher than non-synonymous (amino acid) differences when comparing the HSPA1 genes.

Table 1. Sequence Comparisons between Representative HSPA1 Cluster Genes in Multiple Species

human human human goat goat opossum opossum 1A 1B 1L 1A 1B goat 1L 1 1L human 1A - 99.73 81.16 94.14 94.56 78.54 78.54 80.37 human 1B 100.00 - 81.11 94.19 94.61 78.49 78.44 80.22 human 1L 89.95 89.95 - 79.49 80.01 88.64 81.94 80.79 goat 1A 97.65 97.65 88.70 - 98.95 77.55 77.44 79.28 goat 1B 98.74 98.74 89.80 98.74 - 78.02 77.97 79.75 goat 1L 86.50 86.50 92.15 85.40 86.50 - 80.06 79.38 opossum 1 94.66 94.66 90.27 92.94 94.20 87.13 - 92.09 opossum 1L 92.31 92.31 92.31 91.05 92.15 88.38 93.09 - Numbers are percentages (%). Nucleotide identities above diagonal in white. Amino acid identities below diagonal in grey.

To help better elucidate where and how these nucleotide changes emerge, I used a sliding window analysis to determine what regions of the nucleotide and amino acid

32 sequences of HSPA1A and HSPA1B are similar (Appendices B-C, Figure 7). I first calculated the identities (both nucleotide and amino acid) between HSPA1A and HSPA1B in humans and cattle These analyses revealed the presence of large genic regions, ranging from 50-100% of the gene sequence are identical both at the nucleotide and at the amino acid levels. What these results reveal, is that together with the results presented in Table

1, different species have different models of sequence conservation between HSPA1A and

HSPA1B. Additionally, these results support purifying selection as the difference in sequences between HSPA1A and HSPA1B represent mostly synonymous differences.

These results also show the possibility that there is a mechanism of genetic recombination in the form of gene conversion that is playing a role in homogenizing large portions of the nucleotide sequences leading identical nucleotide sequences between

HSPA1A and HSPA1B.

Figure 7. HSPA1A and HSPA1B genes are highly conserved at both the nucleotide (a, b) and amino acid (c, d) levels. Pairwise sequence alignments between human (a, c) and cattle (b, d) HSPA1A and HSPA1B sequences over sliding windows were performed using SWAAP (v.1.0.3). Full analysis can be found in Appendices B and C.

33

However, it is difficult to distinguish if the high amino acid conservation is due to either purifying selection or gene conversion (Arnheim, 1983). To distinguish between these hypotheses, I also calculated the proportion of synonymous (ps) and non- synonymous (pn) differences to the synonymous and non-synonymous sites of the gene

(Appendix D). By comparing the ps and pn values we can see how these genes evolve and if this change is skewed towards keeping the protein as it is (ps > pn) or changing it

(ps < pn) (Nei & Kumar, 2000; Nikolaidis & Nei, 2004). As can be seen in Appendix D and Supplementary Table 2, the ps values are considerably higher than pn supporting our previous finding that purifying selection is selecting against amino acid mutations for all

HSPA1 genes.

However, purifying selection alone does not fully explain the occurrence of intra- species clades for paralogous HSPA1A/B genes and that large portions of the nucleotide sequences are identical between the two genes (Appendix B; Figure 7). As the presence of the HSPA1A/B gene cluster in all placental mammals suggests an older primary gene duplication event from which these genes emerged. Thus, another possibility could be the suggested gene conversion-like mechanism that homogenizes the two sequences (Kudla et al., 2004).

To clarify where these differences are between the HSPA1 cluster genes I created sliding window comparisons of synonymous substitutions per synonymous sites (ps) and non-synonymous substitutions per non-synonymous sites (pn) substitutions over the length of the HSPA1 genes (Appendix E). First, the number of synonymous substitutions and the pn values between HSPA1A and HSPA1L, are 10 to 30 times more than their equivalent values between HSPA1A and HSPA1B. Second, for the human HSPA1A/B

34 genes there are only a few ps substitutions and no pn substitutions. Third, large parts of the HSPA1A and HSPA1B genic regions show no substitutions. Altogether, these data strongly support the notion that the human HSPA1A/B gene cluster is evolving independently from HSPA1L, and HSPA1A/B are following a concerted model of evolution by a gene conversion-like mechanism, and the variability that is seen is synonymous as would be seen for genes under purifying selection.

Signs of Purifying Selection and Gene Conversion in Human Microevolution

To better understand evolution within a species, I next sought to investigate the distribution of the different SNP types between the human HSPA1 genes and their neighbors using the 1000 Genomes dataset. This analysis revealed that for the intronless

HSPA1A and HSPA1B, most SNPs lie in the 3' and 5'UTR region, and their most common SNP type in the coding region is synonymous (Table 2). For the other six genes, which have one or more , many of the SNPs lie in the intron(s) and 5' or 3' UTR, those SNPs found in the coding region are mostly nonsynonymous. The distribution of

SNP types is also significantly different between the HSPA1 genes and their neighbors.

35

Table 2. The Distribution of SNP Type is Significantly Different Between the HSPA1 and their Neighboring Genes

Count Row % 3'URT 5'UTR CDS_other intron missense synonymous Total 7 16 0 0 5 11 39 HSPA1A 17.95 41.03 0.00 0.00 12.82 28.21 11 18 0 0 7 10 46 HSPA1B 23.91 39.13 0.00 0.00 15.22 21.74 6 26 2 60 37 22 153 HSPA1L 3.92 16.99 1.31 39.22 24.18 14.38 7 17 2 125 9 1 161 C6orf48 4.35 10.56 1.24 77.64 5.59 0.62 9 10 1 206 0 2 228 LSM2 3.95 4.39 0.44 90.35 0.00 0.88 16 7 2 36 12 12 85 NEU 18.82 8.24 2.35 42.35 14.12 14.12 11 0 11 339 42 19 422 SLC44A4 2.61 0.00 2.61 80.33 9.95 4.50 0 0 10 309 56 35 410 VARS 0.00 0.00 2.44 75.37 13.66 8.54 67 94 28 1075 168 112 Total 1544 4.34 6.09 1.81 69.62 10.88 7.25

ChiSquare Prob > ChiSq Test ChiSquare Prob > ChiSq Likelihood Ratio 628.751 < 0.0001* Pearson 627.218 < 0.0001*

A sliding window analysis reveals that both HSPA1A and HSPA1B, contain very

few mutations, and those that they do have are only at the far N- and C-termini of the

encoded protein mirroring in the findings observed in Figure 7 and Appendices B-C

(Appendix F). Additionally, this result reveals that more than 90% of both HSPA1A/B

genes have zero mutational load, which could imply gene conversion homogenize these

two genes in this specific region. In contrast, HSPA1L contains SNPs that cover the entire

length of the gene showing lack of recombination with HSPA1A or HSPA1B. These

patterns when compared with the 24 times larger ExAc dataset, still hold true (not

shown). Altogether, these data suggest that human microevolution mirrors the patterns

36 seen in inter-species evolution. Purifying selection creates SNP profiles that are skewed towards have more synonymous mutations than non-synonymous mutations within these

HSPA1A and HSPA1B. Secondly a gene conversion-like mechanism between the

HSPA1A and HSPA1B genes homogenizes the regions between residues 100-450 of these proteins thus showing no mutational load. On the other hand, HSPA1L show very different SNP profiles suggesting that it evolves independently from these two genes through a different mechanism.

Structural and Functional Consequences of SNPs on Protein Activity of HSPA1A

Based on the low presence of non-synonymous SNPs and how the SNPs are distributed within HSPA1A it appears that even within the relatively short evolutionary timeframe of human evolution purifying selection is already in function. To test this notion, I will experimentally verify the consequences that the most radical, natural mutations have on the structure and protein activities of HSPA1A. My approach is based on the logic that if this gene is undergoing purifying selection then the most radical natural mutations will have little to no effect on the function of the protein.

Results of SNP Curation and Collection

To determine which SNPs would be most likely to cause functional change I created a set of criteria to determine the mutations most likely to change function as I did not identify any natural mutations on sites of know function. From this collection, six variants (Table 3) were selected as they ranked highest in most if not all the criteria (i.e. being found in known populations, high amino acid conservation for the position, high minor allele frequency, negative BLOSUM scores, and deleterious/damaging results from

PolyPhen and SIFT). Amino acid class changes are also presented to provide a general

37 picture for observing radicality of the amino acid change due to molecular interactions within the protein.

Along with these variants two HSPA1A controls are used as comparisons in function. The controls are the wild-type (WT) HSPA1A protein and a synthetic HSPA1A variant, K71A, which has been previously described to be unable to hydrolyze ATP

(Bonini, 2002; Chow, Steel, & Anderson, 2009; O’Brien, Flaherty, & McKay, 1996;

Zeng et al., 2004). From these data, none of the positions for my collected SNPs have literature to determine their function. Secondly, from their minor allele frequencies

(MAF), it appears that these mutations are very rare if not private variations (Table 3).

This low of a value is suggestive of purifying selection and would make the penetrance of these variations into a population low, possibly due to functional consequences that these variants may have.

Table 3. Criteria Data for SNPs of Interest in Functional Characterization.

BLOSUM AA Variant Population Conservation MAF 65 (80) PolyPhen SIFT Change European Probably S16P American >95% 0.1% -1 (-1) Deleterious # to 0 Damaging (Clinical) African Probably S16Y >95% 0.0008% -2 (-2) Deleterious # to 0 American Damaging European R36C 92% 0.000825% -4 (-4) Benign Deleterious + to # (ExAc)† European Probably I74T 94% 0.0002% -1 (-1) Deleterious 0 to # (ExAc)† Damaging European Probably I480N American >95% N/A -3 (-4) Deleterious 0 to # Damaging (Clinical) European Possibly F592S American 86% 0.0358% -2 (-3) Deleterious 0 to # Damaging (ExAc)† AA class change is based on the charge of the amino acid in a solution at neutral pH. 0 - nonpolar, # - polar, uncharged, + (-) – basic (acidic). † Population as defined by Lek et al. (2016).

38

Effects of SNPs on Thermal Stability of HSPA1A

To determine whether a mutation possibly affects protein stability, as this could infer specific changes in its structure, I measured how much the collected mutations alter the melting temperature (Tm) of the protein. The results of this assay revealed that all mutations tested caused the protein to denature at a different temperature and thus are predicted to alter its stability (Table 3). In all cases the mutations altered the Tm, with

S16P, S16Y, I74T, and F592S lowering the Tm while R36C and I480N made it higher, although the latter change was not statistically significant (Table 4). Altogether, the change in Tm caused by these mutations seems to be subtle with the change ranging anywhere between a 0.5 and 3oC change in temperature. These results suggest that even the most radical variants only have a minor effect on the structural integrity of the protein supporting purifying selection though interactivity with ligands needs to be tested also.

Table 4. Melting Temperature (Tm) of the Wild Type and Mutated HspA1A Variants

Protein Tm (oC) SD t test WT 43.49 0.1485 n/a K71A 46.79 0.1464 <0.0001 S16P 43.08 0.1092 0.0188 S16Y 41.32 0.2594 0.0002 R36C 44.24 0.2604 0.0125 I74T 40.65 0.1025 <0.0001 I480N 43.59 0.2940 0.6276 F592S 40.37 0.3196 0.0001

Data is calculated by fitting the data produced using the Boltzmann equation. The values are averages of three independent experiments (N = 3).

39

Effects of SNPs on Binding Kinetics of HSPA1A to ATP, ADP, and Protein Substrate

To get a more comprehensive picture of how the variants may alter the function of

HSPA1A, I determined if natural mutations alter the ability of HSPA1A to bind with essential ligands (ATP, ADP, and protein) to its function. My results suggest that the effect of the mutations is small, but reveal some interesting findings about particular positions on the protein (Appendices G-I; Figure 8). First, most variants bind to ATP with similar affinity to the WT except for S16P. Furthermore, analyses of the enthalpy (dH) and the entropy (dS) of the interactions suggest that the different types of molecular forces, electrostatic and hydrophobic, that govern the binding to ATP remain relatively unchanged (Appendix G). Second, the affinity for ADP is only altered in the S16P and

R36C mutations (Appendix H). The enthalpy and entropy of the interactions again remain relatively unchanged (Appendix H). Third, the affinity for proteins is not greatly altered by the presence of mutations (Appendix I). However, the enthalpy for S16P,

S16Y, and R36C show small alterations. As for the entropy, S16P, R36C, I480N, and

F592S show reversals in their values compared to the WT, similar to what is observed in the K71A null mutation (Appendix I).

40

Figure 8. The wild type (WT) and mutated (I74T) HSPA1A variant bind similarly to ATP. Representative Isothermal Titration Calorimetry (ITC) assays using purified recombinant HspA1A proteins and ATP. (a, b) Represent the ITC raw data for 20 automatic injections of ATP into the sample cell containing either WT or mutated HspA1A. (c, d) Represent ITC binding curves obtained for the interaction between ATP and HspA1A. The data shown are representative of three independent experiments (N = 3).

These results suggest two things, the relatively minor changes that occur in nucleotide binding could suggest that the vital function of ATP hydrolysis and replacement remains conserved and subjected to strong purifying selection, as it is essential to the overall function of the enzyme. Whereas, these natural variants cause noticeable alterations in the molecular forces which govern the ability of HSPA1A to interact with proteins substrates. These alterations in the stability and binding interactions of HSPA1A variants to protein substrates provide interesting questions of if these variants affect the primary chaperone functions of HSPA1A.

41

Effects of SNPs on ATPase Function of HSPA1A

I next tested the effect of these mutations on ATP hydrolysis, a function critical for HSPA1A to act as a molecular chaperone (Mayer & Bukau, 2005). My results reveal that all mutant proteins retain the ability to hydrolyze ATP (Figure 9 and Table 5).

However, the rate of hydrolysis is affected when compared to the WT and the K71A null mutation. Three mutations have a relatively slow rate of ATP hydrolysis using the slope of the reaction divided by time as a measure of rate (s-1) (Table 5). The rate of these mutations, R36C, I74T, and F592S, are significantly lower than the rate of the WT protein, but higher than the rate produced by the K71A mutation, which is known to have negligible ATPase activity (O’Brien et al., 1996).

Figure 9. Variants of HSPA1A can affect the rate of ATP hydrolysis for the given protein over a 90-minute period of time. Data is presented as box plots of averages for three independent experiments (N = 3).

42

Table 5. Equation, R2, Rate (s-1), and Standard Deviation for Linear Regression Lines of the Observed Data for ATP Hydrolysis

t test t test Slope s-1 SD (WT) (K71A) WT y = 22.522x - 18.248 R² = 0.9999 0.3754 0.0526 n/a n/a K71A y = 5.4838x - 2.3661 R² = 0.9565 0.0914 0.0128 0.0008 n/a S16P y = 19.82x + 2.8333 R² = 0.7070 0.3303 0.0462 0.3276 n/a S16Y y = 18.809x + 3.8542 R² = 0.8007 0.3135 0.0439 0.1925 n/a R36C y = 7.334x + 6.9279 R² = 0.9150 0.1222 0.0171 0.0014 0.0668 I74T y = 13.222x + 0.8529 R² = 0.8172 0.2204 0.0309 0.0116 0.0026 I480N y = 22.127x - 20.011 R² = 0.9623 0.3688 0.0516 0.8845 n/a F592S y = 9.3597x - 0.3532 R² = 0.8765 0.1560 0.0218 0.0026 0.0115

SD is standard deviation for three independent experiments (N = 3). Student’s unpaired t test comparing rates of ATP hydrolysis for WT to variants. For those with significantly slowed ATP hydrolysis a second t test comparing the slowed variants with K71A was conducted.

Interestingly, none of these amino acid positions are known to be involved to the binding or hydrolysis steps they may affect secondary movements of the molecule that result in alterations of ATP hydrolysis. However, it is currently unknown whether these small changes are physiologically significant and alter the chaperone function inside a cell.

Overall, these findings support purifying selection as a plausible mechanism with which HSPA1A deals with mutations, by showing that even the most radical of natural coding mutations impart only small changes to stability and molecular interactions with ligands. Also, ATP hydrolysis appears to be conserved which is essential to the primary function of HSPA1A to hold and refold proteins.

43

Functional Consequences of Polymorphisms on Cellular Stress Response

Findings about changes to HSPA1A seen in recombinant proteins can only provide so much insight into possible physiological effects. As there are alterations in protein substrate interactions and ATP hydrolysis this leaves the question of if HSPA1A variants show cellular consequences. To provide insights about what these consequences may be, I analyzed the effects that these mutations have on cellular localization and chaperone functions of HSPA1A

Effects of SNPs on Protein Localization of HSPA1A Inside a Cell

To determine if variants cause a disruption in the normal localization of HSPA1A in mammalian cells, colocalization was determined before (37oC) and immediately after a heat stress (42oC). During normal physiological conditions HSPA1A is found in the cytosol and nucleus at different quantities depending on the cell type and part of the cell cycle it is in (Daugaard et al., 2007; Milarski & Morimoto, 1986). When exposed to a stressor these quantities change and movement towards different cellular organelles, namely the nucleus, plasma membrane, and lysosomes, are observed (Bivik, Rosdah, &

Öllinger, 2007; Garrido et al., 2006; Gyrd-Hansen, Nylandsted, & Jaattela, 2004;

Kotoglou et al., 2009; Multhoff et al., 1995; Multhoff & Hightower, 1996; Nylandsten et al., 2004; Zeng et al., 2004). This methodology was used to observe if HSPA1A variants caused the protein to aggregate, become trapped, or have an altered translocation than what is expected.

First, none of these variations cause HSPA1A to be misfolded, aggregate, or become trapped, which could render the protein ineffective (Appendices J-M). Second, my results reveal that colocalization of WT HSPA1A with the nucleus follow the

44 expected trend for increased localization to the nucleus immediately after heat stress

(Figure 10). The variants surveyed do not remarkably change the localization of

HSPA1A from the nucleus visually, but in some cases, do alter how much HSPA1A is localized when quantified (Appendix H-K). For the S16Y and F592S variants there is a total increase in the CTCF ratio, both at 37oC and 42oC, but the pattern of translocation is still preserved. With the constructs R36C and I480N, HSPA1A is localized similarly to the WT at normal growth conditions but show significantly less translocation after stress.

Figure 10. Histogram with data points (circles) of CTCF ratios for A) nucleus, B) mitochondria, C) lysosome, and D) plasma membrane. White bars (37oC) and black bars (42oC). X represents the mean, line represents the median, error bars represent the min and max data points. Statistical significance for 30 cells of each construct (N = 30) was determined using an unpaired t test, the comparisons are: construct 37 to construct 42 (α p value ≤ 0.05, β p value < 0.01), WT 37 to variant 37 (γ p value ≤ 0.05, δ p value < 0.01), WT 42 to variant 42 (ε p value ≤ 0.05, ζ p value < 0.01). Unpaired t test values can be found in Appendix N.

The trends for the localization of HSPA1A to mitochondria both before and after heat stress has not been previously published. Though it could be expected that HSPA1A

45 translocates towards mitochondria to help shuttle proteins destined for the organelle.

Then when a stress occurs HSPA1A localizes away from the mitochondria to support other vital systems, as protein synthesis is halted during stress. My data follows this hypothesized trend and shows a decrease in the localization with subtle variations of

HSPA1A across all variants immediately after heat stress (Figure 10).

Previous studies have shown that HSPA1A will localize to the lysosomes to prevent lysosomal-based . This trend is seen in the WT with an increased localization towards lysosomes after stress (Figure 10). As for the variants, my results show that all constructs except for S16Y follow a similar trend of increased localization immediately after stress with alterations to how HSPA1A localizes during normal growth conditions.

Lastly, for the plasma membrane (PM) it is well-established both in published work and within my lab that HSPA1A localizes to the PM immediately after a heat stress.

For all variants except R36C I see that this trend is preserved with alterations in the average amount of HSPA1A is localized to the PM after stress (Figure 10).

Overall, these data show minor changes in the amount of protein that translocated to cellular organelles and none of them become trapped or aggregate together (Figure 10).

Effects of SNPs on Protein Refolding Function of HSPA1A

To determine how variants affect the refolding ability of HSPA1A, I assessed the percent of refolded luciferase immediately after (0-minute recovery) and following a 60- minute recovery from a 16-minute heat stress at 45oC (Figure 11). My results reveal that the majority of mutations refold denatured luciferase at the same rate as the WT.

However, two natural variants R36C and I480N both significantly alter the percent of

46 refolded luciferase after 60-minutes of recovery. For the R36C variant the percent of refolded luciferase increased by approximately 21% compared to the WT. The variant,

I480N, on the other hand, showed a decrease in refolded luciferase by approximately 7%, similar to rate of refolding seen in the K71A null mutation.

Variant t test GFP 0.0537 WT n/a S16P 0.3562 S16Y 0.4006 R36C 0.0074 I74T 0.7959 I480N 0.0333 F592S 0.1327 K71A 0.0457

Figure 11. The natural variants R36C and I480N show significant changes to the ability of HSPA1A to refold denature luciferase. Percentages are ratios of experimental luminescence to a GFP-luciferase construct at 37oC. Unpaired t test of three independent experiments (N = 3) comparing WT 60-minute values to variant 60-minute values. * p value ≤ 0.05, *** p value < 0.01

47

These results suggest that the majority of variants do not significantly change the refolding properties of HSPA1A. However, key properties of the amino acids found at positions 36 and 480 of HSPA1A appear to be important to the protein’s ability to promote denatured protein refolding.

Effects of SNPs on HSPA1A Protecting Against Protein-Aggregate Cell Toxicity

To determine the effect of SNPs on the ability of HSPA1A to protect against protein-aggregate stresses, I observed alterations in cellular survival when subjected to elongated huntingtin protein mutants. In general, HSPA1A can negatively regulate multiple pathways that would lead to apoptotic cell death (Garrido et al., 2006).

HSPA1A does this by surrounding the protein inclusions, reducing effector caspase activity, and thus promoting cellular viability (Zeng et al., 2004; Zhou, Li, & Li, 2001).

However, the overexpression of HSPA1A alone is not capable of reducing the number of aggregates present within the cell (Jana, Tanaka, Wang, & Nukina, 2000; Rujano,

Kampinga, & Salomons, 2007; Zhou et al., 2001). A similar event was seen in the variants of HSPA1A, which all seem capable of surrounding the Q74 inclusions to a similar extent as the WT (See Appendix O).

To see if the variants alter the effects of HSPA1A at inhibiting caspase activity, I tested associations of the variants with changes in activity of two effector

(caspase-3/-7). My results show that the WT protein inhibits the activity of caspase-3/-7 resulting in a 15% decrease in caspase-3/-7 activity compared to the GFP-Q74 control

(Figure 12). A similar decrease is also observed in the variants S16P, or S16Y. The I74T variants also inhibits caspase activity, but to a greater extent than the WT protein. Lastly,

48 the constructs R36C, I480N, and F592S show no inhibition of caspase activity comparable to that of GFP-Q74 control.

Lastly to correlate if these changes in caspase inhibition lead to changes in cellular viability I test how cell viability is altered after prolonged exposures to huntingtin aggregates with HSPA1A. My results reveal that the WT protein provides an approximately 25% improvement in cell viability to the cell compared to the GFP-Q74 control (Figure 13). This improvement is also representative for S16Y, R36C, and to a lesser extent S16P. Decreased cell viability is observed in the I74T, I480N, and F592S variants when compared to the GFP-Q74 control.

Variant t test WT n/a S16P 0.2857 S16Y 0.4202 R36C 0.0208 I74T 0.0080 I480N 0.0072 F592S 0.0301 K71A 0.1210

Figure 12. HSPA1A constructs can vary in their inhibition of caspase-3/-7 activity. Activity is normalized to a GFP-Q74 control 48 hours post-transfection and difference between percentages of the construct to the control are calculated. Error bars represent SD (N = 3). An unpaired t test was used to determine significance difference from the WT to the mutants. * p value ≤ 0.05, *** p value < 0.01

49

Overall these observations show that variants that affect caspase activity, may not necessarily translate to changes in cell viability. For variants I480N and F592S we can see correlations between increased caspase activity and decreased cell viability. However,

R36C and I74T show how changes in caspase activity, may not necessarily mean there are direct correlations to cell viability. The unusual dichotomy of these two mutations shows HSPA1A may be interacting with other cellular processes aside, from the effector caspase systems, to help defend the cell against protein-aggregate stresses.

Variant t test WT n/a S16P < 0.0129 S16Y < 0.0822 R36C < 0.3163 I74T < 0.0026 I480N < 0.0008 F592S < 0.0001 K71A < 0.0002

Figure 13: HSPA1A constructs can vary in their ability to maintain cell viability. Cell viability is normalized to a GFP-Q74 control 48 hours post-transfection and the difference between the percentages of the construct to the control are calculated. Error bars represent SD (N = 3). An unpaired t test was used to determine significance difference from the WT to the mutants. * p value ≤ 0.05, *** p value < 0.01

50

CHAPTER 6

DISSCUSSION

The Evolution of HSPA1A is Shaped by Purifying Selection and Gene Conversion

My results on the evolution of the mammalian, cytosolic, heat-inducible Hsp70 genes suggest that gene diversification and sequence conservation are shaped by two phenomena: purifying selection and gene conversion, two mechanisms that have been considered mutually exclusive though have been observed before in a study of Hsp70 genes in nematodes (Nei & Rooney, 2005; Nikolaidis & Nei, 2004).

My findings provide the basis for a hypothetical scenario that could explain the origin of the HSPA1 cluster in mammals. This scenario suggests that the very first mammals had both HSPA1L and an ancestral HSPA1A/B gene within their MHC-III region (Figure 6). At the dawn of placental mammals, a single gene duplication event produced the HSPA1A and HSPA1B gene cluster, which continues to remain within the

MHC-III region of all mammals (Figure 6). Furthermore, this scenario suggests that the

HSPA1A/B have evolved in concert, but independently of HSPA1L, attaining distinct patterns of both synonymous and non-synonymous mutations (Suppl. Table 1-2; Table 1-

2).

My results also suggest that the HSPA1A/B gene cluster evolved through a combination of purifying selection and a gene conversion-like mechanism. By trying to deduce if a single evolution mechanism (i.e. gene conversion or purifying selection) can

51 account for the patterns seen in these genes either alone fails to answer this question. This combined mechanism is supported by the ps/pn distances between HSPA1A and HSPA1B which show significantly higher synonymous differences that are localized only on the ends of the genes (Appendices D-F; Suppl. Table 2). Based on this scenario, these recombination events only cover particular regions of the genes and are complimented by the action of purifying selection that results in the high similarity of amino acid sequences between the two genes (Appendix C; Suppl. Table 1; Table 1).

Purifying Selection and Gene Conversion is Observed in Human Microevolution

These findings are further exemplified by intra-species patterns of polymorphisms found in humans (Appendix F, Table 2). These results show that the patterns seen across species are also occurring within a single species, humans, despite the very short human evolutionary history. The microevolutionary processes within humans supports the presence of both purifying selection and gene conversion acting on HSPA1A/B gene cluster. The number of SNPs present in humans for HSPA1A and HSPA1B show that there are a very small number found in their coding sequence. Which in comparison to

HSPA1L is even more apparent (Table 2). When comparing the HSPA1A/B gene cluster, the gene sequences contain very few mutations compared to HSPA1L which has several mutations throughout the length of its sequence. These patterns are seen first in the 1000

Genomes dataset and further supported in the ExAC dataset, which is 24 times larger than the former (Appendix F). These patterns support the idea of gene conversion at homogenizing the sequences of HSPA1A and HSPA1B, and show that the inter-species evolutionary patterns previously described occur even in the relatively short history of human evolution.

52

Purifying selection is also supported by the rarity of which SNPs are found within the HSPA1 cluster compared to their neighboring genes and the minor allele frequencies for these natural mutations are either private or have a very low frequency (Table 2-3).

This finding suggests that these mutations have a low penetrance in human populations and if negative selection is a factor in maintaining the function of HSPA1A, then changes in functional effects that manifest from natural mutations should be relatively minor.

Overall, these observations point to strong purifying selection and gene conversion acting on both HSPA1A and HSPA1B genes. Purifying selection acts by removing amino-acid changing mutations, which is represented by the majority of SNPs found on HSPA1A/B being synonymous and those that are most likely to confer a functional change have a small effect on a given population. Gene conversion, on the other hand, works to homogenize and preserve large portions of the nucleotide sequence of HSPA1A with HSPA1B and vice versa, creating gene sequences that are nearly identical and have portions of the gene with virtually no mutational load (Appendix E-F).

Protein Activity Tests on Natural HSPA1A Mutations Verify Purifying Selection

Changes in protein function help to verify the notion of purifying selection, as even the most radical of mutations do not drastically change the stability of the protein itself or its ability to bind with essential substrates (Appendices G-I). Shifts in protein stability are consistent and statistically significant, however, these changes are relatively small and none of these data resulted in a drastic shift in protein stability (Appendix I).

This finding suggests that these variants result in relatively small changes to the structural stability of the proteins and may not have a significant physiological effect or dramatically alter the half-life of the proteins inside the cell.

53

Purifying selection is further supported in how the affinity of protein substrates to

HSPA1A have only minor changes (Appendices I). The most obvious change that does occur is in the molecular mechanisms (enthalpy and entropy) involved in the binding of

HSPA1A to protein substrates. However, these data do not directly correlate to functional changes as the methods of this study only show direct interactions between HSPA1A to peptide substrate alone, not in conjunction with nucleotide binding. One theory to this relationship though is that a change in binding mechanisms can be associated with the observed stabilization of Hsp70-substrate interactions as seen in K71A variants of

HSPA8 (Taguwa et al., 2015).

Thus, by further translating these variants to functional changes my results on

ATPase activity also support purifying selection as none of the variants cause a loss of

ATP hydrolysis, as is seen in the K71A variant (Figure 9; Table 4). The hydrolysis of

ATP is an essential action to allow for the primary functions of Hsp70s, and it appears that these functions may be altered but not lost in even the most radical of coding mutations. As ATP hydrolysis is directly related to changes in the affinity of HSPA1A to misfolded protein substrates, these changes together with the alterations observed from

ITC could have correlative effects to the ability by which HSPA1A variants binds to and interacts with various substrates.

Variation in Protein Activity Translate to Functional Consequences

Purifying selection also appears to prevent drastic modulations of HSPA1A variants during the cellular stress response. However, my findings reveal that a couple of these variants may be affecting key positions for chaperone function of HSPA1A. Similar to my protein activity findings the intracellular localization of HSPA1A with different

54 organelles is fairly minor and may not generate a physiologically relevant outcome within the cells (Appendices J-M; Figure 8). Another notion could be that the NBD is particularly robust in protecting its functions, as along with the conservation of nucleotide interactions, the NBD appears to be largely responsible for many lipid interactions that associate HSPA1A with cellular membranes (McCallister, Kdeiss, &

Nikolaidis, 2015). Thus, purifying selection may function so natural variants have a minimal effect on NBD functions such as ATP hydrolysis and intracellular localization.

This leads to the question of, where would functional consequences go? What I propose is that functional consequences translate down to how HSPA1A interacts with protein substrates. Since the function of Hsp70s is allosterically controlled by ATP hydrolysis and the conformational changes that occur, even mutations found in the NBD could lead to consequences in chaperone function (Chiappori, Merelli, Milanesi,

Colombo, & Morra, 2012; Zhuravleva et al., 2012). For example, this idea is supported by how R36C significantly alters the function of HSPA1A in refolding luciferase or inhibiting effector caspase activity (Figure 11-12).

Yet, we see that for some HSPA1A variants (S16P & S16Y) there are little to no functional changes that occur with how the protein interacts with substrates (Figs. 11-13).

This neutral effect is supported by purifying selection, both by the low penetrance of these mutations in populations and their relatively minor changes in function. For those variants that showed multiple significant differences in my assays of the protein (i.e.

R36C and I480N) appear to correlate to specific positions that are important for protein refolding and/or defending against stresses. In particular, these variants appear to show how a protein refolding and rate of ATP hydrolysis could be related, as the R36C variant

55 shows improved rates of refolding with a slowed rate of ATP hydrolysis (Figs. 9, 11;

Table 4). In contrast, I480N which has normal rates of ATP hydrolysis has significantly lessened rates of protein refolding (Figs. 9, 11; Table 4). This dichotomy shows how amino acids from completely different parts of the protein can affect the entire allosteric process when it comes to the primary chaperone function of HSPA1A. As for secondary protein-protein interactions that related to cellular survival, it is hard to determine the exact pathways that are being affected by my variants, but what my results do show is that some variants alter how HSPA1A abates pathways towards cellular toxicity and further analysis of what these specific pathways are, may help us to understand previously characterized or unknown interactions. No matter how dramatic the change may be though, because the variants’ penetrance are very low or private, the overall theory that HSPA1A is subjected to purifying selection still stands (Figs. 11-13; Table 3).

Altogether, these findings suggest that purifying selection works to prevent loss- of-function for HSPA1A and for many of these variants, which are considered the most radical, their functions are only minorly altered but not removed. What relation positions

R36 and I480 have on the function of HSPA1A is still unknown, as these positions have not been previously characterized, but my study provides a starting point to further elaborate what importance these positions may have on the chaperone functions of

HSPA1A.

Final Conclusions

Regardless of the breadth of information concerning the evolution and variation of Hsp70s between species, there are still unanswered questions as for the adaptive reasons behind why eukaryotes can have such dynamically different numbers of Hsp70

56 genes. My results from both inter- and intra-species comparisons provide an origin and model by which HSPA1A has evolved, supporting the notion that HSPA1A is conserved through a process of negative selection and gene conversion. In support of these conclusions, functional studies help to demonstrate how we can test natural mutations to model modes of evolution based on functional outcomes and provide insight on potential positions that are key to function (i.e. R36 and I480N). However, the reason for why these genes are under such strong purifying selection remains unknown, as according to mouse studies HSPA1A/B are non-essential (Daugaard et al., 2007). Also, there is some redundancy in their primary function which can be compensated by other Hsp70s (i.e.

HSPA8) (Kabani & Martineau, 2008). One possibility could be that these genes, which are already under strong purifying selection, are also undergoing frequent recombination that homogenize their gene sequences decreasing the likelihood that deleterious mutations emerge. Another more likely hypothesis, is that Hsp70s have additional secondary functions that cannot be compensated by their paralogs. For example, HSPA8 and HSPA1A have very similar primary functions but are differentiated by their lipid- binding profiles (McCallister, Siracusa et al., 2015). Therefore, selection is working to preserve not only their important primary function but these lesser known secondary functions that could be vital to support other systems in the CSR (i.e. lipid binding, cell signaling).

Altogether, there are still a lot of unanswered questions concerning the evolution and natural variation of molecules that function in the CSR. Nevertheless, my findings provide insights on positions that may be important for chaperone function and provide a

57 model of fast-acting evolution that is used to maintain essential elements to the cellular systems that help organisms adapt and survive stress.

58

APPENDIX A

PRIMERS USED FOR SITE-DIRECTED MUTAGENESIS

Primer Name Primer Sequence (5-3) A1A_S16P_F CGACCTGGGCACCACCTACCCGTGCGTGGGCGTGTTCCAG A1A_S16P_R CTGGAACACGCCCACGCACGGGTAGGTGGTGCCCAGGTCG A1A_S16Y_F GACCTGGGCACCACCTACTATTGCGTGGGCGTGTTCCAGC A1A_S16Y_R GCTGGAACACGCCCACGCAATAGTAGGTGGTGCCCAGGTC A1A_R36C_F CGCCAACGACCAGGGCAACTGCACGACCCCCAGCTACGTG A1A_R36C_R CACGTAGCTGGGGGTCGTGCAGTTGCCCTGGTCGTTGGCG A1A_I74T_F GTTCGACGCGAAGCGGCTGACCGGCCGCAAGTTCGGCGATGC A1A_I74T_R GCATCGCCGAACTTGCGGCCGGTCAGCCGCTTCGCGTCGAAC A1A_I480N_F GATCGAGGTGACCTTCGACAATGACGCCAACGGCATCCTGAAC A1A_I480N_R GTTCAGGATGCCGTTGGCGTCATTGTCGAAGGTCACCTCGATC A1A_F592S_F GCTGGCCGACAAGGAGGAGAGCGTGCACAAGCGGGAGGAG A1A_F592S_R CTCCTCCCGCTTGTGCACGCTCTCCTCCTTGTCGGCCAGC

59

APPENDIX B

COMPARISIONS OF HSPA1A AND HSPA1B NUCLEOTIDE SEQUENCES IN MULTIPLE MAMMALIAN SPECIES

HSPA1A and HSPA1B genes in multiple mammalian species have largely similar if not identical nucleotide sequences. HSPA1A and HSPA1B sequences over sliding windows were performed using SWAAP (v.1.0.3).

60

APPENDIX C

COMPARISIONS OF HSPA1A AND HSPA1B AMINO ACID SEQUENCES IN MULTIPLE MAMMALIAN SPECIES

HSPA1A and HSPA1B genes in multiple mammalian species have largely similar if not identical amino acid sequences. HSPA1A and HSPA1B sequences over sliding windows were performed using SWAAP (v.1.0.3).

61

APPENDIX D

TABLE OF SYNONYMOUS AND NON-SYNONYMOUS DISTANCES BETWEEN REPRESENTATIVE HSPA1 CLUSTER GENES

human 1A human 1B human 1L goat 1A goat 1B goat 1L opossum 1 opossum 1L human 1A - 0.00±0.000 0.07±0.009 0.01±0.004 0.01±0.003 0.09±0.009 0.05±0.008 0.04±0.007 human 1B 0.01±0.004 - 0.06±0.009 0.01±0.004 0.01±0.003 0.09±0.009 0.05±0.008 0.04±0.007 human 1L 0.48±0.021 0.49±0.021 - 0.07±0.009 0.06±0.009 0.04±0.006 0.05±0.007 0.06±0.009 goat 1A 0.16±0.016 0.16±0.016 0.52±0.022 - 0.01±0.002 0.09±0.009 0.06±0.008 0.05±0.006 goat 1B 0.16±0.016 0.16±0.015 0.52±0.022 0.02±0.006 - 0.09±0.009 0.05±0.008 0.04±0.006 goat 1L 0.52±0.022 0.53±0.022 0.28±0.020 0.54±0.023 0.54±0.022 - 0.08±0.007 0.08±0.009 opossum 1 0.59±0.023 0.60±0.023 0.49±0.021 0.62±0.023 0.62±0.023 0.50±0.021 - 0.04±0.007 opossum 1L 0.56±0.022 0.57±0.022 0.49±0.023 0.58±0.022 0.58±0.023 0.50±0.021 0.16±0.017 -

Synonymous (below diagonal, white) and non-synonymous (above diagonal, grey) distances (and standard errors) computed using the modified Nei-Gojobori method. Analysis between clusters shows that synonymous values are considerably higher than non- synonymous ones.

62

APPENDIX E

SYNONYMOUS AND NON-SYNONYMOUS DISTANCES BETWEEN REPRESENTATIVE HSPA1 CLUSTER GENES IN HUMANS

Synonymous substitutions per synonymous sites (ps) are 10 to 30 times higher between

HspA1A and HspA1L (a) than between HspA1A and HspA1AB (b, c). Synonymous/non- synonymous substitutions over sliding windows were performed using SWAAP (v.1.0.3) using the modified Nei-Gojobori method. (b) and (c) are the same data with different y axes scaling. 63

APPENDIX F

SYNONYMOUS AND NON-SYNONYMOUS SNP DISTRIBUTION BETWEEN REPRESENTATIVE HSPA1 CLUSTER GENES

The distribution of SNPs is very different between the three HSPA1 genes. Synonymous and non-synonymous SNPs were mapped along the amino acid sequence of (a) HSPA1A, (b) HSPA1B, and (c) HSPA1L using sliding window analysis. The domain organization of Hsp70s (from HSPA1A) is depicted on the top.

64

APPENDIX G

BINDING AND THERMAL RESULTS OF HSPA1A WITH ATP

Protein (Cell) Syringe (ligand) N Kd (mM) dH (kJ/mol) dS (J/mol*k) WT1 ATP 1.041 2.93E-05 -53.93 -104.66 WT2 ATP 1.001 3.09E-05 -57.76 -107.41 WT3 ATP 1.006 2.59E-05 -60.63 -118.62 K71A_1 ATP 1.016 3.52E-04 -38.38 -62.59 K71A_2 ATP 1.039 2.94E-04 -42.78 -67.22 K71A_3 ATP 1.065 3.57E-04 -41.27 -60.74 S16P_1 ATP 0.957 5.65E-04 -53.46 -104.14 S16P_2 ATP 1.006 2.60E-04 -52.70 -108.11 S16P_3 ATP 0.953 3.11E-04 -54.31 -106.42 S16Y_1 ATP 1.059 2.81E-05 -49.38 -78.49 S16Y_2 ATP 1.008 3.13E-05 -48.93 -77.82 S16Y_3 ATP 1.046 2.70E-05 -51.68 -79.23 R346C_1 ATP 0.991 2.25E-05 -53.73 -97.95 R346C_2 ATP 1.017 3.73E-05 -52.36 -97.91 R346C_3 ATP 0.987 2.67E-05 -52.03 -99.11 I74T_1 ATP 1.037 3.20E-05 -55.86 -86.03 I74T_2 ATP 1.035 4.16E-05 -52.36 -85.87 I74T_3 ATP 1.009 3.72E-05 -50.87 -84.68 I480N_1 ATP 1.028 1.77E-05 -50.51 -78.44 I480N_2 ATP 1.033 2.25E-05 -52.02 -82.57 I480N_3 ATP 1.060 1.75E-05 -52.62 -81.78 F592S_1 ATP 1.031 2.12E-05 -55.09 -95.28 F592S_2 ATP 1.028 2.36E-05 -54.54 -94.36 F592S_3 ATP 1.034 2.14E-05 -53.66 -95.13

Binding and thermal results of the Isothermal Titration Calorimetry (ITC) assays using purified recombinant HspA1A proteins (WT and mutated variants) with ATP. N: reaction stoichiometry; dH: enthalpy; dS: entropy; Kd: dissociation constant; Cell: instrument cell containing protein; Syringe: instrument syringe for ligand titration

65

APPENDIX H

BINDING AND THERMAL RESULTS OF HSPA1A WITH ADP

Protein (Cell) Syringe (ligand) N Kd (mM) dH (kJ/mol) dS (J/mol*k) WT1 ADP 0.994 7.12E-05 -39.95 -58.67 WT2 ADP 1.024 8.28E-05 -42.71 -65.06 WT3 ADP 0.997 9.85E-05 -42.75 -66.67 K71A_1 ADP 1.039 4.79E-04 -28.74 -32.84 K71A_2 ADP 1.037 4.56E-04 -26.85 -35.61 K71A_3 ADP 1.091 4.82E-04 -30.58 -28.19 S16P_1 ADP 1.004 1.00E-04 -38.25 -55.25 S16P_2 ADP 0.981 1.45E-04 -41.94 -55.92 S16P_3 ADP 0.988 1.88E-04 -39.53 -54.48 S16Y_1 ADP 1.017 5.67E-05 -44.65 -69.46 S16Y_2 ADP 1.020 7.33E-05 -45.18 -72.38 S16Y_3 ADP 1.096 7.55E-05 -46.56 -71.24 R346C_1 ADP 0.984 1.91E-04 -44.15 -43.31 R346C_2 ADP 1.010 1.69E-04 -48.31 -41.43 R346C_3 ADP 1.016 1.16E-04 -47.60 -44.43 I74T_1 ADP 1.043 8.69E-05 -44.28 -70.79 I74T_2 ADP 1.031 9.82E-05 -43.84 -68.18 I74T_3 ADP 1.043 8.17E-05 -44.31 -70.35 I480N_1 ADP 1.090 5.79E-05 -40.95 -49.20 I480N_2 ADP 1.031 6.23E-05 -39.48 -51.90 I480N_3 ADP 1.009 5.87E-05 -41.68 -48.79 F592S_1 ADP 1.043 4.86E-05 -35.82 -31.12 F592S_2 ADP 1.024 5.22E-05 -34.23 -32.81 F592S_3 ADP 1.004 6.13E-05 -33.62 -35.15

Binding and thermal results of the Isothermal Titration Calorimetry (ITC) assays using purified recombinant HspA1A proteins (WT and mutated variants) with ADP. N: reaction stoichiometry; dH: enthalpy; dS: entropy; Kd: dissociation constant; Cell: instrument cell containing protein; Syringe: instrument syringe for ligand titration

66

APPENDIX I

BINDING AND THERMAL RESULTS OF HSPA1A WITH PROTEIN SUBSTRATE

Protien (Cell) Syringe N Kd (mM) dH (kJ/mol) dS (J/mol*k) WT1 Peptide 0.993 3.35E-04 -25.84 -36.93 WT2 Peptide 1.002 5.29E-04 -19.75 -23.52 WT3 Peptide 1.023 3.64E-04 -20.03 -31.35 K71A_1 Peptide 0.982 5.28E-04 -11.58 21.91 K71A_2 Peptide 1.002 6.15E-04 -13.87 20.25 K71A_3 Peptide 0.989 6.35E-04 -11.97 19.89 S16P_1 Peptide 0.991 7.38E-04 -12.47 16.92 S16P_2 Peptide 1.005 6.79E-04 -13.43 15.61 S16P_3 Peptide 1.027 5.99E-04 -13.48 16.47 S16Y_1 Peptide 1.038 3.35E-04 -42.71 -72.72 S16Y_2 Peptide 1.021 3.31E-04 -41.66 -73.09 S16Y_3 Peptide 1.007 3.15E-04 -40.76 -71.18 R346C_1 Peptide 1.028 5.76E-04 -15.71 11.53 R346C_2 Peptide 1.001 4.72E-04 -15.95 12.52 R346C_3 Peptide 0.984 5.09E-04 -16.24 10.55 I74T_1 Peptide 0.998 9.99E-04 -18.91 -5.973 I74T_2 Peptide 0.985 9.95E-04 -19.67 -4.888 I74T_3 Peptide 1.014 9.35E-04 -20.28 -6.559 I480N_1 Peptide 1.017 7.10E-04 -18.53 23.28 I480N_2 Peptide 1.007 6.61E-04 -21.18 22.95 I480N_3 Peptide 1.025 6.97E-04 -19.74 24.83 F592S_1 Peptide 1.071 5.59E-04 -16.07 8.356 F592S_2 Peptide 1.037 5.00E-04 -19.52 11.56 F592S_3 Peptide 1.062 3.74E-04 -15.32 10.92

Binding and thermal results of the Isothermal Titration Calorimetry (ITC) assays using purified recombinant HspA1A proteins (WT and mutated variants) with peptide substrate. N: reaction stoichiometry; dH: enthalpy; dS: entropy; Kd: dissociation constant; Cell: instrument cell containing protein; Syringe: instrument syringe for ligand titration

67

APPENDIX J

COLOCALIZATION OF HSPA1A WITH NUCLEUS

Representative images of colocalization of fluorescently-tagged HSPA1A (green) to nucleus (blue). Positive Product of the Differences from the Mean (PDM) values are represented from black to white, with white representing total complete localization between the green pixels and the blue pixels. Scale bar represents 20 µm.

68

APPENDIX K

COLOCALIZATION OF HSPA1A WITH MITOCHONDRIA

Representative images of colocalization of fluorescently-tagged HSPA1A (green) to mitochondria (red). Positive Product of the Differences from the Mean (PDM) values are represented from black to white, with white representing total complete localization between the green pixels and the red pixels. Scale bar represents 20 µm.

69

APPENDIX L

COLOCALIZATION OF HSPA1A WITH LYSOSOMES

Representative images of colocalization of fluorescently-tagged HSPA1A (green) to lysosome (red). Positive Product of the Differences from the Mean (PDM) values are represented from black to white, with white representing total complete localization between the green pixels and the red pixels. Scale bar represents 20 µm.

70

APPENDIX M

COLOCALIZATION OF HSPA1A WITH PLASMA MEMBRANE

Representative images of colocalization of fluorescently-tagged HSPA1A (green) to plasma membrane (red). Positive Product of the Differences from the Mean (PDM) values are represented from black to white, with white representing total complete localization between the green pixels and the red pixels. Scale bar represents 20 µm.

71

APPENDIX N

TABLES OF STATISTICAL SIGNIFICANCE FOR COLOCALIZATION DATA

Statistical analysis of CTCF ratios using Student’s unpaired t test to compare construct 37 to construct 42, wild-type (WT) 37 to variant 37, and WT 42 to variant 42. Values were calculated from CTCF ratios of 30 cells (N = 30). Construct 37 to WT 37 to Variant WT 42 to Nucleus Construct 42 37 Variant 42 Variant t test t test t test WT 0.0012 n/a n/a S16Y 0.0468 < 0.0005 0.0181 R36C 0.8610 < 0.5599 0.0131 I480N 0.0740 < 0.7734 0.2814 F592S 0.3923 < 0.0001 0.0068 Construct 37 to WT 37 to WT 42 to Mitochondria Construct 42 Variant 37 Variant 42 Variant t test t test t test WT 0.0003 n/a n/a S16Y 0.0018 0.0467 0.6126 R36C 0.2707 0.3194 0.0556 I480N 0.0109 0.0228 0.4659 F592S 0.7130 0.1298 0.0654 Construct 37 to WT 37 to WT 42 to Lysosome Construct 42 Variant 37 Variant 42 Variant t test t test t test WT 0.0102 n/a n/a S16Y 0.0246 < 0.0441 0.0044 R36C 0.0058 < 0.0200 0.0227 I480N 0.0047 < 0.0003 0.0672 F592S 0.0002 < 0.0001 0.0002 Construct 37 to WT 37 to Variant WT 42 to Plasma Construct 42 37 Variant 42 Membrane Variant t test t test t test WT 0.0406 n/a n/a S16Y 0.0075 0.1542 0.0241 R36C 0.5502 0.0241 0.4126 I480N 0.0096 0.0072 0.0007 F592S 0.0436 0.0868 0.0758

72

APPENDIX O

REPRESENTATIVE IMAGES OF CELLS WITH HUNTINGTIN AGGREGATES

Representative images of mammalian cells overexpressing fluorescently-tagged HSPA1A (green) with a 74-glutamine containing variant of the huntingtin protein (red). Aggregates are shown as red clumps with yellow/orange representing colocalization between HSPA1A and huntingtin aggregates. Scale bar represents 20 µm.

73

REFERENCES

Adzhubei, I. A., Schmidt, S., Peshkin, L., Ramensky, V. E., Gerasimova, A., Bork, P., . . . Sunyaev, S. R. (2010). A method and server for prediction damaging missense mutations. Nature Methods, 7, 248-249.

Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403-410.

Arnheim, N. (1983). Concerted evolution of multigene families. In M. Nei and R. K. Koehn (Eds.), Evolution of genes and proteins (pp. 38-61). Sunderland, MA: Sinauer Associates.

Ayub, H., Khan, M. I., Micheal, S., Akhtar, F., Ajmal, M., Shafique, S., . . . Qamar, R. (2010). Association of eNOS and HSP70 gene polymorphisms with glaucoma in Pakistani cohorts. Molecular Vision, 16, 18-25.

Bettencourt, B. R., & Feder, M. E. (2001). Hsp70 duplication in the Drosophila melanogaster species group: How and when did two become five? Molecular Biology and Evolution, 18, 1272-1282.

Bettencourt, B. R., & Feder, M. E. (2002). Rapid concerted evolution via gene conversion at the Drosophila hsp70 genes. Journal of Molecular Evolution, 54, 569-586.

Bivik, C., Rosdahl, I., & Öllinger, K. (2007). Hsp70 protects against UVB induced apoptosis by preventing release of cathepsins and cytochrome c in human melanocytes. Carcinogenesis, 28, 537-544.

Blum, D., Hourez, R., Galas, M. C., Popoli, P., & Schiffmann, S. N. (2003). Adenosine receptors and Huntington’s disease: Implications for pathogenesis and therapeutics. The Lancet Neurology, 2, 366-374.

Bonini, N. M. (2002). Chaperoning brain degeneration. Proceedings of the National Academy of Sciences of the United States of America, 99, 16407-16411.

Brocchieri, L., Conway de Macario, E., & Macario, A. J. L. (2008). Hsp70 genes in human genome: Conservation and differentiation patterns predict a wide array of overlapping and specialized functions. BMC Evolutionary Biology, 8, 19.

74

Burgess, A. (2011, May 24). Measuring cell fluorescence using ImageJ [Web blog post]. Retrieved from https://sciencetechblog.com/2011/05/24/measuring-cell- fluorescence-using-imagej/.

Burgess, A., Vigneron, S., Brioudes, E., Labbé, J. C., Lorca, T., & Castro, A. (2010). Loss of human Greatwall results in G2 arrest and multiple mitotic defects due to deregulation of cyclin B-Cdc2/PP2A balance. Proceedings of the National Academy of Sciences of the United States of America, 107, 12564-12569.

Chen, J. M., Cooper, D. N., Chuzhanova, N., Fèrec, C., & Patrinos, G. P. (2007). Gene conversion: Mechanisms, evolution and human disease. Nature Reviews Genetics, 8, 762-775.

Chiappori, F., Merelli, I., Colombo, G., Milanesi, L., & Morra, G. (2012). Molecular mechanism of allosteric communication in Hsp70 revealed by molecular dynamics simulations. PLoS Computational Biology, 8, e1002844.

Chow, A. M., Steel, R., & Anderson, R. L. (2009). Hsp72 chaperone function is dispensable for protection against stress-induced apoptosis. Cell Stress and Chaperones, 14, 253-263.

Craig, E. A., & Marszalek, J. (2017). How do J-proteins get Hsp70 to do so many different things? Trends in Biochemical Sciences, 42, 355-368.

Daugaard, M., Rohde, M., & Jäättelä, M. (2007). The 70 family: Highly homologous proteins with overlapping and distinct functions. FEBS Letters, 581, 3702-3710.

Didszun, C. M. (2009). The role of protein aggregation in Huntington’s disease (Doctoral thesis). Retrieved from Europe PMC (THESIS:530566)

Dulin, E., García-Barreno, P., & Guisasola, M. C. (2012). Genetic variations of HSPA1A, the heat shock protein levels, and risk of atherosclerosis. Cell Stress and Chaperones, 17, 507-516.

Freeman, B. C., Michels, A., Song, J., Kampinga, H. H., & Morimoto, R. I. (2000). Analysis of molecular chaperone activities using in vitro and in vivo approaches. In Keyse, S. M. (Ed.), Methods in Molecular Biology, vol. 99 (pp. 393-419). Totowa, NJ: Humana Press Inc.

Freeman, B. C., Myers, M. P., Schumacher, R., & Morimoto, R. I. (1995). Identification of a regulatory motif in Hsp70 that affects ATPase activity, substrate binding and interaction with HDJ-1. The EMBO Journal, 14, 2281-2292.

Garrido, C., Brunet, M., Didelot, C., Zermati, Y., Schmitt, E., & Kroemer, G. (2006). Heat shock proteins 27 and 70. Cell Cycle, 5, 2592-2601.

75

Genetic Science Learning Center. (2015, January 7). Learn.Genetics. Retrieved from http://learn.genetics.utah.edu/

Gyrd-Hansen, M., Nylandsted, J., & Jäättelä, M. (2004). Heat shock protein 70 promotes cancer cell viability by safeguarding lysosomal integrity. Cell Cycle, 3(12), 1484- 1485.

Hageman, J., Vos, M. J., van Waarde, M. A. W. H., Kampinga, H. H. (2007). Comparison of intra-organellar chaperone capacity for dealing with stress-induced protein unfolding. Journal of Biological Chemistry, 282(47), 34334-34345.

He, M., Guo, H., Yang, X., Zhang, X., Zhou, L., Chang, L., . . . Wu, T. (2009). Functional SNPs in HSPA1A gene predict risk of coronary heart disease. PLoS ONE, 4(3), e4851.

Hunt, C., & Morimoto, R. I. (1985). Conserved features of eukaryotic hsp70 gene revealed by comparison with the nucleotide sequence of human hsp70. Proceedings of the National Academy of Sciences of the United States of America, 82, 6455- 6459.

Jana, N. R., Tanaka, M., Wang, G., & Nukina, N. (2000). Polyglutamine length- dependent interaction of Hsp40 and Hsp70 family chaperones with truncated N- terminal huntingtin: their role in suppression of aggregation and cellular toxicity. Human Molecular Genetics, 9(13), 2009-2018.

Kabani, M., & Martineau, C. N. (2008). Multiple hsp70 isoforms in the eukaryotic cytosol: Mere redundancy or function specificity? Current Genomics, 9, 338-348.

Kakkar, V., Prins, L. C. B., & Kampinga, H. H. (2012). DNAJ proteins and protein aggregation disease. Current Topics in Medicinal Chemistry, 12, 2479-2490.

Kominek, J., Marszalek, J., Neuvéglise, C., Craig, E. A., & Williams, B. L. (2013). The complex evolutionary dynamics of Hsp70s: A genomic and functional perspective. Genome Biology and Evolution, 5(12), 2460-2477.

Kotoglou, P., Kalaitzakis, A., Vezyraki, P., Tzavaras, T., Michalis, L. K., Dantzer, F., . . . Angelidis, C. (2009). Hsp70 translocates to the nuclei and nucleoli, binds to XRCC1 and PARP-1, and protects HeLa cells from single-strand DNA breaks. Cell Stress and Chaperones, 14, 391-406.

Kourtidis, A., Drosopoulou, E., Nikolaidis, N., Hatzi, V. I., Chintiroglou, C. C., & Scouras, Z. G. (2006). Identification of several cytoplasmic HSP70 genes from the Mediterranean mussel (Mytilus galloprovincialis) and their long-term evolution in Mollusca and Metazoa. Journal of Molecular Evolution, 26, 446-459.

Krenek, S., Schlegel, M., & Berendonk, T. U. (2013). Convergent evolution of heat- inducibility during subfunctionalization of the Hsp70 gene family. BMC Evolutionary Biology, 13, 49.

76

Kudla, G., Helwak, A, & Lipinski, L. (2004). Gene conversion and GC-content evolution in mammalian Hsp70. Molecular Biology and Evolution, 21(7), 1438-1444.

Kuiper, E. F. E., de Mattos, E. P., Jardim, L. B., Kampinga, H. H., & Bergink, S. (2017). Chaperones in polyglutamine aggregation: Beyond the Q-stretch. Frontiers in Neuroscience, 11, 145.

Kumar, P., Henikoff, S., and Ng, P. C. (2009). Predicting the effects of coding non- synonymous variants on protein function using the SIFT algorithm. Nature Protocols, 4(7), 1073-1081.

Lek, M., Karczewski, K. J., Minikel, E. V., Samocha, K. E., Banks, E., Fennell, T., . . . Exome Aggregation Consortium. (2016). Analysis of protein-coding genetic variation in 60,706 humans. Nature, 536, 285-291.

Levy, S., Sutton, G., Ng, P. C., Feuk, L., Halpern, A. L., Walenz, B. P., . . . Venter, J. C. (2007). The diploid genome sequence of an individual human. PLoS Biology, 5(10), e254.

Lindquist, S., & Craig, E. A. (1988). The heat-shock proteins. Annual Review of Genetics, 22, 631-677.

Liu, H. & Naismith, J. H. (2008). An efficient one-step site-directed deletion, insertion, single and multiple-site plasmid mutagenesis protocol. BMC Biotechnology, 8, 91.

Macario, A. J. L., & Conway de Macario, E. (2005). Sick chaperones, cellular stress, and disease. New England Journal of Medicine, 353, 1489-1501.

Mayer, M. P., & Bukau, B. (2005). Hsp70 chaperones: Cellular functions and molecular mechanism. Cellular and Molecular Life Sciences, 62(6), 670-684.

McCallister, C., Kdeiss, B., & Nikolaidis, N. (2015). HspA1A, a 70-kDa heat shock protein, differentially interacts with anionic lipids. Biochemical and Biophysical Research Communications, 467(4), 835-840.

McCallister, C., Siracusa, M. C., Shirazi, F., Chalkia, D., & Nikolaidis, N. (2015). Functional diversification and specialization of cytosolic 70-kDa heat shock proteins. Scientific Reports, 5, 9363.

McCloy, R. A., Rogers, S., Caldon, C. E., Lorca, T., Castro, A., & Burgess, A. (2014). Partial inhibition of Cdk1 in G2 phase overrides the SAC and decouples mitotic events. Cell Cycle, 13, 1400-1412.

Milarski, K. L., & Morimoto, R. I. (1986). Expression of human HSP70 during synthetic phase of the cell cycle. Proceedings of the National Academy of Sciences for the United States of America,83, 9517-9521.

77

Mohanan, V., & Grimes, C. L. (2014). The molecular chaperone HSP70 binds to and stabilizes NOD2, an important protein involved in Crohn disease. Journal of Biological Chemistry, 289(27), 18987-18998.

Morimoto, R. I. (2011). The heat shock response: System biology of proteotoxic stress in aging and disease. Cold Spring Harbor Symposia on Quantitative Biology, 76, 91- 99.

Multhoff, G., Botzler, C., Wiesnet, M., Müller, E., Meier, T., Wilmanns, W., & Issels, R. D. (1995). A stress-inducible 72-kDa heat-shock protein (HSP72) is expressed on the surface of human tumor cells, but not on normal cells. International Journal of Cancer, 61, 272-279.

Multhoff, G., & Hightower, L. E. (1996). Cell surface expression of heat shock proteins and the immune response. Cell Stress and Chaperones, 1(3), 167-176.

Murphy, M. (2013). The HSP70 family and cancer. Carcinogenesis, 34(6), 1181-1188.

Nei, M. & Rooney, A. P. (2005). Concerted and birth-and-death evolution of multigene families. Annual Review in Genetics, 39, 121-152.

Nei, M., & Gojobori, T. (1986). Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Molecular Biology and Evolution, 3(5), 418-426.

Nei, M., & Kumar, S. (2000). Molecular Evolution and Phylogenetics. Oxford, England: Oxford University Press.

Nikolaidis, N., & Nei, M. (2004). Concerted and Nonconcerted evolution of the Hsp70 gene superfamily in two sibling species of nematodes. Molecular Biology and Evolution, 21(3), 498-505.

Nollen, E. A. A., Salomons, F. A., Brunsting, J. F., van der Want, J. J. L., Sibon, O. C. M., & Kampinga, H. H. (2001). Dynamic changes in the localization of thermally unfolded nuclear proteins associated with chaperone-dependent protection. Proceedings of the National Academy of Sciences of the United States of American, 98(21), 12038-12043.

Nollen, E. A. A., & Morimoto, R. I. (2002). Chaperoning signaling pathways: Molecular chaperones as stress-sensing ‘heat shock’ proteins. Journal of Cell Science, 115(14), 2809-2816.

Nylandsted, J., Gyrd-Hansen, M., Danielewicz, A., Fehrenbacher, N., Lademann, U., Høyer-Hansen, M., . . . Jäättelä, M. (2004). Heat shock protein 70 promotes cell survival by inhibiting lysosomal membrane permeabilization. J. Exp. Med. 200(4), 425-435.

78

O’Brien, M. C., Flaherty, K. M., & McKay, D. B. (1996). Lysine 71 of the chaperone protein Hsc70 is essential for ATP hydrolysis. Journal of Biological Chemistry, 271(27), 15874-15878.

Pruett, S. B. (2003). Stress and the immune system. Pathophysiology, 9, 133-153.

Ramos-Arroyo, M. A., Feijoo, E., Sanchez-Valverde, F., Aranburu, E., Irisarri, N., Olivera, J. E., & Valiente, A. (2001). Heath-shock protein 70-1 and HLA class II gene polymorphisms associated with celiac disease susceptibility in Navarra (Spain). Human Immunology, 62, 821-825.

Richter, K., Haslbeck, M., & Buchner, J. (2010). The heat shock response: Life on the verge of death. Molecular Cell, 40, 253-266.

Rujano, M. A., Kampinga, H. H., & Salomons, F. A. (2007). Modulation of polyglutamine inclusion formation by the Hsp70 chaperone machine. Experimental Cell Research, 333, 3568-3578.

Sakahira, H., Breuer, P, Hayer-Hartl, M. K., & Hartl, F. U. (2002). Molecular chaperones as modulators of polyglutamine protein aggregation and toxicity. Proceedings of the National Academy of Sciences of the United States of America, 99, 16412-16418.

Schröder, H., Langer, T., Hartl, F. U., & Bukau, B. (1993). DnaK, DnaJ and GrpE form a cellular chaperone machinery capable of repairing heat-induced protein damage. The EMBO Journal, 12(11), 4137-4144.

Shaughnessy, D. T., McAllister, K., Worth, L., Haugen, A. C., Meyer, J. N., Domann, F. E., . . . Tyson, F. L. (2014). Mitochondria, energetics, epigenetics, and cellular responses to stress. Evironmental Health Perspectives, 122(12), 1271-1278.

Taguwa, S., Maringer, K., Li, X., Bernal-Rubio, D., Rauch, J. N., Gestwicki, J. E., . . . Frydman, J. (2015). Defining Hsp70 subnetworks in dengue virus replication reveals key vulnerability in flavivirus infection. Cell, 163(5), 1108-1123.

Tamura, K., Stecher, G., Peterson, D., Filipski, A., & Kumar, S. (2013). MEGA6: Molecular evolutionary genetics analysis version 6.0. Molecular Biology and Evolution, 30, 2725-2729.

Tavaria, M., Gabriele, T., Kola, I, & Anderson, R. L. (1996). A hitchhiker’s guide to the human Hsp70 family. Cell Stress and Chaperones, 1(1), 23-28.

Zeng, X. C., Bhasin, S., Wu, X., Lee. J. G., Maffi, S., Nichols, C. J., . . . Eisenberg, E. (2004). Hsp70 dynamics in vivo: Effect of heat shock and protein aggregation. Journal of Cell Science, 117, 4991-5000.

Zhou, H., Li, S. H., & Li, X. J. (2001). Chaperone suppression of cellular toxicity of huntingtin is independent of polyglutamine aggregation. Journal of Biological Chemistry, 276(51), 48417-48424.

79

Zhuravleva, A., Clerico, E. M., & Gierasch, L. M. (2012). An interdomain energetic tug- of-war creates the allosterically active state in Hsp70 molecular chaperones. Cell, 151(6), 1296-1307.