Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations

2012 The evolution of stress response and complex life history traits in natural populations of garter Tonia Sue Schwartz Iowa State University

Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Bioinformatics Commons, Evolution Commons, and the Genetics Commons

Recommended Citation Schwartz, Tonia Sue, "The ve olution of stress response and complex life history traits in natural populations of garter snakes" (2012). Graduate Theses and Dissertations. 12916. https://lib.dr.iastate.edu/etd/12916

This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected].

The evolution of stress response and complex life history traits in natural populations of garter snakes

by

Tonia Sue Schwartz

A dissertation submitted to the graduate faculty

in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

Major: Genetics Program of Study Committee: Anne M. Bronikowski, Co-Major Professor Jo Anne Powell-Coffman, Co-Major Professor Drena Dobbs Chris Tuggle Jonathan Wendel

Iowa State University Ames, Iowa 2012

Copyright © Tonia Sue Schwartz, 2012. All rights reserved. ii

TABLE OF CONTENTS

LIST OF FIGURES ...... viii

LIST OF TABLES ...... x

ABSTRACT ...... xii

CHAPTER 1. INTRODUCTION ...... 1 Dissertation organization ...... 4 Chapter 2. Molecular stress pathways and the evolution of life histories in ...... 4 Chapter 3. A garter transcriptome: pyrosequencing, de novo assembly, and sex-specific differences...... 4 Chapter 4. Dissecting molecular stress networks: identifying nodes of divergence between life-history phenotypes...... 5 Chapter 5. Mitochondrial response to heat stress: similarities and divergence between life- history ecotypes in respiration, transcription and haplotypes...... 5 Chapter 6. Functional ecological transcriptomes: using quantitative RNA-seq to study heat stress response in a non-model vertebrate...... 6

CHAPTER 2. MOLECULAR STRESS PATHWAYS AND THE EVOLUTION OF LIFE HISTORIES IN REPTILES ...... 7 Chapter summary ...... 7 Reptiles possess remarkable variation and plasticity in life history ...... 7 The molecular stress pathways - what is known in reptiles ...... 9 Metabolic pathways ...... 9 Molecular mechanisms to regulate production of ROS...... 11 Molecular mechanisms to neutralize ROS...... 12 Tolerance and resistance to ROS...... 13 Molecular pathways for repair ...... 14 Insulin and insulin-like growth factor1 (IGF1) signaling pathways (IIS) ...... 15 Environmental stress and evolving molecular pathways - evidence in reptiles...... 16 Temperature (heat) stress ...... 16 Hibernation: supercooling, freeze tolerance and anoxia tolerance ...... 18 Caloric stress: availability and type of food ...... 20 Genomic level selection ...... 22

iii

Summary ...... 23 References ...... 23 Box 1: Definitions of terms used in this chapter ...... 30 Box 2: Case study on life-history ecotypes ...... 31

CHAPTER 3. A GARTER SNAKE TRANSCRIPTOME: PYROSEQUENCING, DE NOVO ASSEMBLY, AND SEX-SPECIFIC DIFFERENCES ...... 37 Abstract ...... 37 Background ...... 38 Results and discussion ...... 40 Sampling and 454 GS-FLX titanium sequencing ...... 40 Assembly and annotation ...... 40 Phylogenetic assessment of BLAST results ...... 43 Additional clustering ...... 44 Sequence variants ...... 45 Comparison between female and male ...... 47 Conclusions ...... 49 Methods ...... 49 Sampling ...... 49 RNA isolation ...... 50 Library preparation and sequencing ...... 50 Assembly and annotation ...... 51 Additional clustering ...... 52 Sequence variants ...... 53 Comparisons between female and male ...... 53 Authors' contributions ...... 53 Acknowledgements ...... 54 References ...... 54 Additional files ...... 69

CHAPTER 4. DISSECTING MOLECULAR STRESS NETWORKS: IDENTIFYING NODES OF DIVERGENCE BETWEEN LIFE-HISTORY PHENOTYPES ...... 70 Abstract ...... 70 Introduction ...... 71

iv

Methods ...... 73 Study populations ...... 73 ...... 74 Experimental design ...... 74 Assays ...... 75 Circulating hormone: corticosterone ...... 75 Circulating ROS: within blood cells (superoxide within the mitochondria) ...... 76

Circulating ROS: hydrogen peroxide (H2O2) in plasma...... 76

ROS (H2O2) produced by live liver mitochondria...... 76 Candidate gene analysis ...... 77 Quantitative gene expression at candidate genes ...... 77 Sequence variation of candidate genes ...... 78 DNA damage within blood cells ...... 78 Statistical analyses ...... 79 Results...... 80 Corticosterone increases with heat stress in both ecotypes ...... 80 Ecotypes differ in levels of circulating ROS in unstressed condition and in response to stress .. 80 Superoxide in blood cells ...... 80

H2O2 in plasma...... 81

H2O2 production by live liver mitochondria ...... 81 Genetic data ...... 81 Microsatellites ...... 81 Gene expression ...... 82 Genetic (mRNA) variation ...... 82 Whole organism heat stress causes a hormetic effect by limiting DNA damage due to

H2O2…… ...... …..83 Repeated measures ...... 83 Discussion ...... 84 Ecotypes differ at specific nodes of the stress response network in unstressed condition and in response to stress ...... 84 Sexes respond differently to heat stress...... 86 The evolutionary consequences of divergence in stress response for life-history traits...... 88 Acknowledgements ...... 89

v

References ...... 89 DNA sequences ...... 96 Physiological data ...... 96 Galaxy workflows...... 96 Supplemental information ...... 105 Supplemental methods ...... 105 Microsatellite genotyping ...... 105 Quantitative PCR ...... 105 DNA damage within blood cells ...... 106 Supplemental References ...... 120

CHAPTER 5. MITOCHONDRIAL RESPONSES TO HEAT STRESS IN TWO LIFE- HISTORY PHENOTYPES: RESPIRATION, TRANSCRIPTION AND HAPLOTYPE .. 122 Abstract ...... 122 Introduction ...... 123 Materials and methods ...... 126 Samples ...... 126 Physiology: mitochondrial respiration ...... 127 Sequence data ...... 127 Reconstructing the Thamnophis elegans mitochondrial reference sequence...... 129 Reconstructing individual mitochondrial transcriptomes from Illumina RNA-seq data ...... 129 Evaluating genetic diversity ...... 130 Quantitative gene expression ...... 130 Results...... 131 Mitochondrial respiration ...... 131 Sequence data ...... 131 Gene expression ...... 132 Mitochondrial gene expression shows increased segment 1 expression (rRNAs) with heat stress...... 132 Ecotypes differ in expression level of mtDNA...... 133 Discussion ...... 133 Active response of mitochondrial energetics to heat stress ...... 134 Mitochondrial divergence between life-history ecotypes ...... 136 Conclusion ...... 137

vi

Acknowledgements ...... 138 References ...... 138

CHAPTER 6. FUNCTIONAL ECOLOGICAL TRANSCRIPTOMICS: USING QUANTITATIVE RNA-SEQ TO STUDY HEAT STRESS IN A NON-MODEL VERTEBRATE ...... 155 Abstract ...... 155 Introduction ...... 156 Methods ...... 157 Animals ...... 157 Experimental design ...... 158 Quantitative expression of the liver transcriptome using Illumina RNA-seq ...... 158 Quantitative real-time PCR of candidate genes ...... 160 Results...... 161 Sequence data ...... 161 RNA-seq differential gene expression ...... 161 Quantitative PCR of IGFs and comparison to RNA-seq ...... 162 Discussion ...... 162 Effects of stress on gene expression ...... 163 Conclusions ...... 164 Acknowledgements ...... 164 References ...... 164

CHAPTER 7. CONCLUSION ...... 191 Future directions ...... 192

APPENDIX A. CO-AUTHORED MANUSCRIPTS RELATED TO THIS DISSERTATION ...... 193 Manuscript 1. Evolutionary rates vary in vertebrates for insulin-like growth factor-1 (IGF-1), a pleiotropic locus involved in life history traits ...... 193 Manuscript 2. Multiple paternity and male reproductive success in a metapopulation of garter snakes with evolutionarily divergent life-histories ...... 194 Manuscript 3. Reptilian telomerase remains active in adult tissues and its activity peaks under heat stress ...... 195

APPENDIX B. CODE FOR PROCESSING DATA AND STATISTICAL ANALYSES ...... 196

vii

Galaxy workflows for processing Illumina data ...... 196 SAS code for statistical analyses ...... 196 Individual physiological variables, full ANOVA model ...... 196 Individual physiological variables, simple ANOVA model ...... 198 Mitochondrial physiological variables, ANOVA model ...... 200 Repeated measures ANOVA on comet data ...... 201 R-code for gene expression analyses in edgeR and GoSeq ...... 203

APPENDIX C. DATA ACCESSIBILITY ...... 210 GenBank accessions for DNA sequences ...... 210 Next-generation sequencing data ...... 210 Individual gene sequences ...... 210 DRYAD ...... 210

ACKNOWLEDGEMENTS ...... 211 Family...... 211 Dissertation committee ...... 211 Additional academic supporters ...... 212 Funding ...... 213

BIOGRAPHICAL SKETCH ...... 214

viii

LIST OF FIGURES

Figure 1-1. Images of the study system. …………………………………………………….…………3 Figure 2-1. Simplified vertebrate phylogeny illustrating the relationships of reptiles relative to other major vertebrate …………………………………………………..…………………….….34 Figure 2-2. Conceptual diagram illustrating environmental stresses as selective pressures affecting the organism through physiology and molecular stress response pathways that have pleiotropic effects on multiple life-history traits……….………………………..….………35 Figure 3-1. Flow-diagram of the assembly procedure…………………………………………..…….61 Figure 3-2. Venn diagram of BlastX results…………………………………………………………..63 Figure 3-3. Taxonomic assignments of sequences…………………………………………………....64 Figure 3-4. BlastX results for the components identified in the graph-clusters………………………65 Figure 3-5. MHC class I graph-cluster………………………………………………………………..67 Figure 3-6. Variants…………………………………………………………………………………...68 Figure 4-1. Example of a simplified molecular stress network with nodes indicated by thick dark circles, connected by arrows……………………………………………………………….100 Figure 4-2. Design of comet assay to assess DNA damage…………………………………………101 Figure 4-3. Interaction plots of ecotypes differential response to stress in levels of reactive oxygen (ROS)…………………………………………………………………………...102 Figure 4-4. Gene expression in the liver based on RNA-seq data…………………………………...103 Figure 4-5. Interaction plots of DNA damage measured by comet assay…………………………...104 Supplemental Figure 4-1. Interaction plot of backtransformed LSM + and – the SE of the plasma corticosterone concentration in response to heat stress treatment…………………..…..113 Supplemental Figure 4-2. Range of gene expression in the liver for the HSP and antioxidant genes in response to heat stress, based on RNA-seq data……………………………………….114 Supplemental Figure 4-3. Catalase gene expression from the liver…………………………………115 Supplemental Figure 4-4. MDS plot of antioxidant gene expression……………………………….116 Supplemental Figure 4-5. Catalase nucleotide alignment of 10 unrelated individuals used in RNAseq…………………….……………………………………………………………………117 Supplemental Figure 4-6. SOD2 nucleotide alignment of 10 unrelated individuals used in RNAseq………….………………………………………………………………………………118

ix

Supplemental Figure 4-7. Allele frequency differences among populations for SOD2 based on 10 unrelated individuals with RNA-seq data………………………………………………...119 Figure 5-1. Thamnophis elegans mitochondrial genome……………………………………………148 Figure 5-2. Mitochondrial oxygen consumption.……………………………………………………149 Figure 5-3. Normalized quantitative expression of the mitchondrial genome at control temperature (i.e., standard body temperature) and in response to heat stress (37ºC)……..……150 Figure 5-4. Haplotype network illustrating the relationships among the four mitochondrial haplotypes identified in this study.………….…………………………………………………..151 Supplemental Figure 5-1. Individual mitochondrial transcriptomes for all protein coding genes and the rRNAs for 10 unrelated garter snakes from two populations: ELFS (L-fast ecotype), PAP (M-slow ecotype)……………………………………………………………………...... 152 Supplemental Figure 5-2. Nucleotide alignment of the two control regions of Thamnophis elegans mitochondrial genome………………………………………………………….…...….153 Supplemental Figure 5-3. Mapping of sequence reads to potential mitochondrial transcripts…………………………………………………………………………………...... 154 Figure 6-1. PCA plot of 12 individuals based on the FPKM of all transcripts .…………………….174 Figure 6-2. Log-fold-changes against log-counts per million reads…………………………………175 Figure 6-3. Heat map of differentially expressed transcripts (q-value = 0.05, variance greater than 0.05)………………………………………………………………………………………..176 Figure 6-4. PCA plots of FPKM of the transcripts that have variance greater than 0.05 and A) q-value 0.05, and B) p-value 0.05……………………………………………………………....177

x

LIST OF TABLES

Table 1-1. Habitat and life-history characteristics of the garter snake (Thamnophis elegans) ecotypes…………………………………………………………………………………………….3 Table 2-1. Summary of differences between slow-living and fast-living garter snake ecotypes around Eagle Lake, California……………………………………………………………………33 Table 3-1. Summary statistics for 454 sequencing and de novo assembly…………………………...59 Table 3-2. The number of unique homologues in each database identified through BlastX using four e-value cut-offs. Databases as of January 2010……………………………………….…….60

Table 4-1. Statistical results from ANOVAs, Fdf1,df2 on physiological traits. NS indicates non- significant results, a dash “-“ indicates that variable was not included in the final model….…...97 Table 4-2. Genetic variation based on the RNA-seq data for the antioxidant genes, and Microsatellite loci…………………………………………….………………………………….98 Table 4-3. Summary of differences in this study testing for the main effects of heat stress, sex, and ecotype……………………………………………………………………………………….99 Supplemental Table 4-1. Table of characteristics between slow-living and fast-living ecotypes of garter snakes (Thamnophis elegans) from Lassen, Co. California. ….………………………108 Supplemental Table 4-2. RNA-seq data……………………………………………………………..109 Supplemental Table 4-3. Quantitative Real-time PCR primers …………………………………….110 Supplemental Table 4-4. Sample sizes for all analyses……………………………………………...111

Supplemental Table 4-5. Pairwise FST values among the four populations used in this study; all are significant at p<0.05. ………………………………………………………………………..112 Table 5-1. Differences between of Thamnophis elegans ecotypes that are relevant to this study, adapted from Schwartz & Bronikowski (2011)…………………………………………………145

Table 5-2. Statistical results from ANOVAs, Fdf1,df2 on mitochondrial respiration and gene expression..……………………………………………………………………………………...146 Table 5-3. Diversity among the 10 unrelated individuals in the protein coding genes..…………….147 Table 6-1. Summary of Illumina RNA-seq data from 12 female livers……………………………..169 Table 6-2. The number of genes and transcripts that were differentially regulated in the T. elegans liver in response to heat stress at FDR ≤ 0.05……………………………………….170 Table 6-3: Affect of heat stress on the expression of candidate genes based on qPCR and

xi

RNA-seq. ……………………………………………………………………………………….171 Table 6-4. Annotation clusters with enrichment scores (ES) greater than one for the differentially expressed transcripts and genes………………………………………………..…172 Table 6-5. KEGG pathways that are over-represented in the differentially expressed genes……….173 Supplemental Table 6-1: Quantitative PCR primers………………………………………………...178 Supplemental Table 6-2: List of genes differentially regulated that could be annotated to a chicken ENSEMBL gene (328 out of 686)……………………………………………...………179

xii

ABSTRACT

How an individual responds to environmental stress (temperature changes, toxins) affects how that individual behaves, its ability to reproduce, and its lifespan. We have limited understanding of how stress response is integrated across single genes, proteins, and metabolites; to whole genomes and transcriptomes; and ultimately its effect on whole organism lifetime fitness. Expanding our understanding of the stress response across generations, and placing it in an evolutionary context, is an even more daunting challenge. Yet without an in-depth understanding of the plasticity and evolution of stress responses, we cannot predict how environmental changes will affect species’ ability to respond and survive in our dramatically changing world. The complex molecular network that underlies physiological stress response is comprised of nodes (proteins, metabolites, mRNAs, etc.) whose connections span cells, tissues, and organs. Variable nodes are points in the network upon which natural selection may act. The aim of my dissertation research was to identify variable nodes that will reveal how the molecular stress network may evolve among populations in different habitats, and how the evolution of these stress networks might impact life-history evolution. To address this, I utilize closely related natural populations of garter snakes (Thamnophis elegans) that occur in discrete habitats that have enhanced their divergence along the pace-of-life continuum; the slow-living phenotype has slower growth, smaller reproductive effort per bout, and extended median lifespan relative to the fast-living phenotype. We take a multifaceted molecular approach –cellular physiology, transcriptomics, and genomics - to test whether lab-born juveniles of these divergent phenotypes vary concomitantly at candidate nodes of the stress response network under unstressed and induced-stress conditions. As molecular resources for this species was lacking, I used next-generation sequencing techniques to generate a large-scale multi-organ transcriptome that provided genetic resources necessary for the rest of this research.

I found, in response to heat stress, some measures increased in both life-history phenotypes: plasma corticosterone; gene expression of heat shock proteins (HSPs); expression of environmental sensing pathways; and gene expression of mitochondrial rRNAs; and State III mitochondrial respiration. These results supported predicted relationships among these traits. As well, the phenotypes diverged at multiple nodes in both unstressed conditions and in their response to stress. Under unstressed conditions the slow-living phenotype had higher expression of the mitochondrial genome, higher State IV mitochondrial respiration, higher circulating levels of reactive oxygen species, lower liver gene expression of a key antioxidant, and higher erythrocyte DNA damage

xiii relative to the fast-living phenotype. In response to heat stress the fast-lived phenotype increased its expression of the mitochondrial genome, increased its circulating levels of reactive oxygen species and its DNA damage relative to the slow-living phenotype. Additionally, mitochondrial haplotypes - defined by nonsynonymous changes - were unique to each phenotype suggesting diversifying selection between the phenotypes. These results support the hypothesis that these evolutionarily divergent life-history phenotypes have also diverged in their molecular stress response networks. In addition I identified specific nodes involved in oxidative stress and mitochondrial function at which selection appears to be acting in these divergent life-history phenotypes. More broadly, these results lend further support to the prediction of tightly integrated molecular interactions between stress networks and life-history traits. Finally, this research furthers our understanding of how changing environmental stresses may drive the evolution of molecular networks.

1

CHAPTER 1. INTRODUCTION

How an individual responds to environmental stress (temperature changes, toxins) affects how that individual behaves, its ability to reproduce, and its lifespan. Currently, we have a rudimentarily understanding of how stress response is integrated across levels of biological complexity: from single genes, proteins, and metabolites to whole genomes and transcriptomes; and ultimately its effect on whole organism lifetime fitness. Expanding our understanding of the stress response across generations, and placing it within an evolutionary context, is an even more daunting challenge. Yet without an in-depth understanding of the plasticity and evolution of stress responses, we cannot predict how changes in environmental stresses will affect species’ (including humans) ability to reproduce, fight disease, and ultimately persist in our dramatically changing world. The focus of this dissertation is to understand how organisms living in habitats with different types of stresses evolve at the molecular level; in other words, how molecular networks evolve in habitats with different types of physiological stress. One method employed by many researchers to address this question is to develop genetic strains of model animals (mice, nematodes, fruit flies) in the laboratory and probe the molecular stress network to identify how changing specific genes affects the stress response and downstream life-history traits. This is a very powerful approach to defining the structure of the molecular network but the results are potentially limited to the genetic background of the animals tested and may be specific to the laboratory environment. Thus it is unclear how the results from this model-organism approach translate to what may be happening in natural populations with genetically diverse individuals in environments that change over time and space. Therefore, I take a different approach. By taking advantage of the 10,000-100,000 of years of ‘mother nature’s work’, I use natural populations of animals that have been evolving via natural selection in very divergent habitats. I employ molecular techniques to compare the animals living in the divergent habitats to determine how their genes, proteins, and metabolites have changed. The last glacial retreat in the Sierra Nevada Mountains in California occurred ~10,000 years ago, leaving lakes in ancient volcanic creators and marshy meadow habitats scattered across the range. Since then, the Western terrestrial garter snake (Thamnophis elegans) has moved into these mountainous habitats, likely migrating from coastal populations, to establish populations either around the lakes, or in the higher elevation marshy meadows (Figure 1). Biologists (particularly Dr. Steve Arnold and his “academic descendants” including my advisor Dr. Anne Bronikowski) have

2

been studying these snake populations for 30+ years. Using mark-recapture methods and detailed laboratory experiments over decades, they have determined that the Western terrestrial garter snakes from the marshy meadow populations have relatively slow growth and small reproductive bouts, but on average survive twice as long as the snakes living in populations immediately adjacent to Eagle Lake. Thus these researchers discovered that the populations of this snake species fall into two categories, either “slow-lived” animals in mountain meadow habitat, or “fast-lived” animals in the lakeshore habitat. Because these two types of animals of the same species live in different ecological habitats, we refer to them as the slow- and fast-living ecotypes. Further studies on these ecotypes have described how they differ in their morphology (e.g. size, coloration), their behavior, and their physiology. Please see the table and pictures below for more characteristics of these ecotypes. Because of the detailed work by these dedicated researchers who characterized these ecotypes (and the funding sources such as the National Science Foundation that has supported them), I had the opportunity to leverage this (relatively uniquely) rich base of knowledge of these populations and utilize these garter snake ecotypes to address questions on how animals respond to stress and how stress response evolves at the molecular level in natural populations, and how divergence in the genes, proteins, and metabolites underlying the stress response may affect life history traits such as reproduction and lifespan. Detailed studies on organisms in the lab (mice, fruit flies, nematodes) provide strong evidence that genes and proteins involved in stress response are tightly integrated with genes and proteins regulating reproduction and longevity/aging. These studies set up the overarching hypothesis of this dissertation, i.e. that different physiological stresses imposed by the divergent habitats has targeted molecular networks underlying stress response which has consequently driven the divergence of the life-history traits that we see in the garter snake ecotypes. Chapter 1 provides a literature review that builds the background information that lead to the development of this hypothesis. To address this hypothesis, I used a heat stress experiment on lab-born juveniles derived from these two ecotypes to characterize the physiological response to a general stress in the garter snake, and to identify differences between the fast- and slow-lived ecotypes. For the cellular physiological traits, I am using a targeted approach to test for the effect of stress on specific genes, proteins, metabolites, and cellular organelles (see Chapters 3 and 4). Employing recent advances in DNA sequencing technology (RNA- seq), I attempt to capture a global picture of how genes are activated or repressed in response to heat stress and how that response has evolved. This research will help us to understand what genes are important for stress response in natural populations, and how they evolve in response to different environmental stress conditions. Many of

3

the genes investigated here are involved in metabolism, diabetes and aging. Because animals share many of the same genes with only slight differences, experiments using snake populations that naturally differ in reproduction, lifespan and stress response can help us to understand the genes responsible for these traits in many species, including livestock and humans. By using naturally evolved populations of animals, I strive to put these finding in a ‘real world’ context Table 0-1. Habitat and life-history characteristics of the garter snake (Thamnophis elegans) ecotypes.

Trait Slow-lived Ecotype Fast-lived Ecotype HABITAT Grassy meadow Rocky lakeshore Elevation 1630-2055m 1555m Summer daytime temperature 15- 30 ºC 20-34 ºC Avian predators Medium bodied raptors, robins Eagle, osprey, robin Food/water availability Variable across years Continuous Major prey types Anurans, leech Fish, leech Most common parasites Tail trematodes Mites MORPHOLOGY/LIFE HISTORY Color and stripe patterns Black with bright yellow stripe Checkered, muted greys, browns Adult body size (mean) 538 mm (range: 370 – 598 mm) 660 mm (range: 425 – 876 mm) Female maturation size/age 400 mm/5-7 years 450 mm/3 years Reproductive rate Infrequent/resource dependent Annual Litter size (mean) 4.3 liveborn (range: 1-6) 8.8 liveborn (range: 1-21) Baby mass 2.85g (range: 1.8 - 3.5g) 3.27g (range: 2.5 - 4.2g) Median life span (years) 8 4

Figure 0-1. Images of the study system. Top: Eagle Lake, California. Bottom left: marshy meadow habitat with the slow-lived ecotype. Bottom right: lakeshore habitat with the fast-lived ecotype.

4

DISSERTATION ORGANIZATION Chapters 2 and 3 provide preliminary information and data that were necessary to conduct the experimental research. Chapters 4, 5, and 6 are based on a large-scale heat stress experiment conducted in 2007 on laboratory-born babies of both ecotypes of the garter snake Thamnophis elegans. Chapter 7 summarizes the final conclusions from this research.

Chapter 2. Molecular stress pathways and the evolution of life histories in reptiles. This dissertation chapter is published as a chapter in the peer-reviewed book “Molecular Mechanisms of Life History Evolution: The Genetics and Physiology of Life History Traits and Trade-Offs”. This chapter synthesizes the empirical evidence for how environmental stresses can induce molecular stress pathways in the reptiles. Further, in an evolutionary framework, we describe how these pathways can respond to natural selection to shape life histories, especially reproduction and life span. We focus on molecular pathways, including gene networks and DNA repair; enzyme and hormonal regulation; and mitochondrial and cell function, that are induced by the stresses found in the natural environment: thermal, oxidative, and caloric (prey type abundance). Finally, we propose areas of research that should be the foci of future studies to elucidate the evolution of molecular mechanisms that determine life history traits in reptiles.

Chapter 3. A garter snake transcriptome: pyrosequencing, de novo assembly, and sex-specific differences. This chapter is a published manuscript in the peer-reviewed scientific journal BMC Genomics. Often, when working with ‘non-model’ organisms there are little to no molecular resources available. Hence, while the Western terrestrial garter snake is an excellent system from an ecological perspective to address the questions of stress response and life-history evolution, it was lacking molecular resources. Therefore, we took advantage of new (in 2007) sequencing technology to rapidly generate a large amount of genetic data and begin to develop molecular resources for this species. I used 454 pyrosequencing to sequence the mRNA transcripts expressed in the garter snake brain, liver, gonads, kidneys, blood, spleen, and heart. The sequences were assembled in to a transcriptome that represents the majority of the mRNAs expressed across these multiple tissues. Because we used multiple individuals, we were able to use the sequence data to identify potential locations in the genome that differ between individuals (single nucleotide polymorphisms, SNPs). We also were able to predict sequences (genes) that were only being used in males or in females. These sequences are provided to the public as a resource for comparative transcriptomics/genomics, and they provide an initiation point for my experimental research into this garter snake system.

5

Chapter 4. Dissecting molecular stress networks: identifying nodes of divergence between life-history phenotypes. This chapter is a published manuscript in the peer-reviewed scientific journal Molecular Ecology. The complex molecular network that underlies physiological stress response is comprised of nodes (proteins, metabolites, mRNAs, etc.) whose connections span cells, tissues, and organs. Variable nodes are points in the network upon which natural selection may act. The aim of this chapter was to identify variable nodes that will reveal how the molecular stress network may evolve among populations in different habitats, and how it might impact life-history evolution utilizing the slow and fast-lived ecotypes. We take a multifaceted molecular approach –cellular physiology, transcriptomics, and genomics - to test whether lab-born juveniles of these divergent phenotypes vary concomitantly at candidate nodes of the stress response network under unstressed and induced-stress conditions. We found that two common measures of stress, levels of corticosterone in the plasma and liver gene expression of heat shock proteins, increased under stress in both life-history ecotypes. In contrast, the ecotypes diverged at four nodes both under unstressed conditions and in response to stress: levels of reactive oxygen species (superoxide, H2O2) in the blood; liver gene expression of the antioxidant GPX1, and DNA damage to in the blood cells. Interestingly, the differences between the life-history ecotypes were more pronounced in females compared to males. This study provides support for the hypothesis that these life-history ecotypes have diverged at the molecular level in how they respond to stress, particularly in nodes regulating oxidative stress.

Chapter 5. Mitochondrial response to heat stress: similarities and divergence between life- history ecotypes in respiration, transcription and haplotypes. This chapter is a manuscript prepared for submission to the peer-reviewed journal, Functional Ecology. Mitochondria are organelles within the cell that are responsible for the production of energy that is used throughout the body. The function of the mitochondria and how it responds to changes in the organism’s environment are predicted targets of natural selection. The aim of this chapter is to understand how the mitochondria in this ectothermic species, the western terrestrial garter snake, respond to an acute whole organismal heat stress by focusing on the mitochondrial respiration (i.e., how the mitochondria consume oxygen to produce ATP-chemical energy), and the regulation of transcription (the activation or repression of genes) of the mitochondrial genome. In this chapter I describe the reconstructed the T. elegans mitochondria sequence confirming the presence of duplicated control regions found in other snakes. I found that heat stress on garter snakes affects how different types of genes (ribosomal versus protein-coding) in the mitochondrial

6

genome are differentially transcriptionally regulated, thus providing the first evidence that the duplicated control regions in the snake mitochondrial genome may provide a novel mechanism for transcriptional plasticity in response to environmental stress. Additionally, I found that stress affects the mitochondria, in an unknown way, to cause a sustained increase in the rate of oxygen consumption. Finally, I found that the garter snake fast- and slow-lived ecotypes differ at three levels of mitochondrial function: (1) rate of resting mitochondrial oxygen consumption when they are unstressed, (2) the degree to which they transcriptionally activate their mitochondrial genome in when unstressed and in response to stress, and (3) in the sequence of their mitochondrial genome. In summary, these data provide further evidence for mitochondrial and metabolic divergence between these garter snake life history ecotypes.

Chapter 6. Functional ecological transcriptomes: using quantitative RNA-seq to study heat stress response in a non-model vertebrate. This chapter is a manuscript prepared for submission to the peer-reviewed journal, Physiological Genomics. Temperature is a fundamental determinant of animal physiology and driver of acclimation and evolutionary adaptation. How ectothermic reptiles respond to temperature and extreme temperature events has been a focus of modeling the effects of climate change. Experimental data elucidating the molecular response to environmental variation is needed to incorporate into these models to improve predictions of how temperature change will impact species. We take an initial step towards these ultimate goals, by providing the first whole transcriptome study of heat stress on an ectothermic reptile, the garter snake (Thamnophis elegans). With the objective of identifying genetic pathways and networks that are activated we characterize the plastic response of the liver transcriptome to heat stress using Illumina RNA-seq and qPCR on targeted genes. This is the first large-scale transcriptomic study of the affect of heat stress on reptile gene expression. We found over 600 genes that are affected by heat stress that are involved in cellular protection from heat shock, transcriptional activation, environmental information processing and cell signaling as well as immune function, growth and reproduction. The functional categories of genes and pathways that are affected by heat stress present hypotheses for future studies on the affect of extreme thermal events (natural or human-induced) on natural populations of reptiles. Furthermore, this chapter demonstrates that a sequenced genome is not necessary for quantitative expression using RNA-seq in a non-model organism. This study provides proof-of- concept for the applicability for the use of RNA-seq to study large-scale transcriptomes in a non- model vertebrate.

7

CHAPTER 2. MOLECULAR STRESS PATHWAYS AND THE

EVOLUTION OF LIFE HISTORIES IN REPTILES

This is a published book chapter within Molecular Mechanisms of Life History Evolution: The Genetics and Physiology of Life History Traits and Trade-Offs. Editors T. Flatt and A. Heyland. Oxford Press, Oxford.

Tonia S. Schwartz and Anne M. Bronikowski

Authors Contributions: Both co-authors drafted sections of this chapter and both revised all sections of this chapter.

CHAPTER SUMMARY This chapter synthesizes the empirical evidence for how environmental stresses can induce molecular stress pathways in the reptiles. Further, in an evolutionary framework, we describe how these pathways can respond to natural selection to shape life histories, especially reproduction and life span. We focus on molecular pathways, including gene networks and DNA repair; enzyme and hormonal regulation; and mitochondrial and cell function, that are induced by the stresses found in the natural environment: thermal, oxidative, and caloric (prey type abundance). Finally, we propose areas of research that should be the foci of future studies to elucidate the evolution of molecular mechanisms that determine life history traits in reptiles.

REPTILES POSSESS REMARKABLE VARIATION AND PLASTICITY IN LIFE HISTORY Reptiles are extraordinarily diverse in their life history traits (reviewed in (Shine 2005). Post- embryonic life ranges from a few months (Labord’s chameleon, painted dragon) to centuries (tortoises, sea turtles), with concomitant completion of growth, maturation, and reproduction within these highly variable time frames. These life history traits are environmentally plastic, but have an underlying pleiotropic genetic basis. Thus selection on either development or reproduction should have accompanying evolution across the life history. Within the amniote vertebrates, the reptilian lineage is characterized by high degree of flexibility in physiological processes on the one hand (excepting Aves), and remarkable adaptations on the other. This combination of variability and

8

novelty has led to in-depth studies of the life histories of avian and non-avian reptiles. A relatively smaller tool kit for molecular studies for application to wild-living species, however, has resulted in far fewer studies of molecular pathways in reptiles. Here, we focus on the recent literature of molecular pathways in ectothermic lineages of reptiles (crocodilians, turtles, tuatara, lizards and snakes) (Fig. 1). These pathways are highly conserved across a diversity of organisms; for many model organisms, the links between molecular pathway and life history have been, or are being, identified. By analogy we hypothesize that these molecular pathways will be found to determine reptilian life history as well. Evolutionary theory posits the life history as a set of co-evolved traits that have responded to natural selection operating through mortality and reproductive success. Thus, the continuum of fast- to slow- pace-of-life (i.e., short-lived with fast growth, early maturation, rapid and high effort reproduction, vs. long-lived with slow growth, late maturation, and extended reproduction, potentially with a post-reproductive stage) results from variation in time and space in the sources of mortality that shape the life history (Williams 1957). Thus, evolution is expected to occur due to changes in external or internal mortality environments (Hamilton 1966). The reptiles possess many such adaptations related to mortality selection that recommend them for studies that link morphological and physiological evolution and underlying molecular events and pathways. These include an external ribcage (turtles); venom (snakes); limblessness (snakes and some lizards); extended metabolic shut- down (all); starvation resistance, including remodeling of the digestive tract (snakes); supercooling, freeze tolerance, and heat tolerance (all); and extended hypoxia resistance (turtles, crocodilians, lizards). Although there are definable life spans and reproductive schedules within each species, in many there is remarkably little physiological deterioration with advancing age. Furthermore, many reptile species have indeterminate growth and indeterminate fecundity; the oldest individuals in natural populations are often the most fecund and robust (Sparkman et al. 2007). Therefore, in many reptile species strong positive selection is maintained on older individuals in the population leading to increased longevity and longer reproductive life spans. Reptiles are proposed as a model for the trade-off between life span and reproduction (Bronikowski 2008 and references therein) because they have evolved plastic responses to external stresses and, putatively, plastic modulation of stress cell signaling pathways. Ectothermic reptiles have different physiological and cellular responses to environmental and metabolic stress relative to endotherms. This is mostly driven by the ability to regulate their metabolism (metabolic rate) by behaviorally regulating their body temperature, resulting in low-energy needs relative to endotherms that must use their metabolism to maintain a constant body temperature. This underlying metabolic

9

difference drives the evolution of many reptilian life history traits and allows for plasticity and the ability to acclimate to environmental conditions (Shine 2005). Understanding the molecular pathways that have been conserved verses specifically evolved in reptile lineages will lead to an understanding of how these pathways and interactions evolve under different selective pressures. Some of these reptilian adaptations to environmental stress are known to activate molecular pathways linked to mechanistic theories of aging, specifically the free radical theory of aging (and its derivations) that posits oxidative damage as a life span determinant. It has long been implicated and it is now becoming increasingly apparent that the molecular mechanisms underlying the complex traits of life history, stress response, and metabolism are controlled by evolutionarily conserved complex gene networks, which are the focus of this chapter. Here we focus on the molecular level to highlight the best characterized metabolic and stress response pathways in reptiles, and suggest causal hypotheses, when they are supported in other organisms, of how they modulate the trade-off between reproduction (including growth and maturation rates) and longevity. The central theme of this chapter is that past and current natural selection on reptilian (environmental) stress response (e.g., oxygen, thermal, and dietary stresses) has resulted in exceptional molecular stress responses that impact many life history traits. We (1) summarize what is known about these molecular pathways in reptiles; (2) describe how three types of environmental stresses affect these pathways; and (3) point out potential consequences for life-history evolution. Additionally we provide a list of key terms and the definitions that we are using, as many of these terms are used differently across disciplines from ecology to molecular biology (Box 1); and provide a case study on garter snake life history evolution (Box 2).

THE MOLECULAR STRESS PATHWAYS - WHAT IS KNOWN IN REPTILES Many pathways can affect the reproduction/longevity life-history trade-off. Experiments on lab animals have clearly illustrated the importance of single allele differences in these pathways on longevity. Gene network analyses indicate that these longevity genes are highly conserved and are highly connected nodes (Bell et al. 2009). Thus, the following pathways may be shaped in wild population through selection on genetic variation in stress response, and likely have pleiotropic effects on the trade-offs among growth, reproduction and longevity.

Metabolic pathways Metabolic pathways produce energy that is essential for growth, reproduction, response to stress, and cellular maintenance/repair; concordantly they also produce deleterious byproducts that can contribute to cellular dysfunction and aging. Under normal conditions, most of the energy produced

10

within an organism is through oxidative phosphorylation by the electron transport chain (ETC) in the mitochondria, resulting in ATP production. The core of the ETC consists of five protein complexes (I-V) in the inner-membrane of the mitochondria. Electrons from oxidized dietary carbohydrates and fats by the tricarboxylic acid (TCA) cycle and β-oxidation, are donated to the ETC, which uses them to produce a proton gradient by pumping hydrogen atoms from the matrix to the inter-membrane space of the mitochondria. The subsequent movement of these protons down this gradient through ATP-synthase (complex V) into the matrix drives production of ATP. This process of oxidative phosphorylation is a major source of reactive oxygen species (ROS) as “waiting” electrons during the early stages of the ETC (particularly complex I and III) can interact with oxygen resulting in ROS, - specificially superoxide (O2 ), and subsequently including hydroxyl radicals (OH-), nitric oxide (NO), hydrogen peroxide (H2O2), and hyperchlorous acid . In this way, ROS are produced continuously throughout life and may be related to species-specific longevity (Barja 2004). ROS have unbalanced electrons that “scavenge” electrons from other molecules including DNA, proteins, and lipids thereby damaging these molecules (reviewed in: Pamplona and Barja 2007). Although the deleterious effects of ROS are typically studied with respect to life history traits, they are also essential for many cellular and physiological processes including cell signaling, immune function, and normal reproduction. Therefore, cells need to manage ROS levels to maintain their essential functions while minimizing their potential deleterious effects to the cell. ROS levels can be regulated through (i) the production of ROS and (ii) the levels of antioxidants that can neutralize ROS. Oxidative stress results if this balance is disrupted causing an overabundance of ROS, and ultimately oxidative damage (see Box 1). Oxidative stress and oxidative damage have established roles in reproductive pathologies, particularly with age, and ROS has been implicated in age-related decline in oocyte quality and hormone biosynthesis, and the degradation of reproductive tissues (reviewed in: Martin and Grotewiel 2006). Additionally, the accumulation of DNA mutations, inefficient proteins and damaged lipid membranes due to ROS can lead to cellular senescence and have predictable impacts on organismal aging and ultimately lifespan - the free radical theory of aging (Finkel and Holbrook 2000 and references therein). The cell can defend itself from oxidative stress and potential damage by increasing the tolerance of cellular components to oxidation by ROS and/or activating mechanisms to repair oxidative damage due to ROS (reviewed in: Monaghan et al. 2009). Thus, there are four components to managing oxidative stress and it effects in the cell: (1) ROS production, (2) ROS neutralization through antioxidants, (3) cellular protection from and tolerance to ROS, and (4) repair of damage due to ROS. The pathways underlying these four components of oxidative stress regulation can contribute to the

11

trade-off on whether to invest in regulating oxidative stress for cell maintenance and longevity versus investment in growth and reproduction (Monaghan et al. 2009: Fig 2).

Molecular mechanisms to regulate production of ROS. Production of ROS is highly correlated with longevity in mammals. Additionally, as most ROS production occurs in the mitochondria, this is where most of the oxidative damage ensues causing the mitochondrial genome to accumulate mutations with age (Barja 2004). As damage accumulates, energy production becomes increasingly inefficient, which in turn causes more ROS production. In this way, mitochondrial function plays an important role in the aging process (Balaban et al. 2005). Understanding the evolution of mitochondrial function is essential for understanding the evolutionary relationship of energy production, reproduction, and aging. The rate of metabolism is not a cause of aging in mammals, but this has not been well-explored in reptiles (Robert et al. 2007) that can have extremely plastic metabolic rates with environmental conditions. Typically reptile metabolic rate - as measured by mitochondrial O2 consumption - is 5- 10X lower than similar sized endotherms of the same body temperature (Brand et al. 1991). Ultimately, this is due to the low energy requirements of reptiles; proximately, this may be due to differences in membrane properties and the proposed remodeling of ETC and mitochondrial function. Hulbert and colleagues (reviewed in 2008) have determined that the types of phospholipids, and their degree of saturation (i.e. number of double bonds in the fatty acid chains), in the plasma and mitochondrial membranes can alter membrane fluidity, metabolic rate and predict lifespan across mammal and bird species. Reptile species typically have membranes with fewer polyunsaturated fatty acid content relative to endotherms (birds and mammals) at the same body temperature – this is consistent with less leaky membranes and increased resistance to peroxidation (Brand et al. 1991; Brookes et al. 1998). Since body temperature also affects membrane fluidity, understanding how reptile membrane fatty acid composition changes with metabolic rate and body temperature would be fruitful. Recent molecular studies on reptiles have proposed a remodeling of the reptile ETC, which may have drastic effects on metabolic rate, energy production, and ROS production. The genes that code for the protein subunits of the ETC are located in either the nuclear or mitochondrial genomes. Allelic variants (mutations) at many of these genes have been shown to affect not only mitochondrial function but also lifespan in mice and C. elegans under laboratory conditions (Greer and Brunet 2008b). Research on reptile ETC has focused on genes encoded by the mitochondrial genome, which are essential core subunits of the ETC protein complexes. Comparative studies reveal varying rates of evolution among the different reptile lineages. While turtles and some lizards appear to have

12

relatively slow rates of mitochondrial evolution, crocodilian (Janke et al. 2001), tuatara (Hay et al. 2008) and snakes lineages (and some lizards) demonstrate accelerated evolution relative to other vertebrates (Castoe et al. 2009; Jiang et al. 2007). Changes in mitochondrial genomic architecture have also occurred in reptiles; the most striking of which is the duplicated control region (D-loop) found in tuatara, most snakes, and some lizards, as well as rRNA gene rearrangements. It is hypothesized that these changes may be adaptive for energy conservation and increased plasticity in mitochondrial function and proliferation (Jiang et al. 2007). Further evidence of reptile mitochondrial evolution is the rapid evolution of gene sequences encoding the ETC protein subunits (e.g. COXI, COX2, ATP6, ATP8, ND5, ND6, CytB) in the lineage leading to snakes and unprecedented convergent evolution among agamid lizards and snakes (e.g. COXI, NDI) (Castoe et al. 2009 and references within). The rapid adaptive evolution of the snake mitochondrial proteome is predicted to affect mitochondrial function and metabolism including the production of ROS; ultimately influencing how energy production, aging, and stress response will evolve (Castoe et al. 2009). Interestingly, birds (having evolved from reptiles) have lower rates of ROS production then mammals of the same size and metabolic rate (reviewed in: Pamplona and Barja 2007). Other proteins in the mitochondrial membrane can influence the production of ROS, such as uncoupling proteins (UCP). Members from this protein family reside in the inner membrane of the mitochondria and provide a proton leak that allows protons to move across the inner membrane into the matrix without passing through ATP-synthase. This uncoupling of the ETC from ATP production can result in the production of heat (UCP1 - only in mammals), and the decrease in ROS production (UCP2 and/or UCP3) (Skulachev 1998). Additionally, these proteins can be activated by ROS to increase the proton leak. All reptiles (including birds) have lost UCP1, but the ecotothermic reptiles still have, and express, UCP2 and UCP3, whereas birds have lost UCP2 (Mezentseva et al. 2008; Schwartz et al. 2008). Interestingly, peroxisome activated receptor coactivator 1- α (PGC-1α) has been found to regulate production of UCPs and mitochondria biogenesis both of which can lead to increased energy production with lower production of ROS.

Molecular mechanisms to neutralize ROS. Antioxidants are molecules that neutralize ROS. These include endogenously produced enzymes, such as superoxide dismutase (SOD), catalase, and glutathione peroxidase (GPX) that act on H2O2; and those obtained from the diet such as Vitamin E (e.g. tocopherols and tocotrienols) and carotenoids (Catoni et al. 2008). Metabolically active tissues generally produce the most antioxidants.

13

For example, liver, kidney and muscle had the highest levels of SOD, catalase, and GPX in green sea turtles (Valdivia et al. 2007). Additionally, the production of endogenous antioxidants can be induced through stress response pathways including IIS discussed below. Dietary antioxidants such as carotenoids contribute to bright coloration in plants and some animals. It has been hypothesized that these antioxidants may represent an honest signal for health (via levels of oxidative stress and immune function) and thereby used in mate choice and sexual selection. Although this hypothesis has received some support in some birds and fish it has yet to be supported in reptiles (reviewed in: Catoni et al. 2008; Olsson et al. 2008a). Antioxidants levels are affected by reproduction and aging. In birds, reproduction causes a decrease in total antioxidant defenses in blood (Alonso-Alvarez et al. 2004). Interestingly, vitellogenin, a pre-cursor to yolk, a biomarker of reproduction, and part of the IIS pathway (see below), is known to shorten lifespan in C. elegans. In contrast, it is an antioxidant and correlated with survival in honeybees (Seehuus et al. 2006). Vitellogenin may function as an antioxidant in lizards as well. Olsson et al. (2008b) determined that babies from large clutches, in which mothers produced more vitellogenin, had lower levels of O2- in their circulating blood than babies from small clutches. Catalase activity decreases or stays the same during maturation but then increases with age in the brain of a short-lived lizard, but catalase is also increasingly susceptible to inhibition with age (Jena et al. 1998). As well, lipid peroxidation increases in various tissues with age, suggesting lower catalase expression or activity with age (Majhi et al. 2000 and references therein).

Tolerance and resistance to ROS. Not all types of proteins and fatty acids are equally susceptible to oxidation, thus the composition of macromolecules can determine susceptibility of cells to attack by ROS (Pamplona and Barja 2007). Proteins with methionine residues and membranes with polyunsaturated fatty acids are most susceptible to ROS and both of these types of macromolecules are found in lower concentrations in long-lived mammals and birds. Lipid membranes with more saturated fatty acids are more resistant to peroxidation. Hulbert et al. (2006) have proposed membrane composition in the naked mole rat as the reason they can tolerate high levels of ROS and have an extended lifespan relative to other rodents of the same size. Furthermore, lipid peroxidation produces additional ROS species, so membrane composition not only affects tolerance to ROS but can also promote additional ROS. Based on the few reptile species examined, reptiles have relatively low levels of polyunsaturated fatty acids in their membranes, thus are predicted to have high tolerance to ROS. Lipid peroxidation potential increases in many tissues with age in mammals and short-lived lizards (Majhi et al. 2000). More research is

14

needed to understand how these fatty acids change under environmental stress and with age across reptiles with different aging rates. Heat shock proteins (HSPs) are molecular chaperones found in the cytoplasm and mitochondria. They are responsible for maintaining the conformational structure of cellular proteins particularly under stress conditions. Different HSPs (from different genes and/or splice variants) are important under different stress conditions. The transcription of all HSPs are controlled by transcription factor, HSF-1, which is normally present in the cytoplasm and when activated by stress moves into the nucleus to bind the promoters of HSP genes. HSPs have been identified in lizards and turtles and play important roles in temperature and hypoxia stress response (see below). DNA damage leads to cellular apoptosis unless it can be repaired, therefore long-lived organisms must invest in either preventative measures to protect the DNA or repair mechanisms. Telomeres, consisting of repetitive DNA and protective proteins, found at the ends of linear chromosomes protect the ends of the chromosomes from degradation during replication and mitotic division. They are considered a biomarker of biological aging because they typically decrease in length with every cell division. Telomeres can be further eroded with oxidative stress, chronic stress, reproduction, infection, and increased growth rate. Thus telomeres provide a clear link between stress and aging which may reflect differences in life-histories (Ilmonen et al. 2008; Monaghan and Haussmann 2006; and references within). Studies on mammals and birds have documented decreased telomere length with age. In reptiles, alligators have long telomeres (~31Kb) relative to mammals, birds, and snakes (14-25 Kb), but not as long at turtles (>60 Kb) (Paitz et al. 2004). Furthermore, in both alligators and snakes, telomeres decrease with age (or size as a proxy for age) (Bronikowski 2008; Scott et al. 2006). A study on loggerhead sea turtles found no significant association with age and telomere length, but this study had low sample sizes (Hatase et al. 2008). As an extremely long-lived clade, more research on turtle telomeres is warranted.

Molecular pathways for repair Most damaged proteins are targeted for destruction, but methionine sulfoxide reductase can repair some oxidized proteins (Pamplona and Barja 2007); however, this has not be explored in reptiles and therefore won’t be discussed further here. As described above, maintaining telomere length is important for longevity. The enzyme telomerase can actively increase the length of the telomeres and thereby repair damage from cell division and oxidation; conversely, hyperactive telomerase can increase the risks of cancer in some species, although this has rarely been found in

15

reptiles. Telomerase activity is tissue- and species-specific; in most tissues adult cells have low levels of telomerase activity, whereas the gonads and germ cells display high levels. Telomerase has a highly conserved structure that consists of protein and an RNA that serves as a template to elongate the telomeres (reviewed in Monaghan and Haussmann 2006). Although there are no peer-reviewed publications on telomerase activity in reptiles, other ectotherms (fish, lobster) (Klapper et al. 1998) have been found to have continuously active telomerase along with a long-lived species of bird (Haussmann et al. 2007) – which is in contrast to most mammalian tissues. Considering the extreme longevity of some reptile clades, their lack of cancer, and their dynamic temperature range understanding how such an important enzyme functions in reptiles is an obviously area that need further exploration.

Insulin and insulin-like growth factor1 (IGF1) signaling pathways (IIS) The insulin-like signaling pathway is a well-characterized, evolutionarily conserved molecular network associated with mediating a trade-off between reproduction/growth and stress response/longevity (Piper et al. 2008). In vertebrates, binding of the insulin-like growth factor 1 (IGF1) to its cell membrane receptor activates a signaling cascade that promotes the activation of genes for growth/reproduction and prevents the activation of genes for stress response/longevity. This “switch” is controlled by the phosphorylation/acetylation of the FOXO transcription factor. Increased IGF1 signaling keeps FOXO phosphorylated, preventing it from entering the nucleus. Decreased IGF1 signaling causes a decrease in FOXO phosphorylation allowing entrance to the nucleus to promote the transcription of genes for antioxidants (e.g. SOD), DNA repair, and other stress response genes (Greer and Brunet 2008a). Decrease in IIS signaling and mutations in this pathway are linked to increased lifespan. Levels of IGF-1 measured in blood samples from sea turtles, garter snakes, and alligators indicate that levels change seasonally, IGFl increases during reproduction, and it is positively correlated with reproductive output (Crain et al. 1995; Guillette et al. 1996; Sparkman et al. 2009). Multiple signaling networks interact with the IIS cascade that contribute to its importance in reproduction and aging (Greer and Brunet 2008b). From the hypothalamus-pituitary axis (HPA) axis, growth hormone (GH) stimulates production of IGF1 from the liver. Interestingly, the transcription of the antioxidant SOD resulting from down-regulation of the IIS pathway is a regulator of GH signaling (Juarez et al. 2008). Mutations in DNA repair mechanisms (nucleotide excision repair) also affect IGF levels (van der Pluijm et al. 2007) possibly due to the reallocation of resources from growth to maintenance. DNA damage can also decrease IIS through P53 (Hinkal and Donehower

16

2008). Furthermore, this network is linked to life extension through caloric restriction possibly through sirtuin and mTOR/nutrient sensing pathways. This has yet to be explored in reptiles.

ENVIRONMENTAL STRESS AND EVOLVING MOLECULAR PATHWAYS - EVIDENCE IN

REPTILES. To integrate evolution with molecular mechanisms, we discuss three major environmental selective pressures (temperature, anoxia, caloric restriction). We summarize the empirical evidence in reptiles for how these environmental stressors affect the underlying molecular mechanisms and pathways discussed above and their implied pleiotropic effects on aging and reproduction based on evidence in other lineages. This approach illustrates the related molecular basis for responses to multiple environmental stressors and the consequential pleiotropic effect these responses have on many life-history traits and their evolution. To discuss the evolution of life history traits through selective forces on stress response pathways, clear definitions are needed of the timeframe and the hierarchical level (cell, tissue, individual, population, species) upon which the different processes are acting. Selective force on metabolic and molecular stress pathways would focus on the reaction norms (see Box 1) of individuals in two timeframes: (1) an individual’s ability to respond to daily changes in stressors, including extreme/immediate stress (e.g. heat shock); and (2) an individual’s ability to acclimate their metabolic pathways over seasonal (weeks to months) fluctuations. The fitness consequences of individual abilities to respond to these stress pressures will drive the evolution of these pathways and their pleiotropic traits at the population level leading to local adaptation (See Box 1; Figure 2). How particular reptile species have evolved the ability to respond to stress may define their current location in life history trait space (e.g. Fig 2, a simplified, 2-D continuum trade-off in aging and reproduction). Selection on allelic variations for both the protein coding sequences and regulatory elements in these pathways would affect an individual’s response.

Temperature (heat) stress Thermodynamics determine the biochemical reaction rates that underlie an organism’s ability to function, thus temperature is a driving force in reptile life histories. In contrast to most mammals and birds, reptiles do not use metabolic processes to maintain a constant body temperature, rather, the ambient temperature and their thermoregulatory behavioral determine body temperature, which can be incredibly precise. Despite this, reptiles are found in extremely diverse thermal habitats ranging in altitudel and latitude. While each species has different life histories, they likely have adapted to their local temperature profiles using the same underlying molecular pathways, which could be selected for

17

in a multitude of directions (permutations) – defined by genetic variation in those pathways. This would result in the variation in basal metabolic rate and in the plastic response (reaction norms) of metabolic rate to temperature that can be found within and among reptile populations. Both metabolic rate and growth rate increase with body temperature in reptiles. Faster growth rate leads to higher mortality in lizards (Olsson and Shine 2002). Although this may be due to increased associated with higher foraging rates, there is also evidence that fast growth rate has a metabolic cost (Peterson et al. 1999) and is linked to an increase in aging-related mortality (Ricklefs 2006). In crocodiles most oxidative damage in a cell occurs within the first year of life, likely due to fast growth (Furtado et al. 2007). Furthermore, studies on both and snakes indicate that local adaptations to cooler temperatures include slow growth/late maturity, low reproduction, and longer lifespan (Bronikowski 2000; Ibarguengoytia and Casalins 2007) - all predicted with a decrease in IIS. Daily and seasonal temperature fluctuations can have a dramatic effect on energy metabolism, metabolic rate, and the production of free radicals unless offset by thermoregulatory behavior, and/or thermal compensation in physiological and molecular pathways (Seebacher 2005). This requires the integration of temperature sensors, cellular signaling pathways, and ultimately behavioral responses. Thus far, two transient receptor potential ion channels (TRPs), which are temperature sensitive cell membrane channels associated with nerve endings, have been indentified in crocodiles and lizards - the heat sensing TRPV1, and the cold sensing TRPM8 – as internal temperature sensors (Seebacher and Murray 2007). The signals from these TRPs are likely passed through the hypothalamus and activate neuronal B-adrenergic receptors on cell membranes. These latter receptors are related to cardiovascular response and PGC-1α expression, which in birds and mammals causes an up- regulation of metabolic enzymes (CCO and ATPse), UCPs, and mitochondrial biogenesis – suggesting an increase in energy production with less ROS. Heat stress causes metabolic rate to increase, ROS to be overproduced, and denaturation of cellular proteins. During heat stress, heat shock proteins (e.g., HSP70) can protect other cellular proteins. Studies on the master transcription factor HSF-1 and HSP-70 in lizards report that both temperate and desert lizards express HSP70 at high levels under heat stress. Interestingly, under normal conditions desert lizards were constitutively expressing HSP70 whereas temperate species maintained higher concentrations of the transcription factor HSF1. This suggests that within these two species, selection on the same pathway (but different components of that pathway) has allowed each species to become locally adapted to their thermal fluctuations - diurnal desert lizards seem to maintain a level of thermal tolerance that can be induced higher under extreme stress, but the

18

temperate species maybe able to respond faster to moderate heat stress through HSP synthesis allowing for more temperature sensitivity (as reviewed in Evgen'ev et al. 2007). Physiological and phenotypic flexibility (i.e., reversible phenotypic plasticity, acclimation) of reptiles to seasonal temperature changes has recently been reviewed (Seebacher 2005). Generally, winter-active reptiles that experience seasonal temperature fluctuations use metabolic compensation during cooler temperatures to maintain performance. They accomplish this by increasing thermosensitivity, increasing metabolic enzyme activities (via transcriptional up-regulation), and changing the membrane fatty acid composition to increase their metabolic potential (Seebacher 2005 and references therein). They may also decrease the negative effects of metabolic up-regulation (ROS production) by increasing UCP activity (Schwartz et al. 2008) and antioxidants. Activating these pathways to assist in acclimation may also promote overall cellular maintenance and longevity, and once these pathways are activated they may have a hormetic effect on longevity through future responses to stress. The selective advantage to precise acclimation and balanced metabolic upregulation is likely profound. Furthermore, evolving the ability to thermally acclimate may select for allelic variants in regulating pathways that also promote longevity.

Hibernation: supercooling, freeze tolerance and anoxia tolerance Hibernation is a strong selective force on life history traits as the ability to hibernate allows reptile populations to persist, and individuals to reproduce for multiple years in colder climates. Hibernation is a drastic response to seasonal temperature fluctuation, in which rather than attempting to compensate for the thermal (or dry) conditions the organism further intensifies the down-regulation of metabolic rates to a level of dormancy. The overall decrease in metabolism for an extended period of time has prompted the suggestion that hibernation or torpor could increase lifespan through “suspended animation”. In addition to this, the evolution of the ability to hibernate (and recover) may actually select for molecular mechanisms that increase lifespan as a by-product. Three types of stresses associated with hibernation would be selective forces at the molecular level: 1) cold stress, 2) anoxia and the build-up of toxic anaerobic by-products, and 3) oxidative stress due to repurfusion during rewarming. Reptile species that experience freezing or close to freezing temperatures have evolved mechanisms for freeze avoidance through supercooling that prevents freezing of tissues, and/or through freeze tolerance – the ability to recover from being frozen. Decreased blood circulation is concomitant with these conditions, resulting in tissue anoxia and a switch to anaerobic metabolism. This is most extreme in turtles as they are typically hibernating under the water; consequently turtles have evolved exceptional ability to deal with anoxia as well as cold stress (Bickler and Buck 2007; Storey 2006). Post-hibernation rewarming or thawing has

19

immense potential for injury due to oxidative stress as the tissues are reperfused with oxygenated blood and aerobic metabolism is reestablished. At the molecular level, it is difficult to separate the response to these stresses independently as they have been selected for concurrently as an integrated hibernation response. In this way, some environmental stressors may provide cues for upcoming stresses the organism must prepare for at the cellular level. This can cause problems experimentally as a response to a particular stress treatment may also activate molecular responses for other naturally occurring stresses. For example, some freeze tolerant genes are known to be regulated by the same signal transduction pathways as anoxia (e.g. the transcription factor hypoxia-inducible factor: HIF-1, Cowan 2003). Thereby, an experimental anoxia treatment would also activate pathways for cold tolerance as natural selection has focused on the organismal level to deal with these stresses concordantly. Appropriate anticipatory responses to these cues are important to prepare the organism at a cellular level. Under extended anoxia, turtles go into a state of suspended animation such that essentially all protein degradation pathways and transcription/translation is silenced (except for a subset of genes essential for hibernation survival), thus no new proteins are made and no proteins are broken down (Storey 2007). Therefore, to survive the impending oxidative stress during the rewarming process after hibernation they need an anticipatory response to prior cues before they enter anoxia and/or hibernation. Metabolic regulation is obviously important for hibernation. Freeze tolerance in reptiles is intimately related to water content in the cell as well as oxidative stress and metabolic pathways. For example, in a study on turtle hatchlings place at -3C for 72 hours, survival was predicted by plasma LDH, and cryoinjury predicted by body size and water content (Costanzo et al. 2006). Anoxia and freezing studies on turtles and frogs consistently detect the up-regulation of mitochondrial genes (COX1, NAD5, NAD4, CytB) – although different genes have different patterns across tissues (Storey 2006; Storey 2007). Experiments on freezing and thawing lizards demonstrated an up- regulation of the mitochondrial protein UCP3 (along with PGC-1α and PPAR - regulators of UCP expression in mammals); and a consequential decrease in mitochondrial superoxide production during thawing (Rey et al. 2008). Protection of cellular proteins during anoxia is provided by the up-regulation of particular heat- shock proteins; most notable of which is HSP70-9b (also called mortalin-2). Mortalin-2 is normally found in mitochondria and is associated with cell survival possibly through interaction with tumor suppressor protein, p53 (reviewed in: Storey 2007). Increasing antioxidant defense is necessary to neutralize ROS and prevent oxidative damage during rewarming. Gene expression for iron-binding

20

proteins (e.g. hemoglobin, ferritin, and transferrin) is up-regulated during hibernation and this is hypothesized to protect against oxidative damage since free iron in the cell can interact with H2O2 and lipid peroxides, producing highly reactive free radicals – thus it is important to keep free iron bound until needed for cellular processes. Many of these genes are under the control on the transcription factor HIF-1 which is also up-regulated in response to freezing and hypoxia in hatchling but not adult turtles (reviewed in Storey 2006). Studies across species vary widely in which antioxidants (SOD, GPX, catalase, etc.) are up-regulated, in which tissues, and in which life history stages, but the overall consensus is that antioxidants are up-regulated during cold stress and/or anoxia, most likely in preparation for free-radical generation when recovering from hibernation (Storey 2006). Additionally, the up-regulation is more extreme in species that have (or had) less frequent cold stress/anoxia, such as garter snakes and hatchling turtles. In contrast adult turtles that have repeated cold stress/anoxia exposure due to diving and underwater hibernation have constitutively high levels of antioxidants relative to other ectothermic vertebrates, but similar to levels in endotherms (reviewed in Story 2006, 2007). The first hibernation event may a strong selective force for maintaining and activating these pathways. It may also provide a hormetic effect allowing these pathways to be activated faster under other types of cellular stress. In summary, hibernation adaptations may have a pleiotropic effect on longevity as well. This has been proposed by Lutz et al. (2003) while studying anoxia affects on the turtle brain, which has evolved mechanisms to increased protection from oxidative stress and repair mechanisms to combat damage. For example, anoxic turtles have increased GABA, which protects neurons, the loss of which is associated with aging diseases; in addition, binding of transcription factor NF-kappaB typically declines with age, but increases in turtles brains under hypoxia (reviewed in: Lutz et al. 2003).

Caloric stress: availability and type of food Many animals have adaptive physiological and biochemical responses to fluctuations in diet (prey availability and quality). In laboratory model systems (C. elegans, drosophila, rodent), caloric restriction (CR) - particularly protein restriction - decreases growth rates, maximum body size, and reproductive output; but increases stress resistance, immune response and lifespan (Pamplona and Barja 2007). Additionally, CR generally results in an overall decrease IIS, and oxidative damage, (Caro et al. 2008). Genetic experiments on lab models have implicated multiple pathways likely involved in this life history trade-off response to CR including metabolic, mTOR, and particularly the IIS pathway. In rodents, the effects of CR is not due to a decrease in mass-specific metabolic rate, but likely due to a decrease in ROS production and/or increase in resistance to ROS due to changes in lipid membrane composition causing increased resistant to oxidative damage (Faulks et al. 2006).

21

Additionally, under CR there is also mitochondrial biogenesis and upregulation of UCPs, which are associated with increased mitochondrial efficiency and decreased ROS production (reviewed in: Pamplona and Barja 2007). Our focus here is on the adaptations in reptiles to caloric stress that utilizes pleiotropic molecular pathways for the evolution of growth/reproduction and aging. As seen in laboratory model systems, reptiles show a correlation between reproductive rate with prey abundance (Shine and Madsen 1997) and quality (Warner et al. 2007). Beyond that, comparing caloric studies between endotherms and reptiles is difficult as the latter have much lower resting metabolic rates, can lower their metabolic rates even further during fasting, and they have extraordinary abilities to go without any food for long periods of time – up to two years in some snakes – whereas endotherms need daily intake (Wang et al. 2006). Consequently, even the terms fasting and starvation do not have the same meaning across birds/mammals and reptiles. While we have found no studies on caloric restriction at the molecular level in reptiles, there are many studies on food deprivation in reptiles; particularly snakes likely due to their extreme ability to forgo feeding. During fasting and starvation, reptiles can down-regulate their RMR (up to 70% in some snakes), and drastically increase aerobic metabolic rate upon food intake (reviewed in Secor 2009). Different species have evolved different levels of plasticity of metabolic rate with feeding – very infrequent feeders such as pythons have the largest fluctuations. Interestingly, in many snake species either gravid or pregnant females stop eating voluntarily to allow for more precise thermoregulation that is necessary for embryogenesis. In contrast to CR and starvation in mammals, these gravid females typically have increased levels of IGF-1 (Sparkman et al. 2009). Lipid content studies indicate an increase in unsaturated fatty acids in snakes and lizards (McCue 2008) and a decrease in selected body temperature in (male, but not female) lizards (Brown and Griffin 2005). Type of food The composition of the diet can impact reproduction and aging: higher protein diet increases reproductive output in lizards (Warner et al. 2007); lifespan in herbivores can be increased by eating stressed plants that are producing phenols (Lamimng et al. 2004); decreasing protein intake can decrease ROS production and increase lifespan in rats (Pamplona and Barja 2007); and dietary antioxidants may negate oxidative stress (Olsson et al. 2008a). Most studies on reptiles have focused on the fatty acid composition (polyunsaturated vs saturated fats) of the diet. In summary, changes in fatty acid composition of the diet can change the composition in tissues, blood plasma, and the plasma membranes, but different patterns across species and across tissues are reported (Cartland- Shaw et al. 1998; Simandle et al. 2001). Additionally, diets rich in polyunsaturated fats cause a

22

decrease in selected body temperature, an ability to decrease RMR with temperature, and a decrease in membrane rigidity (Geiser et al. 1992; Geiser and Learmonth 1994), which is particularly interesting in that polyunsaturated fats are important energy stores for hibernation. It is apparent that local diet restrictions (seasonal fluctuations in prey type and availability) can impact individual reproduction and longevity. A heritable difference in prey preference has been documented in snakes (e.g., Arnold 1995). Thus, natural selection on prey preference may drive populations to evolve along the reproduction/longevity continuum (Fig 2). Selection on one life history trait also affects other life history traits (genetically correlated traits) and the ability to evolve at one trait may actually be constrained by the trait on the flip side of the network. In relation to oxidative stress, it is important to look at all aspects of oxidative stress (free radical production, antioxidant defense, protection, repair, and overall damage accumulation) to make predictions of how these molecular events can influence the evolution of life history traits. Evolutionary pressures may affect these underlying pathways differently in different species/populations. More detailed experiments in reptiles (both in natural populations and under controlled laboratory conditions) are necessary to tease apart these components of oxidative stress regulation in response to different environmental stresses and Monoghan et al. (2009) provides a nice review of procedures for measuring these components.

GENOMIC LEVEL SELECTION As of yet, no complete annotated sequence of a reptile genome is available (although the first draft of the Anolis genome has been released). We anticipate that new high throughput sequencing will change this quickly. While genomic data will soon be accessible, there will always be difficulty in doing laboratory based molecular studies on life history evolution on reptiles due to their long generation time. Although some reptiles - such as anolis lizards that reach sexual maturity in under a year and reproduce readily in captivity – will be good laboratory models, they represent only one side of the longevity continuum. Therefore, more studies will need to focus on wild populations. For natural selection to have an affect there must be heritable genetic variation and there is extreme paucity of studies measuring genetic variation in stress response or ROS regulation in wild populations much less attempting to estimate heritability. One exception in reptiles has demonstrated - genetic variation in ROS production (measured O2 ) and that was heritable (Olsson et al. 2008c). Thus, at least in this lizard species the ROS production has the ability to respond to selection. Predictably, advances in genomic and computational technologies will allow the reconstruction of population/pedigree history with genomic data and detection of heritability of traits in wild

23

populations along with detailed association studies (Stinchcombe and Hoekstra 2008). In addition, more gene expression studies would greatly assist in our understanding of stress effects and regulation at the genomic level. These types of studies on wild reptile populations may allow for further discovery of genes and pathways important for these life-history trade-offs that have not be identified in inbred laboratory animal models that have been selected for laboratory conditions.

SUMMARY 1. The life-history trade-off between reproduction and aging has been well established. Reptiles are excellent model systems to study the trade-off between reproduction and aging and the role of stress response pathways in the evolution of this trade-off. 2. Many types of stress responses use integrated networks which may constraint or accelerate their ability to evolve. Genes known to be involved in longevity are also incorporated in these stress response networks – often as a major node (highly connected). 3. Natural selection on the stress response will have a pleiotropic effect on longevity and the trade-off between longevity and reproduction. External stresses can be a source of mortality; thus they are agents of selection on the pathways that underlie the ability to respond appropriately to environmental cues. 4. Rate of aging has a genetic basis and longevity can evolve under direct selection. Longevity may also be able to evolve as a pleiotropic response to selection on stress response. 5. Future research needs to focus on network thinking and monopolize on the advances in genomics technology to understand the molecular basis for how evolution has shaped conserved molecular networks into unique evolutionary adaptations in reptiles.

REFERENCES Alonso-Alvarez, C., Bertrand, S., Devevey, G., Prost , J., Faivre, B., and Sorci, G. 2004. Increased susceptibility to oxidative stress as a proximate cost of reproduction. Ecology Letters 7(5):363- 368. Balaban, R.S., Nemoto, S., and Finkel, T. 2005. Mitochondria, oxidants, and aging. Cell 120(4):483- 495. Barja, G. 2004. Aging in vertebrates, and the effect of caloric restriction: A mitochondrial free radical production-DNA damage mechanism? Biological Reviews 79(2):235-251. Bell, R., Hubbard, A., Chettier, R., Chen, D., Miller, J.P., Kapahi, P., Tarnopolsky, M., Sahasrabuhde, S., Melov, S., and Hughes, R.E. 2009. A human protein interaction network shows

24

conservation of aging processes between human and invertebrate species. PLoS Genet 5(3):e1000414. Bickler, P.E., and Buck, L.T. 2007. Hypoxia tolerance in reptiles, amphibians, and fishes: Life with variable oxygen availability. Annual Review of Physiology 69:145-170. Brand, M.D., Couture, P.E., Paul L, Withers, K.W., and Hulbert, A.J. 1991. Evolution of energy metabolism: Proteon permeability of the inner membrane of the liver mitochondria is greater in mammal than in reptile. Biochemistry Journal 275:81-86. Bronikowski, A.M. 2000. Experimental evidence for the adaptive evolution of growth rate in the garter snake thamnophis elegans. Evolution 54(5):1760-1767. Bronikowski, A.M. 2008. The evolution of aging phenotypes in snakes: A review and synthesis with new data. AGE 30(2-3):169-176. Bronikowski, A.M., and Arnold, S.J. 1999. The evolutionary ecology of life history variation in the garter snake thomnophis elegans. Ecology 80(7):2314-2325. Bronikowski, A.M., and Arnold, S.J. 2001. Cytochrome b phylogeny does not match subspecific classification in the western terrestrial garter snake, thamnophis elegans. Copeia 2001(2):508- 513. Brookes, P.S., Buckingham, J.A., Tenreiro, A.M., Hulbert, A.J., and Brand, M.D. 1998. The proton permeability of the inner membrane of liver mitochondria from ectothermic and endothermic vertebrates and from obese rats: Correlations with standard metabolic rate and phospholipid fatty acid composition. Comparative Biochemistry and Physiology Part B: Biochemistry and Molecular Biology 119(2):325-334. Brown, R.P., and Griffin, S. 2005. Lower selected body temperatures after food deprivation in the lizard anolis carolinensis. Journal of Thermal Biology 30(1):79-83. Caro, P., Gomez, J., Lopez-Torres, M., Sanchez, I., Naudi, A., Jove, M., Pamplona, R., and Barja, G. 2008. Forty percent and eighty percent methionine restriction decrease mitochondrial ros generation and oxidative stress in rat liver. Biogerontology 9(3):183-196. Cartland-Shaw, L.K., Cree, A., Skeaff, C.M., and Grimmond, N.M. 1998. Differences in dietary and plasma fatty acids between wild and captive populations of a rare reptile, the tuatara (sphenodon punctatus). Journal of Comparative Physiology B: Biochemical, Systemic, and Environmental Physiology 168(8):569-580. Castoe, T.A., de Koning, A.P.J., Kim, H.-M., Gu, W., Noonan, B.P., Naylor, G., Jiang, Z.J., Parkinson, C.L., and Pollock, D.D. 2009. Evidence for an ancient adaptive episode of convergent molecular evolution. Proceedings of the National Academy of Sciences 106(22):8986-8991.

25

Catoni, C., Peters, A., and Martin Schaefer, H. 2008. Life history trade-offs are influenced by the diversity, availability and interactions of dietary antioxidants. Animal Behaviour 76(4):1107- 1119. Costanzo, J.P., Baker, P.J., and Lee, R.E. 2006. Physiological responses to freezing in hatchlings of freeze-tolerant and-intolerant turtles. Journal of Comparative Physiology B-Biochemical Systemic and Environmental Physiology 176(7):697-707. Crain, D.A., Bolten, A.B., Bjorndal, K.A., Guillerte, L.J., and Gross, T.S. 1995. Size-dependent, sex- dependent, and seasonal changes in insulin-like growth factor i in the loggerhead sea turtle (caretta caretta). General and Comparative Endocrinology 98(2):219-226. Evgen'ev, M.B., Garbuz, D.G., Shilova, V.Y., and Zatsepina, O.G. 2007. Molecuar mechanims underlying thermal adaptation of xeric animals. Journal of Biosciences 32(3):489-499. Faulks, S.C., Turner, N., Else, P.L., and Hulbert, A.J. 2006. Calorie restriction in mice: Effects on body composition, daily activity, metabolic rate, mitochondrial reactive oxygen species production, and membrane fatty acid composition. Journals of Gerontology Series A: Biological Sciences and Medical Sciences 61(8):781-794. Finkel, T., and Holbrook, N.J. 2000. Oxidants, oxidative stress and the biology of ageing. Nature 408:239-247. Furtado, O.V., Polcheira, C., Machado, D.P., Mourao, G., and Hermes-Lima, M. 2007. Selected oxidative stress markers in a south american crocodilian species. Comparative Biochemistry and Physiology C-Toxicology & Pharmacology 146(1-2):241-254. Geiser, F., Firth, B.T., and Seymour, R.S. 1992. Polyunsaturated dietary lipids lower the selected body temperature of a lizard. Journal of Comparative Physiology B: Biochemical, Systemic, and Environmental Physiology 162(1):1-4. Geiser, F., and Learmonth, R.P. 1994. Dietary fats, selected body temperature and tissue fatty acid composition of agamid lizards (amphibolurus nuchalis). Journal of Comparative Physiology B: Biochemical, Systemic, and Environmental Physiology 164(1):55-61. Greer, E.L., and Brunet, A. 2008a. Foxo transcription factors in ageing and cancer. Acta Physiologica 192(1):19-28. Greer, E.L., and Brunet, A. 2008b. Signaling networks in aging. J Cell Sci 121(4):407-412. Guillette, J.L.J., Cox, M.C., and Crain, D.A. 1996. Plasma insulin-like growth factor-i concentration during the reproductive cycle of the american alligator (alligator mississippiensis). General and Comparative Endocrinology 104(1):116-122.

26

Hatase, H., Sudo, R., Watanabe, K.K., Kasugai, T., Saito, T., Okamoto, H., Uchida, I., and Tsukamoto, K. 2008. Shorter telomere length with age in the loggerhead turtle: A new hope for live sea turtle age estimation. Genes & Genetic Systems 83(5):423-426. Haussmann, M.F., Winkler, D.W., Huntington, C.E., Nisbet, I.C.T., and Vleck, C.M. 2007. Telomerase activity is maintained throughout the lifespan of long-lived birds. Experimental Gerontology 42(7):610-618. Hay, J.M., Subramanian, S., Millar, C.D., Mohandesan, E., and Lambert, D.M. 2008. Rapid molecular evolution in a living fossil. Trends in Genetics 24(3):106-109. Hinkal, G., and Donehower, L.A. 2008. How does suppression of igf-1 signaling by DNA damage affect aging and longevity? Mechanisms of Ageing and Development 129(5):243-253. Hulbert, A.J. 2008. The links between membrane composition, metabolic rate and lifespan. Comparative Biochemistry and Physiology a-Molecular & Integrative Physiology 150(2):196- 203. Hulbert, A.J., Faulks, S.C., and Buffenstein, R. 2006. Oxidation-resistant membrane phospholipids can explain longevity differences among the longest-living rodents and similarly-sized mice. J Gerontol A Biol Sci Med Sci 61(10):1009-1018. Ibarguengoytia, N.R., and Casalins, L.M. 2007. Reproductive biology of the southernmost homonota darwini: Convergent life-history patterns among southern hemisphere reptiles living in harsh environments. Journal of Herpetology 41(1):72-80. Ilmonen, P., Kotrschal, A., and Penn, D.J. 2008. Telomere attrition due to infection. PLoS ONE 3(5):e2143. Janke, A., Erpenbeck, D., Nilsson, M., and Arnason, U. 2001. The mitochondrial genomes of the iguana (iguana iguana) and the caiman (caiman crocodylus): Implications for amniote phylogeny. Proceedings of the Royal Society of London. Series B: Biological Sciences 268(1467):623-631. Jena, B.S., Nayak, S.B., and Patnaik, B.K. 1998. Age-related changes in catalase activity and its inhibition by manganese (ii) chloride in the brain of two species of poikilothermic vertebrates. Archives of Gerontology and Geriatrics 26(2):119-129. Jiang, Z., Castoe, T., Austin, C., Burbrink, F., Herron, M., McGuire, J., Parkinson, C., and Pollock, D. 2007. Comparative mitochondrial genomics of snakes: Extraordinary substitution rate dynamics and functionality of the duplicate control region. Bmc Evolutionary Biology 7(1):123. Juarez, J.C., Manuia, M., Burnett, M.E., Betancourt, O., Boivin, B., Shaw, D.E., Tonks, N.K., Mazar, A.P., and Donate, F. 2008. Superoxide dismutase 1 (sod1) is essential for h2o2-mediated

27

oxidation and inactivation of phosphatases in growth factor signaling. Proceedings of the National Academy of Sciences 105(20):7147-7152. Klapper, W., Heidorn, K., Kuhne, K., Parwaresch, R., and Krupp, G. 1998. Telomerase activity in 'immortal' fish. Febs Letters 434(3):409-412. Lutz, P.L., Prentice, H.M., and Milton, S.L. 2003. Is turtle longevity linked to enhanced mechanisms for surviving brain anoxia and reoxygenation? Experimental Gerontology 38(7):797-800. Majhi, S., Jena, B.S., and Patnaik, B.K. 2000. Effect of age on lipid peroxides, lipofuscin and ascorbic acid contents of the lungs of male garden lizard. Comparative Biochemistry and Physiology C-Toxicology & Pharmacology 126(3):293-298. Manier, M.K., Seyler, C.M., and Arnold, S.J. 2007. Adaptive divergence within and between ecotypes of the terrestrial garter snake, thamnophis elegans, assessed with f-st-q(st) comparisons. Journal of Evolutionary Biology 20(5):1705-1719. Martin, I., and Grotewiel, M.S. 2006. Oxidative damage and age-related functional declines. Mechanisms of Ageing and Development 127(5):411-423. McCue, M.D. 2008. Fatty acid analyses may provide insight into the progression of starvation among squamate reptiles. Comparative Biochemistry and Physiology a-Molecular & Integrative Physiology 151(2):239-246. Mezentseva, N., Kumaratilake, J., and Newman, S. 2008. The brown adipocyte differentiation pathway in birds: An evolutionary road not taken. BMC Biology 6(1):17. Monaghan, P., and Haussmann, M.F. 2006. Do telomere dynamics link lifestyle and lifespan? Trends in Ecology & Evolution 21(1):47-53. Monaghan, P., Metcalfe, N.B., and Torres, R. 2009. Oxidative stress as a mediator of life history trade-offs: Mechanisms, measurements and interpretation. Ecology Letters 12(1):75-92. Nappi, A.J., and Ottaviani, E. 2000. Cytotoxicity and cytotoxic molecules in invertebrates. Bioessays 22(5):469-480. Olsson, M., and Shine, R. 2002. Growth to death in lizards. Evolution 56(9):1867-1870. Olsson, M., Wilson, M., Isaksson, C., Uller, T., and Mott, B. 2008a. Carotenoid intake does not mediate a relationship between reactive oxygen species and bright colouration: Experimental test in a lizard. J Exp Biol 211(8):1257-1261. Olsson, M., Wilson, M., Uller, T., Mott, B., and Isaksson, C. 2008b. Variation in levels of reactive oxygen species is explained by maternal identity, sex and body-size-corrected clutch size in a lizard. Naturwissenschaften.

28

Olsson, M., Wilson, M., Uller, T., Mott, B., Isaksson, C., Healey, M., and Wanger, T. 2008c. Free radicals run in lizard families. Biology Letters 4(2):186-188. Paitz, R.T., Haussmann, M.F., Boden, R.M., Janzen, F.J., and Vleck, C. 2004. Long telomeres may minimize the effect of aging in the painted turtle. Integrative and Comparative Biology 44(617). Pamplona, R., and Barja, G. 2007. Highly resistant macromolecular components and low rate of generation of endogenous damage: Two key traits of longevity. Ageing Research Reviews 6(3):189-210. Peterson, C.C., Walton, B.M., and Bennett, A.F. 1999. Metabolic costs of growth in free-living garter snakes and the energy budgets of ectotherms. Functional Ecology 13:500-507. Piper, M.D.W., Selman, C., McElwee, J.J., and Partridge, L. 2008. Separating cause from effect: How does insulin/igf signalling control lifespan in worms, flies and mice? Journal of Internal Medicine 263(2):179-191. Rey, B., Sibille, B., Romestaing, C., Belouze, M., Letexier, D., Servais, S., Barre, H., Duchamp, C., and Voituron, Y. 2008. Reptilian uncoupling protein: Functionality and expression in sub-zero temperatures. J Exp Biol 211(9):1456-1462. Ricklefs, R. 2006. Embryo development and ageing in birds and mammals. Proc. R. Soc. Lond. B 273(1597):2077-2082. Robert, K.A., Brunet-Rossinni, A., and Bronikowski, A.M. 2007. Testing the 'free radical theory of aging' hypothesis: Physiological differences in long-lived and short-lived colubrid snakes. Aging Cell 6(3):395-404. Robert, K.A., Vleck, C., and Bronikowski, A.M. In Press. The effects of maternal corticosterone levels on offspring behavior in fast and slow growth garter snakes (thamnophis elegans) Hormones and Behavior. Schwartz, T.S., Murray, S., and Seebacher, F. 2008. Novel reptilian uncoupling proteins: Molecular evolution and gene expression during cold acclimation. Proceedings of the Royal Society B: Biological Sciences 275:979-985. Scott, N.M., Haussmann, M.F., Elsey, R.M., Trosclair, P.L., and Vleck, C.M. 2006. Telomere length shortens with body length in alligator mississippiensis. Southeastern Naturalist 5(4):685-692. Secor, S. 2009. Specific dynamic action: A review of the postprandial metabolic response. Journal of Comparative Physiology B: Biochemical, Systemic, and Environmental Physiology 179(1):1-56. Seebacher, F. 2005. A review of thermoregulation and physiological performance in reptiles: What is the role of phenotypic flexibility? Journal of Comparative Physiology B: Biochemical, Systemic, and Environmental Physiology 175(7):453-461.

29

Seebacher, F., and Murray, S.A. 2007. Transient receptor potential ion channels control thermoregulatory behaviour in reptiles. PLoS ONE 2(3):e281. Seehuus, S.-C., Norberg, K., Gimsa, U., Krekling, T., and Amdam, G.V. 2006. Reproductive protein protects functionally sterile honey bee workers from oxidative stress. Proceedings of the National Academy of Sciences of the United States of America 103(4):962-967. Shine, R. 2005. Life-history evolution in reptiles. Annual Review of Ecology, Evolution, and Systematics 36(1):23-46. Shine, R., and Madsen, T. 1997. Prey abundance and predator reproduction: Rats and pythons on a tropical australian floodplain. Ecology 78(4):1078-1086. Simandle, E.T., Espinoza, R.E., Nussear, K.E., and Tracy, C.R. 2001. Lizards, lipids, and dietary links to animal function. Physiological and Biochemical Zoology 74(5):625-640. Skulachev, V.P. 1998. Uncoupling: New approaches to an old problem of bioenergetics. Biochimica et Biophysica Acta 1363:100-124. Sparkman, A., and Palacious, M.G. In Press. A test of life history theories of immune defense in two ecotypes of the garter snake, thamnophis elegans. Animal Ecology. Sparkman, A.M., Arnold, S.J., and Bronikowski, A.M. 2007. An empirical test of evolutionary theories for reproductive senescence and reproductive effort in the garter snake thamnophis elegans. Proceedings of the Royal Society B-Biological Sciences 274(1612):943-950. Sparkman, A.M., Vleck, C.M., and Bronikowski, A.M. 2009. Evolutionary ecology of endocrine- mediated life-history variation in the garter snake thamnophis elegans. Ecology 90(3):720-728. Stinchcombe, J.R., and Hoekstra, H.E. 2008. Combining population genomcies and quantitative genetics: Findign the genes underlying ecological important traits. Heredity 100:158-170. Storey, K.B. 2006. Reptile freeze tolerance: Metabolism and gene expression. Cryobiology 52(1):1- 16. Storey, K.B. 2007. Anoxia tolerance in turtles: Metabolic regulation and gene expression. Comparative Biochemistry and Physiology - Part A: Molecular & Integrative Physiology 147:236-276. Valdivia, P.A., Zenteno-Savin, T., Gardner, S.C., and Aguirre, A.A. 2007. Basic oxidative stress metabolites in eastern pacific green turtles (chelonia mydas agassizii). Comparative Biochemistry and Physiology C-Toxicology & Pharmacology 146(1-2):111-117. van der Pluijm, I., Garinis, G.A., Brandt, R.M.C., Gorgels, T.G.M.F., Wijnhoven, S.W., Diderich, K.E.M., de Wit, J., Mitchell, J.R., van Oostrom, C., Beems, R. and others. 2007. Impaired

30

genome maintenance suppresses the growth hormone-insulin-like growth factor 1 axis in mice with cockayne syndrome. PLoS Biology 5(1):e2. Wang, T., Hung, C.C.Y., and Randall, D.J. 2006. The comparative physiology of food deprivation: From feast to famine. Annual Review of Physiology 68(1):223. Warner, D., Lovern, M., and Shine, R. 2007. Maternal nutrition affects reproductive output and sex allocation in a lizard with environmental sex determination. Proceedings of the Royal Society B: Biological Sciences:FirstCite.

BOX 1: DEFINITIONS OF TERMS USED IN THIS CHAPTER Acclimation (acclimatization, reversible phenotypic plasticity, phenotypic flexibility): Acclimation occurs within an individual over a period of time (days-months). Biochemical and physiological changes within an organism that allows them to function better in a changed environment. The ability to acclimate can be considered a trait itself. Adaptation: Evolutionary adaptation occurs at the population level over generations. Natural selection acting on individuals lead to changes in genotypic frequencies in the next generation allowing the population as a whole to be more fit in that environment. Aging (senescence): A multi-causal process, resulting from physiological decline associated with a decrease in reproductive success and an increase in probability of mortality. Anoxia (hypoxia): Lack of oxygen. Caloric restriction: Decreasing the daily caloric intake (e.g. 30% reduction in amount of calories consumed). Fasting: In mammals/birds this is characterize by the depletion of liver glycogen and the switch to gluconeogenesis using amino acids from muscle and glycerol from adipose tissue. Hormesis: The beneficial effects of a non-lethal stress treatment. Exposure to a stress once causes a faster response to that same stress at a subsequent exposure and other stresses. “That which does not kill us makes us stronger”. Oxidative Stress: Occurs when there is an overproduction of ROS relative to the levels antioxidants and other mechanisms to mitigate them, potentially resulting in oxidative damage. Oxidative Damage: Damage to macromolecules (nucleic acids, proteins, lipid membranes) due to reactive oxygen species (ROS). Pleiotropy: The influence of one gene (or a gene pathway) on multiple traits.

31

Phenotypic Plasticity: The ability of an organism (a genotype) to respond differently under different types of conditions. Acclimation would be one type of phenotypic plasticity. Reactive Oxygen Species (ROS): Highly reactive molecules that scavenge electrons from other molecules and that are produced within the cell. Reaction Norm: The environmental sensitivity of a genotype. Starvation: In mammals/birds this is a continuation after fasting that is characterized by depletion of adipose stores and rapid degradation of muscle. Stress response: The immediate (~0-6 hours) response to environmental changes, often due to extreme environment fluctuations.

BOX 2: CASE STUDY ON GARTER SNAKE LIFE-HISTORY ECOTYPES While natural selection acts on the individual, evolution occurs at the level of the population. Therefore, studies on diverging populations of the same species are particularly informative as to how genomes can evolve under differing selection pressures and the pleiotropic consequences at on life history traits. This opportunity has been provided by a unique natural study system, 25 populations of the riparian-aquatic garter snake, Thamnophis elegans, arrayed over a 100 km2 study area at 1555- 2055 m at the northern end of the Sierra Nevada Mountains in northeastern California. These populations consist of two distinct garter snake ecotypes (Slow- vs Fast-living) that inhabit contrasting environments (Mountain Meadow vs Lake-shore). The garter snakes ecotypes have evolved different morphologies and life-history strategies along the growth/reproduction - longevity continuum (Table 1; Figure 2), which have been revealed by systematic mark/recapture studies since 1976, and laboratory breeding and common-garden experiments (Bronikowski 2000; Bronikowski and Arnold 1999; Sparkman et al. 2007). Studies of mitochondrial DNA and nuclear markers (microsatellites) show that these two ecotypes have nearly identical genomes and that the ecotypic differentiation is driven by selection that counteracts moderate gene flow between ecotypes (Bronikowski and Arnold 2001; Manier et al. 2007). The contrasting habitat selection pressures and the evolved difference between the ecotypes match up in ways predicted by evolutionary theory. The Fast-living ecotype found at widely-spaced intervals along the shore of Eagle Lake have higher extrinsic mortality due to predation, lower caloric restriction due to high prey abundance, and warmer temperatures due to lower elevation. Whereas the Slow-living ecotype found in the surrounding mountain meadows has lower predation and probability

32

of mortality; high caloric variation due to prey availability being dependent upon ephemeral ponds, and cooler temperatures at the higher elevations. One goal of on-going research is to identify the underlying cellular and molecular basis for the evolved ecotype differences. In doing so, they have identified key differences in how the ecotypes differ in their response to stress in their behavior, hormone levels, free radical production, and DNA damage and repair (Bronikowski 2008; Robert et al. 2007; Robert et al. In Press; Sparkman et al. 2009). In summary the Fast-living ecotype has higher ROS production (with and without stress); less efficient DNA repair mechanisms; and a stronger innate immune response, which includes the use of ROS to kill pathogens (Nappi and Ottaviani 2000; Sparkman and Palacious In Press). Future experiments will continue to characterize differences between stress response in these ecotypes including whole transcriptome gene expression and additional aspects in regulating ROS and oxidative stress. Thus far, evidence suggests that the Slow-living ecotype has a more efficient stress response and less potential for oxidative damage, which is inline with the idea presented in this chapter - that selection on stress response pathways may have pleiotropic effects on life-history traits.

33

Table 2-1: Summary of differences between slow-living and fast-living garter snake ecotypes around Eagle Lake, California.

Trait Slow-living Ecotype Fast-living Ecotype HABITAT1 Grassy meadow Rocky lakeshore Elevation 1630-2055m 1555m Summer daytime temperature 15- 30 C 20-34 C Avian Predators Medium bodied raptors, robins Eagle, osprey, robin Food/water availability Variable across years Continuous Major prey types Anurans, leech Fish, leech (anurans in flood years) Most common Parasites Tail trematodes Mites

MORPHOLOGY/LIFE HISTORY2,3 Color and stripe patterns Black with bright yellow stripe Checkered, muted greys, browns Adult body size (mean) 538 mm (range: 370 – 598) 660 mm (range: 425 – 876 mm) Female Maturation Size/age 400 mm/5-7 years 450 mm/3 yrs Reproductive rate Infrequent/resource dependent Annual Litter size (mean) 4.3 liveborn (range: 1 – 6) 8.8 liveborn (range: 1-21) Baby mass 2.85g (range: 1.8 - 3.5g) 3.27g (range: 2.5 - 4.2g) Annual Adult Pr(survival) 0.77/year 0.48/year Median life span (yrs) 8 4

CELLULAR PHYSIOLOGY4,5 Whole animal metabolic rate at 28C 0.48 ± 0.02 ml O2/hr Statistically equivalent H2O2 production under stress 56 pmol/ min x mg mitochondria 240 pmol/ min x mg mitochondria DNA repair efficiency to UV damage 73% 35% Field baseline corticosterone levels 50±8 ng/ml plasma 7.7±12 ng/ml plasma IGF-1 levels Lower/resource dependent Consistently higher Innate immune response Low: Natural Antibodies Higher for all three (SheepRBC), Complement- immune measures mediated lysis, and Bactericidal Competence 1Bronikowski & Arnold 1999 2Bronikowski 2000 3Sparkman et al. 2007 4Sparkman & Palacios 2009 5Robert & Bronikowski, Accepted Pending Revision, “Physiological evolution in populations of garter snakes with divergent life histories” Am Nat

34

Figure 2-1: Simplified vertebrate phylogeny illustrating the relationships of reptiles relative to other major vertebrate groups. Branch lengths have no meaning.

35

1 Figure 2-2: Conceptual diagram illustrating environmental stresses as selective pressures 2 affecting the organism through physiology and molecular stress response pathways that have 3 pleiotropic effects on multiple life-history traits. Fitness consequences of being able to modulate 4 these pathways allow for natural selection on the individual, cause the population to evolve 5

36

along the trade-off continuum between longevity and growth/reproduction. The simplified 6 molecular pathways depicted in the cell represent stress responses (or signaling cascades) 7 causing cytoplasmic transcription factors (HSF-1, CR, FOXO) enter the nucleus upon 8 perceived stress to transcribe genes involved in stress response and longevity. The mitochondria 9 is represented as the major source of energy and ROS production, which antioxidants 10 neutralize. 11 Abbreviations. 12 Extracellular: cort (corticosterone) CR (corticosterone receptor; IGF-1 (insulin-like growth factor-1); 13 IGFR-1 (insulin-like growth factor-1 receptor). 14 Cytoplasmic: HSF-1 (heat shock factor-1); P-FOXO (Fork-head transcription factor – homologus to 15

DAF-16 in C. elegans)-that is phosphorylated; P53 (protein 53); H2O2 (hydrogen peroxide); GPX 16 (glutathione peroxidase-1); Fe- (iron); OH- (hydroxyl radical). 17 Nucleus: HSP genes (genes encoding heat shock proteins); GRE genes (genes encoding 18 glucocorticoid response elements); PGC-1α (peroxisome activated receptor y coactivator 1- α). 19 Mitochondria: UCP2 (uncoupling protein 2); I (ETC complex I, NADH dehydrogenase); CoQ (co- 20 enzyme Q10); II (ETC complex II, succinate dehydrogenase); III (ETC complex III, bc1 complex); 21 CytC (Cytochrome c); IV (ETC complex IV, cytochrome c oxidase); V (ETC complex V, ATP 22 synthase); e- (electrons); SO (superoxide); SOD (superoxide disumutase); H2O2 (hydrogen peroxide). 23

37

CHAPTER 3. A GARTER SNAKE TRANSCRIPTOME: 24

PYROSEQUENCING, DE NOVO ASSEMBLY, AND SEX-SPECIFIC 25

DIFFERENCES 26 27 This is a published manuscript in the peer-reviewed scientific journal, BMC Genomics 28 29 Tonia S. Schwartz*, Hongseok Tae*, Youngik Yang, Keithanne Mockaitis, John Van Hemert, 30 Stephen R. Proulx, Jeong-Hyeon Choi, and Anne M. Bronikowski 31 32 Authors Contributions: * authors contributed equally to this manuscript. T.S.S., S.P., and A.M.B. 33 conceived and designed the research plan. A.M.B. collected animals. T.S.S. and A.M.B. dissected 34 tissues and isolated the RNA. K.M. optimized the normalization and library preparation for Titanium 35 platform and conducted the sequencing. J-H.C. designed the bioinformatic analysis strategy and 36 improved the assembly. H.T. analyzed homology-based annotation and ORF prediction; Y.Y. 37 analyzed the graph-clustering and sequence variants. J.L.V. and T.S.S. conducted Ka/Ks analyses, 38 additional annotation, and metagenomic analysis. T.S.S. drafted the manuscript. All authors 39 contributed to the interpretation of the results and the content of the final manuscript. 40

ABSTRACT 41 Background 42 The reptiles, characterized by both diversity and unique evolutionary adaptations, provide a 43 comprehensive system for comparative studies of metabolism, physiology, and development. 44 However, molecular resources for ectothermic reptiles are severely limited, hampering our ability to 45 study the genetic basis for many evolutionarily important traits such as metabolic plasticity, extreme 46 longevity, limblessness, venom, and freeze tolerance. Here we use massively parallel sequencing (454 47 GS-FLX Titanium) to generate a transcriptome of the western terrestrial garter snake (Thamnophis 48 elegans) with two goals in mind. First, we develop a molecular resource for an ectothermic reptile; 49 and second, we use these sex-specific transcriptomes to identify differences in the presence of 50 expressed transcripts and potential genes of evolutionary interest. 51 52 53

38

Results 54 Using sex-specific pools of RNA (one pool for females, one pool for males) representing 7 tissue 55 types and 35 diverse individuals, we produced 1.24 million sequence reads, which averaged 366 bp in 56 length after cleaning. Assembly of the cleaned reads from both sexes with NEWBLER and MIRA 57 resulted in 96,379 contigs containing 87% of the cleaned reads. Over 34% of these contigs and 13% 58 of the singletons were annotated based on homology to previously identified proteins. From these 59 homology assignments, additional clustering, and ORF predictions, we estimate that this 60 transcriptome contains ~13,000 unique genes that were previously identified in other species and over 61 66,000 transcripts from unidentified protein-coding genes. Furthermore, we use a graph-clustering 62 method to identify contigs linked by NEWBLER-split reads that represent divergent alleles, gene 63 duplications, and alternatively spliced transcripts. Beyond gene identification, we identified 95,295 64 SNPs and 31,651 INDELs. From these sex-specific transcriptomes, we identified 190 genes that were 65 only present in the mRNA sequenced from one of the sexes (84 female-specific, 106 male-specific), 66 and many highly variable genes of evolutionary interest. 67 Conclusions 68 This is the first large-scale, multi-organ transcriptome for an ectothermic reptile. This resource 69 provides the most comprehensive set of EST sequences available for an individual ectothermic reptile 70 species, increasing the number of snake ESTs 50-fold. We have identified genes that appear to be 71 under evolutionary selection and those that are sex-specific. This resource will assist studies on gene 72 expression and comparative genomics, and will facilitate the study of evolutionarily important traits 73 at the molecular level. 74

BACKGROUND 75 Comparative studies are invaluable for understanding the evolution of complex traits. The reptile 76 lineage has given rise to both a metabolically endothermic group (birds) and diverse ectothermic 77 groups [1], thus providing a comprehensive system for comparative studies of metabolism, 78 physiology, development and aging [2]. Ectothermic reptiles (e.g. turtles, crocodilians, tuatara, 79 lizards, and snakes) exhibit extreme plasticity in their ability to modulate their metabolism in 80 response to external stresses such as thermal and food stress [2-5]. Furthermore, they show 81 extraordinary diversity in body structure (e.g. turtle shells, squamate limblessness), sex determining 82 systems (e.g. temperature-dependent vs. genotypic), and sexual dimorphism. Snakes, in particular, 83 have evolved dramatic evolutionary adaptations (e.g. venom, limblessness) and have been useful 84 models for evolution and ecology [6], but the pursuit to understand these evolutionarily important 85

39

traits at the molecular level has been limited by the molecular resources available. The heightened 86 interest to utilize reptiles in molecular genetic studies is signified by the first two on-going 87 ectothermic reptile genome projects (Anolis lizard and painted turtle) conducted by NIH [7, 8], and 88 the additional five ectothermic reptile genomes selected to be sequenced by BGI [9]. As well, the 89 recent publications of viper venom gland transcriptomes [10], and the first ectothermic reptile linkage 90 map (crocodile) [11] emphasize that developing genomic resources for this interesting group is of 91 utmost importance. First glimpses into the evolution of reptile genomes (endothermic and 92 ectothermic) have revealed unique genomic attributes such as microchromosomes, the evolution of 93 gene structure and gene synteny [12-14], as well as dramatic evolutionary changes in functionally 94 important genes [15]. 95 Ultimately we are interested in understanding how reptiles use their genomes in sex-specific ways 96 to respond to environmental and evolutionary pressures, and how these responses affect reproduction 97 and aging. A necessary first step is to develop and characterize molecular resources for a species of 98 interest. The main focus of this paper is to develop a garter snake transcriptome so it can be used as a 99 reference for future studies. The western terrestrial garter snake (Thamnophis elegans) is an emerging 100 model for understanding the molecular basis for the evolution of life-history trade-offs between 101 cellular maintenance and longevity versus growth and reproduction [16]. Closely related populations 102 of this species, found in the Sierra Nevada Mountains, harbor two ecotypes that contrast in physical 103 appearance, physiological and behavioural stress response, natural lifespan, and reproductive traits. 104 These two ecotypes can be contrasted as either “fast-living” or “slow-living” based on the overall life- 105 history [17-20]. These ecotypes derive from both genetic and environmental differences [17], and 106 selection acts strongly on divergent traits despite low levels of gene flow among these populations 107 [21, 22]. Additionally, males and females have different costs associated with reproduction, which is 108 expected to cause sexual conflict at the genomic level. This conflict may be resolved through 109 differential regulation of how these sexually antagonistic genes are utilized in each sex [23]. 110 The paucity of molecular resources available for the garter snake and other ectothermic reptile 111 species has limited our ability to identify the genetic basis for evolutionarily important traits, thus 112 providing the inspiration behind this project. The application of new sequencing technologies, such as 113 pyrosequencing, to traditional ecological models expands the horizon for ecological genomic studies 114 [24-29]. Indeed, we are on the verge of elucidating the genetic basis of ecologically and 115 evolutionarily relevant traits in natural populations. Here we apply this technology to develop a large 116 scale, multi-tissue, multi-individual transcriptome using massively parallel sequencing with two goals 117 in mind. Our first goal is to develop a molecular resource for the garter snake and make it available to 118

40

the scientific community. This resource is a generalized transcriptome (i.e., RNA was pooled across 119 ecotypes, populations, individuals, and tissues) for use as a reference for future studies. Our second 120 goal is to identify sex-specific differences in the presence/absence of expressed transcripts by 121 identifying transcripts that were present in the normalized library of one of the sexes but not the other. 122 These data are accessible through the Garter Snake Transcriptome Browser at Indiana University 123 CGB (https://lims.cgb.indiana.edu/cgi-bin/gbrowse/telegans_bronikowski_2/), the Bronikowski Lab 124 Data Server (http://eco.bcb.iastate.edu/), and through NCBI Short Read Archive (SRA010134). 125

RESULTS AND DISCUSSION 126

Sampling and 454 GS-FLX titanium sequencing 127 Our goal in sampling was to maximize the identification of unique transcripts, while capturing the 128 diversity of expressed transcripts across tissues, individuals, populations, and stress conditions. 129 Therefore, keeping male and female samples separate, we pooled RNA from 35 garter snakes (T. 130 elegans) of varying sizes/ages (at least 1 year old) into two sex-specific RNA samples (sampling 131 details in Additional file 1). The snakes were both laboratory-born and field-caught from seven focal 132 populations of the Sierra Nevada Mountains in California. These sex-specific pools of RNA were 133 used to develop normalized cDNA libraries that were sequenced on separate halves of a GS-FLX 134 Titanium (Roche/454 Life Sciences) PicoTitre plate. We also obtained an extra quarter plate of male 135 library reads for quality control assessment. This resulted in 446 Mbp of sequence data, 1.24 million 136 reads (i.e. expressed sequence tags) averaging 366 bp in length after cleaning (Table 1; see Additional 137 file 2 for size distribution of reads). The cleaned reads have been deposited in the NCBI Short Read 138 Archive (SRA010134). 139

Assembly and annotation 140 The male and female reads were pooled for assembly, but the sex-of-origin for each read was 141 tracked, which allowed contigs to be categorized as containing reads from both sexes, from males 142 only, or from females only (Figure 1). Relative to the garter snake, the sequenced (draft) genome 143 closest in evolutionary relationship is the Anolis lizard (Anolis carolinensis), which shared its most 144 recent common ancestor with snakes ~215 million years ago [30]. The next evolutionarily closest 145 species with a sequenced genome is the chicken (Gallus gallus), which shared its most recent 146 common ancestor with snakes ~285 million years ago [31]. We used 454 gsMapper 2.3 to map the 147 garter snake reads to the draft Anolis lizard and the chicken genomes. Of cleaned reads, 773,997 148 (62%) and 536,900 (43%), respectively, were mapped, even with the minimum percent identity of 149

41

80%. Those mapped reads were assembled to 255,211 and 175,279 contigs for the lizard and chicken 150 genomes, respectively. 151 For de novo assembly, reads were assembled using GS de novo Assembler (NEWBLER 152 v2.0.00.22; Roche) resulting in 82,134 contigs - with 84% of the reads being placed - with an average 153 depth of 7.6X coverage (Table 1). We then used the program MIRA [32] for additional assembly of 154 the remaining singletons. This resulted in 33,338 singletons (25%) being assembled into an additional 155 14,245 contigs. The remaining singletons were mapped to contigs in order to remove sequence 156 redundancy. Overall, 86.8% of reads were assembled into 96,379 contigs with 7.5% of the reads 157 remaining as singletons, and 5.7% of reads being discarded (Table 1). Because many more of the 158 reads were place with the de novo assembly, we use the de novo assembly for all subsequent analyses. 159 The percentage of reads assembled de novo is similar to other studies that have applied 160 pyrosequencing to non-model organisms [25, 27, 33]. The large number of contigs is likely due to the 161 extensive diversity in the initial RNA samples - pooled across individuals and populations - in the 162 form of sequence variants and alternative splicing. Different organs, different sexes, and different 163 environmental/stress conditions are known to produce extensive alternative spliced transcripts in 164 vertebrates [34]. This variation causes misalignments between reads arising from the same genomic 165 region, preventing correct assembly by most algorithms. For this reason, NEWBLER was used for 166 assembly because it splits reads at the boundaries of variation in order to build contigs reflecting both 167 static and variable regions [35] (see Additional file 3 for graphical representation of Newbler-split 168 reads and assembly). 169 The NEWBLER contigs, the MIRA contigs, and the singletons were compared to three reference 170 databases for annotation using BlastX: NCBI non-redundant protein database (NR); NCBI 171 HomoloGene; and UniGene (Chicken). The sequences were also compared to the 18,031 Ensemble 172 annotated genes (17,672 coding and 359 pseudogenes) from the Anolis lizard draft genome 173 (AnoCar1.0; http://uswest.ensembl.org/Anolis_carolinensis/) using tBLASTx. Over 24% of all 174 sequences (contigs and singletons) identified a homologue in at least one of these reference databases 175 at e-value 1e-5 (Figure 2: annotated data can be downloaded from http://eco.bcb.iastate.edu/). As 176 expected with longer sequences, a higher proportion of the contigs found an identity (34%) relative to 177 the singletons (13%). Most of the contigs that found an identity in one database were identified in all 178 four of the reference databases (Figure 2). This subset of contigs likely contains the more conserved 179 genes that are well characterized and thus found in all of these databases. The number of contigs 180 matching a homologue is similar or higher to other sequencing studies on non-model organisms [28, 181 33]. 182

42

The number of unique homologous genes identified in each database ranged from 26,232 in 183 NCBI-NR to 12,953 in UniGene (Chicken) (Table 2). The BlastX results against the HomoloGene 184 database and the Anolis lizard Ensemble annotation likely give the most realistic estimate of the 185 number of unique genes in our dataset for which we could assign a homology ID, approximately 186 13,000. Naturally, the more quickly evolving genes and the snake-specific genes would be unlikely to 187 find homologues in any of these databases. Thus, undoubtedly, there are uncharacterized genes yet to 188 be discovered in the other 76% of the sequences for which we could not assign an ID based on these 189 reference databases. ORF predictions indicate that an additional 97% of the non-annotated sequences 190 had a predicted open reading frame of at least 30 bp, which suggests that these were transcribed from 191 protein-coding genes. The majority of the GO annotations assigned to the snake sequences were for 192 the biological processes of metabolism and regulation, although there were also a smaller number of 193 sequences assigned to reproduction, behaviour, and stress response (see Additional file 4 for GO pie 194 graphs). 195 Using tBlastX, 55,715 snake transcripts (contigs and singletons) were also mapped to the lizard 196 draft genome (AnoCar1.0). To identify 5’ and 3’ UTRs and non-coding RNAs, these matches to the 197 Anolis genome were compared to the matches to the AnoCar1.0 Ensemble annotation of coding 198 (18,031) and non-coding (2939) RNA. We identified 2322 snake transcripts that contained 5’UTR 199 (286 matched to 5’UTR only, the rest matched 5’UTR along with protein coding sequence and/or 200 intergenic sequence), and 3680 that contained 3’UTR (1018 matched to 3’UTR only, the rest matched 201 3’UTR along with protein coding sequence and/or intergenic sequence). Furthermore, 2,534 of our 202 snake transcripts matched the Anolis non-coding RNAs, and 36,188 matched other intergenic regions 203 of the Anolis genome and therefore may also be non-coding RNAs (see Additional file 5 to access 204 these transcripts). A Snake Transcriptome Browser (GBrowse) of the snake transcripts mapped 205 against the Anolis draft genome (AnoCar1.0) has been set up to visualize the data, and to access the 206 assembled and annotated data files (https://lims.cgb.indiana.edu/cgi- 207 bin/gbrowse/telegans_bronikowski_2/). 208 To check for additional non-coding RNAs, we used RNAmmer 1.2 [36] and tRNAscan-SE 1.23 209 [37]. One contig was predicted to be 8S rRNA, although this was not supported by the homology 210 search through NCBI non-redundant protein sequences. Thus, it is likely a false prediction. Five 211 singletons were identified as tRNAs: two sequences for tRNAAsp, one for tRNAHis, one for tRNALys, 212 and one sequence for tRNASer. Additionally, pseudogenes were predicted from 5 contigs and 11 213 singletons. 214

215

43

Phylogenetic assessment of BLAST results 216 We use the phylogenetic distribution of the Blast hits to identify off-target sequences and as a 217 qualitative assessment of the assembly. Because we indiscriminately used all RNA isolated from six 218 tissue types and blood, which would contain disease and commensal organisms, we expected our 219 dataset to contain off-target species sequences that did not originate from the garter snake genome. To 220 identify these off-target sequences we used the metagenomics program MEGAN [38]. MEGAN 221 assigns a sequence to the lowest common ancestor of all its Blast assignments at a particular cut-off e- 222 value. Thus sequences placed at the deeper nodes of the phylogeny represent more conserved genes 223 relative to the genes on the leaves the tree that are more specific to that species. We used the output 224 from BlastX (e-value = 1e-5) against the NCBI-NR for the contigs and singletons (categorized based 225 on sex-of-origin) to map the hits on the NCBI taxonomic tree (Figure 3). We identified 979 sequences 226 (0.5% of the dataset) that were likely from off-target species including 51 that assigned to bacteria, 227 135 that assigned to viruses, and 11 that assigned to fungi. Interestingly, there was a bias towards 228 bacterial sequences originating from the male reads suggesting the possibility that one or more of the 229 males in our sample may have had a high bacterial infection load. 230 Equal proportions of sex-specific contigs and singletons were assigned to the nodes of the 231 taxonomic tree, which implies that our ability to map to NCBI was not biased towards one or the 232 other sex (Figure 3). Overall, 88% of the assigned sequences were assigned to or a lineage 233 within chordates; and 62% were assigned to tetrapods or a lineage within tetrapods (Figure 3). The 234 placement of sequences at a particular node is highly dependent upon the relative abundance (or 235 presence) of sequences available in NCBI-NR from each taxonomic group. Although there are far 236 fewer protein sequences from reptiles (Sauropsida) in NCBI-NR compared to mammals (Mammalia), 237 both lineages had roughly equal number of genes assigned within them. The species with the most 238 number of genes matched was Gallus gallus (chicken), which is the evolutionarily closest species 239 with a completed genome in the NCBI-NR protein database. 240 As an additional evaluation of the quality of our assembly and sequencing, we identified potential 241 chimeric sequences that are made-up of two unique sequences that have been concatenated such that 242 the sequence had two highly significant (<1e-20) NCBI-NR hits to different genes, which aligned to 243 different ends of the contig or singleton. We identified 24 contigs and 2 singletons that had such 244 signatures (i.e., representing < 0.025%). Of these, two of the contigs were adjacent mitochondrial 245 genes representing a correct assembly as the mitochondrial genome is known to be transcribed in 246 large multi-gene segments [39]. The other 22 contigs and 2 singletons seemed to be true chimeras 247 based on visual inspection of the BlastX hits. These chimeric sequences could be due to misassembly, 248

44

chimerization during the library construction, true biological gene fusion events, or errors in 249 GenBank. Overall, the phylogenetic distribution of the BlastX hits and the low percentage of chimera 250 sequences provide qualitative support for our assembly. 251

Additional clustering 252 Complex vertebrate transcriptomes are characterized by numerous alternatively spliced 253 transcripts and transcripts from duplicated genes [34, 40] making de novo assembly difficult. We used 254 two methods for additional clustering of the contigs and singletons into reference gene sets: clustering 255 based on homology, and clustering based on contig-graphs (Figure 1). Homology clustering uses 256 BlastX against reference databases to group contigs and singletons into clusters that are likely to have 257 originated from the same gene. Homology clustering of contigs and singletons based on the 258 HomoloGene database produced 8771 – 13,346 homology clusters, depending on the e-value cut-off 259 used (Table 2). At e-value 1e-20, each of these homology clusters contained 1 to 93 contigs with an 260 average of two contigs (see Additional file 6 Panels A and B for distributions of contigs and clusters). 261 Additionally, ~2000 of the contigs assigned to two or more homology clusters (HomoloGene 262 accession); these contigs likely belong to gene families or are contigs with a strong domain found in 263 multiple HomoloGenes. For example, contig00258 assigned to four HomoloGene clusters, each of 264 which contains an ubiquitin conjugating enzyme E2 catalytic domain (UBCc). Homology 265 comparisons to the draft gene models from the lizard genome had a smaller range of minimal gene 266 sets (Table 2). These homology clusters contain contigs and singletons that represent alternative 267 alleles, alternatively spliced transcripts, and non-overlapping contigs from the same gene and possibly 268 closely related genes in the same family (particularly at the lower e-value). 269 Output clusters from the contig-graphing approach were used to group NEWBLER-generated 270 contigs based on read-assembly information (see methods and Additional file 3 for a complete 271 description). For alternatively spliced transcripts, highly diverged alleles and repeat regions of 272 duplicated genes, NEWBLER (454 GS Assembler) splits reads into separate contigs. During 273 assembly, split reads are tracked and that information can be used to reconstruct how contigs may be 274 related. This can be illustrated graphically using networks (i.e. graph-clusters) such that nodes in the 275 graphs represent contigs and an edge between two nodes represents a read split between two contigs. 276 This clustering is independent of the contigs’ homology to other databases, and allows for further 277 identification of divergent alleles, alternatively spliced transcripts, and gene families based on the 278 structure of the network and nucleotide similarity of the contigs in the network (see Additional file 3 279 for more details of this method). We constructed 6,860 graph contig clusters containing 27,860 280 NEWBLER contigs. The graph-clusters contained on average four contigs (range 2-95) (see 281

45

Additional file 6 panel C for distribution of contigs in graph-clusters). This clustering method can be 282 used to further merge variable alleles of the same gene, construct alternatively spliced transcripts, and 283 identify gene families. Within these graph-clusters, we identified 554 components that are predicted 284 to represent highly divergent alleles of the same gene that can then be merged, 1,293 components that 285 are predicted to represent alternative splicing events, and 158 components (pairs of duplicated 286 contigs) that are predicted to represent duplicated genes (Figure 4). 287 Evaluating the BlastX NCBI-NR results of the contigs within each graph-cluster component 288 revealed differences among the classes of components. As you would expect from alleles of the same 289 gene, all the contigs from merged components (highly variable alleles of the same gene) were more 290 likely to match the same gene in NCBI-NR than were contigs from alternatively spliced or duplicated 291 gene components. Likewise, contigs in the 158 duplicated gene components were more likely to 292 match different genes in NCBI-NR than either the merged components or the alternative spliced 293 components (Figure 4). 294 As an example of the usefulness of this graph-clustering method, we highlight graph- 295 cluster05625 (Figure 5) that consists of 27 contigs, of which nine contigs have identical hits in NCIB- 296 NR, three contigs have different hits in NCBI-NR, and 15 contigs have no hits in NCBI-NR. Ten of 297 these NCBI-NR matches are to the hypervariable immunological gene MHC class I, which is well 298 documented as being under diversifying selection in other species [41]. This clustering method 299 allowed us to identify additional contigs (not identified by our homology searches) that represent 300 multiple alleles, alternatively spliced transcripts and a potential gene duplication event for a highly 301 important immunological gene complex. 302

Sequence variants 303 One of our goals for including diverse individuals from multiple populations was to identify 304 sequence variants (SNPs and INDELS) that would be useful in future studies. At a Bayesian 305 probability of 90% we identified 126,946 variants: 95,295 SNPs and 31,651 INDELs in either single 306 contigs (23,615) or across contigs that had been merged into graph-clusters (549). Over 110,000 of 307 these variants had >99% probability of being a true variant rather than a sequencing error. Of the 308 single contigs that had variants, the average number of variants per contig is 5.16 (SNPs = 3.84, 309 INDELs = 1.32); with an average rate of one variant per 200 bp window (see Additional file 7 for 310 details on variants). The number of variants (SNPs and INDELs) per contig had a weak but 311 significant correlation with contig length (F=1980, DF=23,615, R2=0.07, p-value < 0.01: Figure 6 312 panel A), with the scatter suggesting that some contigs represented highly conserved genes with low 313 variation and some contigs with high levels of variation for contig length. 314

46

The degree and type of variability within a contig can indicate selection acting on the gene. As 315 expected, large contigs with few variants tended to be from highly conserved genes including, 316 MACF1 (microtubule-actin cross-linking factor 1), and myosin. We identified 236 contigs that were 317 highly variable (in the 99th percentile for highest ratio of variants per contig length) and, as expected 318 of quickly evolving genes, few of these (19%) had homology hits in the BlastX searches, but 98% of 319 them had predicted open reading frames (ORF) larger than 30bp. This size criterion suggests that the 320 ORFs were within true protein coding sequences. Of the 45 contigs that were highly variable and had 321 a BlastX hit in NCBI-NR, at least 8 were from off-target sequences and 11 were associated with 322 transposable elements, but there were also many sequences of interest that are known to be 323 hypervariable and/or quickly evolving in vertebrate taxa. These included three immunological genes 324 (MHC class I, complement factor-H related protein, interferon regulatory factor 7), a snake venom 325 gene (venom factor 1), and a predicted pheromone receptor (see Additional file 8 for full annotation 326 data on these genes of interest). Interestingly, there were also two highly variable genes involved in 327 lipid metabolism or lipid oxidation: peroxisomal long-chain acyl-CoA thioesterase, and fatty acid 328 desaturase 1. These genes may be particularly relevant for future studies on the evolution of 329 metabolism and stress response in these garter snake populations. 330 Mutations to a nucleotide of a similar structure (i.e., TS: transitions - mutations from a purine to a 331 purine, or from a pyrimadine to a pyrimidine) occur more often then transversions (TV: mutation 332 from a purine to a pyrimadine or vice-versa). Thus, a TS/TV ratio < 1 may reveal sequences subjected 333 to diversifying selection [42]. We found 73,836 TSs and 21,459 TVs in this dataset. We identified 334 2,165 contigs with a TS/TV < 1. For SNPs within predicted coding regions, we determined whether 335 they were non-synonymous polymorphisms (Ka) that changed the amino acid, or were synonymous 336 polymorphisms (Ks). Overall, 29,883 of the SNPs found in a coding region were non-synonymous 337 and 23,252 were synonymous. We found 8,417 contigs (8.7% of all contigs) with a Ka/Ks ratio >1. 338 This indicates that mutation(s) have changed the amino acid sequence more than would be expected 339 under a neutral model, and that these genes may be under diversifying selection within or among 340 these snake populations. The distributions of TS/TV and Ka/Ks are in Figure 6A. Of most interest are 341 the 16 contigs at the intersection of Ka/Ks >1, TS/TV <1, and the 99th percentile of highly variable 342 contigs. Only three of these could be assigned a putative identification based on homology: the 343 immune complement factor-H related protein, fatty acid desaturase 1, and a KRAB transcription 344 factor. Revisiting the MHC class I graph-cluster05625 (Figure 5) that consists of 27 contigs, of the 20 345 contigs had variants 10 had Ka/Ks > 1. As predicted above, this further supports diversifying 346 selection across this complex. Additional highly variable genes with high Ka/Ks ratios are likely to 347

47

be targets of diversifying selection, potentially diversifying across the populations (or ecotypes) of the 348 garter snakes. 349

Comparison between female and male 350 When the male and female reads were pooled and assembled into contigs, each read was tracked 351 by the sex from which it was generated. Thus, the contigs and singletons could be classified on the 352 origin of its reads. Focusing only on the sequences for which we could assign an ID based on 353 homology, NCBI-NR BlastX hits (1e-50) were summarized based on whether they were unique to 354 females (i.e., found only in female contigs and/or female singletons), were unique to males, or were 355 composed of reads from both sexes. In this way we identified 190 genes (195 contigs) that were only 356 present in the mRNA sequenced from one of the sexes (see Additional file 8 for full annotation data 357 on these genes of interest). Of these 190 genes, 84 were expressed only in females and 106 were 358 expressed only in males. While this is a relatively low number of genes that are sex-specific, recall 359 that our sex-specific pools of RNA included seven tissue types and were normalized in order to 360 maximize the number of unique transcripts. Therefore, these data are not quantitative differences in 361 expression in a particular tissue type, but are presence/absence data across all seven tissues. Most 362 other studies that look at sex-specific differences use microarrays (or RNA-seq) on non-normalized 363 libraries from a particular tissue and thus have quantitative data to show difference in the levels of 364 expression between the sexes. Indeed it is at the quantitative level that most genes are biased in their 365 expression between the sexes [43]. Additionally, most genes that are biased in their expression 366 between sexes are specific to a particular tissue. For example, a microarray study on chicken brain, 367 gonad, and heart, identified ~13,000 genes that had quantitative differences in one particular tissue, 368 but only four genes were significantly different in all three tissues [44]. As well, sex-specific genes 369 are typically quickly evolving; thus, homologues may not be present in the available databases [43]. 370 The female-specific sequences were enriched for GO terms for “biosynthetic processes” relative 371 to the male-specific sequences (FDR<0.006, p-value <0.0002: see Additional file 9 for distribution of 372 GO terms by sex). Although many of these sex-specific genes were uncharacterized or classified as 373 “predicted coding genes”, some met our expectations for being sex-specific. For example, two of the 374 male-specific genes are known to be involved in spermatogenesis (SPATA18, SPATA22) and two in 375 sperm motility (CATSPERG, CATSPER2). Additionally, two female-specific genes are known to be 376 involved in hormonal signalling and regulation (GnRH receptor and Irg1) [45-47]. Five of these 190 377 sex-specific genes also had a TS/TV ratio ≤ 1, male-specific: BTBD16, CATSPERG, CATSPER2; 378 and female-specific: RAG1 and CDX4. Additionally, 13 of the sex-specific genes had a Ka/Ks ratio > 379

48

1, three of these overlapped with those that had a low TS/TV: male-specific CATSPERG, female 380 specific CDX4 (also on the human X chromosome), and female specific RAG1. 381 These analyses suggest the CATSPER genes maybe under strong selective pressure. The 382 CATSPER genes (1-4) produce proteins that form an ion channel that is specific to the sperm 383 flagellum, and are required for hyperactivated motility during the fertilization process [48-50]. 384 Mutations in the CATSPER genes cause infertility in mice and humans [49-51]. The CATSPERG 385 gene encodes an additional protein that is associated with the CATSPER ion channel [52]. As with 386 many reptiles, garter snakes mate multiply and have multiple paternity litters [53], which could lead 387 to sperm competition and selective pressure on sperm function and motility. Further studies are 388 needed to test the role of diversity in the CATSPER genes in reproductive fitness and sperm 389 competition in this snake system. Although we hypothesize that these male-specific genes associated 390 with sperm production and fertilization are being expressed in the gonads, we cannot address the 391 question of tissue-of-origin with these data at this time. However, we are currently looking at gene- 392 expression differences in individual-based and tissue-specific libraries. 393 Both snakes and birds have ZW / ZZ sex chromosomes, in which the female is the heterogametic 394 sex. The chicken and snake are separated by ~285 million years and it has been well documented that 395 reptile sex chromosomes have undergone drastic rearrangements over that time [54]. Therefore, 396 whether any snake sequences aligned to the chicken sex chromosomes would be a very interesting 397 result. Indeed, the snake sequences aligned to 15 genes on the chicken Z chromosome. Interestingly, 398 four of these had female-specific expression and two had male-specific expression (see Additional 399 file 8 for full annotation data of these genes of interest). It remains to be determined whether these 400 genes reside on the garter snake Z (or W) chromosome. Equally interesting is that one gene known to 401 be on the human X chromosome, CDX4, had female specific expression, had a TS/TV < 1, and a 402 Ka/Ks > 1 suggesting it may be a sex-conflict gene that is quickly evolving in these snake 403 populations. 404 Because the cDNA libraries used for sequencing were normalized, the identification of sex- 405 specifically expressed genes is based on presence/absence rather than quantitative measures. Some of 406 the genes identified here as sex-specific genes are likely under sex-biased expression, potentially the 407 result of sex conflict resolved at the level of gene expression. Additionally, some of these sex-specific 408 genes may reside on the garter snake sex chromosomes. Additional large-scale studies at the 409 quantitative level will verify the sex-specific expression of these genes. 410

411

49

CONCLUSIONS 412 We have successfully sequenced the first large-scale, multi-organ transcriptome for an 413 ectothermic vertebrate using pyrosequencing and de novo assembly. In the process, we use a method 414 for graphically clustering contigs after NEWBLER assembly that allowed us to identify divergent 415 alleles, alternatively spliced transcripts and gene families. We have identified a number of interesting 416 genes that are sex-specifically expressed and/or that are predicted to be quickly evolving that beg for 417 additional investigation. These are the starting points for genetic studies on evolution of metabolic 418 and immune function, sexual conflict resolution, as well as the evolution of sex chromosomes. 419 This transcriptome is the most comprehensive set of published EST sequences available for an 420 individual ectothermic reptile species. It has increased the number of nucleotide ESTs available for 421 ectothermic reptiles 5X, and for snakes 50X. Additionally, we have identified over 100,000 high 422 confidence variants (SNPs and INDELs) that can be used for population genetic studies, and 423 quantitative trait mapping in this and related species. 424 These sequence data are a tool for future gene expression experiments, and comparative 425 transcriptomic, genomic, and metabolomic studies. They can assist interested researchers to address 426 evolutionary- and ecological-genomic questions in this and other reptile species. Ongoing and future 427 studies can use this generalized transcriptome as a reference for mapping quantitative expression and 428 sequence data from experiments that use, for example, short-read sequencing technologies. By 429 combining these new sequencing technologies our labs hope to gain insight into how these snake life- 430 history ecotypes have evolved, as well as how sex-conflict genes evolve. Overall, this is a valuable 431 resource for the study of evolutionary important traits at the molecular level. 432 433

METHODS 434

Sampling 435 A total of 35 western terrestrial garter snake (Thamnophis elegans) individuals of varying 436 sizes/ages (at least 1 year old) were included in this transcriptome (17 females and 18 males: see 437 Additional file 1 for details on sampling). The snakes were either laboratory-born and raised or field- 438 caught from seven populations at the northern end of the Sierra Nevada Mountains in Lassen County, 439 California, which included the two life-history ecotypes as has been previously described [2, 18, 55]. 440 Adult field-caught snakes were shipped live to Iowa State University. The laboratory snakes were 2- 441 year old animals that had been used in a thermal experiment. Within 10 minutes of euthanizing, all 442 organs and blood were either snap-frozen in liquid nitrogen and stored at -80°C, or put in RNA-later 443

50

and kept at room temperature for 24 hours and then stored at -20°C. Organs included brain, gonads 444 (from sexually mature individuals), heart, kidney, liver, spleen and blood (ISU IACUC 3-2-5125-J). 445

RNA isolation 446 Individual snake samples were pooled by tissue type and by sex for total RNA isolation using 447 TriReagent and cleaned-up using Qiagen RNAeasy columns. The TriReagent manufacture’s protocol 448 was followed, but instead of precipitating the RNA, the supernatant from the chloroform step was 449 added (0.75:1) to the RLT/BME buffer from the Qiagen RNAeasy kit. From this point on the Qiagen 450 protocol was followed. Quality of the RNA was verified with a Bioanalyzer nanochip (Agilent). Total 451 RNA quantity was determined by both the Bioanalyzer nanochip and a Nanodrop. RNA from each 452 tissue type was pooled in equal amounts into their respective Male and Female samples, and 453 concentrations determined by fluorometry using Quant-iT OliGreen (Invitrogen). 454

Library preparation and sequencing 455 Library preparations for GS FLX Titanium (Roche/454 Life Sciences) sequencing were 456 developed in the Center for Genomics and Bioinformatics, Indiana University based partially on 457 methods for use in GS FLX standard sequencing described in Meyer et al. [25], with modifications 458 (K. Mockaitis, unpublished). Briefly, cDNA was synthesized from 630 ng of each total RNA pool 459 (male, female) in a manner similar to ClontechTM SMART protocols, using primers optimized for the 460 454 sequencing process, and amplified by PCR to generate dsDNA. For partial normalization to 461 reduce sequence coverage of high copy number transcripts, amplified cDNA was subjected to 462 controlled in-solution hybridization and double-stranded nuclease (DSN) digestion using the Trimmer 463 Direct kit (Evrogen) after reaction optimization. Normalized DNA was then fragmented by 464 sonication, and ends enzymatically blunted and ligated to customized 454 adaptors. Amplification of 465 ligation products exploited adaptor-mediated PCR suppression [25]. This method induces homo- 466 adapted fragment hairpins, thereby severely limiting amplification of mis-ligated products. All 467 amplification steps utilized high-fidelity polymerases. Final libraries were size selected by excision of 468 the 500-800 bp fraction from agarose gels. 469

Emulsion PCR reactions were performed according to the manufacturer (Roche/454 Life 470 Sciences). To optimize pyrosequencing throughput, prior to sequencing final libraries were titrated by 471 emulsion PCR bead enrichment. Sequencing of male and female libraries was performed in parallel 472 on a picotitre plate according to the manufacturer, and yielded 259 Mb (male) and 219 Mb (female) 473 of sequence data in 1,291869 reads with an average of 386 nts in length. 474

51

Sequencing adapters (A and B) were automatically removed from the reads using the signal 475 processing software (Roche/454 Life Sciences). The reads were further cleaned and the adaptors 476 removed by a program developed in-house at CGB, Indiana University 477 (http://sourceforge.net/projects/estclean/). After cleaning, sequences ≤ 30 bp were removed from the 478 dataset. Thus the final cleaned dataset before assembly contained 1,238,280 reads with an average 479 length of 366 nts (Table 1). 480

Assembly and annotation 481 The pooled reads were mapped to the lizard and chicken genomes using the GS mapper v2.3 with 482 80% identity. For de novo assembly, the reads were assembled into contigs using the GS de novo 483 assembler (NEWBLER v2.0.00.22, Roche/454 Life Sciences) with the default parameters (40 bp 484 overlap; 90% identity), resulting in 82,134 contigs and 134,971 singletons (Table 1). We found that 485 some singletons could be aligned to the contigs with 95% percent identity through the entire region 486 except less than 10 bp from both 5’ and 3’ ends. We used Blast to map 5407 singletons to 2471 487 NEWBLER contigs. An additional attempt was made to assemble the remaining singletons using 488 MIRA [32]. The recommended parameters for 454 reads were used, resulting in additional 14,245 489 contigs and 93,000 singletons remaining. Of these singletons, 339 reads were mapped to 380 contigs. 490 Therefore, the final number of singletons is 92,561 (Table 1). 491 For gene identification, contigs were compared to NCBI-NR protein database, HomoloGene, 492 UniGene (Chicken) [56] databases using BlastX and tBlastX with cut-off e-values of 1e-5,1e-10, 1e-20 493 and 1e-50. Databases accessed in January 2010 were used for these analyses. Additionally the contigs 494 were mapped to the draft Anolis lizard genome (AnoCar1.0) [8] and Ensemble annotated gene models 495 using tBLATx and BlastX. Open reading frames (ORFs) were predicted using OrfPredictor [57] with 496 the NR BLAST hits and NCBI’s ORF Finder. Databases accessed in August 2010 were used for these 497 analyses. 498 The NCBI-NR BlastX hits (e-value = 1e-5) for the NEWBLER contigs, MIRA contigs, and 499 singletons were used with the program MEGAN [38] to map the sequences to the NCBI 500 (databases accessed on January 2010 for BlastX and February 2010 for taxonomy). Sequences that 501 were assigned to the very tips of the branches outside of Chordata were considered to have originated 502 from off-target species (i.e., not from T. elegans). MEGAN was used to evaluate the GO annotation 503 (GO Slim) assigned to each term (as of February 2010). 504

505

52

Additional clustering 506 We used two methods for additional clustering: homology clustering, and a newly developed 507 method we refer to as graph-clustering or contig-graphs (Figure 1). Homology clustering is the 508 grouping of singletons and contigs based on their BlastX hits in HomoloGene, and the draft lizard 509 gene models (AnoCar1.0). The homology clustering was based on four different cut-off e-values: 1e-5, 510 1e-10, 1e-20 and 1e-50 511 The graph-clusters were assembled independent of any comparisons to other databases, but rather 512 solely dependent on the information derived from the original reads. The GS de novo (NEWBLER, 513 Roche/454 Life Sciences) assembly program is set up to allow for improved contig alignments when 514 there is abundant alternatively spliced transcripts and gene duplication events by allowing reads to be 515 split into two and each portion assigned to a different contig. We have developed a clustering method 516 based on graph theory that uses this ‘historical’ information on how the reads were split and assigned 517 into contigs (see Additional file 3 for full description). The underlying algorithm clusters contigs into 518 network graphs where the contigs represent the nodes and the split reads are the edges. These graph- 519 clusters can contain components that represent 1) a single gene with divergent alleles, 2) a single gene 520 with alternatively spliced transcripts, 3) closely related genes within a gene family (gene 521 duplications), or 4) any combination of the three (Additional file 3). In the case of alternatively- 522 spliced transcripts, the nodes indicate exons, and the edges indicate the combinations of how these 523 exons connect in the different transcripts (transcriptional paths). These exons could not be merged 524 further based on similarity to each other. In the case of duplicated genes, parts of the transcripts were 525 highly similar and merged into the same contig, but reads covering the regions of the duplicated genes 526 that have diverged were split into different contigs (nodes). These nodes representing the diverged 527 regions of the duplicated genes were still quite similar (assuming >80% but < 95% sequence identity). 528 This similarity is used to distinguish if a contig-graph represents an alternatively-spliced gene or 529 duplicated genes. Contig-graphs representing divergent alleles from the same gene are distinguished 530 from duplicated genes by assuming that the alleles have > 95% sequence identity (see Additional file 531 3 for more details on the graph-clustering method). 532 For each component within a graph-cluster, the BlastX results for its contigs were summarized in 533 order to classify the component into one of five categories. 1) None of the contigs had a homology 534 hit; 2) some of the contigs had a homology hit, but to different files in NCBI; 3) some of the contigs 535 had a homology hit, and they were to the identical file in NCBI; 4) all of the contigs had a homology 536 hit, but to different files in NCBI; 5) all of the contigs had a homology hit, and they were all to the 537 identical file in NCBI. These were summarized for each type of component. 538

53

Sequence variants 539

Sequence variants (SNPs and INDELs) were identified and their probability of being a ‘true’ 540 variant based on Bayesian analysis using the program GIGABAYES [58, 59]. Since NEWBLER 541 aligns reads allowing gaps instead of substitutions, sequence variant callers cannot identify SNPs and 542 INDELs accurately. MOSAIK [60] was used to realign reads against contigs. Because homopolymer 543 errors are more prevalent with 454 sequences than ABI sequences, this has to be taken into account 544 when calculating probabilities. INDELs called by GIGABAYES were filtered out if they were in 545 homopolymer regions. For high confident sequence variants, we filtered out SNPs and INDELs with 546 read coverage < 5 or > 100 and the probability < 0.9. 547 The relationship between the number of variants per length of the contig was tested using a 548 regression analysis in R. Contigs in the 99th percentile of number of variants/contig length were 549 considered highly variable. The ratio between transitions and transversions ((TS + 1) / (TV+ 1)) was 550 calculated for each contigs containing SNPs. The one is added to allow the ratio to be calculated for 551 contigs that had a zero for either TS or TV values. For each contig-containing SNP, for which we had 552 a predicted ORF, we calculated the ratio of non-synonymous (Ka) to synonymous (Ks) 553 polymorphisms ((Ka + 1) / (Ks+ 1)). (python script available at http://eco.bcb.iastate.edu/). 554

Comparisons between female and male 555 The sex-of-origin of each read was tracked through the assembly so that the contigs could be 556 classified as being a “female” contig (only containing female reads), a “male” contig (only containing 557 male reads), or a “both” contig (containing both male and female reads) (Figure 1). To identify genes 558 that are expressed by only one of the sexes, we compared the BlastX for each contigs and singleton. 559 For example, if a BlastX hit was found in only female contigs and/or female singletons, and had no 560 homology to male contigs or male singletons at e-values down to 1e-50, then it was classified as 561 female-specific. The same process was used to identify male-specific genes. These 190 genes were 562 compared to RefSeq for all vertebrates. Their GO slim assignments for Biological Processes were 563 statistically compared in Blast2Go to test for enrichment of particular GO terms using Fisher’s Exact 564 Test. 565

AUTHORS' CONTRIBUTIONS 566 TSS, SP, and AMB conceived and designed the research plan. AMB collected animals. TSS and 567 AMB dissected tissues and isolated the RNA. KM optimized the normalization and library 568 preparation for Titanium platform and conducted the sequencing. J-HC designed the bioinformatic 569 analysis strategy and improved the assembly. HT analyzed homology-based annotation and ORF 570

54

prediction. YIY analyzed the graph-clustering and sequence variants. JLV and TSS conducted Ka/Ks 571 analyses, additional annotation, and metagenomic analysis. TSS drafted the manuscript. All authors 572 contributed to the interpretation of the results and the content of the final manuscript. 573

ACKNOWLEDGEMENTS 574 We would like to acknowledge Steve Arnold for help collecting animals; Megan Manes and 575 Courtney Wolken for laboratory assistance; Suzanne McGaugh, Lex Flagel, Autum Pairett, Dan 576 Warner, the Bronikowski Lab for comments, and three anonymous reviewers for their comments on 577 improving this manuscript. Computer support was provided by the University Information 578 Technology Services (UITS) and by The Center for Genomics and Bioinformatics computing group 579 at IU. This research was funded by the National Science Foundation: IGERT in Computational 580 Molecular Biology [0504304] (TSS), [IOS-0922528] (AMB); the National Research Foundation of 581 Korea Grant funded by the Korean Government [NRF-2009-352-D00275] (HT), the Laurence H. 582 Baker Center (SRP), and the ISU Center for Integrative Animal Genomics (AMB). 583

REFERENCES 584 1. Benton M, Donoghue P: Paleontological evidence to date the tree of life. Molecular Biology 585 and Evolution 2007, 24:26-53. 586 2. Schwartz TS, Bronikowski AM: Molecular stress pathways and the evolution of life histories 587 in reptiles. In: Molecular Mechanisms of Life History Evolution. Edited by Flatt T, Heyland A. 588 Oxford: Oxford Press; 2011. 589 3. Schwartz TS, Murray S, Seebacher F: Novel reptilian uncoupling proteins: molecular 590 evolution and gene expression during cold acclimation. Proceedings of the Royal Society of 591 London Series B 2008, 275:979-985. 592 4. Storey KB: Reptile freeze tolerance: Metabolism and gene expression. Cryobiology 2006, 593 52(1):1-16. 594 5. Secor S: Specific dynamic action: a review of the postprandial metabolic response. Journal 595 of Comparative Physiology B: Biochemical, Systemic, and Environmental Physiology 2009, 596 179(1):1-56. 597 6. Shine R: Life-history evolution in reptiles. Annual Review of Ecology, Evolution, and 598 Systematics 2005, 36(1):23-46. 599 7. National Institutes of Health: The Large-Scale Genome Sequencing Program 600 [http://www.genome.gov/10002154] 601 8. The Broad Institute: [https://www.broadinstitute.org/models/anole] 602

55

9. BGI: The 1000 plant and animal reference genome project 603 [http://www.ldl.genomics.cn/page/ldl/animal.jsp] 604 10. Casewell N, Harrison R, Wuster W, Wagstaff S: Comparative venom gland transcriptome 605 surveys of the saw-scaled vipers (Viperidae: Echis) reveal substantial intra-family gene 606 diversity and novel venom transcripts. BMC Genomics 2009, 10(1):564. 607 11. Miles L, Isberg S, Glenn T, Lance S, Dalzell P, Thomson P, Moran C: A genetic linkage map 608 for the saltwater crocodile (Crocodylus porosus). BMC Genomics 2009, 10(1):339. 609 12. Kumazawa Y, Endo H: Mitochondrial genome of the Komodo dragon: Efficient sequencing 610 method with reptile-oriented primers and novel gene rearrangements. DNA Research 2004, 611 11(2):115-125. 612 13. Organ CL, Moreno RG, Edwards SV: Three tiers of genome evolution in reptiles. Integr Comp 613 Biol 2008, 48(4):494-504. 614 14. Kumazawa Y, Ota H, Nishida M, Ozawa T: Gene rearrangements in snake mitochondrial 615 genomes: Highly concerted evolution of control-region-like sequences duplicated and 616 inserted into a tRNA gene cluster. Mol Biol Evol 1996, 13(9):1242-1254. 617 15. Castoe TA, Jiang ZJ, Gu W, Wang ZO, Pollock DD: Adaptive evolution and functional 618 redesign of core metabolic proteins in snakes. PLoS ONE 2008, 3(5):e2201. 619 16. Bronikowski AM: The evolution of aging phenotypes in snakes: a review and synthesis with 620 new data. AGE 2008, 30(2-3):169-176. 621 17. Bronikowski AM: Experimental evidence for the adaptive evolution of growth rate in the 622 garter snake Thamnophis elegans. Evolution 2000, 54(5):1760-1767. 623 18. Bronikowski AM, Arnold SJ: The evolutionary ecology of life history variation in the garter 624 snake Thamnophis elegans. Ecology 1999, 80(7):2314-2325. 625 19. Sparkman AM, Arnold SJ, Bronikowski AM, Herriges MJ: Evolutionarily divergent patterns 626 of age-specific reproduction in the garter snake Thamnophis elegans. Integrative and 627 Comparative Biology 2005, 45(6):1196-1196. 628 20. Sparkman AM, Vleck CM, Bronikowski AM: Evolutionary ecology of endocrine-mediated 629 life-history variation in the garter snake Thamnophis elegans. Ecology 2009, 90(3):720-728. 630 21. Manier MK, Arnold SJ: Ecological correlates of population genetic structure: a comparative 631 approach using a vertebrate metacommunity. Proceedings of the Royal Society of London 632 Series B 2006, 273:3001-3009. 633

56

22. Manier MK, Seyler CM, Arnold SJ: Adaptive divergence within and between ecotypes of the 634 terrestrial garter snake, Thamnophis elegans, assessed with FST-QST comparisons. Journal 635 of Evolutionary Biology 2007, 20(5):1705-1719. 636 23. Rice WR, Chippindale AK: Intersexual ontogenetic conflict. Journal of Evolutionary Biology 637 2001, 14(5):685-693. 638 24. Wicker T, Schlagenhauf E, Graner A, Close T, Keller B, Stein N: 454 sequencing put to the test 639 using the complex genome of barley. BMC Genomics 2006, 7(1):275. 640 25. Meyer E, Aglyamova G, Wang S, Buchanan-Carter J, Abrego D, Colbourne J, Willis B, Matz M: 641 Sequencing and de novo analysis of a coral larval transcriptome using 454 GS-FLX. BMC 642 Genomics 2009, 10(1):219. 643 26. Toth AL, Varala K, Newman TC, Miguez FE, Hutchison SK, Willoughby DA, Simons JF, 644 Egholm M, Hunt JH, Hudson ME et al: Wasp gene expression supports an evolutionary link 645 between maternal behavior and eusociality. Science 2007, 318(5849):441-444. 646 27. Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, Hanski I, Marden JH: Rapid 647 transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Mol 648 Ecol 2008, 17(7):1636-1647. 649 28. Hale M, McCormick C, Jackson J, DeWoody JA: Next-generation pyrosequencing of gonad 650 transcriptomes in the polyploid lake sturgeon (Acipenser fulvescens): the relative merits of 651 normalization and rarefaction in gene discovery. BMC Genomics 2009, 10(1):203. 652 29. Schwarz D, Robertson H, Feder J, Varala K, Hudson M, Ragland G, Hahn D, Berlocher S: 653 Sympatric ecological speciation meets pyrosequencing: sampling the transcriptome of the 654 apple maggot Rhagoletis pomonella. BMC Genomics 2009, 10(1):633. 655 30. Vidal N, Hedges SB: The phylogeny of squamate reptiles (lizards, snakes, and 656 amphisbaenians) inferred from nine nuclear protein-coding genes. Comptes Rendus Biologies 657 2005, 328(10-11):1000-1008. 658 31. Rest JS, Ast JC, Austin CC, Waddell PJ, Tibbetts EA, Hay JM, Mindell DP: Molecular 659 systematics of primary reptilian lineages and the tuatara mitochondrial genome. Molecular 660 Phylogenetics and Evolution 2003, 29(2):289-297. 661 32. Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Müller WEG, Wetter T, Suhai S: Using the 662 miraEST assembler for reliable and automated mRNA transcript assembly and SNP 663 detection in sequenced ESTs. Genome Research 2004, 14(6):1147-1159. 664

57

33. Wang W, Wang Y, Zhang Q, Qi Y, Guo D: Global characterization of Artemisia annua 665 glandular trichome transcriptome using 454 pyrosequencing. BMC Genomics 2009, 666 10(1):465. 667 34. Kim E, Magen A, Ast G: Different levels of alternative splicing among eukaryotes. Nucleic 668 Acids Research 2006:gkl924. 669 35. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, 670 Chen Y-J, Chen Z et al: Genome sequencing in microfabricated high-density picolitre 671 reactors. Nature 2005, 437(7057):376-380. 672 36. Lagesen K, Hallin P, Andreas Rodland E, Staerfeldt H-H, Rognes T, Ussery DW: RNAmmer: 673 consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Research 2007, 674 35(1):3100-3108. 675 37. Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA 676 genes in genomic sequence. Nucleic Acids Res 1997, 25:955-964. 677 38. Huson D, Auch A, Qi J, Schuster SC: MEGAN analysis of metagenomic data. Genomce 678 Research 2007, 17:377-386. 679 39. Scheffler IE: Mitochondria: Second edition. Hoboken, New Jersey: John Wiley & Sons, Inc; 680 2008. 681 40. Lynch M: The origins of genome architecture. Sunderland, MA: Sinauer Associates, Inc.; 2007. 682 41. Klein J: Natural history of the major histocompatibility complex. New York: John Wiley & 683 Sons Inc; 1986. 684 42. Kimura M: A simple method for estimating evolutionary rates of base substitutions through 685 comparative studies of nucleotide sequences. J Mol Evol 1980, 16(2):111-120. 686 43. Ellegren H, Parsch J: The evolution of sex-biased genes and sex-biased gene expression. Nat 687 Rev Genet 2007, 8(9):689-698. 688 44. Mank J, Hultin-Rosenberg L, Webster M, Ellegren H: The unique genomic properties of sex- 689 biased genes: Insights from avian microarray data. BMC Genomics 2008, 9(1):148. 690 45. Chen B, Zhang D, Pollard JW: Progesterone regulation of the mammalian ortholog of 691 methylcitrate dehydratase (Immune Response Gene 1) in the uterine epithelium during 692 implantation through the protein kinase C pathway. Molecular Endocrinology 2003, 693 17(11):2340-2354. 694 46. Cheon Y-P, Xu X, Bagchi MK, Bagchi IC: Immune-Responsive Gene 1 is a novel target of 695 progesterone receptor and plays a critical role during implantation in the mouse. 696 Endocrinology 2003, 144(12):5623-5630. 697

58

47. Millar RP, Lu Z-L, Pawson AJ, Flanagan CA, Morgan K, Maudsley SR: Gonadotropin- 698 releasing hormone receptors. Endocrine Reviews 2004, 25(2):235-275. 699 48. Quill TA, Ren D, Clapham DE, Garbers DL: A voltage-gated ion channel expressed 700 specifically in spermatozoa. P Natl Acad Sci USA 2001, 98(22):12527-12531. 701 49. Ren D, Navarro B, Perez G, Jackson AC, Hsu S, Shi Q, Tilly JL, Clapham DE: A sperm ion 702 channel required for sperm motility and male fertility. Nature 2001, 413(6856):603-609. 703 50. Qi H, Moran MM, Navarro B, Chong JA, Krapivinsky G, Krapivinsky L, Kirichok Y, Ramsey IS, 704 Quill TA, Clapham DE: All four CatSper ion channel proteins are required for male fertility 705 and sperm cell hyperactivated motility. Proceedings of the National Academy of Sciences 706 2007, 104(4):1219-1223. 707 51. Avenarius MR, Hildebrand MS, Zhang Y, Meyer NC, Smith LLH, Kahrizi K, Najmabadi H, 708 Smith RJH: Human male infertility caused by mutations in the CATSPER1 channel protein. 709 The American Journal of Human Genetics 2009, 84(4):505-510. 710 52. Wang H, Liu J, Cho K-H, Ren D: A novel, single, transmembrane protein CATSPERG is 711 associated with CATSPER1 channel protein. Biology of Reproduction 2009, 81(3):539-544. 712 53. Garner TWJ, Larsen KW: Multiple paternity in the western terrestial garter snake, 713 Thamnophis elegans. Canadian Journal of Zoology 2005, 83:656-663. 714 54. Graves JAM, Shetty S: Sex from W to Z: Evolution of vertebrate sex chromosomes and sex 715 determining genes. Journal of Experimental Zoology 2001, 290(5):449-462. 716 55. Robert K, Vleck C, Bronikowski A: The effects of maternal corticosterone levels on offspring 717 behavior in fast- and slow-growth garter snakes (Thamnophis elegans). Hormones and 718 Behavior 2009, 55(1):24-32. 719 56. Pontius JU, Wagner L, Schuler GD: UniGene: a unified view of the transcriptome. In: The 720 NCBI Handbook. Bethesda, MD: National Center for Biotechnology Information; 2003. 721 57. Min XJ, Butler G, Storms R, Tsang A: OrfPredictor: predicting protein-coding regions in 722 EST-derived sequences. Nucleic Acids Research 2005, 33(suppl_2):W677-680. 723 58. Marth GT, Korf I, Yandell MD, Yeh RT, Gu Z, Zakeri H, Stitziel NO, Hillier L, Kwok P-Y, Gish 724 WR: A general approach to single-nucleotide polymorphism discovery. Nat Genet 1999, 725 23(4):452-456. 726 59. Quinlan AR, Stewart DA, Stromberg MP, Marth GT: Pyrobayes: an improved base caller for 727 SNP discovery in pyrosequences. Nat Meth 2008, 5(2):179-181. 728 60. MOSAIK Assembler 1.0: [http://bioinformatics.bc.edu/marthlab/Mosaik] 729 730

59

731 Table 3-1. Summary statistics for 454 sequencing and De Novo assembly 732

Female Male Total Total number of raw reads 538,706 753,163 1,291,869 Number of reads after cleaning 507,466 730,814 1,238,280 Amount of cleaned data 207Mbp 246Mbp 453Mbp Average length of cleaned reads (bp) 408 337 366 NEWBLER contigs (large>500bp) NA NA 82,134 (30,520) NEWBLER contig bp total NA NA 48 Mbp NEWBLER contig length (bp) NA NA 90-10,680 (avg 581) Average coverage (NEWBLER) NA NA 7.6X Reads placed (NEWBLER) NA NA 1,035,854 Singletons (after NEWBLER) NA NA 134,971 MIRA contigs NA NA 14,245 Reads placed (MIRA) NA NA 33,093 Singletons mapped by BLAST to contigs NA NA 5846 Reads discarded1 NA NA 70,926 (5.7%) Remaining Singletons NA NA 92,561 (7.5%) Total reads placed NA AN 1,074,793 (86.8%) Total number of contigs NA NA 96,379 SNPs NA NA 95,295 INDELs NA NA 31,651 733 1Reads were discarded either due to redundancy (3461) or for quality control by NEWBLER or 734 MIRA during the assembly process (67,455 and 10 respectively). 735

60

Table 3-2. The number of unique homologues in each database identified through BlastX using 736 four e-value cut-offs. Databases as of January 2010. 737

1e-5 1e-10 1e-20 1e-50 NCIB-NR 26,232 23,710 19,485 12,197 UniGene (Chicken) 12,953 12,317 11,348 9012 Anolis Ensembl annotated genes 13,923 13,533 12,806 10,515 HomoloGene 13,346 12,876 11,803 8771

738 739

61

740

741 Female Pool of RNA Male Pool of RNA 742 Includes Includes ¸ 17 female individuals ¸ 18 male individuals 743 ¸ 7 tissue types ¸ 7 tissue types ¸ 7 populations ¸ 6 populations 744 ¸ both ecotypes ¸ both ecotypes 745 Normalized library prepared Normalized library prepared GS-FLX Titanium 746 sequencing Female Male 747 Reads Reads

NEWBLER Assembly 748 Blast mapping

749

Newbler Contigs 750 Female FC BC MC Male Singletons Singletons 751

752 MIRA Assembly Blast mapping 753

MIRA Contigs Newbler Contigs 754

Female FC BC MC FC BC MC Male Singletons Singletons 755

756

757 Homology Homology HomoloGene Anolis gene models 758

graph clusters contig1 contig5 759

contig4 contig2 contig9 contig6 760 contig3 contig8 contig7 761

Figure 3-1. Flow-diagram of the assembly procedure. Female and male normalized pools of cDNA 762 were sequenced separately using Roche 454 GS-FLX Titanium chemistries. NEWBLER was initially 763 used to assemble the cleaned reads into contigs -that 1 - could be classified into three categories based on 764

62

the sex-of-origin of the reads: Female Contigs (FC, contigs made only from female reads); Male 765 Contigs (MC, made only from male reads); Both Contigs (BC, contigs made from both male and 766 female reads). The remaining male and female singletons went through additional assembly using 767 MIRA. After each assembly process an attempt was made to map the remaining singletons to the 768 contigs using Blast. The contigs and singletons were clustered based on homology in the 769 HomoloGene database or to the Ensemble annotated gene models from the Anolis lizard draft 770 genome (AnoCar1.0). The NEWBLER contigs were put into graph-clusters based how reads were 771 split and assigned to different contigs during the NEWBLER assembly process (see Additional file 3 772 for full description). In the graph-clusters the contigs (nodes) are linked by reads (edges) that were 773 split between the contigs. 774

63

Contigs [Singletons] Total 96,379 [92,561]

Cutoff e-5 63,455 [79,650] e-10 67,535 [83,784] e-20 73,038 [87,513]

UniGene (Chicken) NCBI-NR 424 [192] 1,240 [1,046]

327 [95] 834 [447] 226 1,350 [683] [40] 601 [209] 1,211 [498] 859 [230] 1,188 1,658 1,041 958 1,028 [752] 1,340 915 565 779 [531] 1,107 21,621 [5,824] 970 [715] 363 [192] 19490 [4,397] [445] [1,223] [581] HomoloGene 16,065 [2,768] [189] [612] [298] [73] [400] 1487 1358 [545] 1,086 [425] 1,410 [1,266] [277] 477 [220] 1,294 [885] 338 [117] 1,265 [526] [35] 234 Anole (anoCar1.0 Ensembl annotation)

775 Figure 3-2. Venn diagram of BlastX results. The number of contigs (both NEWBLER and MIRA) 776 and singletons that found a homologue when Blasted against UniGene (chicken), HomoloGene, 777

NCBI-NR databases and the draft lizard (Anolis carolinensis) genome and transcriptome (anoCar1.0) 778 at three different e-value cut offs. 779

780 781

64

B Assignment of each class of sequences to the least common ancestor of their BlastX hits hits BlastX their of ancestor common least the to sequences of class each of Assignment

ode, whereas the numbers are the accumulative sum of sequences assigned within that subclade. B) Expansion of of Expansion B) subclade. that within assigned sequences of sum accumulative the are numbers the whereas ode, ) using MEGAN. A) Distribution ofassignments onthe NCBI taxonomytree. The piegraphs areproportional to the numberof 5 - 3. Taxonomic assignments of sequences. of assignments Taxonomic 3. - Figure 3 Contigs - both sexes sexes both - Contigs females only - Contigs males only - Contigs females - Singletons males - Singletons A value = 1e = value - 782 Figure 3 Figure (e n that to assigned sequences tree. taxonomic on the snake garter the of placement the indicates arrow yellow The node. Tetropoda the

65

Figure 4 783

100%

90%

80%

70%

60%

50%

40%

30%

20%

10%

0% Merged Alternativespliced Duplicategenes (n=554) (n=1293) (n=158) Nocontigshadhits 215 197 93 Somecontigshadhits:differenthits 0350 Somecontigshadhits:identicalhits 21 630 28 Allcontigshadhits:differenthits 1217 Allcontigshadhits:identicalhits 317 410 30

784 Figure 3-4. BlastX results for the components identified in the graph-clusters. Classification of 785 the results of the BlastX NCBI-NR hits for each contig in each graph-cluster component, presented as 786 a proportion of each type of component. “Merged” are components have nodes (i.e. contigs) that were 787 merged due to sequence similarity. “Duplicate genes” are components that consist of pairs of contigs 788 from duplicated genes. “Alternative spliced” are components that consist of nodes that are predicted 789 to contain alternatively spliced exons from the same gene (after merging contigs and collapsing 790 duplicated contig pairs). 791 792

66

Figure 5

A

B

793

67

Figure 3-5. MHC class I graph-cluster. The MHC class I graph-cluster (cluster 05625) that 794 contains allelic variation, alternative spliced transcripts and potential gene duplication(s). A) 795 The nodes of the network cluster represent contigs, solid lines (edges) represent reads split 796 between the two contigs during assembly by NEWBLER, and the dotted lines represent 797 homologous contigs representing either divergent alleles from either the same or duplication 798 genes. The orange contigs were identified as homologues of MHC class I in squamates 799 (lizards and snakes), the light blue contigs had no hits in NCBI-NR. B) Alignment of the 800 contigs to their BlastX hit in the NCBI-NR database, MHC class I antigen from Iguanid 801 lizard, Conolophus subcristatus. Again, dotted lines represent homologous contigs 802 representing either divergent alleles from either the same or duplication genes. 803

804

68

805 806 contigs containing

variants on contig length for each contig with at least one variable site. B) Histograms B) Histograms site. variable one least at with contig each for length contig on variants

. Contigs with TS/TV < 1 (0 on log10 scale) and Ka/Ks > 1 (0 on log10 scale) are suggested to be under diversifying A) Regression of number of the the of number of Regression A)

Figure 6 . Variants. A B 6 -

igure 3 igure F the for ratios Ka/Ks of distribution log10 the and SNPs, containing contigs for ratios TS/TV the of distribution log10 the of SNPsin predicted ORFs snakes. garter of populations these in selection

69

807

ADDITIONAL FILES 808 Additional file 1 – Table containing details of the samples used for the sex-specific RNA 809 pools. Tissue samples of the same type were pooled across individuals (either laboratory or field born 810 animals) for total RNA extraction. Extracted pools of RNA were quantified and the quality checked 811 on the Bioanalyzer. Equal amounts of RNA from each tissue type were pooled by sex. 812 Additional file 2 – Graphs illustrating the size distribution of the reads for each sex. Length 813 (bp) distribution of reads obtained with the 454 GS-FLX Titanium sequencing. Read number (N) and 814 length (L) in base pairs. A) Female run. B) Male runs. 815 Additional file 3 – Description of NEWBLER assembly and graph-clustering procedure. 816 Additional file 4 – Pie graphs of GO assignments. GO slim (level 1, Biological Processes) 817 assignments for all the sequences with annotation, broken down by class of sequences: male 818 singletons, male contigs, both contigs (containing male and female reads), female contigs, female 819 singletons. 820 Additional file 5 - Snake transcripts mapped to coding and non-coding regions of the Anolis 821 lizard draft genome (AnoCar1.0). A 2007 Excel file (.xlsx) providing details where the snake 822 transcripts mapped to the Anolis draft genome. See the ReadME tab for description of columns. 823 Additional file 6 - Clustering based on homology and contig-graphs. A) Distribution of the 824 number of contigs in a HomoloGene accession, and B) the number of HomoloGene accessions that a 825 contig is assigned to, both at e-value =1e-20. C) Distribution of the number of contigs belonging to a 826 graph-cluster. 827 Additional file 7 – Details of variants. A 2007 Excel file (.xlsx) providing details for the 828 variants (SNPs and INDELs). See the ReadME tab for description of columns. 829 Additional file 8 - Contigs of interest. A 2007 Excel file (.xlsx) containing the sequences of 830 interest including those that are sex-specific, that have homology to the chicken Z chromosome, those 831 in the 1st percentile of TS/TV ratios, those in the top 99th percentile of Ka/Ks ratios, and those in the 832 top 99th percentile of variability (number of variants per bp). See the ReadME tab for description of 833 columns. 834 Additional file 9 – Sex-specific enrichment of GO terms (level 2, Biological Processes) 835 assigned to the 190 sex-specific sequences. The * indicates the significant over-enrichment of 836 sequences involved in biosynthetic processes in the female-specific sequences (Fisher’s Exact Test, 837 FDR <0.006, p-value < 0.0002). 838

70

CHAPTER 4. DISSECTING MOLECULAR STRESS NETWORKS: 839

IDENTIFYING NODES OF DIVERGENCE BETWEEN LIFE-HISTORY 840

PHENOTYPES 841

842 This is a published manuscript in the peer-reviewed scientific journal, Molecular Ecology 843 844 Tonia S. Schwartz and Anne M. Bronikowski 845 846 Authors contributions: Both T.S.S. and A.M.B. contributed to the design of the experiment and 847 the assays and performed the experiment. T.S.S. analyzed the data and wrote the manuscript. A.M.B. 848 advised the statistical analyses and edited the manuscript. 849

ABSTRACT 850 The complex molecular network that underlies physiological stress response is comprised of 851 nodes (proteins, metabolites, mRNAs, etc.) whose connections span cells, tissues, and organs. 852 Variable nodes are points in the network upon which natural selection may act. Thus, identifying 853 variable nodes will reveal how this molecular stress network may evolve among populations in 854 different habitats, and how it might impact life-history evolution. Here we use physiological and 855 genetic assays to test whether lab-born juveniles from natural populations of garter snakes 856 (Thamnophis elegans), which have diverged in their life-history phenotypes, vary concomitantly at 857 candidate nodes of the stress response network, 1) under unstressed conditions, and 2) in response to 858 an induced-stress. We found that two common measures of stress (plasma corticosterone and liver 859 gene expression of heat shock proteins) increased under stress in both life-history phenotypes. In 860 contrast, the phenotypes diverged at four nodes both under unstressed conditions and in response to 861 stress: circulating levels of reactive oxygen species (superoxide, H2O2); liver gene expression of 862 GPX1; and erythrocyte DNA damage. Additionally, allele frequencies for SOD2 diverge from neutral 863 markers, suggesting diversifying selection on SOD2 alleles. This study supports the hypothesis that 864 these life-history phenotypes have diverged at the molecular level in how they respond to stress, 865 particularly in nodes regulating oxidative stress. Furthermore, the differences between the life-history 866 phenotypes were more pronounced in females. We discuss the responses to stress in the context of the 867

71

associated life-history phenotype, and the evolutionary pressures thought to be responsible for 868 divergence between the phenotypes. 869

INTRODUCTION 870 To survive, all organisms must respond appropriately to stress. Physiological response to stress 871 (“stress response” hereafter) is a complex trait that can be characterized at the molecular level as the 872 neurological sensing of the environment, the production and circulation of molecular signals (e.g. 873 hormones), tissue/organ specific cellular signaling, and tissue specific regulation of gene expression. 874 As such, stress response comprises an intricate network, whose molecular nodes (e.g. proteins, 875 metabolites, transcripts, DNA) are interconnected across cells, tissues, and organs (see Figure 1). 876 Nodes that contain variation (e.g., differential gene expression or enzymatic activity) among 877 individuals, populations and species are points in the network upon which selection may act leading 878 to local adaptation of populations to their specific environmental stresses. Identifying which nodes 879 vary, and the extent to which this variation is due to genetic variation will facilitate our understanding 880 of how the complex molecular stress response can evolve. Furthermore, we can then apply this 881 mechanistic understanding of the molecular network to frame the study of the pleiotropic effects of 882 stress on life history traits such as reproduction and longevity. 883 Research on diverse laboratory model systems (e.g., mice, nematodes, fruit flies) indicates that 884 stress response and metabolism are integrated into a molecular network that has pleiotropic effects on 885 longevity and reproduction (Lithgow & Miller 2008; McElwee et al. 2007). This has been 886 demonstrated by the discovery of gene mutations that confer both increased resistance to stress (e.g., 887 oxidative) and increased longevity, and that often have negative effects on reproduction (Finkel & 888 Holbrook 2000; Friedman & Johnson 1988; Ghazi et al. 2009; Johnson et al. 2002; Kenyon et al. 889 1993; Lithgow et al. 1995; Norry & Loeschcke 2002; Walker et al. 2000). Recent studies have begun 890 to dissect the phenotypic correlation of these traits by targeting specific nodes in the network using 891 specific stress manipulations (Grandison et al. 2009). From these empirical studies on laboratory 892 organisms, these molecular networks are proposed to be a mechanistic basis for the evolution of life- 893 history trade-offs (Flatt & Heyland 2011) – specifically, the negative correlation between cellular and 894 somatic maintenance/longevity versus growth/reproduction (Leroi et al. 2005; Monaghan et al. 2009). 895 It is largely unknown how findings from laboratory studies on model organisms apply to natural 896 populations. However, homologous genes (and thus proteins) at dominant nodes indicate that the core 897 structure of this molecular network is largely evolutionarily conserved across diverse model 898 organisms (Holzenberger et al. 2003; Kenyon 2001; Piper et al. 2008; Russell & Kahn 2007a, b; 899

72

Tatar et al. 2003). Therefore, this network would likewise be conserved in both presence and overall 900 function in non-model species. For instance, we predict that natural selection acting on the (genetic) 901 variation at nodes of the stress response network should have a pleiotropic effect on life history traits 902 in natural populations as is seen in model organisms. Such selection could, in turn, drive populations 903 along a pace-of-life continuum with slower life-histories at one end (e.g., slow growth, delayed 904 maturation) and faster life-histories at the other (Ricklefs & Wikelski 2002). Recent studies on natural 905 populations of invertebrates, Drosophila (Norry et al. 2006) and butterflies (Isabell & Klaus 2009), 906 provide support for this hypothesis, demonstrating local adaptation to thermal stress, correlated 907 evolution of longevity, and a corresponding trade-off with fecundity (Norry et al. 2006). 908 Within the intricate stress response network, much empirical work has focused on pathways and 909 nodes associated with oxidative stress and the regulation of reactive oxygen species (ROS) (Dowling 910 & Simmons 2009; Finkel & Holbrook 2000). ROS are important signaling molecules throughout the 911 cell, but if they are not tightly regulated they can cause extensive oxidative damage to DNA, proteins 912 and lipids. ROS dynamics are regulated by their production as a by-product of oxidative 913 phosphorylation in the mitochondria, and their degradation via antioxidants. The cellular 914 consequences of ROS are mitigated by both protective and repair mechanisms (Monaghan et al. 915 2009). The Oxidative Stress Theory of Aging predicts that the imbalance between the production of 916 and protection from ROS results in oxidative damage to cell components, which causes cellular aging 917 and ultimately determines lifespan (Finkel & Holbrook 2000). Relevant to this study, reports on 918 natural populations of vertebrates - birds (Bize et al. 2008; Wiersma et al. 2004), lizards (Isaksson et 919 al. 2011), and rodents (Garratt et al. 2011; Holzenberger et al. 2003) - have identified interactions 920 between aspects of oxidative stress and reproduction and/or survival, although not always in the 921 predicted direction. Thus, we sought to provide additional data on how nodes within the stress 922 response network regulate and respond to oxidative stress, in relation to life-history trade-offs in an 923 ecological model system. 924 Here, we present results from a stress response study of naturally-divergent populations of garter 925 snakes (Thamnophis elegans). These populations exhibit either a fast-living phenotype (fast growth, 926 high reproduction, short lifespan) or a slow-living phenotype (slow growth, smaller reproductive 927 output, longer lifespan) on the pace-of-life continuum. Furthermore, these pace-of-life phenotypes are 928 predictably found within two different habitat types in our study area, and are thus best understood as 929 ecotypes that may be diverging along a number of life-history, morphology, and behavioral 930 continuums (see Table S1 for more details). We test the hypothesis that these fast and slow life- 931 history ecotypes differ at nodes in the stress response network under control (unstressed) conditions, 932

73

and in how their molecular stress networks respond when induced by an immediate physiological 933 stress (stress-induced). Based on life-history theory, Oxidative Stress Theory of Aging, and emperical 934 studies on model organisms, we predict that the slow-living ecotype would have a more protective 935 stress response in terms of cellular maintenance (decrease ROS levels, increased antioxidants and 936 HSP gene expression, decreased DNA damage) in both the unstressed and stress-induced conditions 937 due to the dominant ecological pressures in their habits of food restriction and colder temperatures 938 relative to the fast-living ecotype. We focus on candidate nodes in the stress response network, 939 including two well-known measures of stress (plasma corticosterone and gene expression of heat 940 shock proteins), and various components of oxidative stress: level of superoxide in blood cells; 941 circulating levels of hydrogen peroxide (H2O2); production of H2O2 by the liver mitochondria; DNA 942 damage; and liver gene expression and allele frequencies of the antioxidants catalase, superoxide 943 dismutases and glutathione peroxidases. For each of these nodes we determine 1) if the unstress 944 measurement differs between ecotypes (i.e., estimate of genetic divergence), and 2) if the plastic 945 response to an induced stress differs between the ecotypes (i.e., the extent of genetic divergences in 946 response to environment). We then address the potential of stress response networks to underpin life 947 history evolution by incorporating an ecological context into our interpretations (i.e. Schwartz & 948 Bronikowski 2011). 949 950

METHODS 951

Study populations 952 Populations of Western terrestrial garter snake (Thamnophis elegans) in the Sierra Nevada 953 Mountains of California provide a unique opportunity to test whether populations that have diverged 954 in their life-history phenotypes, vary concomitantly in their molecular responses to an induced stress 955 and, if so, to identify which nodes in their stress response networks have diverged. Populations of this 956 species, found across the 10-km landscape around Eagle Lake (Lassen County), are in opposition on 957 the pace-of-life continuum (Table S1) (Schwartz & Bronikowski 2011). Populations living at lower- 958 elevation lakeshore sites are denoted as fast-living (L-fast); they have earlier maturation at larger 959 body size, higher reproductive rate, and decreased longevity, relative to the slow-living populations 960 (M-slow) that live in higher elevation mountain meadows (Bronikowski & Arnold 1999; Sparkman et 961 al. 2007). The habitats (lakeshore versus mountain meadow) vary in the types and strengths of 962 selective pressures including temperature, elevation, food abundance/frequency, and predation (See 963 Table S1 for details)(Bronikowski & Arnold 1999; Miller et al. 2011). Specifically, higher-elevation 964

74

habitats, with seasonal and annual variation in food and water availability, are predicted to have more 965 physiological/metabolic stress than the constant and plentiful resource environment of lakeshore sites, 966 whereas the lakeshore sites are predicted to have increased predation stress that would likely affect 967 behavior and morphology (Manier et al. 2007). These different types of stresses are predicted to drive 968 the ecotypes in opposite directions along the life-history continuum; the increased physiological stress 969 towards cellular maintenance and longevity, the increased predation stress towards fast growth and 970 reproduction. 971 Previous population genetic studies across this landscape indicate restricted gene-flow within and 972 among ecotype populations (average FST ~ 0.02) (M. Manes unpublished data; Manier & Arnold 973 2005), which could facilitate local adaptation to the specific habitats. In this study we focus on two 974 replicate populations of the L-fast ecotype (ELF, MER) and two replicate populations of the M-slow 975 ecotype (PAP, MAH). We estimate genetic distance between these specific populations using 976 microsatellites as described in Supplemental Information. 977

Animals 978 In May and June of 2007, gravid females were collected from these four focal populations. 979 Females were brought back to Iowa State University, and gave birth in September 2007. Offspring 980 were raised in a common laboratory environment for 1.2 years until utilized in this experiment in 981 October/November 2008. The animals were individually housed in a common laboratory environment 982 (single room). Each tank had a thermal gradient that ranged from 24 °C–32 °C (12-h light), which 983 enabled the animals to behaviorally regulate their body temperature, and 24 °C (12-h dark). Animals 984 were fed 1x per week, pinky mice until satiation. Our experiments were conducted 5 days post- 985 feeding. 986

Experimental design 987 A total of 105 animals that had been born and raised in the lab for ~1.2 years (sexually 988 immature juveniles) were the subjects of this experiment. We chose to activate the molecular stress 989 network using heat as a generic stressor. Siblings were split between “Control” and “Heat Stress” 990 temperature treatment groups, and sexes were balanced across populations and treatments. Twenty 991 hours before the beginning of each experiment day, all animals were moved to a 27 ˚C incubator. The 992 day of the experiment, control animals were moved to a 27 ˚C incubator and treatment animals were 993 moved to a 37 ˚C incubator where they were maintained for 2-h before being euthanized via 994 decapitation. The 2-h treatment time was chosen to target the early, but well established molecular 995 changes due to the induction of a generic stress response. The heat stress temperature of 37 ˚C was 996

75

chosen as this is above the preferred body temperature of T. elegans, including both these ecotypes 997 (26-32 ˚C), but well below the critical thermal max (~43 ˚C) (Arnold & Peterson 2002; Huey et al. 998 1989; Stevenson et al. 1985). Based on behavior and locomotive performance, T. elegans become 999 thermally stressed above 35 ˚C (Stevenson et al. 1985), and O2 consumption in Thamnophis species 1000 increases with body temperature with Q10 = 2.45 (Bronikowski & Vleck 2010; Peterson et al. 1998). 1001 Thus we reasoned that 37 ˚C for 2-h would be sufficient to initiate a generalized stress response. 1002 Due to logistical constraints of assaying live tissues and cells, seven experiment days were spread 1003 over a 21-day period with 6-19 animals being processed on any one day from both thermal treatments 1004 and ecotypes. The order of the processing was random with respect to ecotype, and alternated 1005 between control and heat stress animals. Upon decapitation and exsanguination, all organs were 1006 dissected and snap-frozen in liquid nitrogen within 10 min, except for one-half of the liver that was 1007 put into ice-cold STE buffer (250 mM sucrose, 5 mM Tris, 2 mM EGTA, pH 7.4) for live 1008 mitochondrial extraction. Liver was chosen because of its high metabolic activity and because it is the 1009 largest organ allowing for both mitochondrial extraction and RNA isolation to be performed on the 1010 same tissue. From the fresh blood sample 4 µL were put into HBSS buffer (Hank’s Balanced Salt 1011 Solution without Calcium, Magnesium, or Phenol Red: Lonza Cat No. 04-315Q) for a DNA damage 1012 assay conducted that same day (see below). The rest of the blood sample was centrifuged at 1000 g, at 1013 4 ˚C, for 10 min to separate the plasma from blood cells. Five microliter of the compacted blood cells 1014 were diluted into 120 µL of HBSS for flow cytometry conducted that same day. The plasma was 1015 stored at -80 ˚C and at a later date used to measure concentrations of circulating corticosterone and 1016 hydrogen peroxide (H2O2). 1017

Assays 1018

Circulating hormone: corticosterone 1019 Corticosterone is the primary glucocorticoid hormone in reptiles, and it has been well established 1020 that it increases in response to acute physiological stress (Greenberg & Wingfield 1987). Plasma 1021 corticosterone concentration was measured using the ImmunChem Double Antibody Corticosterone 1022 I-125 RIA kit (MP Biomedical, Orangeburg, NY) with a protocol optimized for snakes (Robert et al. 1023 2009). Plasma samples (n=91) were diluted 1:160 and were run in duplicate using 5 µL of plasma. 1024 RIA readings were corrected for background noise and converted to ng/mL (ng of corticosterone per 1025 mL of plasma) based on a series of standards run along side the samples. 1026

1027

76

Circulating ROS: within blood cells (superoxide within the mitochondria) 1028 MitoSOX (Invitrogen Molecular Probes) is a red fluorescent molecule that binds to superoxide in 1029 the mitochondria of live cells. For 47 individuals, 5 µL of fresh concentrated blood cells (we did not 1030 separate red and white blood cells) were washed with HBSS and then diluted 1:20 with HBSS and 1031 kept on ice until mixed with the MitoSOX. Reactions contained a 1:120 dilution of blood cells in 1032

HBSS, at a total volume of 300 µL, with a final concentration of 5 µM of MitoSOX. Samples were 1033 incubated at 27 ˚C for 30 minutes and then analyzed on a FACS Canto (Biosciences) at the Flow 1034 Cytometry Facility at Iowa State University, with data collected on at least 10,000 events for each 1035 sample using excitation at 488 nm and fluorescence emitted at 575 nm. Statistics were conducted on 1036 the percent of cells that had a fluorescence level over the background level of HBSS. 1037

Circulating ROS: hydrogen peroxide (H2O2) in plasma. 1038

Plasma H2O2 concentration was measured using the Amplex Red Hydrogen Peroxide/Peroxidase 1039 Assay Kit (Invitrogen, Molecular Probes) for a 96-well microplate assay according to manufacturer’s 1040 instructions. Samples (n=93) were run on a single day in duplicate 50 µL reactions, using 5 µL of 1041 plasma that had been stored at -80 ˚C. Samples were spread across three 96-well round-bottom 1042 microtiter plates and a standard series of 0, 1, 2, 3, 4, 5, and 6 µM H2O2 were run on each plate. 1043 Fluorescence was measured after a 20-min incubation at room temperature on a Synergy HT 1044 microplate reader (Bio-Tek, Vermont, USA) with 530 nm excitation and 590 nm emission. Among 1045 the three plates, there were no significant differences among the slopes or the intercepts of the 1046 standard curves. Therefore, the standards from each plate were treated as replicates (i.e., 6 replicates 1047 for each standard) to construct an overall standard curve from which H2O2 concentration was 1048 calculated for each sample using the average fluorescence from the replicates. 1049

ROS (H2O2) produced by live liver mitochondria. 1050 Live mitochondria were isolated from the portion of the liver transferred to ice-cold STE buffer 1051 after euthanization. To obtain enough mitochondria, we pooled 1-3 livers by population (litter when 1052 possible) and treatment, providing 34-pooled samples. The mitochondria were isolated from the liver 1053 using a homogenization protocol (Robert et al. 2007) and separated from the other cellular debris 1054 with differential centrifugation (Palloti & Lenaz 2001). The protein concentration of the 1055 mitochondrial isolations was determined with a Bradford Assay (Bio-Rad) and used to normalize the 1056 assays by the amount of mitochondrial protein. 1057

The rate of H2O2 production by liver mitochondria was measured as described above for plasma, 1058 except that kinetic readings were taken over 35-min at 22 ˚C to calculate the amount of H2O2 1059

77

produced by a 0.5 mg suspension of mitochondria. We used the increase in fluorescence from 5-min 1060 to 35-min as a measure of H2O2 production by the mitochondria. The increase in fluorescence was 1061 converted to µg of H2O2 using the standard curve from the reading at 35-min. We then converted this 1062 to a rate of production per mg of mitochondrial protein by dividing the amount of H2O2 produced 1063 over 30-min by 0.5 mg of mitochondria protein. 1064

Candidate gene analysis 1065 Both gene expression and nucleotide sequence data for a set of candidate genes were extracted 1066 from a transcriptome-level quantitative gene expression dataset produced by Illumina RNA-seq on 13 1067 female individual livers. Twelve of these (three from each treatment/ecotype group) were from the 1068 heat stress experiment and contributed to the gene expression data. The 13th individual (not from the 1069 heat stress experiment) was included for genetic diversity, so that we had sequence data from 10 1070 unrelated individuals. In brief, RNA was isolated from 12 to 19 mg of snap-frozen liver using Qiagen 1071 RNAeasy kit (Qiagen). For each individual one lane of 72 or 80-bp single-pass sequencing using 1072 RNA-seq on the Illumina GAIIx platform was performed at the ISU DNA facility. Reads were 1073 cleaned using a workflow in Galaxy (Goecks et al. 2010) to trim the poor quality ends of the reads 1074 and then filter out reads less than 20-bp in length (workflow available via link in “Data 1075 Accessibility”). See Table S2 for summary metrics. 1076 We evaluated gene expression and sequence variation of candidate genes involved in heat shock 1077 and oxidative stress: heat shock proteins (HSP70 and two isoforms of HSP40); catalase (CAT); 1078 glutathione peroxidases (GPX1, GPX3, and GPX4), and superoxide dismutases (SOD1, SOD2, and 1079 SOD3). Increased expression of heat shock proteins is a canonical heat stress response; they protect 1080 cellular protein structure and function under stress (Sun & MacRae 2005). Of the antioxidant 1081 enzymes, CAT and GPXs degrade H2O2 (Mates & Sanchez-Jimenez 1999). CAT and GPX1 work 1082 within the cytoplasm, whereas GPX3 works extracellularly and in the plasma (based on avian 1083 systems) (Surai 2002). GPX4 catalyzes the reduction of organic hydroperoxide, lipid peroxides, and 1084

H2O2 within the cytoplasm and mitochondria (Surai 2002). SOD catalyzes the reduction of 1085 superoxides: SOD1 (Cu,Zn-SOD) in the cytoplasm; SOD2 (Mn-SOD) in mitochondria; and SOD3 1086 extracellularly (Mates & Sanchez-Jimenez 1999). 1087

Quantitative gene expression at candidate genes 1088 A garter snake reference sequence for each gene of interest was identified via BLAST searches 1089 with the chicken and/or human homolog to the reference garter snake transcriptome (Schwartz et al. 1090 2010), and to a reference transcriptome assembled in TRINITY (Grabherr et al. 2011) using the 1091

78

RNA-seq data from this study. The reads from each individual were mapped to the reference gene 1092 sequences and counted using a Galaxy workflow (available via link in ‘Data Accessibility’). To 1093 determine the number of mappable reads for each individual, reads were also mapped to the reference 1094 RNA-seq transcriptome. The number of reads mapped to each gene of interest was normalized for 1095 quality and quantity of reads across individuals by dividing by the number of reads (millions) for each 1096 individual that mapped to the reference transcriptome, and for candidate gene mRNA length by 1097 dividing by the size (KB) of the reference sequence. The result is a normalized count of transcripts for 1098 each gene of interest that were present in the liver cells of control and heat-stress animals. To verify 1099 gene expression based on the RNA-seq data, two stress genes were assayed with Q-PCR for 68 1100 individuals, HSP70A1 and Catalase, using EEF1A and B-actin as reference genes (Supplemental 1101 Information, Table S3). 1102

Sequence variation of candidate genes 1103 We used Geneious (Drummond et al. 2009) to identify single nucleotide polymorphisms (SNPs) 1104 from the mapped data (described above) on the 10 unrelated individuals for which we had RNA-seq 1105 data. Within an individual, called SNPs had a minimum coverage of 5X and a minimum frequency of 1106 0.25. To define an allele, a SNP had to be found in more than one individual. From the SNP data, 1107 alleles were constructed for each individual, and allele frequencies for each gene were tested for 1108 statistical differences between the populations by calculating FST values using GenALex (Peakall & 1109 Smouse 2006). Alleles were translated to amino acid sequences to categorize the SNPs as 1110 synonymous or non-synonymous and dN/dS ratios were calculated to check for strong signatures of 1111 positive selection (i.e., dN/dS >1). 1112 To identify alternatively spliced isoforms from our seven antioxidant genes, transcriptomes for 1113 each of the 10 individuals were assembled using TRINITY, which reconstructs alternatively spliced 1114 transcripts (Grabherr et al. 2011). The reference sequence for each gene was compared to each 1115 individual’s transcriptome using BLASTN. Hits were aligned within and across individuals for 1116 comparison, and against a human or chicken reference that contained annotated exons boundaries to 1117 assist in defining alternatively spliced transcripts. 1118

DNA damage within blood cells 1119 We measured the baseline and inducible DNA damage within erythrocytes using a cellular-level 1120 indicator of DNA damage with Single Cell Gel Electrophoresis, (aka “comet”) assay (Singh et al. 1121 1998). This assay detects damage due to single and double strand breaks. We performed this assay as 1122 described in Bronikowski (2008) on fresh cells (details in Supplemental Information). We used two 1123

79

levels of stress to characterize the effect of oxidative stress on DNA damage (Figure 2). The first 1124 level was the whole-organism temperature treatment (i.e., control vs. heat stress animals) as described 1125 above. The second level was a specific oxidative stress treatment (H2O2) applied directly to the blood 1126 cells. A comparison of DNA damage from the heat stress and control animals assesses the effect of 1127 organismal temperature on baseline DNA damage; we refer to this as the Background damage (n=80; 1128 Figure 2). For a subset of the experimental animals (n=50 of 80), we performed the second level of 1129 stress experimentation by exposing the blood cells to buffers with and without H2O2 (Figure 2). The 1130 purpose was to assess the resistance of blood cells to damage from a specific oxidizing agent (H2O2). 1131

Statistical analyses 1132 Sample sizes by group for each analysis are reported in Table S4. The data were checked for 1133 outliers (defined as two standard deviations from the mean – i.e., outside of the 95% percentile) and 1134 normality. When necessary, data were transformed with an appropriate transformation (below). All 1135 analyses were run in SAS/STAT v9.22 interfaced with Enterprise Guide (version 4.3). We used 1136 mixed model ANOVAs on uni-variate dependent variables (PROC MIXED) using a Residual 1137 Maximum Likelihood Estimator (REML) and type 3 SS to account for the unbalanced dataset. The 1138 following full model was used to test for the effects of temperature (control vs. heat stress), ecotype 1139 (M-slow vs. L-fast), sex, and experiment day on the following dependent variables: (1) circulating 1140 corticosterone (ng/mL) (log transformed), (2) plasma H2O2 (µM) (3) percent of blood cells with 1141 superoxide (arcsin transformed), (4) Background DNA damage due to heat-stress, and (5) DNA 1142 damage due H2O2 cell treatment. 1143 Response Variable = µ + (ln Mass) + (Experiment Day) + (Temperature Treatment) + (Sex) + 1144 (Litter(Pop(Ecotype))) + (T*E) + (T*S) + (E*S) + (T*E*S) + ε 1145 The main effects are T (pre-euthanize temperature: 27 °C vs. 37 °C), E (ecotype), and S (sex). 1146 Litter nested within population within ecotype is a random effect, experimental day is the blocking 1147 effect of 7 experimental days, and the natural log of body mass was a covariate. Factors with P-values 1148 ≥0.3 were removed from the model, and models were rerun. The AIC values for the different models 1149 were compared and the model with the lowest AIC value was chosen. 1150

We analyzed the two repeated measures of blood cell DNA damage (Background and H2O2 1151 Treatment) using repeated measures ANOVA with the previously described model, but including 1152 Blood Cell Treatment as a factor, repeated for individuals, along with all its interactions. 1153 Response Variable = µ + (lnMass) + (Experiment Day) + (Blood Cell Treatment) + (Temperature 1154 Treatment) + (Sex) + (Litter(Pop(Ecotype))) + (T*E) + (T*S) + (E*S) + (B*T) + (B*E) + (B*S) + 1155 (T*E*S) + (B*T*E) + (B*S*E) + (B*T*S) + (B*T*E*S) + ε 1156

80

Where T, E, and S are as described above, and B is the fixed effect of Blood Cell Treatment (with 1157 or without H2O2 in the cell buffer). 1158

Mitochondrial H2O2 production was measured on pooled liver mitochondria isolates. 1159 Therefore, the effects of sex and litter are not applicable. We used the natural log of average mass of 1160 the individuals in the pool as a covariate. 1161 Response Variable = µ + (ln Avg_Mass) + (Experiment Day) + (Temperature Treatment) + 1162 (Pop(Ecotype)) + (T*E) + ε 1163 We analyzed gene expression for the HSPs and the seven antioxidant genes, using the RNAseq 1164 data on 12 females, to test for the effects of treatment, ecotype, and their interaction. 1165 Response Variable = µ + (Temperature Treatment) + (Ecotype) + (T*E) + ε 1166 We also used this model for the Q-PCR data to verify the RNA-seq data. We tested for a 1167 correlation between gene expression of catalase from the liver (RNA-seq, and Q-PCR data) and 1168 plasma levels of H2O2 using Pearson’s correlation. Finally, we conducted a PCA on all seven 1169 antioxidant genes to evaluate their coregulation and to gain inference on the evolutionary trajectories 1170 of each ecotype. 1171

RESULTS 1172

Corticosterone increases with heat stress in both ecotypes 1173 The plasma corticosterone concentration ranged from 8.35 – 633ng/mL, after the removal 1174 of one extreme outlier (912 ng/mL). The data were log10 transformed, and the full model had 1175 the lowest AIC. The heat stress treatment caused a significant 3-fold increase in 1176 corticosterone concentration relative to the control animals (Table 1, Supplemental Figure 1) 1177 (back-transformed Least-square Mean ± SE: control = 56.8 ± 6.5 ng/mL; heat stress = 283.9 1178 ± 297 ng/mL). Additionally heavier animals had higher corticosterone concentrations, and L- 1179 fast had higher levels than M-slow although this ecotype difference was marginally 1180 significant (P=0.08). Neither the effect of sex nor any interactions were significant. 1181

Ecotypes differ in levels of circulating ROS in unstressed condition and in response to stress 1182

Superoxide in blood cells 1183 MitoSOX fluoresces upon contact with superoxide within the mitochondria of the blood cells. We 1184 analyzed the percent of cells with MitoSox fluorescence above the background noise, which ranged 1185 across individuals from 15.8-97.2% with a mean of 68.9%. We arcsin transformed the percentages 1186

81

since they were bimodally distributed, and the full model had the lowest AIC. The three-way 1187 interaction among treatment, ecotype, and sex significantly affected the percent of cells containing 1188 superoxide (Table 1). Heat-stress M-slow females had lower percentage of cells with superoxide 1189 relative to control M-slow females; whereas heat-stress L-fast females had higher percentage of cells 1190 with superoxide relative to control L-fast females (Figure 3A). Interestingly, males of both ecotype 1191 did not differ in the percent of cells with superoxide under control and heat stress temperatures. Back- 1192 transformed LS means ± SE for Females: M-slow Control = 81.7 ± 3.8% and M-slow Stress = 58.2 ± 1193 8.1; L-fast Control = 40.6 ± 9.1 and L-fast Stress = 76.6 ± 4.2. Whereas for Males: M-slow Control = 1194 76.7±4.8 and M-slow Stress = 74.6 ± 5.6; L-fast Control = 73.0 ± 4.3 and L-fast Stress = 71.6 ± 4.1. 1195

H2O2 in plasma. 1196

The concentration of H2O2 in blood plasma ranged from 15.5 to 17.8 µM and averaged 16.7 µM of 1197

H2O2, after the removal of three outliers at the low end of the range. Treatment and ecotype interacted 1198 significantly to affect the amount of H2O2 in the same pattern as seen with the superoxide above 1199

(Figure 3B). M-slow individuals under heat stress had a lower concentration of H2O2 in blood plasma 1200 relative to M-slow individuals in the control treatment, whereas L-fast heat stress individuals had 1201 higher concentrations of H2O2 relative to their controls. (LS means ± SE: M-slow control 16.80 ± 0.13 1202 µM; M-slow stress 16.55 ± 0.13 µM; L-fast control 16.49 ± 0.12 µM; L-fast stress 16.70 ± 0.10 µM). 1203

H2O2 production by live liver mitochondria 1204 Because of the necessity to pool the livers to obtain enough mitochondria, our sample size for this 1205 analysis was 34 mitochondrial measurements. The production of H2O2 by mitochondria ranged from 1206 0.221 to 7.45 µg/min per mg mitochondria, with an average of 2.867. There were no significant 1207 differences between the ecotypes, and the temperature treatments in mitochondrial H2O2 production, 1208 although the pattern was in agreement with our other ROS variables (Figure 3C). A post-hoc power 1209 analysis indicates we had very little power (0.3-0.4) to detect an ecotype by treatment interaction with 1210 our reduced sample size. 1211

Genetic data 1212

Microsatellites 1213 Evaluation of microsatellite loci as neutral markers of evolution indicates small but significant 1214 differences between these four populations with FST values ranging from 0.01-0.07 (Supplemental 1215 Table 5). 1216

82

Gene expression 1217 There was considerable variation in the levels of expression of the genes of interest with HSP70 1218 having the overall highest expression and GPX1 having the lowest (Supplemental Figure 2). As 1219 expected, the effect of temperature on HSP70A1, HSP40A4, and HSP40A1 was significant with heat 1220 stress treatment causing a 20-, 20- and 10-fold increase in mean expression of each gene or isoform, 1221 respectively (P < 0.01 in all three cases: Supplemental Figure 2). Q-PCR data on HSP70A1 1222 confirmed this effect. Additionally, two of the antioxidant genes were significantly altered by 1223 temperature and/or ecotype. GPX1 was significantly affected by the interaction of ecotype and 1224 temperature with L-fast animals having increased expression under heat-stress, but M-slow animals 1225 showed no change with heat-stress (LS means ± SE of normalized transcript number: M-slow control 1226 7.31 ± 1.33; M-slow heat stress 7.34 ± 1.33; L-fast control 4.93 ± 1.33; L-fast heat stress 14.36 ± 1227 1.33; P=0.05, Figure 4A). SOD2 expression was significantly lower with heat-stress with no affect of 1228 ecotype (LS means ± SE: M-slow control 43.2 ± 0.4.9; M-slow heat stress 27.7 ± 4.9; L-fast control 1229 46.0 ± 0.4.9; L-fast heat stress 38.0 ± 4.9; P<0.05, Figure 4B). 1230

Catalase catalyzes the degradation of H2O2. We found a negative correlation between catalase 1231 gene expression in the liver and plasma levels of H2O2 (Figure 4C). Although this was not significant 1232 with the limited dataset for which we had both RNA-seq data and H2O2 plasma levels (n=9; P=0.16, 1233 R2 = 0.25), it proved to be significant in our qPCR validation dataset with a higher sample size (n=66, 1234 P<0.01, R2 = 0.17) (Supplemental Figure 3A). Both the RNA-seq and the qPCR showed that the 1235 interaction between ecotype and treatment was not significant, although the pattern of expression 1236 matches expectations if catalase expression from the liver was impacting H2O2 levels in the plasma. 1237

That is, as catalase expression decreased with heat-stress in L-fast ecotype, plasma H2O2 increased 1238 (Supplemental Figure 3B-C). The MDS plot from the PCA analyses on the seven antioxidant genes 1239 showed no clear patterns among the four groups (Supplemental Figure 4) and neither ecotype or 1240 treatment (or their interaction) explained the variation in the first two components. 1241

Genetic (mRNA) variation 1242 Two mRNA isoforms were detected for both catalase and SOD2 (Table 2). While all individuals 1243 expressed the first isoform of both genes, the second catalase isoform was found in four individuals 1244 and was characterized by a different first exon (and thus start codon), resulting in a protein that was 1245 14 amino acids shorter, with the next nine amino acids differing between the isoforms (Supplemental 1246 Figure 5). The second SOD2 isoform was detected in two individuals (Table 2), and had a different 1247 first codon that resulted in a 50 amino acid shorter protein that is missing the signaling peptide 1248 (Supplemental Figure 6). Although the frequency of the second isoforms varied between the ecotypes 1249

83

(Table 2), this ecotype difference between alternatively spliced isoform frequencies was not 1250 significant (Fisher’s Exact Test, n=10, Catalase P=0.52, SOD2 P=0.44). 1251 All of the antioxidant mRNA sequences showed genetic variation except for SOD1; and four 1252 of the six genes had changes to the amino acid sequence, although none of the genes had a dN/dS 1253 over 1 (Table 2). Only SOD2 showed a marginally significant FST between the two ecotypes 1254

(populations), with FST = 0.160 (P<0.1) compared to the FST over all candidate genes, FST = -0.004 1255 (P>0.1). Considering the clear distinction in allele frequencies of SOD2 between populations 1256 (Supplemental Figure 7), the marginal significance is due to the low samples size (n=10). 1257 1258 DNA Damage 1259 Ecotypes differ in level of DNA damage in unstressed condition and in response to stress 1260 To assess the amount of DNA damage due to the whole organism heat stress, we compared the 1261 Heat Stress-Background and Control-Background groups (Figure 2). The percent of DNA in the 1262 comet tail (damaged DNA) ranged from 10.98 to 52.25% with a mean of 22.73%. Data were square 1263 root transformed prior to analyses. Organismal heat stress treatment and ecotype interacted such that 1264 control M-slow individuals had lower damage relative to control L-fast, but heat stress M-slow 1265 individuals had higher damage than heat-stress L-fast. (Back-transformed LS means ± SE: M-slow 1266 control 20.0 ± 1.3; M-slow heat stress 25.3 ± 1.6; L-fast control 24.0 ± 1.5; L-fast heat stress 22.9 ± 1267 1.2; Figure 5A). 1268

Whole organism heat stress causes a hormetic effect by limiting DNA damage due to H2O2 1269

To assess the amount of damage due to H2O2 treatment of blood cells subsequent to the whole 1270 organism heat treatment (Figure 2), we subtracted the Background damage from the H2O2 blood cell 1271 treatment and analyzed this difference for the H2O2 treatment. In both ecotypes the H2O2 exposure of 1272 blood cells increased the amount of damaged DNA in these cells. However, the whole organism heat 1273 stress treatment significantly affected the amount of inducible damage by H2O2. Specifically, animals 1274 that were subjected to heat stress prior to erythrocyte treatment of H2O2 had lower levels of 1275 erythrocyte DNA damage than animals that had been at the control temperature prior to erythrocyte 1276 treatment of H2O2 (LS means ± SE: control 17.0 ± 2.5; heat stress 8.8 ± 2.1; Figure 5B). This effect 1277 was equivalent between the ecotypes. 1278

Repeated measures 1279 In analyzing the whole DNA damage dataset with repeated measures, DNA damage was 1280 significantly affected by the 3-way interaction among ecotype, sex, and H2O2 treatment. Females of 1281

84

both ecotypes had the most extreme responses - H2O2 caused significantly less DNA damage in the 1282 M-slow females than either males or L-fast females (LS means ± SE of % DNA damaged are 1283

Females: M-slow Background = 22.7 ± 2.2 and M-slow H2O2 Treatment = 28.8 ± 2.7; L-fast 1284

Background = 22.6 ± 1.9 and L-fast H2O2 Treatment = 39.6 ± 2.7. And for Males: M-slow 1285

Background = 23.8 ± 1.6 and M-slow H2O2 Treatment =37.8 ± 2.4; L-fast Background = 25.2 ± 1.5 1286 and L-fast H2O2 Treatment = 35.4 ± 1.9; Figure 5C). 1287 1288

DISCUSSION 1289

Ecotypes differ at specific nodes of the stress response network in unstressed condition and in 1290 response to stress 1291 We conducted this experiment on lab-reared juveniles from two “fast-living” lakeshore 1292 populations (L-fast) and two “slow-living” meadow populations (M-slow) that have diverged in life- 1293 history phenotypes (Table S1). Our goal was to determine if these naturally-evolved life-history 1294 ecotypes vary in accordance with life-history theory at candidate nodes in the molecular stress 1295 network under unstressed conditions and in response to an induced stress. Divergence between the 1296 ecotypes in unstressed measures indicates the potential for segregating genetic variation between the 1297 ecotypes. Significant interaction between ecotype and stress treatment, and the directionality of these 1298 interactions, addresses whether adaptive evolution may have occurred in correlation with life-history 1299 traits. Specifically, relative to the L-fast ecotype, the M-slow ecotype persists in a more 1300 physiologically stressful environment with highly variable and ephemeral food availability, shortened 1301 growth season, and overall cooler temperatures. We reasoned that if M-slow individuals are locally 1302 adapted to restricted food and thermal regimes, this would be reflected at the cellular level as 1303 measurable differences in direction and magnitude of the plastic response of specific nodes within the 1304 molecular stress network, relative to the L-fast ecotype. If unstressed and stress-induced measures at 1305 the candidate nodes in the M-slow animals show more protective phenotypes, promoting cell 1306 maintenance by decreasing ROS levels and DNA damage, we interpret this as supportive evidence for 1307 adaptive correlated evolution in stress response and life-history traits. 1308 We identified four specific nodes for which ecotypes differed: levels of circulating ROS 1309 (unstressed and in response to stress); DNA damage in blood cells (unstressed and in response to 1310 stress); liver gene expression of GPX1 in response to stress; and allele frequencies at SOD2. In 1311 general, ROS levels can be driven by either the rate of ROS production by the mitochondria, and/or 1312 by the rate of degradation of the ROS by antioxidants. Although not significant, the H2O2 production 1313

85

by the liver mitochondria mirrored the pattern of ROS levels in the blood (compare Figure 3C to 1314 3A&B). A recent study by Robert and Bronikowski (2010) on these same garter snake ecotypes that 1315 used a chronic 10-day in utero stress on neonates (via corticosterone elevation within mothers), 1316 showed similar results with higher levels of H2O2 production by the mitochondria in L-fast 1317 individuals in response to stress, but no effect of in utero stress on the M-slow ecotypes H2O2 1318 production. Together, the data from these two studies suggest that the ecotypes have evolved 1319 differences in mitochondrial production of ROS in response to stress. 1320 Regarding the possibility that differences in the rate of ROS degradation by antioxidants accounts 1321 for the ROS findings, our results suggest this is not the case. Although the expression of liver catalase 1322 correlated negatively with the level of H2O2, in the plasma, explaining between 17% and 25% of the 1323 variation, there were no significant interactions between ecotype and treatment on antioxidants gene 1324 expression that would explain the levels of circulating ROS. Only gene expression of GPX1 and 1325 SOD2 were significantly affected by the experiment, but the patterns were in opposition to that 1326 predicted by ROS levels. Thus based on these data we hypothesize that divergence in mitochondrial 1327 energetics rather than antioxidant levels may explain the differences in ROS levels. Future studies 1328 will further discern among gene expression levels, anti-oxidant activity, ROS production, and 1329 oxidative damage. 1330 The ecotypes also differed in the extent of erythrocyte DNA damage in the unstressed condition, 1331 in response to whole organism temperature treatment, and in response to the H2O2 treatment directly 1332 to blood cells (Figure 5). Both heat stress (this study) and UV exposure (Robert & Bronikowski 2010) 1333 induced more damage in M-slow than in L-fast erythrocytes, whereas additional exposure to H2O2 1334 caused relatively more DNA damage in the L-fast ecotype and the M-slow males, as compared to the 1335 M-slow females who had the least amount of damage. Robert & Bronikowski (2010) also showed 1336 that cells from both ecotypes were able to repair the UV damage, with M-slow ecotype being more 1337 efficient; we were unable to test for repair to H2O2 damage since it is not possible to remove the H2O2 1338 from the live cells. These differences are intriguing considering the different types of DNA damage 1339 the UV, heat, and H2O2 create. Exogenous damage, such as from UVA and heat used in our studies 1340 induce damage by both indirect (secondary generation of free radicals leading to double and single- 1341 strand DNA breaks) and direct (e.g., depurination, pyrimidine dimers) methods (Bruskov et al. 2002; 1342

Jiang et al. 2009). Whereas the application of H2O2, to the cells causes catastrophic and direct DNA 1343 damage - single and double strand breaks via oxidation (Rochette et al. 2003). The comet assay 1344 measures general DNA damage including single strand breaks, some DNA lesions that become 1345 hydrolyzed by the alkali conditions, and intermediates of repair (Collins 2004). Although we are 1346

86

unable to quantify the different types of damage with the method use here, together the data from 1347 these two studies suggest that the M-slow ecotype (particularly females) can better protect their DNA 1348 from oxidative damage due to H2O2 (but not heat or UV), and can repair UV induced DNA damage 1349 faster than the L-fast ecotype. 1350

Sexes respond differently to heat stress. 1351 For many individual-based measurements for which we could test for differences between sexes 1352 (corticosterone, circulating ROS, DNA damage) we found either a sex effect or an interaction 1353 between ecotype and sex. Typically females drove the ecotype-by-heat stress interactions by having 1354 stronger opposing responses, i.e., the heat-stress females were more strongly different to controls than 1355 were heat-stress males. Sex differences in stress response have been reported in many studies 1356 including the seminal study of Maynard Smith (1958), which demonstrated that heat stress increases 1357 longevity in female Drosophila but not males. Specifically, components of oxidative stress have been 1358 shown to differ between sexes in many taxa (Ali et al. 2006; Costantini et al. 2009; Isaksson et al. 1359 2011; Olsson et al. 2009; Wiersma et al. 2004). The viviparous garter snake species used in this study 1360 is sexually dimorphic with females being larger and likely investing more into reproduction than do 1361 males due to their viviparity. Thus the fitness costs of mismatched regulation of their stress responses 1362 (i.e., either over- or under- compensating) are predicted to be higher if stress networks have 1363 pleiotropic effects on longevity and reproduction. It is intriguing that these differences are already 1364 present in the juveniles used in this study, well before they have reached sexual maturation. 1365 Evolutionary inferences of physiological response 1366 This study was conducted with the goal of discovering genetic and genetic-by-environment 1367 effects on stress biomarkers to test for local adaptation between the ecotypes. However, we are not 1368 addressing the intriguing possibility of transgenerational epigenetic effects and/or hormesis as 1369 causative agents in our measures of stress response. In fact, a large body of evidence suggests that 1370 stress may elicit responses transgenerationally (e.g. non-genetic maternal effects in this study system, 1371 Robert and Bronikowski 2010), or within individuals hormetically (Crews et al. 2012; Rando & 1372 Verstrepen 2007; Vaiserman et al. 2004). Hormesis is a reproducible biological phenomena in which 1373 exposure to stress (toxins, temperature, nutritional deprivation, radiation, etc.) at low doses can have a 1374 beneficial effect on the organism through enhanced longevity or future stress resistance (Calabrese & 1375 Baldwin 2002). Interestingly, in our study we have identified a hormetic effect of the heat treatment 1376 on resistance of the DNA damage to a subsequent H2O2 treatment. As such the heat stress initiated 1377 protective mechanisms to prevent further damage to the DNA (see Figure 5B). Intriguingly, Robert 1378 and Bronikowski (2010) found that in utero corticosterone treatment improved DNA repair efficiency 1379

87

for UV-induced DNA damage in this same study system. Thus it seems both a chronic in utero stress 1380 and an immediate heat stress can have hormetic effects on DNA integrity either through protection or 1381 repair in both ecotypes. 1382 Some proportion of the differences between the ecotypes in how they respond to stress could be a 1383 transgenerational epigenetic phenomenon. The parents of our study subjects inhabited different 1384 environments in the wild, which can also affect how offspring may respond to a stress (Galloway & 1385 Etterson 2007; Ho & Burggren 2010; Nadeau 2009). This experiment used 1.2 year old garter snakes 1386 that were conceived and partially gestated in the field, but born and raised in a common laboratory 1387 environment. The maternal environment and body condition prior to conception can affect the quality 1388 and quantity of resources (yolk, metabolites, mRNAs) allocated to offspring that have lasting effects 1389 on morphology, growth, and fitness (Badyaev & Uller 2009; Mousseau & Fox 1998; Noguera et al. 1390 2011; Warner et al. 2007). There have been few studies exploring the extent of transgenerational 1391 epigenetic effects on specific molecular physiological phenotypes in offspring - particularly in 1392 vertebrates (Crews et al. 2012; Ho & Burggren 2010). A recent study by Kaneko et al. (2011) has 1393 shown that environmentally induced longevity and enhanced oxidative stress resistance can be 1394 inherited by second generation rotifers, such that offspring from caloric restricted mothers had higher 1395 levels of catalase and MnSOD gene expression and protein levels. Given that our subjects were 1.2 1396 years in age, if their differences in stress response were due to epigenetic modification of the genome, 1397 this would imply that relatively stable epigenetic changes in the genome are regulating gene 1398 expression and ultimately affecting our measurements - such as through epigenetic marks, histone 1399 modification, and non-coding RNAs (Jablonka & Lamb 2011). This is a hypothesis to be tested in 1400 future studies. 1401 We suggest that the majority of the differences between the ecotypes at the nodes in the stress 1402 network have a genetic basis and are under divergent selection in the different habitats, leading to 1403 local adaptation. Knock-out gene studies on laboratory models have established that stress response 1404 can have a strong genetic basis (Kenyon 2005; Luca et al. 2009). Selection experiments on diverse 1405 laboratory model systems (nematodes, drosophila, and mice) have shown longevity and reproduction 1406 to have sufficient heritability for rapid responses to selection (Klebanov et al. 2000; Luckinbill & 1407 Clare 1985; Walker et al. 2000). Furthermore, it is possible for physiological differences to evolve 1408 rapidly (within 3 generations) in vertebrates as shown by an experimental evolution study on cold 1409 tolerance in cichlids (Barrett et al. 2011). Thus, the nodes that differ between these garter snake 1410 ecotypes could be evolved regulatory differences in the molecular network. Neutral nuclear markers 1411 (microsatellite) show restricted gene flow among the populations used in this study, and the 1412

88

differences between the two habitat types could facilitate the divergence of traits under strong 1413 selection. Although we had very low power for detection with our limited sample size for the RNA- 1414 seq data, we have identified one of six genes examined (mitochondrial (Mn)SOD2), which shows 1415 allele frequency differences between the ecotypes greater than the average for the candidate genes 1416 and for the microsatellites. This suggests that it may be a target of diversifying selection for different 1417 alleles in different populations. Although the allele-defining SNPs did not functionally change the 1418 protein sequence, SNPs regulating this gene’s expression and splicing maybe the targets of selection. 1419 These findings warrant future analyses with an increased sample size on the functional genetic 1420 differentiation between these ecotypes. 1421

The evolutionary consequences of divergence in stress response for life-history traits. 1422 The fast- and slow-living ecotypes used in this study have evolved under different selection 1423 pressures in their respective habitats (Table S1). Although limited to specific nodes of the entire stress 1424 network, this study supports the hypothesis that these life-history ecotypes have also diverged at the 1425 molecular level in how they respond to stress. Due to the integrated nature of stress networks with 1426 life-history trade-offs, this allows us to further propose that selection maybe acting specifically via 1427 this physiological network to drive life-history divergence in one or both of these ecotypes. This and 1428 our previous studies have shown that M-slow animals have a stress response that is consistent with 1429 promoting cellular maintenance, which could lead to increased longevity. This is consistent with a 1430 mechanistic hypothesis of aging that posits eventual catastrophic damage accumulation over the adult 1431 lifespan (e.g. the Oxidative Stress Theory of Ageing). 1432 The differences in the unstressed levels of ROS between these ecotypes are intriguing as they 1433 contradict expectations based on the Oxidative Stress Theory of Ageing, with M-slow individuals 1434 having higher levels of ROS than the L-fast ecotype under control lab conditions. ROS are involved 1435 in many biological processes despite their negative effects being a focus of aging studies; and the 1436 imbalance between ROS production (pro-oxidant) and ROS degradation (antioxidant) that would 1437 result in increased oxidative stress and accumulation of damage is thought to be more important than 1438 the baseline levels of ROS itself (Costantini & Verhulst 2009; Monaghan et al. 2009). Our results 1439 (e.g. higher ROS and lower DNA damage in unstressed M-slow animals) are in-line with the theory 1440 of mitohormesis (Tapia 2006). This theory predicts that moderately high levels of ROS are beneficial 1441 such that activation of the mitochondria and the increased production of ROS as a by-product could 1442 have a hormetic (mitohormetic) affect via signaling through the cell to maintain higher levels of 1443 cellular protection. Now, many studies have challenged the idea that baseline ROS levels cause aging 1444 via oxidative damage to the cell (Lapointe & Hekimi 2010; Ristow & Zarse 2010; Tapia 2006). 1445

89

Comparing across studies in this system is interesting that both ROS levels and DNA damage 1446 differed between the unstressed ecotypes at 1.2 years (this study), but these differences were not 1447 present at 0.2 years of age (Robert & Bronikowski 2010). From this, we hypothesize that as the 1448 neonates of both ecotypes grow and develop (in this case in a common lab environment) they diverge 1449 in their physiological differences. Further studies conducted over extended periods of time under field 1450 and controlled conditions will be necessary to fully elucidate the role of ROS regulation in driving 1451 life-histories in this system. 1452 In summary, in this experiment we activate the stress response network and assayed candidate 1453 nodes, predicted from studies on laboratory model animals to be important to the evolution of life- 1454 history trade-offs. We found that corticosterone and expression of HSP had parallel increase between 1455 the ecotypes, but no divergence in the response at these nodes. In contrast, the ecotypes diverged in 1456 many aspects related to oxidative stress: levels of ROS, DNA damage, antioxidant gene expression 1457 (GPX1) and allele frequencies (SOD2). Therefore, this study supports the hypothesis that these life- 1458 history ecotypes have also diverged at the molecular level in how they respond to stress, particularly 1459 in nodes that regulate oxidative stress Furthermore, these data allow us to propose the hypothesis that 1460 selection on nodes in the stress network may be responsible for driving the evolutionary differences 1461 between the ecotypes. 1462

ACKNOWLEDGEMENTS 1463 Considerable thanks to M. G. Palacios for performing the corticosterone assay; S. Rigby and the 1464 ISU Flow Cytometry Facility for assistance with the flow cytometer assay; M. Manes, J. 1465 Chamberlain, and A. Sparkman for assistance on experiment days; D. Warner for statistical and 1466 graphical assistance. We appreciate the useful comments on the manuscript provided by members of 1467 the Bronikowski and Vleck Labs, three anonymous reviewers and C. Eizaguirre, the ME subject 1468 editor. Funding sources include NSF grant [IOS 0922528] AMB; ISU Center for Integrated Animal 1469 Genomics to AMB and TSS; Iowa Academy of Science to AMB and TSS; NSF-IGERT and NSF- 1470 GK-12 fellowships to TSS. IACUC number 3-2-5125-J. Thanks to A. Dixon for Fig. 1 artwork. 1471

REFERENCES 1472 Ali, SS, Xiong C, Lucero J, Behrens MM, Dugan L, Quick KL. (2006) Gender differences in free 1473 radical homeostasis during aging: shorter-lived female C57BL6 mice have increased oxidative 1474 stress. Aging Cell 5, 565-574. 1475 Arnold SJ, Peterson CR (2002) A model for optimal reaction norms: The case of the pregnant garter 1476 snake and her temperature-sensitive embryos. American Naturalist 160, 306-316. 1477

90

Badyaev AV, Uller T (2009) Parental effects in ecology and evolution: mechanisms, processes and 1478 implications. Philosophical Transactions of the Royal Society B: Biological Sciences 364, 1169- 1479 1177. 1480 Barrett RDH, Paccard A, Healy TM, et al. (2011) Rapid evolution of cold tolerance in stickleback. 1481 Proceedings of the Royal Society B: Biological Sciences 278, 233-238. 1482 Bize P, Devevey G, Monaghan P, Doligez B, Christe P (2008) Fecundity and Survival in Relation to 1483 Resistance to Oxidative Stress in a Free-Living Bird. Ecology 89, 2584-2593. 1484 Bronikowski AM (2008) The evolution of aging phenotypes in snakes: a review and synthesis with 1485 new data. AGE 30, 169-176. 1486 Bronikowski AM, Arnold SJ (1999) The evolutionary ecology of life history variation in the garter 1487 snake Thamnophis elegans. Ecology 80, 2314-2325. 1488 Bronikowski AM, Vleck D (2010) Evolutionarily divergent physiology and life span: A case study in 1489 the garter snake (Thamnophis elegans). Integrative and Comparative Biology 50, 880-887. 1490 Bruskov VI, Malakhova LV, Masalimov ZK, Chernikov AV (2002) Heat-induced formation of 1491 reactive oxygen species and 8-oxoguanine, a biomarker of damage to DNA. Nucleic Acids 1492 Research 30, 1354-1363. 1493 Calabrese EJ, Baldwin LA (2002) Defining hormesis. Human and Experimental Toxicology 21, 91- 1494 97. 1495 Collins A (2004) The comet assay for DNA damage and repair. Molecular Biotechnology 26, 249- 1496 261. 1497 Costantini D, Dell'Omo G, Paola de Filippis S, et al. (2009) Temporal and spatial covariation of 1498 gender and oxidative stress in the Galápagos land iguana Conolophus subcristatus. Physiological 1499 and Biochemical Zoology 82, 430-437. 1500 Costantini D, Verhulst S (2009) Does high antioxidant capacity indicate low oxidative stress? 1501 Functional Ecology 23, 506-509. 1502 Crews D, Gillette R, Scarpino SV, et al. (2012) Epigenetic transgenerational inheritance of altered 1503 stress responses. Proceedings of the National Academy of Sciences. 1504 Dowling DK, Simmons LW (2009) Reactive oxygen species as universal constraints in life-history 1505 evolution. Proceedings of the Royal Society B: Biological Sciences 276, 1737-1745. 1506 Drummond AJ, Ashton B, Cheung M, et al. (2009) Geneious v4.7, Available from 1507 http://www.geneious.com/. 1508 Finkel T, Holbrook NJ (2000) Oxidants, oxidative stress and the biology of ageing. Nature 408, 239- 1509 247. 1510

91

Flatt T, Heyland A (2011) Integrating mechanisms into life history evolution. In: Mechanisms of life 1511 history evolution: The genetics and physiology of life history traits and trade-offs (eds. Flatt T, 1512 Heyland A), pp. 3-10. Oxford University Press, Oxford. 1513 Friedman DB, Johnson TE (1988) A mutation in the age-1 gene in Caenorhabditis elegans lengthens 1514 life and reduces hermaphrodite fertility. Genetics 118, 75-86. 1515 Galloway LF, Etterson JR (2007) Transgenerational Plasticity Is Adaptive in the Wild. Science 318, 1516 1134-1136. 1517 Garratt M, Vasilaki A, Stockley P, et al. (2011) Is oxidative stress a physiological cost of 1518 reproduction? An experimental test in house mice. Proceedings of the Royal Society B: Biological 1519 Sciences 278, 1098-1106. 1520 Ghazi A, Henis-Korenblit S, Kenyon C (2009) A transcription elongation factor that links signals 1521 from the reproductive system to lifespan extension in Caenorhabditis elegans. PLoS Genetics 5, 1522 e1000639. 1523 Goecks J, Nekrutenko A, Taylor J, Team TG (2010) Galaxy: a comprehensive approach for 1524 supporting accessible, reproducible, and transparent computational research in the life sciences. 1525 Genome Biology 11, R86. 1526 Grabherr MG, Haas BJ, Yassour M, et al. (2011) Full-length transcriptome assembly from RNA-Seq 1527 data without a reference genome. Nat Biotech 29, 644-652. 1528 Grandison RC, Piper MDW, Partridge L (2009) Amino-acid imbalance explains extension of lifespan 1529 by dietary restriction in Drosophila. Nature 462, 1061-1064. 1530 Greenberg N, Wingfield J (1987) Stress and reproduction: reciprocol relationships. In: Hormones and 1531 Reproduction in Fishes, Amphibians, and Reptiles (eds. Norris DO, Jones RE), pp. 461-503. 1532 Plenum Press, New York. 1533 Ho DH, Burggren WW (2010) Epigenetics and transgenerational transfer: a physiological 1534 perspective. Journal of Experimental Biology 213, 3-16. 1535 Holzenberger M, Dupont J, Ducos B, et al. (2003) IGF-1 receptor regulates lifespan and resistance to 1536 oxidative stress in mice. Nature 421, 182-187. 1537 Huey RB, Peterson CR, Arnold SJ, Porter WP (1989) Hot rocks and not-so-hot rocks - retreat-site 1538 selection by garter snakes and its thermal consequences. Ecology 70, 931-944. 1539 Isabell K, Klaus F (2009) Altitudinal and environmental variation in lifespan in the Copper butterfly 1540 Lycaena tityrus. Functional Ecology 23, 1132-1138. 1541

92

Isaksson C, While GM, Olsson M, Komdeur J, Wapstra E (2011) Oxidative stress physiology in 1542 relation to life history traits of a free-living vertebrate: the spotted snow skink, Niveoscincus 1543 ocellatus. Integrative Zoology 6, 140-149. 1544 Jablonka E, Lamb MJ (2011) Transgenerational epigenetic inhertiance. In: Evolution: The extended 1545 synthesis (eds. Pigliucci M, Muller GB), pp. 137-174. The MIT Press, Cambridge. 1546 Jiang Y, Rabbi M, Kim M, et al. (2009) UVA Generates Pyrimidine Dimers in DNA Directly. 1547 Biophysical Journal 96, 1151-1158. 1548 Johnson TE, Henderson S, Murakami S, et al. (2002) Longevity genes in the nematode 1549 Caenorhabditis elegans also mediate increased resistance to stress and prevent disease. Journal of 1550 Inherited Metabolic Disease 25, 197-206. 1551 Kaneko G, Yoshinaga T, Yanagawa Y, et al. (2011) Calorie restriction-induced maternal longevity is 1552 transmitted to their daughters in a rotifer. Functional Ecology 25, 209-216. 1553 Kenyon C (2001) A conserved regulatory system for aging. Cell 105, 165-168. 1554 Kenyon C (2005) The plasticity of aging: Insights from long-lived mutants. Cell 120, 449-460. 1555 Kenyon C, Chang J, Gensch E, Rudner A, Tabtlang R (1993) A C. elegans mutant that lives twice as 1556 long as wild type. Nature 366. 1557 Klebanov S, Flurkey K, Roderick TH, et al. (2000) Heritability of life span in mice and its 1558 implication for direct and indirect selection for longevity. Genetica 110, 209-218. 1559 Lapointe J, Hekimi S (2010) When a theory of aging ages badly. Cellular and Molecular Life 1560 Sciences 67, 1-8. 1561 Leroi AM, Bartke A, Benedictis GD, et al. (2005) What evidence is there for the existence of 1562 individual genes with antagonistic pleiotropic effects? Mechanisms of Ageing and Development 1563 126, 421-429. 1564 Lithgow G, White T, Melov S, Johnson T (1995) Thermotolerance and extended life-span conferred 1565 by single-gene mutations and induced by thermal stress 1566 10.1073/pnas.92.16.7540. Proceedings of the National Academy of Sciences 92, 7540-7544. 1567 Lithgow GJ, Miller RA (2008) Determination of aging rate by coordinated resistance to multiple 1568 forms of stress. In: Molecular biology of aging (eds. Guarente LP, Partridge L, Wallace DC), pp. 1569 427-482. Cold Spring Harbor Laboratory Press, Cold Spring Harbor. 1570 Luca F, Kashyap S, Southard C, et al. (2009) Adaptive variation regulates the expression of the 1571 human SGK1 gene in response to stress. PLoS Genet 5, e1000489. 1572 Luckinbill LS, Clare MJ (1985) Selection for life span in Drosophila melanogaster. Heredity 55, 9- 1573 18. 1574

93

Manier MK, Arnold SJ (2005) Population genetic analysis identifies source-sink dynamics for two 1575 sympatric garter snake species (Thamnophis elegans and Thamnophis sirtalis). Molecular 1576 Ecology 14, 3965-3976. 1577 Manier MK, Seyler CM, Arnold SJ (2007) Adaptive divergence within and between ecotypes of the 1578 terrestrial garter snake, Thamnophis elegans, assessed with F-ST-Q(ST) comparisons. Journal of 1579 Evolutionary Biology 20, 1705-1719. 1580 Mates JM, Sanchez-Jimenez F (1999) Antioxidant enzymes and their implications in 1581 pathyphysiologic processes. Frontiers in Bioscience 4, d339-345. 1582 Maynard Smith J (1958) The effects of temperature and of egg-laying on the longevity of Drosophila 1583 subobscura. Journal of Experimental Biology 35, 832-843. 1584 McElwee JJ, Schuster E, Blanc E, et al. (2007) Evolutionary conservation of regulated longevity 1585 assurance mechanisms. Genome Biology 8, R132. 1586 Miller DA, Clark WR, Arnold SJ, Bronikowski AM (2011) Stochastic population dynamics in 1587 populations of western terrestrial garter snakes with divergent life histories. Ecology 92, 1658- 1588 1671. 1589 Monaghan P, Metcalfe NB, Torres R (2009) Oxidative stress as a mediator of life history trade-offs: 1590 mechanisms, measurements and interpretation. Ecology Letters 12, 75-92. 1591 Mousseau TA, Fox CW (1998) The adaptive significance of maternal effects. Trends in Ecology & 1592 Evolution 13, 403-407. 1593 Nadeau JH (2009) Transgenerational genetic effects on phenotypic variation and disease risk. Human 1594 Molecular Genetics 18, R202-R210. 1595 Noguera JC, Alonso-Alvarez C, Kim S-Y, Morales J, Velando A (2011) Yolk testosterone reduces 1596 oxidative damages during postnatal development. Biology Letters 7, 93-95. 1597 Norry F, Sambucetti P, Scannapieco A, Loeschcke V (2006) Altitudinal patterns for longevity, 1598 fecundity and senescence in Drosophila buzzatii. Genetica 128, 81-93. 1599 Norry FM, Loeschcke VR (2002) Longevity and resistance to cold stress in cold-stress selected lines 1600 and their controls in Drosophila malanogaster. Journal of Evolutionary Biology 15, 775-783. 1601 Olsson M, Wilson M, Uller T, Mott B, Isaksson C (2009) Variation in levels of reactive oxygen 1602 species is explained by maternal identity, sex and body-size-corrected clutch size in a lizard. 1603 Naturwissenschaften 96, 25-29. 1604 Palloti F, Lenaz G (2001) Isolation and subfractionation of mitochondria from animal cells and tissue 1605 culture lines. In: Mitochondria (eds. Pon LA, Schon EA), pp. 1-35. Acedemic Press, San Diego, 1606 CA. 1607

94

Peakall R, Smouse PE (2006) GENALEX6: genetic analyses in Excel. Population genetic software 1608 for teaching and research. Molecular Ecology Notes 6, 288-295. 1609 Peterson CC, Walton BM, Bennett AF (1998) Intrapopulation variation in ecological energetics of the 1610 garter snake Thamnophis sirtalis, with analysis of the precision of doubly labeled water 1611 measurements. Physiological Zoology 71, 333-349. 1612 Piper MDW, Selman C, McElwee JJ, Partridge L (2008) Separating cause from effect: how does 1613 insulin/IGF signalling control lifespan in worms, flies and mice? Journal of Internal Medicine 1614 263, 179-191. 1615 Rando OJ, Verstrepen KJ (2007) Timescales of genetic and epigenetic inheritance. Cell 128, 655-668. 1616 Ricklefs RE, Wikelski M (2002) The physiology/life-history nexus. Trends in Ecology & Evolution 1617 17, 462-468. 1618 Ristow M, Zarse K (2010) How increased oxidative stress promotes longevity and metabolic health: 1619 The concept of mitochondrial hormesis (mitohormesis). Experimental Gerontology 45, 410-418. 1620 Robert K, Vleck C, Bronikowski A (2009) The effects of maternal corticosterone levels on offspring 1621 behavior in fast- and slow-growth garter snakes (Thamnophis elegans). Hormones and Behavior 1622 55, 24-32. 1623 Robert KA, Bronikowski AM (2010) Evolution of senescence in nature: physiological evolution in 1624 populations of garter snake with divergent life histories. The American Naturalist 175, 147-159. 1625 Robert KA, Brunet-Rossinni A, Bronikowski AM (2007) Testing the 'free radical theory of aging' 1626 hypothesis: physiological differences in long-lived and short-lived colubrid snakes. Aging Cell 6, 1627 395-404. 1628 Rochette PJ, Therrien JP, Drouin R, et al. (2003) UVA-induced cyclobutane pyrimidine dimers form 1629 predominantly at thymine–thymine dipyrimidines and correlate with the mutation spectrum in 1630 rodent cells. Nucleic Acids Research 31, 2786-2794. 1631 Russell SJ, Kahn CR (2007a) Endocrine regulation of ageing. Nat Rev Mol Cell Biol 8, 681-691. 1632 Russell SJ, Kahn CR (2007b) Endocrine regulation of ageing. Nature Reviews in Molecular Cell 1633 Biology 8, 681-691. 1634 Schwartz TS, Bronikowski AM (2011) Molecular stress pathways and the evolution of life histories 1635 in reptiles. In: Molecular Mechanisms of Life History Evolution: The Genetics and Physiology of 1636 Life History Traits and Trade-Offs (eds. Flatt T, Heyland A). Oxford Press, Oxford. 1637 Schwartz TS, Tae H, Yang Y, et al. (2010) A garter snake transcriptome: pyrosequencing, de novo 1638 assembly, and sex-specific differences. Bmc Genomics 11, 694. 1639

95

Singh NP, McCoy MT, Tice RR, Schneider EL (1998) A simple technique for quantitation of low 1640 levels of DNA damage in individual cells. Experimental Cell Research 175, 184-191. 1641 Sparkman AM, Arnold SJ, Bronikowski AM (2007) An empirical test of evolutionary theories for 1642 reproductive senescence and reproductive effort in the garter snake Thamnophis elegans. 1643 Proceedings of the Royal Society B-Biological Sciences 274, 943-950. 1644 Stevenson RD, Peterson CR, Tsuji JS (1985) The thermal dependence of locomotion, tongue flicking, 1645 digestion, and oxygen consumption in the wandering garter snake. Physiological Zoology 58, 46- 1646 57. 1647 Sun Y, MacRae T (2005) Small heat shock proteins: molecular structure and chaperone function. 1648 Cellular and Molecular Life Sciences 62, 2460-2476. 1649 Surai PF (2002) Natural antioxidants in avain nutrution and reproduction. Nottingham University 1650 Press, Thrumption. 1651 Tapia PC (2006) Sublethal mitochondrial stress with an attendant stoichiometric augmentation of 1652 reactive oxygen species may precipitate many of the beneficial alterations in cellular physiology 1653 produced by caloric restriction, intermittent fasting, exercise and dietary phytonutrients: 1654 "Mitohormesis" for health and vitality. Medical Hypotheses 66, 832-843. 1655 Tatar M, Bartke A, Antebi A (2003) The endocrine regulation of aging by insulin-like signals. 1656 Science 299, 1346-1351. 1657 Vaiserman AM, Koshel NM, Mechova LV, Voitenko VP (2004) Cross-life stage and cross- 1658 generational effects of gamma irradiations at the egg stage on Drosophila melanogaster life 1659 histories. Biogerontology 5, 327-337. 1660 Walker DW, McColl G, Jenkins NL, Harris J, Lithgow GJ (2000) Natural selection: Evolution of 1661 lifespan in C. elegans. Nature 405, 296-297. 1662 Warner D, Lovern M, Shine R (2007) Maternal nutrition affects reproductive output and sex 1663 allocation in a lizard with environmental sex determination. Proceedings of the Royal Society of 1664 London, Biological sciences 274, 883-890. 1665 Wiersma P, Selman C, Speakman JR, Verhulst S (2004) Birds sacrifice oxidative protection for 1666 reproduction. Proceedings of the Royal Society of London. Series B: Biological Sciences 271, 1667 S360-S363. 1668 1669 1670 1671

96

DATA ACCESSIBILITY 1672

DNA sequences 1673 Genbank accessions: RNA-seq data in Sequence Read Archive, SRA052923; Catalase Isoform 1, 1674 JX291960; Catalase Isoform 2, JX291961; Glutathione Peroxidase 1, JX291962; Glutathione 1675 Peroxidase 3, JX291963; Glutathione Peroxidase 4, JX291964; HSP40A1, JX291965; HSP40A4, 1676 JX291966; HSP70A1, JX291967; Superoxide dismutase 1, JX291968; Superoxide dismutase 2 1677 Isoform 1, JX291969; Superoxide dismutase 2 Isoform 2, JX291970; Superoxide dismutase 3, 1678 JX291971 1679

Physiological data 1680 DRYAD Provisional DOI doi:10.5061/dryad.sb30r 1681

Galaxy workflows. 1682 Cleaning Illumina Reads: http://main.g2.bx.psu.edu/u/ts-ecogen/w/groomillumina- 1683 solexatrimq20filter20bp 1684 Mapping Illumina reads to candidate genes: http://main.g2.bx.psu.edu/u/ts-ecogen/w/map- 1685 illumina-indiv-to-candidate-transcripts-and-count 1686 1687 1688

97 “ indicates indicates “ - significant results, a dash “ a results, significant -

0.001 ≤ value - on physiological traits. NS indicates non indicates NS traits. physiological on

df1,df2 0.01; *** p *** 0.01; ≤ value - 0.05; ** p ** 0.05; ≤ value - was not included in the final model. final the in included not was Statistical resultsfrom ANOVAs, F

1. p * 0.1; ≤ value - - + p +

Table 4 Table thatvariable 1689

98

1690 1691 of the the of 1692 1693 1694 1695 1696 Frequency of thesecond

seq data is n=10 (5 per (5 n=10 is data seq synonymous to synonymous synonymous to synonymous - - 1697 non 1698 1699 1700 1701 1702 1703 1704 1705

1706 seq data for the antioxidant genes, and microsatellite loci. microsatellite and genes, antioxidant the for data seq - between the two populations representing the ecotypes. Sample size for the RNAthe for size Sample ecotypes. the representing populations two the between

ST

Genetic variation based on the RNA the on based variation Genetic

2. - able 4 able T region coding the in SNPs of number CDS SNPs, ecotypes; both SNPs across of number SNPs, ecotype; each in (Isoform2) isoform of ratio the dN/dS, changes; acid amino of number changes, AA UTRs; 3’ or 5’ the to opposed as mRNA changesbetween alleles; F S5). (Table microsatellites the for n=104 and population) p<0.05 * p<0.1, +

99

Summaryof differences in thisstudy testingfor the maineffects ofheat stress,sex, ecotype.

3. - able 4 able T

100

Figure 4-1. Example of a simplified molecular stress network with nodes indicated by thick dark circles, connected by arrows. Modified from Schwartz and Bronikowski (2011). Abbreviations. Extracellular: cort (corticosterone) CR (corticosterone receptor; IGF-1 (insulin-like growth factor-1); IGFR-1 (insulin-like growth factor-1 receptor). Cytoplasmic: HSF-1 (heat shock factor-1); P-FOXO (Fork-head transcription factor – homologus to

DAF-16 in C. elegans)-that is phosphorylated; P53 (protein 53); H2O2 (hydrogen peroxide); GPX (glutathione peroxidase-1); Fe- (iron); OH- (hydroxyl radical). Nucleus: HSP genes (genes encoding heat shock proteins); GRE genes (genes encoding glucocorticoid response elements); PGC-1α (peroxisome activated receptor y coactivator 1- α). Mitochondria: UCP2 (uncoupling protein 2); I (ETC complex I, NADH dehydrogenase); CoQ

(co-enzyme Q10); II (ETC complex II, succinate dehydrogenase); III (ETC complex III, bc1 complex); CytC (Cytochrome c); IV (ETC complex IV, cytochrome c oxidase); V (ETC complex V, ATP .- synthase); e- (electrons); O2 (superoxide); SOD (superoxide disumutase); H2O2 (hydrogen peroxide).

101

Level 1: Level 2: Groups for comparison Whole Organism Blood Cell of DNA damage Treatment Treatment

Background: Control Background Control cell buffer only (n=52) Temperature

cell buffer with Control H2O2 H2O2 added (n=22)

Background: Heat Stress Background Heat Stress cell buffer only (n=53) Temperature

cell buffer with Heat Stress H2O2 H2O2 added (n=28)

Figure 4-2. Design of comet assay to assess DNA damage. Level 1 is the whole organism heat stress treatment with animals split between control temperature (27°C) and heat stress temperature (37°C). After euthanization, the captured blood cells from each animal were split between a

Background treatment with normal cell buffer, or a H2O2 treatment with 200 µM of H2O2 added to the cell buffer for a subset if individuals (Level 2). Analyses were conducted on the four groups in the black box.

102

A 90

80 L-fast female 70 L-fast male 60 M-slow female 50 M-slow male superoxide 40

% of blood cells with 30

17.2

B 17

M) 16.8 μ ( 2 16.6 O 2 16.4

16.2

Plasma H 16

5

4

3 L-Fast

produced over C 2 2 M-Slow O mitochondria 2

30 min by 2 mg of 1 g H μ 0 Control (27C) Heat Stress(37C) Figure 4-3. Interaction plots of ecotypes differential response to stress in levels of reactive oxygen species (ROS). Y-axes are the LSM (backtransformed when appropriate) ± the SE. Dark lines are L-fast ecotype, light colored lines are M-slow ecotype; if differentiated by sex, dashed lines are males, solid lines are females. A) Significant interaction between treatment, ecotype and sex in the percent of blood cells with MitoSOX fluorescence (indicator of superoxide in the mitochondria).

B) Significant interaction between treatment and ecotype in the concentration of H2O2 in blood plasma in response to heat stress. C) Non-significant pattern of interaction between treatment and ecotype in the amount of H2O2 produced by the live liver mitochondria over a 30 minute period.

103

A 20

15

10 5 L-fast M-slow

GPX1 Gene Expression 0 Control (27C) Heat Stress (37C)

B 60 50 40 30 20 10 L-fast M-slow SOD2 Gene expression 0 Control (27C) Heat Stress (37C)

18 C 17.5 M)

μ 17 ( 2 O

2 16.5 16 15.5 Plasma H 15 0 20 40 60 80 100 120 Catalase Gene Expression from RNA-seq

Figure 4-4. Gene expression in the liver based on RNA-seq data. Y axes are the LSM ± the SE of the normalized number of transcripts. Dark lines are L-fast ecotype, light colored lines are M-slow ecotype. A) Significant interaction between ectypes and treatment on the expression of GPX1. B) Significant effect of treatment on SOD2 gene expression. C) Negative correlation between CAT gene expression (RNA-seq) and plasma levels of H2O2.

104

A 40

30 L-fast

20 M-slow heat stress 10

% DNA damage due to 0 Control Background Heat Stress Background B

45 L-fast

35 M-slow

2 25

O 2

H 15

5

-5 % DNA damage due to Control: Heat Stress: H2O2 - Background H2O2 - Background C

40

30

20 L-Fast Female L-Fast Male M-Slow Female 10 M-Slow Male

% DNA Damage 0 Background H2O2

Figure 4-5. Interaction plots of DNA damage measured by comet assay. Dark lines are L-fast ecotype, light colored lines are M-slow ecotype; if differentiated by sex, dashed lines are males, solid lines are females. Y-axes are the LSM (backtransformed when appropriate) ± the SE. Please refer to Figure 2 for clarification on X-axis. A) Ecotypes have significantly different levels of background DNA damage in their blood cells in response to whole organism heat stress. B) Individuals subjected to whole organism heat stress, accrue less DNA damage due to subsequent H2O2 on the blood cells.

X-axis has the Background damage subtracted from the damage due to H2O2 treatment on the blood cells. C) Significant interaction between ecotype, treatment and sex, in response to the H2O2 treatment on the blood cells (control and heat stress are averaged here).

105

SUPPLEMENTAL INFORMATION

Supplemental methods

Microsatellite genotyping DNA was extracted from blood samples taken from field-caught animals and genotyped at seven microsatellite loci reported in previous publications: NSU10, NSU2, NSU3, TS10, TS2, TelCa18, and TelCa2 (Garner et al. 2004; Manier & Arnold 2005; McCracken et al. 1999; Prosser et al. 1999) and verified to be in HWE and linkage equilibrium in this system (M. Manes unpublished data). Loci were amplified in three multiplex, 7 µL reactions using Type-it Microsatellite PCR kit (QIAGEN) with 25 ng DNA, 0.03 to 0.65 ng of each locus-specific reverse and forward primer, 0.5 to 0.6 ng of the M13 primer labeled with a fluorescent dye, and 3 mM MgCl2 (in Type-it Multiplex PCR Master Mix). Amplification was performed with a thermal cycle of: initial 95 °C for 5-min, 30 cycles of 95 °C for 30-sec, 57 °C for 1-min 30-sec, and 72 °C for 30-sec, and a final extension of 30-min at 60 °C. For each sample, 0.5 µL of each PCR multiplex (1.5 µL total PCR product per sample) was mixed with 8.5 µL H2O to create a 1:20 dilution. For each of these diluted samples, 1.5 µL was electrophoresed at the Iowa State University DNA Facility using the ABI 3100 Genetic Analyzer (Applied Biosystems), with dye set G5 and the internal size standard LIZ. To minimize scoring biases and errors, we used GENEMAPPER (Applied Biosystems) to define allelic bins. The electropherograms were scored automatically by GENEMAPPER and subsequently verified or corrected by eye. ARLEQUIN (Excoffier et al. 2005) was used to calculate pairwise FST values to estimate genetic distance among the populations.

Quantitative PCR To verify gene expression results based on the RNA-seq data, two antioxidant genes were assayed using Q-PCR on 68 individuals, HSP70A1 and catalase, using EEF1A and B-actin as reference genes (Supplemental Table 1 for primer information). cDNA was synthesized using 1 µg of RNA, 200 units of SuperScript II reverse transcriptase (Invitrogen), 2mM dNTPs, and 0.5 µg oligo dTs (Invitrogen and Integrated DNA technologies). From a pool of cDNA from all the samples, a 10- point standard curve was created using 1:4 dilutions. Quantitative PCR reactions were run in duplicate and contained 0.3 µM of each primer (Table 1), and Q-PCR Supermix (Promega), in 20 µL reactions and was conducted on an Eppendorf MasterPlex Cycler using the SybrGreen setting. Cycle consisted of 95 °C for 2-min, 40 cycles of 95 °C for 20-sec, 58 °C for 20-sec, 68 °C for 20-sec, and a final melt curve analysis from 68 to 95 °C for 20-min. A standard curve was run in duplicate on every

106

plate. For every plate, the standard curve was used to convert Ct values to relative copy number of the transcript for that particular gene. Individuals that had a Ct standard deviation among replicates greater than 0.25 were rerun. To normalize the samples, the relative copy number of each copy gene was divided by the relative copy number of the reference gene, and then divided by the average of the L-fast controls to make the data set relative. Data were log transformed to make normally distributed.

DNA damage within blood cells We measured the baseline and inducible DNA damage within erythrocytes using a cellular-level indicator of DNA damage with Single Cell Gel Electrophoresis, (aka “comet”) assay (Singh et al. 1998). This assay detects damage due to single and double strand breaks. Generally, we performed this assay as described in Bronikowski (2008) on fresh cells (details in Supplemental Information). We used two levels of stress to characterize the effect of oxidative stress on DNA damage (Figure 2). The first level was the temperature treatment (i.e., control vs. heat stress animals) as described above.

The second level was a specific oxidative stress treatment (H2O2) applied directly to the blood cells. A comparison of DNA damage from the heat stress and control animals assesses the effect of organismal temperature on baseline DNA damage; we refer to this as the Background Damage (Figure 1). For a subset of the experimental animals (n=50 of 80), we performed the second level of stress experimentation by exposing the blood cells to a buffer with and without H2O2. The purpose was to assess the resistance of blood cells to damage from a specific oxidizing agent (H2O2). For the Background measurements, blood was diluted in HBSS cell buffer, whereas for the

H2O2Treatment the blood was diluted in HBSS cell buffer containing 200 µM of H2O2. The cells incubated in their respective buffers for 1-h, and then were washed 2X by centrifuging at 1000 g for 30-sec at 4 ˚C and replacing the HBSS with PBS. Next, cells were added to low melt agarose (final concentration of 0.75% agarose) and applied to a microscope slide and spread by the pressure of a cover slip. The agarose was allowed to solidify for 10-min, the cover slip removed and then the cells were lysed by putting the slides into a cold lysis buffer (2.5 M NaCl, 100 mM EDTA, 10 mM Tris- Base, 1% Triton-X, 10% DMSO, pH 10). After lysis the slides were put in a neutralization buffer (0.4 M Tris-Base, pH 7.5) and then electrophoresed (buffer: 300 mM NaOH, 1 mM EDTA, pH 10) for 40- min at 25V/300 mA. The electrophoresis caused the DNA to migrate out from the lysed cell membrane, with the smaller broken fragments of DNA migrating faster and further thus creating a comet-like appearance. Cells were fixed in 100% ethanol for 10-min, and the DNA was stained with SYBR Green I (Invitrogen Molecular Probes, Cat No S7563; 1:10,000 dilution in TE buffer: 10 mM Tris-HCL, 1 mMEDTA, pH 7.5). After 5-min of staining, slides were rinsed and allowed to air dry

107

while protected from light. Cells were visualized and photographed using fluorescent microscopy with 464 nm excitation and 521 nm emission. For each slide 20 to 50 cells were digitally photographed and the percent of DNA in the comet tail (broken DNA) relative to the DNA in the head of the comet (largely intact DNA) was calculated for each cell using the program Viscomet (Impuls Imaging, Gilching, Germany). To compare the amount of DNA damage specifically from the

H2O2 treatment, we subtracted the Background DNA damage (from organismal temperature treatment) from the DNA damage resulting from H2O2 treatment.

108 et

)

et al.

Manier ( 5 Robert ( 10 10 ; ) Thamnophis elegans Thamnophis Bronikowski & Vleck 2010 Vleck & Bronikowski (

9 ; )

Sparkman, A& B. Bronikowskipers. comm.; iving ecotypes of garter snakes ( snakes garter of ecotypes iving . l 4 ) - ; ) 2011 ( living and fast and living - Robert & Bronikowski 2010 Bronikowski & Robert ( 8 ; Kephart & Arnold 1982 Arnold & Kephart ) ( 3 ; ) 2011

et al.

Bronikowski 2000 Bronikowski ( 7 Miller ( ; ) 2 2 ; )

) 2007

Table of characteristics between slow between characteristics of Table . 2012 1

- et al.

4 et al.

Sparkman ( 6 Palacios ( ; ) 11 11 ; ) 2007

Bronikowski & Arnold 1999 Arnold & Bronikowski (

SupplementalTable fromLassen, Co. California. Modified from Schwartz andBronikowski 1 al. 2009

109

Supplemental Table 4-2. RNA-seq data. Metrics for Illumina reads (sequences) in millions as they are cleaned and mapped to the Illumina reference transcriptome, using Galaxy workflows.

Individual Ecotype Treatment Total Reads Total Mapped % Mapped Mean Read Length 37 L-fast Control 31,044,140 22,919,100 0.738 81.44 38 L-fast Control 13,493,633 11,368,928 0.843 67.56 46 L-fast Stress 11,717,961 9,270,494 0.791 70.39 61 M-slow Control 13,275,144 11,238,896 0.847 68.67 62 M-slow Control 5,570,075 4,061,760 0.729 67.75 64 L-fast Stress 12,353,495 10,305,033 0.834 73.25 65 L-fast Stress 12,974,002 10,980,294 0.846 63.63 66 M-slow Stress 13,557,853 11,367,876 0.838 69.49 67 M-slow Stress 14,541,526 11,645,601 0.801 62.03 87 M-slow Control 10,246,198 7,410,149 0.723 56.32 90 L-fast Control 11,949,353 10,568,233 0.884 62.49 91 L-fast NA 14,220,732 11,216,959 0.789 63.75 93 M-slow Stress 15,033,500 13,215,622 0.879 64.29

110

exon boundaryto inhibit amplification of - bp and over exon over and bp - designed to optimize: GC content of 50%; GC clamp but no more more no but GC 50%; clamp of content GC optimize: to designed

Primers timePCR primers. - C annealing, amplicon length 90 to150 90 length amplicon annealing, C ° Quantitative Real Quantitative 3. 3. - bp on 3’end; 60 on 3’end; bp -

SupplementalTable 4 5 last Gin 3 or C than contaminating DNA.

111

Supplemental Table 4-4. Sample sizes for all analyses.

Treatment Ecotype Sex N Individuals N Litters Assay N Control L-Fast Female 10 7 Corticosterone 8 H2O2 Plasma 8 SO Blood cells 7 DNA damage – background 3 DNA damage - H2O2 treatment 3 RNA-seq 3 Q-PCR 7 H2O2 – mitochondria (both sexes) 10 Male 20 11 Corticosterone 18 H2O2 Plasma 20 SO Blood cells 8 DNA damage – background 10 DNA damage - H2O2 treatment 7 RNA-seq 0 Q-PCR 7 M-slow Female 9 8 Corticosterone 7 H2O2 Plasma 6 SO Blood cells 6 DNA damage – background 9 DNA damage - H2O2 treatment 6 RNA-seq 3 Q-PCR 8 H2O2 – mitochondria (both sexes) 6 Male 13 10 Corticosterone 11 H2O2 Plasma 11 SO Blood cells 5 DNA damage – background 11 DNA damage - H2O2 treatment 7 RNA-seq 0 Q-PCR 8 Heat Stress L-Fast Female 11 9 Corticosterone 10 H2O2 Plasma 10 SO Blood cells 7 DNA damage – background 11 DNA damage - H2O2 treatment 7 RNA-seq 3 Q-PCR 7 H2O2 – mitochondria (both sexes) 10 Male 22 13 Corticosterone 20 H2O2 Plasma 21 SO Blood cells 11 DNA damage – background 16 DNA damage - H2O2 treatment 13 RNA-seq 0 Q-PCR 7 M-slow Female 9 7 Corticosterone 7 H2O2 Plasma 7 SO Blood cells 3 DNA damage – background 5 DNA damage - H2O2 treatment 3 RNA-seq 3 Q-PCR 8 H2O2 – mitochondria (both sexes) 8 Male 11 9 Corticosterone 10 H2O2 Plasma 10 SO Blood cells 4 DNA damage – background 11 DNA damage - H2O2 treatment 5 RNA-seq 0 Q-PCR 7

112

Supplemental Table 4-5. Pairwise FST values among the four populations used in this study; all are significant at p<0.05. The two populations used for RNA-seq are indicated in Bold (ELFS and Papoose).

L-Fast M-Slow ELFS Merril Mahogany Papoose (n=41) (n=11) (n=50) (n=63) ELFS - Merril 0.04 - Mahogany 0.07 0.02 - Papoose 0.07 0.02 0.01 -

113

350

300

250

200

150 L-fast

M-slow 100

50 Plasma Corcosterone (ng/ml) 0 Control (27C) Heat Stress (37C) Temperature Treatment

Supplemental Figure 4-1. Interaction plot of back-transformed LSM + and – the SE of the plasma corticosterone concentration in response to heat stress treatment. Dark line represents L- fast ecotype; light colored line is M-slow ecotype. Both ecotypes significantly increased concentration of corticosterone in their plasma.

114

Range of gene expression in the liver for the HSP and antioxidant genes in response to heat stress, based on on based stress, heat to response in genes antioxidant HSP and the for liver the in expression gene of Range

2. - Number of reads were normalized for each library by the total number of reads (millions) and the size of the transcript transcript the of size the and (millions) reads of number total the by library each for normalized were reads of Number

seq data. seq -

SupplementalFigure 4 RNA (kilobases),and thenaveraged bytreatment. The inset shows fullscale of gene expressionrange for HSP70 andGPX3.

115

A 18

17.5

(µ M) 17 2 O 2 16.5

16

Plasma H 15.5 15 0 0.5 1 1.5 2 2.5 3

B Catase gene expression normalized by EEF-1

1 L-fast

0.8 M-slow

0.6

0.4

Q-PCR: Catalase Gene Exppression 0.2 0 Control (27C) Heat Stress (37C) C

110 L-fast 100 M-slow 90 80

70 Expression 60

RNA-seq: Catalase Gene 50 40 30 Control (27C) Heat Stress (37C)

Supplemental Figure 4-3. Catalase gene expression from the liver. A) Negative correlation between catalase gene expression using quantitative PCR, normalized by EEF-1, and plasma H2O2 (raw data prior to transformation). B) Non-significant ecotype by treatment interaction on catalase gene expression using quantitative PCR data (n=67, F1,63=2.81, p=0.18). C) Non-significant ecotype by treatment interaction on catalase gene expression using RNA-seq (N=12, F1,8=0.69, p=0.43).

116

L-fast control L-fast stress M-slow control M-slow stress

4

2

0 PC 3 PC

-2 4

2 -4 0

PC 1 2 -2 0 -2 -4 PC 2 -4

Supplemental Figure 4-4. MDS plot of antioxidant gene expression.

117

s, s,

difference nce

This alignment illustrates the two the illustrates alignment This

seq. seq. -

Catalase nucleotide alignment of 10 unrelated individuals used in RNA in used individuals 10 unrelated of alignment nucleotide Catalase

5. -

genetic variation. The SNPs on the consensus sequence represent the SNPs found in multiple individuals, SNPs on each each on SNPs individuals, multiple in found SNPs the represent sequence consensus the SNPs on The variation. genetic

SupplementalFigure 4 and isoforms seque represent lines vertical dark data, sequence missing indicate blocks Dark individuals. heterozygous indicate transcript and darka thin horizontal line represents a gap. CDSrepresents thecoding sequence for eachtranscript.

118

each This alignment illustrates the two the illustrates alignment This

seq. - dark vertical lines represent sequence differences, differences, sequence represent lines vertical dark SOD2 nucleotidealignment of10 unrelated individuals usedin RNA

6. - SupplementalFigure 4 SNPs on individuals, multiple in found SNPs the represent sequence consensus on the SNPs The variation. genetic and isoforms data, sequence missing indicate blocks Dark individuals. heterozygous indicate transcript and darka thin horizontal line represents a gap. CDSrepresents thecoding sequence for eachtranscript.

119

Allele Frequency at SOD2 for L-fast (ELF) (n=5) 4 0%

3 10% 2 20%

1 70%

Allele Frequency at SOD2 for M-slow (PAP) (n=5)

3 0% 4 10% 1 30%

2 60%

Supplemental Figure 4-7. Allele frequenies differences among populations for SOD2 based on the 10 unrelated individuals with RNA-seq data.

120

Supplemental References Bronikowski AM (2000) Experimental evidence for the adaptive evolution of growth rate in the garter snake Thamnophis elegans. Evolution 54, 1760-1767. Bronikowski AM (2008) The evolution of aging phenotypes in snakes: a review and synthesis with new data. AGE 30, 169-176. Bronikowski AM, Arnold SJ (1999) The evolutionary ecology of life history variation in the garter snake Thamnophis elegans. Ecology 80, 2314-2325. Bronikowski AM, Vleck D (2010) Evolutionarily divergent physiology and life span: A case study in the garter snake (Thamnophis elegans). Integrative and Comparative Biology 50, 880-887. Excoffier L, Lavel G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1, 47-50. Garner TWJ, Pearman PB, Gregory PT, et al. (2004) Microsatellite markers developed from Thamnophis elegans and Thamnophis sirtalis and their utility in three species of garter snakes. Molecular Ecology Notes 4, 369-371. Kephart DG, Arnold SJ (1982) Garter snake diets in a fluctuating environment - a 7-year study. Ecology 63, 1232-1236. Manier MK, Arnold SJ (2005) Population genetic analysis identifies source-sink dynamics for two sympatric garter snake species (Thamnophis elegans and Thamnophis sirtalis). Molecular Ecology 14, 3965-3976. Manier MK, Seyler CM, Arnold SJ (2007) Adaptive divergence within and between ecotypes of the terrestrial garter snake, Thamnophis elegans, assessed with F-ST-Q(ST) comparisons. Journal of Evolutionary Biology 20, 1705-1719. McCracken GF, Burghardt GM, Houts SE (1999) Microsatellite markers and multiple paternity in the garter snake Thamnophis sirtalis. Molecular Ecology 8, 1475-1479. Miller DA, Clark WR, Arnold SJ, Bronikowski AM (2011) Stochastic population dynamics and life- history evoltuion in the western terrestrial garter snake. Ecology In Press. Palacios MG, Sparkman AM, Bronikowski AM (2012) Corticosterone and pace of life in two life- history ecotypes of the garter snake Thamnophis elegans. General and Comparative Endocrinology 175, 443-448. Prosser MR, Gibbs HL, Weatherhead PJ (1999) Microgeographic population genetic structure in the northern water snake, Nerodia sipedon sipedon detected using microsatellite DNA loci. Molecular Ecology 8, 329-333.

121

Robert K, Vleck C, Bronikowski A (2009) The effects of maternal corticosterone levels on offspring behavior in fast- and slow-growth garter snakes (Thamnophis elegans). Hormones and Behavior 55, 24-32. Robert KA, Bronikowski AM (2010) Evolution of senescence in nature: physiological evolution in populations of garter snake with divergent life histories. The American Naturalist 175, 147-159. Schwartz TS, Bronikowski AM (2011) Molecular stress pathways and the evolution of life histories in reptiles. In: Molecular Mechanisms of Life History Evolution: The Genetics and Physiology of Life History Traits and Trade-Offs (eds. Flatt T, Heyland A). Oxford Press, Oxford. Singh NP, McCoy MT, Tice RR, Schneider EL (1998) A simple technique for quantitation of low levels of DNA damage in individual cells. Experimental Cell Research 175, 184-191. Sparkman AM, Arnold SJ, Bronikowski AM (2007) An empirical test of evolutionary theories for reproductive senescence and reproductive effort in the garter snake Thamnophis elegans. Proceedings of the Royal Society B-Biological Sciences 274, 943-950.

122

CHAPTER 5. MITOCHONDRIAL RESPONSES TO HEAT STRESS

IN TWO LIFE-HISTORY PHENOTYPES: RESPIRATION,

TRANSCRIPTION AND HAPLOTYPE

Manuscript prepared for submission to the peer-reviewed scientific journal, Functional Ecology Tonia S. Schwartz, Zebulun W. Arendsee, and Anne M. Bronikowski

Author contributions: Both T.S.S. and A.M.B contributed to the design of the experiment and the assays, and performed the experiment. Z.W.A. contributed to the reconstruction of the mitochondrial genomes/transcriptomes from the RNA-seq data. T.S.S. analyzed the data and drafted the manuscript. A.M.B. advised the statistical analyses. All authors edited the final manuscript.

ABSTRACT Mitochondrial function and plasticity are predicted targets of natural selection. However, studies that seek to link mitochondrial function to environmental variation and/or genetic variation in the mitochondrial genome are equivocal. This study focuses on closely related natural populations of garter snakes (Thamnophis elegans) adapted to two disparate habitats. Animals in these habitats have divergent life-history phenotypes in terms of growth, maturation, reproduction, and longevity. These two life-history phenotypes are predicted to have underlying physiological and metabolic adaptations that support this organismal divergence. Our aim here was to test whether the mitochondria of these life-history phenotypes have also diverged, using an acute whole organism heat stress. Specifically, we test for differences between the life-history phenotypes in mitochondrial oxygen consumption and quantitative expression of the mitochondrial genome under control and heat stress conditions, and genetic variation in the mitochondrial genome. We reconstructed the T. elegans mitochondria genomic sequence and confirmed the presence of duplicated control regions found in other snake species. We found that a 2-hour heat stress on garter snakes (i) increases the rate of State III respiration (production rate of ATP), and (ii) increases the gene expression of mitochondrial rRNAs, but not mRNAs. These gene expression results are aligned with control region placement; thus providing the first evidence that control region duplication in

123

snakes may provide a mechanism for transcriptional plasticity. Moreover, we identified several mitochondrial phenotypes that diverged between the life-history phenotypes. (i) Under control conditions the slow-living phenotype had higher State IV respiration (resting). (ii) Under control conditions the slow-living phenotype had higher expression of the mitochondrial genome. (iii) In response to heat stress the fast-living ecotype increased the expression of the mitochondrial genome, where as the slow-living ecotype had no change. Additionally, by reconstructing individual mitochondrial haplotypes from the RNA-seq data, we identified distinct mitochondrial haplotypes with non-synonymous changes belonging to each ecotype. These life-history phenotype x environment interactions are consistent with local adaptation of mitochondria.

INTRODUCTION Mitochondria, specifically the oxidative phosphorylation system, produce the majority of energy used by aerobic metazoans (Attardi & Schatz 1988). It is well established that mitochondrial function is responsive to environmental conditions, particularly temperature changes (Blier & Guderley 1993; Chamberlin 2004; Fangue et al. 2009; Fontanillas et al. 2005; Pichaud et al. 2011). This plasticity in mitochondrial bioenergetics contributes to an organism’s ability to respond (acutely) and acclimate to different environmental conditions over geographic space and time (Fangue et al. 2009; Schulte et al. 2011; Seebacher et al. 2010). As such, mitochondrial function and plasticity are predicted to be targets of natural selection (Blier et al. 2001). Studies attempting to link mitochondrial function to environmental change and genetic variation (particularly in the mitochondrial genome (mtDNA)) have given mixed results. For example, studies on whitefish demonstrate that mitochondrial transcription regulation contributes to hypoxia tolerance, but there is no association with mitochondrial haplotype among closely related species exposed to different levels of oxygen in their natural populations (Flight et al. 2011). Genetic variation in ancestral human mitochondrial haplotypes is associated with latitude and is predicted to affect the efficiency of the electron transport chain (ETC) and thus the relative amounts of energy allocated to heat versus ATP production. Increased heat production is predicted to be a benefit in colder climates (Ruiz-Pesini et al. 2004), but in vitro studies of these mitochondrial haplotypes in a common genetic background failed to meet these predictions at the cellular level (Amo & Brand 2007). In contrast, variation in the mitochondrial genome has been associated with variation in heat production in shrews (Fontanillas et al. 2005). Arguably the most detailed evidence for mitochondrial evolution and adaptation to environmental variation has come from studies on Drosophilia where the effect of mtDNA variants on fitness has been reported to be temperature

124

dependent (Doi et al. 1999). Additionally, studies on D. simulans demonstrate the effect of temperature change on particular components of the electron transport chain and have linked metabolic thermosenstitivity to genetic variation in the mtDNA (Pichaud et al. 2011, 2012; Pichaud et al. 2010). Concerning the evolvability of mitochondria function, a recent study on natural populations of rats has identified genetic by environment interactions in mitochondrial function in response to changes in thermal environment, emphasizing the importance of genetic constraints in the evolution of metabolic plasticity (Glanville et al. 2012). Despite these advances, we do not yet have a comprehensive understanding of the evolution of and plasticity in mitochondrial energetics across a landscape; nor how such putative adaptive evolution relates to genetic variation in functional genes, particularly from the mitochondrial genome. This study focuses on closely related populations of garter snakes (Thamnophis elegans) with divergent life-history phenotypes that are adapted to disparate habitats. Comparisons among populations can reveal the first stages in evolutionary divergence and facilitate our understanding of micro-evolutionary processes. This study addresses whether mitochondria have diverged in respiration, transcriptional regulation and mtDNA haplotype among replicate populations of the western terrestrial garter snake (Thamnophis elegans) in the Sierra Nevada Mountains of California. These populations have diverged in habitat association as well as across a suite of traits related to life-history, morphology, and physiology (Table 1: also see Box 1 in Schwartz & Bronikowski 2011). Thus the animals in these populations have been characterized as a fast-lived ecotype found in populations around Eagle Lake that have a constant food source of fish but higher levels of predation (L-fast ecotype); and a slow-lived ecotype from populations at higher elevation marshy meadows, with cooler average temperature and ephemeral food sources (M-slow ecotype) (Table 1). These ecotypes have diverged in growth rate (i.e. fast vs. slow), reproductive output (high vs. low), and longevity (short vs. long) (Bronikowski 2000; Bronikowski & Arnold 1999; Miller et al. 2011; Sparkman et al. 2007; Sparkman et al. 2005). Recent studies suggest that mitochondrial function has also diverged among these ecotypes (Table 1). For example, the L-fast ecotype has a higher mass-specific resting metabolic rate over a wide range of temperatures, relative to the M- slow ecotype (Bronikowski & Vleck 2010). Additionally, the ecotypes differ in the production of reactive oxygen species (ROS), which are produced by electrons being lost from the ETC complexes (predominately complexes I and III). ROS production differed between the ecotypes under normal laboratory conditions and in response to physiological stresses (Robert & Bronikowski 2010; Schwartz & Bronikowski In Press). Our aim here was to test more directly whether the

125

mitochondria of these life-history ecotypes have diverged, using an acute whole organism heat stress. Studies on reptiles have proven significant for understanding the evolution of physiological plasticity in metabolic rate and in their response to abiotic stresses (reviewed in Schwartz & Bronikowski 2011). Although one order of reptiles has had an independent origin of endothermy (birds), the majority of reptiles are ectothermic with body temperatures determined by the environmental temperature and thermoregulatory behavior. Thus, changes in the external environment have a relatively direct effect on reptilian mitochondrial function in the sense that environmentally determined body temperature is translated to the cellular environment. Studies on ectotherms that measure respiration of isolated mitochondria show increased oxygen with an increase in biologically relevant temperatures. This is reflective of the passive increase in enzymatic rate with increasing temperature (i.e., increase in average molecular velocity and thus rate of chemical reactions) (Chamberlin 2004; Schulte et al. 2011). Here we target “active plasticity” of the mitochondria to an acute whole organism heat stress by isolating the mitochondria from the heat stressed animals and then measuring oxygen consumption at a standard body temperature. We specifically ask if heat stress induces changes to the mitochondria reflected in respiration and transcription that represent potential “active plasticity”, as opposed to measuring the “passive” effect of temperature on enzymes and reaction rates. Analyses of reptile mitochondrial sequences suggest rapid evolution of the mitochondrial genome and specifically remodeling of the ETC; this is particularly evident in snakes (Castoe et al. 2008; Dong & Kumazawa 2005; Jiang et al. 2007). These studies suggest that the remodeling of the ETC may be related to changes in physiological and metabolic flexibility and efficiency. Additionally, reptiles have experienced changes in mitochondrial genomic architecture, rRNA gene rearrangements in many species, and most notably the duplicated control region found in tuatara (Rest et al. 2003), most snakes, (Dong & Kumazawa 2005; Kumazawa et al. 1996, 1998), and some lizards and birds (Eberhard et al. 2001; Moritz & Brown 1986) (Figure 1). Jiang et al. (2007) hypothesized that these modifications may be adaptive for energy conservation and increased plasticity in mitochondrial function and proliferation. In general, the vertebrate mitochondrial genome has polycistronic transcription, such that from the opening of the D-loop in the control region multiple genes are transcribed sequentially, and subsequently (or concurrently) the transcript is spliced into individual genes (or groups of genes) prior to translation (Rorbach & Minczuk 2012; Scheffler 2008). Having a mitochondrial genome with two control regions presents a possible mechanism for these reptiles to dissociate the transcription of the two segments of the

126

mitochondrial genome. This would allow for the potential for each segment to be transcribed at a different rate in response to environmental changes. Here we test this hypothesis using quantitative RNA-seq data from a heat stress experiment, mapped to a reconstructed T. elegans mitochondrial genome. In this study we delve fully into the divergent phenotypes and genotypes of mitochondria from individuals of these fast- and slow- pace of life ecotypes. Furthermore, we quantify mitochondrial phenotypic plasticity in heat stress response. We hypothesize is that observed organismal evolutionary divergence in life-history traits and physiology of these garter snake ecotypes is underlain by divergence in mitochondrial energetics and mtDNA sequence, and thus the life-history evolution is correlated with mitochondrial evolution. We further hypothesize that the stress- responsivity of mitochondria may as well be divergent in the degree of plasticity within ecotypes. We address our hypotheses using mitochondrial respirometry, quantitative RNA-seq gene expression data, and nucleotide sequence analysis.

MATERIALS AND METHODS

Samples Gravid female western terrestrial garter snakes (Thamnophis elegans) were collected from four populations in the vicinity of Eagle Lake (Lassen County), California in June of 2007. Two populations were from lakeshore habitat with L-fast individuals and two populations were from higher elevation, meadow habitat with individuals of the M-slow ecotype. Females were transported to Iowa State University for completion of gestation. Parturition occurred on a single day for each female in August 2007. Offspring (n=105) were raised in this common laboratory environment for 1.2 years until subjected to this heat stress experiment. Standard animal rearing conditions included individual housing in 12 x 6 x 8" plastic boxes with paper substrate, continuous water availability, once/week feeding of thawed frozen mice, 12:12 hr L:D cycle and a thermal gradient from 24 to 32 ºC during the 12 hours of light. At 15 months of age (±3 weeks), litters of siblings were randomly divided between two thermal treatments, 27 °C or 37 °C for two hours. 37 ºC is below their critical thermal maximum of ~43 ˚C, (Arnold & Peterson 2002; Huey et al. 1989; Stevenson et al. 1985), but elicits a stress response based on behavior (Stevenson et al. 1985). After two hours at 27 ºC or 37 ºC, animals were immediately decapitated, exsanguinated and dissected (for details see Chapter 5, Schwartz & Bronikowski In Press).

127

Physiology: mitochondrial respiration Upon dissection, half the liver was snap-frozen for RNA analyses, and half the liver was transferred to ice-cold mitochondrial isolation buffer (250 mM sucrose, 5 mM Tris, 2 mM EGTA, pH 7.4). To obtain sufficient mitochondria for physiological measurements, 2 to 3 liver halves were pooled by population (litter when possible), within treatment, for a total of 34 pools of isolated mitochondria. The live mitochondria were isolated from the liver by homogenization and differential centrifugation (Palloti & Lenaz 2001; Robert et al. 2007). The protein concentration of the mitochondrial isolates was determined with a Bradford Assay (Bio-Rad) and used to normalize the assays by the amount of mitochondrial protein. Mitochondrial respiration was determined by measuring oxygen consumption of the liver mitochondria in an airtight chamber with a Clark-type oxygen electrode (Hansatech, Norfolk, UK) based on established procedures (Brand et al. 1993) optimized for reptiles as described in Robert et al. (2007). Respiration for all mitochondrial isolates (regardless of organismal temperature treatment) was measured at 28 ˚C, the normal preferred organismal temperature (Arnold & Peterson 2002), with a final concentration of 2 mg/mL mitochondrial protein. Succinate was used as a substrate for ETC complex II (i.e., succinate dehydrogenase) and rotenone was use to inhibit complex I (i.e., NADH dehydrogenase) preventing the backward flow of electrons. The rate of oxygen consumption was measured before the addition of ADP (State IV) and after the addition of ADP (State III). Analyses of the respirometry data were run in SAS/STAT version 9.22 interfaced with

Enterprise Guide (version 4.3). We used mixed model ANOVAs (PROC MIXED) using a Residual Maximum Likelihood Estimator (REML) and type 3 SS to account for unbalanced sample sizes. The main effects were Temperature (control 27 ºC vs. heat stress 37 ºC) and Ecotype (M-slow vs. L-fast). We used the natural log of the average mass of the individuals in the pool as a covariate, and experimental day as a blocking effect. Population nested within Ecotype was a random effect. Response Variable = µ + (ln Avg_Mass) + (Experiment Day) + (Temperature) + (Pop(Ecotype)) + (Temperature * Ecotype) + ε The response variables included State IV and State III respiration.

Sequence data Three types of sequence data were used in this study: a Roche 454 normalized RNA-seq dataset; an Illumina quantitative RNA-seq dataset; and Sanger sequences. The 454 RNA-seq dataset derives from sex-specific normalized pools of RNA from seven tissue types and 36 individuals of western terrestrial garter snake (Thamnophis elegans) that were sequenced using pyrosequencing

128

on the Roche 454 Titanium platform (GenBank Sequence Read Archive: (SRS007367, SRS007366). See Schwartz et al. (2010) for details on library construction, sequencing, and de novo assembly of the 454 garter snake transcriptome. These data were used to reconstruct the Thamnophis elegans mitochondrial reference sequence (see below). The Illumina RNA-seq data set consisted of RNA from 13 individual female snake livers: 6 individuals from a single L-fast population (3 control, 3 heat-stress), and 6 individuals from a single M-slow population (3 control, 3 heat-stress). Additionally, a 13th individual, from the same cohort of snakes from the L-fast ecotype (not included in the heat stress experimental design) was also sequenced (GenBank Sequence Read Archive: SRA052923). When possible, unrelated individuals were used (n=10 of 13). Total RNA was isolated from the snap-frozen liver using a slightly modified protocol for RNAeasy columns (Qiagen) including the DNA digestion step on the membrane. The quality and quantity was determined through a Bioanalyzer RNA Nanochip and all samples had a RIN number greater than 8. For each individual, 10 ng of total-RNA were submitted to the Iowa State University DNA Facility for RNA-seq library preparation and 72- or 80-cycle (single read) sequencing on the Illumina GXII platform. Reads were cleaned using a Galaxy workflow to trim by quality (score = 20) and to filter out short reads (≤20 bases) (http://main.g2.bx.psu.edu/u/ts- ecogen/w/groomillumina-solexatrimq20filter20bp). Cleaned reads were assembled using Trinity (Grabherr et al. 2011). A composite reference transcriptome using the Illumina data from all 13 individuals (i.e., 138 million reads) was assembled and used to facilitate the reconstruction of the Thamnophis elegans mitochondrial reference sequence (see below). Additionally, transcriptomes for each of the 13 individuals were assembled for reconstruction of the mitochondrial sequence for each individual (see below). Sanger sequencing of T. elegans mitochondrial DNA was used to verify four regions of the mitochondrial genome for the reference sequence: control region I, control region II, 16s rRNA, and COX1 (subunit 1 of complex IV: cytochrome c oxidase). DNA was extracted from blood using a salt extraction (Sunnucks & Hales 1996). These four regions were PCR amplified using primers designed from T. elegans flanking regions. Table S2 lists primer sequences and PCR conditions. PCR products were cleaned of excess primers and dNTPs using ExoSAP-IT (USB/Affymatrix, CA) and sent for Sanger sequencing in both directions at the Iowa State University DNA facility. Additionally, a 2171 bp region spanning from the middle of ND5 to the 3' end of CytB was amplified from DNA and sequenced to verify key nonsynonomous SNPs designating the mitochondrial haplotypes unique to each population.

129

Reconstructing the Thamnophis elegans mitochondrial reference sequence. We first used the RNA sequence data to reconstruct a reference T. elegans mitochondrial sequence. To identify the mitochondrial sequences from the two RNA-seq datasets, we created a consensus mitochondrial genome from the four colubrid snakes most closely related to T. elegans with mitochondrial genomes available in GenBank at the time of analyses: Elaphe poryphyracea (NC_012770), (Jiang et al. 2007); Thermophis zhaoermii (NC_012816), (He et al. 2010); Nerodia sipedon (NC_015793); and Achalinus meiguensis (NC_011576). The colubrid snake mitochondrial genomes were aligned in the program GENEIOUS (Drummond et al. 2009) to create a reference colubrid mitochondrial genome sequence. In an attempt to capture as many mitochondrial DNA sequences as possible, we used this colubrid consensus and the mitochondrial genome from the most closely related species, Nerodia, to search (blastn) the 454 and Illumina reference T. elegans transcriptomes for mitochondrial sequences that were then mapped back to the colubrid consensus and Nerodia mitochondrial genomes. Using this procedure we were able to create a draft of a T. elegans mitochondrial reference sequence which was missing only the two control regions, 16s rRNA, and COX1; therefore, we used the Sanger sequences to fill in these gaps (described above). The final T. elegans mitochondrial consensus sequence was verified by remapping back to the two RNA-seq transcriptomes (the 454 and the composite Illumina) and through comparison to the T. elegans sequences for the five mitochondrial genes that were available in GenBank: 12s rRNA (AF402642) (Alfaro & Arnold 2001), CYTB (AF420113), ND1 (AF420114), ND2 (AF420115), and ND4 (AF420116) (de Queiroz et al. 2002). The reconstructed T. elegans mitochondrial sequence was annotated based on the four colubrid mitochondrial genomes. Annotation was verified using DOGMA (Wyman et al. 2004) and tRNA- scan (Schattner et al. 2005). Annotation for each protein-coding gene was verified with each start and stop codon.

Reconstructing individual mitochondrial transcriptomes from Illumina RNA-seq data We used GENEIOUS to search (blastn) for the T. elegans mitochondrial reference sequence in each of the 10 unrelated individual RNA-seq transcriptomes. The identified sequences were assembled against the T. elegans mitochondrial reference to create a consensus mitochondrial sequence for each individual that typically contained all the protein-coding and rRNA genes, but often lacked many of the tRNAs and large portions of the control regions (Figure S1).

130

Evaluating genetic diversity The mitochondrial transcriptomes from these ten unrelated individuals were aligned using GENEIOUS. The two control regions and many of the tRNAs had a considerable amount of missing data among individuals so these regions were removed from the multiple sequence alignment; the final dataset included the two rRNAs and the 13 protein coding genes. SNPs were identified and characterized as either transitions or transversions, and either synonymous or non- synonymous. These mitochondrial haplotypes were used to estimate genetic divergence between the two populations (one from each ecotype) based on Tamura and Nei’s model for φST, in ARLEQUIN (Excoffier et al. 2005). A haplotype network depicting the relatedness of the haplotypes was created using NETWORK (Bandelt et al. 1999).

Quantitative gene expression To address whether the two control regions allow for plasticity in transcriptional regulation of the T. elegans mitochondrial genome, the T. elegans consensus mitochondrial sequence was divided into the two segments separated by the duplicated control regions (Figure 1). For each of the 12 individuals from the heat-stress experiment for which we had quantitative Illumina RNA-seq data, the cleaned reads were mapped to (1) the T. elegans reference transcriptome, (2) the two segments of the mitochondrial genome, and (3) to each mitochondrial gene individually using Bowtie (Langmead et al. 2009) within a Galaxy workflow: http://main.g2.bx.psu.edu/u/ts-ecogen/w/map-illumina-indiv- to- candidate-transcripts-and-count (Goecks et al. 2010). The total number of reads mapped to each of the two mitochondrial segments were normalized by dividing by the total number (millions) of reads that were able to be mapped to the whole transcriptome (mappable reads) for that individual, and then dividing by the length (kilobases) of the mitochondrial segment. The normalized value is the number of reads mapped to a segment, per million of mappable reads for that individual, per kilobase of the mitochondrial segment. To compare the relative expression of each part of the mitochondria, a ratio of normalized expression of Segment 1: Segment 2 was calculated. We then used mixed model ANOVAs as described above with the following model: Response Variable = µ + (Temperature Treatment) + (Ecotype) + (Temperature Treatment * Ecotype) + ε For gene-by-gene analysis of mitochondrial genome expression, we used the R statistical package edgeR (Robinson et al. 2010; Robinson & Smyth 2007) within Bioconductor and utilized the GLM method in detecting differentially expressed genes (McCarthy et al. 2012) between

131

ecotypes (L-fast vs. M-slow), temperatures (27 vs. 37 ºC), and their interaction. A false discovery rate of 0.1 was used to assess significance. To gain insight into transcription initiation, termination, and splicing, the control regions were searched for transcription initiation and termination sites homologous to mammals and chickens using blastn. Additionally, in an attempt to visualize where transcripts may be initiating (i.e., from one or both control regions), and if transcription extends across the control regions and gene boundaries, the 454 reads and the reads from one Illumina dataset were mapped to reference sequences representing potential transcripts from the mitochondria using the workflow describe above: (1) Segment 1 with the Control Region I, and (2) the beginning of Segment 2 with the Control region II. The BAM files containing the reads aligned to the references were visually inspected in GENEIOUS for reads that extended over gene-gene boundaries and gene-control region boundaries.

RESULTS

Mitochondrial respiration

-1 The rate of oxygen consumption during State IV respiration ranged from 0.5–15 nmol O2 min mg-1 (average = 7.6). State IV respiration showed no overall effects of treatment, ecotype, or their interaction. A post-hoc contrast comparison indicated that at 27 ºC (i.e., the non-stressful temperature) M-slow mitochondria had significantly higher rates of oxygen consumption than L-

-1 -1 fast mitochondria (LS Mean ± SE nmol O2 min mg at 27 °C: M-slow Control = 9.8±0.8, L-fast Control = 7.7±1.2) (Figure 2A, Table 2). The rate of oxygen consumption during ADP to ATP

-1 -1 conversion (State III respiration) ranged from 10.0–63.0 nmol O2 min mg (average = 26.6). The 37 ºC heat-stress treatment caused an increase in the rate of oxygen consumption in State III

-1 -1 respiration (measured at 28 ˚C) for both ecotypes (LS Mean ± SE nmol O2 min mg : 27 ºC = 21.5±2.2, 37 ºC= 28.3±2.5) (Figure 2B, Table 1). That this significant difference occurred at our measurement temperature of 28 ºC indicates that heat stress caused a change in the metabolic capacity of the mitochondria and not simply an increase in enzyme activity with increasing temperature.

Sequence data The consensus T. elegans mitochondrial reference is represented in Figure 1. Sanger sequencing confirmed the presence of a second control region (CRII) between ND1 and ND2, with 96% pairwise identity to CRI, suggesting it is under concerted evolution as has been found in most

132

other snakes (Figure 1) (Kumazawa et al. 1996). Interestingly the variation between the two control regions was focused at either end, which is where transcription regulatory sites are located in mammals: the first 150 bp at the 5' end that contains a transcription termination site, and the last 21 bp at the 3'end that contains the transcription initiation site (Figure S2). For each of the 13 individual Illumina datasets, 0.21-0.67% of the Illumina reads mapped to the mitochondrial reference sequence, which represented ~76% of the mitochondrial genome; typically the missing regions were parts of the tRNAs and/or the control regions. The processed tRNAs were likely eliminated from the sequence pool during the size selection step as part of the sequence library preparation. The missing control region sequences were due to the low abundance of transcripts across the region(s).

Gene expression

Mitochondrial gene expression shows increased segment 1 expression (rRNAs) with heat stress. The quantitative expression data indicate that the expression of the two segments of the mitochondrial genome is regulated independently in response to thermal environment. The ratio of Segment 1 (containing the rRNA genes) to Segment 2 (containing all the protein genes except ND1) was nearly doubled under heat stress (LSMean of Segment 1: Segment 2 ± SE: 27 ºC = 0.7 ± 0.08; 37ºC = 1.3 ± 0.08) (Figure 3A, Table 2). Under heat stress the Segment I of the mitochondria, containing the ribosomal RNAs, significantly increased expression (Least-square Mean ± SE: 27 ºC =3812 ± 334, 37 ºC = 4252 ± 334), whereas expression of Segment 2 decreased, although not significantly (Figure 3A, Table 2). This result was consistent across both ecotypes. Analyses of each gene within the mitochondrial genome showed that the expression of the two ribosomal genes significantly increased in response to heat stress. Yet ND1, also on Segment I, did not change (12s rRNA: 2.25 log fold change, p<0.001, q = 0.059; 16sRNA: 2.0 log fold change, p<0.005, q = 0.1; ND1: 0.8 log fold change p=0.20, q=0.34). No other mtDNA genes significantly changed in expression level with heat stress. Further evaluation of our alignments showed that reads continued across the 16s rRNA / ND1 boundary, which suggests that in contrast to mammals, these snakes do not have a termination sequence at the end of 16s rRNA gene (Figure S3). Rather, the transcripts appear to terminate after ND1, although a termination sequence similar to mammals or chickens was not found in this area either. These results suggest that the entire Segment I is expressed as a single transcript, and the differential expression of the rRNAs versus ND1 is due to subsequent degradation of the ND1 mRNA. This hypothesis will need to be tested in vitro.

133

Ecotypes differ in expression level of mtDNA. Intriguingly, ecotype and temperature interacted significantly to influence the expression of the mitochondrial genome overall. At 27 ºC, a normal body temperature for these snakes, the liver of the L-fast animals had significantly lower expression relative to M-slow animals. Under heat stress (37ºC), L-fast animals increased mitochondrial transcriptome expression whereas M-slow remained the same (LS Mean ± SE of counts: L-fast 27 ºC = 2836 ± 354, L-fast 37 ºC = 4397 ± 354, M-slow 27 ºC = 4417 ± 354, M-slow 37 ºC = 3874 ± 354) (Figure 3B, Table 2). A relatively smaller amount of reads mapped to the control regions, suggesting they are transcribed at a low level. Due to the high sequence similarity we were unable to assign most of the reads to a particular control region. However, by evaluating only the ends of the control regions where they are distinguishable, we determined that in the one (heat stressed) individual we evaluated, 2446 and 709 reads could be uniquely assigned to CRI and CRII respectively, providing additional support that the segment I (starting from CRI) is expressed at a higher level under stress. Interestingly, the last 250 bp region on the 3' end of both control regions showed a higher level of expression, suggesting that the transcription initiation site may be located ~250-300 bp region of the 3' end of the control regions (Figure S3). These results provide further support that both control regions are functional in initiating transcription, although also this will need to be confirmed in vitro. Thamnophis elegans mitochondrial haplotypes are population specific Using the 13 protein coding genes and the two rRNA genes for which we had complete coverage for all 10 unrelated individuals, we identified eight variable sites that characterized four mitochondrial haplotypes, two in each population (Figure 4, Figure S1). These sites were verified with Sanger sequencing of four individuals, each representing a haplotype. Three of the seven sites that were in protein coding regions were nonsynonymous. These were in the ND5 and CYTB proteins: Isoleucine -> Threonine at ND5 amino acid site 470, Serine -> Asparagine at ND5 amino acid site 510; and Isoleucine -> Threonine at CytB amino acid site 325 (Table 3). These three nonsynonomous mutations were confirmed with Sanger sequencing of the DNA. Using the mitochondrial haplotype frequencies, we calculated divergence between the populations based on

Tamura and Nei’s model for φST was 0.429, p-value <0.05. The haplotype network in Figure 4 represents the relationship among the mitochondrial haplotypes.

DISCUSSION Mitochondrial function is centrally important to all life processes and is likely a physiological target of natural selection. Thus the evolution, evolvability, and the plasticity of mitochondrial

134

energetics may determine whether animal populations will adapt to and survive in this changing world (Visser 2008). Our goal with this study was to undertake a comprehensive approach that integrates physiology, ecology, and evolutionary genetics to advance the understanding of mitochondrial regulation and energetics, and to test hypotheses specifically related to mitochondrial energetics and response to thermal stress.

Active response of mitochondrial energetics to heat stress Heat stress increased the rate of oxygen consumption during ATP production from ADP, and the expression of rRNA from the mitochondrial genome. Because we isolated mitochondria from the liver subsequent to the whole organism heat stress and conducted our respirometry assays at 28 ºC, we interpret the increase in State III respiration to be an active response as opposed to a passive response of increased temperature on the thermal dynamics of enzymatic reactions. Hence, our data suggest that exposure to heat stress functionally changed the mitochondria to consume oxygen faster similar to that seen in chicken mitochondria post heat-stress (Mujahid et al. 2009). Suggested mechanisms from other species of such functional changes include: rapid fatty acid compositional changes of the lipid membrane, which affect the activity of membrane-bound proteins (Hulbert & Else 1999); protein post-translation modification (e.g., phosphorylation, acetylation) (Koc & Koc 2012; Pagliarini & Dixon 2006); and changes in protein composition of the mitochondria (Christian & Spremulli 2012). For example, phospholipid composition of salmon kidney plasma membranes can change within hours of a temperature change (Hazel & Landrey 1988). Protein phosphorylation to regulate protein activity in response to cold air exposure has been documented to occur within 5-10 minutes in human lung cells (Park & Jang 2011). Likewise, the Artic clam, M. mercernaria, shows increase phosphorylation of mitochondrial proteins within the first 4 hours of heat induction (Ulrich & Marsh 2009). Changes in protein composition of the mitochondria membranes, matrix, or intermembrane space can occur through rapid import of proteins from the cytoplasm, protein degradation, and increased translation within the mitochondria (Christian & Spremulli 2012). Our finding that heat stress increased the transcription of rRNAs relative to mRNAs of the mitochondrial genome lends further support to the importance for increased translation of one or all of the 13 proteins encoded by the mitochondrial genome (Blomain & McMahon 2012). This result of increased mitochondrial rRNA expression in response to stress is in stark contrast to what has been documented to occur in the mammalian nucleus and cytoplasm where rRNA transcription and overall translation is inhibited in response to stress (Brostrom & Brostrom 1997; Mayer et al. 2005). Multiple mechanisms of transcriptional regulation in response to cellular changes, such as metabolic stress, have been documented in mammalian systems (Blomain & McMahon 2012;

135

Scarpulla 2012). One of these signals is ATP concentration (Amiott & Jaehning 2006), which suggests that the increased rRNA expression could be in response to the increased respiration in State III, and thus a sudden increased ATP concentration within mitochondria under heat stress. Additionally, in mammals, regulation of mitochondrial transcription responds to levels of steroid and thyroid hormones. Glucocorticoid response elements (GRE) having been on the mitochondrial genome, indicating these hormones may directly regulate mitochondrial transcription (Demonacos et al. 1995; Mutvei et al. 1989). Hence, the increase in corticosterone with heat stress that has been documented in these garter snakes (Schwartz & Bronikowski In Press) is another potential mechanism of translating the cellular stress to mitochondrial transcriptional regulation. Having a duplicated control region raises the intriguing possibility that they function independently to allow differential expression of the mitochondrial segments corresponding to the rRNA genes and the protein-coding genes. Jiang (2007) proposed that decoupling rRNA and protein-coding genes by differential regulation of the duplicated control regions may be a method to offset the thermodynamic effects of temperature on enzymatic rates - in this case up-regulating translation via increase in rRNA but not necessarily transcription of the protein-coding genes. Gene-by-gene analyses indicate the two rRNA genes are up-regulated but the ND1 gene is not. In mammals, transcription initiates at three sites within the control region, referred to as H1, H2, and L1. The H strand initiation sites have differential activity (Bonawitz et al. 2006), where H1 creates a transcript containing the ribosomal RNAs that terminates after 16S rRNA, while transcription initiation at the H2 site produces a full-length transcript of the mitochondrial genome (see Figure 1). However, evaluation of the RNA-seq reads that mapped to this region shows that transcription does not terminate after 16s rRNA in the garter snake, nor did a blastn search identify a termination site here. This suggests that Segment 1 as a whole is up-regulated with ND1 mRNA being subsequently degraded. The question turns then to whether both CRI and CRII are both functionally active and can differentially regulate transcription in response to stress. The high sequence conservation of the duplicated control regions in T. elegans suggests that both are functional, and the diversity at the ends of the regions (where transcription initiation, inhibition, or termination factors are likely to bind) provide the potential for differential regulation of transcription from the two control regions. Functional studies of transcription binding factors and in vitro transcription initiation are necessary to more fully resolve this issue.

136

Mitochondrial divergence between life-history ecotypes Co-evolved life-history/physiology syndromes are an ecological model for understanding how populations diverge and adapt to environmental differences in their local habitats. In this study, we used garter snake ecotypes that have previously been documented to diverge in multiple aspects of their reproduction, longevity, and metabolic physiology. We have found that the garter snake ecotypes differed also in their mitochondrial phenotype. Considering ecotype differences under normal laboratory temperature, M-slow mitochondria had higher resting respiration and higher mitochondrial gene expression relative to L-fast mitochondria. Elevated levels of mitochondrial transcripts are thought to represent increased ETC complexes and thus increased metabolic capacity (Fangue et al. 2009). In addition, L-fast mitochondria exhibited increased gene expression after exposure to a thermal stress. These results complement findings from previous studies on these ecotypes demonstrating differences in metabolic physiology. Schwartz et al. (In Press) demonstrated that individuals of the M-slow ecotype have higher levels of the circulating reactive oxygen species

(H2O2 and superoxide) at 27 ºC than L-fast individuals. Furthermore, free radicals were lower in M-slow individuals, but higher in L-fast individuals under heat stress relative to control. Given that State IV respiration is thought to have the highest production of these oxygen species (reviewed in: Scheffler 2008), our result that M-slow at normal temperature have both increased mitochondrial respiration at State IV and increased levels of ROS production is consistent because an increased rate of oxygen consumption that is not producing ATP could result in increased ROS production. Additionally, Robert and Bronikowski (2010) showed that neonates of the M-slow ecotype have more efficient ATP production by the mitochondria. Also Bronikowski and Vleck (2010) demonstrated that the adults of the M-slow ecotype has a lower mass-specific metabolic rate across normal activity temperatures (15-32 °C) relative to the L-fast ecotype. Together, these studies reinforce the conclusion of functional mitochondrial divergence between the ecotypes. While our focus has been on mitochondrial dynamics and gene regulation, we also document population genetic divergence in mitochondrial haplotypes, such that two haplotypes are specific to each ecotype. This results in high estimates of genetic divergence between these populations (φST was 0.429, p-value <0.05). This is dramatic considering previous population genetic studies across this landscape indicate small levels of gene-flow within and among ecotype populations using nuclear microsatellites (Average FST ~ 0.02 among populations across the landscape, and 0.07 between these specific populations) (Manes et al. In Revision with Encouragment to Resubmit; Manier & Arnold 2005; Manier & Arnold 2006; Manier et al. 2007). These results suggest potential selection for mitochondrial genomes in the specific populations/habitats despite low levels of nuclear gene

137

flow. The nonsynonymous mutations that define the haplotypes are found in ND5 and CYTB. ND5 is part of Complex I (NADH dehydrogenase) that is one initiation point for the ETC. CytB is part of Complex III (cytochrome C reductase) that transfers electrons from ubiquinol to cytochrome C. These complexes are known to produce the majority of reactive oxygen species in the mitochondria (Scheffler 2008), which is interesting in light of the differences in the levels of reactive oxygen species documented between these ecotypes. It is important to note that our measurement of mitochondrial respiration initiates at complex II of the ETC (succinate dehydrogenase), thus this measurement does not reflect any functional differences in complex I (NADH), which possessed two of the three nonsynonymous mutations among the haplotypes. Although our haplotype data are from only two populations, one of each ecotype, these results are striking and warrant further investigation to determine if these haplotypes are fixed between these populations, and potentially the ecotypes, and if the mutations are adaptive and under divergent selection in the particular habitats as has been documented in Drosophilia. Mitochondrial haplotypes in Drosophilia, distinguishable by just a few variants, are also correlated with different life-history trade-offs: egg size and fecundity, thermal-tolerance, rates of H2O2 production, and lifespan (Aw et al. 2011; Ballard et al. 2007; Clancy 2008). Furthermore the Drosophilia haplotypes provide different advantages across environmental gradients and ultimately affect fitness (Ballard et al. 2007; James & Ballard 2003). An alternative to diversifying selection for explaining the patterns of haplotype divergence between the garter snake ecotypes is sex-specific migration among the ecotype populations (Roberts et al. 2004; Yannic et al. 2012), or drift resulting from weakened strength of selection on the mitochondrial genome (Chong & Mueller 2012). More extensive sequencing of mitochondrial haplotypes, as well as the nuclear components of the ETC complexes, within and across populations from each ecotype is ongoing to properly test for evidence of selection for specific haplotypes in particular habitats.

Conclusion In summary, here we have demonstrated that a whole organism 2-hour heat stress on garter snakes increases the gene expression of mitochondrial rRNAs and causes functional changes to the mitochondria that affect the rate of oxygen consumption in State III respiration. We provide the first evidence that differential transcriptional regulation via the duplicated control regions in the snake mitochondrial genome may provide a mechanism for transcriptional plasticity in response to environmental stress. Finally we provide additional evidence for mitochondrial and metabolic

138

divergence between these garter snake life history ecotypes at three levels of biological complexity: mitochondrial respiration, mitochondrial transcription, and the mitochondrial genome sequence.

ACKNOWLEDGEMENTS Funding provided by the National Science Foundation: IGERT Fellowship in Computational Biology to TSS, GK-12 Fellowship to TSS, NSF IOS–0922528 to AMB. Funding and support also through ISU Baker Center Research Experience for Undergraduates in Computational and Systems Biology Fellowship to ZWA, ISU Center for Integrated Animal Genomics to AMB, and the Iowa Academy of Sciences to TSS and AMB.

REFERENCES Amiott EA, Jaehning JA (2006) Mitochondrial Transcription Is Regulated via an ATP ‚"Sensing" Mechanism that Couples RNA Abundance to Respiration. Molecular Cell 22, 329-338. Amo T, Brand MD (2007) Were inefficient mitochondrial haplogroups selected during migrations of modern humans? A test using modular kinetic analysis of coupling in mitochondria from cybrid cell lines. Biochemical Journal 404, 345-351. Arnold SJ, Peterson CR (2002) A model for optimal reaction norms: The case of the pregnant garter snake and her temperature-sensitive embryos. American Naturalist 160, 306-316. Attardi G, Schatz G (1988) Biogenesis of Mitochondria. Annual Review of Cell Biology 4, 289-331. Aw WC, Correa CC, Clancy DJ, Ballard JWO (2011) Mitochondrial DNA variants in Drosophila melanogaster are expressed at the level of the organismal phenotype. Mitochondrion 11, 756-763. Ballard JWO, Melvin RG, Katewa SD, Maas K, Nachman M (2007) Mitochondrial DNA variation is associated with measurable differences in life-history triats and mitochondrial metabolism in Drosophila simulans Evolution 61, 1735-1747. Bandelt HJ, Forster P, Röhl A (1999) Median-joining networks for inferring intraspecific phylogenies. Molecular Biology and Evolution 16, 37-48. Blier PU, Dufresne F, Burton RS (2001) Natural selection and the evolution of mtDNA-encoded peptides: evidence for intergenomic co-adaptation. Trends in Genetics 17, 400-406. Blier PU, Guderley HE (1993) Mitochondrial activity in rainbow trout red muscle: the effect of temperature on the ADP-dependence of ATP synthesis. Journal of Experimental Biology 176, 145-158. Blomain ES, McMahon SB (2012) Dynamic regulation of mitochondrial transcription as a mechanism of cellular adaptation. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1819, 1075-1079.

139

Bonawitz ND, Clayton DA, Shadel GS (2006) Initiation and Beyond: Multiple Functions of the Human Mitochondrial Transcription Machinery. Molecular Cell 24, 813-825. Brand MD, Harper M-E, Taylor HC (1993) Control of the effective P/O ratio of oxidative phosphorylation in liver mitochondria and hepatocytes. Biochemical Journal 291, 739-748. Bronikowski AM (2000) Experimental evidence for the adaptive evolution of growth rate in the garter snake Thamnophis elegans. Evolution 54, 1760-1767. Bronikowski AM, Arnold SJ (1999) The evolutionary ecology of life history variation in the garter snake Thamnophis elegans. Ecology 80, 2314-2325. Bronikowski AM, Vleck D (2010) Evolutionarily divergent physiology and life span: A case study in the garter snake (Thamnophis elegans). Integrative and Comparative Biology 50, 880-887. Brostrom CO, Brostrom MA (1997) Regulation of Translational Initiation during Cellular Responses to Stress. In: Progress in Nucleic Acid Research and Molecular Biology (ed. Kivie M), pp. 79- 125. Academic Press. Castoe T, Jiang Z, Gu W, Wang Z, Pollock D (2008) Adaptive evolution and functional redesign of core metabolic proteins in snakes. PLoS ONE 3, e2201. Chamberlin ME (2004) Top-down control analysis of the effect of temperature on ectotherm oxidative phosphorylation. American Journal of Physiology - Regulatory, Integrative and Comparative Physiology 287, R794-R800. Chong RA, Mueller RL (2012) Low metabolic rates in salamanders are correlatedwith weak selective constraints on mitochondrial genes.. Evolution, In press. Christian BE, Spremulli LL (2012) Mechanism of protein biosynthesis in mammalian mitochondria. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1819, 1035-1054. Clancy DJ (2008) Variation in mitochondrial genotype has substantial lifespan effects which may be modulated by nuclear background. Aging Cell 7, 795-804. de Queiroz A, Lawson R, Lemos-Espinal JA (2002) Phylogenetic Relationships of North American Garter Snakes (Thamnophis) Based on Four Mitochondrial Genes: How Much DNA Sequence Is Enough? Molecular Phylogenetics and Evolution 22, 315-329. Demonacos C, Djordjevic-Markovic R, Tsawdaroglou N, Sekeris CE (1995) The mitochondrion as a primary site of action of glucocorticoids: the interaction of the glucocorticoid receptor with mitochondrial DNA sequences showing partial similarity to the nuclear glucocorticoid responsive elements. The Journal of Steroid Biochemistry and Molecular Biology 55, 43-55. Doi A, Suzuki H, Matsuura ET (1999) Genetic analysis of temperature-dependent transmission of mitochondrial DNA in Drosophila. Heredity 82, 555-560.

140

Dong S, Kumazawa Y (2005) Complete mitochondrial DNA sequences of six snakes: Phylogenetic relationships and molecular evolution of genomic features. J Mol Evol 61, 12 - 22. Drummond AJ, Ashton B, Cheung M, et al. (2009) Geneious v4.7, Available from http://www.geneious.com/. Eberhard JR, Wright TF, Bermingham E (2001) Duplication and concerted evolution of the mitochondrial control region in the parrot Amazona. Molecular Biology and Evolution 18, 1330-1342. Excoffier L, Lavel G, Schneider S (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evolutionary Bioinformatics Online 1, 47-50. Fangue NA, Richards JG, Schulte PM (2009) Do mitochondrial properties explain intraspecific variation in thermal tolerance? Journal of Experimental Biology 212, 514-522. Flight PA, Nacci D, Champlin D, Whitehead A, Rand DM (2011) The effects of mitochondrial genotype on hypoxic survival and gene expression in a hybrid population of the killifish, Fundulus heteroclitus. Molecular Ecology 20, 4503-4520. Fontanillas P, DÉPraz A, Giorgi MS, Perrin N (2005) Nonshivering thermogenesis capacity associated to mitochondrial DNA haplotypes and gender in the greater white-toothed shrew, Crocidura russula. Molecular Ecology 14, 661-670. Glanville EJ, Murray SA, Seebacher F (2012) Thermal adaptation in endotherms: climate and phylogeny interact to determine population-level responses in a wild rat. Functional Ecology 26, 390-398. Goecks J, Nekrutenko A, Taylor J, Team TG (2010) Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology 11, R86. Grabherr MG, Haas BJ, Yassour M, et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotech 29, 644-652. Hazel JR, Landrey SR (1988) Time course of thermal adaptation in plasma membranes of trout kidney. I. Headgroup composition. American Journal of Physiology - Regulatory, Integrative and Comparative Physiology 255, R622-R627. He M, Feng J, Zhao E (2010) The complete mitochondrial genome of the Sichuan hot-spring keel- back (Thermophis zhaoermii; Serpentes: ) and a mitogenomic phylogeny of the snakes. Mitochondrial DNA 21, 8-18. Huey RB, Peterson CR, Arnold SJ, Porter WP (1989) Hot rocks and not-so-hot rocks - retreat-site selection by garter snakes and its thermal consequences. Ecology 70, 931-944.

141

Hulbert AJ, Else PL (1999) Membranes as Possible Pacemakers of Metabolism. Journal of Theoretical Biology 199, 257-274. James AC, Ballard JWO (2003) Mitochondrial Genotype Affects Fitness in Drosophila simulans. Genetics 164, 187-194. Jiang Z, Castoe T, Austin C, et al. (2007) Comparative mitochondrial genomics of snakes: extraordinary substitution rate dynamics and functionality of the duplicate control region. Bmc Evolutionary Biology 7, 123. Kephart DG, Arnold SJ (1982) Garter snake diets in a fluctuating environment - a 7-year study. Ecology 63, 1232-1236. Koc EC, Koc H (2012) Regulation of mammalian mitochondrial translation by post-translational modifications. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1819, 1055- 1066. Kumazawa Y, Ota H, Nishida M, Ozawa T (1996) Gene rearrangements in snake mitochondrial genomes: Highly concerted evolution of control-region-like sequences duplicated and inserted into a tRNA gene cluster. Mol Biol Evol 13, 1242-1254. Kumazawa Y, Ota H, Nishida M, Ozawa T (1998) The complete nucleotide sequence of a snake (Dinodon semicarinatus) mitochondrial genome with two identical control regions. Genetics 150, 313 - 329. Langmead B, Trapnell C, Pop M, Salzberg S (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10, R25. Manes M, Schwartz TS, Robert KA, Bronikowski AM (In Revision with Encouragment to Resubmit) Covariation in levels of multiple paternity and life-history strategy in closely related populations of garter snakes: implications for mating system evolution. Molecular Ecology. Manier MK, Arnold SJ (2005) Population genetic analysis identifies source-sink dynamics for two sympatric garter snake species (Thamnophis elegans and Thamnophis sirtalis). Molecular Ecology 14, 3965-3976. Manier MK, Arnold SJ (2006) Ecological correlates of population genetic structure: a comparative approach using a vertebrate metacommunity. Proceedings of the Royal Society of London Series B 273, 3001-3009. Manier MK, Seyler CM, Arnold SJ (2007) Adaptive divergence within and between ecotypes of the terrestrial garter snake, Thamnophis elegans, assessed with F-ST-Q(ST) comparisons. Journal of Evolutionary Biology 20, 1705-1719.

142

Mayer C, Bierhoff H, Grummt I (2005) The nucleolus as a stress sensor: JNK2 inactivates the transcription factor TIF-IA and down-regulates rRNA synthesis. Genes & Development 19, 933- 941. McCarthy DJ, Chen Y, Smyth GK (2012) Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation. Nucleic Acids Research. Miller DA, Clark WR, Arnold SJ, Bronikowski AM (2011) Stochastic population dynamics and life- history evoltuion in the western terrestrial garter snake. Ecology In Press. Moritz C, Brown W (1986) Tandem duplication of D-loop and ribosomal RNA sequences in lizard mitochondrial DNA. Science 233, 1425-1427. Mujahid A, Akiba Y, Toyomizu M (2009) Olive oil-supplemented diet alleviates acute heat stress- induced mitochondrial ROS production in chicken skeletal muscle. American Journal of Physiology - Regulatory, Integrative and Comparative Physiology 297, R690-R698. Mutvei A, Kuzela S, Nelson BD (1989) Control of mitochondrial transcription by thyroid hormone. European Journal of Biochemistry 180, 235-240. Pagliarini DJ, Dixon JE (2006) Mitochondrial modulation: reversible phosphorylation takes center stage? Trends in Biochemical Sciences 31, 26-34. Palacios MG, Sparkman AM, Bronikowski AM (2012) Corticosterone and pace of life in two life- history ecotypes of the garter snake Thamnophis elegans. General and Comparative Endocrinology 175, 443-448. Palloti F, Lenaz G (2001) Isolation and subfractionation of mitochondria from animal cells and tissue culture lines. In: Mitochondria (eds. Pon LA, Schon EA), pp. 1-35. Acedemic Press, San Diego, CA. Park S, Jang M (2011) Phosphoproteome profiling for cold temperature perception. Journal of Cellular Biochemistry 112, 633-642. Pichaud N, Ballard JWO, Tanguay RM, Blier PU (2011) Thermal sensitivity of mitochondrial functions in permeabilized muscle fibers from two populations of Drosophila simulans with divergent mitotypes. American Journal of Physiology - Regulatory, Integrative and Comparative Physiology 301, R48-R59. Pichaud N, Ballard JWO, Tanguay RM, Blier PU (2012) Naturally occurring mitochondrial NDA haplotypes exhibit metabolic differences: insight into functional properties of mitochondria. Evolution, no-no.

143

Pichaud N, Chatelain EH, Ballard JWO, et al. (2010) Thermal sensitivity of mitochondrial metabolism in two distinct mitotypes of Drosophila simulans: evaluation of mitochondrial plasticity. The Journal of Experimental Biology 213, 1665-1675. Rest JS, Ast JC, Austin CC, et al. (2003) Molecular systematics of primary reptilian lineages and the tuatara mitochondrial genome. Molecular Phylogenetics and Evolution 29, 289-297. Robert K, Vleck C, Bronikowski A (2009) The effects of maternal corticosterone levels on offspring behavior in fast- and slow-growth garter snakes (Thamnophis elegans). Hormones and Behavior 55, 24-32. Robert KA, Bronikowski AM (2010) Evolution of senescence in nature: physiological evolution in populations of garter snake with divergent life histories. The American Naturalist 175, 147-159. Robert KA, Brunet-Rossinni A, Bronikowski AM (2007) Testing the 'free radical theory of aging' hypothesis: physiological differences in long-lived and short-lived colubrid snakes. Aging Cell 6, 395-404. Roberts MA, Schwartz TS, Karl SA (2004) Global population genetic structure and male-mediated gene flow in the green sea turtle (Chelonia mydas): analysis of microsatellite loci. Genetics 166, 1857-1870. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140. Robinson MD, Smyth GK (2007) Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, 2881-2887. Rorbach J, Minczuk M (2012) The post-transcriptional life of mammalian mitochondrial RNA. Biochemical Journal 444, 357-373. Ruiz-Pesini E, Mishmar D, Brandon M, Procaccio V, Wallace DC (2004) Effects of Purifying and Adaptive Selection on Regional Variation in Human mtDNA. Science 303, 223-226. Scarpulla RC (2012) Nucleus-encoded regulators of mitochondrial function: Integration of respiratory chain expression, nutrient sensing and metabolic stress. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms 1819, 1088-1097. Schattner P, Brooks AN, Lowe TM (2005) The tRNAscan-SE, snoscan and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids Research 33, W686-W689. Scheffler IE (2008) Mitochondria: Second edition John Wiley & Sons, Inc, Hoboken, New Jersey. Schulte PM, Healy TM, Fangue NA (2011) Thermal Performance Curves, Phenotypic Plasticity, and the Time Scales of Temperature Exposure. Integrative and Comparative Biology.

144

Schwartz TS, Bronikowski AM (2011) Molecular stress pathways and the evolution of life histories in reptiles. In: Molecular Mechanisms of Life History Evolution: The Genetics and Physiology of Life History Traits and Trade-Offs (eds. Flatt T, Heyland A). Oxford Press, Oxford. Schwartz TS, Bronikowski AM (In Press) Dissecting molecular stress networks: identifying nodes of divergence between life-history phenotypes. Molecular Ecology. Schwartz TS, Tae H, Yang Y, et al. (2010) A garter snake transcriptome: pyrosequencing, de novo assembly, and sex-specific differences. Bmc Genomics 11, 694. Seebacher F, Brand MD, Else PL, et al. (2010) Plasticity of Oxidative Metabolism in Variable Climates: Molecular Mechanisms. Physiological and Biochemical Zoology 83, 721-732. Sparkman AM, Arnold SJ, Bronikowski AM (2007) An empirical test of evolutionary theories for reproductive senescence and reproductive effort in the garter snake Thamnophis elegans. Proceedings of the Royal Society B-Biological Sciences 274, 943-950. Sparkman AM, Arnold SJ, Bronikowski AM, Herriges MJ (2005) Evolutionarily divergent patterns of age-specific reproduction in the garter snake Thamnophis elegans. Integrative and Comparative Biology 45, 1196-1196. Stevenson RD, Peterson CR, Tsuji JS (1985) The thermal dependence of locomotion, tongue flicking, digestion, and oxygen consumption in the wandering garter snake. Physiological Zoology 58, 46- 57. Sunnucks P, Hales DF (1996) Numerous transposed sequences of mitochondrial cytochrome oxidase I-II in aphids of the genus Sitobion (Hemiptera: Aphididae). Molecular Biology and Evolution 13, 510-524. Ulrich P, Marsh A (2009) Thermal Sensitivity of Mitochondrial Respiration Efficiency and Protein Phosphorylation in the Clam Mercenaria mercenaria. Marine Biotechnology 11, 608-618. Visser ME (2008) Keeping up with a warming world; assessing the rate of adaptation to climate change. Proceedings of the Royal Society B: Biological Sciences 275, 649-659. Wyman SK, Jansen RK, Boore JL (2004) Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20, 3252-3255. Yannic G, Basset P, Büchi L, Hausser J, Broquet T (2012) Scale-specific sex-biased dispersal in the Valais shrew unveiled by genetic variation on the Y chromosome, autosomes, and mitochondrial DNA. Evolution 66, 1737-1750.

145

et al.

Manier ( 5 Robert ( 10 10 ; ) Bronikowski & Vleck 2010 Vleck & Bronikowski (

9 ; ) Sparkman, A& B. Bronikowskipers. comm.;

. 4 ) ; ) Robert & Bronikowski 2010 Bronikowski & Robert ( 8 ; ) ecotypes thatare relevant to this study, adapted fromSchwartz &

Kephart & Arnold 1982 Arnold & Kephart ( 3 ; ) Schwartz& Bronikowski In Press 2011 (

13 13 et al.

Bronikowski 2000 Bronikowski ( 7 ; ) Thamnophis elegans Thamnophis Miller ( This Study; Study; This 2 2 ; 12 2007 )

; ) et al.

2012

. ) et al.

2011 Sparkman ( ( 6 ; ) Palacios ( 1. Differences between of of between Differences 1. - 11 11 ; 2007 ) Bronikowski & Arnold 1999 Arnold & Bronikowski able 5 able (

T Bronikowski 1 et al. 2009

146

significant significant - NS indicates non indicates NS

on mitochondrial respiration and gene expression. gene and respiration mitochondrial on df1,df2 “ indicatesthat variable was not included in thefinal model. - Statistical resultsfrom ANOVAs, F 2. - 5 Table Table results, dasha “

147

Table 5-3. Diversity among the 10 unrelated individuals in the protein coding genes. The number and type of variable sites identified in mitochondrial gene found to have variation. TS = transitions. TV=transversion. dN = nonsynonymous mutation, dS = synonymous mutations.

Gene Complex Nucleotide TS Amino Acid dN No. Population Changes or Changes or Indiv. TV dS l-rRNA 2507 A->G TS NA NA 1 L-fast: ELF ND4 I 453 T->C TS - dS 1 L-fast: ELF ND5 I 1409 T->C TS 470I -> T dN 1 M-slow: PAP ND5 I 1529 G->A TS 510S -> N dN 1 L-fast: ELF CYTB III 974 T->C TS 325I -> T dN 4 M-slow: PAP COX3 IV 27 C->T TS - dS 1 L-fast: ELF ATP6 V 498 C->T TS - dS 1 L-fast: ELF ATP6 V 579 G->A TS - dS 4 M-slow: PAP Total 8 7/1 4 3/4

148

Figure 5-1. Thamnophis elegans mitochondrial sequence.

149

A 35

L-fast 30

A M-slow

) 25 -1

mg -1 20 min

2 15

(nmol O 10

5

Rate of Oxygen consumpon in State IV 0 Control (27C) Heat Stress (37C) 35

B 30 )

-1 25 mg -1 20 min 2 15

(nmol O 10

5 Rate of oxygen consumpon in State III

0 Control (27C) Heat Stress (37C) Temperature Treatment

Figure 5-2. Mitochondrial oxygen consumption. A) State IV, B) and State III respiration in mitochondria isolated from the two garter snake ecotypes (L-fast and M-slow) and measured at 28 ºC after whole organismal exposure to either 27 ºC or 37 ºC for two hours. Values are LSmeans ± 1SE.

150

A 2

1.8

1.6 1.4 1.2 Control 1

0.8 Stress Gene Expression 0.6 0.4 0.2

0 B Rao I:II MTI MTII

6000

5000

4000 L-fast

3000 M-slow

2000

1000 from the mitohcondrial genome

Normalized number of transcripts 0 Control (27C) Heat Stress (37C) Temperature Treatment

Figure 5-3. Normalized quantitative expression of the mitchondrial genome at control temperature (i.e., standard body temperature) and in response to heat stress (37 ºC). A) Ratio I:II is the ratio of normalized transcription of Segment1:Segment 2 of the mitochondria under control and thermal stress. MTI is the normalize expression of Segment 1 of the mitochondria divided by 3000 to scale for prepresentation, MTII is the normalized expression of Segment 2 of the mitochondria divided by 3000 to scale for presentation. B) Normalized number of transcripts that mapped to the mitochondrial genome (excluding control regions) for each ecotype.

151

H4: Meadow

H3: Meadow

H2: Lakeshore H1: Lakeshore _____

Figure 5-4. Haplotype network illustrating the relationships among the four mitochondrial haplotypes identified in this study. The size of the circles respresents the relative frequency of that haplotype. The gene codes on the lines indicate the locations (gene) of the variants between those haplotypes. Blue circles corresponds to the population representing the M-slow ecotype, and red corresponds to the population representing the L-fast ecotype.

152

The representative mitochondrial haplotypes are indicated by by indicated are haplotypes mitochondrial representative The

slow ecotype). slow -

ecotype), PAP(M fast - 1. Individual mitochondrial transcriptomes for all protein coding genes and the rRNAs for 10 unrelated garter 10 unrelated for rRNAs and the genes coding protein all for transcriptomes mitochondrial Individual 1. -

upplemental Figure5 S (LELFS populations: from two snakes variants. sequence represent lines vertical Black h4. to h1

153

Differences Differences

mitochondrial genome. mitochondrial

ophis elegans ophis Thamn 2. Nucleotide alignment of the two control regions of of regions control two the of alignment Nucleotide 2. -

A B

upplemental Figure5 S (SNPs) betweenthe two regionsare highlighted.A) 5’ endof thecontrol regions. B) 3’end of thecontrol regions.

154

ta ta Foreach panel, thegreen histogram below the number of reads that mapped to that region of the gene. A) Individual 66 (Illumina data) data) (Illumina 66 Individual A) gene. the of region to that mapped that reads of number the

3. Mapping of sequence reads to potential mitochondrial transcripts. transcripts. mitochondrial potential to reads sequence of Mapping 3. -

d to Segment 1, B) 454 data mapped to Segment 1, C) Individual 66 (Illumina data) mapped to beginning of Segment 2, D) 454 da 454 D) 2, Segment of beginning to mapped data) (Illumina 66 Individual C) 1, Segment to mapped 454 data B) 1, Segment to d A B C D

upplemental Figure5 S abundance read represents diagram gene the mappe 2. Segment of beginning to mapped

155

CHAPTER 6. FUNCTIONAL ECOLOGICAL TRANSCRIPTOMICS:

USING QUANTITATIVE RNA-SEQ TO STUDY HEAT STRESS IN A

NON-MODEL VERTEBRATE

Manuscript prepared for submission to the peer-reviewed scientific journal, Physiological Genomics

Tonia S. Schwartz and Anne M. Bronikowski

Authors’ contributions: Both T.S.S. and A.M.B. contributed to the design of the experiment and the assays and performed the experiment. T.S.S. analyzed the data and drafted the manuscript. Both authors edited the manuscript.

ABSTRACT The emerging field of conservation physiology requires experimental data to address critical questions on the mechanistic basis for an organism’s ability to persist in a habitat. For example, what molecular networks are activated under physiological stress and what traits are affected by those networks? Also, to what extent does plasticity in these molecular networks allow for physiological acclimation to a changing environment? Because both the mean and variance of environmental temperature is predicted to change across large landscapes, and because temperature has a strong influence on molecular reactions thereby modulating the degree to which organisms can respond to changing thermal conditions, the focus of our experimental work is on transcriptomic responses to temperature. Here, we present the first large-scale transcriptome study of heat stress in a poikilothermic reptile – the garter snake, Thamnophis elegans. Our study populations have been models of ecological and evolutionary genetics for the past quarter century. We bring this long-term understanding of a natural ecological system to bear on the questions of how animals respond to physiological stress. We characterize the plastic responses of liver transcriptomes to heat stress using Illumina RNA-seq, and identify molecular networks or pathways that are affected. We have identified 417 transcripts (of 14,558: 2.9%), and 686 genes (of 10,930: 6.7%) that were differentially expressed in response to heat stress, with twice as many genes being up-regulated versus down-regulated. Genes were clustered by annotation into five main categories: heat shock proteins and protein folding;

156

transcription regulation; muscle and cytoskeletal proteins; extracellular and matrix proteins for secretion and cell signaling; and development and growth. Five KEGG pathways were significantly over-represented in the differentially expressed genes: extracellular matrix interaction; focal adhesion; arginine and proline metabolism; MAPK signaling pathway; and glycophingolipid biosynthesis – ganglio series. Additionally, a number of immune genes were up-regulated, and complementary qPCR data indicate that the insulin/insulin-like signaling pathway is also affect by heat stress. Thus, we have identified functional categories of genes and pathways that are affected by heat stress, providing hypotheses for future studies on the pleiotropic affect of extreme thermal events (natural and human-induced) activating stress response networks in natural populations of reptiles. Finally, this study provides a proof-of-concept for the applicability for the use of RNA-seq to study large-scale transcriptomic changes in response to climate variation in a non-model vertebrate.

INTRODUCTION Temperature is a fundamental determinant of animal physiology and a driver of acclimation and evolutionary adaptation. Natural populations are exposed to extreme acute temperature fluctuations, which have been increasing (and are predicted to continue increasing) with human induced climate change (Hansen et al. 2012). The ability of an organism to respond appropriately to physiological stresses such as temperature spikes can determine an individual’s ability to survive, and a population’s ability to persist in a particular habitat or geographic location (Huey et al. 2009; Somero 2010; Tomanek 2010). Recent studies emphasize the importance of high temperatures (as opposed to low temperatures) being detrimental to ectothermic organisms (Kearney et al. 2009). Although behavioral regulation can buffer against thermal change to a degree (Huey 1974; Kearney et al. 2009), further buffering requires the activation of physiological networks, such as the heat shock system, that provide cellular protection (Lindquist 1986; Richter et al. 2010). An ectotherm’s modulation of cellular biochemistry in response to thermal stress initiates with protein modifications and gene expression. Understanding the functional gene groups and pathways that are affected under a particular stress will lead to a broader understanding of the effect on the whole organism including the pleiotropic consequences on broader life history traits beyond the cellular protection from the immediate stress. This has recently been demonstrated in sea urchins, where thermal stress affected developmental gene networks (Runcie et al. 2012). The emerging field of conservation physiology requires experimental data to elucidate the molecular networks that determine both plastic and invariant responses to changes in the environment (Seebacher & Franklin 2012). For this reason, how ectothermic reptiles respond to temperature has

157

been a focus of modeling the effects of climate change. Understanding the reptilian ability to respond to elevated temperature at the molecular level will provide much needed data towards an understanding of how organisms will respond to variation in habitat and climate. Towards these ultimate goals, we take an essential initial step to provide the first whole transcriptome study of heat stress on an ectothermic reptile, the Western terrestrial garter snake (Thamnophis elegans). Populations of this species in the Sierra Nevada Mountains have been models of ecological and evolutionary genetics for the past quarter century. We bring this long-term understanding of a natural ecological system to bear on the questions of how these animals respond to heat stress. These populations have diverged in habitat association as well as across a suite of traits related to life- history, morphology, and physiology (Schwartz & Bronikowski 2011). Here we sample across these populations to gain a broad understanding of the generalized heat stress response that supersede the differences among the populations. Previously we identified a number of physiological changes that occur with heat stress in this species: an increase in corticosterone, increased DNA damage, and increased gene expression of HSP70 and mitochondrial rRNAs (Chapter 5; Schwartz et al. In Press). Here our aim is to characterize the plastic response of the transcriptome to heat stress with the objective of identifying genetic pathways and networks that are affected. We focus on a primary organ for metabolic processes, the liver, and quantify its gene expression under control and heat stress conditions using Illumina RNA-seq. While we have candidate stress pathways that we specifically test for an affect of heat stress using quantitative PCR (qPCR), the whole transcriptome approach using RNA-seq allows for the discovery of novel genes and pathways that are responsive to heat stress. Both of these approaches provide new insight on the affect of heat stress on transcriptional regulation in this reptile.

METHODS

Animals In May and June of 2007, gravid T. elegans females were collected from four genetically- divergent populations around Eagle Lake in the Sierra Nevada Mountains of California: significant pairwise FST ranges from 0.02 to 0.07 based on microsatellites, pairwise φST = 0.429 based on mitochondrial haplotype between the two most divergent populations in this study (Chapter 5; Manes et al. In Revision with Encouragment to Resubmit; Manier & Arnold 2005; Manier & Arnold 2006; Manier et al. 2007). Females were brought back to Iowa State University, and gave birth in September 2007. Offspring were raised in the common laboratory environment for 1.2 years until used in this experiment in October/November 2008. Each animal tank had a thermal gradient that

158

ranged from 24 °C to 32 °C (12-h light), which enabled the animals to behaviorally regulate their body temperature, and 24 °C (12-h dark). Animals were fed 1x per week, pinky mice until satiation, and were last fed 5 days prior to their experimental day.

Experimental design Animals were split between a standard body temperature (27 °C) and a heat stress treatment (37 °C). The heat stress temperature of 37 ˚C was chosen to increase their metabolic and oxidative stress (Schwartz & Bronikowski In Press); 37 ˚C is above the preferred body temperature of T. elegans, including both these ecotypes (26 to 32 ˚C), but well below the critical thermal max (~43 ˚C) (Arnold & Peterson 2002; Huey et al. 1989; Stevenson et al. 1985). Based on behavior and locomotive performance, T. elegans become thermally stressed above 35 ˚C (Stevenson et al. 1985), and O2 consumption in Thamnophis species increases with body temperature with Q10 = 2.45 (Bronikowski & Vleck 2010; Peterson et al. 1998). All animals were placed in a 27 °C incubator for twenty hours to acclimate, and then put into incubators at their respective temperature treatments (27 °C for control and 37 °C for heat stress) for two hours, at which time they were euthanized via decapitation. The liver was dissected out within 10 minutes of death, and half of the liver was snap frozen in liquid nitrogen for RNA isolation. Liver was chosen because of its high metabolic activity. RNA was isolated from 12-19 mg of snap-frozen liver from 51 individuals, representing both populations (24 control, 27 heat stress), using Qiagen RNAeasy kit (Qiagen), with a DNAse digestion on the membrane.

Quantitative expression of the liver transcriptome using Illumina RNA-seq To evaluate the effects of heat/metabolic stress on gene expression, we sequenced the mRNA from the livers of 12 female from two populations using Illumina RNA-seq (Table 1) (Wang et al. 2009). The quality and quantity of the total RNA was determined through an RNA Nanochip on a Bioanalzyer, and all samples had a RIN number greater than 8. For each individual 10 ng of total- RNA were submitted to the Iowa State University DNA Facility for RNA-seq library preparation and 72- or 80-cycle (single read) sequencing on the Illumina GXII platform in Summer/Fall of 2009 except for one liver that was prepped and sequenced in the Fall of 2010 (Individual 37: Table 1). A reference T. elegans liver transcriptome was assembled using reads from these twelve individuals from the experiment, and a 13th individual not included in this experiment, but from the same cohort of animals, for a total of 138 million reads. Reads were first cleaned using a workflow in Galaxy (Goecks et al. 2010) to trim the poor quality ends of the reads (quality score < 20) and then filter out reads less than 20-bp in length (http://main.g2.bx.psu.edu/u/ts-ecogen/w/groomillumina-

159

solexatrimq20filter20bp). The transcriptome was assembled using the cleaned reads in TRINITY v.2011-08-02 (Grabherr et al. 2011), which is designed for de novo RNA-seq transcriptome assembly, to create transcripts for alternatively spliced isoforms that belong to groups representing genes. The transcriptome was annotated with local installation of BLAST 2.2.27+ (Camacho et al. 2008 [Updated 2012 Sep 4]), using tblastx to the ENSEMBL chicken cDNAs. For each individual, the cleaned reads were mapped to the reference transcriptome using BOWTIE (Langmead et al. 2009), and expression abundance estimated using RSEM (Li & Dewey 2011); both programs were used within the TRINITY package. We also used scripts within the TRINITY package to compute fragments per kilobase of transcript per million fragments mapped (FPKM) values and to produce a table of read counts per gene per individual (Trinity-Developers- Group 2012). To identify specific transcripts and genes that were differentially regulated in the liver in response to heat stress we used the R/Bioconductor package edgeR that uses an empirical Bayes method to estimate gene specific biological variation (Robinson et al. 2010; Robinson & Smyth 2007). This program uses a negative binomial distribution (Robinson & Smyth 2008) and a TTM normalization that corrects for differences in the quantity of reads per individual (Robinson & Oshlack 2010). We used the total number of reads mapped per individual for the TTM normalization across libraries (individuals) to control for quantity and quality of reads mapped. We analyzed the data at the transcript level and at the gene level. Because counting variance is higher for transcripts/genes with low expression, we ran the program including and then excluded transcripts/genes with less than an average of 50 reads per individual. Thus, we ran the program four times using (i) counts per transcript with all transcripts, (ii) counts per transcript after filtering out transcripts with a low abundance reads, (iii) counts per genes including all genes, and (iv) counts per gene after filtering out genes with a low abundance reads. The Benjamini method (Benjamini & Hochberg 1995) was used to control for false discovery rate (FDR). To test for enrichment of pathways or functionally related gene groups affected by heat stress we used the program DAVID (Huang et al. 2008). This program uses a list of differentially expressed gene IDs to compare to the total population of gene IDs. We applied the Gallus gallus (chicken) ENSEMBL cDNA annotation of our T. elegans reference transcriptome to our list of differentially expressed genes/transcripts with a FDR cut-off ≤ 0.05, and the background list of transcripts/genes that were tested for significance (see i-iv above). The R/Bioconductor program GoSeq (Young et al. 2010) was also used to test for enrichment of KEGG pathways in the genes that were differentially expressed. This program can take into account the relative expression differences between genes as it

160

uses gene length to take into account biases in mapping due to gene length. For these analyses, chicken gene length and annotation was used.

Quantitative real-time PCR of candidate genes

To compare with, and complement the RNA-seq data, we assayed six candidate genes and two reference genes using quantitative real-time PCR (qPCR) on 51 individuals. One candidate gene, heat shock protein 70 (HSP70), was chosen to verify that 37 ºC was perceived as a heat stress. The other 5 candidate genes were chosen out of interest from prior experiments on oxidative stress (catalase), and out of specific interest in our research group, key genes encoding insulin growth factor (IGF) ligands and receptors in the insulin/insulin-like signaling pathway (IGF1 ligand, IGF2 ligand, IGF1 Receptor, and IGF2 Receptor). Reference genes were eukaryotic translation elongation factor 1α (EEF1α) and B-actin. cDNA was synthesized using 1 µg of total RNA, 200 units of SuperScript II reverse transcriptase (Invitrogen), 2 mM dNTPs, and 0.5 µg oligo dTs (Invitrogen and Integrated DNA technologies). Using a pool of cDNA from all the samples, a 10-point standard curve was created using 1:4 dilutions. qPCR reactions were run in duplicate and contained 0.3 µM of each primer, and Q-PCR SybrGreen Supermix (Promega), in 20 µL reactions and was conducted on an Eppendorf MasterPlex Cycler using the SybrGreen setting (see Supplemental Table 5-1 for primer information). The thermal cycle consisted of 95 °C for 2-min, 40 cycles of 95 °C for 20-sec, 58 °C for 20-sec, 68 °C for 20-sec, and a final melt curve analysis from 68 to 95 °C for 20-min. Individuals that had a CT standard deviation greater than 0.25 were rerun. A standard curve was run in duplicate on every plate.

For every plate, the relevant range of the standard curve was used to convert CT values to relative copy number of the transcript for that particular gene. To test for expression differences between the control and heat stress groups for each candidate gene, we used the REST 2009 Software (Qiagen) (Pfaffl et al. 2002). The method employed by the REST Software takes into account the PCR efficiency for each gene, uses normalization relative to the reference genes (EEF1α and B-actin), and a randomization technique that randomly reallocates control and heat stress samples between the groups to calculate P-values. We used 10,000 randomizations. To compare the RNA-seq and qPCR data, we correlated expression levels between

RNA-seq FPKM from the RSEM output and CT values from the qPCR data for each candidate genes, and compared the estimates of fold-changes and significance of the affect of heat on expression with each technology.

161

RESULTS

Sequence data The number of reads per Illumina library ranged from 5-31 million reads per library and the average quality scores per library ranged from 21–36 (Table 1). This variation is indicative of the rapid improvement in the sequencing technology over the timespan of one year (mid 2009 – mid 2010). The assembled transcriptome had 132,343 transcripts representing 97,612 gene groups. On average 79.3 percent of reads from each individual mapped to the reference (Table 1). Using the chicken ENSEMBL cDNAs, we were able to annotate 30,136 transcripts (23%). 24,866 of these annotated transcripts were to an annotated cDNA, the rest were to ab initio gene predictions based on genome scans for reading frames. The annotated transcripts corresponded to 14,406 gene groups.

RNA-seq differential gene expression Using a FDR of 5%, with no filtering we identified 1261 transcripts (of 132,343: 0.95%), and 1938 genes (of 97,612: 1.99%) that were differentially expressed in the T. elegans liver in response to heat stress (Table 2). With the low abundance reads filtered we identified 417 transcripts (of 14,558: 2.9%), and 686 genes (of 10,930: 6.7%) that were differentially expressed (Supplemental Table 2). In all runs, approximately twice as many genes were up-regulated compared to down-regulated (Figures 2 and 3). Based on the annotation to the chicken, the list of differentially expressed genes includes the expected heat shock proteins, including our candidate gene HSP70, validating the 37 °C as a heat stress on these garter snakes (Table 3). Functional annotation clustering was based only on the differentially expressed genes that could be annotated with the ENSEMBL chicken cDNAs and their associated GO terms and KEGG pathways; these represented less than half of the differentially expressed genes/transcripts. The program DAVID identified between 28 and 48 annotation clusters for the four runs, with between 6 to 17 clusters having an enrichment score greater than 1.3 (equivalent to a non-log scale p=0.05) (Table 4). These clusters could be grouped into five main categories: heat shock proteins and protein folding; transcription regulation; muscle and cytoskeletal proteins; extracellular and matrix proteins for secretion and cell signaling; and development and growth. GoSeq analyses highlighted five KEGG pathways that were significantly over-represented in the differentially expressed genes: Extracellular matrix interaction; focal adhesion; arginine and proline metabolism; MAPK signaling pathway; glycophingolipid biosynthesis – ganglio series (Table 5).

162

Quantitative PCR of IGFs and comparison to RNA-seq As expect both the RNA-seq and the qPCR data indicated HSP70 was dramatically up-regulated with heat stress (Table 3). Neither catalase nor IGF1 were statistically affected by heat stress in either the RNAseq or the qPCR assay. In our qPCR assay, IGF2 was slightly down regulated (0.5X), and both IGF receptors were modestly up-regulated with heat stress: IGF1 receptor 1.4X, and IGF2 receptor, 1.5X (Table 3). In comparing the RNA-seq data for the same candidate genes used for qPCR, only HSP70, catalase, and the reference genes had a sufficient abundance of reads to be compared for the classification of differentially expressed, and all three of these genes showed the same result with both technologies. The abundance of reads for the four IGF genes fell below the threshold level cut-off for low abundance reads (at least an average of 50 reads per individual) in the RNA seq data. This low mRNA abundance for the four genes is obvious in the qPCR methodology as well, with the low cDNA dilution factors for these genes. In this way, both the technologies were consistent in determining the low expression levels of these genes. We conclude from our qPCR approach, that RNA-seq will not pick up small, but potentially biologically important, gene expression changes in genes with overall low expression.

DISCUSSION Understanding poikilothermic ability to transduce environmental variation into changes at the molecular level will be informative for understanding how such organisms will respond to variation in habitat and climate. Here we provide the first whole transcriptome study of heat stress on a reptile, utilizing Illumina RNA-seq on 12 garter snake (Thamnophis elegans) females. This sequencing dataset has considerable variation among individuals in both the quality of the sequences (ultimately affecting their length when trimmed for quality) and the quantity of reads. Additionally, both the heat stress and the control groups had an outlier individual that could be due to either technological variation or biological variation, or both. Despite these limitations, we propose that our analyses of these data provide useful insight into the affects of heat stress on the liver transcriptome, and serve as a proof-of-concept for utilizing high-throughput sequencing data for gene expression analyses to understand the molecular effects of naturally-occurring complexity. First, a PCA across all the transcripts could clearly group the individuals into their treatment groups (Figure 1). Second, although RNA-seq could not be used to detect differentially expressed genes with low expression overall (e.g., the candidate IGF genes), as demonstrated by qPCR, it did correctly categorized HSP70, catalase, and the two reference genes that were more highly expressed. Thus these data are useful for testing the differential expression of highly expressed genes and to identify enriched pathways and

163

functions in the differentially expressed genes with high expression. Future studies monopolizing on the rapid advances in sequencing technology, along with increased sample sizes will have considerably more statistical power to look beyond the highly expressed genes.

Effects of stress on gene expression Temperature is known to be an important causal agent in variable organismal physiology. Here, we have identified over 600 genes that are differentially regulated by heat stress, based on a FDR of 5%. Exploring the data further suggests that many more genes can be clearly grouped by treatments (Figure 4) and likely would be identified as significant with a higher sample size, a cleaner dataset, or a targeted approach such as with qPCR. Of the 600+ genes we detected at differentially expressed, it is important to note that less than half of the differentially expressed genes and transcripts could be annotated, indicating there are many important genes involved in stress response that have yet to be characterized. Additionally, genes that are unique to snakes or that have diverged in sequence such that they cannot be annotated could not be included in our function or pathway analyses, but should be the targets of further detailed studies. The functional groups indicated HSP and protective cellular processes are up-regulated in response to heat stress as expected since this response is highly conserved (Fangue et al. 2006; Healy et al. 2010; Mayer & Bukau 2005). More interesting are the functional groups and pathways of extracellular matrix, focal cell-cell adhesion and the MAPK signaling pathway that are all involved in processing information from the environment to the cells, and within and among the cells. Increased activation of the MAPK signaling pathway emphasizes the importance of protein modification in relaying signals from environment through the cells. These genes and pathways would be a starting point for investigating the similarities and differences in how ectothermic reptiles sense and respond to their environment compared to endotherms. Interestingly, a number of immune genes were also up-regulated with heat stress (MHC class II, interleukin, and immunoglobulin J). Heat shock proteins have been implicated in stimulating the immune response in multiple organisms (Collier et al. 2008; Pockley 2003; Sorensen et al. 2005), providing a mechanism for how abiotic environmental change can affect immune function. This poses interesting questions as to the affect of combined thermal and immune stresses in a natural environment and suggests a testable hypothesis of whether short-term thermal stresses may actually facilitate an ectothermic reptile’s response to a pathogen. The insulin/insulin-like growth factor (IIS) pathway has been implicated as a potential network to modulate cellular protection versus growth and reproduction. Studies on nematodes have demonstrated that decreasing IIS signaling results increased tolerance to heat stress (Cypser et al.

164

2006; Muñoz 2003). Previous studies on these garter snakes have demonstrated a correlation between environmental stress in the form of resource availability and decreased levels of circulating IGF-1 (Sparkman et al. 2009). Here, the qPCR data indicates that the IIS pathway in garter snakes is also affected by heat stress, with IGF2 being down regulated, but both receptors up-regulated in the liver. The increase in both receptors is intriguing since IGF-1 receptor promotes IIS signaling and growth/reproduction, where as IGF2 receptor is know to bind and inactivate IGF ligands (preferentially IGF2), thereby decreasing the IIS signaling. The up regulation of both the oxidative stress growth inhibitor 1 gene and the insulin-induced gene 1 provide further evidence that the IIS pathway is affected by stress. Overall, these results suggest the potential for pleotropic effects of extreme thermal events on growth and reproduction in natural populations of ecotherms via the affects on the IIS pathway. Ongoing studies looking at circulating levels of these growth factors along with transcription levels across tissues provide additional insight into the role of the IIS pathway in reptiles.

Conclusions This is the first large-scale transcriptomic study of the affect of heat stress on reptile gene expression. We identified functional categories of genes and pathways that are affected by heat stress, providing hypotheses for future studies on the affect of extreme thermal events (natural or human- induced) on natural populations of reptiles. Similar to Meyer et al. (2011) we demonstrate that a sequenced genome is not necessary for quantitative expression using RNA-seq in a non-model organism. This study provides proof-of-concept for the applicability for the use of RNA-seq to study large-scale transcriptomes in a non-model vertebrate.

ACKNOWLEDGEMENTS We would like to acknowledge and thank K. Roberts and M. Manes for the collection and care of animals; M. Manes and A. Sparkman for assistance on experiment days; M. Baker and the ISU DNA Sequencing Facility; S. McGaugh and A. Severin for assistance with sequence data analyses; and the following sources for funding: Iowa Academy of Science (TSS), NSF-IGERT fellowship (TSS), NSF-DDIG (TSS), and NSF Grant [IOS 0922528] (AMB).

REFERENCES Arnold SJ, Peterson CR (2002) A model for optimal reaction norms: The case of the pregnant garter snake and her temperature-sensitive embryos. American Naturalist 160, 306-316.

165

Benjamini Y, Hochberg Y (1995) Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological) 57, 289-300. Bronikowski AM, Vleck D (2010) Evolutionarily divergent physiology and life span: A case study in the garter snake (Thamnophis elegans). Integrative and Comparative Biology 50, 880-887. Camacho C, Madden T, Coulouris G, et al. (2008 [Updated 2012 Sep 4]) BLAST Command Line Applications User Manual In: In: BLAST® Help [Internet]. National Center for Biotechnology Information (US);, Bethesda (MD): . Collier RJ, Collier JL, Rhoads RP, Baumgard LH (2008) Invited Review: Genes Involved in the Bovine Heat Stress Response. Journal of Dairy Science 91, 445-454. Cypser JR, Tedesco P, Johnson TE (2006) Hormesis and aging in Caenorhabditis elegans. Experimental Gerontology 41, 935-939. Fangue NA, Hofmeister M, Schulte PM (2006) Intraspecific variation in thermal tolerance and heat shock protein gene expression in common killifish, Fundulus heteroclitus. Journal of Experimental Biology 209, 2859-2872. Goecks J, Nekrutenko A, Taylor J, Team TG (2010) Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biology 11, R86. Grabherr MG, Haas BJ, Yassour M, et al. (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotech 29, 644-652. Hansen J, Sato M, Ruedy R (2012) Perception of climate change. Proceedings of the National Academy of Sciences 109, E2415-E2423. Healy TM, Tymchuk WE, Osborne EJ, Schulte PM (2010) Heat shock response of killifish (Fundulus heteroclitus): candidate gene and heterologous microarray approaches. Physiological Genomics 41, 171-184. Huang DW, Sherman BT, Lempicki RA (2008) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protocols 4, 44-57. Huey RB (1974) Behavioral Thermoregulation in Lizards: Importance of Associated Costs. Science 184, 1001-1003. Huey RB, Deutsch CA, Tewksbury JJ, et al. (2009) Why tropical forest lizards are vulnerable to climate warming. Proceedings of the Royal Society B: Biological Sciences 276, 1939-1948. Huey RB, Peterson CR, Arnold SJ, Porter WP (1989) Hot rocks and not-so-hot rocks - retreat-site selection by garter snakes and its thermal consequences. Ecology 70, 931-944.

166

Kearney M, Shine R, Porter WP (2009) The potential for behavioral thermoregulation to buffer ‚Äúcold-blooded‚Äù animals against climate warming. Proceedings of the National Academy of Sciences 106, 3835-3840. Langmead B, Trapnell C, Pop M, Salzberg S (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology 10, R25. Li B, Dewey C (2011) RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. Bmc Bioinformatics 12, 323. Lindquist S (1986) The Heat-Shock Response. Annual Review of Biochemistry 55, 1151-1191. Manes M, Schwartz TS, Robert KA, Bronikowski AM (In Revision with Encouragment to Resubmit) Covariation in levels of multiple paternity and life-history strategy in closely related populations of garter snakes: implications for mating system evolution. Molecular Ecology. Manier MK, Arnold SJ (2005) Population genetic analysis identifies source-sink dynamics for two sympatric garter snake species (Thamnophis elegans and Thamnophis sirtalis). Molecular Ecology 14, 3965-3976. Manier MK, Arnold SJ (2006) Ecological correlates of population genetic structure: a comparative approach using a vertebrate metacommunity. Proceedings of the Royal Society of London Series B 273, 3001-3009. Manier MK, Seyler CM, Arnold SJ (2007) Adaptive divergence within and between ecotypes of the terrestrial garter snake, Thamnophis elegans, assessed with F-ST-Q(ST) comparisons. Journal of Evolutionary Biology 20, 1705-1719. Mayer M, Bukau B (2005) Hsp70 chaperones: Cellular functions and molecular mechanism. Cellular and Molecular Life Sciences 62, 670-684-684. Meyer E, Aglyamova GV, Matz MV (2011) Profiling gene expression responses of coral larvae (Acropora millepora) to elevated temperature and settlement inducers using a novel RNA-Seq procedure. Molecular Ecology 20, 3599-3616. Muñoz MJ (2003) Longevity and heat stress regulation in Caenorhabditis elegans. Mechanisms of Ageing and Development 124, 43-48. Peterson CC, Walton BM, Bennett AF (1998) Intrapopulation variation in ecological energetics of the garter snake Thamnophis sirtalis, with analysis of the precision of doubly labeled water measurements. Physiological Zoology 71, 333-349. Pfaffl MW, Horgan GW, Dempfle L (2002) Relative expression software tool (REST) for group-wise comparison and statistical analysis of relative expression results in real-time PCR. Nucleic Acids Research 30, e36-e45.

167

Pockley AG (2003) Heat shock proteins as regulators of the immune response. The Lancet 362, 469- 476. Richter K, Haslbeck M, Buchner J (2010) The Heat Shock Response: Life on the Verge of Death. Molecular Cell 40, 253-266. Robinson M, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology. Robinson M, Smyth G (2008) Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 9, 321 - 332. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139-140. Robinson MD, Smyth GK (2007) Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, 2881-2887. Runcie DE, Garfield DA, Babbitt CC, et al. (2012) Genetics of gene expression responses to temperature stress in a sea urchin gene network. Molecular Ecology 21, 4547-4562. Schwartz TS, Bronikowski AM (2011) Molecular stress pathways and the evolution of life histories in reptiles. In: Molecular Mechanisms of Life History Evolution: The Genetics and Physiology of Life History Traits and Trade-Offs (eds. Flatt T, Heyland A). Oxford Press, Oxford. Schwartz TS, Bronikowski AM (In Press) Dissecting molecular stress networks: identifying nodes of divergence between life-history phenotypes. Molecular Ecology. Seebacher F, Franklin CE (2012) Determining environmental causes of biological effects: the need for a mechanistic physiological dimension in conservation biology. Philosophical Transactions of the Royal Society B: Biological Sciences 367, 1607-1614. Somero GN (2010) The physiology of climate change: how potentials for acclimatization and genetic adaptation will determine 'winners' and 'losers'. Journal of Experimental Biology 213, 912-920. Sorensen JG, Nielsen MM, Kruhoffer M, Justesen J, Loeschcke V (2005) Full genome gene expression analysis of the heat stress response in Drosophila melanogaster. Cell Stress & Chaperones 10, 312-328. Sparkman AM, Vleck CM, Bronikowski AM (2009) Evolutionary ecology of endocrine-mediated life-history variation in the garter snake Thamnophis elegans. Ecology 90, 720-728. Stevenson RD, Peterson CR, Tsuji JS (1985) The thermal dependence of locomotion, tongue flicking, digestion, and oxygen consumption in the wandering garter snake. Physiological Zoology 58, 46- 57.

168

Tomanek L (2010) Variation in the heat shock response and its implication for predicting the effect of global climate change on species' biogeographical distribution ranges and metabolic costs. Journal of Experimental Biology 213, 971-979. Trinity-Developers-Group (2012) Trinity: Supported Processes for Read Alignment, Visualization, and Abundance Estimation. Wang Z, Gerstein M, Snyder M (2009) RNA-seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics 10, 57-63. Young M, Wakefield M, Smyth G, Oshlack A (2010) Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biology 11, R14.

169

Table 6-1. Summary of Illumina RNA-seq data from 12 female livers. For the control treatment, individuals remained at 27 °C, whereas in the heat stress treatment individuals were put at 37 °C for 2 hours prior to euthanization. RIN is the quality score of the RNA from the Bioanalyzer.

Sample Treatment RIN Read Average Millions of Millions Percent Reads ID Length Quality Reads Reads Mapped Score Mapped 37 Control 8.6 80 35.91 31.128 23.360 75.04 38 Control 9.3 80 24.06 13.723 11.188 81.53 61 Control N/AA 80 24.59 13.540 11.016 81.35 62 Control 9.3 73 26.17 5.630 4.052 71.98 87 Control 9.4 73 21.12 10.420 7.275 69.82 90 Control 9.3 73 23.97 12.066 10.430 86.44 46 Stress 8.9 73 27.47 11.807 9.153 77.52 64 Stress 9.2 80 26.92 12.434 10.247 82.41 65 Stress 9.1 73 23.53 13.174 10.821 82.14 66 Stress 9.5 80 24.96 13.715 11.251 82.04 67 Stress 9.6 73 22.57 14.825 11.373 76.72 93 Stress 9.3 73 23.22 15.275 12.961 84.85

A The RIN calculation failed for this sample but the graph looks very similar to Indiv 66, although the 18S:28S ratio is lower both peaks are clear and distinct.

170

Table 6-2. The number of genes and transcripts that were differentially regulated in the T. elegans liver in response to heat stress at FDR ≤ 0.05. Either all the transcripts and genes were used (No Filter) or only those transcripts and that had more than an average of 50 reads per sample (> 50 rps).

Affect of heat stress Transcripts Transcripts Genes Genes Filter No filter > 50 rps No Filter > 50 rps Up regulated 797 323 1257 540 No Change 131,082 14,141 95,681 10,244 Down regulated 464 94 681 146 Percent of genes D.E. 0.95 2.94 1.99 6.7

171

Table 6-3. Effect of heat stress on the expression of candidate genes based on qPCR and RNA- seq. NA indicates levels were below the threshold for performing an exact test. The correlation is between the expression levels for RNA-seq (FPKM) from the RSEM output and CT values from the qPCR data.

Gene qPCR qPCR RNA-seq Avg No. Correlation Fold-change dilution Fold-change of reads R2 HSP70 274 *** 1:5000 139.65*** 39,378 0.68 Catalase - 1:500 0.76 2341 0.11 IGF1 - 1:15 -0.64 5.25 0.55 IGF2 -0.469 ** 1:50 1.06 8.33 0.18 IGF1 Receptor 1.367 * 1:15 1.39 33.8 0.08 IGF2 Receptor 1.538 *** 1:50 -1.25 4.5 0.04 B-actin NA 1:500 1.13 67,607 0.182 EEF1a NA 1:5000 1.86 35,249 0.383 p-value: * <0.05, ** <0.01, ***<0.001

172

genes. cilitatecomparing organization are in are organization SP).Purple indicate clusterson

4. Annotation clusters with enrichment scores (ES) greater than one for the differentially expressed transcripts and and transcripts expressed differentially the for one than greater (ES) scores enrichment with clusters Annotation 4. - able 6 able Clusters with enrichments scores greater than 1.3 are significant at p<0.05 and are in bold. Clusters are colored coded to fa to coded colored are Clusters bold. in T are and p<0.05 at significant are 1.3 than greater scores enrichments with Clusters clustersacross analyses. Orangerepresents typical heat stress responseincluding Heat ShockProteins collagen (H and structure matrix, on extracellular Clusters brown. in are proteins muscle on Clusters regulation. transcription development. and growth on clusters indicates yellow Finally, blue.

173

Table 6-5. KEGG pathways that are over-represented in the differentially expressed genes.

Pathway p-value Name Class 04512 3.79E-09 ECM-receptor interaction Environmental Information Processing; Signaling Molecules and Interaction

04510 1.41E-06 Focal adhesion Cellular Processes; Cell Communication

00330 0.028 Arginine and proline metabolism Metabolism; Amino Acid Metabolism

04010 0.031 MAPK signaling pathway Environmental Information Processing; Signal Transduction

00604 0.04 Glycosphingolipid biosynthesis - Metabolism; Glycan Biosynthesis and Metabolism ganglio series

174

Figure 6-1. PCA plot of 12 individuals based on the FPKM of all transcripts. This plot suggesting individuals 87 and 67 are outliers. Heat stress animals are in yellow and control animals are in blue.

175

A.

B.

Figure 6-2. Log-fold-changes against log-counts per million reads. A) transcripts and B) genes with more than an average of 50 reads per sample. Differentially expressed transcripts or genes are highlighted in red. The horizontal blue lines show 4 fold changes in expression.

176

regulatedtranscripts are in - Up

value = 0.05, variance greater than 0.05). than greater variance 0.05, = value - expressedtranscripts (q regulatedtranscripts are in green. Heat stress treatmentanimals are in yellow; controlanimals are in blueacross the top. - 3. Heat map of differentially differentially map of Heat 3. - igure 6 igure F redand down

177

A.

B.

Figure 6-4. PCA plots of FPKM of the transcripts that have variance greater than 0.05 and A) q-value 0.05, and B) p-value 0.05. Heat bar indicates q-values. The two clusters in each figure represent transcripts in the control (lower left cluster) and heat stress (top right cluster) treatments, indicating clear distinction between the groups for many genes even without using the false discovery rate.

178

exon boundaryto inhibit amplification of - exon bp, and over and bp, - Primersdesigned to optimize:GC content of50%; GC clampbut no more than 3 Quantitative PCR primers. primers. PCR Quantitative

1. -

bpon 3’end; 60 °Cannealing, amplicon length 90 to 150 - upplemental Table6 S 5 last in G or C contaminating DNA.

179

Supplemental Table 6-2. List of genes differentially regulated that could be annotated to a chicken ENSEMBL gene (328 out of 686). logFC is the log fold change in expression, FDR is the false discovery rate.

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp3039_c0 3.696 2.55E-22 6.97E-19 ENSGALT00000027851 cysteine and histidine-rich

domain (CHORD)-containing 1 comp2048_c0 3.566 2.21E-21 4.83E-18 ENSGALT00000037566 DnaJ (Hsp40) homolog,

subfamily A, member 1

comp437_c0 4.473 9.65E-21 1.17E-17 ENSGALT00000015358 BCL2-associated athanogene 3

comp462_c0 4.423 8.92E-21 1.17E-17 ENSGALT00000015358 BCL2-associated athanogene 3 comp1081_c0 3.413 5.20E-19 4.37E-16 ENSGALT00000026117 similar to qin-induced kinase;

salt-inducible kinase 1 comp4070_c0 3.598 3.25E-18 2.54E-15 ENSGALT00000000280 growth arrest and DNA-

damage-inducible, beta

comp627_c0 5.564 1.18E-17 8.03E-15 ENSGALT00000010524 heat shock 70kDa protein 8 comp2822_c0 3.345 2.97E-17 1.80E-14 ENSGALT00000021965 methylenetetrahydrofolate dehydrogenase (NADP+ dependent) 2, methenyltetrahydrofolate

cyclohydrolase comp398_c0 4.774 3.47E-17 2.00E-14 ENSGALT00000014497 DnaJ (Hsp40) homolog,

subfamily B, member 4

comp64_c0 4.944 6.00E-17 3.12E-14 ENSGALT00000010524 heat shock 70kDa protein 8 comp9091_c0 5.163 5.98E-17 3.12E-14 ENSGALT00000020478 nuclear receptor subfamily 4,

group A, member 2 comp2987_c0 3.202 7.28E-17 3.39E-14 ENSGALT00000017033 AHA1, activator of heat shock 90kDa protein ATPase

homolog 1 (yeast) comp1286_c0 3.159 7.44E-17 3.39E-14 ENSGALT00000026117 similar to qin-induced kinase;

salt-inducible kinase 1

comp2218_c0 2.759 1.17E-15 4.11E-13 ENSGALT00000001009 zinc finger homeobox 3

comp1811_c0 3.819 2.00E-15 6.43E-13 ENSGALT00000005887 dual specificity phosphatase 1 comp2275_c0 5.079 2.47E-15 7.72E-13 ENSGALT00000020478 nuclear receptor subfamily 4,

group A, member 2

comp6643_c0 3.184 6.56E-15 1.89E-12 ENSGALT00000041155 hypothetical LOC430080 comp1694_c0 -2.398 8.63E-15 2.36E-12 ENSGALT00000012270 death effector domain

containing

comp95_c0 5.465 6.45E-14 1.60E-11 ENSGALT00000034109 crystallin, alpha B

comp542_c0 5.056 7.63E-14 1.85E-11 ENSGALT00000010524 heat shock 70kDa protein 8 comp3275_c0 2.768 9.62E-14 2.29E-11 ENSGALT00000018287 serpin peptidase inhibitor, clade H (heat shock protein 47), member 1, (collagen binding

protein 1)

comp7149_c0 5.005 1.90E-13 4.41E-11 ENSGALT00000016303 FOS-like antigen 2 comp2525_c1 3.848 6.87E-13 1.53E-10 ENSGALT00000027586 heat shock 105kDa/110kDa

protein 1

comp110_c0 5.667 1.31E-12 2.75E-10 ENSGALT00000019144 heat shock protein 70

comp180_c0 2.971 1.64E-12 3.39E-10 ENSGALT00000039502 ubiquitin C; ubiquitin B

comp2587_c0 3.101 2.03E-12 4.11E-10 ENSGALT00000037948 dual specificity phosphatase 4

180

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp36_c3 5.510 9.92E-12 1.87E-09 ENSGALT00000005141 DnaJ (Hsp40) homolog,

subfamily A, member 4

comp9336_c0 3.142 1.18E-11 2.19E-09 ENSGALT00000002784 jumonji domain containing 6

comp7674_c0 2.464 2.59E-11 4.63E-09 ENSGALT00000010481 angiomotin like 2 comp1728_c0 2.901 3.68E-11 6.28E-09 ENSGALT00000007768 CCAAT/enhancer binding

protein (C/EBP), alpha comp6213_c0 3.917 4.48E-11 7.54E-09 ENSGALT00000014106 cysteine-rich, angiogenic

inducer, 61

comp737_c0 1.990 3.87E-10 5.36E-08 ENSGALT00000011938 heat shock 22kDa protein 8

comp722_c1 2.211 4.28E-10 5.78E-08 ENSGALT00000024477 hypothetical LOC427259

comp9522_c0 3.241 5.54E-10 7.30E-08 ENSGALT00000036134 claudin 3 comp10425_c0 2.278 6.43E-10 8.37E-08 ENSGALT00000014497 DnaJ (Hsp40) homolog,

subfamily B, member 4

comp6788_c0 3.416 8.44E-10 1.04E-07 ENSGALT00000017670 jun oncogene comp102_c0 4.811 2.31E-09 2.55E-07 ENSGALT00000016542 heat shock 90kDa protein 1,

beta comp10457_c0 4.635 2.61E-09 2.85E-07 ENSGALT00000037491 TNF receptor-associated factor

3 comp2629_c0 -1.832 3.00E-09 3.23E-07 ENSGALT00000012270 death effector domain

containing

comp19374_c0 3.592 7.16E-09 7.46E-07 ENSGALT00000015966 activating transcription factor 3 comp9896_c0 2.206 1.39E-08 1.37E-06 ENSGALT00000007041 tumor necrosis factor receptor

superfamily, member 1B

comp4880_c0 2.301 1.82E-08 1.70E-06 ENSGALT00000011472 ICER protein

comp5191_c1 1.741 2.17E-08 1.99E-06 ENSGALT00000021876 ENSGALG00000013411 comp1099_c0 1.908 2.40E-08 2.19E-06 ENSGALT00000013006 CCAAT/enhancer binding

protein (C/EBP), beta comp1396_c0 1.729 2.42E-08 2.19E-06 ENSGALT00000026453 inhibitor of DNA binding 2, dominant negative helix-loop-

helix protein comp6432_c0 2.333 2.67E-08 2.39E-06 ENSGALT00000024766 nuclear factor of kappa light polypeptide gene enhancer in

B-cells inhibitor, zeta comp6488_c0 3.371 3.10E-08 2.73E-06 ENSGALT00000014248 lysosomal-associated

membrane protein 3 comp779_c0 2.591 3.44E-08 2.96E-06 ENSGALT00000006471 chromosome 5 open reading

frame 3 comp2355_c0 3.030 4.58E-08 3.81E-06 ENSGALT00000013057 hypothetical protein LOC777126; coenzyme Q10

homolog B (S. cerevisiae)

comp5224_c0 2.528 4.60E-08 3.81E-06 ENSGALT00000038039 jun oncogene comp1115_c0 1.645 5.38E-08 4.33E-06 ENSGALT00000026453 inhibitor of DNA binding 2, dominant negative helix-loop-

helix protein comp15507_c0 2.875 5.55E-08 4.43E-06 ENSGALT00000000873 E74-like factor 3 (ets domain transcription factor, epithelial-

specific ) comp4211_c0 2.736 5.72E-08 4.53E-06 ENSGALT00000031308 HIG1 hypoxia inducible

domain family, member 1A comp5262_c0 1.466 7.32E-08 5.76E-06 ENSGALT00000009924 phosphatidylinositol 4-kinase

type 2 alpha

comp5512_c0 2.119 7.98E-08 6.18E-06 ENSGALT00000025878 Kruppel-like factor 10

comp5167_c0 3.245 1.47E-07 1.06E-05 ENSGALT00000016667 otokeratin

181

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp8682_c0 2.325 1.47E-07 1.06E-05 ENSGALT00000016761 solute carrier family 33 (acetyl-

CoA transporter), member 1

comp4017_c0 2.401 1.84E-07 1.30E-05 ENSGALT00000038039 jun oncogene

comp1429_c0 1.756 2.61E-07 1.79E-05 ENSGALT00000019502 transducer of ERBB2, 2 comp600_c1 1.914 2.72E-07 1.85E-05 ENSGALT00000016518 solute carrier family 20 (phosphate transporter),

member 2 comp718_c0 2.408 3.63E-07 2.43E-05 ENSGALT00000016005 actin, beta-like 2; actin, alpha, cardiac muscle 1; actin, alpha

1, skeletal muscle comp1898_c0 2.327 3.78E-07 2.50E-05 ENSGALT00000010535 myosin, heavy chain 10, non- muscle; myosin, heavy chain 11, smooth muscle; similar to Myosin-11 (Myosin heavy chain, gizzard smooth muscle); similar to myosin, heavy chain

9, non-muscle comp9867_c0 2.208 3.88E-07 2.56E-05 ENSGALT00000010797 prolyl 4-hydroxylase, alpha

polypeptide II comp12943_c0 3.781 4.20E-07 2.73E-05 ENSGALT00000014322 solute carrier family 1 (glutamate/neutral amino acid

transporter), member 4 comp4219_c0 1.673 4.44E-07 2.87E-05 ENSGALT00000017391 low density lipoprotein receptor-related protein 8,

apolipoprotein e receptor comp5059_c0 3.077 4.99E-07 3.19E-05 ENSGALT00000010025 inhibitor of DNA binding 1, dominant negative helix-loop-

helix protein comp15271_c0 4.716 5.18E-07 3.27E-05 ENSGALT00000009017 coagulation factor III

(thromboplastin, tissue factor)

comp3940_c1 2.147 6.08E-07 3.72E-05 ENSGALT00000016303 FOS-like antigen 2 comp3619_c0 -2.522 7.01E-07 4.23E-05 ENSGALT00000040327 complement component

(3b/4b) receptor 1-like comp6433_c0 -1.866 7.47E-07 4.46E-05 ENSGALT00000002667 rabaptin, RAB GTPase binding

effector protein 1

comp5452_c0 2.762 7.54E-07 4.46E-05 ENSGALT00000008979 calponin 3, acidic

comp873_c0 2.526 7.54E-07 4.46E-05 ENSGALT00000018394 similar to sulfotransferase

comp5652_c0 1.643 8.57E-07 4.98E-05 ENSGALT00000004744 ENSGALG00000003000 comp13476_c0 3.356 9.09E-07 5.26E-05 ENSGALT00000019248 G protein-coupled receptor,

family C, group 5, member A

comp17891_c0 3.675 1.13E-06 6.44E-05 ENSGALT00000012431 early growth response 1 comp9307_c0 1.719 1.21E-06 6.83E-05 ENSGALT00000017012 latent transforming growth

factor beta binding protein 1

comp9990_c0 2.049 1.31E-06 7.28E-05 ENSGALT00000018765 creatine kinase, brain comp4575_c0 2.391 2.40E-06 0.000128135 ENSGALT00000033605 myosin, heavy chain 10, non- muscle; myosin, heavy chain 11, smooth muscle; similar to Myosin-11 (Myosin heavy chain, gizzard smooth muscle); similar to myosin, heavy chain

9, non-muscle

comp3701_c0 1.774 2.42E-06 0.000128271 ENSGALT00000020388 Rho family GTPase 3 comp9384_c0 2.223 2.57E-06 0.000135878 ENSGALT00000027421 chromosome 13 open reading

frame 31

182

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp7358_c0 -1.434 2.63E-06 0.000138023 ENSGALT00000015470 non-SMC element 4 homolog

A (S. cerevisiae) comp500_c0 1.453 2.83E-06 0.000147382 ENSGALT00000000739 ERBB receptor feedback

inhibitor 1 comp11528_c0 2.402 2.85E-06 0.000147794 ENSGALT00000038821 poly (ADP-ribose) polymerase

family, member 1 comp8255_c0 1.392 3.01E-06 0.000155142 ENSGALT00000013221 collagen, type IV, alpha 5

(Alport syndrome) comp15708_c0 2.754 3.04E-06 0.000156243 ENSGALT00000014008 solute carrier family 39 (zinc

transporter), member 12

comp8256_c0 1.854 3.11E-06 0.00015884 ENSGALT00000003792 neurochondrin comp24221_c0 2.863 3.26E-06 0.000165626 ENSGALT00000016555 heat shock 70kDa protein 4-

like comp3424_c0 1.628 3.37E-06 0.00016987 ENSGALT00000018038 ras homolog gene family,

member U comp601_c0 1.563 3.44E-06 0.000172456 ENSGALT00000013137 similar to similar to 60 kDa heat shock protein, mitochondrial precursor (Hsp60) (60 kDa chaperonin) (CPN60) (Heat shock protein 60) (HSP-60) (Mitochondrial matrix protein P1) comp9333_c0 1.770 3.74E-06 0.000184235 ENSGALT00000014738 limb bud and heart

development homolog (mouse)

comp16695_c0 2.808 4.17E-06 0.000201592 ENSGALT00000037814 EPH receptor A5 comp8321_c0 2.212 4.22E-06 0.000203409 ENSGALT00000027351 immunoresponsive 1 homolog

(mouse) comp7603_c0 1.722 4.35E-06 0.00020781 ENSGALT00000025300 zinc finger, FYVE domain

containing 28 comp2321_c0 -2.136 4.76E-06 0.000224122 ENSGALT00000001781 complement component 4

binding protein, alpha comp339_c0 1.584 4.82E-06 0.00022628 ENSGALT00000007216 procollagen-lysine 1, 2-

oxoglutarate 5-dioxygenase 1

comp1300_c0 2.345 5.17E-06 0.000240387 ENSGALT00000019425 transgelin

comp19218_c0 1.889 5.52E-06 0.00025575 ENSGALT00000014020 PTK7 protein tyrosine kinase 7

comp7485_c0 1.388 6.02E-06 0.000274138 ENSGALT00000026723 selenoprotein I comp8733_c0 3.453 6.05E-06 0.000274582 ENSGALT00000022087 nuclear receptor subfamily 4,

group A, member 3

comp8920_c0 -1.770 6.20E-06 0.000279896 ENSGALT00000026827 zinc finger protein 395

comp2124_c0 1.763 7.85E-06 0.000348827 ENSGALT00000005981 annexin A2 comp12528_c0 2.373 8.86E-06 0.000387551 ENSGALT00000018315 CDP-diacylglycerol synthase (phosphatidate

cytidylyltransferase) 1

comp5379_c0 3.365 8.91E-06 0.000388011 ENSGALT00000038525 ENSGALG00000021139

comp9559_c0 2.356 9.03E-06 0.000391668 ENSGALT00000010975 interferon regulatory factor 1 comp5804_c0 2.459 1.12E-05 0.000474073 ENSGALT00000003969 solute carrier family 25,

member 33 comp6580_c0 -1.487 1.17E-05 0.000488292 ENSGALT00000032890 solute carrier family 39 (zinc

transporter), member 3 comp3012_c0 3.373 1.24E-05 0.000512033 ENSGALT00000010225 similar to 6-phosphofructo-2- kinase/fructose-2,6- biphosphatase 3 (6PF-2-K/Fru- 2,6-P2ASE brain/placenta-type isozyme); 6-phosphofructo-2- kinase/fructose-2,6-

183

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name

biphosphatase 3

comp7467_c0 1.459 1.25E-05 0.000514711 ENSGALT00000026354 ring finger protein 139

comp10776_c0 1.970 1.29E-05 0.000527751 ENSGALT00000010538 hemopoietic cell kinase comp11556_c0 1.306 1.31E-05 0.000534761 ENSGALT00000014128 synapse defective 1, Rho GTPase, homolog 2 (C.

elegans) comp3081_c0 1.151 1.42E-05 0.000573266 ENSGALT00000008128 membrane protein,

palmitoylated 1, 55kDa

comp11606_c0 1.649 1.57E-05 0.000629635 ENSGALT00000001897 ENSGALG00000001250

comp10408_c0 6.383 1.61E-05 0.000643615 ENSGALT00000010816 ENSGALG00000006693 comp3337_c0 1.184 1.68E-05 0.000660727 ENSGALT00000005262 oxidative stress induced growth

inhibitor 1 comp12390_c0 1.523 1.68E-05 0.000660727 ENSGALT00000007621 glycosyltransferase 25 domain

containing 2

comp2312_c0 1.433 1.94E-05 0.000753094 ENSGALT00000015703 collagen, type I, alpha 2 comp13888_c0 1.779 2.38E-05 0.000913982 ENSGALT00000014712 neural proliferation,

differentiation and control, 1

comp1931_c0 1.276 2.87E-05 0.001080105 ENSGALT00000031997 reticulon 4 comp13231_c0 2.122 2.91E-05 0.00109258 ENSGALT00000003054 ATP-binding cassette, sub-

family A (ABC1), member 3 comp10740_c0 1.680 3.29E-05 0.001222601 ENSGALT00000038844 TMEM189-UBE2V1

readthrough transcript comp14697_c0 -1.256 3.35E-05 0.001242222 ENSGALT00000011349 chromosome 10 open reading

frame 33 comp7879_c0 2.328 3.43E-05 0.001268094 ENSGALT00000011642 suppressor of cytokine

signaling 3 comp5071_c0 1.620 3.72E-05 0.001370721 ENSGALT00000023747 GC-rich promoter binding

protein 1 comp7188_c0 1.206 3.88E-05 0.001407734 ENSGALT00000034231 RasGEF domain family,

member 1A comp10788_c0 1.476 3.90E-05 0.001409877 ENSGALT00000020742 myosin regulatory light chain

interacting protein

comp26859_c0 2.861 4.07E-05 0.001462298 ENSGALT00000016472 polo-like kinase 3 (Drosophila)

comp7444_c0 1.178 4.08E-05 0.001462928 ENSGALT00000012832 laminin, beta 1

comp7082_c0 1.607 4.25E-05 0.001508398 ENSGALT00000030043 CD9 molecule comp14074_c0 2.003 4.24E-05 0.001508398 ENSGALT00000040166 hairy and enhancer of split 1,

(Drosophila) comp14532_c0 1.937 4.50E-05 0.001560989 ENSGALT00000016555 heat shock 70kDa protein 4-

like comp2651_c0 1.619 4.64E-05 0.001601069 ENSGALT00000025873 v-ets erythroblastosis virus E26

oncogene homolog 2 (avian) comp22977_c0 2.928 4.75E-05 0.001627905 ENSGALT00000015887 CCR4 carbon catabolite

repression 4-like (S. cerevisiae)

comp744_c0 1.577 4.83E-05 0.001642909 ENSGALT00000004033 collagen, type III, alpha 1

comp12615_c0 1.256 4.91E-05 0.001663 ENSGALT00000010204 ENSGALG00000006311 comp1590_c0 1.971 4.99E-05 0.001677096 ENSGALT00000016005 actin, beta-like 2; actin, alpha, cardiac muscle 1; actin, alpha

1, skeletal muscle

comp7278_c0 1.458 4.98E-05 0.001677096 ENSGALT00000027197 collagen, type IV, alpha 1

comp8373_c0 1.692 6.65E-05 0.002190383 ENSGALT00000009704 AXIN1 up-regulated 1

184

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp13439_c0 1.285 6.85E-05 0.002247089 ENSGALT00000025838 sperm associated antigen 1; similar to infertility-related

sperm protein comp14750_c0 1.800 6.96E-05 0.002278038 ENSGALT00000008166 chromosome 17 open reading

frame 71 comp4174_c0 2.085 7.01E-05 0.002287291 ENSGALT00000010534 myosin, heavy chain 10, non- muscle; myosin, heavy chain 11, smooth muscle; similar to Myosin-11 (Myosin heavy chain, gizzard smooth muscle); similar to myosin, heavy chain

9, non-muscle comp19792_c0 1.543 7.10E-05 0.002309373 ENSGALT00000016682 TCDD-inducible poly(ADP-

ribose) polymerase

comp12648_c0 1.850 7.50E-05 0.002424113 ENSGALT00000014784 EH-domain containing 3 comp32800_c0 2.996 7.52E-05 0.002425113 ENSGALT00000010134 sodium channel, nonvoltage-

gated 1, gamma comp15652_c0 2.358 7.79E-05 0.002503059 ENSGALT00000026447 SRY (sex determining region

Y)-box 11

comp785_c0 1.364 7.90E-05 0.002525319 ENSGALT00000010382 pantothenate kinase 1 comp10771_c0 2.997 8.06E-05 0.002568639 ENSGALT00000004936 zinc finger, AN1-type domain

2A comp4683_c0 1.155 8.44E-05 0.002674452 ENSGALT00000008132 family with sequence similarity

102, member A

comp5685_c0 1.745 8.69E-05 0.002737749 ENSGALT00000010386 ENSGALG00000006435

comp13510_c0 1.756 9.03E-05 0.002818975 ENSGALT00000020388 Rho family GTPase 3 comp21426_c0 4.216 0.000100473 0.003093441 ENSGALT00000010010 sodium channel, nonvoltage- gated 1, beta (Liddle

syndrome)

comp4798_c0 1.559 0.00010228 0.003131423 ENSGALT00000009993 thioredoxin reductase 3

comp10527_c0 2.556 0.000102208 0.003131423 ENSGALT00000039200 similar to pim-3 protein

comp642_c0 1.003 0.000104442 0.003179807 ENSGALT00000005581 arrestin domain containing 2

comp10720_c0 4.357 0.000106665 0.003220889 ENSGALT00000008577 mesothelin comp15985_c0 1.610 0.000106675 0.003220889 ENSGALT00000027598 solute carrier family 7 (cationic amino acid transporter, y+

system), member 1

comp13502_c0 1.616 0.000108309 0.00325981 ENSGALT00000022516 similar to Protein C18orf1 comp9379_c0 1.935 0.000117834 0.003509337 ENSGALT00000013695 interleukin-1 receptor-

associated kinase 2

comp11655_c0 2.310 0.000120048 0.003565559 ENSGALT00000040248 heat shock 27kDa protein 1

comp3099_c0 1.567 0.000120795 0.003568356 ENSGALT00000013980 tubulin, beta 3; tubulin, beta 2C

comp9247_c0 1.370 0.000136025 0.003975286 ENSGALT00000006684 dipeptidyl-peptidase 9

comp18758_c0 4.205 0.00014117 0.004085052 ENSGALT00000003868 aquaporin 3 (Gill blood group)

comp18906_c0 2.077 0.000141276 0.004085052 ENSGALT00000039095 cadherin 13, H-cadherin (heart) comp3688_c0 1.423 0.000143412 0.004135864 ENSGALT00000023747 GC-rich promoter binding

protein 1

comp12347_c0 1.843 0.000144716 0.004162476 ENSGALT00000023900 thrombospondin 4 comp7253_c0 -1.124 0.000148174 0.004239628 ENSGALT00000024097 G elongation factor,

mitochondrial 2

comp16946_c0 1.238 0.000148643 0.004241953 ENSGALT00000039291 jagged 1 (Alagille syndrome)

comp11837_c0 1.203 0.000164174 0.004624807 ENSGALT00000005663 fibronectin 1

185

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name

comp10546_c0 1.346 0.000165985 0.004663796 ENSGALT00000007301 oligophrenin 1

comp8789_c1 1.430 0.000175218 0.004898028 ENSGALT00000029461 SMAD family member 7

comp15005_c0 1.619 0.000178539 0.004978144 ENSGALT00000001660 mex-3 homolog D (C. elegans)

comp15676_c0 2.248 0.000191889 0.005336767 ENSGALT00000009201 forkhead box F1

comp23361_c0 1.934 0.000218206 0.006022702 ENSGALT00000024449 Boc homolog (mouse)

comp24802_c0 1.681 0.000220371 0.006067148 ENSGALT00000017951 kelch-like 8 (Drosophila) comp7552_c0 1.816 0.000227864 0.006250863 ENSGALT00000014902 ficolin (collagen/fibrinogen domain containing lectin) 2

(hucolin) comp7860_c0 1.366 0.000235651 0.006423099 ENSGALT00000027073 LON peptidase N-terminal

domain and ring finger 2 comp7154_c0 1.167 0.000252828 0.006823228 ENSGALT00000013746 neuroepithelial cell

transforming gene 1 comp811_c1 1.176 0.000259085 0.006974867 ENSGALT00000000567 similar to AHNAK

nucleoprotein

comp9947_c0 1.481 0.000290453 0.007705477 ENSGALT00000000667 phosphoglucomutase 2 comp7324_c0 1.231 0.000295531 0.007821188 ENSGALT00000038937 caveolin 1, caveolae protein,

22kDa

comp27160_c0 1.949 0.000314713 0.008232867 ENSGALT00000030209 ephrin-B2 comp2126_c0 3.561 0.000320102 0.008330267 ENSGALT00000018840 immunoglobulin J polypeptide, linker protein for immunoglobulin alpha and mu

polypeptides comp23718_c0 2.422 0.000321008 0.008334009 ENSGALT00000007915 solute carrier family 12 (sodium/potassium/chloride transporters), member 1; similar to Na/K/Cl

cotransporter comp5447_c0 1.182 0.000334866 0.008652676 ENSGALT00000025447 proline-rich nuclear receptor

coactivator 1

comp5591_c0 -1.005 0.00034166 0.008807409 ENSGALT00000026884 ENSGALG00000016655

comp10456_c0 1.659 0.000358143 0.009151228 ENSGALT00000019358 ENSGALG00000011865 comp6640_c0 2.651 0.000363395 0.00925853 ENSGALT00000008979 calponin 3, acidic comp9917_c0 1.429 0.000365687 0.00926907 ENSGALT00000006090 similar to Adenylate cyclase type 7 (Adenylate cyclase type VII) (ATP pyrophosphate-lyase 7) (Adenylyl cyclase 7);

adenylate cyclase 7 comp17940_c0 4.637 0.000366353 0.00926907 ENSGALT00000013059 carboxypeptidase A2

(pancreatic) comp14469_c0 1.276 0.000365313 0.00926907 ENSGALT00000038652 interleukin 16 (lymphocyte

chemoattractant factor) comp12331_c0 1.444 0.000385367 0.009727621 ENSGALT00000005163 polymerase I and transcript

release factor

comp13209_c0 1.619 0.000410176 0.010284086 ENSGALT00000027208 collagen, type IV, alpha 2

comp13563_c0 1.180 0.000417287 0.010436948 ENSGALT00000014476 sperm specific antigen 2 comp1884_c0 5.197 0.00042498 0.010605094 ENSGALT00000014902 ficolin (collagen/fibrinogen domain containing lectin) 2

(hucolin)

comp8153_c0 1.384 0.00042802 0.01060829 ENSGALT00000028082 insulin induced gene 1 comp9474_c0 1.537 0.000433503 0.010719879 ENSGALT00000020165 SAM and SH3 domain

containing 1

186

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp3427_c1 -1.284 0.000446432 0.010990998 ENSGALT00000026141 lysine (K)-specific demethylase

6A comp1404_c0 2.004 0.000461457 0.011283504 ENSGALT00000039927 actin, alpha 2, smooth muscle, aorta; actin, gamma 2, smooth

muscle, enteric comp5190_c0 1.137 0.000471054 0.011416017 ENSGALT00000007502 similar to Moesin (Membrane- organizing extension spike

protein)

comp5915_c0 1.042 0.000477747 0.011527097 ENSGALT00000027924 von Willebrand factor comp17641_c0 2.116 0.000493807 0.011862227 ENSGALT00000021542 phosphatidylinositol transfer

protein, membrane-associated 1

comp36012_c0 -3.239 0.000501557 0.0119715 ENSGALT00000009066 ENSGALG00000005652 comp11753_c0 1.095 0.000500764 0.0119715 ENSGALT00000033722 solute carrier family 5 (sodium/glucose cotransporter),

member 11

comp14699_c0 1.253 0.000501642 0.0119715 ENSGALT00000037891 ENSGALG00000023153 comp3427_c0 1.010 0.000553588 0.013125194 ENSGALT00000014009 inositol 1,4,5-trisphosphate 3-

kinase A

comp18011_c0 1.439 0.000561443 0.013243614 ENSGALT00000011035 kelch-like 25 (Drosophila)

comp3533_c0 1.143 0.000563429 0.013243614 ENSGALT00000015706 collagen, type I, alpha 2

comp7493_c0 1.535 0.000581856 0.013589065 ENSGALT00000004590 connective tissue growth factor comp7407_c0 1.371 0.000597162 0.013916804 ENSGALT00000016845 transforming growth factor,

beta 3 comp4274_c0 -0.871 0.000606453 0.014103265 ENSGALT00000006389 argininosuccinate synthetase 1;

hypothetical LOC425164 comp8584_c0 1.059 0.00061263 0.014200563 ENSGALT00000014514 UDP-GlcNAc:betaGal beta- 1,3-N- acetylglucosaminyltransferase

2 comp14289_c0 1.647 0.000630088 0.014508064 ENSGALT00000020523 TIMP metallopeptidase inhibitor 3 (Sorsby fundus dystrophy,

pseudoinflammatory)

comp1362_c0 1.137 0.000651577 0.01489902 ENSGALT00000011784 heat shock 70kDa protein 4 comp4396_c0 1.075 0.000662912 0.015063676 ENSGALT00000010428 RNA binding motif protein, X-

linked comp1409_c1 -0.920 0.000682116 0.015403991 ENSGALT00000000735 splicing factor, arginine/serine-

rich 3 comp7488_c0 1.141 0.000697072 0.015676948 ENSGALT00000011887 thioredoxin domain containing

14

comp7279_c0 1.207 0.000699526 0.015688794 ENSGALT00000012260 dihydropyrimidinase-like 3

comp16504_c0 1.142 0.00070047 0.015688794 ENSGALT00000027329 protocadherin 9 comp17977_c0 1.784 0.000735491 0.01637254 ENSGALT00000040952 Rho-related BTB domain

containing 2 comp15843_c0 1.670 0.000739968 0.016438725 ENSGALT00000009624 SRY (sex determining region

Y)-box 18

comp27485_c0 1.762 0.000744488 0.016505585 ENSGALT00000039200 similar to pim-3 protein

comp21191_c0 1.943 0.000757709 0.016730827 ENSGALT00000030447 dermatopontin

comp16199_c1 1.415 0.000761316 0.016776576 ENSGALT00000022138 F-box protein 5 comp19360_c0 1.397 0.000773334 0.017007115 ENSGALT00000016617 zinc finger, SWIM-type

containing 5 comp3893_c0 1.041 0.000775425 0.017018864 ENSGALT00000004006 regulator of G-protein signaling

2, 24kDa

187

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp306_c4 0.871 0.000813858 0.017755415 ENSGALT00000002797 family with sequence similarity

65, member A comp7475_c0 -1.047 0.000897492 0.019492021 ENSGALT00000010074 UDP-N-acetyl-alpha-D- galactosamine:polypeptide N- acetylgalactosaminyltransferase

11 (GalNAc-T11)

comp12801_c0 -1.251 0.000911676 0.019731918 ENSGALT00000019560 myozenin 2 comp8282_c0 1.555 0.000917492 0.019818555 ENSGALT00000020914 tubulin, beta 6; similar to Tubulin beta-5 chain (Beta- tubulin class-V); similar to

tubulin, beta 3; tubulin, beta 2B

comp5517_c0 1.401 0.000948166 0.020400505 ENSGALT00000022476 tubulin, beta 3; tubulin, beta 2C comp12927_c0 -1.236 0.000962594 0.020670233 ENSGALT00000020304 similar to Pyridoxal (pyridoxine, vitamin B6)

phosphatase comp5091_c0 1.144 0.000968099 0.020747696 ENSGALT00000037815 lectin, galactoside-binding,

soluble, 1

comp22435_c0 1.729 0.000972961 0.020789232 ENSGALT00000024183 ENSGALG00000014995

comp31252_c0 1.837 0.000992264 0.020977658 ENSGALT00000036098 hexokinase 2

comp3007_c0 1.120 0.001003584 0.021176017 ENSGALT00000030506 ENSGALG00000019256 comp14815_c0 1.325 0.00104163 0.021927673 ENSGALT00000006994 SRY (sex determining region

Y)-box 9

comp5762_c0 0.863 0.00104322 0.021927673 ENSGALT00000024811 ENSGALG00000015377 comp18357_c0 1.471 0.001054127 0.022048494 ENSGALT00000023113 solute carrier family 25 (mitochondrial carrier:

glutamate), member 22

comp12831_c0 1.057 0.001055019 0.022048494 ENSGALT00000025747 spermine oxidase

comp16259_c0 1.815 0.001061297 0.022137361 ENSGALT00000001808 ENSGALG00000001192

comp995_c0 1.020 0.00106835 0.022217231 ENSGALT00000006265 syndecan 4 comp2391_c0 0.850 0.001076862 0.022334164 ENSGALT00000039821 spen homolog, transcriptional

regulator (Drosophila) comp12715_c0 -1.183 0.001092954 0.022539593 ENSGALT00000008587 zinc finger and BTB domain

containing 5

comp15558_c0 1.457 0.001100002 0.022642223 ENSGALT00000002318 CTTNBP2 N-terminal like comp4346_c0 0.924 0.001121598 0.023043364 ENSGALT00000020536 transducin-like enhancer of split 1 (E(sp1) homolog,

Drosophila)

comp4184_c0 1.412 0.001153415 0.02356795 ENSGALT00000018395 decorin

comp17859_c0 1.475 0.0011536 0.02356795 ENSGALT00000019257 LOC417962 comp1963_c1 1.145 0.001158082 0.023615368 ENSGALT00000009107 filamin B, beta (actin binding

protein 278) comp2022_c0 1.325 0.00119194 0.024173311 ENSGALT00000000194 MHC class II antigen B-F minor heavy chain; MHC class

II beta chain comp7379_c0 1.049 0.001192078 0.024173311 ENSGALT00000004533 zinc finger and BTB domain

containing 38 comp11425_c0 1.081 0.001195454 0.024186479 ENSGALT00000013683 hypothetical LOC429729; similar to GAC-1; ankyrin

repeat domain 12 comp942_c0 1.293 0.001206326 0.024282025 ENSGALT00000006650 secreted protein, acidic,

cysteine-rich (osteonectin)

comp7942_c0 1.142 0.00120604 0.024282025 ENSGALT00000030477 follistatin-like 1

188

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp5199_c0 1.317 0.001212958 0.024325931 ENSGALT00000020914 tubulin, beta 6; similar to Tubulin beta-5 chain (Beta- tubulin class-V); similar to

tubulin, beta 3; tubulin, beta 2B comp7900_c0 1.079 0.001212274 0.024325931 ENSGALT00000027550 spastic paraplegia 20 (Troyer

syndrome) comp7901_c0 1.113 0.001268037 0.025291328 ENSGALT00000021004 cAMP responsive element

binding protein 3-like 2 comp25286_c0 1.430 0.001282156 0.025526349 ENSGALT00000022483 tumor necrosis factor, alpha-

induced protein 3

comp11906_c0 1.188 0.001287124 0.025556861 ENSGALT00000013447 ENSGALG00000008266 comp11806_c0 1.081 0.001288365 0.025556861 ENSGALT00000015518 pleckstrin homology domain containing, family A (phosphoinositide binding

specific) member 1 comp9547_c0 1.028 0.001302779 0.025702842 ENSGALT00000022686 erythrocyte membrane protein

band 4.1-like 2

comp3469_c0 1.275 0.001308274 0.025764754 ENSGALT00000019173 matrix Gla protein comp3537_c0 1.478 0.0013658 0.026801075 ENSGALT00000031766 serum/glucocorticoid regulated

kinase 1

comp7445_c0 1.085 0.001403534 0.027442979 ENSGALT00000015940 Meis homeobox 2 comp151_c0 -1.090 0.00144738 0.028142536 ENSGALT00000002382 5'-nucleotidase domain

containing 2 comp757_c0 1.913 0.001449151 0.028142536 ENSGALT00000038351 glycine dehydrogenase

(decarboxylating) comp6844_c0 1.443 0.001452186 0.028142536 ENSGALT00000040568 similar to Glutathione peroxidase 3 precursor (GSHPx-3) (GPx-3) (Plasma glutathione peroxidase)

(GSHPx-P) comp10749_c1 1.809 0.001505804 0.028925206 ENSGALT00000011354 cerebellar degeneration-related

protein 2, 62kDa comp15340_c0 1.394 0.001509786 0.028950799 ENSGALT00000039507 similar to phosphoinositide 3- kinase catalytic subunit; similar to Phosphatidylinositol 3- kinase, catalytic, alpha polypeptide; phosphoinositide- 3-kinase, catalytic, alpha

polypeptide comp4931_c0 1.796 0.001586218 0.029995445 ENSGALT00000007969 TIMP metallopeptidase

inhibitor 4

comp5630_c0 1.097 0.001606855 0.030280907 ENSGALT00000000233 Kruppel-like factor 3 (basic) comp10647_c0 -1.014 0.001617293 0.030372867 ENSGALT00000026409 ATPase family, AAA domain

containing 2

comp8776_c0 0.922 0.001685531 0.031067211 ENSGALT00000013447 ENSGALG00000008266 comp21645_c0 1.912 0.001710729 0.031478563 ENSGALT00000017274 glutaminyl-peptide cyclotransferase (glutaminyl

cyclase) comp13967_c0 3.404 0.001721018 0.031614673 ENSGALT00000010737 similar to fatty acid Coenzyme A ligase, long chain 6; acyl- CoA synthetase long-chain

family member 6 comp5508_c0 -0.891 0.001752817 0.032090929 ENSGALT00000010812 abhydrolase domain containing

2 comp15108_c0 1.261 0.001756009 0.032095614 ENSGALT00000037297 hypothetical LOC430294; ets

variant gene 6 (TEL oncogene)

189

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp107_c0 -1.198 0.001787814 0.032568018 ENSGALT00000012129 kynurenine 3-monooxygenase

(kynurenine 3-hydroxylase) comp12477_c0 1.267 0.00181505 0.033009143 ENSGALT00000025676 RNA binding motif protein

12B comp19040_c0 1.021 0.001859242 0.033644899 ENSGALT00000002321 von Willebrand factor A

domain containing 1 comp4019_c0 -0.919 0.001889656 0.034138743 ENSGALT00000020197 chromosome 6 open reading

frame 72 comp14752_c0 0.959 0.001903713 0.034324373 ENSGALT00000017630 BTB (POZ) domain containing

7 comp10272_c0 -1.039 0.001906212 0.034324373 ENSGALT00000025419 GA binding protein transcription factor, alpha

subunit 60kDa

comp20071_c0 -1.619 0.001914911 0.034376312 ENSGALT00000017505 F-box protein 8 comp13965_c0 1.045 0.001976957 0.035307426 ENSGALT00000001259 solute carrier family 7 (cationic amino acid transporter, y+

system), member 6

comp10160_c0 1.435 0.001981036 0.035322558 ENSGALT00000015236 SKI-like oncogene

comp1105_c0 3.643 0.001996838 0.035546327 ENSGALT00000008254 hemicentin 1 comp12449_c0 1.241 0.002027729 0.035979028 ENSGALT00000011898 B-cell CLL/lymphoma 6 (zinc

finger protein 51)

comp8111_c0 0.860 0.002044913 0.036225123 ENSGALT00000028075 YY1 transcription factor comp7228_c0 0.838 0.002048778 0.036234866 ENSGALT00000027884 PCF11, cleavage and polyadenylation factor subunit, homolog (S. cerevisiae); similar to Pre-mRNA cleavage complex 2 protein Pcf11 (Pre- mRNA cleavage complex II

protein Pcf11) comp17992_c0 1.151 0.002057153 0.036324198 ENSGALT00000012164 ATP-binding cassette, sub- family C (CFTR/MRP),

member 3

comp12204_c0 1.056 0.002075157 0.036465383 ENSGALT00000013811 arrestin domain containing 1 comp12745_c0 0.897 0.002107579 0.036916415 ENSGALT00000005666 matrix metallopeptidase 2 (gelatinase A, 72kDa gelatinase, 72kDa type IV

collagenase)

comp7593_c0 -1.186 0.002117473 0.037030375 ENSGALT00000023286 cystatin A (stefin A) comp12166_c0 -1.682 0.002121594 0.037043175 ENSGALT00000017129 deiodinase, iodothyronine, type

II

comp15238_c0 -1.201 0.002136882 0.037191282 ENSGALT00000020758 thioredoxin reductase 1 comp8206_c0 0.924 0.002177914 0.03780884 ENSGALT00000019909 FK506 binding protein 9, 63

kDa comp15220_c0 1.473 0.002191481 0.03780884 ENSGALT00000020823 DNA-damage regulated

autophagy modulator 1

comp7258_c0 2.069 0.002184019 0.03780884 ENSGALT00000038579 lysozyme (renal amyloidosis)

comp6649_c0 0.991 0.002198649 0.03784447 ENSGALT00000009887 collagen, type VI, alpha 2 comp7621_c0 1.083 0.002213458 0.038039452 ENSGALT00000007472 mitogen-activated protein kinase 6; similar to Mapk6

protein comp16123_c0 1.540 0.002222549 0.038135724 ENSGALT00000015681 potassium channel tetramerisation domain

containing 12

comp9735_c0 -0.974 0.002248903 0.038381127 ENSGALT00000027050 unc-50 homolog (C. elegans)

190

T. elegans gene logFC P-Value FDR ENSEMBL Gene Name comp22885_c0 1.556 0.002260425 0.038483562 ENSGALT00000039866 core-binding factor, beta

subunit comp4032_c0 1.007 0.002294769 0.038946937 ENSGALT00000026595 ras homolog gene family,

member B comp6524_c0 1.023 0.002305336 0.039065618 ENSGALT00000014679 ST6 (alpha-N-acetyl- neuraminyl-2,3-beta- galactosyl-1,3)-N- acetylgalactosaminide alpha-

2,6-sialyltransferase 5 comp10763_c0 1.267 0.002379476 0.040011805 ENSGALT00000008888 microtubule associated monoxygenase, calponin and

LIM domain containing 2

comp11790_c0 -1.111 0.002387458 0.040084349 ENSGALT00000017008 KIAA0494 comp6183_c0 1.361 0.00240629 0.040338573 ENSGALT00000027873 frizzled homolog 4

(Drosophila)

comp18106_c0 1.630 0.002410497 0.040347218 ENSGALT00000023963 3-oxoacid CoA transferase 1 comp8472_c0 1.008 0.002465712 0.041042535 ENSGALT00000013396 protein kinase C and casein

kinase substrate in neurons 3

comp5033_c0 0.877 0.002507304 0.041648681 ENSGALT00000021345 caldesmon 1 comp22340_c0 1.354 0.002525944 0.041894634 ENSGALT00000009128 cysteine-rich secretory protein

LCCL domain containing 2 comp7120_c0 0.947 0.002531869 0.041929278 ENSGALT00000012918 TBC1 domain family, member

10A

comp15550_c0 1.027 0.002550303 0.042170669 ENSGALT00000004300 kinesin family member 1B

comp25813_c0 1.907 0.002566246 0.042370188 ENSGALT00000005686 ankyrin 3

comp5083_c0 1.228 0.00259347 0.042690703 ENSGALT00000032821 annexin A1

comp5865_c0 0.663 0.002591922 0.042690703 ENSGALT00000039610 Nedd4 binding protein 1 comp3890_c0 3.895 0.00266161 0.043680769 ENSGALT00000030078 vitelline membrane outer layer

1 homolog (chicken)

comp8788_c0 0.911 0.002765793 0.045187025 ENSGALT00000027924 von Willebrand factor

comp13049_c0 1.574 0.002770567 0.045197462 ENSGALT00000039870 ENSGALG00000013255

comp5716_c0 0.983 0.002783781 0.045345349 ENSGALT00000005471 zinc finger protein 207

comp10327_c1 1.023 0.002799144 0.045504652 ENSGALT00000009637 phosphodiesterase 8A

comp11630_c0 1.002 0.002810639 0.045579055 ENSGALT00000016941 ENSGALG00000010405

comp14015_c0 1.621 0.002841164 0.04600581 ENSGALT00000004380 tetraspanin 3

comp12506_c0 1.105 0.002921702 0.047239943 ENSGALT00000005513 cyclin D3 comp7831_c0 1.108 0.003007855 0.04847681 ENSGALT00000008364 aquaporin 1 (Colton blood

group) comp18703_c0 2.000 0.003011505 0.04847681 ENSGALT00000020801 neural precursor cell expressed, developmentally down-

regulated 9

comp5701_c0 -0.898 0.003064736 0.049116656 ENSGALT00000037683 t-complex 11 (mouse)-like 2 comp12030_c0 1.200 0.003088337 0.04927813 ENSGALT00000033813 chemokine (C-X-C motif)

receptor 7

191

CHAPTER 7. CONCLUSION Animals living in different habitats must respond to different types of stress. The genes and molecular networks they use to respond to environmental stresses often affects other traits such as immune function, behavior, reproduction, and longevity. The goal of this body of research was to utilize a naturally-evolved garter snake system to test the hypothesis that different environmentally- derived physiological stresses may have caused the evolution of molecular networks underlying stress response; consequently the evolution of these networks may drive the divergence in the life-history traits that we see in the garter snake ecotypes. To address this hypothesis, I used a heat stress experiment on lab-born juveniles derived from these two garter snake ecotypes. I employ a systems biology approach to characterize the immediate response to heat stress at multiple biological levels within the individual (genetic, transcriptomic, metabolic, and physiological); and beyond the individual (populations, ecotypes, generations) to understand how the molecular stress networks can evolve. I have found that heat stress induces changes at multiple levels of biological complexity: (i) the transcriptional regulation of multiple functional classes of genes promoting cellular protection, cell communication and environmental sensing, including the up-regulation of the mitochondria rRNAs; (ii) increased mitochondrial respiration; and (iii) increased plasma corticosterone. These data are useful for comparisons to other species including endothermic organisms to predict the larger affects of stress on organisms in general and reptiles specifically. Additionally, they provide mechanistic data on the effect of heat stress on natural populations for the new field of conservation physiology, specifically the face of climate change. Between the garter snake ecotypes, we have identified multiple nodes in the stress response molecular network that were variable under either control conditions or in how the ecotypes respond to stress. Once again these nodes were found at multiple levels of organization: mitochondrial DNA sequence, overall transcription of the mitochondrial genome; mitochondrial respiration; circulating levels of reactive oxygen species; liver gene expression of a key antioxidant; and erythrocyte DNA damage in response to heat. These results support the hypothesis that these evolutionarily divergent life-history ecotypes have also diverged in their molecular stress response networks. In addition we have identified specific nodes involved in oxidative stress and mitochondrial function at which selection appears to be acting. More broadly, these results lend further support to the prediction of tightly integrated molecular interactions between stress networks and life-history traits, thus supporting findings from studies on laboratory model organisms. Finally, this research furthers our

192 understanding of how diverging environmental stresses may drive the evolution of molecular networks and our ability to predict how environmental changes will affect species’ ability to respond and survive in our dramatically changing world.

FUTURE DIRECTIONS Multiple projects are ongoing that complement or follow-up the results from this body of research. An additional heat stress experiment was conducted on another set of animals this past winter 2012, and 20 female livers were once again used for RNA-seq (paired-end 100 bp Illumina High Seq). In collaboration with Suzanne McGaugh, these data are being analyzed for quantitative variation in gene expression and sequence variation across the populations/ecotypes. This study increases the quality and quantity of the data presented in Chapter 6, and is expected to provide the statistical power to test for differences in transcription among the ecotypes. An ecological genomics project funded by a NSF doctoral dissertation improvement grant is underway to compare the protein coding sequences of 500 genes from specific genetic pathways across populations of the two ecotypes. This project uses sequence-capture technology to specifically target gene sequences from 96 individuals. These target genes include mitochondrial genomes, and genes involved in metabolism, oxidative stress and the insulin/insulin-like signaling pathway, as well as genes of interest from other lab members. Data from this project in combination with the previously described RNA-seq datasets will provide sequence data for 120 genetically unique individuals from 6 populations across the Eagle Lake landscape. These data will be used to test for protein level evolution of stress and metabolic pathways across populations and ecotypes. These sequence data will also contribute to a collaborative Bronikowski-lab project that is specifically testing for differences between the ecotypes in the insulin/insulin-like signaling pathway in (i) protein-coding sequences of genes in the IIS, (ii) transcriptional regulation of the IGFs and IGF receptors across tissues, and (iii) abundance of IGFs in the plasma. Additionally these data will allow us to follow up on results from Chapter 5 that suggest divergence in mitochondrial haplotypes between the ecotypes. In summary, these sequence data in combination with the gene expression data and the physiological measurements are expected to provide a comprehensive understanding of how the molecular stress networks are evolving among these life- history ecotypes.

193

APPENDIX A. CO-AUTHORED MANUSCRIPTS RELATED TO THIS DISSERTATION

MANUSCRIPT 1. EVOLUTIONARY RATES VARY IN VERTEBRATES FOR INSULIN-LIKE

GROWTH FACTOR-1 (IGF-1), A PLEIOTROPIC LOCUS INVOLVED IN LIFE HISTORY

TRAITS This manuscript is in press in the peer-reviewed scientific journal, General and Comparative Endocrinology Amanda M. Sparkman*, Tonia S. Schwartz*, Jill M. Madden, Scott Boyken, Neil Ford, Jeanne Serb, and Anne M. Bronikowski *Authors contributed equally to this manuscript

Insulin-like growth factor-1 (IGF-1) is a member of the vertebrate insulin/insulin-like growth factor/relaxin gene family necessary for growth, reproduction, and survival at both the cellular and organismal level. Its sequence, protein structure, and function have been characterized in mammals, birds, and fish; however, a notable gap in our current knowledge of the function of IGF-1 and its molecular evolution is the absence of information on ectothermic reptiles. To address this disparity, we sequenced the coding region of IGF-1 in eleven reptile species—one crocodilian, three turtles, three lizards, and four snakes. Complete sequencing of the full mRNA transcript of a snake revealed the Ea isoform, the predominant isoform of IGF-1 also reported in other vertebrate groups. A gene tree of the IGF-1 protein-coding region that incorporated sequences from diverse vertebrate groups showed similarity to the species phylogeny, with the exception of the placement of Testudines as sister group to Aves, due to their high nucleotide sequence similarity. In contrast, long-branch lengths indicate more rapid divergence in IGF-1 among lizards and snakes. Additionally, lepidosaurs (i.e., lizards and snakes) had higher rates of nonsynonymous:synonymous substitutions (dN/dS) relative to archosaurs (i.e., birds and crocodilians) and turtles. Tests for positive selection on specific codons within branches and evaluation of the changes in the amino acid properties, suggested positive selection in lepidosaurs on the C domain of IGF-1, which is involved in binding affinity to the IGF-1 receptor. Predicted structural changes suggest that major alterations in protein structure and function may have occurred in reptiles. These data propose new insights into the molecular co-evolution of IGF-1 and its receptors, and ultimately the evolution of IGF-1's role in regulating life-history traits across vertebrates.

194

MANUSCRIPT 2. MULTIPLE PATERNITY AND MALE REPRODUCTIVE SUCCESS IN

A METAPOPULATION OF GARTER SNAKES WITH EVOLUTIONARILY DIVERGENT

LIFE-HISTORIES

This manuscript is in review at the peer-reviewed scientific journal, Molecular Ecology Megan Manes, Tonia S. Schwartz, Kylie A. Robert, and Anne M. Bronikowski

In animal species that can have multiple sires for each female's reproductive event, many questions remain equivocal in the literature such as what environmental factors affect frequency of multiple paternity, the optimum number of sires per litter, and male and female optimal reproductive strategies. Here, we take advantage of natural life-history divergence among populations of garter snakes to address these questions in a robust field setting. These populations have diverged along a slow-to-fast life-history continuum; at one end, individuals have fast growth to maturation, large reproductive effort, and short lifespan relative to individuals in slow-lived populations. We utilize molecular markers to determine the minimum number of sires in 56 litters and estimate male reproductive success from replicate populations of both life-history strategies. We found that despite dramatic differences in average annual female reproductive output: (1) females of both life-history strategies average same number of sires per litter, although simulations based on a numerical odds argument predict that fast life-history females, with their larger litter sizes, should have twice the number of sires as slow life-history females; (2) the percentage of multiply sired litters did not differ between life-history strategies; and (3) males of the fast life-history strategy have higher reproductive success with more variance among males, relative to slow life-history males. Together, these results indicate strong intrasexual competition among males, particularly in the fast paced life-history ecotype. We discuss these results in the context of competing hypotheses for multiple paternity related to bet-hedging, resource variability, and life-history strategies.

195

MANUSCRIPT 3. REPTILIAN TELOMERASE REMAINS ACTIVE IN ADULT TISSUES

AND ITS ACTIVITY PEAKS UNDER HEAT STRESS

This manuscript is in prepared for submission to the peer-reviewed scientific journal, Biology Letters Tonia S. Schwartz and Anne M. Bronikowski

Reptiles display dramatic flexibility in metabolism and resistance to physiological stress and many species have negligible senescence either physiologically and/or reproductively, thus making good comparative models for aging and stress response. Despite this, there is insufficient information on this group for molecular components of aging process including telomerase, a key enzyme maintaining telomere length, repairing DNA damage, and cell longevity. Here we assay activity of a reptilian (garter snake, Thamnophis elegans) telomerase activity across tissues, ages, and temperatures. We find evidence that 1) telomerase activity is maintained in adult reptile tissues, and 2) enzymatic activity is optimized to heat stress temperature as opposed to preferred body temperature. Our results support the hypothesis that telomerase activity is maintained in reptiles, and thus may contribute the negligible senescence through somatic maintenance through physiological stress. Intriguingly, telomerase activity peaked at heat stress temperature in this species, as opposed to preferred body temperature as is seen in mammals; setting up multiple hypotheses as to how telomerase activity evolves. The results of this research provide the first functional information on ectothermic reptilian telomerase, a fundamentally important enzyme in the aging and stress response process. Comparative analyses with telomerase functions in mammals will provide new insight into telomerase evolution and their role in the evolution of life span and stress response.

196

APPENDIX B. CODE FOR PROCESSING DATA AND STATISTICAL ANALYSES

GALAXY WORKFLOWS FOR PROCESSING ILLUMINA DATA Cleaning Illumina Reads: http://main.g2.bx.psu.edu/u/ts-ecogen/w/groomillumina- solexatrimq20filter20bp Mapping Illumina reads to candidate genes: http://main.g2.bx.psu.edu/u/ts- ecogen/w/map-illumina-indiv-to-candidate-transcripts-and-count

SAS CODE FOR STATISTICAL ANALYSES

Individual physiological variables, full ANOVA model

/* ------Code generated by SAS Task

Generated on: Monday, August 13, 2012 at 12:59:20 PM By task: Mixed Models ANOVAs

Input Data: WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD Server: Local ------*/ ODS GRAPHICS ON;

%_eg_conditional_dropds(WORK.SORTTempTableSorted); /* ------Sort data set WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD ------*/

PROC SQL; CREATE VIEW WORK.SORTTempTableSorted AS SELECT T.Cort_log10, T.ln_mass, T.treatment, T.expday, T.ecotype, T.pop, T.Sex, T.Litter FROM WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD as T ; QUIT; TITLE; TITLE1 "Mixed Models Analysis of: Cort_log10";

197

FOOTNOTE; FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) at %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))"; PROC MIXED DATA = WORK.SORTTempTableSorted PLOTS(ONLY)=ALL METHOD=REML ; CLASS treatment expday ecotype pop Sex Litter ; MODEL Cort_log10= ln_mass expday treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / HTYPE=3 SOLUTION CL ALPHA=0.05 INTERCEPT E3 ; RANDOM Litter(pop ecotype) / TYPE=VC; ; LSMEANS treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / PDIFF=ALL ; /* ------End of task code. ------*/ /*PROC MIXED DATA = WORK.SORTTempTableSorted PLOTS(ONLY)=ALL METHOD=REML ; CLASS expday treatment ecotype pop Sex Litter ; MODEL "Cort_log10"n= ln_mass expday treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / HTYPE=1 3 SOLUTION CL ALPHA=0.05 INTERCEPT E1 E3 ; RANDOM Litter(pop ecotype) / TYPE=VC; ; LSMEANS treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / PDIFF=ALL ; */ RUN; QUIT;

198

%_eg_conditional_dropds(WORK.SORTTempTableSorted); TITLE; FOOTNOTE; ODS GRAPHICS OFF

Individual physiological variables, simple ANOVA model

/* ------Code generated by SAS Task

Generated on: Monday, August 13, 2012 at 12:59:20 PM By task: Mixed Models ANOVAs

Input Data: WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD Server: Local ------*/ ODS GRAPHICS ON;

%_eg_conditional_dropds(WORK.SORTTempTableSorted); /* ------Sort data set WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD ------*/

PROC SQL; CREATE VIEW WORK.SORTTempTableSorted AS SELECT T.mt8H2O2prod_5to35min, T.expday, T.treatment, T.ecotype FROM WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD as T ; QUIT; TITLE; TITLE1 "Mixed Models Analysis of: mt8H2O2prod_5to35min"; FOOTNOTE; FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) at %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))"; PROC MIXED DATA = WORK.SORTTempTableSorted PLOTS(ONLY)=ALL METHOD=REML ; CLASS expday treatment ecotype ; MODEL mt8H2O2prod_5to35min= avgMass expday treatment ecotype treatment*ecotype / HTYPE=3 SOLUTION CL ALPHA=0.05 INTERCEPT E3

199

; LSMEANS treatment ecotype treatment*ecotype / PDIFF=ALL ; /* ------End of task code. ------*/ /*PROC MIXED DATA = WORK.SORTTempTableSorted PLOTS(ONLY)=ALL METHOD=REML ; CLASS expday treatment ecotype pop Sex Litter ; MODEL "Cort_log10"n= ln_mass expday treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / HTYPE=1 3 SOLUTION CL ALPHA=0.05 INTERCEPT E1 E3 ; RANDOM Litter(pop ecotype) / TYPE=VC; ; LSMEANS treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / PDIFF=ALL ; */

RUN; QUIT; %_eg_conditional_dropds(WORK.SORTTempTableSorted); TITLE; FOOTNOTE; ODS GRAPHICS OFF;

200

Mitochondrial physiological variables, ANOVA model

/* ------Code generated by SAS Task

Generated on: Monday, August 13, 2012 at 12:59:20 PM By task: Mixed Models ANOVAs

Input Data: WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD Server: Local ------*/ ODS GRAPHICS ON;

%_eg_conditional_dropds(WORK.SORTTempTableSorted); /* ------Sort data set WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD ------*/

PROC SQL; CREATE VIEW WORK.SORTTempTableSorted AS SELECT T.mt8H2O2prod_5to35min, T.expday, T.treatment, T.ecotype FROM WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD as T ; QUIT; TITLE; TITLE1 "Mixed Models Analysis of: mt8H2O2prod_5to35min"; FOOTNOTE; FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) at %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))"; PROC MIXED DATA = WORK.SORTTempTableSorted PLOTS(ONLY)=ALL METHOD=REML ; CLASS expday treatment ecotype ; MODEL mt8H2O2prod_5to35min= avgMass expday treatment ecotype treatment*ecotype / HTYPE=3 SOLUTION CL ALPHA=0.05 INTERCEPT E3 ; LSMEANS treatment ecotype treatment*ecotype / PDIFF=ALL ; /* ------End of task code. ------

201

*/ /*PROC MIXED DATA = WORK.SORTTempTableSorted PLOTS(ONLY)=ALL METHOD=REML ; CLASS expday treatment ecotype pop Sex Litter ; MODEL "Cort_log10"n= ln_mass expday treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / HTYPE=1 3 SOLUTION CL ALPHA=0.05 INTERCEPT E1 E3 ; RANDOM Litter(pop ecotype) / TYPE=VC; ; LSMEANS treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex treatment*ecotype*Sex / PDIFF=ALL ; */

RUN; QUIT; %_eg_conditional_dropds(WORK.SORTTempTableSorted); TITLE; FOOTNOTE; ODS GRAPHICS OFF;

Repeated measures ANOVA on comet data

/* ------Code generated by SAS Task Generated on: Tuesday, August 14, 2012 at 1:24:34 PM By task: Mixed Models3

Input Data: WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD Server: Local ------*/ ODS GRAPHICS ON;

%_eg_conditional_dropds(WORK.SORTTempTableSorted); /* ------Sort data set WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD ------

202

*/

PROC SQL; CREATE VIEW WORK.SORTTempTableSorted AS SELECT T.Damage, T.ln_mass, T.DNAtrt, T.treatment, T.expday, T.ecotype, T.pop, T.Sex, T.TubeID, T.Litter FROM WORK.HEATSTRESSPHYSIOLOGY_DATA_DRYAD as T ; QUIT; TITLE; TITLE1 "Mixed Models Analysis of: Damage"; FOOTNOTE; FOOTNOTE1 "Generated by the SAS System (&_SASSERVERNAME, &SYSSCPL) on %TRIM(%QSYSFUNC(DATE(), NLDATE20.)) at %TRIM(%SYSFUNC(TIME(), TIMEAMPM12.))"; PROC MIXED DATA = WORK.SORTTempTableSorted PLOTS(ONLY)=ALL METHOD=REML ; CLASS DNAtrt treatment expday ecotype pop Sex TubeID Litter ; MODEL Damage= ln_mass expday DNAtrt treatment Sex ecotype treatment*ecotype treatment*Sex ecotype*Sex DNAtrt*treatment DNAtrt*ecotype DNAtrt*Sex treatment*ecotype*Sex DNAtrt*treatment*ecotype DNAtrt*Sex*ecotype DNAtrt*treatment*Sex DNAtrt*treatment*ecotype*Sex / HTYPE=3 INTERCEPT E3 ; RANDOM Litter(pop ecotype) / TYPE=VC; ; LSMEANS DNAtrt treatment ecotype treatment*ecotype treatment*Sex ecotype*Sex DNAtrt*treatment DNAtrt*ecotype DNAtrt*Sex treatment*ecotype*Sex DNAtrt*treatment*ecotype DNAtrt*Sex*ecotype DNAtrt*treatment*Sex DNAtrt*treatment*ecotype*Sex / PDIFF=ALL ; /* ------End of task code. ------*/ RUN; QUIT; %_eg_conditional_dropds(WORK.SORTTempTableSorted); TITLE; FOOTNOTE; ODS GRAPHICS OFF;

203

R-CODE FOR GENE EXPRESSION ANALYSES IN EDGER AND GOSEQ

# open libraries library(edgeR) library(limma) x <- read.delim("all.isoforms.counts.matrix.txt",row.names="Gene") or x <- read.delim("all.gene.counts.matrix.txt",row.names="Gene") head(x) dim(x) group <- factor(c(1,1,2,1,1,2,2,2,2,1,1,2))

#use if doing interaction group <- factor(c(1,1,2,3,3,2,2,4,4,3,1,4))

#Use this to calculate the library size - total number of reads #y <- DGEList(counts=x,group=group) #y

#Use this if want to insert number of mapped reads in to library size lib.size<-(c(23359723, 11187840, 9153074, 11015692, 4051861, 10246588, 10821495, 11251351, 11373349, 7275241, 10429865, 12960820)) y <- DGEList(counts=x,group=group,lib.size=lib.size) dim(y) [1] 132343 12 Transcripts [1] 97619 12 Genes y

#Use this if want to filter low number of reads.

#keep <- rowSums(cpm(y)>0.1) >= 2 # 1 count per million. If 133.127 million reads then a total of 13.3 across the 12 individuals. 13.3/12 = average of 1.1 reads per indiv #y <- y[keep,]

#dim(y) #[1] 125865 12

# This keeps transcripts with 50 reads per individual: 600 reads total across all indivs keep <- rowSums(cpm(y)>7.905) >= 2 y <- y[keep,] dim(y) [1] 14558 12 Transcripts

# This keeps transcripts with 50 reads per individual: 600 reads total across all indivs keep <- rowSums(cpm(y)>7.21) >= 2

204 y <- y[keep,] dim(y)

[1] 10930 12 y # see the data matrix

#Apply TMM normalization: y <- calcNormFactors(y) y$samples

Transcripts - no filter group lib.size norm.factors RSEM.isoforms.results37 1 23359723 0.9149738 RSEM.isoforms.results38 1 11187840 1.1122260 RSEM.isoforms.results46 2 9153074 1.2223299 RSEM.isoforms.results61 1 11015692 0.9347249 RSEM.isoforms.results62 1 4051861 1.0029738 RSEM.isoforms.results64 2 10246588 1.1796874 RSEM.isoforms.results65 2 10821495 0.9070730 RSEM.isoforms.results66 2 11251351 0.9540451 RSEM.isoforms.results67 2 11373349 1.0606905 RSEM.isoforms.results87 1 7275241 1.0606919 RSEM.isoforms.results90 1 10429865 0.8823637 RSEM.isoforms.results93 2 12960820 0.8461216

MAPPABLE READS LIBRARY SIZES (Transcripts, filter < 50 reads/indiv) group lib.size norm.factors RSEM.isoforms.results37 1 23359723 1.0949654 RSEM.isoforms.results38 1 11187840 1.1620644 RSEM.isoforms.results46 2 9153074 1.1322330 RSEM.isoforms.results61 1 11015692 0.9400850 RSEM.isoforms.results62 1 4051861 0.9248618 RSEM.isoforms.results64 2 10246588 1.1610272 RSEM.isoforms.results65 2 10821495 0.9811268 RSEM.isoforms.results66 2 11251351 0.8813819 RSEM.isoforms.results67 2 11373349 1.0589498 RSEM.isoforms.results87 1 7275241 0.9514376 RSEM.isoforms.results90 1 10429865 0.9190393 RSEM.isoforms.results93 2 12960820 0.8587529

MAPPABLE READS LIBRARY SIZES (Genes) group lib.size norm.factors RSEM.genes.results.37 1 23359723 0.9354315 RSEM.genes.results.38 1 11187840 1.1153366 RSEM.genes.results.46 2 9153074 1.2459899 RSEM.genes.results.61 1 11015692 0.9336936 RSEM.genes.results.62 1 4051861 0.9766462 RSEM.genes.results.64 2 10246588 1.1938690 RSEM.genes.results.65 2 10821495 0.8946319

205

RSEM.genes.results.66 2 11251351 0.9736755 RSEM.genes.results.67 2 11373349 1.0530985 RSEM.genes.results.87 1 7275241 1.0075460 RSEM.genes.results.90 1 10429865 0.9010573 RSEM.genes.results.93 2 12960820 0.8484463

Genes filter >50 reads per indiv group lib.size norm.factors RSEM.genes.results.37 1 23359723 1.1343143 RSEM.genes.results.38 1 11187840 1.1452741 RSEM.genes.results.46 2 9153074 1.1559454 RSEM.genes.results.61 1 11015692 0.9195944 RSEM.genes.results.62 1 4051861 0.9064663 RSEM.genes.results.64 2 10246588 1.1648560 RSEM.genes.results.65 2 10821495 0.9784048 RSEM.genes.results.66 2 11251351 0.8896488 RSEM.genes.results.67 2 11373349 1.0592516 RSEM.genes.results.87 1 7275241 0.9332157 RSEM.genes.results.90 1 10429865 0.9105247 RSEM.genes.results.93 2 12960820 0.8753657

#Data Exploration plotMDS(y)

############################ Classic one factor design #DGEList object D, we estimate the dispersions using the following commands. #To estimate common dispersion: y <- estimateCommonDisp(y) #To estimate tagwise dispersions: y <- estimateTagwiseDisp(y)

#The testing can be done by using the function exactTest(), and the #function allows both common dispersion and tagwise dispersion approaches. For example: et <- exactTest(y) topTags(et,n=20)

#total number of differentially expressed genes at FDR ~0.05 is summary(de <- decideTestsDGE(et, p=0.05, adjust="BH"))

Transcripts no filter -1 464 0 131082 1 797

Transcripts filter >50 reads/indiv -1 94 0 14141 1 323

206

Genes no filter -1 681 0 95681 1 1257

Genes filter >50 reads/ind -1 146 0 10244 1 540 write.table(topTags(et, n=500000), "Tscp_50Filter.csv", col.names=TRUE) write.table(topTags(et, n=500000), "Gene_50Filter.csv", col.names=TRUE)

#shows the counts per million for the tags that edgeR has identified as the most differentially expressed. detags <- rownames(topTags(et, n=20)) cpm(y)[detags,]

# plotSmear generates a plot of the tagwise log-fold-changes against log-counts per million detags <- rownames(y)[as.logical(de)] plotSmear(et, de.tags=detags) abline(h = c(-2, 2), col = "blue") #The horizontal blue lines show 4-fold changes.

################################### GLM method for mulitfactoral designs

Ecotype <- factor(c("L","L","L","M","M","L","L","M","M","M","L","M")) Treatment<- factor(c("C","C","S","C","C","S","S","S","S","C","C","S")) data.frame(Sample=colnames(y),Ecotype,Treatment) Sample Ecotype Treatment 1 RSEM.isoforms.results37 L C 2 RSEM.isoforms.results38 L C 3 RSEM.isoforms.results46 L S 4 RSEM.isoforms.results61 M C 5 RSEM.isoforms.results62 M C 6 RSEM.isoforms.results64 L S 7 RSEM.isoforms.results65 L S 8 RSEM.isoforms.results66 M S 9 RSEM.isoforms.results67 M S 10 RSEM.isoforms.results87 M C 11 RSEM.isoforms.results90 L C 12 RSEM.isoforms.results93 M S design <- model.matrix(~Treatment*Ecotype, data=x) design y <- estimateGLMTrendedDisp(y,design) y <- estimateGLMTagwiseDisp(y,design) plotBCV(y) fit <- glmFit(y,design) design

207 colnames(design) [1] "(Intercept)" "TreatmentS" "EcotypeM" "TreatmentS:EcotypeM" lrt_Treatment <- glmLRT(y,fit,coef=2) topTags(lrt_Treatment, n=20) write.table(topTags(lrt_Treatment, n=50000), "Treatment_LibCal.csv", col.names=TRUE) lrt_Ecotype <- glmLRT(y,fit,coef=3) topTags(lrt_Ecotype) write.table(topTags(lrt_Ecotype, n=50000), "Ecotype_LibCal.csv", col.names=TRUE) lrt_Interaction <- glmLRT(y,fit,coef=4) topTags(lrt_Interaction, n=10) write.table(topTags(lrt_Interaction, n=50000), "Interaction_LibCal.csv", col.names=TRUE)

#total number of differentially expressed genes at P ~0.05 is summary(de <- decideTestsDGE(lrt_Treatment, p=0.05, adjust="BH")) -1 53 0 43654 1 451 summary(de <- decideTestsDGE(lrt_Ecotype, p=0.05, adjust="BH")) -1 24 0 44080 1 54 summary(de <- decideTestsDGE(lrt_Interaction, p=0.05, adjust="BH")) -1 5 0 44153 1 0

#shows the counts per million for the tags that edgeR has identified as the most differentially expressed. detags <- rownames(topTags(lrt_Treatment, n=20)) cpm(y)[detags,] # plotSmear generates a plot of the tagwise log-fold-changes against log-counts per million detags <- rownames(y)[as.logical(de)] plotSmear(lrt_Treatment, de.tags=detags) abline(h = c(-2, 2), col = "blue") #The horizontal blue lines show 4-fold changes. detags <- rownames(topTags(lrt_Ecotype, n=20)) cpm(y)[detags,] # plotSmear generates a plot of the tagwise log-fold-changes against log-counts per million detags <- rownames(y)[as.logical(de)] plotSmear(lrt_Ecotype, de.tags=detags) abline(h = c(-2, 2), col = "blue") #The horizontal blue lines show 4-fold changes. detags <- rownames(topTags(lrt_Interaction, n=20))

208 cpm(y)[detags,] # plotSmear generates a plot of the tagwise log-fold-changes against log-counts per million detags <- rownames(y)[as.logical(de)] plotSmear(lrt_Interaction, de.tags=detags) abline(h = c(-2, 2), col = "blue") #The horizontal blue lines show 4-fold changes.

############### End GLM methods #############

################################### starting GoSeq #function allows both common dispersion and tagwise dispersion approaches. For example: et <- exactTest(y) topTags(et,n=20) #total number of differentially expressed genes at FDR ~0.05 is summary(de <- decideTestsDGE(et, p=0.05, adjust="BH")) topTags(et)

# read in annotation file gene.anno.single<- read.table("Gene.anno.single.csv", header = TRUE, sep = ",") head(gene.anno.single) dim(gene.anno.single)

# merge again with gene results Gene.anno.geneNOfilter<-merge(gene.anno.single, geneNOfilter, all=FALSE) head(Gene.anno.geneNOfilter) dim(Gene.anno.geneNOfilter)

# make file with only first two columns blast.IDs <- blast.IDs [c("Gene", "ENSEMBLchick")] dim(blast.IDs)

# Original data with repeats removed. These do the same: Unique_gene50IDs<-unique(gene50IDs) dim(Unique_gene50IDs) # give a table of how many annos per each gene Unique.Gene50filter_table<-table(Unique_gene50IDs$Gene) genes=as.integer(p.adjust(et$table$PValue[et$table$logFC!=0],method="BH")<.05) names(genes)=row.names(et$table[et$table$logFC!=0,]) table(genes) source("http://bioconductor.org/biocLite.R") biocLite("geneLenDataBase") biocLite("GenomicFeatures") biocLite("KEGG.db") biocLite("chicken.db0") biocLite("GO.db") library(goseq)

209 library(geneLenDataBase) library(GenomicFeatures) library(GO.db) head(supportedGenomes()) head(supportedGeneIDs(),n=12) pwf=nullp(genes,"galGal3","ensGene") head(pwf)

### 6.5.2 Using the Wallenius approximation GO.wall=goseq(pwf,"galGal3","ensGene") head(GO.wall)

### 6.5.3 Using random sampling GO.samp=goseq(pwf,"galGal3","ensGene",method="Sampling",repcnt=1000) head(GO.samp)

## To try with KEGG see 6.5.5 Limiting GO.MF=goseq(pwf,"galGal3","ensGene",test.cats=c("GO:MF")) head(GO.MF)

### 6.6 KEGG pathway analysis #Get the mapping from ENSEMBL 2 Entrez en2eg=as.list(org.Gg.egENSEMBL2EG) #Get the mapping from Entrez 2 KEGG eg2kegg=as.list(org.Gg.egPATH) #Define a function which gets all unique KEGG IDs associated with a set of Entrez IDs grepKEGG=function(id,mapkeys){unique(unlist(mapkeys[id],use.names=FALSE))} #Apply this function to every entry in the mapping from ENSEMBL 2 Entrez to combine t head(kegg) pwf=nullp(genes,"galGal3","ensGene") KEGG=goseq(pwf,gene2cat=kegg) head(KEGG) pwf=nullp(genes,'galGal3','ensGene') KEGG=goseq(pwf,'galGal3','ensGene',test.cats="KEGG") head(KEGG)

### 6.5.6 Making sense of the results enriched.GO=GO.wall$category[p.adjust(GO.wall$over_represented_pvalue,method="BH")<.0] head(enriched.GO) library(GO.db) for(go in enriched.GO[1:10]){ print(GOTERM[[go]])cat("------\n")} kegg=lapply(en2eg,grepKEGG,eg2kegg)

210

APPENDIX C. DATA ACCESSIBILITY

GENBANK ACCESSIONS FOR DNA SEQUENCES

Next-generation sequencing data 1) 454 pyrosequencing data: GenBank Sequence Read Archive: SRS007367, SRS007366 2) Illumina RNA-seq data in Sequence Read Archive, SRA052923

Individual gene sequences Catalase Isoform 1, JX291960; Catalase Isoform 2, JX291961; Glutathione Peroxidase 1, JX291962; Glutathione Peroxidase 3, JX291963; Glutathione Peroxidase 4, JX291964; HSP40A1, JX291965; HSP40A4, JX291966; HSP70A1, JX291967; Superoxide dismutase 1, JX291968; Superoxide dismutase 2 Isoform 1, JX291969; Superoxide dismutase 2 Isoform 2, JX291970; Superoxide dismutase 3, JX291971

DRYAD Physiological Data: DRYAD Provisional DOI doi:10.5061/dryad.sb30r

211

ACKNOWLEDGEMENTS

FAMILY I am forever grateful for the love and support of my life-partner and best friend Daniel Warner. Our personal and professional lives have been entwined for the past 14 years; then and now our research adventures and discussions enrich my life, and my research about life. Thank you for sharing in our best adventure yet, our son Ethan. It is such a thrill to see the gene x environment interactions come to life. Thank you, my wonderful little Ethan, for putting my life and priorities into perspective. I want to express my deep gratitude for my Family. I am extraordinarily lucky to have been born into such a loving, generous, and supportive family. My parents, Wayne and Cathy Schwartz, thank you for providing a solid foundation of unwavering support from which I have been able to pursue my dreams and goals. Your love and support mean as much to me now as they did when I was five! I want to express my appreciation and respect for both of my brothers, Travis and Josh, who provide me with new perspectives and reality checks as to what is really important in life. Thank you, also, to my in-laws for their unconditional support and generosity: Ann Schwartz and Melanie Schwartz, Chuck and Mary Warner, Vicky and Greg Hayes, Vince Warner and Kelly Ferris, Dave and Sarah Warner, and of course all the wonderful nieces and nephews!

DISSERTATION COMMITTEE Throughout this five-year journey I have many times been humbled by the support from many of the faculty at Iowa State University. I especially want to express sincere appreciation to my major advisors and dissertation committee members. I specifically want to say thank you to those who were especially support through my reproductive phase. Anne –I appreciate you, your advising style, your mentorship, and the confidence that you had in my ideas and me, even when I did not always have that confidence myself. From the lab, to the field, to the desk you have provided guidance, knowledge, and support for this research. I appreciate the extensive amount of time and thought you put into discussing my ideas and research direction, and editing my manuscripts. I could not have asked for a better advisor and mentor over these past five years. Thank You. Jo Anne – My association with your lab has provided me with a broader perspective of my field, and a deeper understanding of my research. Our weekly, bi-weekly, and monthly discussions were exceptional in providing insight, not only on my research projects, but also on the field of biology as

212 a whole, academic careers, and life in general. Thank You. Drena - I truly appreciate your perspective on my research direction and the career advice you have provided over the past few years. I am amazed, humbled, and deeply grateful for the unwavering support you have given me on many different levels. Thank You. Chris – I appreciate the “genomic” knowledge you have provided me through the Genomics class, my committee meetings, and the occasional chat. My research is stronger because of your insight. Thank you. Jonathan – Nearly every conversation we have had has caused me to pause and think (or rethink) about what I am doing and why, ultimately improving my understanding of my own research and career direction. You are a role model for every young scientist. Thank You. Thank you Fred Janzen, Bonnie Bowen and Carol Vleck for being instrumental in defining my career direction during my first forays into research as an undergraduate. My master’s advisor, Stephen Karl, and subsequent research mentors Luciano Beheregary, Frank Seebacher, and Mats Olsson also provided foundations in scientific research and allowed me to explore and expand my research interests, for which I am grateful.

ADDITIONAL ACADEMIC SUPPORTERS A deep thanks to the past and present members of the Bronikowski Laboratory and Powell- Coffman Laboratory who have provided a supportive and fun work environment, and many great discussions: Dawn Reding, Liz Addis, Eric Gangloff, Megan Manes, Shikha Parsai, Amanda Sparkman, Gaby Palacios, Jill Madden, HaLea Messersmith, Dingxia Feng, Jenifer Saldanha, Elizabeth Asque, Zhiyong Shao, Zhiwei Zhai, and Yi Zhang. I would like to acknowledge my current and past collaborators for all their assistance, guidance, and insight: Suzanne McGaugh, Justin Choi, Hongseok Tae, Young-IK Yang, Keithanne Mocktalis, Stephen Proulx, John Van Hemert, Zebulun Arendsee, and Gary Morris. Thank you to the “2nd floor corridor neighbors” the Hoffmockel, Serb, and Valenzuela and Vleck Labs, and the Wendel Lab for the use of equipment and for personal support. I would also like to extend my thanks to the ISU DNA/Genomics Facility, Mike Baker, Gary Polking, and Andrew Severin; and the Flow Cytometry Facility, Shawn Rigsby for technical assistance. Thank you to the secretaries Linda Wilde, Trish Stauble, Janet McMahone, Debbie Hurley, and Jackie Hayes for their logistical assistance over the past five years. Thanks to all the EEOB, IG, and BCB graduate students and post-docs for providing a fun and supportive work environment.

213

FUNDING I would like to acknowledge the following organizations for financial support towards salary and tuition: National Science Foundation (NSF) IGERT Fellowship in Computational Molecular Biology [0504304]; NSF GK-12 Promoting a Green Workforce [DGE-0947929]; Research Assistantships through NSF Grant [IOS 0922528] to Anne Bronikowski; and the Department for Ecology, Evolution and Organismal Biology. I would like to acknowledge the following organizations for financial support of my research: NSF Grant [IOS 0922528] to Anne Bronikowski; NSF Doctoral Dissertation Improvement Grant [1011350] to Tonia Schwartz; Iowa Academy of Science to Anne Bronikowski and Tonia Schwartz; Sigma-Xi Grants in Aid of Research to Tonia Schwartz; the ISU Ecology, Evolution and Organismal Biology Department; the ISU Laurence H. Baker Center Grant to Stephen Proulx; and the ISU Center for Integrative Animal Genomics to Anne Bronikowski and Hui-Hsien Chou.

214

BIOGRAPHICAL SKETCH

Tonia Sue Schwartz was born April 13, 1976. She grew up in Houghton, Iowa and attended Marquette Elementary and Middle School, and Fort Madison Public Jr. and High School. She received a Bachelor of Science in Zoology with a minor in Genetics from Iowa State University in 1998. In 2003 she received a Master of Science in Zoology with an emphasis in conservation genetics and molecular ecology from University of South Florida, advised by Dr. Stephen A. Karl. Her research was on the population structure of the gopher tortoise (Gopherus polyphemus) in Florida. During her Master’s she was a teaching assistant for the classes: Cell Biology, Microbiology, and Biology II. She also conducted genetic analyses for fisheries management at Florida Marine Research Institute. In 2003 she married Daniel A. Warner, and moved to Sydney Australia where Daniel began a PhD program. From 2003 – 2007 she worked as a research assistant and lab manager in three labs in the Sydney area. During this time she conducted research and published papers on multiple topics: hybridization and speciation in fishes and sharks in Dr. Luciano Beheregary’s Molecular Ecology Laboratory (then at Macquarie University); the evolution of metabolic proteins in Dr. Frank Seebacher’s Evolutionary and Ecological Physiology Laboratory at University of Sydney; and the evolution of sexual selection and reproductive and aging traits in Dr. Mats Olsson’s Reptile Evolutionary and Behavioral Ecology Laboratory (then at Wollongong University). In 2007 she began her doctoral degree at Iowa State University being co-advised by Drs. Anne Bronikowski and Jo Anne Powell-Coffman. Her research focused on the molecular evolution of local adaptations, stress response, and life history trade-offs. While at Iowa State she participated in the National Science Foundation GK-12 program, working with a Des Moines middle school teacher to bring real science into the classroom. During her graduate training she receive multiple fellowships, grants and awards: NSF-IGERT Fellowship in Computational Molecular Biology; Iowa Academy of Sciences award to study sex-specific gene expression involved in stress response; NSF Doctoral Dissertation Improvement Grant to study the genetic basis for complex life-history traits; Best Young Investigator presentation in the Evolutionary Ecological Genomics Symposium at the 13th Congress for the European Society for Evolutionary Biology. In 2012 she gave birth to her son, Ethan Daniel Warner, and received her Doctorate degree in Interdepartmental Genetics. She has been awarded a 2- year James S. McDonnell Foundation Post-doctoral Fellowship to study Complex Systems. Currently, she has published 28 peer-reviewed manuscripts and one book chapter.