UNIVERSITY OF CALIFORNIA

Los Angeles

Novel Gene Expression Shifts in Response to

Complex Environmental Stimuli in a Wild Population

of Yellow-Bellied Marmots (Marmota flaviventer)

A dissertation submitted in partial satisfaction of the

requirements for the degree Doctor of Philosophy

in Biology

by

Tiffany Christine Armenta

2017

© Copyright by

Tiffany Christine Armenta

2017

ABSTRACT OF THE DISSERTATION

Novel Gene Expression Shifts in Response to

Complex Environmental Stimuli in a Wild Population

of Yellow-Bellied Marmots (Marmota flaviventer)

by

Tiffany Christine Armenta

Doctor of Philosophy in Biology

University of California, Los Angeles, 2017

Professor Daniel T. Blumstein, Co-Chair

Professor Robert Wayne, Co-Chair

Gene expression is an important mechanism that allows organisms to adapt to environmental stimuli. Recent advances in molecular techniques have enabled extensive research evaluating environmentally induced transcription rates across the genome in controlled laboratory conditions. However, relatively few studies have examined transcriptional responses to external changes in wild populations, where natural selection operates. For my dissertation, I aimed to evaluate how wild animals physiologically respond to various environmental stressors on the molecular level. In my first chapter, I provide an overview of my three empirical research chapters. In my second chapter, I evaluated gene expression changes in female yearling yellow- bellied marmots (Marmota flaviventer) as they prepared to disperse from the natal colony.

ii Dispersers exhibited significant increases in the expression of genes involved in metabolism, muscle function, and pathogen defense, providing support for somatic preparation for the upcoming risks involved with dispersal. In my third chapter, I focused on both male and female marmots to evaluate the impact of social status on gene expression variation in blood. Previous studies in controlled systems have identified a conserved transcriptional response to adversity

(CTRA) where socially stressed individuals often exhibit chronic inflammation and reduced antiviral responses. I found that affiliative and agonistic social interactions influenced the inflammatory transcriptional response, but not the antiviral response. This suggests that inflammation is an evolutionarily conserved trait when dealing with social stress, but that the viral response may depend on the social structure of the species. Finally, in my fourth chapter, I examined the transcriptional response to predator pressure in this species. I identified predator- induced differential expression in several genetic pathways including heat shock , metabolism, brain function, and glucocorticoid signaling. My dissertation demonstrated the ability to observe the molecular response to external stimuli in a wild mammal, which could prove to be a powerful method for studying the cellular stress response in natural populations.

iii The dissertation of Tiffany Christine Armenta is approved.

Daniel H. Geschwind

Pamela J. Yeh

Daniel T. Blumstein, Committee Co-Chair

Robert Wayne, Committee Co-Chair

University of California, Los Angeles

2017

iv

I dedicate this to my mother for showing me that I can do anything I set my mind to

and to my father for inspiring my love for all things wild.

v TABLE OF CONTENTS

LIST OF FIGURES ...... viii

LIST OF TABLES ...... ix

ACKNOWLEDGEMENTS ...... x

VITA ...... xiii

CHAPTER ONE: General introduction ...... 1

Bibliography ...... 11

CHAPTER TWO: Genes involved in musculature, metabolism, and antigen defense are up- regulated in yellow-bellied marmots prior to dispersal

Abstract ...... 17

Introduction ...... 18

Materials and methods ...... 24

Results ...... 30

Discussion ...... 34

Supplementary results ...... 48

Bibliography ...... 63

vi CHAPTER THREE: Wild rodents exhibit an inflammatory transcriptional response to adversity

Abstract ...... 78

Introduction ...... 79

Materials and methods ...... 83

Results ...... 89

Discussion ...... 92

Supplemental results ...... …106

Bibliography ...... 109

CHAPTER FOUR: Glucocorticoid signaling and genes involved in cellular repair and homeostasis are up-regulated in leukocytes of wild rodents exposed to chronic predation pressure

Abstract ...... 122

Introduction ...... 123

Materials and methods ...... 127

Results ...... 132

Discussion ...... 134

Supplementary results ...... 146

Bibliography ...... 155

vii LIST OF FIGURES

Figure 2-1 ...... 43

Figure 2-2 ...... 44

Figure 2-3 ...... 45

Figure 2-S1...... 48

Figure 3-1 ...... 99

Figure 3-2 ...... 100

Figure 3-3 ...... 101

Figure 3-S1...... 106

Figure 4-1 ...... 142

Figure 4-2 ...... 143

Figure 4-3 ...... 144

viii LIST OF TABLES

Table 2-1 ...... 46

Table 2-2 ...... 47

Table 2-S1 ...... 49

Table 2-S2 ...... 50

Table 2-S3 ...... 54

Table 2-S4 ...... 57

Table 2-S5 ...... 58

Table 2-S6 ...... 62

Table 3-1 ...... 102

Table 3-2 ...... 103

Table 3-3 ...... 104

Table 3-S1 ...... 107

Table 3-S2 ...... 108

Table 4-1 ...... 145

Table 4-S1 ...... 146

Table 4-S2 ...... 154

ix ACKNOWLEDGEMENTS

First and foremost, I’d like to thank my graduate advisors, Dr. Daniel Blumstein and Dr.

Robert Wayne, for their support and mentorship during this work. Dan, thank you for your endless encouragement, challenges and forceful pushes. I cannot express how grateful I am for your confidence in my abilities, especially during times when I didn’t have much. I feel extremely fortunate to have been your student and collaborator. As a teacher, mentor and all around philosopher, you inspire me. I would have never gotten this far without you. Bob, thank you for seeing my potential and welcoming me into this community back when I didn’t know a thing about science. I attribute so much of my academic success and critical thinking to your world class training. I would also like to thank my dissertation committee members, Dr. Dan

Geschwind and Dr. Pamela Yeh for their helpful guidance over the past six years.

Dr. Steve Cole, although you were not an official advisor, your knowledge and mentoring were essential to this dissertation. Thank you for all the assistance over the years and for teaching me that communicating difficult concepts doesn’t have to be complicated. To Tom Gillespie, Seth

Riley, and Erin Boydston: thank you so much for the early mentoring. You all inspired me to pursue and continue down this path and I whole heartedly appreciate all the effort you’ve put into my career. And of course, I have to give a big thank you to Jocelyn Yamadera and Tessa

Villaseñor. I relied on you two for 90% of my questions during the last six years. Thank you for always being so helpful, prompt, and positive.

The following chapters resulted from the efforts of so many and I can’t possibly thank everyone who contributed here, but I will try. The first chapter is the work of Tiffany Armenta, with helpful input from Daniel Blumstein. The second chapter is the work of Tiffany Armenta,

Steve Cole, Dan Geschwind, Robert Wayne, and Daniel Blumstein. This chapter was inspired and

x largely informed by SC’s work. TCA, DTB, and the marmot crew at the Rocky Mountain

Biological Laboratory collected data in the field. TCA and Jesusa Arevalo carried out genetic lab work. TCA performed statistical analyses and created figures. The manuscript was primarily written by TCA, with input from all authors. The third chapter is the work of Tiffany Armenta,

Steve Cole, Robert Wayne, and Daniel Blumstein. TCA, DTB, and marmoteers collected data in the field. TCA and Jesusa Arevalo carried out genetic lab work. TCA performed statistical analyses and created figures. The manuscript was written by TCA, with input from all authors.

The fourth chapter is the work of Tiffany Armenta, Steve Cole, Robert Wayne, and Daniel

Blumstein. TCA, DTB, and marmoteers collected data in the field. TCA and JM carried out genetic lab work. TCA and SC conducted analyses. The original manuscript was written by TCA, with input from all authors. Although I couldn’t include you all as authors, I want to give mad props to all the marmoteers who worked their butts off to help me collect these data. I know it was hard work (I’m still tired from those field seasons), but I sincerely appreciate all your efforts.

Thanks for being a great squad. I’d also like to thank the UCLA Statistical Consulting Group for answering all my questions and guiding me through tons of analyses, especially Andy Lin.

I appreciate all the sources of funding for this dissertation, particularly the National

Science Foundation, which provided me the Graduate Research Fellowship. This fellowship gave me the opportunity to spend half the year conducting fieldwork at RMBL, where I learned immensely about science (and myself). DTB also received funding from NSF (DEB-1119660,

DEB-1557130). Other funding sources are UCLA (Graduate Division and the Division of Life

Sciences to TCA) and the National Institutes of Health (P30 AG017265 to SC, S10 RR029668 and S10 ODO18174 to the Vincent J. Coates GSL at UC Berkeley). I thank the UCLA Animal

xi Care Committee and the Colorado Division of Wildlife for research approval and collection permits.

To the members of the Blumstein and Wayne labs, past and present: I could not have survived this without you. Your assistance and support in the lab were instrumental in acquiring the skills needed for this dissertation and our adventures outside of the lab forged some life-long friendships. Laurel Serieys, thank you so much for taking me under your wing. You are the reason I can call myself a wildlife biologist. FTM. FTB. FTP. FTW! Thank you Rachel Johnston,

Alice Mouton, Amanda Lea, Adriana Maldonado-Chaparro and Matt Petelle for sharing valuable coding and statistical knowledge with me. Gabriela M. Pinho, thanks for always coming to lab with that contagious smile on your face.

Finally, to my friends both in and out of graduate school, you made this insanely challenging journey bearable! Thanks to my graduate cohort especially Jacqueline Robinson,

Rachel Chock, Mairin Balisi, Sarah Ratay and Zac Schakner for being part of my development as a scientist, and at times, for providing shoulders to cry on. Devaughn Fraser, our countless scientific and non-scientific conversations under that tree in your backyard were the only things that kept me sane sometimes. I thank Lilah Hubbard for always being there for me day or night, field or lab, mountains or ocean. Thank you Ana Solis for your amazing hugs and encouraging notes. You always delivered them right when I needed them. I thank Lexi for forcing me to step outside and take a stroll even when I felt too stressed to do so. To Adam: I always thought this last year of graduate school would be the most stressful year of my life, but your endless support and positivity turned it into the best year.

xii VITA

TIFFANY CHRISTINE ARMENTA

EDUCATION

2008 Bachelor of Arts in Geography, University of California Los Angeles

FELLOWSHIPS

2017 UCLA Department of Ecology and Evolutionary Biology Graduate Research

Fellowship: $6,500

2014 Graduate Assistance in Areas of National Need Fellowship: $26,000 stipend +

$13,500 institutional allowance

2012 National Science Foundation Graduate Research Fellowship: three year award of

annual $30,000 stipend + $13,500 institutional allowance

2012 Gordon Research Conference Underrepresented Minority Fellowship: $1,500

2011 Eugene V. Cota-Robles Fellowship: two year award of annual $25,000 stipend +

$13,500 institutional allowance

GRANTS

2017 UCLA Department of Ecology and Evolutionary Biology Travel Grant, $500

2016 UCLA Graduate Division Conference Travel Grant, $1000

2014 UCLA Department of Ecology and Evolutionary Biology Research Grant, $1000

AWARDS

2017 Scherbaum Award for Distinguished Research by a Graduate Student, UCLA

Department of Ecology and Evolutionary: $200

2008 Carolyn Blackman Award for Outstanding GPA in UCLA Geography Department

xiii PUBLICATIONS

2017 Serieys LEK, Lea AJ, Epeldegui M, Armenta TC, Moriarty JG, VandeWoude S, Carver

S, Foley J, Wayne RK, Riley SPD, Uittenbogaart CH. 2017. Urbanization and

anticoagulant rat poisons promote immune dysfunction in bobcats. Proceedings of the

Royal Society B. In press.

2017 Johnson GC, Karajah MT, Mayo K, Armenta TC, Blumstein DT. 2017. The bigger they

are the better they taste: size predicts predation risk and anti-predator behavior in giant

clams. Journal of Zoology. 301(2017):102-107. doi:10.1111/jzo.12401

2017 Pezner AK, Lim AR, Kang JJ, Armenta TC, Blumstein DT. 2017. Hiding behavior in

Christmas tree worms on different time scales. Behavioral Ecology. 28(1):154-163.

2016 Sastre N, Francino O, Curti JN, Armenta TC, Fraser D, Kelly RM, Hunt E, Silbermayr

K, Zewe C, Sanchez A, Ferrer L. 2016. Detection, prevalence, and phylogenetic

relationships of Demodex spp. and further skin Prostigmata mites (Acari, Arachnida) in

wild and domestic mammals. PLoS ONE. 11(10):1-20.

2015 Serieys LEK, Armenta TC, Moriarty JG, Boydston EE, Lyren LM, Poppenga RH. 2017.

Anticoagulant rodenticides in urban bobcats: exposure, risk factors and potential effects

based on a 16-year study. Ecotoxicology. 24(4):844-862.

SELECT PRESENTATIONS

2017 Genes involved in musculature, metabolism and antigen defense are up-regulated in

yellow-bellied marmots prior to dispersal. Evolution Conference, Portland, Oregon.

2015 Do human demographics contribute to toxicant exposure in urban bobcats? Wildlife

Society Conference (poster), Winnipeg, Manitoba, Canada.

xiv

Chapter 1:

General introduction

1 Understanding how organisms respond to environmental stimuli is a central issue in biology and a focus of evolutionary studies. Individuals face a variety of external challenges and this drives a variety of adaptations on multiple levels, from population-wide responses down to the molecular level. A primary physiological mechanism that leads to functional change is the transcription of genes into necessary proteins. RNA transcription fluctuates widely and rapidly depending on the requirements of an organism at a precise moment in time. Over the past few decades, the increasing availability of molecular tools has enabled researchers to dive deep into this functional genetic mechanism. For example, earlier studies could typically only focus on a handful of suspected genes, but we can now sequence the entire transcriptome of an organism

(RNA-seq) relatively easily. This global approach allows us to identify previously unsuspected genes and proteins, understand how transcripts function as an integrated system, and clarify their specific involvement in the adaptive response to complex challenges. Thus, RNA-seq enables researchers to take a holistic view of how an organism’s internal machinery responds to stimuli.

Gene expression is time and tissue specific and is extremely sensitive to both internal and external biological cues. Internal cues include developmental stage (Arbeitman et al., 2002;

Graveley et al., 2011; Xue et al., 2013), reproductive status (Guo et al., 2009; Lemay, Neville,

Rudolph, Pollard, & German, 2007), and circadian or circannual rhythms (Boivin et al., 2003;

Johnston, Paxton, Moore, Wayne, & Smith, 2016; R. Zhang, Lahens, Ballance, Hughes, &

Hogenesch, 2014), whereas external variables include temperature (Chu, Miller, Kaluziak,

Trussell, & Vollmer, 2014; Colinet, Lee, & Hoffmann, 2010; Sørensen, Nielsen, Kruhøffer,

Justesen, & Loeschcke, 2005), confinement (Hazard et al., 2011; Kennerly et al., 2008), and toxicant exposure (Bowen et al., 2007; Hook, Skillman, Small, & Schultz, 2006). Even seemingly subtle differences such as a person’s lifestyle (Idaghdour et al., 2010) or social relationships

2 (Steven W. Cole, 2013, 2014) induce differential gene expression. Thus, organisms clearly alter their physiology to meet the demands of a wide variety of environmental challenges.

Although these sophisticated studies have revealed incredible insights into how organisms respond to environmental challenges on the molecular level, they have primarily been conducted in controlled laboratory settings, where the risk of natural selection is removed. The next big step in RNA-seq is to measure the transcriptional response to challenges in natural populations. Such an approach has the potential to provide insight into adaptation and evolutionary consequences.

In this dissertation, I focus on a wild system that is well-suited for transcriptional studies- the yellow-bellied marmot (Marmota flaviventer). In Chapter 1, I examine the transcriptional physiology underlying the preparation to disperse from the natal colony. Chapter 2 uses social network parameters to investigate whether an individual marmot’s social status affects gene expression, with a specific focus on immune defense genes. Finally, Chapter 3 analyzes the transcriptional response to the psychological stress associated with predator pressure. All work was done with the help of others and each chapter was written with coauthors.

Study system and general methods

Yellow-bellied marmots are semi-fossorial ground squirrels weighing 2-6 kg that are found throughout the sub-alpine regions of western North America (Frase & Hoffmann, 1980).

This species is facultatively social and lives in matrilineal groups. Social groups are composed of one or more adult females, their kin, and one adult male. Marmots experience three distinct life stages: juveniles, or young of the year, yearlings, or those that have survived their first winter, and adults, those individuals that have survived their second winter and are reproductive

(Armitage, 1991) and behaviors during these life stages are distinctly different.

3 My dissertation used data collected from a well-studied population of yellow-bellied marmots near Rocky Mountain Biological Laboratory (RMBL) in Gunnison County Colorado,

USA (38° 57′ North, 106° 59′ West). The behavior, physiology, and demographics of this population have been studied continuously since 1962, providing a great deal of knowledge about the direct and indirect effects stressors impose on marmots. I focused my dissertation chapters on gene expression in yearling marmots because they exhibit significant social variation and undergo natal dispersal during this important life history stage.

As part of this long-term study, our research team regularly live-trapped the population throughout their active season (mid-April through mid-September, after which marmots hibernate in underground burrows). Marmots were trapped bi-weekly using Tomahawk live traps

(Tomahawk Inc., Wisconsin) baited with horse oats (Omolene, Purina, Gray Summit, MO). We applied uniquely numbered ear tags for permanent identification and marked individuals with non-toxic fur dye to facilitate behavioral data collection from afar (Blumstein, 2013). Trapping is intensive and we trap nearly all individuals in the population at least once a year and most are trapped several times.

During trapping, we transferred marmots to cloth handling bags and collected 1 mL whole blood, preserved in 2.5 mL PAXgeneTM Blood RNA solution (PreAnalytiX, Qiagen,

Hombrechtikon, Switzerland). Many contemporary gene expression studies analyze function- specific tissues such as brain (Aubin-Horth, Landry, Letcher, & Hofmann, 2005; Johnston et al.,

2016; Sanogo, Hankison, Band, Obregon, & Bell, 2011), muscle (C. M. Jones et al., 2015; Postel,

Thompson, Barker, Viney, & Morris, 2010; Vellichirammal et al., 2014), and liver (Désert et al.,

2008; Harris, Munshi-South, Obergfell, & O’Neill, 2013; Song et al., 2014). However, peripheral blood cell RNA profiling has great potential for use, particularly in wild animal populations, for

4 two reasons: 1. Compared to other tissues, blood is relatively easy to obtain and sampling inflicts minimal harm and pain to the animal. This is a critical factor for field biologists because studies should be as minimally invasive as possible and should avoid affecting the behavior and function of the study population. 2. Blood interacts with every organ in the body and is involved in the innate and acquired immune systems. Specific populations of cells in blood express receptors for stress mediators such as hormones, neurotransmitters, and cytokines, making them an ideal tissue type for the evaluation of the physiological stress response (Mohr & Liew, 2007). Recent evidence suggests that gene expression profiling of peripheral blood cells should be considered as a general strategy for identifying biomarkers and measuring general ecosystem health in response to a diverse array of ecological stressors including toxicant exposure (Bowen et al., 2012), disease

(Baechler et al., 2007), and psychological stress (Morita et al., 2005).

Our research team observed all focal marmot colonies daily from 7:00-10:00 and 16:00-

19:00 hours (the peaks of marmot activity; Armitage, 1962) throughout the active season from a distance of 100 to 300 meters to not disrupt natural behavior. We recorded date, time, and location of a variety of social behaviors, including affiliative (e.g. allogrooming, sitting in body contact, play wrestling) and agonistic (e.g. slapping, chasing, biting). For this dissertation, I specifically used trapping and behavioral observation data collected from 2013 to 2015.

Physiology of dispersal

During their second year, nearly 100% of yearling males and approximately 50% of females disperse to new areas outside of their mother’s home range (Armitage, 2014; hereafter, dispersers; Van Vuren, 1990). The remaining 50% of female yearlings never leave their natal site, are recruited into the matriline, and breed in the group for the remainder of their lives (hereafter,

5 residents). Dispersers leave a familiar environment to potentially face a heightened risk of predation (Van Vuren & Armitage, 1994; Yoder, Marschall, & Swanson, 2004), physical attacks from unfamiliar conspecifics (Soulsbury, Baker, Iossa, & Harris, 2008), and exposure to novel pathogens in new environments (risks reviewed by Bonte et al., 2012; Srygley, Lorch, Simpson,

& Sword, 2009).

These risks that occur during dispersal are not trivial, so dispersers likely prepare physiologically for them. The few studies that have investigated transcription during or prior to dispersal have done so in a laboratory setting where the environment is controlled (Brisson,

Davis, & Stern, 2007; Vellichirammal et al., 2014; Wheat et al., 2011). To truly understand the molecular pathways and physiological adaptation to this major life history event, we must study how it develops in free-living animals.

In Chapter 1, I quantified genome-wide transcription levels in blood from female yearling yellow-bellied marmots that either subsequently dispersed or remained residents in their natal colony. I tested three predictions. First, I hypothesized genes and gene networks will be differentially expressed (DE) between dispersing and resident marmots since dispersal likely requires behavioral and physiological changes. Consequently, I expected DE genes to primarily be involved in metabolism, muscle building, circadian processes, and immune function. Finally, in dispersing individuals, I predicted DE genes would change as a function of sampling date relative to dispersal date.

I detected significant differential expression between dispersers and residents in 150 individual genes, including genes that play critical roles in lipid metabolism and muscle generation. I also discovered a dispersal-associated gene network that was enriched for extra- cellular immune function and major histocompatibility (MHC) class II pathways. Of the

6 dispersal-associated genes, 22 fluctuated as a function of days until dispersal, indicating that expression shifts to meet the physiological demands of dispersal, but that all genes do not initiate transcription on the same time scale. These results suggest that wild vertebrates require fundamental molecular changes for natal dispersal in vertebrates and that there exists an evolutionary conservation of metabolic pathways during this behavioral process.

The social environment

In social species, an individual’s relationship with conspecifics plays a major role in its life history decisions. Sociogenomics is an emerging field that analyzes the relationship between the social environment and genomic activity (Robinson, Grozinger, & Whitfield, 2005). This field aims to understand how genetic mechanisms regulate social behavior and development as well as how the genome responds to social cues, providing insights into the evolution and maintenance of sociality across taxa. Many sociogenomic studies have observed that the social environment can have profound impacts on gene expression. Specifically, socially isolated or bullied individuals typically increase expression of genes involved in inflammation and decrease expression of genes involved in antiviral responses and antibody synthesis in blood tissues (a pattern termed the conserved transcriptional response to adversity; CTRA) (Steven W. Cole,

2013, 2014). However, most existing studies have been conducted on humans and other highly social species in captivity.

In Chapter 3, I evaluated the immunological gene expression profiles associated with a marmot’s social network status. Based on prior sociogenomic findings in humans and non-human species, I predicted: 1) A marmot’s level of social connectedness or social stress would be associated with differential expression of immune system genes in blood cells; 2) Marmots that

7 are more social and that frequently engage in affiliative interactions with conspecifics would significantly up-regulate canonical antiviral genes compared to marmots that are socially isolated from conspecifics; and 3) Individuals that receive more agonistic interactions from conspecifics

(i.e., marmots that are bullied) will up-regulate canonical inflammatory genes compared to those that are less bullied.

I found a strong transcriptional inflammatory response to social interactions in my study system. Marmots that were socially isolated and those that were bullied by conspecifics exhibited significantly higher expression of pro-inflammatory genes, a trend that was consistent with our hypotheses. Although there is empirical support in the literature for an anti-viral response to social status, I found no such relationship despite having sufficient power to detect it. The absence of increased antiviral activity in highly affiliative marmots suggests that this facultatively social species does not gain all of the immunological benefits of strong social relationships that are typically seen in species with more fixed social life history strategies.

Predator-induced stress

Psychological stressors are known to induce many physiological responses via the sympathetic nervous system and the hypothalamo-pituitary-adrenal (HPA) axis. An increasing number of studies have shown that predator-induced psychological stress has complex effects on the physiology of prey species, including on the molecular level. Predator pressure influences transcription of a diverse set of functional pathways in laboratory settings, but we do not have a good understanding of how transcription responds to an increased risk of predation in the wild.

Marmots are prey to several mammalian and avian predators (Van Vuren, 2001) and marmot colonies at RMBL experience significantly different exposure to predators, making them an

8 excellent system in which to study these dynamics. Furthermore, glucocorticoid stress hormone levels have been found to be positively correlated with the degree of predator pressure in this species (Monclús, Tiulim, & Blumstein, 2011), indicating that marmots in high predation areas experience chronic stress compared to those that experience low predation.

In Chapter 3, I quantified predator pressure for each marmot colony and examined if and how individual gene expression in marmot leukocytes changed with chronic predator pressure. I also assessed whether these differentially expressed genes were statistically over-represented by functional categories and tested for transcription factor activity in promoters that may regulate the observed gene expression differences. I found that 349 genes significantly changed expression as a function of chronic exposure to predators and these genes were largely involved in the cellular response to stress, metabolism, and DNA repair. Transcription factor analysis indicated promoters of differentially expressed genes associated with predation pressure were statistically enriched for glucocorticoid receptor activity.

These findings support the canonical expectation that the stress associated with predation induces cells to mobilize the HPA axis, glucocorticoid signaling, and cellular homeostasis. Our results confirm that the physiological response to stress is complex, initiating multiple levels of transcriptional processes and pathways.

General conclusions and future directions

My dissertation research found that on multiple levels, genomic transcription responds to a variety of naturally occurring stressors. I found that young females prepare for dispersal by up- regulating networks of genes involved in metabolism and pathogen defense, that an adverse social environment leads to chronic inflammatory activity, and that the psychological stress

9 associated with predation risk leads to increased HPA axis and glucocorticoid activity on the molecular level. These results suggest that fundamental molecular changes are required for organisms to adapt to their external environment and similarities with previous studies suggest an evolutionary conservation of several genetic pathways during these behavioral processes.

Although the gene expression literature is extensive, my research is novel because it is one of the first to identify associations between gene transcription and environmental stressors in a largely uncontrolled setting. My approach demonstrates the power of using genomics to test specific physiological hypotheses and helps illustrate how wild organisms “perceive” and adapt to a constantly changing external environment. Free-living animals regularly encounter life- threatening challenges and understanding how they physiologically respond to such stressors is key to predicting and managing wild populations.

Future work should focus on evaluating the transcriptional response in additional wild systems to compare and confirm our findings. It would be especially interesting to study the

CTRA response in species with different social systems. We also need to improve our understanding of the proteomic and phenotypic results of such gene transcription. That is, do gene transcription shifts ultimately result in observable phenotypic differences? These types of studies would allow us to confirm the role of gene regulation in the diverse and complex phenotypes we observe in wild organisms.

10 Bibliography

Arbeitman MN, Furlong EM, Imam F et al. (2002) Gene expression during the life cycle of

Drosophila melanogaster. Science, 297, 2270–2276.

Armitage KB (1962) Social behaviour of a colony of the yellow-bellied marmot (Marmota

flaviventris). Animal Behaviour, 10, 319–331.

Armitage KB (1991) Social and population dynamics of yellow-bellied marmots: Results from

long-term research. Annual Review of Ecology, Evolution, and Systematics, 22, 379–407.

Armitage KB (2014) Marmot Biology: Sociality, Individual Fitness, and Population Dynamics.

Cambridge University Press, Cambridge.

Aubin-Horth N, Landry CR, Letcher BH, Hofmann HA (2005) Alternative life histories shape

brain gene expression profiles in males of the same population. Proceedings of the Royal

Society B: Biological Sciences, 272, 1655–1662.

Baechler E, Bauer JW, Slattery CA et al. (2007) An interferon signature in the peripheral blood

of Dermatomyositis patients is associated with disease activity. Molecular Medicine, 13, 59–

68.

Blumstein DT (2013) Yellow-bellied marmots: insights from an emergent view of sociality.

Philosophical Transactions of the Royal Society B, 368, 20120349.

Boivin DBDB, James FOFO, Wu A et al. (2003) Circadian clock genes oscillate in human

peripheral blood mononuclear cells. Biosystems, 102, 4143–4145.

Bonte D, Van Dyck H, Bullock JM et al. (2012) Costs of dispersal. Biological Reviews, 87, 290–

312.

Bowen L, Miles AK, Murray M et al. (2012) Gene transcription in sea otters (Enhydra lutris);

development of a diagnostic tool for sea otter and ecosystem health. Molecular Ecology

11 Resources, 12, 67–74.

Bowen L, Riva F, Mohr C et al. (2007) Differential gene expression induced by exposure of

captive mink to fuel oil: A model for the sea otter. EcoHealth, 4, 298–309.

Brisson JA, Davis GK, Stern DL (2007) Common genome-wide patterns of transcript

accumulation underlying the wing polyphenism and polymorphism in the pea aphid

(Acyrthosiphon pisum). Evolution and Development, 9, 338–346.

Chu ND, Miller LP, Kaluziak ST, Trussell GC, Vollmer S V. (2014) Thermal stress and

predation risk trigger distinct transcriptomic responses in the intertidal snail Nucella lapillus.

Molecular Ecology, 23, 6104–6113.

Cole SW (2013) Social regulation of human gene expression: Mechanisms and implications for

public health. American Journal of Public Health, 103, S84-92.

Cole SW (2014) Human social genomics. PLoS Genetics, 10, 4–10.

Colinet H, Lee SF, Hoffmann A (2010) Temporal expression of heat shock genes during cold

stress and recovery from chill coma in adult Drosophila melanogaster. FEBS Journal, 277,

174–185.

Désert C, Duclos MJ, Blavy P et al. (2008) Transcriptome profiling of the feeding-to-fasting

transition in chicken liver. BMC Genomics, 9, 611.

Frase BA, Hoffmann RS (1980) Marmota flaviventris. Mammalian Species, 135, 1–8.

Graveley BR, Brooks AN, Carlson JW et al. (2011) The developmental transcriptome of

Drosophila melanogaster. Nature, 471, 473–479.

Guo P, Baum M, Grando S et al. (2009) Differentially expressed genes between drought-tolerant

and drought-sensitive barley genotypes in response to drought stress during the reproductive

stage. Journal of Experimental Botany, 60, 3531–3544.

12 Harris SE, Munshi-South J, Obergfell C, O’Neill R (2013) Signatures of rapid evolution in urban

and rural transcriptomes of white-footed mice (Peromyscus leucopus) in the New York

Metropolitan Area. PLoS One, 8, e74938.

Hazard D, Fernandez X, Pinguet J et al. (2011) Functional genomics of the muscle response to

restraint and transport in chickens. Journal of Animal Science, 89, 2717–2730.

Hook SE, Skillman AD, Small JA, Schultz IR (2006) Gene expression patterns in rainbow trout,

Oncorhynchus mykiss, exposed to a suite of model toxicants. Aquatic Toxicology, 77, 372–

385.

Idaghdour Y, Czika W, Shianna K V et al. (2010) Geographical genomics of human leukocyte

gene expression variation in southern Morocco. Nature Genetics, 42, 62–7.

Johnston RA, Paxton KL, Moore FR, Wayne RK, Smith TB (2016) Seasonal gene expression in a

migratory songbird. Molecular Ecology, 25, 5680–5691.

Jones CM, Papanicolaou A, Mironidis GK et al. (2015) Genomewide transcriptional signatures of

migratory flight activity in a globally invasive insect pest. Molecular Ecology, 24, 4901–

4911.

Kennerly E, Ballmann A, Martin S et al. (2008) A gene expression signature of confinement in

peripheral blood of red wolves (Canis rufus). Molecular Ecology, 17, 2782–91.

Lemay DG, Neville MC, Rudolph MC, Pollard KS, German JB (2007) Gene regulatory networks

in lactation: identification of global principles using bioinformatics. BMC Systems Biology,

1, 56.

Mohr S, Liew CC (2007) The peripheral-blood transcriptome: new insights into disease and risk

assessment. Trends in Molecular Medicine, 13, 422–432.

Monclús R, Tiulim J, Blumstein DT (2011) Older mothers follow conservative strategies under

13 predator pressure: The adaptive role of maternal glucocorticoids in yellow-bellied marmots.

Hormones and Behavior, 60, 660–665.

Morita K, Saito T, Ohta M et al. (2005) Expression analysis of psychological stress-associated

genes in peripheral blood leukocytes. Neuroscience Letters, 381, 57–62.

Postel U, Thompson F, Barker G, Viney M, Morris S (2010) Migration-related changes in gene

expression in leg muscle of the Christmas Island red crab Gecarcoidea natalis: Seasonal

preparation for long-distance walking. The Journal of Experimental Biology, 213, 1740–

1750.

Robinson GE, Grozinger CM, Whitfield CW (2005) Sociogenomics: Social life in molecular

terms. Nature Reviews Genetics, 6, 257–271.

Sanogo YO, Hankison S, Band M, Obregon A, Bell AM (2011) Brain transcriptomic response of

threespine sticklebacks to cues of a predator. Brain, Behavior and Evolution, 77, 270–85.

Song L, Zhang J, Li C et al. (2014) Genome-wide identification of Hsp40 genes in channel

catfish and their regulated expression after bacterial infection. PLoS ONE, 9, 1–20.

Sørensen JG, Nielsen MM, Kruhøffer M, Justesen J, Loeschcke V (2005) Full genome gene

expression analysis of the heat stress response in Drosophila melanogaster. Cell Stress &

Chaperones, 10, 312–328.

Soulsbury CD, Baker PJ, Iossa G, Harris S (2008) Fitness costs of dispersal in red foxes (Vulpes

vulpes). Behavioral Ecology and Sociobiology, 62, 1289–1298.

Srygley RB, Lorch PD, Simpson SJ, Sword GA (2009) Immediate dietary effects on

movement and the generalised immunocompetence of migrating Mormon crickets Anabrus

simplex (Orthoptera: Tettigoniidae). Ecological Entomology, 34, 663–668.

Vellichirammal NN, Zera AJ, Schilder RJ et al. (2014) De novo transcriptome assembly from fat

14 body and flight muscles transcripts to identify morph-specific gene expression profiles in

Gryllus firmus. PLoS One, 9, e82129.

Van Vuren DH (1990) Dispersal of Yellow-Bellied Marmots. University of Kansas.

Van Vuren DH (2001) Predation on yellow-bellied marmots (Marmota flaviventris). The

American Midland Naturalist, 145, 94–100.

Van Vuren DH, Armitage KB (1994) Survival of dispersing and philopatric yellow-bellied

marmots: What is the cost of dispersal? Oikos, 69, 179–181.

Wheat CW, Fescemyer HW, Kvist J et al. (2011) Functional genomics of life history variation in

a butterfly metapopulation. Molecular Ecology, 20, 1813–1828.

Xue Z, Huang K, Cai C et al. (2013) Genetic programs in human and mouse early embryos

revealed by single-cell RNA sequencing. Nature, 500, 593–597.

Yoder JM, Marschall EA, Swanson DA (2004) The cost of dispersal: Predation as a function of

movement and site familiarity in ruffed grouse. Behavioral Ecology, 15, 469–476.

Zhang R, Lahens NF, Ballance HI, Hughes ME, Hogenesch JB (2014) A circadian gene

expression atlas in mammals: Implications for biology and medicine. Proceedings of the

National Academy of Sciences, 111, 16219–16224.

15

Chapter 2:

Genes involved in musculature, metabolism, and antigen defense are up-regulated in

yellow-bellied marmots prior to dispersal

16 Abstract

The causes, consequences, and risks of natal dispersal have been studied extensively in many taxa. However, the molecular mechanisms involved in dispersal have not yet been investigated in vertebrates. We used RNA-seq to quantify whole transcriptome gene expression in blood of wild yellow-bellied marmots (Marmota flaviventer) prior to either dispersing from or remaining philopatric to their natal colony site. We tested three predictions. First, we hypothesized genes and gene networks will be differentially expressed (DE) between dispersing and resident marmots since dispersal likely requires behavioral and physiological changes.

Consequently, we expected DE genes to primarily be involved in metabolism, muscle building, circadian processes, and immune function. Finally, in dispersing individuals, we predicted DE genes would change as a function of sampling date relative to dispersal date. We detected significant differential expression between dispersers and residents in 150 individual genes, including genes that play critical roles in lipid metabolism and muscle generation. Gene network analysis revealed a co-expressed module of 126 genes associated with dispersal. This module included many of the musculature function genes and was enriched for extra-cellular immune function and MHC class II pathways. Of these dispersal-associated genes, 22 fluctuated as a function of days until dispersal, suggesting that genes alter expression to meet the physiological demands of dispersal, but that all genes do not initiate transcription on the same time scale. Our results provide novel insights into the fundamental molecular changes required for natal dispersal in vertebrates and suggest evolutionary conservation of metabolic pathways in this behavioral process.

17 Introduction

Natal dispersal, the permanent movement of an individual from their birth site to a new site, is a complex trait influenced by morphology, physiology, and behavior (Clobert, Danchin,

Dhondt, & Nichols, 2001). In vertebrates, one sex is typically philopatric, but even for the less dispersing sex there is substantial variation in the likelihood of dispersal (Clutton-Brock & Lukas,

2012; Greenwood, 1980; Pusey, 1987) and a wide variety of both internal and external factors influence this variation. The primary internal predictors of dispersal include mass and body condition (Barbraud, Johnson, & Bertault, 2003; Meylan, Belliure, Clobert, & de Fraipont, 2002;

Nilsson & Smith, 1985), hormones such as testosterone and glucocorticoids (reviewed by Dufty

& Belthoff, 2001; Silverin, 1997; Woodroffe, Macdonald, & da Silva, 1993) behavioral traits

(Cote & Clobert, 2007; reviewed by Cote, Clobert, Brodin, Fogarty, & Sih, 2010; Dingemanse,

Both, van Noordwijk, Rutten, & Drent, 2003), and additive genetic variance (although heritability estimates vary widely; see review by Doligez & Pärt, 2008). The propensity to disperse is also largely driven by external forces, such as population density (Matthysen, 2005), breeding opportunities (Alberts & Altmann, 1995; Pope, 2000; Pruett-Jones & Lewis, 1990), habitat and food quality (Ekernas & Cords, 2007; Lin & Batzli, 2001; Lurz, Garson, & Wauters, 1997), and social relationships (Bekoff, 1977; Gese, Ruff, & Crabtree, 1996; Poirier, Bellisari, & Haines,

1978).

There are many important consequences of dispersal. On the population level, it can drive ecological and evolutionary dynamics through gene flow, genetic subdivision, population size, and the extent of geographic distribution (Chepko-Sade & Halpin, 1987; Clobert et al., 2001;

Duckworth & Badyaev, 2007; Hammond, Handley, Winney, Bruford, & Perrin, 2006; Pusey,

1987). On the individual level, leaving the natal colony is one of the most profound and

18 potentially costly events to occur (Smale et al. 1997) in an individual’s lifetime. Dispersers leave a familiar environment to potentially face a heightened risk of predation (Van Vuren & Armitage,

1994; Yoder et al., 2004), physical attacks from unfamiliar conspecifics (Soulsbury et al., 2008), and exposure to novel pathogens in new environments (risks reviewed by Bonte et al., 2012;

Srygley et al., 2009).

The many risks that occur during dispersal are not trivial, so dispersers are likely to prepare physiologically for them. Change in gene expression can provide one immediate adaptive response to challenging conditions and requirements. Consequently, comparative analyses of gene expression between dispersing and non-dispersing individuals may allow the unbiased, genome wide discovery of specific genes and gene networks whose products aid in successful dispersal. However, the only studies examining gene expression and animal dispersal, to our knowledge, have been done in winged insects. Insects can display dramatic dispersal polymorphisms in the context of alternative life history strategies, including winged dispersers and sedentary wingless morphs of the same species (reviewed by Zera & Brisson, 2012; Zera &

Denno, 1997). Transcriptomic comparisons of these morphs reveal underlying coordinated physiological strategies, where dispersers significantly up-regulate (and therefore allocate energy to) genes involved in cellular energy production and metabolism as well as muscle building

(Brisson et al., 2007; Vellichirammal et al., 2014; Wheat et al., 2011). However, considering the dramatic phenotypic difference between these two morphotypes, the observed transcriptomic differences may be a primary result of the development requirements for somatic transformation rather than an adaptation to dispersal per se.

Although studies examining gene expression differences during dispersal are limited, the molecular pathways involved with seasonal migration have received much more attention. Since

19 both dispersal and migration require endurance for long distance travel, occur at discrete times of year, and incur many of the same risks, physiological adaptation or preparation for these two life history events may be somewhat analogous (Liedvogel, Åkesson, & Bensch, 2011). Genes involved in metabolism and mobilization of cellular energy stores, especially lipids, are commonly differentially expressed (DE) between migratory phenotypes. Metabolic gene expression is significantly higher in migratory vs. non-migratory birds (Boss et al., 2016;

Fudickar et al., 2016) and butterflies (Zhu, Gegear, Casselman, Kanginakudru, & Reppert, 2009), aphids and crickets that have wings compared to those that do not (Brisson et al., 2007;

Vellichirammal et al., 2014), and moths that experimentally fly long vs. short distances (C. M.

Jones et al., 2015). Increasing lipid metabolism during migratory periods is not surprising since lipids are the primary source of energy during migration when individuals may go long periods without food (Blem, 1980; Jeffs, Willmott, & Wells, 1999; Jenni & Jenni-Eiermann, 1998).

Genes that transcribe muscular proteins also tend to be up-regulated during insect migration (C.

M. Jones et al., 2015; Vellichirammal et al., 2014; Zhan et al., 2014; Zhu et al., 2009), bird migration (Boss et al., 2016), and crustacean migration (Postel et al., 2010) and genes that dictate circadian processes are known to be important aspects of migration phenology for invertebrates

(Zhu et al., 2009) and vertebrates (Boss et al., 2016; Johnston et al., 2016).

Yellow-bellied marmots (Marmota flaviventer) are an excellent system in which to study the molecular and physiological pathways involved in vertebrate dispersal because the timing of dispersal is highly predictable (Armitage, 2014; Blumstein, Wey, & Tang, 2009) and there are established social and sex-biased correlates. Starting in late June of their second summer (as yearlings), nearly all males and approximately 50% of female marmots disperse to new areas outside of their mother’s home range (Armitage, 2014; hereafter called dispersers; Van Vuren,

20 1990). The remaining 50% of female yearlings never leave their natal site, are recruited into the matriline, and breed in the group for the remainder of their lives (hereafter, residents). A female’s affiliative social relationships with other females in her natal colony, particularly matriarchs (e.g., her mother, grandmother, older sister) explain variation in dispersal (Armitage, Van Vuren,

Ozgul, & Oli, 2011; Blumstein, Wey, et al., 2009). As predicted by the social cohesion hypothesis (Bekoff, 1977) females that engage in more affiliative behaviors with kin and those that are more embedded in the social group, are less likely to disperse (Blumstein, Wey, et al.,

2009).

Because dispersing female marmots are significantly less socially integrated than their resident counterparts, the social environment may also affect gene expression profiles.

Sociogenomics is an emerging field that identifies the genes that are subject to social- environmental regulation (Robinson et al., 2005). Generally, compared to those more socially integrated, socially isolated individuals up-regulate proinflammatory, anti-bacterial genes and down-regulate anti-viral, innate immune response genes (reviewed by Steven W. Cole, 2014;

Steven W. Cole et al., 2007; Miller et al., 2009). Most sociogenomic studies have been conducted in humans, but all social vertebrates studied thus far exhibit this conserved transcriptional response, including laboratory rodents (Avitsur, Hunzeker, & Sheridan, 2006) and captive primates (Steven W. Cole et al., 2012; Snyder-Mackler et al., 2016; Tung et al., 2012). Therefore, we expect an analogous up-regulation of anti-bacterial, proinflammatory genes in less socially integrated dispersers.

To evaluate the many physiological pathways that are likely associated with dispersal, we used RNA sequencing (RNA-seq) to quantify genome-wide transcription levels in blood from female yearling yellow-bellied marmots that either subsequently dispersed or remained residents

21 in their natal colony. We sampled blood because it can be obtained through a minimally invasive approach and it is a good tissue for exploring a variety of physiological functions. Blood is dominated by red blood cells which are non-nucleated, so white blood cells or leukocytes produce the dominant transcripts. Leukocytes are immune cells that defend the body against infectious disease and foreign antigens and thus, blood is an appropriate tissue to study the predicted response to pathogens. Furthermore, leukocytes share approximately 80% of mRNA with other tissues (Liew, Ma, Tang, Zheng, & Dempsey, 2006) and several studies have demonstrated that it can act as an ideal surrogate for multiple tissue types (Davies et al., 2009; Kohane & Valtchinov,

2012; Rudkowska et al., 2011; e.g., Sullivan, Fan, & Perou, 2006). For example, although liver is one of the most important metabolic organs (Rui, 2014), blood is frequently used to perform many basic metabolic tests in humans including quantifying levels of glucose, cholesterol, and hemoglobin A1c. Most genes involved in key metabolic pathways are expressed in blood (e.g.,

Ghosh et al., 2010; Rudkowska et al., 2011) and these signaling pathways are largely conserved across tissues. Gene expression profiles in blood have also been shown to be highly correlated with profiles in muscle tissues (skeletal muscle correlation = 0.84; Rudkowska et al., 2011; e.g., smooth muscle correlation = 0.61; Sullivan et al., 2006) and muscle-specific genes can also be expressed in non-muscle tissues (e.g., Chelly, Kaplan, Maire, Gautron, & Kahn, 1988; Mizuno et al., 2011; Rudkowska et al., 2011). Finally, although brain tissue is typically used to examine expression of genes involved in circadian processes, expression of these genes is relatively quite low compared to other tissues examined (R. Zhang et al., 2014). Unfortunately Zhang and colleagues did not assess circadian expression in blood, but other work has shown that many circadian genes are transcribed by leukocytes (Archer, Viola, Kyriakopoulou, von Schantz, &

Dijk, 2008; Boivin et al., 2003). Thus, although leukocyte RNA is not a perfect surrogate for an

22 assessment of metabolic, muscle-related, or circadian expression, it can provide meaningful information when more relevant tissues are inaccessible.

We used two methods to compare transcriptomes between the groups. First, we tested for differential expression of individual genes while controlling for other potentially confounding variables using a linear mixed model approach. Second, we performed a weighted gene co- expression network analysis (WGCNA), which identifies groups (“modules”) of genes that are highly co-expressed across samples (Oldham et al., 2008; B. Zhang & Horvath, 2005) and then tested gene modules for correlation with dispersal. Co-expression of genes suggests they are involved in similar biological processes (Parikshak, Gandal, & Geschwind, 2015; Weston,

Gunter, Rogers, & Wullschleger, 2008), facilitating an understanding of how genes coordinate to achieve a complex phenotype on a more inclusive level than individual genes. Based on our knowledge of the molecular profiles of dispersing and migratory phenotypes, we tested three hypotheses: 1) Dispersing and resident marmots differentially express genes and gene networks prior to the dispersal event. Although this hypothesis seems somewhat trivial, gene expression involved in dispersal behavior has yet to be explored in vertebrates, thus it may be the case that our null hypothesis is true (i.e., there are no gene expression differences between the two classes or the expression differences are not readily interpretable with regard to functional expectations);

2) Lists of dispersal-related DE genes are enriched for the biological processes of lipid metabolism, muscle growth, circadian processes, and extracellular pathogen defense; and 3)

Genes from these lists are likely beneficial in preparing for dispersal, so genes up-regulated in dispersers should increase expression as the date of dispersal approaches.

23 Materials and methods

Study system and observations

We studied free-living yellow-bellied marmots near the Rocky Mountain Biological

Laboratory in Gunnison County, Colorado from 2013-2015. Marmots emerge from hibernation in late April or early May (Blumstein, Im, Nicodemus, & Zugmeyer, 2004) and were live-trapped throughout the active season (approximately mid-May to mid-September). During all trap events, we recorded basic measurements (mass, foot length, etc.), affixed ear tags, and applied a unique dorsal fur mark for individual identification from afar (Blumstein, 2013). We observed eight colonies from 07:00–10:00 and 16:00–19:00 h (the peaks of marmot activity; Armitage 1962) and recorded date, time, location, and behaviors (ethogram and detailed observation methods in

Blumstein, Wey, et al., 2009). In general, each site was under continuous observation during periods of peak marmot activity throughout the active season and most individuals were observed daily.

Dispersers leave approximately when young of the year emerge from their natal burrows

(Armitage, 1991, 2014). Since pup emergence dates vary across years and colony sites, we defined dispersers as those that were observed or trapped earlier in the summer and were not seen later that year up to 10 days after the date of a colony site’s first pup emergence (as in Blumstein,

Wey, et al., 2009). We defined dispersal date as the last date an individual was observed or trapped at its natal colony. Animals that were known to have died during the year of sampling and those that returned to their natal colony after a prolonged absence were excluded from all analyses.

24 RNA sampling, library preparation, and sequencing

During each yearling capture, we collected 1 mL whole blood and preserved it in 2.5 mL

PAXgeneTM Blood RNA solution (PreAnalytiX, Qiagen, Hombrechtikon, Switzerland). Samples were incubated, frozen, and extracted according to PAXgeneTM Blood RNA kit manufacturers’ instructions. Only samples taken during the dispersal period (up to 10 days after the colony’s pup emergence) were included in subsequent steps. We removed globin transcripts using the

GLOBINclearTM kit for mouse/rat (Ambion, ThermoFisher Scientific, Waltham, MA) and RNA quality was assessed with an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA).

To preserve statistical power, we excluded any samples with RIN < 4 and globally corrected for

RNA degradation by regressing any effect of RIN (as suggested by Romero, Pai, Tung, & Gilad,

2014; details below). cDNA libraries were created using a TruSeq Library Prep Kit v2 (Illumina,

Madison, WI), quantified with the KAPA SYBR® Fast qPCR library quantification kit (Kapa

Biosystems Inc., Wilmington, MA), and pooled at 8-10 samples per lane. Single-end 100 bp sequencing was performed on Illumina HiSeq2000 (2013 samples) and HiSeq4000 (2014-2015) platforms at the Vincent J. Coates Genomics Sequencing Laboratory, UC Berkeley, USA.

Read mapping, expression quantification and outlier removal

We trimmed adapters and removed short (< 20 base pairs) and low quality reads (Phred score < 20) using Trim Galore! (Krueger, 2015). Resulting reads were mapped to the thirteen- lined ground squirrel (Ictidomys tridecemlineatus) genome (spetri2, GenBank Assembly ID

GCA_000236235.1) using TopHat2 v.2.1.0 (Kim et al., 2013; Trapnell, Pachter, & Salzberg,

2009a). These species are estimated to have diverged approximately 8.6 million years ago

(Bininda-Emonds et al., 2007; Soria-Carrasco & Castresana, 2012) and exhibit sequence

25 divergence of 13.2% (Thomas & Martin, 1993). To maximize uniquely mapped reads, we allowed eight mismatches, a 10- (bp) gap length, and a 20 bp edit distance between read sequences and the reference genome. We quantified expression levels of uniquely mapped reads per individual using HT-Seq’s ‘union’ mode (Anders, Pyl, & Huber, 2015) and the squirrel gene annotation file (Ictidomys_tridecemlineatus.spetri2.84.gtf). All statistical analysis hereafter was carried out using R version 3.3.1 (R Core Team, 2016). We filtered this dataset to only include protein-coding genes with at least 10 reads in 75% of libraries. We transformed these count data for linear modeling while normalizing according to sequencing depth, gene length, and mean variance across genes using the ‘voom’ function within the LIMMA package (Law, Chen, Shi, &

Smyth, 2014; Ritchie et al., 2015). We then built a Euclidean distance-based network of samples using the ‘adjacency’ function in WGCNA (Langfelder & Horvath, 2008). Samples were designated as outliers and removed from analyses if their connectivity was more than three standard deviations from the mean (as described by Steve Horvath, 2011).

Removal of technical variation

Non-biological technical sources of variation, or “batch effects”, are ubiquitous in high- throughput genetic studies and introduce systematic bias to gene expression, methylation, and variant calling (Leek et al., 2010). To protect against these spurious artifacts, we used principal component analysis (PCA) of the normalized, VOOM transformed expression data to assess the technically derived variance. Five batch effects significantly influenced the principal components

(PCs) of gene expression: sequence platform (HiSeq 2000 vs 4000; correlated with PC 1 [Pearson correlation r = -0.74, p = 7.435e-09]), sequencing lane (samples distributed across 10 lanes; correlated with PC 1 [r =-0.65, p = 1.494e-06] and PC 5 [r = -0.39, p = 0.0078]), RNA extraction

26 date (performed on six different days; correlated with PC 1 [r = -0.53, p = 0.0002] and PC 7 [r =

0.44, p = 0.0027]), the number of uniquely mapped reads (correlated with PC 1 [r = -0.51, p =

0.0004]) and RNA integrity number (RIN; correlated with PC 3 [r = -0.39, p = 0.0079]). To remove technical variation, we iteratively regressed out batch effects in descending order of correlation with PCs, beginning with the sequence platform. Because some batch effects were highly correlated with the sequence platform (lane r = 0.88, p = 3.164e-15; unique reads r = 0.64, p = 2.115e-06), these variables did not influence expression variance after regressing the sequence platform out; thus, we did not regress these variables. We subsequently regressed the effect of RIN (a technique suggested by Romero et al., 2014 for handling degraded RNA samples) and extraction date. Cellular composition is also known to influence gene expression in blood samples (Palmer, Diehn, Alizadeh, & Brown, 2006). We controlled for this effect by quantifying cell types in blood smears for all RNA samples (detailed methods described in Lopez,

Wey, & Blumstein, 2013) and regressed the effect of the cell type with the strongest signal to a

PC (neutrophils; correlated with PC 4 [r = -0.42, p = 0.0039]). We controlled for any lane effect by regressing out this variable. However, in retrospect, we acknowledge that balancing samples across lanes (as we did with several other variables such as sampling data, colony, disperser status etc.) would have been a more robust approach for removing potential bias. Nonetheless after regression, lane had no significant association with any PC of global gene expression

(highest correlation with a PC = PC 9 [r = -0.15, p = 0.35]).

Linear mixed effects models

To identify individual genes that were significantly affected by the dispersal phenotype, we created linear mixed models using the package EMMREML (Akdemir & Godfrey, 2015).

27 Models included dispersal status (disperser=1, resident=0), day of the year of blood sample collection (to account for seasonal variation), and time of day of sample collection (to account for circadian variation) as fixed independent effects, and kinship as a random effect (to control for heritability of gene expression; Tung, Zhou, Alberts, Stephens, & Gilad, 2015; Wright et al.,

2014). Kinship was calculated as pair-wise relatedness between individuals using the triadic maximum likelihood approach in COANCESTRY (J. Wang, 2011) based on genotypes obtained from 12 microsatellite loci (detailed methods described in Blumstein, Lea, Olson, & Martin,

2010). Dependent variables of mixed models were the residuals of the filtered, normalized gene expression counts after regressing batch effect variance. For each model, we extracted the p-value associated with dispersal and adjusted for multiple hypothesis testing using a false discovery rate approach (q; Storey & Tibshirani, 2003) implemented in QVALUE (Storey, Bass, Dabney, &

Robinson, 2015). A gene was considered significantly associated with dispersal if q was < 0.1.

We ran a follow-up mixed model to compare differential expression between the dispersers that were sampled within eight days of dispersal (n = 5) and residents that were matched to these dispersers according to sampling date and time (n = 5; Table S2-1). Model variables and significance thresholds were identical to the model above (gene expression ~ dispersal status + date + time + relatedness; q < 0.1).

Gene network analysis

We identified coordinated gene expression patterns in dispersers using WGCNA

(Langfelder & Horvath, 2008; B. Zhang & Horvath, 2005). We performed a signed network analysis on the gene expression residuals with a soft thresholding power of 14, cut height of 0.3 and minimum module size of 50. Within each co-expressed gene module, WGCNA assigns an

28 eigenvector, or module eigengene (ME), which is the representative expression profile for that module (Oldham et al., 2008). We tested each ME for a significant correlation with dispersal and considered a module significantly associated with dispersal if its ME correlated with the trait at p

< 0.05.

Gene ontology analysis

To identify which biological functions were significantly over-represented in genes associated with dispersal, we performed a gene ontology (GO) analysis on three separate gene lists resulting from the two analyses: 1) Genes identified as up-regulated in dispersers using mixed models; 2) genes found to be down-regulated in mixed models; and 3) genes in network modules significantly associated with dispersal. Using the ENSEMBL thirteen-lined ground squirrel genome as a reference, we acquired HGNC (HUGO Gene Nomenclature Committee; K.

A. Gray, Yates, Seal, Wright, & Bruford, 2014) gene symbol information for squirrel transcripts using BiomaRt (Smedley et al., 2015). We identified the enriched biological processes using gProfileR (Reimand et al., 2016). Queries were the three dispersal related gene lists, background lists included all protein-coding genes expressed at detectable levels (≥ 10 reads in 75% of libraries), and we set the minimum functional category and intersection sizes to five. We corrected for multiple testing using the optimal ‘gSCS’ method (Reimand, Kull, Peterson,

Hansen, & Vilo, 2007).

Temporal expression patterns in dispersing marmots

To further test how important individual genes are for dispersing, we assessed whether expression levels change as a function of days until dispersal using a mixed model approach.

29 Here, we focused only on dispersers (n = 16) and calculated the number of days between the date the sample was collected and the date the individual was last seen (dispersal date) in its natal colony. Observed days until dispersal ranged from 0 to 54 days; that is, some dispersers were sampled on the day they dispersed from the natal colony while others were sampled up to 54 days prior to dispersing. We then fitted linear mixed models in EMMREML with days until dispersal as a fixed predictor variable and kinship as a random predictor variable. Again, dependent variables were gene expression residuals and q was set to 0.1, but here we only fitted models for genes determined to be significantly associated with dispersal in either the previous mixed model analysis or gene network analysis.

Results

RNA-seq samples

Our final dataset consisted of cDNA libraries from 43 female yearlings with known dispersal status (n = 16 dispersers, n = 27 residents). Animals known to have died (n = 2) or that returned after a prolonged absence (n = 1) were excluded from all analyses, but we acknowledge that some disappearances could have been due to undetected mortality. Although lower than the average of 50%, this proportion of dispersers (37%) falls within the range of observed dispersal in this species. The annual proportions of female yearlings that dispersed during this study were

32%, 67%, and 11% in 2013, 2014, and 2015, respectively. Future dispersers were sampled between May 19 to July 3; residents were sampled from June 3 to July 10 across the three years.

On average, we generated 29 million reads per individual and 62% (18 million) uniquely mapped to the squirrel genome with sufficient quality and length. Of the 22,389 protein-coding genes in the squirrel genome, 11,381 (50.8%) were substantially expressed (≥ 10 reads in 75% of

30 libraries), protein-coding genes used in subsequent analyses. Clustering analysis revealed no outlier samples.

Linear mixed effects models

After controlling for sampling date, time, and relatedness across individuals, mixed models identified 150 differentially expressed dispersal-related genes, 119 (79%) of which had known HGNC homologs and descriptions (q < 0.1; Table S2-2). Eighty-one percent (122) had positive (log2) fold changes, indicating up-regulation (98 with known homologs; Figure 2-1) whereas 19% (28) had negative fold changes (down-regulated) in dispersers compared to residents (21 known; Figure 2-1). The mean log2 fold change of the 150 significant genes was

0.013 (± 0.27 SD). Genes with significantly large fold changes (greater than three standard deviations from the mean; -0.81 and 0.81) are highlighted in Figure 2-1. These highly up- regulated genes include muscle-related LMO7, metabolic genes CPNE7 and EHD4, and proinflammatory IL1RL1, and HVCN1.

Our analysis comparing the dispersers that were sampled very close to the actual dispersal date with matched resident samples (Table S2-1) identified an order of magnitude more differential expression between these groups. We found 1,403 genes DE (1,046 up-regulated and

357 down-regulated in the five dispersers) including 63 of the 150 significant genes found in the original mixed model and many additional immunological genes such as HLA-DOA, CD-79B,

CD-200, and HLA-DPB1.

31 Gene network analysis

We identified 18 distinct co-expression modules containing 50 to 2,784 genes per module.

Of these, one module eigengene (designated the “salmon” module by WGCNA) was significantly and positively correlated with dispersal (correlation = 0.36, p = 0.02), indicating up-regulation of module genes in dispersers compared to residents. This module was composed of 126 genes, of which 108 (86%) were annotated with HGNC descriptions (Table S2-3). Many of the salmon module genes are known to regulate inflammation and immune responses to antigen (e.g., cluster of differentiation [CD] and human leukocyte antigen genes [HLA]) and they show a clear trend of co-expression and up-regulation in dispersing marmots compared to residents (Figure 2-2).

Several characteristic markers of activated B lymphocytes are included on this list, including

CD19 (the canonical cell surface marker of B cells), CD74, CD79A/B, CD40, CD83, BLK, and

MHC class II molecules (HLA-D/DR). To further identify the most important genes within this network, we identified the highly-connected nodes, or “hub genes”. Intramodular hub genes are often highly associated with a trait of interest (Geschwind & Konopka, 2009; S Horvath et al.,

2006; B. Zhang & Horvath, 2005) and are generally considered to play important roles within a network (Langfelder, Mischel, & Horvath, 2013). We conservatively defined hub genes as those with a module membership > 0.8 and a correlation with dispersal > 0.3, resulting in 29 hub genes

(Table S2-4). Within the salmon gene network, hub genes are largely associated with immune system responses (e.g., MS4A1, HLA-DOB, CD79B; Figure 2-3) and are particularly characteristic of activated B lymphocytes.

32 Gene ontology analysis

GO analysis of the 122 genes up-regulated in mixed models revealed categorical enrichment of 33 biological processes (Table S2-5). These enriched categories were primarily composed of metabolic functions (e.g., “primary metabolic process”, “organic substance metabolic process”) and transcription regulatory processes (e.g., “RNA binding”, “regulation of gene expression”). There was no categorical enrichment of biological processes in the list of 28 down-regulated genes.

In the genes up-regulated by the five dispersers sampled within eight days of dispersal, these same metabolic and regulatory processes and more were statistically enriched. In total, gene ontology identified 176 biological processes, of which 44 contained the term “metabolic”. This up-regulated list was also enriched for interesting immunological categories including “response to stress”, “leukocyte activation”, and “immune system development”.

The network-derived salmon module was entirely enriched for immunological processes including “B cell receptor signaling pathway”, “antigen processing and presentation of peptide antigen”, and “MHC class II protein complex binding”, among other immune functions (Table 2-

1). Genes involved in these pathways included several orthologues of human leukocyte antigens

(HLA-DMA, -DOB, -DPB1, -DRA) and cluster of differentiation genes characterizing B lymphocytes (CD19, CD74, CD79A/B). The overwhelming enrichment of genes associated with antigen defense, and specifically MHC class II processes, confirmed our hypothesis that the less socially integrated dispersers up-regulate inflammatory and immunological activation genes that protect from intruding bacterial pathogens compared to residents.

33 Temporal expression patterns in dispersing marmots

One expectation is that if these genes are physiologically necessary for dispersal, their expression would be related to the actual dispersal event timing. Consequently, we tested whether the expression of genes associated with dispersal varied as a function of the number of days between sampling and dispersal. 150 genes were found to be associated with dispersal in the mixed model analysis and 126 genes were identified in the network analysis, with 16 genes identified by both approaches (Table 2-2). We then tested whether there was a significant expression fold change (q < 0.1) as a function of the number of days until dispersal in these 260 genes. Twenty dispersal associated genes significantly increased expression as the date of dispersal approached, whereas two genes decreased expression (Table S2-6). This correlation is influenced by the inclusion of two individuals sampled very early in the season (those sampled 54 and 43 days prior to dispersal; Figure 2-3). When we remove these two individuals, the temporal correlation does not persist and only 1 of the 22 temporally significant genes (ATP1B1) changes as a function of date until dispersal. However, we have no biological justification for removing these data. These are the earliest sampled dispersers, and as predicted, these individuals show significantly different gene expression compared to individuals that were sampled immediately prior to dispersal.

Discussion

Dispersal is a complex behavior that is influenced by multiple factors, including genotype, body condition and social and environmental pressures. Consequently, we expected the transcriptomic profiles distinguishing dispersing and resident individuals to be complex as well reflecting these diverse sources of variation. We found that prior to dispersal from their natal

34 colony, yellow-bellied marmots initiate gene expression changes in circulating blood cells across numerous pathways and somatic processes. Echoing much of what is known about the genetics of migration and invertebrate dispersal, these results suggest that many molecular processes are conserved across taxa. Specifically, mixed model analysis confirmed some of our a priori predictions derived from the animal migration literature. Genes important for muscle development and lipid and glucose metabolism proved to have some of the highest fold changes between dispersal phenotypes. We did not observe evidence of circadian or circannual processes being altered prior to dispersal; however, dispersal is not a seasonal or repeated behavior. This result suggests that the circadian physiological shifts that occur during migration may not apply for a single, undirected dispersal event. Furthermore, although many circadian rhythm genes are transcribed by peripheral blood cells (Archer et al., 2008; Boivin et al., 2003), expression values differ dramatically by tissue type (R. Zhang et al., 2014). Thus, sampling RNA from tissues directly regulating circadian rhythms may reveal the expected temporal patterns in expression. As predicted, GO analyses found that up-regulated genes were enriched for metabolic processes on numerous levels and we detected an unexpected enrichment of nucleic acid transcription and regulation (Table S2-5). These signals may stem in part from the marked transcriptional induction required for immunological activation (i.e., consistent with the activated B cell signature observed below). This over-representation of transcription processes suggests that protein synthesis shifts are also critical in the preparation to disperse.

Our network approach produced complimentary and overlapping results. Specifically, 16 individual genes were found to be significantly related to dispersal in both analyses (Table 2-2).

This list includes several genes identified as highly up-regulated in gene-specific linear model analyses, including CPNE7, LMO7, and HVCN1 (Figure 2-1), as well as modular hub genes

35 MS4A1, STAP1, and CNN3 (Figure S2-1). Consistent with the activated B cell signature seen in mixed model results, genes in the WGCNA module were categorically enriched for immune system processes, specifically those related to antigen processing (likely reflecting the marked

HLA-D/DR activation molecule signature). Taken together, the two methods illustrate how numerous processes are involved in preparing animals to disperse.

Muscle and tissue development

Dispersal related genes were not statistically enriched for muscle growth contrary to our prediction. This finding may be due in part to an absence of muscle tissue which was a limitation of our sampling protocol. However, many individual DE genes that were identified in the assayed blood samples are important for muscle development. LMO7 is one of the most highly up- regulated genes by dispersers (Figure 1). Among this gene’s many known functions, the produced protein binds emerin, which stimulates muscle development and an emerin deficiency can lead to muscle wasting and weakness in humans (Nagano et al., 1996). LMO7 also regulates several other genes important for muscle regeneration and dystrophy, such as CREBBP, NAP1L1, LAP2, and RBL2 (Holaska, Rais-Bahrami, & Wilson, 2006). Thus, LMO7 appears critical for building and maintaining healthy muscle mass. This gene was also up-regulated in terrestrial crabs during long-distance migration (Postel et al., 2010), which suggests this gene may be particularly important for acquiring the muscle mass needed during migration and dispersal. CNN3 and YBX3 also have roles in cytoskeletal protein and muscle development (Ayuso et al., 2016; R. Liu & Jin,

2016) and CNN3 has been implicated in growth and weight gain (Tang et al., 2014). Of note, dispersers also highly up-regulated GPATCH1, which is important for bone strength and density

(Mitchell et al., 2015; Zhou et al., 2016; Figure 2-1). It is particularly interesting that we found

36 these results in yearling marmots, which are not fully mature and hence still producing muscle.

Thus, the specific physical demands of dispersal appear to require an increased allocation of energy toward muscle mass and bone tissue in addition to that needed for juveniles to fully develop. Our data revealed modest fold changes by dispersers in these muscle-related genes in blood. However, the majority of the studies we cite assayed muscle tissue directly (Ayuso et al.,

2016; e.g., Postel et al., 2010; Tang et al., 2014). In blood cells, tissue-specific genes correlate with differential expression in other tissues, though they are expressed at lower rates (Liew et al.,

2006; Rudkowska et al., 2011). Therefore, we suggest our results mark changes in musculo- skeletal expression that may also be important for dispersing vertebrates. Future work is needed to confirm activity of these genes in muscle tissues during dispersal events.

Metabolism

The genes that dispersers up-regulated were categorically enriched for metabolic processes, as predicted (Table S2-5). Although blood is frequently used to detect metabolic conditions (such as in standard comprehensive blood panels), it is not a primary metabolic tissue

(Berg, Tymoczko, & Stryer, 2002), so this enrichment is correlative. It would be very interesting to evaluate metabolic gene expression in the liver or kidneys as a follow up analysis, as one would expect an even more pronounced signal in these tissues. One gene with the highest fold change in dispersers was CPNE7 (Figure 2-1), which is functionally annotated with lipid metabolism and is expressed at different rates before and after food deprivation in chickens

(Désert et al., 2008). Interestingly, a related gene, CPNE4, has also been found to be associated with migration in birds (S. Jones, Pfister-Genskow, Cirelli, & Benca, 2008; Ruegg, Anderson,

Boone, Pouls, & Smith, 2014), though this gene was hypothesized to be associated with

37 migratory restlessness in these studies, and not metabolism. EHD4, a highly-conserved gene known to induce endocytosis and ATP hydrolysis in many taxa (Naslavsky & Caplan, 2011; Pohl et al., 2000), is also strongly up-regulated by dispersers. The association of this gene with carbohydrate derivative binding and ATP suggests it is critical in the breakdown of sugar molecules and energy metabolism. GRK5 is implicated in many aspects of physiology, particularly fat uptake and weight gain and a GRK5 deficiency impairs lipid metabolism (F.

Wang, Wang, Shen, & Ma, 2012). Similarly, MAP3K signaling (including MAP3K4) is important for growth factors, lipid metabolism, and adipogenesis and has a vital role in energy homeostasis in many mammals (Sale, Atkinson, & Sale, 1995; Q. Zhang et al., 2016). Since dispersers (and animals that migrate) are moving long distances in unfamiliar territory, they likely experience unpredictable and unreliable food sources for extended time periods. Thus, transcriptional regulation of genes that control lipid metabolism, fat uptake, and weight gain during these life history phases likely helps conserve the energy required for both long distance travel and food deprivation.

Immune system response

Co-expression analysis revealed coordinated up-regulation of immunological genes in dispersers. Although this module included many muscle and metabolic genes, it was categorically over-represented by processes that protect the body from extracellular antigens and bacteria, specifically, major histocompatibility complex (MHC) class II genes (Table 2-1). Moreover, many of the DE genes identified by linear model analyses were specifically characteristic of activated B lymphocytes, including the canonical B cell marker CD19 and the activation of related molecules CD74, CD79A/B, CD40, CD83, BLK, and MHC class II (HLA-D/DR). MHC is

38 a cluster of genes found in vertebrates that encodes a complex of receptors expressed on the surface of all somatic cells (Klein, 1986). These cell surface receptors bind peptides derived from antigens and present them to immune cells to initiate the antigenic response. Whereas MHC class

I genes bind intracellular peptides, MHC II deals with extracellular foreign peptides such as bacteria and presents them to specialized immune cells that distinguish between harmful and neutral antigens (Todd et al., 1988). In conjunction with the up-regulation of multiple B cell activation markers, the observed up-regulation of MHC II molecules suggests that the adaptive immune system may selectively stimulate the antibody-producing B cell sub-population in advance of the potential microbial challenges associated with dispersal.

The B-cell inflammatory/immune activation response may not be entirely due to dispersal.

Rather, it may be stimulated in part by differences in the social environment. Dispersing female marmots engage in fewer affiliative interactions with conspecifics (e.g., grooming, greeting, sitting in body contact; Armitage et al., 2011) and are less embedded in their social network than residents (Blumstein, Wey, et al., 2009). Mounting evidence in human and lab-based sociogenomics suggests that the social environment alone can regulate major shifts in immunological gene expression profiles (Steven W. Cole, 2014; Robinson et al., 2005).

Therefore, we expected to see the canonical inflammatory repertoire in less social dispersers compared to more social residents. Figure 2-2 shows a remarkable pattern of up-regulation of antigen processing genes within less social dispersers (columns are generally warmer colors), whereas the more social residents down-regulate these genes (predominantly cooler colors).

However, certain individuals (shown in columns) clearly do not fit these patterns. Some dispersers appear to down-regulate these genes while some residents up-regulate them. Thus, it is also possible that this proinflammatory signal results directly from an individual’s exposure to

39 extracellular antigens, leading to increased MHC II activity. Future studies examining gene expression in vertebrates prior to and during dispersal may help clarify the immune system’s relationship to this phenotype.

Furthermore, we should interpret these immunological results with caution since they are based on blood samples. Circulating blood cell populations are composed of numerous cell types that have different functional roles and express genes at different rates (Janeway, Travers,

Walport, & Shlomchik, 2005). Of the cell types we measured, we found that the proportion of neutrophils explained significant variation in overall gene expression and so we regressed out this effect. Collecting samples at our isolated field station prevented us from assessing compositions of antigen presenting cells through flow cytometry (e.g., dendritic cells, macrophages, and B cells). Thus, these cell types may have increased with preparation for dispersal, leading to the observed increased expression of MHC genes. Nevertheless, whether these results are due to fluctuations in cellular composition or simply gene expression changes in the existing cell repertoire, this is the first time an immune response of this scale has been documented in dispersers.

Temporal analysis

Of the genes we found to be associated with the dispersal phenotype, 22 significantly changed expression as the timing of dispersal neared. Importantly, all genes that increased expression during this time were consistently up-regulated in dispersers compared to residents in the initial models and the two that showed decreased expression were down-regulated in dispersers. In addition, many of these genes are related to functions we predicted would be important for dispersal (Figure 2-3). ANAPC1 transcription has been associated with muscular

40 dystrophy (Sáenz et al., 2008). PDK3 regulates glucose metabolism and conservation (Sugden &

Holness, 2003), whereas GRK5 and MAP3K4 are important for adipogenesis and lipid metabolism (Sale et al., 1995; F. Wang et al., 2012; Q. Zhang et al., 2016). Many of the immunological genes found in the salmon module also significantly increased expression as dispersal neared, including CD19, FCRLB, and BC11A (Figure 2-3). Although not all genes in the salmon module had significant fold changes in the temporal analysis, there is a general pattern of increasing expression as a function of days until dispersal (Figure 2-2). The individuals sampled within one week of dispersing (columns on the far right) are largely up-regulated (warmer colors), whereas those sampled well before dispersing (the left most disperser columns) have cooler expression profiles that are more similar to resident profiles. All DE genes likely aid in successful dispersal; however, these temporal results suggest that the physiological preparation does not occur at a constant rate for all genes. Depending on their function, it may be important to ramp up expression of some proteins early while others may change expression immediately before dispersing. Although this time course with dispersal date appears to conflict with the idea that social factors predict gene expression, we believe these two results are not mutually exclusive. Under typical circumstances, an individual’s position in the social network may regulate gene transcription, but perhaps acute needs (such as preparing for dispersal) temporarily trump these long-standing factors.

Conclusions

Our RNA-seq analysis of female yellow-bellied marmots provides novel insights into molecular modifications prior to dispersal. To our knowledge, we are the first to document gene expression differences between dispersing and non-dispersing vertebrates. Our results are

41 consistent with dispersal studies in invertebrates, where dispersing individuals allocate more energy toward building muscles and metabolizing lipids than individuals that never permanently leave the natal colony. This suggests that these physiological mechanisms are evolutionarily conserved across highly diverged taxa that engage in similar behavior. We have also demonstrated a substantial immunological activation of the B cell component of the adaptive immune system that occurs prior to dispersal and may be regulated in part by the social environment. Moreover, some, but not all, dispersal-related genes fluctuate expression as a function of days until dispersal. Dispersal is generally considered to be risky and our findings support the idea that some of the associated costs can be reduced through physiological preparation (Clobert, Le Galliard, Cote, Meylan, & Massot, 2009). Indeed, although dispersing marmots experience only a negligible decrease in survival rates compared to residents (0.73 vs.

0.87; Van Vuren & Armitage, 1994), the observed gene regulatory profiles may conceivably represent an evolutionary acquired anticipatory defense mechanism to reduce these dispersal risks.

42 Figure 2-1. Volcano plot showing fold change in gene expression for dispersing marmots compared to residents for 11,381 genes. Horizontal dashed line indicates q-value of 0.1. Vertical dashed lines indicate log2 fold changes of -0.81 and 0.81. Gray dots represent genes not significantly associated with dispersal, black dots are significantly associated (q < 0.1). Genes with known homologs and large fold changes (< -0.81 or > 0.81) are highlighted in blue if significantly down-regulated in dispersers and red if up-regulated.

LMO7 GPATCH1

C15orf62 EHD4 PEX26

HVCN1 CPNE7 HAGHL IL1RL1

43 Figure 2-2. Heat map illustrating gene expression levels of antigen defense genes in module

designated the “salmon” module by WGCNA. Rows represent genes, columns represent individual marmots, grouped by dispersal phenotype. This module of genes is down-regulated in

residents and up-regulated in dispersers, particularly as the date of dispersal approaches.

Residents Dispersers

BLK CD180 CD19 CD40 CD72

CD74 CD79A CD79B CD83 CR2 FCRLB HLA-DMA HLA-DOB HLA-DPB1 HLA-DRA HVCN1 IRF8 ITGAX MS4A1 STAP1

54 27 15 8 0 Days until dispersal Residual gene expression

−2 −1 0 1 2

44 Figure 2-3. Examples of genes that significantly increase expression relative to the number of days until dispersal for dispersing marmots (n = 16). PDK3, GRK5 and MAP3K4 are associated with metabolism, and CD19, FCRLB, and BC11A are involved in antigen defense.

PDK3 GRK5 MAP3K4

CD19 FCRLB BC11A

Residual gene expression

Days until dispersal

45 Table 2-1. Gene ontology (GO) enrichment of genes in module which WGCNA calls the “salmon” module.

p-value Domain GO term Genes involved B cell receptor signaling 0.04060 BLK, CD79A, CD79B, STAP1, CD19 pathway antigen processing and presentation of peptide or HLA-DOB, HLA-DPB1, CD74, HLA- 0.00075 polysaccharide antigen via DMA, HLA-DRA Biological MHC class II processes antigen processing and HLA-DOB, HLA-DPB1, CD74, HLA- 0.01400 presentation of peptide DMA, HLA-DRA antigen antigen processing and HLA-DOB, HLA-DPB1, CD74, HLA- 0.00047 presentation of peptide DMA, HLA-DRA antigen via MHC class II ITGAX, HLA-DOB, CD79A, HLA- plasma membrane protein 0.00582 DPB1, CD79B, CD74, HLA-DMA, complex HLA-DRA HLA-DOB, HLA-DPB1, CD74, HLA- 0.00003 MHC protein complex DMA, HLA-DRA Cellular components MHC class II protein HLA-DOB, HLA-DPB1, CD74, HLA- 0.00001 complex DMA, HLA-DRA

FGFR1, ITGAX, CD79A, CD79B, 0.02440 receptor complex CD74, CR2, DIABLO, CD40, ERBB3

HLA-DOB, HLA-DPB1, MS4A1, 0.00001 antigen binding CD74, HLA-DMA, HLA-DRA, ATP1B1

Molecular MHC protein complex HLA-DOB, MS4A1, CD74, HLA- 0.00001 functions binding DMA, HLA-DRA, ATP1B1

MHC class II protein HLA-DOB, MS4A1, CD74, HLA- 0.00001 complex binding DMA, HLA-DRA, ATP1B1

46 Table 2-2. Genes found to be significantly associated with dispersal in both mixed model and gene network analyses.

Gene Log fold WGCNA Gene description 2 q-value p-value symbol change correlation CPNE7 copine 7 1.48 0.083 0.35 0.020 LMO7 LIM domain 7 1.14 0.026 0.45 0.002 HVCN1 hydrogen voltage gated channel 1 0.81 0.063 0.38 0.012 DOCK9 dedicator of cytokinesis 9 0.80 0.083 0.41 0.007 signal transducing adaptor family STAP1 0.74 0.043 0.43 0.004 member 1 MS4A1 membrane spanning 4-domains A1 0.72 0.098 0.39 0.009 FCRLB Fc receptor like B 0.70 0.065 0.39 0.009 CNN3 calponin 3 0.69 0.089 0.35 0.020 Ral GEF with PH domain and SH3 RALGPS2 0.69 0.049 0.41 0.007 binding motif 2 minichromosome maintenance MCM3 0.68 0.004 0.46 0.002 complex component 3 zinc finger and BTB domain ZBTB8A 0.62 0.049 0.41 0.006 containing 8A PARP1 poly(ADP-ribose) polymerase 1 0.55 0.044 0.40 0.008 PARN poly(A)-specific ribonuclease 0.52 0.017 0.42 0.005 NA NA 0.50 0.040 0.45 0.002 NA NA 0.37 0.088 0.34 0.024 GRK5 G protein-coupled receptor kinase 5 0.37 0.052 0.37 0.015

47 Supplementary results

Figure 2-S1. Network relationships of genes in the “salmon” module. Each node represents a gene, each line represents a positive expression correlation between two genes. Hub genes are indicated with bold nodes and labels.

48 Table 2-S1. Sample information for temporal analysis subset. Dispersers were sampled within eight days of dispersal (n=5) and residents were matched to these dispersers according to sampling date and time (n=5).

Dispersers Residents Sampling Sampling Days until Sampling Sampling Days until ID ID date time dispersal date time dispersal 7205_7492 16-Jun 10:00 8 7652_7619 12-Jun 10:07 NA 7027_6570 16-Jun 9:12 7 6127_7339 12-Jun 10:03 NA 7198_6820 6-Jun 18:25 6 2607_7075 4-Jun 18:46 NA 7284_7109 3-Jul 9:28 0 6532_6580 5-Jul 9:31 NA 7405_7444 20-Jun 10:01 0 7227_7239 24-Jun 9:49 NA

49 Table 2-S2. All genes that were differentially expressed (q<0.1) as a function of dispersal phenotype (n=150).

HGNC Disperse Disperse Ensembl ID HGNC description symbol fold change q-value ENSSTOG00000028709 NA NA 1.588 0.026 ENSSTOG00000019772 CPNE7 copine 7 1.476 0.083 ENSSTOG00000001282 NA NA 1.217 0.049 ENSSTOG00000013061 LMO7 LIM domain 7 1.141 0.026 ENSSTOG00000026089 PEX26 peroxisomal biogenesis factor 26 1.136 0.057 ENSSTOG00000027341 NA NA 1.086 0.056 ENSSTOG00000026414 NA NA 1.022 0.044 ENSSTOG00000000719 EHD4 EH domain containing 4 0.893 0.047 ENSSTOG00000009898 NA NA 0.889 0.064 ENSSTOG00000004162 IL1RL1 interleukin 1 receptor like 1 0.873 0.057 ENSSTOG00000009429 HVCN1 hydrogen voltage gated channel 1 0.815 0.063 ENSSTOG00000003720 GPATCH1 G-patch domain containing 1 0.813 0.037 ENSSTOG00000003129 DOCK9 dedicator of cytokinesis 9 0.804 0.083 ENSSTOG00000007740 MYEF2 myelin expression factor 2 0.799 0.030 ENSSTOG00000016568 NA NA 0.771 0.044 ENSSTOG00000005496 NA NA 0.746 0.032 signal transducing adaptor family 0.741 0.043 ENSSTOG00000013018 STAP1 member 1 ENSSTOG00000011946 CNR2 cannabinoid receptor 2 0.740 0.067 N(alpha)-acetyltransferase 40, NatD 0.727 0.067 ENSSTOG00000011363 NAA40 catalytic subunit ENSSTOG00000006747 MS4A1 membrane spanning 4-domains A1 0.719 0.098 ENSSTOG00000019373 CEP89 centrosomal protein 89 0.703 0.052 ENSSTOG00000004669 FCRLB Fc receptor like B 0.699 0.065 ENSSTOG00000014061 CNN3 calponin 3 0.693 0.089 Ral GEF with PH domain and SH3 0.692 0.049 ENSSTOG00000005052 RALGPS2 binding motif 2 minichromosome maintenance complex 0.684 0.004 ENSSTOG00000001358 MCM3 component 3 SWAP switching B-cell complex 0.679 0.035 ENSSTOG00000014938 SWAP70 70kDa subunit ENSSTOG00000006310 C9orf3 9 open reading frame 3 0.650 0.060 ENSSTOG00000007775 FOXJ3 forkhead box J3 0.632 0.098 ENSSTOG00000022858 ZDHHC9 zinc finger DHHC-type containing 9 0.627 0.098 zinc finger and BTB domain containing 0.620 0.049 ENSSTOG00000024103 ZBTB8A 8A microtubule associated protein RP/EB 0.617 0.070 ENSSTOG00000016184 MAPRE2 family member 2 ENSSTOG00000022266 NA NA 0.613 0.049 ENSSTOG00000000456 NUDCD3 NudC domain containing 3 0.606 0.049 ENSSTOG00000027287 DCP1B decapping mRNA 1B 0.600 0.000 ENSSTOG00000011221 NA NA 0.588 0.046 nicotinamide nucleotide 0.585 0.064 ENSSTOG00000011209 NNT transhydrogenase ENSSTOG00000000072 RFTN1 raftlin, lipid raft linker 1 0.550 0.044 ENSSTOG00000016000 PARP1 poly(ADP-ribose) polymerase 1 0.548 0.044 ENSSTOG00000006029 ACLY ATP citrate lyase 0.534 0.049

50 2-oxoglutarate and iron dependent 0.524 0.098 ENSSTOG00000005854 OGFOD1 oxygenase domain containing 1 ENSSTOG00000007519 PARN poly(A)-specific ribonuclease 0.520 0.017 eukaryotic translation initiation factor 3 0.520 0.093 ENSSTOG00000002512 EIF3A subunit A YTH N6-methyladenosine RNA 0.508 0.000 ENSSTOG00000025075 YTHDF2 binding protein 2 ENSSTOG00000004790 NA NA 0.504 0.040 ENSSTOG00000002101 NCBP3 nuclear cap binding subunit 3 0.502 0.088 CCR4-NOT transcription complex 0.496 0.026 ENSSTOG00000014954 CNOT9 subunit 9 ENSSTOG00000009128 NUFIP1 NUFIP1, FMR1 interacting protein 1 0.496 0.086 zinc finger and BTB domain containing 0.485 0.046 ENSSTOG00000028789 ZBTB9 9 ENSSTOG00000014894 PDK3 pyruvate dehydrogenase kinase 3 0.474 0.092 adaptor related protein complex 1 beta 0.465 0.093 ENSSTOG00000001533 AP1B1 1 subunit ENSSTOG00000014837 MAN2A2 mannosidase alpha class 2A member 2 0.465 0.060 SWI/SNF related, matrix associated, actin dependent regulator of chromatin 0.463 0.049 ENSSTOG00000008249 SMARCC1 subfamily c member 1 ENSSTOG00000027833 PRPF4 pre-mRNA processing factor 4 0.460 0.049 C2 calcium dependent domain 0.446 0.056 ENSSTOG00000004903 C2CD3 containing 3 ENSSTOG00000014925 MYBBP1A MYB binding protein 1a 0.443 0.049 ENSSTOG00000001563 NIPSNAP1 nipsnap homolog 1 (C. elegans) 0.439 0.052 ENSSTOG00000012505 HTATSF1 HIV-1 Tat specific factor 1 0.435 0.049 ENSSTOG00000024236 NA NA 0.430 0.096 ENSSTOG00000026887 NA NA 0.427 0.026 ENSSTOG00000005046 CASP2 caspase 2 0.423 0.092 ENSSTOG00000003938 ANAPC1 anaphase promoting complex subunit 1 0.423 0.026 ENSSTOG00000009130 ELAVL1 ELAV like RNA binding protein 1 0.419 0.067 ENSSTOG00000001750 NA NA 0.417 0.076 ENSSTOG00000014932 NA NA 0.416 0.088 ENSSTOG00000021873 NA NA 0.412 0.056 ubiquitin protein ligase E3 component 0.411 0.088 ENSSTOG00000007748 UBR7 n-recognin 7 (putative) ENSSTOG00000004450 SCAF4 SR-related CTD associated factor 4 0.408 0.003 ENSSTOG00000005459 EXOSC10 exosome component 10 0.400 0.069 ENSSTOG00000014125 NUP160 nucleoporin 160 0.397 0.089 ENSSTOG00000020152 USP11 ubiquitin specific peptidase 11 0.396 0.050 dehydrodolichyl diphosphate synthase 0.393 0.082 ENSSTOG00000004335 DHDDS subunit eukaryotic translation initiation factor 0.390 0.049 ENSSTOG00000014197 EIF4ENIF1 4E nuclear import factor 1 apurinic/apyrimidinic 0.389 0.060 ENSSTOG00000010096 APEX1 endodeoxyribonuclease 1 ENSSTOG00000007985 DDX31 DEAD-box helicase 31 0.387 0.067 SHQ1, H/ACA ribonucleoprotein 0.383 0.053 ENSSTOG00000010043 SHQ1 assembly factor YTH N6-methyladenosine RNA 0.379 0.088 ENSSTOG00000023001 YTHDF1 binding protein 1 ATP binding cassette subfamily F 0.377 0.058 ENSSTOG00000009626 ABCF2 member 2

51 ENSSTOG00000019989 ZFP64 ZFP64 zinc finger protein 0.376 0.065 ENSSTOG00000014581 YLPM1 YLP motif containing 1 0.371 0.047 ENSSTOG00000012596 PRKCE protein kinase C epsilon 0.370 0.065 ENSSTOG00000021692 NA NA 0.369 0.088 ENSSTOG00000002553 GRK5 G protein-coupled receptor kinase 5 0.369 0.052 ENSSTOG00000028892 SF3B2 splicing factor 3b subunit 2 0.365 0.082 ENSSTOG00000020987 NA NA 0.363 0.046 apoptosis antagonizing transcription 0.358 0.049 ENSSTOG00000002458 AATF factor RNA binding protein with serine rich 0.356 0.026 ENSSTOG00000013200 RNPS1 domain 1 ENSSTOG00000028652 C16orf62 chromosome 16 open reading frame 62 0.354 0.082 upstream binding transcription factor, 0.354 0.044 ENSSTOG00000029137 UBTF RNA polymerase I ENSSTOG00000002212 RABGAP1L RAB GTPase activating protein 1 like 0.353 0.032 ENSSTOG00000022116 SETD2 SET domain containing 2 0.352 0.060 cleavage and polyadenylation specific 0.343 0.087 ENSSTOG00000014133 CPSF6 factor 6 elongation factor Tu GTP binding 0.341 0.049 ENSSTOG00000012188 EFTUD2 domain containing 2 ENSSTOG00000025543 PRMT1 protein arginine methyltransferase 1 0.335 0.026 ENSSTOG00000016288 DHX9 DEAH-box helicase 9 0.331 0.040 ENSSTOG00000002302 BUD13 BUD13 homolog 0.331 0.065 URB2 ribosome biogenesis 2 homolog 0.329 0.057 ENSSTOG00000025323 URB2 (S. cerevisiae) mitogen-activated protein kinase kinase 0.320 0.052 ENSSTOG00000002821 MAP3K4 kinase 4 chromodomain helicase DNA binding 0.319 0.087 ENSSTOG00000001854 CHD8 protein 8 ENSSTOG00000004219 DDX23 DEAD-box helicase 23 0.310 0.067 ENSSTOG00000002864 NA NA 0.308 0.049 ENSSTOG00000010151 TEX10 testis expressed 10 0.301 0.069 synaptotagmin binding cytoplasmic 0.299 0.069 ENSSTOG00000009962 SYNCRIP RNA interacting protein factor interacting with PAPOLA and 0.298 0.056 ENSSTOG00000012166 FIP1L1 CPSF1 ENSSTOG00000024547 BRD7 bromodomain containing 7 0.297 0.070 ENSSTOG00000006984 NA NA 0.297 0.093 ENSSTOG00000025050 NCOA6 nuclear receptor coactivator 6 0.296 0.040 ENSSTOG00000014004 NA NA 0.289 0.063 ENSSTOG00000008657 RBM15B RNA binding motif protein 15B 0.289 0.011 AFG3 like matrix AAA peptidase 0.288 0.082 ENSSTOG00000004887 AFG3L2 subunit 2 ENSSTOG00000005490 NA NA 0.284 0.070 ENSSTOG00000008209 NA NA 0.282 0.039 ENSSTOG00000008878 NA NA 0.281 0.058 ENSSTOG00000007896 KPNB1 karyopherin subunit beta 1 0.278 0.086 ENSSTOG00000005933 CUL1 cullin 1 0.274 0.089 ENSSTOG00000005531 WDR33 WD repeat domain 33 0.273 0.041 ENSSTOG00000002108 YBX3 Y-box binding protein 3 0.269 0.049 ENSSTOG00000007753 ZC3H10 zinc finger CCCH-type containing 10 0.261 0.089 mitogen-activated protein kinase 0.261 0.057 ENSSTOG00000011275 MAPKAP1 associated protein 1 ENSSTOG00000006211 RBMX RNA binding motif protein, X-linked 0.247 0.062

52 ENSSTOG00000026587 SERBP1 SERPINE1 mRNA binding protein 1 0.238 0.049 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, 0.209 0.049 ENSSTOG00000002175 SMARCE1 subfamily e, member 1 ENSSTOG00000029008 NA NA 0.177 0.098 ENSSTOG00000022920 NA NA -0.232 0.026 ENSSTOG00000006939 SERINC4 serine incorporator 4 -0.269 0.057 ST3 beta-galactoside alpha-2,3- -0.293 0.086 ENSSTOG00000020753 ST3GAL4 sialyltransferase 4 ENSSTOG00000000362 NA NA -0.297 0.090 ENSSTOG00000026739 ZNF2 zinc finger protein 2 -0.305 0.083 ENSSTOG00000020752 NFKBID NFKB inhibitor delta -0.319 0.000 ENSSTOG00000016565 NA NA -0.340 0.072 TGIF2- -0.385 0.071 ENSSTOG00000022280 C20orf24 TGIF2-C20orf24 readthrough ENSSTOG00000000574 SDF2 stromal cell derived factor 2 -0.396 0.057 ENSSTOG00000028433 NA NA -0.438 0.088 ENSSTOG00000016556 GLRX glutaredoxin -0.523 0.087 ENSSTOG00000010770 NA NA -0.534 0.067 ATP binding cassette subfamily B -0.560 0.049 ENSSTOG00000011066 ABCB6 member 6 (Langereis blood group) SEC31 homolog B, COPII coat -0.591 0.058 ENSSTOG00000009253 SEC31B complex component ENSSTOG00000007785 S100A6 S100 calcium binding protein A6 -0.598 0.089 ENSSTOG00000019835 CCDC116 coiled-coil domain containing 116 -0.603 0.026 ENSSTOG00000013029 ANKRD37 ankyrin repeat domain 37 -0.646 0.064 ENSSTOG00000000245 NA NA -0.674 0.093 ENSSTOG00000027314 MPC2 mitochondrial pyruvate carrier 2 -0.674 0.092 cholinergic receptor nicotinic alpha 2 -0.704 0.049 ENSSTOG00000016060 CHRNA2 subunit ENSSTOG00000022143 ZC3H12C zinc finger CCCH-type containing 12C -0.729 0.056 ENSSTOG00000021240 RSPH3 radial spoke 3 homolog -0.734 0.086 ENSSTOG00000000253 ARG2 arginase 2 -0.764 0.067 ENSSTOG00000009812 STEAP4 STEAP4 metalloreductase -0.787 0.095 ankyrin repeat and SOCS box -0.802 0.046 ENSSTOG00000022386 ASB12 containing 12 ENSSTOG00000000358 C15orf62 chromosome 15 open reading frame 62 -0.854 0.053 ENSSTOG00000004187 HAGHL hydroxyacylglutathione hydrolase-like -0.897 0.082 ENSSTOG00000000345 NA NA -2.767 0.065

53 Table 2-S3. Genes in the module which WGCNA calls the “salmon” module (n=126).

HGNC Disperse Disperse Ensembl ID HGNC description symbol correlation p-value ENSSTOG00000012711 BLNK B-cell linker 0.323 0.034 major histocompatibility complex, class II, ENSSTOG00000010426 HLA-DRA 0.298 0.052 DR alpha ENSSTOG00000006747 MS4A1 membrane spanning 4-domains A1 0.395 0.009 ENSSTOG00000006477 FCMR Fc fragment of IgM receptor 0.340 0.026 ENSSTOG00000009000 CD79B CD79b molecule 0.303 0.048 ENSSTOG00000026333 CD19 CD19 molecule 0.291 0.059 ENSSTOG00000009429 HVCN1 hydrogen voltage gated channel 1 0.381 0.012 ENSSTOG00000028460 CD72 CD72 molecule 0.361 0.017 ENSSTOG00000009017 CD74 CD74 molecule 0.331 0.030 ENSSTOG00000014061 CNN3 calponin 3 0.353 0.020 major histocompatibility complex, class II, ENSSTOG00000009246 HLA-DMA 0.287 0.062 DM alpha ENSSTOG00000009964 CD22 CD22 molecule 0.268 0.082 ENSSTOG00000004662 FCRLA Fc receptor like A 0.297 0.053 ENSSTOG00000003518 CD79A CD79a molecule 0.309 0.044 ENSSTOG00000019974 NA NA 0.254 0.100 ENSSTOG00000012191 CR2 complement component 3d receptor 2 0.316 0.039 ENSSTOG00000019821 NA NA 0.302 0.049 ENSSTOG00000002925 FLT3 fms related tyrosine kinase 3 0.387 0.010 ENSSTOG00000007652 CYB561A3 cytochrome b561 family member A3 0.349 0.022 TOPBP1 interacting checkpoint and ENSSTOG00000006086 TICRR 0.339 0.026 replication regulator ENSSTOG00000021692 NA NA 0.344 0.024 ENSSTOG00000013018 STAP1 signal transducing adaptor family member 1 0.430 0.004 ENSSTOG00000029251 CD83 CD83 molecule 0.225 0.146 ENSSTOG00000028078 NA NA 0.190 0.222 ENSSTOG00000007519 PARN poly(A)-specific ribonuclease 0.422 0.005 ENSSTOG00000000974 RAPGEF3 Rap guanine nucleotide exchange factor 3 0.359 0.018 minichromosome maintenance complex ENSSTOG00000001358 MCM3 0.465 0.002 component 3 major histocompatibility complex, class II, ENSSTOG00000003405 HLA-DOB 0.320 0.036 DO beta ENSSTOG00000013061 LMO7 LIM domain 7 0.451 0.002 ENSSTOG00000009423 TCTN1 tectonic family member 1 0.305 0.047 ENSSTOG00000012405 VPREB3 pre-B lymphocyte 3 0.277 0.073 ENSSTOG00000024952 NA NA 0.298 0.053 ENSSTOG00000025959 DUSP26 dual specificity phosphatase 26 (putative) 0.165 0.291 ENSSTOG00000004523 BCL11A B-cell CLL/lymphoma 11A 0.350 0.021 ENSSTOG00000026253 SNX8 sorting nexin 8 0.177 0.256 FYVE, RhoGEF and PH domain containing ENSSTOG00000016385 FGD6 0.178 0.253 6 ENSSTOG00000020975 CD180 CD180 molecule 0.376 0.013 ENSSTOG00000023621 TLR10 toll like receptor 10 0.311 0.043 ENSSTOG00000003831 FCHSD2 FCH and double SH3 domains 2 0.311 0.043 ENSSTOG00000004669 FCRLB Fc receptor like B 0.393 0.009 ENSSTOG00000026537 NA NA 0.344 0.024 ENSSTOG00000025994 CXXC5 CXXC finger protein 5 0.161 0.301 ENSSTOG00000010917 GGA2 golgi associated, gamma adaptin ear 0.326 0.033

54 containing, ARF binding protein 2 ENSSTOG00000023262 TMEM206 transmembrane protein 206 0.205 0.186 ADP-ribosyltransferase 4 (Dombrock blood ENSSTOG00000011244 ART4 0.200 0.199 group) BLK proto-oncogene, Src family tyrosine ENSSTOG00000000630 BLK 0.293 0.056 kinase ENSSTOG00000004790 NA NA 0.450 0.002 ENSSTOG00000016000 PARP1 poly(ADP-ribose) polymerase 1 0.400 0.008 ENSSTOG00000002553 GRK5 G protein-coupled receptor kinase 5 0.367 0.015 ENSSTOG00000003197 ITGAX integrin subunit alpha X 0.191 0.220 ENSSTOG00000000428 ZNF346 zinc finger protein 346 0.266 0.085 ENSSTOG00000004551 NA NA 0.126 0.422 ENSSTOG00000015212 SLC25A19 solute carrier family 25 member 19 0.202 0.194 ENSSTOG00000006407 PPIC peptidylprolyl isomerase C 0.352 0.021 ENSSTOG00000024354 PAX5 paired box 5 0.106 0.499 ENSSTOG00000013071 NA NA 0.401 0.008 ENSSTOG00000027236 JMJD7 jumonji domain containing 7 0.237 0.127 ENSSTOG00000009013 NA NA 0.270 0.080 ENSSTOG00000007668 CD200 CD200 molecule 0.332 0.030 ENSSTOG00000008793 HOMER2 homer scaffolding protein 2 0.333 0.029 ENSSTOG00000020966 RAB30 RAB30, member RAS oncogene family 0.374 0.013 ENSSTOG00000006546 C18orf8 chromosome 18 open reading frame 8 0.230 0.138 major histocompatibility complex, class II, ENSSTOG00000003567 HLA-DPB1 0.202 0.193 DP beta 1 ENSSTOG00000020603 PLD4 phospholipase D family member 4 0.271 0.078 ENSSTOG00000021193 NA NA 0.338 0.027 ENSSTOG00000003098 NKTR natural killer cell triggering receptor 0.367 0.016 ENSSTOG00000010620 SNX9 sorting nexin 9 0.389 0.010 ENSSTOG00000003129 DOCK9 dedicator of cytokinesis 9 0.408 0.007 ENSSTOG00000008446 SQLE squalene epoxidase 0.230 0.138 ENSSTOG00000006255 HHEX hematopoietically expressed homeobox 0.229 0.139 ENSSTOG00000022986 RBSN rabenosyn, RAB effector 0.185 0.234 hes related family bHLH transcription ENSSTOG00000022371 HEY2 0.214 0.168 factor with YRPW motif 2 ENSSTOG00000010515 NA NA 0.203 0.192 protein tyrosine phosphatase, receptor type ENSSTOG00000011996 PTPRO 0.310 0.043 O ENSSTOG00000009492 SLC9A7 solute carrier family 9 member A7 0.069 0.662 ENSSTOG00000007411 DENND5B DENN domain containing 5B 0.174 0.265 ENSSTOG00000028363 IRF8 interferon regulatory factor 8 0.292 0.057 ENSSTOG00000021562 CD40 CD40 molecule 0.181 0.245 family with sequence similarity 212 ENSSTOG00000013745 FAM212A 0.230 0.138 member A ENSSTOG00000021472 NA NA 0.340 0.026 Ral GEF with PH domain and SH3 binding ENSSTOG00000005052 RALGPS2 0.405 0.007 motif 2 ENSSTOG00000005134 TCF4 transcription factor 4 0.309 0.044 ENSSTOG00000019466 NA NA 0.165 0.290 ENSSTOG00000019772 CPNE7 copine 7 0.353 0.020 ENSSTOG00000011720 NAPSA napsin A aspartic peptidase 0.111 0.477 glutaminyl-tRNA synthase (glutamine- ENSSTOG00000009177 QRSL1 0.388 0.010 hydrolyzing)-like 1 ENSSTOG00000006518 ZNF608 zinc finger protein 608 0.224 0.149 ENSSTOG00000027272 JADE1 jade family PHD finger 1 0.262 0.090 55 ENSSTOG00000025953 B4GALT7 beta-1,4-galactosyltransferase 7 0.171 0.274 ENSSTOG00000001175 FGFR1 fibroblast growth factor receptor 1 0.228 0.141 ENSSTOG00000025852 NA NA 0.266 0.085 ENSSTOG00000021878 NA NA 0.096 0.539 high mobility group nucleosomal binding ENSSTOG00000014353 HMGN3 0.226 0.145 domain 3 StAR related lipid transfer domain ENSSTOG00000024994 STARD9 0.165 0.290 containing 9 ENSSTOG00000026707 PRCP prolylcarboxypeptidase 0.318 0.038 beta-1,3-N-acetylgalactosaminyltransferase ENSSTOG00000015026 B3GALNT1 0.348 0.022 1 (globoside blood group) ENSSTOG00000009343 SCPEP1 serine carboxypeptidase 1 0.333 0.029 ENSSTOG00000024297 ERBB3 erb-b2 receptor tyrosine kinase 3 0.143 0.360 ENSSTOG00000023685 HS6ST1 heparan sulfate 6-O-sulfotransferase 1 0.380 0.012 ENSSTOG00000003982 VPREB1 pre-B lymphocyte 1 0.237 0.125 ENSSTOG00000012982 NA NA 0.177 0.256 ENSSTOG00000012690 KMO kynurenine 3-monooxygenase 0.235 0.130 DnaJ heat shock protein family (Hsp40) ENSSTOG00000015166 DNAJB5 0.249 0.107 member B5 ENSSTOG00000001189 ARHGEF25 Rho guanine nucleotide exchange factor 25 0.367 0.016 ENSSTOG00000012181 ELL3 elongation factor for RNA polymerase II 3 0.178 0.254 ENSSTOG00000022259 C9orf152 open reading frame 152 0.232 0.134 ENSSTOG00000010471 RASGRP3 RAS guanyl releasing protein 3 0.258 0.095 ENSSTOG00000024103 ZBTB8A zinc finger and BTB domain containing 8A 0.410 0.006 ENSSTOG00000004904 PHYKPL 5-phosphohydroxy-L-lysine phospho-lyase 0.118 0.449 ENSSTOG00000027839 ZBTB3 zinc finger and BTB domain containing 3 0.263 0.089 ENSSTOG00000011746 MYCBPAP MYCBP associated protein 0.338 0.027 ENSSTOG00000027659 FUT10 fucosyltransferase 10 0.177 0.255 ENSSTOG00000008890 CLCN4 chloride voltage-gated channel 4 0.361 0.017 ENSSTOG00000014928 DIABLO diablo IAP-binding mitochondrial protein 0.236 0.127 ENSSTOG00000009709 TPD52L1 tumor protein D52-like 1 0.369 0.015 SEC22 homolog C, vesicle trafficking ENSSTOG00000024224 SEC22C -0.025 0.875 protein phosphatidylinositol glycan anchor ENSSTOG00000015902 PIGG -0.024 0.879 biosynthesis class G ENSSTOG00000011473 ATP1B1 ATPase Na+/K+ transporting subunit beta 1 0.232 0.134 ENSSTOG00000028711 NA NA 0.161 0.303 ENSSTOG00000007042 SLC6A7 solute carrier family 6 member 7 -0.044 0.780 ENSSTOG00000016304 SLC37A2 solute carrier family 37 member 2 0.105 0.505 ENSSTOG00000007202 SLAMF6 SLAM family member 6 -0.046 0.770 ENSSTOG00000005781 NEIL1 nei like DNA glycosylase 1 0.205 0.188 von Willebrand factor A domain containing ENSSTOG00000022384 VWA3B 0.135 0.388 3B ENSSTOG00000010681 POLH DNA polymerase eta 0.194 0.212 ENSSTOG00000026265 FBXL20 F-box and leucine rich repeat protein 20 0.059 0.705

56 Table 2-S4. Hub genes in the module which WGCNA calls the “salmon” module (n=29). Hub genes are defined as genes with module membership > 0.8 and correlation with dispersal > 0.3.

HGNC Disperse Disperse Module Ensembl ID HGNC description symbol correlation p-value membership

minichromosome ENSSTOG00000001358 MCM3 maintenance complex 0.465 0.002 0.838 component 3 ENSSTOG00000013061 LMO7 LIM domain 7 0.451 0.002 0.836 signal transducing adaptor ENSSTOG00000013018 STAP1 0.430 0.004 0.859 family member 1 poly(A)-specific ENSSTOG00000007519 PARN 0.422 0.005 0.841 ribonuclease membrane spanning 4- ENSSTOG00000006747 MS4A1 0.395 0.009 0.944 domains A1 ENSSTOG00000004669 FCRLB Fc receptor like B 0.393 0.009 0.815 ENSSTOG00000002925 FLT3 fms related tyrosine kinase 3 0.387 0.010 0.877 hydrogen voltage gated ENSSTOG00000009429 HVCN1 0.381 0.012 0.926 channel 1 ENSSTOG00000020975 CD180 CD180 molecule 0.376 0.013 0.819 ENSSTOG00000028460 CD72 CD72 molecule 0.361 0.017 0.926 Rap guanine nucleotide ENSSTOG00000000974 RAPGEF3 0.359 0.018 0.839 exchange factor 3 ENSSTOG00000014061 CNN3 calponin 3 0.353 0.020 0.920 ENSSTOG00000004523 BCL11A B-cell CLL/lymphoma 11A 0.350 0.021 0.823 cytochrome b561 family ENSSTOG00000007652 CYB561A3 0.349 0.022 0.868 member A3 ENSSTOG00000026537 NA NA 0.344 0.024 0.814 ENSSTOG00000021692 NA NA 0.344 0.024 0.867 ENSSTOG00000006477 FCMR Fc fragment of IgM receptor 0.340 0.026 0.944 TOPBP1 interacting ENSSTOG00000006086 TICRR checkpoint and replication 0.339 0.026 0.867 regulator ENSSTOG00000009017 CD74 CD74 molecule 0.331 0.030 0.924 golgi associated, gamma ENSSTOG00000010917 GGA2 adaptin ear containing, ARF 0.326 0.033 0.810 binding protein 2 ENSSTOG00000012711 BLNK B-cell linker 0.323 0.034 0.956 major histocompatibility ENSSTOG00000003405 HLA-DOB 0.320 0.036 0.838 complex, class II, DO beta complement component 3d ENSSTOG00000012191 CR2 0.316 0.039 0.880 receptor 2 ENSSTOG00000023621 TLR10 toll like receptor 10 0.311 0.043 0.818 FCH and double SH3 ENSSTOG00000003831 FCHSD2 0.311 0.043 0.816 domains 2 ENSSTOG00000003518 CD79A CD79a molecule 0.309 0.044 0.907 ENSSTOG00000009423 TCTN1 tectonic family member 1 0.305 0.047 0.830 ENSSTOG00000009000 CD79B CD79b molecule 0.303 0.048 0.934 ENSSTOG00000019821 NA NA 0.302 0.049 0.879 57 Table 2-S5. Gene ontology results for 150 up-regulated genes in full dataset.

Biological process Number GO term ID Genes involved p-value GO term of genes RNA localization GO:0006403 RFTN1, EXOSC10, PARN, SHQ1, NUP160, SETD2 6 0.0012 ribonucleoprotein AATF, EIF3A, EXOSC10, OGFOD1, DDX31, GO:0022613 7 0.0289 complex biogenesis NUFIP1, SHQ1 NUDCD3, MCM3, CHD8, SMARCE1, MAP3K4, AFG3L2, C2CD3, EXOSC10, OGFOD1, PARN, organelle GO:0006996 KPNB1, SMARCC1, APEX1, NAA40, PRKCE, STAP1, 24 0.00694 organization CNN3, SWAP70, PARP1, CEP89, SETD2, BRD7, PRMT1, PEX26 MCM3, CHD8, SMARCE1, MAP3K4, EXOSC10, chromosome GO:0051276 PARN, KPNB1, SMARCC1, APEX1, NAA40, SETD2, 13 0.00648 organization BRD7, PRMT1

RFTN1,NUDCD3,EHD4,MCM3,AP1B1,NIPSNAP1,C HD8,NCBP3,YBX3,SMARCE1,RABGAP1L,AATF,EIF3 A,GRK5,MAP3K4,DOCK9,GPATCH1,ANAPC1,IL1RL 1,DHDDS,FCRLB,AFG3L2,C2CD3,CASP2,RALGPS2, EXOSC10,OGFOD1,CUL1,ACLY,RBMX,C9ORF3,PA RN,MYEF2,KPNB1,DDX31,SMARCC1,RBM15B,NUFI biological process GO:0008150 75 0.00591 P1,ELAVL1,HVCN1,SYNCRIP,SHQ1,APEX1,NNT,MA PKAP1,NAA40,CNR2,HTATSF1,PRKCE,STAP1,LMO 7,RNPS1,CNN3,NUP160,CPSF6,EIF4ENIF1,MAN2A2 ,MYBBP1A,SWAP70,CNOT9,PARP1,DHX9,CEP89,C PNE7,USP11,SETD2,ZDHHC9,YTHDF1,BRD7,NCOA 6,YTHDF2,PRMT1,PEX26,DCP1B,C16ORF62

MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,GRK5,M AP3K4,GPATCH1,ANAPC1,DHDDS,AFG3L2,C2CD3, CASP2,EXOSC10,OGFOD1,CUL1,ACLY,RBMX,C9O RF3,PARN,MYEF2,SMARCC1,RBM15B,NUFIP1,ELA metabolic process GO:0008152 VL1,SYNCRIP,SHQ1,APEX1,NNT,NAA40,HTATSF1,P 53 0.0127 RKCE,STAP1,RNPS1,NUP160,CPSF6,EIF4ENIF1,MA N2A2,MYBBP1A,SWAP70,CNOT9,PARP1,DHX9,USP 11,SETD2,ZDHHC9,YTHDF1,BRD7,NCOA6,YTHDF2 ,PRMT1,DCP1B

MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,GRK5,M AP3K4,GPATCH1,ANAPC1,DHDDS,AFG3L2,C2CD3, CASP2,EXOSC10,OGFOD1,CUL1,ACLY,RBMX,C9O primary metabolic RF3,PARN,MYEF2,SMARCC1,RBM15B,NUFIP1,ELA GO:0044238 50 0.00411 process VL1,SYNCRIP,SHQ1,APEX1,NAA40,HTATSF1,PRKC E,STAP1,RNPS1,CPSF6,EIF4ENIF1,MAN2A2,MYBBP 1A,SWAP70,CNOT9,PARP1,USP11,SETD2,ZDHHC9, YTHDF1,BRD7,NCOA6,YTHDF2,PRMT1,DCP1B

58 MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,GRK5,M AP3K4,GPATCH1,ANAPC1,DHDDS,AFG3L2,C2CD3, CASP2,EXOSC10,OGFOD1,CUL1,ACLY,RBMX,C9O RF3,PARN,MYEF2,SMARCC1,RBM15B,NUFIP1,ELA organic substance GO:0071704 VL1,SYNCRIP,SHQ1,APEX1,NAA40,HTATSF1,PRKC 52 0.00228 metabolic process E,STAP1,RNPS1,NUP160,CPSF6,EIF4ENIF1,MAN2A 2,MYBBP1A,SWAP70,CNOT9,PARP1,DHX9,USP11,S ETD2,ZDHHC9,YTHDF1,BRD7,NCOA6,YTHDF2,PR MT1,DCP1B

MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,GRK5,M AP3K4,GPATCH1,ANAPC1,DHDDS,AFG3L2,C2CD3, CASP2,EXOSC10,OGFOD1,CUL1,RBMX,C9ORF3,P macromolecule ARN,MYEF2,SMARCC1,RBM15B,NUFIP1,ELAVL1,S 5.22E- GO:0043170 50 metabolic process YNCRIP,SHQ1,APEX1,NAA40,HTATSF1,PRKCE,STA 05 P1,RNPS1,NUP160,CPSF6,EIF4ENIF1,MYBBP1A,SW AP70,CNOT9,PARP1,DHX9,USP11,SETD2,ZDHHC9, YTHDF1,BRD7,NCOA6,YTHDF2,PRMT1,DCP1B

MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4 macromolecule ,DHDDS,EXOSC10,OGFOD1,RBMX,PARN,MYEF2,S biosynthetic GO:0009059 MARCC1,NUFIP1,ELAVL1,SYNCRIP,APEX1,EIF4EN 28 0.0456 process IF1,MYBBP1A,CNOT9,PARP1,SETD2,ZDHHC9,YTH DF1,BRD7,NCOA6,YTHDF2

CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,GPAT CH1,AFG3L2,C2CD3,CASP2,EXOSC10,OGFOD1,RB MX,PARN,MYEF2,SMARCC1,RBM15B,NUFIP1,ELA 3.74E- gene expression GO:0010467 37 VL1,SYNCRIP,SHQ1,APEX1,HTATSF1,STAP1,RNPS1 06 ,NUP160,CPSF6,EIF4ENIF1,MYBBP1A,CNOT9,DHX 9,SETD2,YTHDF1,BRD7,NCOA6,YTHDF2,DCP1B

MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,GRK5,M AP3K4,GPATCH1,ANAPC1,DHDDS,AFG3L2,C2CD3, CASP2,EXOSC10,OGFOD1,CUL1,RBMX,C9ORF3,P nitrogen compound ARN,MYEF2,SMARCC1,RBM15B,NUFIP1,ELAVL1,S GO:0006807 48 0.00416 metabolic process YNCRIP,SHQ1,APEX1,NAA40,HTATSF1,PRKCE,STA P1,RNPS1,CPSF6,EIF4ENIF1,MYBBP1A,SWAP70,CN OT9,PARP1,USP11,SETD2,ZDHHC9,YTHDF1,BRD7, NCOA6,YTHDF2,PRMT1,DCP1B

RFTN1,NUDCD3,EHD4,MCM3,CHD8,SMARCE1,AA cellular component TF,EIF3A,MAP3K4,AFG3L2,C2CD3,EXOSC10,OGF organization or GO:0071840 OD1,PARN,KPNB1,DDX31,SMARCC1,NUFIP1,SHQ1 32 0.0169 biogenesis ,APEX1,NAA40,PRKCE,STAP1,CNN3,CPSF6,SWAP7 0,PARP1,CEP89,SETD2,BRD7,PRMT1,PEX26

MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,GRK5,M AP3K4,GPATCH1,ANAPC1,DHDDS,AFG3L2,EXOSC 10,OGFOD1,CUL1,ACLY,RBMX,PARN,MYEF2,SMA cellular metabolic RCC1,RBM15B,NUFIP1,ELAVL1,SYNCRIP,SHQ1,AP GO:0044237 47 0.0449 process EX1,NNT,NAA40,HTATSF1,PRKCE,STAP1,RNPS1,C PSF6,EIF4ENIF1,MYBBP1A,SWAP70,CNOT9,PARP1 ,USP11,SETD2,ZDHHC9,YTHDF1,BRD7,NCOA6,YT HDF2,PRMT1,DCP1B

59 MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,GRK5,M AP3K4,GPATCH1,ANAPC1,DHDDS,AFG3L2,EXOSC 10,OGFOD1,CUL1,RBMX,PARN,MYEF2,SMARCC1, cellular RBM15B,NUFIP1,ELAVL1,SYNCRIP,SHQ1,APEX1,N 0.00050 macromolecule GO:0044260 45 AA40,HTATSF1,PRKCE,STAP1,RNPS1,CPSF6,EIF4E 1 metabolic process NIF1,MYBBP1A,SWAP70,CNOT9,PARP1,USP11,SET D2,ZDHHC9,YTHDF1,BRD7,NCOA6,YTHDF2,PRMT 1,DCP1B

MCM3,CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4 ,GPATCH1,EXOSC10,OGFOD1,RBMX,PARN,MYEF2 cellular nitrogen ,SMARCC1,RBM15B,NUFIP1,ELAVL1,SYNCRIP,SHQ compound GO:0034641 34 0.0162 1,APEX1,HTATSF1,RNPS1,CPSF6,EIF4ENIF1,MYBB metabolic process P1A,SWAP70,CNOT9,PARP1,SETD2,YTHDF1,BRD7, NCOA6,YTHDF2,DCP1B

CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,C2CD 3,EXOSC10,OGFOD1,RBMX,PARN,MYEF2,SMARCC regulation of 1,RBM15B,NUFIP1,ELAVL1,SYNCRIP,SHQ1,APEX1, GO:0019222 34 0.00975 metabolic process PRKCE,STAP1,RNPS1,EIF4ENIF1,MYBBP1A,SWAP7 0,CNOT9,PARP1,DHX9,SETD2,YTHDF1,BRD7,NCO A6,YTHDF2,DCP1B

CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,EXOS regulation of C10,OGFOD1,RBMX,PARN,MYEF2,SMARCC1,NUFI biosynthetic GO:0009889 P1,ELAVL1,SYNCRIP,APEX1,EIF4ENIF1,MYBBP1A, 25 0.0457 process CNOT9,PARP1,SETD2,YTHDF1,BRD7,NCOA6,YTHD F2 CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,EXOS regulation of C10,OGFOD1,RBMX,PARN,MYEF2,SMARCC1,NUFI cellular GO:0031326 P1,ELAVL1,SYNCRIP,APEX1,EIF4ENIF1,MYBBP1A, 25 0.0371 biosynthetic CNOT9,PARP1,SETD2,YTHDF1,BRD7,NCOA6,YTHD process F2

CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,C2CD 3,EXOSC10,OGFOD1,RBMX,PARN,MYEF2,SMARCC regulation of 1,RBM15B,NUFIP1,ELAVL1,SYNCRIP,SHQ1,APEX1, macromolecule GO:0060255 33 0.00483 PRKCE,STAP1,RNPS1,EIF4ENIF1,MYBBP1A,SWAP7 metabolic process 0,CNOT9,DHX9,SETD2,YTHDF1,BRD7,NCOA6,YTH DF2,DCP1B

CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,EXOS C10,OGFOD1,RBMX,PARN,MYEF2,SMARCC1,RBM1 regulation of gene 0.00014 GO:0010468 5B,NUFIP1,ELAVL1,SYNCRIP,SHQ1,APEX1,STAP1, 30 expression 8 RNPS1,EIF4ENIF1,MYBBP1A,CNOT9,DHX9,SETD2, YTHDF1,BRD7,NCOA6,YTHDF2,DCP1B posttranscriptional EIF3A,OGFOD1,PARN,ELAVL1,SYNCRIP,APEX1,EI 0.00014 regulation of gene GO:0010608 10 F4ENIF1,DHX9,YTHDF1,YTHDF2 7 expression

60 CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,C2CD 3,EXOSC10,OGFOD1,RBMX,PARN,MYEF2,SMARCC regulation of 1,RBM15B,NUFIP1,ELAVL1,SYNCRIP,SHQ1,APEX1, primary metabolic GO:0080090 32 0.0119 PRKCE,STAP1,RNPS1,EIF4ENIF1,MYBBP1A,SWAP7 process 0,CNOT9,PARP1,SETD2,YTHDF1,BRD7,NCOA6,YTH DF2

CHD8,YBX3,SMARCE1,AATF,EIF3A,MAP3K4,C2CD 3,EXOSC10,OGFOD1,RBMX,PARN,MYEF2,SMARCC regulation of 1,RBM15B,NUFIP1,ELAVL1,SYNCRIP,SHQ1,APEX1, nitrogen compound GO:0051171 32 0.00799 PRKCE,STAP1,RNPS1,EIF4ENIF1,MYBBP1A,SWAP7 metabolic process 0,CNOT9,PARP1,SETD2,YTHDF1,BRD7,NCOA6,YTH DF2 CHD8,YBX3,SMARCE1,AATF,EXOSC10,PARN,SMAR negative regulation CC1,ELAVL1,SYNCRIP,SHQ1,PRKCE,STAP1,RNPS1, of metabolic GO:0009892 21 0.00349 EIF4ENIF1,MYBBP1A,SWAP70,CNOT9,PARP1,BRD process 7,YTHDF2,DCP1B CHD8,YBX3,SMARCE1,AATF,EXOSC10,PARN,SMAR negative regulation CC1,ELAVL1,SYNCRIP,SHQ1,PRKCE,RNPS1,EIF4E of macromolecule GO:0010605 19 0.0119 NIF1,MYBBP1A,SWAP70,CNOT9,BRD7,YTHDF2,DC metabolic process P1B CHD8,YBX3,SMARCE1,EXOSC10,PARN,ELAVL1,SY negative regulation GO:0010629 NCRIP,SHQ1,RNPS1,EIF4ENIF1,MYBBP1A,CNOT9, 15 0.0191 of gene expression BRD7,YTHDF2,DCP1B regulation of RNA 5.88E- GO:0043487 PARN,ELAVL1,SYNCRIP,APEX1,DHX9,YTHDF2 6 stability 05 regulation of GO:0043488 ELAVL1,SYNCRIP,APEX1,DHX9,YTHDF2 5 0.0012 mRNA stability mRNA metabolic GPATCH1,EXOSC10,RBMX,PARN,RBM15B,HTATSF 0.00043 GO:0016071 10 process 1,RNPS1,CPSF6,CNOT9,DCP1B 4 RNA catabolic GO:0006401 EXOSC10,PARN,RNPS1,CNOT9,DCP1B 5 0.0448 process mRNA catabolic GO:0006402 EXOSC10,PARN,RNPS1,CNOT9,DCP1B 5 0.0182 process regulation of EIF3A,OGFOD1,ELAVL1,SYNCRIP,EIF4ENIF1,YTH GO:0006417 7 0.0404 translation DF1,YTHDF2

61 Table 2-S6. Genes that were differentially expressed as a function of number of day until dispersal (n=22).

Days until Days until HGNC dispersal Ensembl ID HGNC description dispersal symbol Log 2 q-value fold change ATPase Na+/K+ transporting subunit ENSSTOG00000011473 ATP1B1 -0.043 0.028 beta 1 elongation factor for RNA ENSSTOG00000012181 ELL3 -0.029 0.069 polymerase II 3 ENSSTOG00000009423 TCTN1 tectonic family member 1 -0.027 0.056 ENSSTOG00000005781 NEIL1 nei like DNA glycosylase 1 -0.025 0.027 ENSSTOG00000026333 CD19 CD19 molecule -0.025 0.034 ENSSTOG00000004669 FCRLB Fc receptor like B -0.020 0.069 ENSSTOG00000027236 JMJD7 jumonji domain containing 7 -0.019 0.069 ENSSTOG00000019989 ZFP64 ZFP64 zinc finger protein -0.017 0.027 ENSSTOG00000008793 HOMER2 homer scaffolding protein 2 -0.017 0.063 ENSSTOG00000014894 PDK3 pyruvate dehydrogenase kinase 3 -0.016 0.063 ENSSTOG00000011221 NA NA -0.015 0.087 ENSSTOG00000002553 GRK5 G protein-coupled receptor kinase 5 -0.014 0.062 ENSSTOG00000006984 NA NA -0.014 0.028 mitogen-activated protein kinase ENSSTOG00000002821 MAP3K4 -0.013 0.034 kinase kinase 4 ENSSTOG00000007985 DDX31 DEAD-box helicase 31 -0.013 0.079 ENSSTOG00000012505 HTATSF1 HIV-1 Tat specific factor 1 -0.011 0.042 ENSSTOG00000004523 BCL11A B-cell CLL/lymphoma 11A -0.011 0.071 RAB GTPase activating protein 1 ENSSTOG00000002212 RABGAP1L -0.011 0.073 like anaphase promoting complex subunit ENSSTOG00000003938 ANAPC1 -0.011 0.069 1 upstream binding transcription factor, ENSSTOG00000029137 UBTF -0.010 0.063 RNA polymerase I ENSSTOG00000000362 NA NA 0.011 0.080 ENSSTOG00000016556 GLRX glutaredoxin 0.016 0.035

62 Bibliography

Akdemir D, Godfrey OU (2015) EMMREML: Fitting Mixed Models with Known Covariance

Structures.

Alberts SC, Altmann J (1995) Balancing costs and opportunities : Dispersal in male baboons.

The American Naturalist, 145, 279–306.

Anders S, Pyl PT, Huber W (2015) HTSeq-A Python framework to work with high-throughput

sequencing data. Bioinformatics, 31, 166–169.

Archer SN, Viola AU, Kyriakopoulou V, von Schantz M, Dijk D (2008) Inter-individual

differences in habitual sleep timing and entrained phase of endogenous circadian rhythms of

BMAL1, PER2 and PER3 mRNA in human leukocytes. Sleep, 31, 608–617.

Armitage KB (1991) Social and population dynamics of yellow-bellied marmots: Results from

long-term research. Annual Review of Ecology, Evolution, and Systematics, 22, 379–407.

Armitage KB (2014) Marmot Biology: Sociality, Individual Fitness, and Population Dynamics.

Cambridge University Press, Cambridge.

Armitage KB, Van Vuren DH, Ozgul A, Oli MK (2011) Proximate causes of natal dispersal in

female yellow-bellied marmots, Marmota flaviventris. Ecology, 92, 218–27.

Avitsur R, Hunzeker J, Sheridan JF (2006) Role of early stress in the individual differences in

host response to viral infection. Brain, Behavior, and Immunity, 20, 339–348.

Ayuso M, Fernández A, Núñez Y et al. (2016) Developmental stage, muscle and genetic type

modify muscle transcriptome in pigs: Effects on gene expression and regulatory factors

involved in growth and metabolism. Plos One, 11, e0167858.

Barbraud C, Johnson AR, Bertault G (2003) Phenotypic correlates of post-fledging dispersal in a

population of greater flamingos: The importance of body condition. Journal of Animal

63 Ecology, 72, 246–257.

Bekoff M (1977) Mammalian dispersal and the ontogeny of individual behavioral phenotypes.

The American Naturalist, 111, 715–732.

Berg J, Tymoczko J, Stryer L (2002) Each Organ Has a Unique Metabolic Profile. In:

Biochemistry, p. . W. H. Freeman.

Bininda-Emonds ORP, Cardillo M, Jones KE et al. (2007) The delayed rise of present-day

mammals. Nature, 446, 507–512.

Blem CR (1980) The Energetics of Migration. In: Animal Migration, Orientation, and

Navigation (ed Sidney A. Gauthreaux J), pp. 175–224. Academic Press, New York.

Blumstein DT (2013) Yellow-bellied marmots: insights from an emergent view of sociality.

Philosophical Transactions of the Royal Society B, 368, 20120349.

Blumstein DT, Im S, Nicodemus A, Zugmeyer C (2004) Yellow-bellied marmots (Marmota

flaviventris) hibernate socially. Journal of Mammalogy, 85, 25–29.

Blumstein DT, Lea AJ, Olson LE, Martin JGA (2010) Heritability of anti-predatory traits:

vigilance and locomotor performance in marmots. Journal of Evolutionary Biology, 23,

879–87.

Blumstein DT, Wey TW, Tang K (2009) A test of the social cohesion hypothesis: Interactive

female marmots remain at home. Proceedings of the Royal Society of London B: Biological

Sciences, 276, 3007–12.

Boivin DBDB, James FOFO, Wu A et al. (2003) Circadian clock genes oscillate in human

peripheral blood mononuclear cells. Biosystems, 102, 4143–4145.

Bonte D, Van Dyck H, Bullock JM et al. (2012) Costs of dispersal. Biological Reviews, 87, 290–

312.

64 Boss J, Liedvogel M, Lundberg M et al. (2016) Gene expression in the brain of a migratory

songbird during breeding and migration. Movement Ecology, 4, 4.

Brisson JA, Davis GK, Stern DL (2007) Common genome-wide patterns of transcript

accumulation underlying the wing polyphenism and polymorphism in the pea aphid

(Acyrthosiphon pisum). Evolution and Development, 9, 338–346.

Chelly J, Kaplan J-C, Maire P, Gautron S, Kahn A (1988) Transcription of the dystrophin gene

in human muscle and non-muscle tissues. Nature, 333, 858–860.

Chepko-Sade BD, Halpin ZT (1987) Mammalian Dispersal Patterns: The Effects of Social

Structure on Population Genetics. University of Chicago Press, Chicago.

Clobert J, Danchin E, Dhondt AA, Nichols JD (Eds.) (2001) Dispersal. Oxford University Press,

Oxford.

Clobert J, Le Galliard J-F, Cote J, Meylan S, Massot M (2009) Informed dispersal, heterogeneity

in animal dispersal syndromes and the dynamics of spatially structured populations.

Ecology Letters, 12, 197–209.

Clutton-Brock TH, Lukas D (2012) The evolution of social philopatry and dispersal in female

mammals. Molecular Ecology, 21, 472–492.

Cole SW (2014) Human social genomics. PLoS Genetics, 10, 4–10.

Cole SW, Conti G, Arevalo JMG et al. (2012) Transcriptional modulation of the developing

immune system by early life social adversity. Proceedings of the National Academy of

Sciences of the United States of America, 109, 20578–20583.

Cole SW, Hawkley LC, Arevalo JM et al. (2007) Social regulation of gene expression in human

leukocytes. Genome Biology, 8, R189.

Cote J, Clobert J (2007) Social personalities influence natal dispersal in a lizard. Proceedings of

65 the Royal Society of London B: Biological Sciences, 274, 383–390.

Cote J, Clobert J, Brodin T, Fogarty S, Sih A (2010) Personality-dependent dispersal:

Characterization, ontogeny and consequences for spatially structured populations.

Philosophical Transactions of the Royal Society B, 365, 4065–76.

Davies MN, Lawn S, Whatley S et al. (2009) To what extent is blood a reasonable surrogate for

brain in gene expression studies: Estimation from mouse hippocampus and spleen.

Frontiers in Neuroscience, 3, 1–6.

Désert C, Duclos MJ, Blavy P et al. (2008) Transcriptome profiling of the feeding-to-fasting

transition in chicken liver. BMC Genomics, 9, 611.

Dingemanse NJ, Both C, van Noordwijk AJ, Rutten AL, Drent PJ (2003) Natal dispersal and

personalities in great tits (Parus major). Proceedings of the Royal Society of London B:

Biological Sciences, 270, 741–7.

Doligez B, Pärt T (2008) Estimating fitness consequences of dispersal: a road to “know-where”?

Non-random dispersal and the underestimation of dispersers’ fitness. Journal of Animal

Ecology, 77, 1199–1211.

Duckworth RA, Badyaev A V (2007) Coupling of dispersal and aggression facilitates the rapid

range expansion of a passerine bird. Proceedings of the National Academy of Sciences of

the United States of America, 104, 15017–22.

Dufty AM, Belthoff JR (2001) Proximate mechanisms of natal dispersal: The role of body

condition and hormones. In: Dispersal (eds Clobert J, Danchin E, Dhondt AA, Nichols JD),

pp. 223–233. Oxford University Press, Oxford.

Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and

hybridization array data repository. Nucleic Acids Research, 30, 207–210.

66 Ekernas LS, Cords M (2007) Social and environmental factors influencing natal dispersal in blue

monkeys, Cercopithecus mitis stuhlmanni. Animal Behaviour, 73, 1009–1020.

Fudickar AM, Peterson MP, Greives TJ et al. (2016) Differential gene expression in seasonal

sympatry: mechanisms involved in diverging life histories. Biology Letters, 12, 20160069.

Geschwind DH, Konopka G (2009) Neuroscience in the era of functional genomics and systems

biology. Nature, 461, 908–15.

Gese EM, Ruff RL, Crabtree RL (1996) Social and nutritional factors influencing the dispersal of

resident coyotes. Animal Behaviour, 52, 1025–1043.

Ghosh S, Dent R, Harper M-E et al. (2010) Gene expression profiling in whole blood identifies

distinct biological pathways associated with obesity. BMC Medical Genomics, 3, 56.

Gray KA, Yates B, Seal RL, Wright MW, Bruford EA (2014) Genenames.org: The HGNC

resources in 2015. Nucleic Acids Research, 43, D1079–D1085.

Greenwood PJ (1980) Mating systems, philopatry and dispersal in birds and mammals. Animal

Behaviour, 28, 1140–1162.

Hammond RL, Handley LJL, Winney BJ, Bruford MW, Perrin N (2006) Genetic evidence for

female-biased dispersal and gene flow in a polygynous primate. Proceedings of the Royal

Society of London B: Biological Sciences, 273, 479–484.

Holaska JM, Rais-Bahrami S, Wilson KL (2006) Lmo7 is an emerin-binding protein that

regulates the transcription of emerin and many other muscle-relevant genes. Human

Molecular Genetics, 15, 3459–3472.

Horvath S (Ed.) (2011) Weighted Network Analysis: Applications in Genomics and Systems

Biology. Springer, New York.

Horvath S, Zhang B, Carlson M et al. (2006) Analysis of oncogenic signaling networks in

67 glioblastoma identifies ASPM as a molecular target. Proceedings of the National Academy

of Sciences of the United States of America, 103, 17402–17407.

Janeway CA, Travers P, Walport M, Shlomchik MJ (2005) Immunobiology: The Immune System

in Health and Disease. Garland Science Publishing, New York.

Jeffs AG, Willmott ME, Wells RMG (1999) The use of energy stores in the puerulus of the spiny

lobster Jasus edwardsii across the continental shelf of New Zealand. Comparative

Biochemistry and Physiology, 123, 351–357.

Jenni L, Jenni-Eiermann S (1998) Fuel supply and metabolic constraints in migrating birds.

Journal of Avian Biology, 29, 521–528.

Johnston RA, Paxton KL, Moore FR, Wayne RK, Smith TB (2016) Seasonal gene expression in

a migratory songbird. Molecular Ecology, 25, 5680–5691.

Jones CM, Papanicolaou A, Mironidis GK et al. (2015) Genomewide transcriptional signatures

of migratory flight activity in a globally invasive insect pest. Molecular Ecology, 24, 4901–

4911.

Jones S, Pfister-Genskow M, Cirelli C, Benca RM (2008) Changes in brain gene expression

during migration in the white-crowned sparrow. Brain Research Bulletin, 76, 536–544.

Kim D, Pertea G, Trapnell C et al. (2013) TopHat2: accurate alignment of transcriptomes in the

presence of insertions, deletions and gene fusions. Genome Biology, 14, R36.

Klein J (1986) Natural History of the Major Histocompatibility Complex. Wiley, New York.

Kohane IS, Valtchinov VI (2012) Quantifying the white blood cell transcriptome as an accessible

window to the multiorgan transcriptome. Bioinformatics, 28, 538–545.

Krueger F (2015) Trim Galore: A wrapper tool around Cutadapt and fastQC to consistently apply

quality and adapter trimming to FastQ files.

68 Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network

analysis. BMC Bioinformatics, 9, 559.

Langfelder P, Mischel PS, Horvath S (2013) When is hub gene selection better than standard

meta-analysis? PLoS One, 8.

Law CW, Chen Y, Shi W, Smyth GK (2014) voom: precision weights unlock linear model

analysis tools for RNA-seq read counts. Genome Biology, 15, R29.

Leek JT, Scharpf RB, Bravo HC et al. (2010) Tackling the widespread and critical impact of

batch effects in high-throughput data. Nature Reviews Genetics, 11, 733–739.

Liedvogel M, Åkesson S, Bensch S (2011) The genetics of migration on the move. Trends in

Ecology & Evolution, 26, 561–569.

Liew CC, Ma J, Tang HC, Zheng R, Dempsey AA (2006) The peripheral blood transcriptome

dynamically reflects system wide biology: A potential diagnostic tool. Journal of

Laboratory and Clinical Medicine, 147, 126–132.

Lin Y-TK, Batzli GO (2001) The influence of habitat quality on dispersal, demography, and

population dynamics of voles. Ecological Monographs, 71, 245–275.

Liu R, Jin JP (2016) Calponin isoforms CNN1, CNN2 and CNN3: Regulators for actin

cytoskeleton functions in smooth muscle and non-muscle cells. Gene, 585, 143–153.

Lopez J, Wey TW, Blumstein DT (2013) Patterns of parasite prevalence and individual infection

in yellow-bellied marmots. Journal of Zoology, 291, 296–303.

Lurz PWW, Garson PJ, Wauters LA (1997) Effects of temporal and spatial variation in habitat

quality on red squirrel dispersal behaviour. Animal Behaviour, 54, 427–435.

Matthysen E (2005) Density-dependent dispersal in birds and mammals. Ecography, 28, 403–

416.

69 Meylan S, Belliure J, Clobert J, de Fraipont M (2002) Stress and body condition as prenatal and

postnatal determinants of dispersal in the common lizard (Lacerta vivipara). Hormones and

Behavior, 42, 319–326.

Miller GE, Chen E, Fok AK et al. (2009) Low early-life social class leaves a biological residue

manifested by decreased glucocorticoid and increased proinflammatory signaling.

Proceedings of the National Academy of Sciences of the United States of America, 106,

14716–21.

Mitchell JA, Chesi A, Elci O et al. (2015) Genetics of bone mass in childhood and adolescence:

Effects of sex and maturation interactions. Journal of Bone and Mineral Research, 30,

1676–1683.

Mizuno H, Nakamura A, Aoki Y et al. (2011) Identification of muscle-specific MicroRNAs in

serum of muscular dystrophy animal models: promising novel blood-based markers for

muscular dystrophy. PLoS ONE, 6, 14–19.

Nagano A, Koga R, Ogawa M et al. (1996) Emerin deficiency at the nuclear membrane in

patients with Emery-Dreifuss muscular dystrophy. Nature Genetics, 12, 254–9.

Naslavsky N, Caplan S (2011) EHD proteins: Key conductors of endocytic transport. Trends in

Cell Biology, 21, 122–131.

Nilsson J-åke, Smith HG (1985) Early fledgling mortality and the timing of juvenile dispersal in

the marsh tit Parus palustris. Ornis Scandinavica, 16, 293–298.

Oldham MC, Konopka G, Iwamoto K et al. (2008) Functional organization of the transcriptome

in human brain. Nature Neuroscience, 11, 1271–1282.

Palmer C, Diehn M, Alizadeh AA, Brown PO (2006) Cell-type specific gene expression profiles

of leukocytes in human peripheral blood. BMC Genomics, 7, 115.

70 Parikshak NN, Gandal MJ, Geschwind DH (2015) Systems biology and gene networks in

neurodevelopmental and neurodegenerative disorders. Nature Reviews Genetics, 16, 441–

458.

Pohl U, Smith JS, Tachibana I et al. (2000) EHD2, EHD3, and EHD4 encode novel members of

a highly conserved family of EH domain-containing proteins. Genomics, 63, 255–62.

Poirier FE, Bellisari A, Haines L (1978) Functions of primate play behavior. In: Social Play in

Primates (ed Smith EO), pp. 143–168. Academic Press, New York.

Pope TR (2000) The evolution of male philopatry in neotropical monkeys. In: Primate Males:

Causes and Consequences of Variation in Group Composition (ed Kappeler PM), p. 219.

Cambridge University Press, Cambridge.

Postel U, Thompson F, Barker G, Viney M, Morris S (2010) Migration-related changes in gene

expression in leg muscle of the Christmas Island red crab Gecarcoidea natalis: Seasonal

preparation for long-distance walking. The Journal of Experimental Biology, 213, 1740–

1750.

Pruett-Jones SG, Lewis MJ (1990) Sex ratio and habitat limitation promote delayed dispersal in

superb fairy-wrens. Nature, 348, 541–542.

Pusey A (1987) Sex-biased dispersal and inbreeding avoidance in birds and mammals. Trends in

Ecology & Evolution, 2, 295–299.

R Core Team (2016) R: A language and environment for statistical computing.

Reimand J, Arak T, Adler P et al. (2016) g:Profiler- a web server for functional interpretation of

gene lists (2016 update). Nucleic Acids Research, 1–7.

Reimand J, Kull M, Peterson H, Hansen J, Vilo J (2007) G:Profiler-a web-based toolset for

functional profiling of gene lists from large-scale experiments. Nucleic Acids Research, 35,

71 193–200.

Ritchie ME, Phipson B, Wu D et al. (2015) limma powers differential expression analyses for

RNA-sequencing and microarray studies. Nucleic Acids Research, 43, e47.

Robinson GE, Grozinger CM, Whitfield CW (2005) Sociogenomics: Social life in molecular

terms. Nature Reviews Genetics, 6, 257–271.

Romero IG, Pai AA, Tung J, Gilad Y (2014) RNA-seq: impact of RNA degradation on transcript

quantification. BMC Biology, 12, 42.

Rudkowska I, Raymond C, Ponton A et al. (2011) Validation of the use of peripheral blood

mononuclear cells as surrogate model for skeletal muscle tissue in nutrigenomic studies.

OMICS: A Journal of Integrative Biology, 15, 1–7.

Ruegg K, Anderson EC, Boone J, Pouls J, Smith TB (2014) A role for migration-linked genes

and genomic islands in divergence of a songbird. Molecular Ecology, 23, 4757–4769.

Rui L (2014) Energy metabolism in the liver. Comprehensive Physiology, 4, 177–197.

Sáenz A, Azpitarte M, Armañanzas R et al. (2008) Gene Expression Profiling in Limb-Girdle

Muscular Dystrophy 2A. PLoS One, 3, e3750.

Sale EM, Atkinson PG, Sale GJ (1995) Requirement of MAP kinase for differentiation of

fibroblasts to adipocytes, for insulin activation of p90 S6 kinase and for insulin or serum

stimulation of DNA synthesis. The EMBO Journal, 14, 674–684.

Silverin B (1997) The stress response and autumn dispersal behaviour in willow tits. Animal

Behaviour, 53, 451–459.

Smale L, Nunes S, Holekamp KE (1997) Sexually dimorphic dispersal in mammals: Patterns,

causes, and consequences. Advances in the Study of Behavior, 26, 181–251.

Smedley D, Haider S, Durinck S et al. (2015) The BioMart community portal: An innovative

72 alternative to large, centralized data repositories. Nucleic Acids Research, 43, W589–W598.

Snyder-Mackler N, Sanz J, Kohn JN et al. (2016) Social status alters immune regulation and

response to infection in macaques. Science, 354, 1041–1045.

Soria-Carrasco V, Castresana J (2012) Diversification rates and the latitudinal gradient of

diversity in mammals. Proceedings of the Royal Society B: Biological Sciences, 279, 4148–

4155.

Soulsbury CD, Baker PJ, Iossa G, Harris S (2008) Fitness costs of dispersal in red foxes (Vulpes

vulpes). Behavioral Ecology and Sociobiology, 62, 1289–1298.

Srygley RB, Lorch PD, Simpson SJ, Sword GA (2009) Immediate protein dietary effects on

movement and the generalised immunocompetence of migrating Mormon crickets Anabrus

simplex (Orthoptera: Tettigoniidae). Ecological Entomology, 34, 663–668.

Storey JD, Bass AJ, Dabney A, Robinson D (2015) QVALUE: Q-value Estimation for False

Discovery Rate Control.

Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proceedings of

the National Academy of Sciences of the United States of America, 100, 9440–5.

Sugden MC, Holness MJ (2003) Recent advances in mechanisms regulating glucose oxidation at

the level of the pyruvate dehydrogenase complex by PDKs. American Journal of

Physiology-Endocrinology and Metabolism, 284, E855–E862.

Sullivan PF, Fan C, Perou CM (2006) Evaluating the comparability of gene expression in blood

and brain. American Journal of Medical Genetics - Neuropsychiatric Genetics, 141 B, 261–

268.

Tang Z, Liang R, Zhao S et al. (2014) CNN3 is regulated by microrna-1 during muscle

development in pigs. International Journal of Biological Sciences, 10, 377–385.

73 Thomas WK, Martin SL (1993) A recent origin of marmots. Molecular Phylogenetics and

Evolution, 2, 330–336.

Todd JA, Acha-Orbea H, Bell JI et al. (1988) A molecular basis for MHC class II- associated

autoimmunity. Science, 240, 1003–1009.

Trapnell C, Pachter L, Salzberg SL (2009) TopHat: Discovering splice junctions with RNA-Seq.

Bioinformatics, 25, 1105–1111.

Tung J, Barreiro LB, Johnson ZP et al. (2012) Social environment is associated with gene

regulatory variation in the rhesus macaque immune system. Proceedings of the National

Academy of Sciences of the United States of America, 109, 6490–5.

Tung J, Zhou X, Alberts SC, Stephens M, Gilad Y (2015) The genetic architecture of gene

expression levels in wild baboons. eLife, 4, 1–22.

Vellichirammal NN, Zera AJ, Schilder RJ et al. (2014) De novo transcriptome assembly from fat

body and flight muscles transcripts to identify morph-specific gene expression profiles in

Gryllus firmus. PLoS One, 9, e82129.

Van Vuren DH (1990) Dispersal of Yellow-Bellied Marmots. University of Kansas.

Van Vuren DH, Armitage KB (1994) Survival of dispersing and philopatric yellow-bellied

marmots: What is the cost of dispersal? Oikos, 69, 179–181.

Wang J (2011) Coancestry: A program for simulating, estimating and analysing relatedness and

inbreeding coefficients. Molecular Ecology Resources, 11, 141–145.

Wang F, Wang L, Shen M, Ma L (2012) GRK5 deficiency decreases diet-induced obesity and

adipogenesis. Biochemical and Biophysical Research Communications, 421, 312–317.

Weston DJ, Gunter LE, Rogers A, Wullschleger SD (2008) Connecting genes, coexpression

modules, and molecular signatures to environmental stress phenotypes in plants. BMC

74 Systems Biology, 2, 16.

Wheat CW, Fescemyer HW, Kvist J et al. (2011) Functional genomics of life history variation in

a butterfly metapopulation. Molecular Ecology, 20, 1813–1828.

Woodroffe R, Macdonald DW, da Silva J (1993) Dispersal and philopatry in the European

badger, Meles meles. Journal of Zoology, 237, 227–239.

Wright FA, Sullivan PF, Brooks AI et al. (2014) Heritability and genomics of gene expression in

peripheral blood. Nature Genetics, 46, 430–437.

Yoder JM, Marschall EA, Swanson DA (2004) The cost of dispersal: Predation as a function of

movement and site familiarity in ruffed grouse. Behavioral Ecology, 15, 469–476.

Zera AJ, Brisson JA (2012) Quantitative, physiological, and molecular genetics of dispersal and

migration. In: Dispersal Ecology and Evolution (ed Clobert J), pp. 63–82. Oxford

University Press, Oxford.

Zera AJ, Denno RF (1997) Physiology and ecology of dispersal polymorphism in insects. Annual

Review of Entomology, 42, 207–230.

Zhan S, Zhang W, Niitepõld K et al. (2014) The genetics of monarch butterfly migration and

warning colouration. Nature, 514, 317–21.

Zhang B, Horvath S (2005) A general framework for weighted gene co-expression network

analysis. Statistical Applications in Genetics and Molecular Biology, 4, Article17.

Zhang R, Lahens NF, Ballance HI, Hughes ME, Hogenesch JB (2014) A circadian gene

expression atlas in mammals: Implications for biology and medicine. Proceedings of the

National Academy of Sciences, 111, 16219–16224.

Zhang Q, Sun X, Xiao X et al. (2016) Effects of maternal chromium restriction on the long-term

programming in MAPK signaling pathway of lipid metabolism in mice. Nutrients, 8, 1–17.

75 Zhou H, Mori S, Ishizaki T et al. (2016) Genetic risk score based on the lifetime prevalence of

femoral fracture in 924 consecutive autopsies of Japanese males. Journal of Bone and

Mineral Metabolism, 34, 685–691.

Zhu H, Gegear RJ, Casselman A, Kanginakudru S, Reppert SM (2009) Defining behavioral and

molecular differences between summer and migratory monarch butterflies. BMC Biology, 7,

14.

76

Chapter 3:

Wild rodents exhibit an inflammatory transcriptional response to social adversity

77 Abstract

The social environment can have profound impacts on gene expression. Specifically, socially stressed individuals exhibit chronic inflammation and reduced antiviral responses.

However, most of the existing studies have been conducted on humans and other highly social species in captivity. Under natural conditions, factors such as pathogen exposure, nutrition, or other environmental factors also regulate inflammatory and antiviral response and the influence of social stressors relative to these other factors has not been well assessed in the wild. To address this shortcoming, we profiled global gene expression levels in whole blood from wild yellow-bellied marmots (Marmota flaviventer) using RNA-seq. This species is facultatively social and previous results have shown both costs and benefits associated with increasing sociality. We observed affiliative and agonistic social interactions between wild marmots, created social networks and quantified individual connectedness in networks. Overall, social position explained little variation in global gene expression in discovery-based analyses at the level of individual gene transcripts. However, we found that individuals that are less affiliative and those that are subjected to relatively more aggression by conspecifics up-regulate expression of an a priori-specified set of proinflammatory genes. We found no evidence of social influences on antiviral/antibody-related gene expression. Our results suggest that social isolation and agonistic/threat-related processes induce inflammation, but no viral response. Differences from effects observed in previous studies of humans and captive species may reflect the free-living, facultatively social structure of marmots and the effects that environmental factors, in addition to social environment, have on gene expression.

78 Introduction

Social stressors and social isolation can have important health consequences. In humans, social stress can lead to increased disease susceptibility (S. Cohen, Janicki-Deverts, & Miller,

2007), compromised pathogen defense (S. Cohen, Doyle, Skoner, Rabin, & Gwaltney, 1997), and reduced longevity and increased mortality rates (Holt-Lunstad, Smith, & Layton, 2010;

Steptoe, Shankar, Demakakos, & Wardle, 2013). Recent sociogenomic studies suggest that these health disparities reflect large-scale shifts in transcription of immune-related genes (Steven W.

Cole, 2013, 2014). Humans who experience chronic social stressors, such as having few close personal relationships (Steven W. Cole et al., 2007; Hawkley, Capitanio, & Hawkley, 2015), low socioeconomic status (Miller et al., 2009), or imminent bereavement (Miller et al., 2008), typically increase expression of genes involved in inflammation and decrease expression of genes involved in antiviral responses and antibody synthesis in blood tissues (a pattern termed the conserved transcriptional response to adversity; CTRA) (Steven W. Cole, 2013, 2014).

Less research has been done in non-humans, but CTRA-like transcriptome shifts have also been observed in the other social species studied thus far. For example, social aggression drives immunological gene expression profiles in rodent brain (Audet, Jacobson-Pick, Wann, &

Anisman, 2011; Powell et al., 2013), blood (Avitsur, Powell, Padgett, & Sheridan, 2009; Stark,

Avitsur, Hunzeker, Padgett, & Sheridan, 2002), and lung tissues (Avitsur et al., 2006), as well as in non-human primate blood cells (Steven W. Cole et al., 2012; Tung et al., 2012). Recently these immune responses were found not only conserved across mammals, but also in lower vertebrates. Specifically, maraena whitefish (Coregonus maraena) activate immune- and stress- related pathways when housed in crowded conditions (Korytár et al., 2016). Although these

79 adverse social circumstances qualitatively differ, they each induce a similar transcriptional response, including a chronic inflammatory state and reduced innate antiviral defenses.

Intuitively, the observed physiological response to social stress would appear to be deleterious. Chronic inflammation leads to many of the diseases that disproportionately afflict socially disadvantaged humans (e.g., cardiovascular disease, stroke, asthma; Barnes, Chung, &

Page, 1988; Jin, Yang, & Li, 2010; Libby, 2006; Lindsberg & Grau, 2003) and a weakened antiviral response can limit the ability to combat pathogens and infections (Avitsur et al., 2006;

Raposo et al., 2013). However, two hypotheses have been advanced to explain the potential adaptive value of the CTRA response.

The first hypothesis focuses on adaptation to pathogen exposures associated with differing social conditions. This “density-dependent” hypothesis notes that socially isolated and socially integrated individuals experience vastly different environments with potential differences in pathogen risk. Individuals experiencing an amicable social environment are at greater risk of encountering viral infections through close contact with conspecifics (Nunn &

Altizer, 2006). Thus, these individuals should up-regulate their antiviral defenses. In an isolated social environment, however, individuals are at greater risk of being physically attacked by conspecifics and predators. Proinflammatory and antiviral gene expression programs are reciprocally regulated by different transcription factor systems (Amit et al., 2009; Decker,

Müller, & Stockinger, 2005) and it is highly adaptive to shift from one regime to another in order to counter changing profiles of pathogen risk. Therefore, less social individuals should direct energy towards an inflammatory response that will aid in wound healing and bacterial infection, whereas highly social individuals might prepare to fight infections that result from close social contact.

80 A second adaptive hypothesis for the CTRA emphasizes the need to alter immune defense profiles during periods when individuals experience threats and consequently shift their molecular resources toward inflammatory responses to the specific infectious risks associated with wounding injuries (Steven W. Cole, 2014; Powell et al., 2013). Under this “threat- dependent” hypothesis, the proximal trigger for the CTRA profile is not the density of social contacts per se but rather threatening life experiences (whether social or non-social). Studies supporting this hypothesis have shown that blocking fight-or-flight neuroendocrine signaling is sufficient to block CTRA development (holding constant conspecific social density; Powell et al., 2013).

Despite the extensive support from studies of humans and nonhumans under in captive conditions, no studies have addressed how the transcriptome might respond to social conditions in a natural setting. Unlike controlled environments and human lifestyles, life in the wild is more heterogeneous and unpredictable. Free-living individuals regularly encounter a suite of unique challenges such as insect-vectored pathogens, predation, nutritional challenges, climactic forces, and a diverse array of other “non-social” ecological influences. Thus, the biological impact of perceived social adversity may be minor compared to the challenges created by environmental variation, the threats of competition, the challenges associated with reproduction, and the need to avoid predation. If so, the sociogenomic response is expected to be attenuated compared to more controlled systems where these risks are removed (provided social factors are relatively uncorrelated with other risk factors).

To test this paradigm in a wild population, we use data and samples from a long-term research program on yellow-bellied marmots (Marmota flaviventer) near the Rocky Mountain

Biological Laboratory (RMBL). This population has been continuously studied since 1962

81 leading to new insights about how social behavior interacts with group dynamics, demography and fitness (Armitage, 2014; Blumstein, 2013). Importantly, yellow-bellied marmots at the study site are facultatively social which enhances variation in sociality allowing more robust statistical testing. Females recruit daughters to form matrilines (Armitage, 2014) but although some geographic locations contain large and multiple matrilines, other, smaller populations contain only a breeding age female and her young of the year. Individuals may be completely socially isolated or may live in colony sites with up to 65 others (D.T. Blumstein unpublished data).

Furthermore, there is significant variation in how socially integrated individuals are, regardless of colony size. Formal analyses of social networks indicate that marmots can vary by a factor of

10-fold in the number of conspecifics they affiliate with, even within the same colony-year (D.T.

Blumstein unpublished data).

We investigated whether affiliative/density-related or agonistic/adversity-related social metrics can explain variation in the genomic profile of free-living individuals by profiling global gene expression in whole blood using RNA-seq. We adopted a hypothesis-driven approach based on gene expression studies in humans and non-human species and predicted: 1) An individual’s level of social connectedness or social stress will be associated with differential expression of immune system genes in blood cells; 2) Marmots that are more social and that frequently engage in affiliative interactions with conspecifics (hereafter called integrated) will significantly up- regulate canonical antiviral genes compared to marmots that are less affiliative with conspecifics

(hereafter, isolated); and 3) Individuals that receive more agonistic interactions from conspecifics

(hereafter, bullied) will up-regulate canonical inflammatory genes compared to those that are less bullied.

82 Materials and methods

Study subjects

We studied free-living yellow-bellied marmots near the Rocky Mountain Biological

Laboratory (RMBL) in Gunnison County, Colorado from 2013-2015. Marmots emerge from hibernation in late April or early May and begin hibernating in mid-September or early October, depending on weather (Armitage, 2014; Blumstein, 2009). Marmots were systematically live- trapped throughout the active season at seven geographically distinct colonies surrounding

RMBL. During each trap event, basic measurements were recorded (body mass, foot length, etc.), blood samples collected (blood, hair, feces), ear tags affixed (when needed), and a unique dorsal fur mark for individual identification from afar was applied (Blumstein 2013).

Social observations

Trained observers closely watched marmot colonies during times of peak activity (0700–

1000 h and 1600–1900 h) from a distance that did not interfere with typical behavior (20–150 m depending on habitat). Observers recorded date, time, location, and all social and non-social behaviors for observed individuals (ethogram and detailed observation methods in Blumstein,

Wey, et al., 2009). Social interactions were classified as affiliative (i.e., allogrooming, playing, sitting in body contact) or agonistic (physical aggression or displacement). Interactions that were neither affiliative nor agonistic (such as a rare mount which may or may not precede sex) were excluded from analyses. During each social interaction, we noted the initiator, recipient, and the winner (defined as the marmot that remained at the interaction location). In general, each site was observed throughout the active season and most individuals were observed daily.

83 Social measures

Using these social observations, we created separate affiliative and agonistic pairwise interaction matrices between resident yearlings and adults in each colony annually. We excluded interactions with juveniles and transient individuals observed fewer than five times. These matrices were used to create weighted, directed affiliative and agonistic social networks in

SOCPROG (Whitehead, 2009).

Because we had no strong predictions about which specific social measures would be associated with gene expression, we calculated six affiliative attributes that quantify the extent of an individual’s social integration: 1) degree; 2) strength; 3) embeddedness; 4) closeness centrality; 5) betweenness centrality; and 6) eigenvector centrality using the iGraph package

(Csárdi & Nepusz, 2006) in R (version 3.3.1; R Core Team, 2016). Briefly, degree measures the number of direct connections that a focal individual has with others in the network (Freeman,

1978), whereas strength measures the magnitude of these direct relationships (Barrat,

Barthelemy, Pastor-Satorras, & Vespignani, 2004). Embeddedness measures how deeply integrated an individual is in a network (Moody & White, 2003). Centrality measures reflect an individual’s importance to the structure of the network. Closeness represents the shortest path distance from one individual to all other group members, thus describing how independent an individual is (Freeman, 1978). Betweenness is the proportion of shortest paths between every other pair of animals on which a focal animal lies, indicating how important an individual is in connecting others (Freeman, 1978). Eigenvector is the sum of centralities of an individual’s neighbors (Bonacich, 1987).

We quantified the extent to which each individual was bullied by conspecifics using received agonistic interactions (as in Lea, Blumstein, Wey, & Martin, 2010). From our agonistic

84 social networks, we calculated: 1) indegree; 2) instrength; and 3) incloseness, where indegree equals the number of neighbors that initiate agonistic interactions toward a focal individual, instrength represents the number of times an individual receives agonistic behaviors, and incloseness measures the shortest distance from which a node can be reached from other nodes

(for detailed definitions of these and other social network measures, see Wey, Blumstein, Shen,

& Jordán, 2008). Each social measure was rank-transformed to meet distribution assumptions.

Reduction of social variables

Because social network measures are often highly correlated, we first conducted a principal component analysis (PCA) using R’s ‘prcomp’ function to identify correlated and uncorrelated social network measures (as in Wey & Blumstein, 2012; Yang, Maldonado-

Chaparro, & Blumstein, 2017). We calculated components for rank-transformed affiliative and agonistic social measures separately. For each social principal component with an eigenvalue > 1

(calculated as the standard deviation of the component, squared), we identified the social variable that explained the most variation in that component and used that rank-transformed social measure as a fixed factor in mixed effects models testing the transcriptional response to sociality.

RNA sampling, library preparation, and sequencing

Networks consisted of yearling and adult marmots. However, we focused on assessing the CTRA in yearlings because they exhibit more social variation than adults (Wey & Blumstein,

2012). For each trapped yearling, we collected 1 mL of whole blood preserved in 2.5 mL

PAXgeneTM Blood RNA solution (PreAnalytiX, Qiagen, Hombrechtikon, Switzerland). Samples

85 were incubated, frozen, and extracted according to the PAXgeneTM Blood RNA kit instructions.

We removed globin transcripts using the GLOBINclearTM kit for mouse/rat (ambion,

ThermoFisher Scientific, Waltham, MA) and assessed RNA quality using an Agilent 2100

Bioanalyzer (Agilent Technologies, Santa Clara, CA). We excluded any samples with RIN < 4 to preserve statistical power and corrected for degraded samples using validated regression methods

(Romero et al., 2014). cDNA libraries were created using a TruSeq Library Prep Kit v2

(Illumina, Madison, WI), quantified with the KAPA SYBR® Fast qPCR library quantification kit (Kapa Biosystems Inc., Wilmington, MA), and pooled at 8-10 samples per lane. Single-end

100 base pair (bp) sequencing was performed on Illumina HiSeq2000 (2013 samples) and

HiSeq4000 (2014-2015) platforms at the Vincent J. Coates Genomics Sequencing Laboratory,

UC Berkeley, USA.

Read mapping and expression quantification

We trimmed adapters and removed short (< 20 base pairs) and low quality reads (Phred score < 20) using Trim Galore! (Krueger, 2015). We mapped resulting reads to the thirteen-lined ground squirrel (Ictidomys tridecemlineatus) genome (spetri2, GenBank Assembly ID

GCA_000236235.1) using TopHat2 v.2.1.0 (Kim et al., 2013; Trapnell, Pachter, & Salzberg,

2009b). To maximize uniquely mapped reads, we allowed eight mismatches, a 10-base pair (bp) gap length, and a 20 bp edit distance between reads and the reference genome. We quantified expression of uniquely mapped reads per individual using HT-Seq’s ‘union’ mode (Anders et al.,

2015) with the squirrel gene annotation file (Ictidomys_tridecemlineatus.spetri2.84.gtf). We filtered this dataset to only include protein-coding genes with at least 10 reads in 75% of libraries. Count data were transformed for linear modeling and normalized according to

86 sequencing depth, gene length, and mean variance across genes using LIMMA’s ‘voom’ function

(Law et al., 2014; Ritchie et al., 2015). Using the ENSEMBL thirteen-lined ground squirrel genome as a reference, we acquired HGNC (HUGO Gene Nomenclature Committee; K. A. Gray et al., 2014) gene symbol information for squirrel transcripts using BiomaRt (Smedley et al.,

2015). We only included genes with HGNC homologs in downstream analyses.

Removal of technical variation and outliers

Non-biological technical sources of variation, or “batch effects”, are ubiquitous and introduce systematic bias to high-throughput genetic studies (Leek et al., 2010). To identify these spurious artifacts, we created a PCA of the normalized, VOOM transformed expression data and assessed the technically derived variance using ‘prcomp’. We regressed out the variance explained by three batch effects that significantly (p < 0.05) influenced any of the first 12 principal components (PCs) of global gene expression: 1) sequencer (HiSeq 2000 vs 4000; correlated with PC 1 [Pearson correlation r = 0.68, p = 1.65e-11]); 2) RNA extraction date

(performed on six different days; PC 1 [r = 0.48, p = 1.13e-5]); 3) concentration of RNA in each sample (PC 1 [r = 0.26, p = 0.02]). After regressing technical variables, we built a Euclidean distance-based network of the expression residuals using the ‘adjacency’ function in WGCNA

(Langfelder & Horvath, 2008). Samples were designated as outliers and removed from analyses if their connectivity was more than three standard deviations from the mean (as described by

Steve Horvath, 2011).

Cellular composition is also known to influence gene expression in peripheral blood

(Eady et al., 2005; Palmer et al., 2006). In particular, CTRA gene expression is largely driven by the relative abundance of different monocyte subsets (Steven W. Cole, Hawkley, Arevalo, &

87 Cacioppo, 2011; Powell et al., 2013). To test for any effect of major cell populations in our samples, we quantified monocyte counts (subset counts were not available) in blood smears for all RNA samples (detailed methods described in Lopez et al., 2013). Of these cell types, PCA indicated neutrophils explained the most variation in global gene expression (correlated with PC

1 after regressing technical effects above [r = 0.43, p = 0.0001]). Although we did not observe any significant difference in neutrophil counts across social phenotypes (Figure 3-S1), we regressed out the effect of this variable.

Linear mixed effects models

To allow for genome wide discovery of individual genes that are significantly associated with social phenotypes, we created linear mixed models using EMMREML (Akdemir &

Godfrey, 2015). Affiliative and agonistic social measures deemed important by PCA were included as fixed independent effects. We also included sex as a fixed effect as well as the number of adult residents in the colony to control for the number of potential social interactions available to each individual. A matrix of pairwise relatedness between individuals was included as a random effect to control for heritability of gene expression (Tung et al., 2015; Wright et al.,

2014). Relatedness was calculated using the triadic maximum likelihood approach in

COANCESTRY (J. Wang, 2011) based on genotypes obtained from 12 microsatellite loci (as described elsewhere; Blumstein et al., 2010; Lea et al., 2010). Dependent variables of mixed models were the residuals of the filtered, normalized gene expression residuals after regressing batch effect variance. For each gene model, we extracted the p-value associated with social phenotype and adjusted for multiple hypothesis testing using a false discovery rate approach (q;

88 Storey & Tibshirani 2003) implemented in R’s ‘p.adjust’ function. A gene was considered significantly associated with a social phenotype if q ≤ 0.1.

Canonical CTRA genes

In addition to identifying individual differentially expressed (DE) genes associated with the social environment throughout the genome, we investigated expression profiles of a set of 53 a priori-specified CTRA indicator genes. This list has been described and evaluated previously

(Antoni et al., 2016; e.g., Fredrickson et al., 2013; Knight et al., 2016) and includes genes involved in inflammatory (up-regulated in CTRA) and antiviral or antibody production pathways

(down-regulated in CTRA). We identified homologs of these genes in the squirrel genome and separated those involved in proinflammatory and antiviral/antibody pathways into two lists. We then z-score standardized gene expression residuals so that means were centered at zero and variance was equal across genes within the two lists (as in Antoni et al., 2016). We averaged the standardized expression values in each list separately to create one inflammatory and one antiviral/antibody summary score per individual. We then fitted general linear models (GLMs) to test whether social network measures explain variation in these two composite values. As in linear mixed models, fixed independent effects were the affiliative and agonistic social measures, sex, and the number of adult residents in the colony. Dependent variables were the standardized inflammatory and antiviral/antibody composite scores.

Results

Social measures

In total, we observed 338 individual marmots and recorded 16,725 affiliative and 1,693 agonistic interactions during the 2013-2015 field seasons. After removing juveniles and marmots

89 observed fewer than five times in the season, our social dataset consisted of 7,849 affiliative and

695 agonistic interactions among 137 individuals, of which 111 were yearlings. These observations were used to calculate affiliative and agonistic network statistics for 18 colony- years during the study period. PCA reduced our six affiliative and three agonistic social network measures to three uncorrelated social variables. Two affiliative/density-related social measures

(strength and closeness) and one agonistic/adversity-related measure (indegree) explained the variation in the three principal components with an eigenvalue > 1 (Tables 3-S1, S2).

RNA-seq samples

Of the 111 yearlings we observed engaging in social interactions, we obtained high quality RNA sequence data from 76 individuals. We generated an average of 32.9 million reads per individual (range = 17.6 - 54.4 million) and an average of 57% of reads (18.8 million) mapped uniquely to the squirrel genome with sufficient quality and length. Of the 22,389 protein-coding genes in the squirrel genome, we evaluated expression levels of 9,070 (41%) detectably expressed, protein-coding genes with known HGNC symbols. Clustering analysis of these 9,070 gene expression values identified three outlier samples that were excluded from downstream analyses, resulting in a total of 73 samples. We calculated affiliative measures for all 73 individuals. However, agonistic interactions are rare (less than 10% of all social interactions were agonistic in this dataset) and eight individuals were neither observed to initiate nor receive agonistic interactions. Thus, these eight marmots were not included in agonistic association matrices, resulting in 65 individuals with agonistic social measures.

90 Linear mixed effects models

We created two mixed effects models to evaluate the association between variation in inter-individual gene expression and the three uncorrelated social metrics: 1) an affiliative model analyzing strength and closeness (n = 73); and 2) an agonistic model to analyze the effect of indegree (n = 65). Because these social measures were not normally distributed, we used rank- transformed values in models. However, we note that rank-transformed values were highly correlated with the original social metrics (range = 0.60-0.99) and analyses using either of these metrics yielded the same results. After controlling for relatedness, sex, and the number of non- juvenile residents in the colony (which ranged from 6 to 37 individuals), no genes were significantly associated with affiliative strength, seven genes were associated with affiliative closeness and nine genes were differentially expressed as a function of agonistic indegree

(Tables 3-1 and 3-2; false discovery rate = 10%). Five of the seven closeness-associated genes were up-regulated in marmots with high affiliative closeness (PDZK1, ALPK2, ACTRT3,

DKKL1, TDRD3; Table 3-1; Figure 3-1), whereas two were down-regulated (ADAP2, MARC2;

Table 3-1; Figure 3-1). Seven of nine genes were up-regulated in individuals that received a high degree of agonistic interactions (NCAM1, ATP6V1G3, DNAJC15, OGT, AKTIP, AGK,

DNAJC13; Table 3-2; Figure 3-2) and two were down-regulated (GPX7, PHGDH; Table 3-2;

Figure 3-2). None of the CTRA indicator genes were differentially expressed as a function of any social metric.

Composites of canonical CTRA genes

Of the 53 CTRA genes, we identified 33 homologs in the squirrel genome and 29 of these were detectably expressed in our samples (³ 10 transcripts in 75% of samples; Table 3-3). After

91 separating this gene list into two categories (n = 16 antiviral genes, n = 13 proinflammatory genes; we identified no antibody producing CTRA homolog transcripts), we tested the correlations between the z-scaled inflammatory and antiviral composites against the three social network measures. Here, we also created one affiliative and one agonistic model for each composite score (resulting in 4 total GLMs).

As previously discussed, the density-related CTRA hypothesis predicts that socially integrated individuals (high affiliative strength or closeness) would down-regulate proinflammatory and up-regulate antiviral pathways. Conversely, the adversity-related CTRA hypothesis predicts that bullied individuals should increase a proinflammatory response and suppress antiviral genes. After controlling for covariates, both affiliative closeness and agonistic indegree were significantly associated with a CTRA composite value. Individuals that had low closeness and those that received agonistic interactions from more conspecifics exhibited significantly higher inflammatory composite values, a trend that was consistent with our hypotheses (p = 0.009 and 0.027 respectively; Figure 3-3). Affiliative strength was not associated with either composite score and the antiviral composite did not change as a function of any of the social metrics we tested (Figure 3-3).

Discussion

This study is the first to identify associations between CTRA gene transcription and social metrics in a wild mammal population. In genome-wide discovery analyses, few (n = 16) genes were differentially expressed as a function of the social network measures we examined

(Tables 3-1 and 3-2). Nevertheless, some noteworthy results emerged. Of the affiliative closeness-associated genes (n = 7), PDZK1 was up-regulated in more affiliative marmots. This

92 adaptor protein is involved in a wide range of biological functions, but has specifically been found to be down-regulated in mice experiencing chronic inflammation (Lenzen et al., 2012).

Although Lenzen and colleagues analyzed colonic mucosa expression and we analyzed expression in blood, our results are concordant with the pattern they observed: marmots that exhibited an increased inflammatory response (those with low closeness; Figure 3-3) also down- regulated expression of PDZK1 (Figure 3-1a). Interestingly, ADAP2, a gene that regulates interferon gene transcription and therefore modulates an antiviral defense (Bist et al., 2017; Shu,

Lennemann, Sarkar, Sadovsky, & Coyne, 2015), was found to be significantly down-regulated in marmots with high affiliative closeness. This gene is not included in the list of antiviral CTRA genes, but this finding contradicts the affiliative/density-dependent theory which predicts more socially affiliative individuals would up-regulate genes that defend against viral entry. Several genes with no clear relationship with the immune system were also associated with closeness including ALPK2, a hypothesized tumor suppressing gene (Nishi et al., 2017; Salim et al., 2013;

Yoshida et al., 2012), DKKL1, an acrosome-associated protein important for fertilization in mammals (Kohn, Sztein, Yagi, DePamphilis, & Kaneko, 2010; Zhuang et al., 2011), and two genes of unknown function: ACTRT3 (Tripathi et al., 2017) and TDRD3 (Goulet, Boisvenue,

Mokas, Mazroui, & Côté, 2008; Linder et al., 2008).

Mixed models identified differential expression of nine agonistic-associated genes (Table

3-2). Five of these have no known immune function: carcinoma marker ATP6V1G3 (Shinmura et al., 2015; Tajima et al., 2016), apoptosis regulator AKTIP (Remy & Michnick, 2004), lipid kinase AGK (Bektas et al., 2005), serine synthesizer PHGDH (Possemato et al., 2012), and glycosylating enzyme OGT (Vosseller, Sakabe, Wells, & Hart, 2002; Wells & Hart, 2003); whereas four bullying-associated genes contribute to immune defense. NCAM1 was significantly

93 up-regulated in marmots that were bullied by many conspecifics (Figure 3-2a). NCAM1 (also known as CD56) is a cardinal marker of natural killer (NK) cells and this relationship could reflect NK cell activation or mobilization, as previous research has found NK cells to be consistently mobilized by acute fight-or-flight responses during a variety of physical and psychological stressors (Dopp, Miller, Myers, & Fahey, 2000; Owen, Poulton, Hay, Mohamed-

Ali, & Steptoe, 2003; e.g., Schedlowski et al., 1996). Transient NK cell mobilization into blood occurs when increased blood flow flushes NK cells out of low-flow pools in the spleen and on blood vessel walls into the circulating blood pool. Thus, up-regulation of this gene in blood tissue (via increased NK cell prevalence) is expected when individuals experience more fight-or- flight inducing situations. Furthermore, when NCAM activity was measured in the amygdala of rats exposed to different social conditions, expression was notably higher in individuals reared in isolation (Gilabert-Juan, Moltó, & Nacher, 2012), confirming that this gene may play a role in the response to social stressors. Transcription of heat shock proteins (HSPs) is a highly conserved response to a plethora of stressors such as temperature, hypoxia, and diseases

(Morimoto, 1998). We found two immune-associated HSPs, DNAJC15 (aka MCJ) and

DNAJC13, to be up-regulated in bullied marmots (Figures 3-2c and 3-2g). DNAJC15 is a critical component in the macrophage response to bacterial infection. This protein regulates the production of tumor necrosis factor (TNF) in response to inflammatory stimuli and mice deficient in this gene exhibit resistance to inflammatory insults (Navasa et al., 2015). Similarly,

DNAJC13 expression significantly increases following experimental bacterial infection (Song et al., 2014). In contrast, GPX7, a gene that has been shown to sense and reduce oxidative stress

(Peng et al., 2012; Utomo et al., 2004) was more highly expressed in marmots that were not bullied by many conspecifics. GPX7 is not directly associated with an immune response, but it

94 appears to regulate the activation of several proinflammatory cytokines, chemokines, and interleukins via NF-kB signaling (including many inflammatory CTRA genes; Peng, Hu, Soutto,

Belkhiri, & El-Rifai, 2014) and GPX7-deficient mice can develop several immune disorders

(reviewed by Chen, Wei, Hsu, Su, & Lee, 2016). Taken together, these individual DE gene associations suggest that bullied marmots initiate a cascade of inflammatory pathways in response to physical threats and the potential bacterial infections that may follow an aggressive encounter.

Although none of the individual CTRA genes significantly changed expression as a function of sociality, this pattern is not uncommon. In studies that have detected significant associations between the averaged CTRA composites and social metrics, the individual CTRA transcripts rarely reach statistical significance (e.g., Fredrickson et al., 2015; Kitayama, Akutsu,

Uchida, & Cole, 2016) due to a poorer signal-to-noise ratio in individual gene expression values

(Novak, Sladek, & Hudson, 2002; Tu, Stolovitzky, & Klein, 2002). In contrast, composite scores summarizing the covariation of multiple genes are more likely to reveal true positive associations because they are variance-stabilized by smoothing (via the law of large numbers). Analysis of a priori-defined CTRA gene sets demonstrated partial evidence for both the density-dependent and adversity-dependent hypotheses. We found that an agonistic social environment was associated with increased activity in inflammatory pathways and that an affiliative social environment was associated with decreased inflammatory activity, as predicted. However, we did not find support for the antiviral aspects of our hypotheses. Whereas social isolation and social stress have often been found to associate with reduced basal expression of antiviral immune genes in humans and other controlled systems, none of our social metrics were associated with antiviral gene expression in leukocyte RNA profiles from wild yellow-bellied marmots.

95 It is notable that one affiliative measure we tested (closeness) was associated with both inflammatory CTRA composites and individual gene expression, but that the second affiliative measure we tested (strength) revealed no significant results. Strength reflects the absolute number of interactions between a focal individual and other conspecifics (Barrat et al., 2004), whereas closeness centrality measures the shortest path distance from the focal individual to all other group members (Freeman, 1978). This discrepancy suggests that a marmot’s position in the social network may exert more influence on its sense of social stability compared to the sheer number of affiliative interactions in which an individual engages. Recent work has shown that a marmot’s sense of ‘security’ may be influenced by such indirect social interactions (Mady &

Blumstein in press; Blumstein et al. 2017) and our results confirm that indirect social relationships may play an important role in social physiology as well.

The lack of antiviral activity associated with the social environment differs from patterns observed in humans and captive rodents and primates, but two key ecological differences may explain our results. First, the nature of a marmot’s relationship with conspecifics differs from previously studied systems. Humans and the other species that consistently exhibit improved immunity under positive social experiences (e.g., Steven W. Cole et al., 2007; Hawkley et al.,

2015; Tung et al., 2012) are obligately social, whereas yellow-bellied marmots are facultatively social. Resident males defend females but do not form strong pair bonds with them (Armitage

2014). Female marmots vary in the degree to which they recruit their daughters to their matriline, which is the mechanism underlying social variation in this species (Armitage 2014). Although group living typically provides benefits such as reduced predation risk via dilution (Vine, 1971) and detection (Pulliam, 1973), yellow-bellied marmots experience a number of costs from being amicable. Females with very strong affiliative social relationships are less likely to reproduce

96 (Wey & Blumstein 2012). More social individuals of either sex are more likely to die overwinter

(Yang et al., 2017) and they have significantly shorter life spans overall (Blumstein et al. in revision). Thus, the genomic response to social environmental conditions may be less sensitive in a species whose life history strategy is not continually dependent on strong social bonds.

Second, in contrast to previous research in controlled settings, we evaluated the transcriptional response to varying social circumstances in a natural population. Free-living animals regularly encounter life-threatening challenges, especially if they are a prey species like marmots. Several mammalian and avian carnivores prey on marmots and predation is the main cause of summer mortality (Van Vuren, 2001). Thus, individuals must allocate a proportion of their time to antipredator vigilance (Blumstein, Runyan, et al., 2004). Moreover, marmots with insufficient mass are more likely to die during hibernation (Lenihan & Van Vuren, 1996). It is therefore critical that they accumulate enough fat during the brief active season to maintain energy stores overwinter. Previous research examining the transcriptional response to the social environment occurred in controlled settings where the risks of predation, food acquisition, and mortality in general were minimal. With these threats removed, social relationships likely become a primary source of stress, so a muted physiological response to social stress in this system is unsurprising. In sum, life in the natural environment is heterogeneous and any psychological stress associated with the social environment may be relatively minor compared to the psychological or ecological factors that directly influence individual survival. Future studies in natural systems, where environmental factors influencing gene vary across gradients are required to properly estimate the specific effect of social factors on gene expression.

Results from our investigation of the two immunological pathways we expected to be associated with social phenotype (antiviral and proinflammatory pathways) support these ideas.

97 The absence of increased antiviral activity in highly affiliative individuals suggests that facultatively social marmots do not gain the immunological benefits from strong social relationships that are typically seen in species with more fixed social life history strategies.

Moreover, the antiviral composite scores we observed may be driven by immunological factors we did not record. It would be prudent for future research examining the CTRA response in wild systems to simultaneously measure other immunological parameters such as infection status and antibody prevalence to more thoroughly evaluate the effect of the social environment compared to these challenges. We did, however, observe a significant positive relationship between inflammation and physical attacks by conspecifics in both individual genes (Figure 3-2) and composite scores (Figure 3-3). This finding indicates an evolutionarily conserved inflammatory response to social adversity exists whereas the antiviral response may be more circumstance dependent.

Taken together, our results suggest that the strength of gene regulatory influence may be directed toward agonistic/threat-related inflammatory processes in facultatively social species rather than antiviral costs and benefits as found in humans and primates. Our approach demonstrates the power of using gene expression analysis to test specific hypotheses about the physiological response to stressors in a natural population. Future studies in additional systems are needed to clarify the relationship between these genetic pathways and the social environment in different ecological contexts.

98 Figure 3-1. Seven genes significantly changed expression as a function of affiliative closeness.

Genes in plots a through e were up-regulated in more socially integrated individuals, whereas f and g were down-regulated.

PDZK1 ALPK2 ACTRT3 DKKL1

TDRD3 ADAP2 MARC2

Residual gene expression

Affiliative closeness

99 Figure 3-2. Nine genes were significantly associated with being bullied. Individuals that received agonistic interactions from more individuals exhibited increased expression n of genes in plots a through g and decreased expression of genes in plots h and i.

NCAM1 ATP6V1G3 DNAJC15

OGT AKTIP AGK

Residual gene expression

DNAJC13 GPX7 PHGDH

Agonistic indegree

100 Figure 3-3. Rank-transformed social metrics against standardized expression composites of

CTRA inflammatory and antiviral genes. Individuals that had low affiliative closeness and those that received agonistic interactions from more conspecifics significantly up-regulated CTRA genes involved in inflammation (p < 0.05). Note that plots depict raw relationships between social metrics and CTRA composites, whereas statistics are calculated from the GLMs

1 1 1

0 0 0

−1 −1 −1 Inflammatory composite score beta = 0.004 beta = −0.016 beta = 0.007 p = 0.124 p = 0.009 p = 0.027 0 20 40 60 0 20 40 60 0 20 40 60 Affiliative strength Affiliative closeness Agonistic indegree

2 2 2

1 1 1

0 0 0 Antiviral composite score Antiviral

−1 beta = 0.004 −1 beta = −0.011 −1 beta = 0.0005 p = 0.249 p = 0.189 p = 0.915 0 20 40 60 0 20 40 60 0 20 40 60 Affiliative strength Affiliative closeness Agonistic indegree

101 Table 3-1. List of genes that are significantly differentially expressed (q £ 0.1) as a function of affiliative closeness.

Closeness HGNC Closeness Closeness HGNC description log2 fold symbol p-value q-value change PDZK1 PDZ domain containing 1 0.042 2.31E-05 0.070 ALPK2 alpha kinase 2 0.032 3.72E-05 0.077 ACTRT3 actin related protein T3 0.027 7.21E-05 0.099 DKKL1 dickkopf like acrosomal protein 1 0.026 1.40E-05 0.064 TDRD3 tudor domain containing 3 0.019 4.23E-05 0.077 ADAP2 ArfGAP with dual PH domains 2 -0.062 7.64E-05 0.099 MARC2 mitochondrial amidoxime reducing -0.073 5.83E-10 5.29E-06 component 2

102 Table 3-2. List of genes that are significantly differentially expressed (q £ 0.1) as a function of the degree of agonistic interactions received.

Indegree HGNC Indegree Indegree HGNC description log2 fold symbol p-value q-value change NCAM1 neural cell adhesion molecule 1 0.036 4.26E-05 0.070 ATP6V1G3 ATPase H+ transporting V1 subunit G3 0.022 5.35E-05 0.070 DnaJ heat shock protein family (Hsp40) DNAJC15 0.017 5.01E-05 0.070 member C15 O-linked N-acetylglucosamine OGT 0.015 7.11E-06 0.032 (GlcNAc) transferase AKTIP AKT interacting protein 0.014 7.36E-05 0.074 AGK acylglycerol kinase 0.010 1.12E-05 0.034 DnaJ heat shock protein family (Hsp40) DNAJC13 0.009 6.16E-05 0.070 member C13 GPX7 glutathione peroxidase 7 -0.005 5.68E-05 0.070 PHGDH phosphoglycerate dehydrogenase -0.015 5.12E-07 0.005

103 Table 3-3. Canonical genes associated with a conserved transcriptional response to social adversity (CTRA). We present the 33 CTRA genes that mapped to transcripts in the thirteen- lined ground squirrel genome, with HGNC symbols of those detectably expressed in our samples in bold.

HGNC Immune Evidence of association HGNC description Squirrel Ensembl ID symbol function with social adversity

CXCL8 C-X-C motif chemokine ENSSTOG00000003271 inflammatory Miller et al. 2009; ligand 8 Powell et al. 2013 FOS Fos proto-oncogene, AP-1 ENSSTOG00000003005 inflammatory Fredrickson et al. 2013; transcription factor subunit Powell et al. 2013

FOSB FosB proto-oncogene, AP-1 ENSSTOG00000028949 inflammatory Fredrickson et al. 2013; transcription factor subunit Powell et al. 2013 FOSL1 FOS like 1, AP-1 ENSSTOG00000003296 inflammatory Fredrickson et al. 2013; transcription factor subunit Powell et al. 2013 FOSL2 FOS like 2, AP-1 ENSSTOG00000022839 inflammatory Fredrickson et al. 2013; transcription factor subunit Powell et al. 2013 IFI27L2 interferon alpha inducible ENSSTOG00000005819 antiviral Fredrickson et al. 2013; protein 27 like 2 Cole 2014 IFI30 IFI30, lysosomal thiol ENSSTOG00000001939 antiviral Fredrickson et al. 2013; reductase Cole 2014 IFI35 interferon induced protein 35 ENSSTOG00000000491 antiviral Fredrickson et al. 2013; Cole 2014 IFI44 interferon induced protein 44 ENSSTOG00000013516 antiviral Fredrickson et al. 2013; Cole 2014 IFI44L interferon induced protein 44 ENSSTOG00000026205 antiviral Fredrickson et al. 2013; like Cole 2014 IFIH1 interferon induced with ENSSTOG00000004953 antiviral Fredrickson et al. 2013; helicase C domain 1 Cole 2014 IFIT2 interferon induced protein ENSSTOG00000022564 antiviral Fredrickson et al. 2013; with tetratricopeptide repeats Cole 2014 2 IFIT3 interferon induced protein ENSSTOG00000013100 antiviral Fredrickson et al. 2013; with tetratricopeptide repeats Cole 2014 3 IFIT5 interferon induced protein ENSSTOG00000028876 antiviral Fredrickson et al. 2013; with tetratricopeptide repeats Cole 2014 5 IFITM1 interferon induced ENSSTOG00000025599 antiviral Fredrickson et al. 2013; transmembrane protein 1 Cole 2014 IFITM5 interferon induced ENSSTOG00000022428 antiviral Fredrickson et al. 2013; transmembrane protein 5 Cole 2014

104 IL1A interleukin 1 alpha ENSSTOG00000002285 inflammatory Miller et al. 2009; Fredrickson et al. 2013; Powell et al. 2013 IL1B interleukin 1 beta ENSSTOG00000002292 inflammatory Cole et al. 2007; Fredrickson et al. 2013; Powell et al. 2013 IRF2 interferon regulatory factor 2 ENSSTOG00000003228 antiviral Fredrickson et al. 2013; Runcie et al. 2013 IRF7 interferon regulatory factor 7 ENSSTOG00000005711 antiviral Fredrickson et al. 2013; Snyder-Mackler et al. 2016 IRF8 interferon regulatory factor 8 ENSSTOG00000028363 antiviral Fredrickson et al. 2013; Powell et al. 2013 JUN Jun proto-oncogene, AP-1 ENSSTOG00000028549 inflammatory Fredrickson et al. 2013; transcription factor subunit Powell et al. 2013 MX1 MX dynamin like GTPase 1 ENSSTOG00000012846 antiviral Fredrickson et al. 2013; Powell et al. 2013; Cole 2014 NFKB1 nuclear factor kappa B ENSSTOG00000011445 inflammatory Cole et al. 2007; Miller subunit 1 et al. 2009; Cole et al. 2012; Powell et al. 2013 NFKB2 nuclear factor kappa B ENSSTOG00000002191 inflammatory Cole et al. 2007; Miller subunit 2 et al. 2009; Cole et al. 2012; Powell et al. 2013 OAS1 2'-5'-oligoadenylate ENSSTOG00000024010 antiviral Fredrickson et al. 2013; synthetase 1 Powell et al. 2013; Cole 2014 OAS3 2'-5'-oligoadenylate ENSSTOG00000003957 antiviral Fredrickson et al. 2013; synthetase 3 Powell et al. 2013; Cole 2014 PTGS1 prostaglandin-endoperoxide ENSSTOG00000001672 inflammatory Fredrickson et al. 2013; synthase 1 Powell et al. 2013 PTGS2 prostaglandin-endoperoxide ENSSTOG00000009613 inflammatory Cole et al. 2007; Cole et synthase 2 al. 2011; Tung et al. 2012; Fredrickson et al. 2013; Powell et al. 2013 REL REL proto-oncogene, NF-kB ENSSTOG00000005785 inflammatory Cole et al. 2007; subunit Fredrickson et al. 2013 RELA RELA proto-oncogene, NF- ENSSTOG00000027336 inflammatory Cole et al. 2007; kB subunit Fredrickson et al. 2013 RELB RELB proto-oncogene, NF- ENSSTOG00000009056 inflammatory Cole et al. 2007; kB subunit Fredrickson et al. 2013 TNF tumor necrosis factor ENSSTOG00000012884 inflammatory Audet et al. 2011; Fredrickson et al. 2013; Powell et al. 2013

105 Supplemental results

Figure 3-S1. Rank-transformed social network measures against neutrophil counts in blood. Cell counts were not significantly associated with social phenotypes.

80 80 80 beta=−0.09, p=0.44 beta=−0.16, p=0.15 beta=−0.02, p=0.89

60 60 60

40 40 40 Neutrophil count 20 20 20

0 0 0 0 20 40 60 0 20 40 60 20 40 60 Affiliative strength Agonistic indegree Affiliative closeness

80 80 80 beta=−0.09, p=0.44 beta=−0.16, p=0.15 beta=−0.02, p=0.89

60 60 60

40 40 40

Neutrophil count 20 20 20

0 0 0 0 20 40 60 0 20 40 60 20 40 60 Affiliative strength Affiliative closeness Agonistic indegree

106 Table 3-S1. Loadings resulting from principle component analysis of affiliative interactions within marmot colony-years. Bold values indicate the social attributes used in mixed effects and general linear models.

Social network measure Principal component 1 Principal component 2

Degree -0.554 0.282

Strength -0.564 -0.083

Closeness -0.195 -0.622

Betweenness 0.049 0.417

Embeddedness -0.541 0.259

Eigenvector centrality -0.202 -0.533

Variance explained (%) 40.8 34

PC eigenvalue 2.45 2.04

107 Table 3-S2. Loadings resulting from principle component analysis of agonistic interactions within marmot colony-years. Bold values indicate the social attributes used in mixed effects and general linear models.

Social network measure Principal component 1 Principal component 2

In-degree -0.657 0.231

In-strength -0.643 0.322

In-closeness -0.392 -0.918

Variance explained (%) 68.5 27.1

PC eigenvalue 2.06 0.81

108 Bibliography

Akdemir D, Godfrey OU (2015) EMMREML: Fitting Mixed Models with Known Covariance

Structures.

Amit I, Garber M, Chevrier N et al. (2009) Unbiased reconstruction of a mammalian

transcriptional network mediating pathogen responses. Science, 326, 257–263.

Anders S, Pyl PT, Huber W (2015) HTSeq-A Python framework to work with high-throughput

sequencing data. Bioinformatics, 31, 166–169.

Antoni MH, Bouchard LC, Jacobs JM et al. (2016) Stress management, leukocyte transcriptional

changes and breast cancer recurrence in a randomized trial: An exploratory analysis.

Psychoneuroendocrinology, 74, 269–277.

Armitage KB (2014) Marmot Biology: Sociality, Individual Fitness, and Population Dynamics.

Cambridge University Press, Cambridge.

Audet M-C, Jacobson-Pick S, Wann BP, Anisman H (2011) Social defeat promotes specific

cytokine variations within the prefrontal cortex upon subsequent aggressive or endotoxin

challenges. Brain, Behavior, and Immunity, 25, 1197–1205.

Avitsur R, Hunzeker J, Sheridan JF (2006) Role of early stress in the individual differences in

host response to viral infection. Brain, Behavior, and Immunity, 20, 339–348.

Avitsur R, Powell N, Padgett DA, Sheridan JF (2009) Social interactions, stress, and immunity.

Immunology and Allergy Clinics of North America, 29, 285–293.

Barnes PJ, Chung KF, Page CP (1988) Inflammatory mediators and asthma. Pharmacological

Reviews, 40, 49–84.

Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex

weighted networks. Proceedings of the National Academy of Sciences of the United States

109 of America, 101, 3747–3752.

Bektas M, Payne SG, Liu H et al. (2005) A novel acylglycerol kinase that produces

lysophosphatidic acid modulates cross talk with EGFR in prostate cancer cells. Journal of

Cell Biology, 169, 801–811.

Bist P, Kim SS-Y, Pulloor NK et al. (2017) ArfGAP domain-containing protein 2 (ADAP2)

integrates upstream and downstream modules of RIG-I signaling and facilitates type I

interferon production. Molecular and Cellular Biology, 37, e00537-16.

Blumstein DT (2009) Social effects on emergence from hibernation in yellow-bellied marmots.

Journal of Mammalogy, 90, 1184–1187.

Blumstein DT (2013) Yellow-bellied marmots: insights from an emergent view of sociality.

Philosophical Transactions of the Royal Society B, 368, 20120349.

Blumstein DT, Fuong H, Palmer E (2017) Social security: social relationship strength and

connectedness influence how marmots respond to alarm calls. Behavioral Ecology and

Sociobiology, 71, 145.

Blumstein DT, Lea AJ, Olson LE, Martin JGA (2010) Heritability of anti-predatory traits:

vigilance and locomotor performance in marmots. Journal of Evolutionary Biology, 23,

879–87.

Blumstein DT, Runyan A, Seymour M et al. (2004) Locomotor ability and wariness in yellow-

bellied marmots. Ethology, 110, 615–634.

Blumstein DT, Wey TW, Tang K (2009) A test of the social cohesion hypothesis: Interactive

female marmots remain at home. Proceedings of the Royal Society of London B: Biological

Sciences, 276, 3007–12.

Bonacich P (1987) Power and centrality: A family of measures. American Journal of Sociology,

110 92, 1170–1182.

Chen YI, Wei PC, Hsu JL, Su FY, Lee WH (2016) NPGPx (GPx7): A novel oxidative stress

sensor/transmitter with multiple roles in redox homeostasis. American Journal of

Translational Research, 8, 1626–1640.

Cohen S, Doyle WJ, Skoner DP, Rabin BS, Gwaltney JM (1997) Social ties and susceptibility to

the common cold. JAMA, 277, 1940–1944.

Cohen S, Janicki-Deverts D, Miller GE (2007) Psychological stress and disease. JAMA, 298,

1685–1687.

Cole SW (2013) Social regulation of human gene expression: Mechanisms and implications for

public health. American Journal of Public Health, 103, S84-92.

Cole SW (2014) Human social genomics. PLoS Genetics, 10, 4–10.

Cole SW, Conti G, Arevalo JMG et al. (2012) Transcriptional modulation of the developing

immune system by early life social adversity. Proceedings of the National Academy of

Sciences of the United States of America, 109, 20578–20583.

Cole SW, Hawkley LC, Arevalo JM et al. (2007) Social regulation of gene expression in human

leukocytes. Genome Biology, 8, R189.

Cole SW, Hawkley LC, Arevalo JMG, Cacioppo JT (2011) Transcript origin analysis identifies

antigen-presenting cells as primary targets of socially regulated gene expression in

leukocytes. Proceedings of the National Academy of Sciences of the United States of

America, 108, 3080–3085.

Csárdi G, Nepusz T (2006) The igraph software package for complex network research.

InterJournal, Complex Systems, 1695, 1–9.

Decker T, Müller M, Stockinger S (2005) The Yin and Yang of type I interferon activity in

111 bacterial infection. Nature Reviews Immunology, 5, 675–687.

Dopp JM, Miller GE, Myers HF, Fahey JL (2000) Increased natural killer-cell mobilization and

cytotoxicity during marital conflict. Brain, Behavior, and Immunity, 14, 10–26.

Eady JJ, Wortley GM, Wormstone YM et al. (2005) Variation in gene expression profiles of

peripheral blood mononuclear cells from healthy volunteers. Physiological Genomics, 22,

402–411.

Fredrickson BL, Grewen KM, Algoe SB et al. (2015) Psychological well-being and the human

conserved transcriptional response to adversity. Plos One, 10, e0121839.

Fredrickson BL, Grewen KM, Coffey KA et al. (2013) A functional genomic perspective on

human well-being. Proceedings of the National Academy of Sciences of the United States of

America, 110, 13684–9.

Freeman LC (1978) Centrality in social networks. Social Networks, 1, 215–239.

Gilabert-Juan J, Moltó MD, Nacher J (2012) Post-weaning social isolation rearing influences the

expression of molecules related to inhibitory neurotransmission and structural plasticity in

the amygdala of adult rats. Brain Research, 1448, 129–136.

Goulet I, Boisvenue S, Mokas S, Mazroui R, Côté J (2008) TDRD3, a novel Tudor domain-

containing protein, localizes to cytoplasmic stress granules. Human Molecular Genetics, 17,

3055–3074.

Gray KA, Yates B, Seal RL, Wright MW, Bruford EA (2014) Genenames.org: The HGNC

resources in 2015. Nucleic Acids Research, 43, D1079–D1085.

Hawkley LC, Capitanio JP, Hawkley LC (2015) Perceived social isolation, evolutionary fitness

and health outcomes: a lifespan approach. Philosophical transactions of the Royal Society

of London. Series B, Biological sciences, 370, 20140114.

112 Holt-Lunstad J, Smith TB, Layton JB (2010) Social relationships and mortality risk: A meta-

analytic review. PLoS Medicine, 7.

Horvath S (Ed.) (2011) Weighted Network Analysis: Applications in Genomics and Systems

Biology. Springer, New York.

Jin R, Yang G, Li G (2010) Inflammatory mechanisms in ischemic stroke: Role of inflammatory

cells. Journal of Leukocyte Biology, 87, 779–789.

Kim D, Pertea G, Trapnell C et al. (2013) TopHat2: accurate alignment of transcriptomes in the

presence of insertions, deletions and gene fusions. Genome Biology, 14, R36.

Kitayama S, Akutsu S, Uchida Y, Cole SW (2016) Work, meaning, and gene regulation: findings

from a Japanese information technology firm. Psychoneuroendocrinology, 72, 175–181.

Knight JM, Rizzo JD, Logan BR et al. (2016) Low socioeconomic status, adverse gene

expression profiles, and clinical outcomes in hematopoietic stem cell transplant recipients.

Clinical Cancer Research, 74, 269–277.

Kohn MJ, Sztein J, Yagi R, DePamphilis ML, Kaneko KJ (2010) The acrosomal protein

Dickkopf-like 1 (DKKL1) facilitates sperm penetration of the zona pellucida. Fertility and

Sterility, 93, 1533–1537.

Korytár T, Nipkow M, Altmann S et al. (2016) Adverse husbandry of maraena whitefish directs

the immune system to increase mobilization of myeloid cells and proinflammatory

responses. Frontiers in Immunology, 7, 1–14.

Krueger F (2015) Trim Galore: A wrapper tool around Cutadapt and fastQC to consistently apply

quality and adapter trimming to FastQ files.

Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network

analysis. BMC Bioinformatics, 9, 559.

113 Law CW, Chen Y, Shi W, Smyth GK (2014) voom: precision weights unlock linear model

analysis tools for RNA-seq read counts. Genome Biology, 15, R29.

Lea AJ, Blumstein DT, Wey TW, Martin JGA (2010) Heritable victimization and the benefits of

agonistic relationships. Proceedings of the National Academy of Sciences of the United

States of America, 107, 21587–92.

Leek JT, Scharpf RB, Bravo HC et al. (2010) Tackling the widespread and critical impact of

batch effects in high-throughput data. Nature Reviews Genetics, 11, 733–739.

Lenihan C, Van Vuren D (1996) Growth and survival of juvenile yellow-bellied marmots

(Marmota flaviventris). Canadian Journal of Zoology, 74, 297–302.

Lenzen H, Lünnemann M, Bleich A et al. (2012) Downregulation of the NHE3-binding PDZ-

adaptor protein PDZK1 expression during cytokine-induced inflammation in interleukin-10-

deficient mice. PLoS ONE, 7, e40657.

Libby P (2006) Inflammation and cardiovascular disease mechanisms. American Journal of

Clinical Nutrition, 83, 456–460.

Linder B, Plöttner O, Kroiss M et al. (2008) Tdrd3 is a novel stress granule-associated protein

interacting with the Fragile-X syndrome protein FMRP. Human Molecular Genetics, 17,

3236–3246.

Lindsberg PJ, Grau AJ (2003) Inflammation and infections as risk factors for ischemic stroke.

Stroke, 34, 2518–2532.

Lopez J, Wey TW, Blumstein DT (2013) Patterns of parasite prevalence and individual infection

in yellow-bellied marmots. Journal of Zoology, 291, 296–303.

Mady RP, Blumstein DT Social security: are socially connected individuals less vigilant? Animal

Behaviour.

114 Miller GE, Chen E, Fok AK et al. (2009) Low early-life social class leaves a biological residue

manifested by decreased glucocorticoid and increased proinflammatory signaling.

Proceedings of the National Academy of Sciences of the United States of America, 106,

14716–21.

Miller GE, Chen E, Sze J et al. (2008) A Functional Genomic Fingerprint of Chronic Stress in

Humans: Blunted Glucocorticoid and Increased NF-κB Signaling. Biological Psychiatry,

64, 266–272.

Moody J, White DR (2003) Structural cohesion and embeddedness: A hierarchical concept of

social groups. American Sociological Review, 68, 103–127.

Morimoto RI (1998) Regulation of the heat-shock transcriptional response: cross talk between a

family of heat-shock factors, molecular chaperones, and negative regulators. Genes &

Development, 12, 3788–3796.

Navasa N, Martín I, Iglesias-Pedraz JM et al. (2015) Regulation of oxidative stress by

methylation-controlled J protein controls macrophage responses to inflammatory insults.

Journal of Infectious Diseases, 211, 135–145.

Nishi K, Luo H, Nakabayashi K et al. (2017) An alpha-kinase 2 gene variant disrupts

filamentous actin localization in the surface cells of colorectal cancer spheroids. Anticancer

Research, 37, 3855–3862.

Novak JP, Sladek R, Hudson TJ (2002) Characterization of variability in large-scale gene

expression data: implications for study design. Genomics, 79, 104–13.

Nunn CL, Altizer SM (2006) Infectious Diseases in Primates: Behavior, Ecology and Evolution.

Oxford University Press, Oxford.

Owen N, Poulton T, Hay FC, Mohamed-Ali V, Steptoe A (2003) Socioeconomic status, C-

115 reactive protein, immune factors, and responses to acute mental stress. Brain, Behavior, and

Immunity, 17, 286–295.

Palmer C, Diehn M, Alizadeh AA, Brown PO (2006) Cell-type specific gene expression profiles

of leukocytes in human peripheral blood. BMC Genomics, 7, 115.

Peng D, Belkhiri A, Hu T et al. (2012) Glutathione peroxidase 7 protects against oxidative DNA

damage in oesophageal cells. Gut, 61, 1250–60.

Peng DF, Hu TL, Soutto M, Belkhiri A, El-Rifai W (2014) Loss of glutathione peroxidase 7

promotes TNF-α-induced NF-κB activation in Barrett’s carcinogenesis. Carcinogenesis, 35,

1620–1628.

Possemato R, Marks KM, Shaul YD et al. (2012) Functional genomics reveals serine synthesis is

essential in PHGDH-amplified breast cancer. Nature, 476, 346–350.

Powell ND, Sloan EK, Bailey MT et al. (2013) Social stress up-regulates inflammatory gene

expression in the leukocyte transcriptome via β-adrenergic induction of myelopoiesis.

Proceedings of the National Academy of Sciences of the United States of America, 110,

16574–9.

Pulliam HR (1973) On the advantages of flocking. Journal of Theoretical Biology, 38, 419–422.

R Core Team (2016) R: A language and environment for statistical computing.

Raposo RAS, Abdel-Mohsen M, Bilska M et al. (2013) Effects of cellular activation on anti-

HIV-1 restriction factor expression profile in primary cells. Journal of Virology, 87, 11924–

11929.

Remy I, Michnick SW (2004) Regulation of apoptosis by the Ft1 protein, a new modulator of

protein kinase B/Akt. Molecular and Cellular Biology, 24, 1493–1504.

Ritchie ME, Phipson B, Wu D et al. (2015) limma powers differential expression analyses for

116 RNA-sequencing and microarray studies. Nucleic Acids Research, 43, e47.

Romero IG, Pai AA, Tung J, Gilad Y (2014) RNA-seq: impact of RNA degradation on transcript

quantification. BMC Biology, 12, 42.

Salim H, Arvanitis A, de Petris L et al. (2013) miRNA-214 is related to invasiveness of human

non-small cell lung cancer and directly regulates alpha protein kinase 2 expression. Genes,

Chromosomes & Cancer, 52, 895–911.

Schedlowski M, Hosch W, Oberbeck R et al. (1996) Catecholamines modulate human NK cell

circulation and function via spleen-independent beta 2-adrenergic mechanisms. Journal of

Immunology, 156, 93–99.

Shinmura K, Igarashi H, Kato H et al. (2015) BSND and ATP6V1G3: novel

immunohistochemical markers for chromophobe renal cell carcinoma. Medicine, 94, e989.

Shu Q, Lennemann NJ, Sarkar SN, Sadovsky Y, Coyne CB (2015) ADAP2 is an interferon

stimulated gene that restricts RNA virus entry. PLoS Pathogens, 11, 1–23.

Smedley D, Haider S, Durinck S et al. (2015) The BioMart community portal: An innovative

alternative to large, centralized data repositories. Nucleic Acids Research, 43, W589–W598.

Song L, Zhang J, Li C et al. (2014) Genome-wide identification of Hsp40 genes in channel

catfish and their regulated expression after bacterial infection. PLoS ONE, 9, 1–20.

Stark JL, Avitsur R, Hunzeker J, Padgett DA, Sheridan JF (2002) Interleukin-6 and the

development of social disruption-induced glucocorticoid resistance. Journal of

Neuroimmunology, 124, 9–15.

Steptoe A, Shankar A, Demakakos P, Wardle J (2013) Social isolation, loneliness, and all-cause

mortality in older men and women. Proceedings of the National Academy of Sciences of the

United States of America, 110, 5797–5801.

117 Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proceedings of

the National Academy of Sciences of the United States of America, 100, 9440–5.

Tajima S, Mori Y, Mukaiyama Y et al. (2016) Recently proposed oncocytic variant of

chromophobe renal cell carcinoma (RCC) expressing BSND and ATP6V1G3 : two new

immunohistochemical markers differentiating chromophobe RCC from other RCC

subtypes. International Journal of Clinical Experimental Pathology, 9, 2374–2381.

Trapnell C, Pachter L, Salzberg SL (2009) TopHat: Discovering splice junctions with RNA-Seq.

Bioinformatics, 25, 1105–11.

Tripathi SK, Chen Z, Larjo A et al. (2017) Genome-wide analysis of STAT3-mediated

transcription during early human Th17 cell differentiation. Cell Reports, 19, 1888–1901.

Tu Y, Stolovitzky G, Klein U (2002) Quantitative noise analysis for gene expression microarray

experiments. Proceedings of the National Academy of Sciences of the United States of

America, 99, 14031–6.

Tung J, Barreiro LB, Johnson ZP et al. (2012) Social environment is associated with gene

regulatory variation in the rhesus macaque immune system. Proceedings of the National

Academy of Sciences of the United States of America, 109, 6490–5.

Tung J, Zhou X, Alberts SC, Stephens M, Gilad Y (2015) The genetic architecture of gene

expression levels in wild baboons. eLife, 4, 1–22.

Utomo A, Jiang X, Furuta S et al. (2004) Identification of a novel putative non-selenocysteine

containing phospholipid hydroperoxide glutathione peroxidase (NPGPx) essential for

alleviating oxidative stress generated from polyunsaturated fatty acids in breast cancer cells.

Journal of Biological Chemistry, 279, 43522–43529.

Vine I (1971) Risk of visual detection and pursuit by a predator and the selective advantage of

118 flocking behaviour. Journal of Theoretical Biology, 30, 405–422.

Vosseller K, Sakabe K, Wells L, Hart GW (2002) Diverse regulation of protein function by O-

GlcNAc: a nuclear and cytoplasmic carbohydrate post-translational modification. Current

Opinion in Chemical Biology, 6, 851–857.

Van Vuren DH (2001) Predation on yellow-bellied marmots (Marmota flaviventris). The

American Midland Naturalist, 145, 94–100.

Wang J (2011) Coancestry: A program for simulating, estimating and analysing relatedness and

inbreeding coefficients. Molecular Ecology Resources, 11, 141–145.

Wells L, Hart GW (2003) O-GlcNAc turns twenty: functional implications for post-translational

modification of nuclear and cytosolic proteins with a sugar. FEBS Letters, 546, 154–158.

Wey TW, Blumstein DT (2012) Social attributes and associated performance measures in

marmots: Bigger male bullies and weakly affiliating females have higher annual

reproductive success. Behavioral Ecology and Sociobiology, 66, 1075–1085.

Wey TW, Blumstein DT, Shen W, Jordán F (2008) Social network analysis of animal behaviour:

A promising tool for the study of sociality. Animal Behaviour, 75, 333–344.

Whitehead H (2009) SOCPROG programs: analysing animal social structures. Behavioral

Ecology and Sociobiology, 63, 765–778.

Wright FA, Sullivan PF, Brooks AI et al. (2014) Heritability and genomics of gene expression in

peripheral blood. Nature Genetics, 46, 430–437.

Yang WJ, Maldonado-Chaparro AA, Blumstein DT (2017) A cost of being amicable in a

hibernating mammal. Behavioral Ecology, 28, 11–19.

Yoshida Y, Tsunoda T, Doi K et al. (2012) ALPK2 is Crucial for luminal apoptosis and DNA

repair-related gene expression in a three-dimensional colonic-crypt model. Anticancer

119 Research, 32, 2301–2308.

Zhuang XJ, Hou XJ, Liao SY et al. (2011) SLXL1, a novel acrosomal protein, interacts with

DKKL1 and is involved in fertilization in mice. PLoS ONE, 6, e20866.

120

Chapter 4:

Glucocorticoid signaling and genes involved in cellular repair and homeostasis are up-

regulated in leukocytes of wild rodents exposed to chronic predation pressure

121 Abstract

Psychological stressors induce many physiological responses via the sympathetic nervous system and the hypothalamo-pituitary-adrenal (HPA) axis. Recent studies have shown that the psychological stress induced by exposure to predators has complex effects on prey physiology, and may modify gene expression. Predator pressure influences transcription of a diverse set of functional pathways in laboratory settings, but transcriptional responses to an increased risk of predation in the wild are not well known. Here, we used RNA-sequencing to investigate the leukocyte transcriptome response to chronic predator pressure in a well-studied population of wild yellow-bellied marmots (Marmota flaviventer). We assessed the expression response to this stressor in three ways by: 1) identifying differentially expressed individual genes across the genome; 2) assessing whether differentially expressed genes were statistically over-represented by functional categories; and 3) testing for transcription factor activity that may mediate observed gene expression differences. We found 349 individual genes responded to chronic predator pressure. Gene ontology analysis revealed that these differentially expressed genes were enriched for the cellular response to stress, metabolism, and protein transport. Transcription factor analysis indicated marmots living in high predation areas significantly increased glucocorticoid signaling gene activity. These results support the canonical expectation that cells mobilize HPA signaling and cellular repair in response to predator-induced stress. Our results confirm that the physiological response to stress is complex, initiating multiple transcriptional processes and pathways, and that this response can be detected in samples collected in a natural environment.

122 Introduction

Psychological stressors can induce dramatic individual physiological and behavioral responses. Once a threatening stimulus is perceived by the sympathetic nervous system, it can redirect cognitive and physical attention toward the threat. Specifically, heart rate and mental awareness increase, and energy from glucose is mobilized from fat. The hypothalamo-pituitary- adrenal (HPA) axis also acts by releasing glucocorticoid hormones (GCs). These well-studied stressor-induced hormones are key to mobilizing and redirecting stored energy away from long- term investments such as growth and reproduction toward more immediate needs including energy metabolism and locomotor activity. Vertebrates elevate GCs in response to a wide variety of stressful situations such as food shortage (M. Clinchy, Zanette, Boonstra, Wingfield, & Smith,

2004; Kitaysky et al., 2010; Lynn, Breuner, & Wingfield, 2003), an inhospitable social environment (Blumstein, Keeley, & Smith, 2016; reviewed by Creel, Dantzer, Goymann, &

Rubenstein, 2013; Sands & Creel, 2004; Young et al., 2006), and increased predator exposure and/or abundance (Arlet & Isbell, 2009; M. Clinchy et al., 2004; Creel, Christianson, & Schuette,

2013; Van Meter et al., 2009).

These physiological responses may be largely mediated by changes in gene expression.

Stress puts cells at risk and transcriptional mechanisms are critical in preventing cellular damage and neutralizing the effects of the stressor to reestablish homeostasis. For example, genes involved in regulating the transcription of heat shock proteins alter expression in response to thermal stress in mussels (Gracey et al., 2008; Lockwood, Sanders, & Somero, 2010), flies

(Colinet et al., 2010), yeast (Causton et al., 2001), and guppies (Fraser, Weadick, Janowitz,

Rodd, & Hughes, 2011). Pro-inflammatory genes are frequently up-regulated during physical stressors such as electrical shock (Blandino et al., 2009; Nguyen et al., 1998; Yamano et al.,

123 2004), swim tests (Cullinan, Herman, Battaglia, Akil, & Watson, 1995), and exercise-induced stress (Goebel, Mills, Irwin, & Ziegler, 2000; Walsh et al., 2001). Restraint can have a variety of functional effects. Immobilization led to increased inflammatory activity in rat brains (Cullinan et al., 1995; Minami et al., 1991), transport activated genes involved in carbohydrate metabolism and cytoskeleton development in chicken muscle (Hazard et al., 2011), and actin and cytoskeleton-related genes altered expression during confinement of wild canids (Kennerly et al.,

2008). Transcriptional activity appears to be a major component in the adaptive response to psychological stress, but the functional categories of genes involved vary according to the tissue examined and the context of the stressor.

Predation is a powerful selective force and the risk of predation is a psychological stressor. It is becoming increasingly apparent that even without directly killing prey, the presence of predators affects population dynamics, behavior, and physiology of prey indirectly (Michael

Clinchy, Sheriff, & Zanette, 2013; Martin, 2011; reviewed by Preisser, Bolnick, & Benard,

2005). The physiological responses are evident on the cellular level as gene expression consistently responds to predator exposure under experimental conditions. Exposure to a predator (or merely its scent) induces activation of a wide range of molecular pathways, including heat shock proteins (Pauwels, Stoks, & De Meester, 2005; Pijanowska & Kloc, 2004), cytoskeleton organization (Pijanowska & Kloc, 2004), a pro-inflammatory response (Su, Xie,

Xin, Zhao, & Li, 2011), antigen processing (Sanogo et al., 2011), and visual perception (Sanogo et al., 2011). However, very few studies have assessed the transcriptomic response to predators in a natural setting. To our knowledge, Lavergne et al. (2014) conducted the only study examining gene expression changes associated with predators in the wild. The authors found that brain tissue taken from snowshoe hares (Lepus americanus) during periods of relatively high

124 predator abundance (Lynx canadensis) significantly altered expression of genes involved in metabolic processes, hormone responses, and immune function. This intriguing study revealed for the first time that the molecular response to predation risk can be observed in a natural setting. However, the genomic response to predation risk is likely complex and not uniform across species and tissues. Here we asked whether genes expressed in blood, arguably a less invasive measure than brain tissue and one that permits large sample sizes, are differentially expressed as a function of existence under different degrees of predator presence and threat.

Yellow-bellied marmots (Marmota flaviventer) in the vicinity of the Rocky Mountain

Biological Laboratory (RMBL) have been studied continuously since 1962, providing an ideal system in which to assess the molecular pathways involved in predation. Marmots are prey of several mammalian and avian predators and predation is a constant threat. In this population,

98% of summer mortality events are due to predation (Van Vuren, 2001) and colonies at RMBL experience different degrees of exposure to predators. This long-term study has led to significant insights into the direct and indirect effects of predators on this species. Ecologically, the persistence of a marmot colony is better predicted by predation-related attributes such as visibility and underground protection than food-related factors (Blumstein, Ozgul, Yovovich,

Van Vuren, & Armitage, 2006). Behaviorally, marmots have evolved a rich repertoire of anti- predator tactics to minimize risk. This species can identify potential predators by sight

(Blumstein, Ferando, & Stankowich, 2009), sound (Blumstein, Cooley, Winternitz, & Daniel,

2008), and smell (Blumstein, Barrow, & Luterra, 2008), and they communicate predation risk with others using alarm calls (Blumstein, 2007). Marmots allocate a proportion of their time to being vigilant and adjust this proportion depending on context (Blumstein, Runyan, et al., 2004).

Considering these dramatic responses to the risk of predation, it is not surprising that

125 glucocorticoid levels (GCs) are positively correlated with the degree of predator presence at a colony (Blumstein et al., 2016; Monclús et al., 2011). Thus, marmots in high predation areas appear to be under chronic stress whereas those that experience low predation pressure are not.

To better understand the many physiological pathways that are likely activated due to the psychological stress induced by predation risk, we quantified genome-wide transcription levels in blood from yellow-bellied marmots. Whole transcriptome profiling (or RNA-seq) is a valuable tool for assessing cellular physiology because this technique can identify a molecular response to environmental stimuli on many levels, including individual genes, coordinated gene networks, and activated regulatory pathways. We sequenced blood RNA because collection is minimally invasive and unlike more function-specific tissues, it can be used to explore a variety of physiological functions. Since red blood cells are non-nucleated, it is the white blood cells or leukocytes that produce the dominant transcripts. Leukocytes defend the body against foreign antigens and thus, they are excellent cells for studying immune function. In addition, blood shares approximately 80% of mRNA with other tissues (Liew et al., 2006) and it has been shown to be an ideal surrogate for multiple tissue types (Davies et al., 2009; Kohane & Valtchinov,

2012; Rudkowska et al., 2011; e.g., Sullivan et al., 2006). For example, although liver is ideal for studying metabolism (Rui, 2014), blood is frequently used to perform many basic metabolic tests and genes involved in key metabolic pathways are expressed in blood (e.g., Ghosh et al., 2010;

Rudkowska et al., 2011). Thus, blood, although not a perfect surrogate for evaluating gene expression across all functions, provides meaningful information and its collection is less invasive.

Our goal was to compare the transcriptomic response of leukocytes in marmots that experienced high predation pressure to those that experienced low predation pressure as

126 quantified by the frequency of observed predators. To accomplish our aim, we used three methods that represent different levels of cellular response. First, we tested for significant differential expression of individual genes using a linear mixed model approach. Second, we tested for enrichment of functional pathways in these genes using gene ontology. Third, we identified the upstream transcription control pathways mediating these changes in gene expression. Based on previous research examining the molecular responses to predation, we expected four pathways to be up-regulated by individuals experiencing chronic predation: glucocorticoid signaling, inflammation, metabolism, and heat shock proteins.

Materials and methods

Study subjects and trapping procedures

During the summers of 2013 to 2015, we studied free-living yellow-bellied marmots in and around the Rocky Mountain Biological Laboratory (RMBL) in Gothic, Colorado, U.S.A.

Yellow-bellied marmots are facultatively social, sciurid rodents that are active from approximately mid-May to mid-September and hibernate the remainder of the year (Blumstein,

Im, et al., 2004). Marmots were trapped bi-weekly throughout the active season using

Tomahawk live traps and affixed with numbered ear tags for permanent identification and unique fur marks to facilitate individual identification from afar (Blumstein, 2013).

Using predator abundance as a proxy for predator pressure

Predator presence was calculated using the frequency of daily predator sightings at a colony divided by the number of observation sessions at that colony for each year 2013 to 2015.

In other words, for every day that we observed a marmot colony, we applied a binary score (0 or

127 1) indicating whether a predator was seen. Species that predate marmots at RMBL include the red fox (Vulpes vulpes), coyote (Canis latrans), badger (Taxidea taxus), black bear (Ursus americanus), red-tailed hawk (Buteo jamaicensis), and golden eagle (Aquila chrysaetos).

Quantifying predator abundance in this way is a relatively conservative predation measure because it eliminates the possibility of inflating predator pressure if the same individual predator is observed multiple times in one day. We then divided the number of days predators were observed by the total number of observation days at that colony, for a proportional value ranging from 0 to 1 indicating predator abundance for each colony-year. We limited observations to the early season (mid-April through the end of June) because after this period, vegetation grows rapidly and terrestrial predators are harder to observe. Each year was analyzed separately. We then calculated the median predator index across all colony-years. Colony-years with values below the median were considered low predator pressure areas, whereas colony-years with values above the median were considered high predator pressure areas. All observers were trained to identify both aerial and terrestrial predators at study sites.

RNA sampling, library preparation, and sequencing

During trapping, we transferred marmots to cloth handling bags and collected 1 mL whole blood, preserved in 2.5 mL PAXgeneTM Blood RNA solution (PreAnalytiX, Qiagen,

Hombrechtikon, Switzerland). We incubated, froze, and extracted samples according to

PAXgeneTM manufacturers’ instructions. We removed globin transcripts using the rodent

GLOBINclearTM kit (Ambion, ThermoFisher Scientific, Waltham, MA) and assessed RNA quality with an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA). To preserve statistical power, we excluded samples with RIN < 4 and globally corrected for RNA

128 degradation by regressing any effect of RIN (as suggested by Romero et al., 2014; details below). We prepared cDNA libraries using a TruSeq Library Prep Kit v2 (Illumina, Madison,

WI), quantified cDNA with the KAPA SYBR® Fast qPCR library quantification kit (Kapa

Biosystems Inc., Wilmington, MA), and pooled 8-10 samples per lane. We used Illumina

HiSeq2000 (2013 samples) and HiSeq4000 (2014-2015) platforms at the Vincent J. Coates

Genomics Sequencing Laboratory, UC Berkeley, USA, to create single-end 100 base pair (bp) sequences. For this study, we focused on sampling and sequencing yearling marmots to control for any effect of age on gene expression.

Read mapping, expression quantification and outlier removal

We used Trim Galore! (Krueger, 2015) to remove adapters, short (< 20 bp), and low quality reads (Phred score < 20). Resulting reads were mapped to the thirteen-lined ground squirrel (Ictidomys tridecemlineatus) genome (spetri2, GenBank Assembly ID

GCA_000236235.1) using TopHat2 v.2.1.0 (Kim et al., 2013; Trapnell et al., 2009a). These species diverged approximately 8.6 million years ago (Bininda-Emonds et al., 2007; Soria-

Carrasco & Castresana, 2012) and exhibit sequence divergence of 13.2% (Thomas & Martin,

1993). To maximize read mapping, we allowed eight mismatches, a 10 bp gap length, and a 20 bp edit distance between reads and the reference genome. We used the ‘union’ mode of HT-Seq

(Anders et al., 2015) to quantify transcript abundance of uniquely mapped reads. Using the

ENSEMBL thirteen-lined ground squirrel genome as a reference, we acquired HGNC (HUGO

Gene Nomenclature Committee; K. A. Gray et al., 2014) gene symbol information for squirrel transcripts in BiomaRt (Smedley et al., 2015). All downstream analyses were carried out using R version 3.3.1 (R Core Team, 2016). We filtered the dataset to include protein-coding genes with

129 at least 10 reads in 75% of libraries and transformed count data for linear modeling. Values were normalized according to sequencing depth, gene length, and mean variance across genes using the ‘voom’ function of the LIMMA package (Law et al., 2014; Ritchie et al., 2015). To identify outliers, we built a Euclidean distance-based network of samples using the ‘adjacency’ function in WGCNA (Langfelder & Horvath, 2008). Samples were designated as outliers and removed if their connectivity was more than three standard deviations from the mean (as described by Steve

Horvath, 2011).

Removal of technical variation

Batch effects are ubiquitous in high-throughput gene expression, methylation, and variant calling studies (Leek et al., 2010). To control for this systematic bias, we used principal component analysis (PCA) of the normalized, transformed expression data to assess variance due to technical factors. We regressed the effect of technical variables that were correlated with any of the first 12 PCs using a significance threshold of 0.05.

Linear mixed effects models

To identify individual genes that were significantly affected by predation risk, we created linear mixed models using EMMREML (Akdemir & Godfrey, 2015). Fixed effects included predator pressure (a binary variable; low, high), day of the year of sample collection (to account for seasonal variation), time of day of sample collection (to account for circadian variation), and sex. To control for heritability of gene expression (Tung et al., 2015; Wright et al., 2014), kinship was included as a random effect. Kinship was calculated as pair-wise relatedness between individuals using the triadic maximum likelihood approach in COANCESTRY (J.

130 Wang, 2011) based on genotypes obtained from 12 microsatellite loci (for details, see Blumstein et al., 2010). Dependent variables of mixed models were the residuals of the filtered, normalized gene expression counts after regressing batch effect variance. For each model, we extracted the p-value associated with predator pressure and adjusted for multiple hypothesis testing using the false discovery rate (q; Storey & Tibshirani, 2003) in QVALUE (Storey et al., 2015). A gene was considered significantly associated with predation when q was < 0.1.

Functional enrichment analysis

To identify the biological processes that were statistically over-represented in the differentially expressed genes, we performed gene ontology (GO) analysis using gProfileR

(Reimand et al., 2016). We separated up- and down-regulated genes and used these two lists as queries in gProfileR. Background lists included all protein-coding genes expressed at detectable levels (≥ 10 reads in 75% of libraries). We set the minimum functional category and intersection sizes to five, used the ‘moderate’ hierarchical filter, corrected for multiple testing using the optimal ‘gSCS’ method (Reimand et al., 2007), and set the significance threshold to 0.05.

Quantifying transcription factor activity

To evaluate the role of glucocorticoid receptor (GR) signaling in mediating the observed expression differences, we scanned the promoters of all genes showing > 1.25-fold difference in expression between high- vs. low-predation colonies for glucocorticoid response elements using the TELiS database (Steve W. Cole, Yan, Galic, Arevalo, & Zack, 2005). GR response element prevalence was assessed using the TRANSFAC mat_sim statistic computed over the V$GR_Q6 position-specific weight matrix (Steve W. Cole et al., 2005). As in previous studies (Miller et al.,

131 2008, 2014), intensity of GR activation was inferred from the ratio of GR response element prevalence within promoters of up-regulated genes relative to down-regulated genes, with (log2) ratios averaged over nine parametric variations in promoter length (-300 nt, -600 nt, and -1000 nt to +200 nt relative to the RefSeq start site) and response element detection threshold (mat_sim >

.80, .90, and .95). Statistical significance of mean log ratios was assessed by bootstrap resampling of differentially expressed genes.

Results

Predation indexes

During the 2013 to 2015 summer active seasons, we observed marmots for 5039 hours and detected 300 predators in 27 colony-years. Of these, 184 predators were observed prior to

July 1. These sightings largely consisted of red foxes (n = 82), coyotes (n = 32), and various raptors (n = 51). Predation indexes (the number of days a predator was observed per colony-year divided by total number of observation days) ranged from 0.013 to 0.463, indicating that a predator was observed nearly every other day in some colonies. The median cut that separated low from high predation areas was 0.092 (low predator index: mean = 0.073 ± 0.036 SD; high predator index: mean = 0.219 ± 0.092 SD).

RNA-seq samples

We extracted high quality RNA sequences from 79 individual marmots. On average, we generated 32.5 million reads per individual, 18.8 million of which (58.8%) uniquely mapped to the squirrel genome. Of the 22,389 protein-coding genes in the squirrel genome, 11,440 (51.9%) were substantially expressed (≥ 10 reads in 75% of libraries). We used the 9,063 genes that were

132 annotated in subsequent analyses. Three batch effects significantly (p < 0.05) influenced one of the first 12 principal components (PCs) of gene expression: sequencing platform (HiSeq 2000 vs

4000), RNA extraction batch (samples were extracted in seven batches), and input RNA concentration. To control for RNA degradation, we also regressed the effect of RNA integrity number (RIN; a technique validated by Romero et al., 2014). Clustering analysis revealed one outlier sample, resulting in 78 total individuals for subsequent analyses. Predation indexes for this dataset ranged from 0.013 to 0.463 (median = 0.117). Using this median split, n = 40 RNA samples came from colonies that experienced low predation pressure, whereas n = 38 experienced high predation pressure.

Linear mixed effects models

After controlling for sampling date, time, sex, and relatedness, mixed models found 349 of the 9,063 (3.9%) annotated genes were differentially expressed as a function of predator index

(q < 0.1; Table 4-S1). Of these, 203 were significantly up-regulated (had positive log2 fold changes) in marmots that live in colonies with high predator indexes, whereas 146 had negative fold changes (down-regulated; Figure 4-1). Genes that were up-regulated included many heat shock proteins and genes that are important in the DNA replication and damage repair processes.

Genes that were down-regulated as a function of high predator index included many known to be involved in central nervous system development as well as genes that respond to various stressors, including anxiety and depression disorders, oxidative damage, and toxin exposure.

133 Enriched functional categories

Gene ontology analysis of the up-regulated genes revealed categorical enrichment of five biological processes (Table 4-1). In line with our predictions, results indicated that marmots in colonies with high predator indexes primarily up-regulated genes involved in metabolism, protein synthesis and transport, and the cellular response to stress. Down-regulated genes were statistically over-represented by one functional category (“metabolic process”; Table 4-1).

Predation transcription factor analysis

Genes up-regulated > 1.25-fold in association with predation pressure showed a significant enrichment of glucocortoid receptor-binding motifs with their promoters (mean ratio: 2.16-fold ±

0.73, p = .03). That is, marmots that lived in colonies where many predators were observed showed bioinformatic indications of increased glucocorticoid signaling, which is consistent with our hypothesis that these animals are chronically stressed by predators. Similar results emerged when analyses were run using different analytic parameters (e.g., identifying up-regulated genes based on > 1.20-fold differential expression instead of 1.25-fold).

Discussion

Even in the absence of direct mortality, the psychological stress induced by predators or predator cues can have complex and long-lasting effects on prey demography, reproduction, behavior, memory, and physiology. To our knowledge, this study is the first to reveal that leukocyte transcriptomes of wild animals reflect the diverse physiological changes that accompany the ecology of fear. We found that yellow-bellied marmots that experienced chronic predation pressure initiated gene expression changes in circulating blood cells across numerous pathways

134 and somatic processes. We observed many functional genetic changes that are consistent with previous physiological and condition-based studies of predator-mediated effects. Based on the extensive associative evidence between inflammation and psychological stress, we were surprised we did not detect an inflammatory transcriptional response to predator pressure.

However, we detected the predicted increased glucocorticoid signaling, heat shock protein response, and metabolic changes in marmots that frequently experienced the life-threatening risk of being predated, in addition to functions we did not predict.

Glucocorticoid signaling

The HPA axis modulates energetic reactions to various stressors and restores homeostasis by producing glucocorticoid hormones. GCs are among the most commonly used proxies for stress and measuring levels of these “stress hormones” can be very useful in evaluating the psychological effects of a stressor, especially in wild systems (Reeder & Kramer, 2005). In fact, previous research in this population of marmots has shown that individuals that live in colonies frequently visited by predators significantly increased fecal glucocorticoid levels after controlling for other variables (Monclús et al., 2011). We note, however, that this relationship was quite small (Cohen’s d = 0.05) and not statistically significant in this smaller dataset

(p=0.39; Figure 4-2).

We chose to build on our knowledge of GC hormones by evaluating the transcriptional response to the stress of predation. Based on existing knowledge of how the HPA axis mediates stress response dynamics (reviewed by Sapolsky, Romero, & Munck, 2000; S. M. Smith & Vale,

2006), we predicted that marmots exposed to chronic, high predator abundance would up- regulate glucocorticoid receptor genes. Transcription factor analysis supported this prediction.

135 We found GC receptor-binding motifs to be statistically enriched in the promoters of genes up- regulated during chronic predator exposure. This combination of higher GC hormones and increased GC receptor transcription activity appears to confirm our hypothesis that these animals are chronically stressed by predator pressure. This finding also suggests that the observed increases in glucocorticoid hormone levels may be mediated by transcription control pathways.

Heat shock proteins

Heat shock proteins (HSPs) were initially named when they were discovered during a severe heat stress experiment (Ritossa, 1962). Since then, HSPs have been shown to respond to an array of biotic and abiotic stressors including extreme cold (Matz, Blake, Tatelman, Lavoi, & Holbrook,

1995), desiccation (Hayward, Pinehart, & Denlinger, 2004), disease (Chai, Koppenhafer, Bonini,

& Paulson, 1999), and environmental toxicants (Richter et al., 2011; Wong, Bonakdar, Mautz, &

Kleinman, 1996). Heat shock proteins are now such a ubiquitous component of cellular stress, these proteins have become synonymous with the stress response. Stressful conditions often result in misfolded proteins and cell death, but HSPs help cells survive and maintain homeostasis. HSPs act as molecular chaperones by interacting with other proteins to ensure they are synthesizing, folding, and transporting critical proteins correctly during stressful times

(Gething & Sambrook, 1992).

Because of this long-standing associations between heat shock proteins and the stress response, it was no surprise that five genes that encode HSPs were associated with the chronic stress of predator pressure in this study. DNAJC2 (HSP40 member C2), DNAJC8 (HSP40 member C8), HSPA4 (HSP70 member 4), HSPA9 (HSP70 member 9), and HSP90B1 were all significantly up-regulated by marmots in high predation colonies (Figure 4-1; Figure 4-3). Thus,

136 the stress associated with predation is powerful enough to initiate the defense of these essential molecular chaperones in marmot leukocytes.

Cellular homeostasis and DNA repair

As noted above, heat shock proteins oversee protein assembly and folding and these proteins were activated in our samples’ response to predator stress. We also found that the 203 genes up- regulated by marmots in high predation areas were statistically enriched for specific homeostatic functions that are often managed by HSPs including the “cellular response to stress”,

“intracellular protein transport”, and “ribonucleoprotein complex assembly” (Table 4-S2).

Furthermore, many of these individual DE genes code for proteins that are important for cellular repair. RFC4, MLH1, RPA2, FANCD2, XRCC5, PRKDC, TRIP12, and RRM1 have gene ontology annotations in DNA repair and DNA damage control (S. Carbon et al., 2017; Figure 1;

Seth Carbon et al., 2009). Specifically, RFC4 is a DNA mismatch repair gene that is essential for proper genetic replication and DNA damage checkpoints (Kim & Brill). RPA2 is critical in binding and stabilizing single-stranded DNA intermediates that form during DNA replication or upon DNA stress (Wold, 1997; Zou & Elledge, 2003). MARCH2 is a central member of the ubiquitin system (Cheng & Guggino, 2013), which regulates cell homeostasis and appears to be important in the response to thermal and endoplasmic reticulum stress in animals (Verleih et al.,

2015; Xia et al., 2017). MLH1 is a DNA mismatch repair gene that is down-regulated during hypoxic stress (Mihaylova et al., 2003). Thus, as predicted, DNA replication and repair appear to be an important component in dealing with the stress associated with chronic exposure to predators in a wild setting as well.

137 Cerebral function

The brain is the key organ in evaluating stressful stimuli (McEwen & Gianaros, 2010). When the brain perceives a threat, neurons in the hypothalamus initiate a cascade of events in the HPA axis, resulting in the eventual release of circulating glucocorticoid hormones. Thus, it makes sense that psychological stressors can alter brain morphology, memory and learning, and a variety of anxiety-like behaviors (De Kloet, Joëls, & Holsboer, 2005).

Since we examined the transcriptional response of animals exposed to predators in leukocytes and not brain tissue, we did not expect to see differential expression of genes involved in cerebral function. However, we observed associations between predator abundance and several genes that are important for brain activity. For example, myelin regulatory factor

(MYRF) is critical for central nervous system development (Bujalka et al., 2013; Mitew et al.,

2014). Tumor protein 73 (TP73) regulates neuron survival and apoptosis, making it vital to central nervous system development and the cellular stress response (Jacobs, Kaplan, & Miller,

2006). We also detected differential expression of uromodulin like 1 (UMODL1), a gene that has been suggested to play a role in olfactory axon navigation to the brain (Di Schiavi, Riano, Heye,

Bazzicalupo, & Rugarli, 2005). This was a particularly interesting result as the olfactory system is critical in communicating external threats to the brain, especially for wild animals that rely heavily on scent.

Anxiety behaviors

Exposure to a predator is a universally stressful life threatening experience. Thus, it is one of the most commonly used stressors in modern studies evaluating anxiety and posttraumatic stress

138 disorders (H. Cohen, Matar, & Zohar, 2008). Most contemporary studies in this field use laboratory animal models to investigate the transcription associated with anxiety.

Prior to this study, we assumed high predator presence led to chronic stress and anxiety in this marmot population. Indeed, we found some similarities to human and laboratory models in the genomic response to anxiety. We found PTP4A3, which typically exhibits lower expression in humans with major depressive disorder (Pajer et al., 2012) and posttraumatic stress disorder

(Logue et al., 2015), was down-regulated in high predation areas. ALAD, a gene associated with social phobias and general anxiety (Donner et al., 2008), and FZR1, a gene associated with depression in humans (Tochigi et al., 2008), were also differentially expressed as a function of predator abundance. However, we note that ALAD and FZR1 were previously been found to be up-regulated by individuals with anxiety disorders, whereas in the marmot dataset, they were down-regulated by individuals that were more frequently exposed to predators.

We found it a bit surprising that the differentially expressed genes involved in the central nervous response and anxiety disorders (TP73, MYRF, UMODL1, ALAD, FZR1, PTP4A3) were largely down-regulated by marmots in colonies experiencing high predator pressure. The fold change in a few of these genes went in the same direction as previous studies (e.g., PTP4A3), but more genes exhibited the opposite response (e.g., ALAD, FZR1). We believe this may be due to the nature of the stressor and the setting in which we studied these dynamics. Importantly, marmots are able to compensate for increased stressors behaviorally while captive mice and rats may not be able to in the same ways. Indeed, acute stress is transient and usually provides a protective benefit. Excessive activation of the HPA axis due to chronic stress, however, can lead to pathologies such as sensitization to stressors, impaired hippocampal function, and prolonged anxiety like behaviors (J. Gray, Rubin, Hunter, & McEwen, 2014; McEwen & Gianaros, 2010).

139 Thus, it is possible that the differences in the timeframe at which these stressors are occurring in our study compared to previous controlled studies may have led to different transcriptional outcomes. Additional research is needed to be more conclusive regarding how these specific genes respond to different psychological stressors.

Metabolism and growth

The HPA stress response and GCs are known to directly mobilize stored energy for immediate needs and suppress long-term growth (Hawlena & Schmitz, 2010; Morita et al.,

2005). Therefore, we expected predator-induced differentially expressed genes to be associated with, and enriched for, processes involved in increased energy metabolism and gluconeogenesis of body proteins.

Our gene ontology results indicated an enrichment of activity in metabolic processes. Up- regulated genes were largely involved in “nucleobase-containing compound metabolic process” and “nucleotide metabolic process”, whereas down-regulated genes were enriched for simply

“metabolic process”. However, our investigation of these gene ontology terms revealed that genes in these groups do not encode for proteins that increase lipid or carbohydrate metabolism per se. To be more specific, these genes are functionally important in cellular and nucleotide metabolism (see discussion above).

We did, however, observe differential expression of a few genes consistent with the stress-induced energy metabolism and growth hypothesis. PDK1 encodes for pyruvate dehydrogenase kinase 1, one of the major enzymes responsible for the regulation of homeostasis of carbohydrate fuels in mammals (Mora, Komander, Van Aalten, & Alessi, 2004). DLD is also critical in energy metabolism, among its many functions (Dashty, 2013). As predicted, these

140 genes were up-regulated by marmots experiencing chronic predator stress. FADS6 has been implicated in the metabolism of lipids and fatty acids (Guillou, Zadravec, Martin, & Jacobsson,

2010; Q. Liu et al., 2012), but it was down-regulated by individuals in high predation risk.

Finally, PHOSPHO1 was significantly down-regulated by marmots experiencing high predation risk. This gene is involved in the mineralization of bone and cartilage (Houston, Stewart, &

Farquharson, 2004), thus supporting the hypothesis that the chronic stress of predator pressure might suppress skeletal growth.

Together, our results suggest that the psychological stress of predation alone is powerful enough to induce multiple aspects of the cellular stress response in a wild setting. We identified individual predator stress-associated genes that transcribe proteins that are critical in maintaining homeostasis and metabolism, found statistical enrichment for the cellular response to stress, and established that the promoters of up-regulated genes were highly enriched for glucocortoid receptor-binding motifs. This transcriptome-wide approach went beyond examining and validating a handful of candidate genes known to activate during cellular stress or looking for hormonal correlates with stressors. During this process, we confirmed that even in a largely uncontrolled wild population, cellular transcription of the various proteins and pathways that join forces to restore cellular homeostasis in response to a stressor can be observed.

141 Figure 4-1. Volcano plot showing fold change in individual gene expression as a function of predator index for 9,063 genes. Horizontal dashed line indicates q-value of 0.1. Genes in gray are not significantly associated with predation, black dots are significantly associated (n=349).

Genes highlighted in discussion are blue if significantly down-regulated and red if up-regulated.

2.5

2.0 FZR1

MXD4 DNAJC8 RRM1 PTP4A3 XRCC5 1.5 HSP90B1 TRIP12 MYRF FANCD2 DNAJA2 ALAD RFC4 q-value MARCH2 HSPA9 10 HSPA4

-log 1.0 GPX1 TP73 UMODL1

0.5

0.0

1.0 0.5 0 0.5 1.0 log2 fold change

142 Figure 4-2. Boxplot showing non-significant relationship between predator index and average fecal glucocorticoid measures across the active season (n=63). Note that fecal samples were not available for all individuals with RNA sequence data.

0.75

0.50 (ng/L) Average fecal cortisol fecal level Average

0.25

Cohen’s d=0.05, p=0.39

Low High Predator index

143 Figure 4-3. Expression levels of differentially expressed heat shock proteins, grouped by predator abundance index.

1.0 DNAJC2 DNAJC8 HSPA4

0.5 0.5 1

0.0 0.0

−0.5 0 Gene expression Gene expression Gene expression −0.5 −1.0

Low High Low High Low High Predator index Predator index Predator index

1.0 HSPA9 1.0 HSP90B1

0.5 0.5

0.0 0.0 Gene expression Gene expression −0.5 −0.5

Low High Low High Predator index Predator index

144 Table 4-1. Gene ontology results for genes that were up- and down-regulated as a function of predator pressure. For full gene ontology results including specific genes, see Table 4-S2.

Number Enriched biological process p-value of genes Cellular response to stress 28 9.65E-05 Up- Intracellular protein transport 19 0.00167 regulated gene list Nucleobase-containing compound metabolic process 85 7.87E-20 Nucleotide metabolic process 13 0.041 Ribonucleoprotein complex assembly 8 0.0167 Down- regulated Metabolic process 74 0.000231 gene list

145 Supplementary results

Table 4-S1. List of genes that were significantly differentially expressed (q < 0.1; n=349) as a function of predator pressure. Positive fold change indicates up-regulation by marmots in colonies with high predator indexes; negative fold change indicates down-regulation.

Log HGNC 2 Predation HGNC description predation symbol q-value fold change EFCAB3 EF-hand calcium binding domain 3 1.416 0.080 TTK TTK protein kinase 0.771 0.036 GRASP general receptor for phosphoinositides 1 associated scaffold protein 0.706 0.045 RFC4 replication factor C subunit 4 0.691 0.048 IL6ST interleukin 6 signal transducer 0.574 0.062 FANCD2 Fanconi anemia complementation group D2 0.547 0.050 TTC21A tetratricopeptide repeat domain 21A 0.494 0.045 COL15A1 collagen type XV alpha 1 chain 0.479 0.084 CRYGS crystallin gamma S 0.441 0.090 UTP15 UTP15, small subunit processome component 0.438 0.036 OSBPL1A oxysterol binding protein like 1A 0.396 0.064 BOD1L1 biorientation of in cell division 1 like 1 0.395 0.065 GLB1L galactosidase beta 1 like 0.390 0.045 PARP2 poly(ADP-ribose) polymerase 2 0.389 0.034 ZNF827 zinc finger protein 827 0.380 0.051 TTC4 tetratricopeptide repeat domain 4 0.375 0.091 EIF5B eukaryotic translation initiation factor 5B 0.366 0.051 PGM3 phosphoglucomutase 3 0.364 0.059 DLD dihydrolipoamide dehydrogenase 0.364 0.024 HIBCH 3-hydroxyisobutyryl-CoA hydrolase 0.363 0.034 NUFIP1 NUFIP1, FMR1 interacting protein 1 0.360 0.049 DDX18 DEAD-box helicase 18 0.358 0.092 MRPL44 mitochondrial ribosomal protein L44 0.356 0.039 PGM1 phosphoglucomutase 1 0.355 0.070 SCCPDH saccharopine dehydrogenase (putative) 0.355 0.024 ETFDH electron transfer flavoprotein dehydrogenase 0.354 0.040 EMC1 ER membrane protein complex subunit 1 0.351 0.034 HNRNPA3 heterogeneous nuclear ribonucleoprotein A3 0.348 0.098 RAD23B RAD23 homolog B, nucleotide excision repair protein 0.343 0.052 METTL16 methyltransferase like 16 0.342 0.045 MSH6 mutS homolog 6 0.342 0.034 FARSB phenylalanyl-tRNA synthetase beta subunit 0.341 0.032 PGBD1 piggyBac transposable element derived 1 0.341 0.046 STAMBPL1 STAM binding protein like 1 0.340 0.040 FASTKD2 FAST kinase domains 2 0.340 0.049 NAT10 N-acetyltransferase 10 0.336 0.051 C2CD3 C2 calcium dependent domain containing 3 0.335 0.034 RPA2 replication protein A2 0.334 0.010

146 DROSHA drosha ribonuclease III 0.332 0.048 ZNF326 zinc finger protein 326 0.330 0.036 CDC16 cell division cycle 16 0.329 0.017 SETD6 SET domain containing 6 0.329 0.046 ANAPC7 anaphase promoting complex subunit 7 0.329 0.023 SEC23B Sec23 homolog B, coat complex II component 0.325 0.020 PHF11 PHD finger protein 11 0.325 0.035 NUP205 nucleoporin 205 0.324 0.044 ELAC1 elaC ribonuclease Z 1 0.322 0.051 HNRNPR heterogeneous nuclear ribonucleoprotein R 0.318 0.083 TMLHE trimethyllysine hydroxylase, epsilon 0.317 0.096 POLR1E RNA polymerase I subunit E 0.317 0.040 HSP90B1 heat shock protein 90 beta family member 1 0.315 0.038 SF3B2 splicing factor 3b subunit 2 0.315 0.064 LONP2 lon peptidase 2, peroxisomal 0.315 0.004 TAF15 TATA-box binding protein associated factor 15 0.314 0.094 MLH1 mutL homolog 1 0.313 0.091 HSPA9 heat shock protein family A (Hsp70) member 9 0.312 0.066 ASH2L ASH2 like histone lysine methyltransferase complex subunit 0.310 0.051 METTL14 methyltransferase like 14 0.308 0.044 STT3A STT3A, catalytic subunit of the oligosaccharyltransferase complex 0.307 0.023 ZNF281 zinc finger protein 281 0.307 0.080 phosphoribosylglycinamide formyltransferase, phosphoribosylglycinamide synthetase, 0.307 0.038 GART phosphoribosylaminoimidazole synthetase SLC7A6 solute carrier family 7 member 6 0.307 0.077 SCFD2 sec1 family domain containing 2 0.304 0.091 HNRNPD heterogeneous nuclear ribonucleoprotein D 0.303 0.034 FAM35A family with sequence similarity 35 member A 0.303 0.064 YTHDF2 YTH N6-methyladenosine RNA binding protein 2 0.300 0.051 HNRNPU heterogeneous nuclear ribonucleoprotein U 0.299 0.032 PSMD1 proteasome 26S subunit, non-ATPase 1 0.297 0.024 XRCC5 X-ray repair cross complementing 5 0.295 0.034 SF3B3 splicing factor 3b subunit 3 0.294 0.046 HSPA4 heat shock protein family A (Hsp70) member 4 0.292 0.079 SRRM1 serine and arginine repetitive matrix 1 0.291 0.036 UTP4 UTP4, small subunit processome component 0.290 0.090 hydroxyacyl-CoA dehydrogenase/3-ketoacyl-CoA thiolase/enoyl- 0.289 0.077 HADHB CoA hydratase (trifunctional protein), beta subunit CALR calreticulin 0.288 0.075 GSPT1 G1 to S phase transition 1 0.287 0.034 FUS FUS RNA binding protein 0.285 0.099 AIG1 androgen induced 1 0.285 0.064 CNOT11 CCR4-NOT transcription complex subunit 11 0.285 0.078 PTCD2 pentatricopeptide repeat domain 2 0.285 0.047 NOP2 NOP2 nucleolar protein 0.284 0.048 HEATR1 HEAT repeat containing 1 0.284 0.051 VRK1 vaccinia related kinase 1 0.283 0.054 VCP valosin containing protein 0.283 0.049 RTF1 RTF1 homolog, Paf1/RNA polymerase II complex component 0.283 0.054 TDP1 tyrosyl-DNA phosphodiesterase 1 0.281 0.041 DNAJC8 DnaJ heat shock protein family (Hsp40) member C8 0.281 0.020 147 RMND1 required for meiotic nuclear division 1 homolog 0.281 0.082 NME7 NME/NM23 family member 7 0.280 0.086 RRM1 ribonucleotide reductase catalytic subunit M1 0.278 0.023 DICER1 dicer 1, ribonuclease III 0.276 0.051 UTP6 UTP6, small subunit processome component 0.274 0.034 FIP1L1 factor interacting with PAPOLA and CPSF1 0.273 0.066 WDR73 WD repeat domain 73 0.273 0.032 USP48 ubiquitin specific peptidase 48 0.271 0.017 RARS arginyl-tRNA synthetase 0.270 0.039 ZFYVE26 zinc finger FYVE-type containing 26 0.270 0.020 DHX32 DEAH-box helicase 32 (putative) 0.270 0.091 DHRS1 dehydrogenase/reductase 1 0.269 0.098 NDUFS1 NADH:ubiquinone oxidoreductase core subunit S1 0.268 0.026 GARS glycyl-tRNA synthetase 0.268 0.033 TNPO1 transportin 1 0.266 0.034 SSRP1 structure specific recognition protein 1 0.265 0.089 FLNB filamin B 0.263 0.054 MPHOSPH8 M-phase phosphoprotein 8 0.262 0.055 NDUFAF1 NADH:ubiquinone oxidoreductase complex assembly factor 1 0.262 0.092 YLPM1 YLP motif containing 1 0.262 0.084 RPN2 ribophorin II 0.262 0.071 HYOU1 hypoxia up-regulated 1 0.261 0.062 PRKDC protein kinase, DNA-activated, catalytic polypeptide 0.260 0.094 DDX24 DEAD-box helicase 24 0.260 0.064 TPR translocated promoter region, nuclear basket protein 0.257 0.082 AIFM1 apoptosis inducing factor, mitochondria associated 1 0.256 0.012 PREP prolyl endopeptidase 0.254 0.092 EIF2D eukaryotic translation initiation factor 2D 0.254 0.094 SMG8 SMG8, nonsense mediated mRNA decay factor 0.254 0.037 BMS1 BMS1, ribosome biogenesis factor 0.253 0.066 DDX47 DEAD-box helicase 47 0.252 0.076 MANSC1 MANSC domain containing 1 0.252 0.077 SEC23IP SEC23 interacting protein 0.252 0.094 UTP18 UTP18, small subunit processome component 0.251 0.092 SRP54 signal recognition particle 54 0.251 0.045 LDHB lactate dehydrogenase B 0.249 0.080 SCFD1 sec1 family domain containing 1 0.248 0.077 SLC38A1 solute carrier family 38 member 1 0.248 0.065 GEMIN4 gem nuclear organelle associated protein 4 0.247 0.019 ACSL5 acyl-CoA synthetase long-chain family member 5 0.247 0.037 HNRNPAB heterogeneous nuclear ribonucleoprotein A/B 0.246 0.044 CPSF3 cleavage and polyadenylation specific factor 3 0.245 0.042 COPG2 coatomer protein complex subunit gamma 2 0.244 0.048 PWP1 PWP1 homolog, endonuclein 0.243 0.080 CHD8 chromodomain helicase DNA binding protein 8 0.242 0.084 C17orf80 chromosome 17 open reading frame 80 0.241 0.076 COPB2 coatomer protein complex subunit beta 2 0.240 0.034 PAAF1 proteasomal ATPase associated factor 1 0.239 0.087 NFATC3 nuclear factor of activated T-cells 3 0.238 0.034 MUT methylmalonyl-CoA mutase 0.236 0.045 NCAPD2 non-SMC condensin I complex subunit D2 0.236 0.065

148 tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation 0.236 0.042 YWHAE protein epsilon TARS threonyl-tRNA synthetase 0.235 0.095 GMPR2 guanosine monophosphate reductase 2 0.235 0.024 LARP7 La ribonucleoprotein domain family member 7 0.235 0.091 DHX8 DEAH-box helicase 8 0.234 0.037 LARS leucyl-tRNA synthetase 0.232 0.065 GMPS guanine monophosphate synthase 0.231 0.045 SLC25A3 solute carrier family 25 member 3 0.231 0.079 RALGAPB Ral GTPase activating protein non-catalytic beta subunit 0.230 0.042 DALRD3 DALR anticodon binding domain containing 3 0.228 0.088 DENND2D DENN domain containing 2D 0.228 0.085 PDIA6 protein disulfide isomerase family A member 6 0.227 0.051 IK IK cytokine, down-regulator of HLA II 0.227 0.045 RBM6 RNA binding motif protein 6 0.227 0.040 PRKCB protein kinase C beta 0.227 0.098 SAM and HD domain containing deoxynucleoside triphosphate 0.227 0.071 SAMHD1 triphosphohydrolase 1 PAN2 PAN2 poly(A) specific ribonuclease subunit 0.226 0.091 KARS lysyl-tRNA synthetase 0.223 0.078 TRIP12 thyroid hormone receptor interactor 12 0.223 0.038 VPS33B VPS33B, late endosome and lysosome associated 0.222 0.079 SMU1 DNA replication regulator and spliceosomal factor 0.222 0.019 CRBN cereblon 0.220 0.044 KAT7 lysine acetyltransferase 7 0.220 0.084 PDIA4 protein disulfide isomerase family A member 4 0.219 0.082 CWF19L1 CWF19-like 1, cell cycle control (S. pombe) 0.219 0.100 USP10 ubiquitin specific peptidase 10 0.219 0.076 DNAJA2 DnaJ heat shock protein family (Hsp40) member A2 0.219 0.038 IWS1 IWS1, SUPT6H interacting protein 0.218 0.051 ARHGAP15 Rho GTPase activating protein 15 0.218 0.045 NUP88 nucleoporin 88 0.218 0.048 HARS histidyl-tRNA synthetase 0.217 0.045 IPO5 importin 5 0.217 0.092 TMEM248 transmembrane protein 248 0.217 0.028 ATP synthase, H+ transporting, mitochondrial F1 complex, alpha 0.213 0.032 ATP5A1 subunit 1, cardiac muscle PAPSS1 3'-phosphoadenosine 5'-phosphosulfate synthase 1 0.213 0.082 MACF1 microtubule-actin crosslinking factor 1 0.212 0.094 VDAC2 voltage dependent anion channel 2 0.211 0.067 SON SON DNA binding protein 0.209 0.089 HARS2 histidyl-tRNA synthetase 2, mitochondrial 0.207 0.077 DAP3 death associated protein 3 0.205 0.051 ODF2 outer dense fiber of sperm tails 2 0.204 0.100 SWI/SNF related, matrix associated, actin dependent regulator of 0.204 0.051 SMARCE1 chromatin, subfamily e, member 1 PPP4R3A protein phosphatase 4 regulatory subunit 3A 0.204 0.076 ZNF317 zinc finger protein 317 0.201 0.053 PDK1 pyruvate dehydrogenase kinase 1 0.201 0.089 CEP95 centrosomal protein 95 0.201 0.089 EIF3B eukaryotic translation initiation factor 3 subunit B 0.194 0.049 YIPF1 Yip1 domain family member 1 0.193 0.091 149 PDIA3 protein disulfide isomerase family A member 3 0.193 0.087 ZNF207 zinc finger protein 207 0.192 0.100 DYNC1I2 dynein cytoplasmic 1 intermediate chain 2 0.192 0.028 minichromosome maintenance complex component 3 associated 0.192 0.097 MCM3AP protein KAT14 lysine acetyltransferase 14 0.192 0.098 SYNCRIP synaptotagmin binding cytoplasmic RNA interacting protein 0.190 0.096 FOXP1 forkhead box P1 0.189 0.084 PSMD2 proteasome 26S subunit, non-ATPase 2 0.188 0.079 HNRNPK heterogeneous nuclear ribonucleoprotein K 0.185 0.045 CSTF1 cleavage stimulation factor subunit 1 0.184 0.063 CS citrate synthase 0.169 0.092 SNW1 SNW domain containing 1 0.168 0.084 HDAC3 histone deacetylase 3 0.168 0.100 SNX1 sorting nexin 1 0.166 0.084 HDAC1 histone deacetylase 1 0.161 0.092 NFYC nuclear transcription factor Y subunit gamma 0.160 0.092 CLP1 cleavage and polyadenylation factor I subunit 1 0.157 0.075 TMEM11 transmembrane protein 11 -0.209 0.092 DNAL4 dynein axonemal light chain 4 -0.258 0.086 MED11 mediator complex subunit 11 -0.263 0.075 TOMM20 translocase of outer mitochondrial membrane 20 -0.267 0.062 LEMD2 LEM domain containing 2 -0.273 0.066 NUDT22 nudix hydrolase 22 -0.278 0.092 RNF181 ring finger protein 181 -0.292 0.036 GTPBP2 GTP binding protein 2 -0.293 0.026 mannosyl (alpha-1,3-)-glycoprotein beta-1,4-N- -0.307 0.074 MGAT4B acetylglucosaminyltransferase, isozyme B ARFGAP1 ADP ribosylation factor GTPase activating protein 1 -0.310 0.045 SNX15 sorting nexin 15 -0.311 0.099 BCL7C BCL tumor suppressor 7C -0.331 0.100 FAM160B2 family with sequence similarity 160 member B2 -0.331 0.076 FAM104A family with sequence similarity 104 member A -0.356 0.059 NECAP2 NECAP endocytosis associated 2 -0.364 0.024 TOR3A torsin family 3 member A -0.375 0.050 MRPS25 mitochondrial ribosomal protein S25 -0.382 0.076 WDR13 WD repeat domain 13 -0.392 0.080 SF3B5 splicing factor 3b subunit 5 -0.392 0.065 ATG4D autophagy related 4D cysteine peptidase -0.395 0.063 ATP synthase, H+ transporting, mitochondrial Fo complex subunit -0.397 0.058 ATP5I E PTDSS2 phosphatidylserine synthase 2 -0.401 0.084 POLE3 DNA polymerase epsilon 3, accessory subunit -0.401 0.083 KLHDC4 kelch domain containing 4 -0.403 0.049 NTMT1 N-terminal Xaa-Pro-Lys N-methyltransferase 1 -0.412 0.054 POLR2E RNA polymerase II subunit E -0.413 0.034 PQLC1 PQ loop repeat containing 1 -0.413 0.084 PKIG protein kinase (cAMP-dependent, catalytic) inhibitor gamma -0.414 0.039 STK11 serine/threonine kinase 11 -0.419 0.033 YPEL3 yippee like 3 -0.423 0.065 UQCR10 ubiquinol-cytochrome c reductase, complex III subunit X -0.423 0.049 SMIM3 small integral membrane protein 3 -0.428 0.046 150 SGTA small glutamine rich tetratricopeptide repeat containing alpha -0.433 0.084 STARD10 StAR related lipid transfer domain containing 10 -0.439 0.092 RNF123 ring finger protein 123 -0.442 0.058 SERF2 small EDRK-rich factor 2 -0.447 0.044 MED9 mediator complex subunit 9 -0.450 0.084 MPND MPN domain containing -0.467 0.024 ASCC2 activating signal cointegrator 1 complex subunit 2 -0.469 0.076 WBSCR27 Williams Beuren syndrome chromosome region 27 -0.471 0.037 CHMP4B charged multivesicular body protein 4B -0.472 0.049 CARHSP1 calcium regulated heat stable protein 1 -0.475 0.064 NAA38 N(alpha)-acetyltransferase 38, NatC auxiliary subunit -0.480 0.037 AKT2 AKT serine/threonine kinase 2 -0.484 0.038 EPN1 epsin 1 -0.486 0.039 FOXO4 forkhead box O4 -0.490 0.091 ATXN7 ataxin 7 -0.498 0.063 ZMAT5 zinc finger matrin-type 5 -0.502 0.036 NPRL3 NPR3 like, GATOR1 complex subunit -0.504 0.062 HYAL2 hyaluronoglucosaminidase 2 -0.506 0.099 CYB5R3 cytochrome b5 reductase 3 -0.506 0.035 SH3GLB2 SH3 domain containing GRB2 like endophilin B2 -0.515 0.037 UBA52 ubiquitin A-52 residue ribosomal protein fusion product 1 -0.518 0.037 RPS6KB2 ribosomal protein S6 kinase B2 -0.520 0.039 FAM214B family with sequence similarity 214 member B -0.531 0.098 TESC tescalcin -0.537 0.092 ACSS1 acyl-CoA synthetase short-chain family member 1 -0.541 0.076 ACOT8 acyl-CoA thioesterase 8 -0.543 0.094 C17orf107 chromosome 17 open reading frame 107 -0.547 0.066 CNPPD1 cyclin Pas1/PHO80 domain containing 1 -0.550 0.046 NDUFB7 NADH:ubiquinone oxidoreductase subunit B7 -0.561 0.077 CDC34 cell division cycle 34 -0.561 0.063 FAM117A family with sequence similarity 117 member A -0.568 0.066 NINJ1 ninjurin 1 -0.569 0.038 hydroxy-delta-5-steroid dehydrogenase, 3 beta- and steroid delta- -0.583 0.035 HSD3B7 isomerase 7 PC pyruvate carboxylase -0.583 0.022 NDUFS7 NADH:ubiquinone oxidoreductase core subunit S7 -0.583 0.045 E2F2 E2F transcription factor 2 -0.584 0.030 AGO2 argonaute 2, RISC catalytic component -0.585 0.075 PGPEP1 pyroglutamyl-peptidase I -0.587 0.045 MAFG MAF bZIP transcription factor G -0.595 0.049 ELL elongation factor for RNA polymerase II -0.596 0.099 LTBP4 latent transforming growth factor beta binding protein 4 -0.601 0.076 GFI1B growth factor independent 1B transcriptional repressor -0.604 0.092 FAM171A2 family with sequence similarity 171 member A2 -0.611 0.094 PDE9A phosphodiesterase 9A -0.615 0.024 VASH1 vasohibin 1 -0.617 0.049 TXN2 thioredoxin 2 -0.637 0.098 CBR1 carbonyl reductase 1 -0.647 0.045 TRIM56 tripartite motif containing 56 -0.652 0.045 SDF4 stromal cell derived factor 4 -0.654 0.035 ODF3L1 outer dense fiber of sperm tails 3 like 1 -0.659 0.077

151 CCDC96 coiled-coil domain containing 96 -0.660 0.080 ANKRD9 ankyrin repeat domain 9 -0.660 0.051 WNT9A Wnt family member 9A -0.664 0.086 CAPN5 calpain 5 -0.671 0.066 LYL1 LYL1, basic helix-loop-helix family member -0.674 0.063 MFSD2B major facilitator superfamily domain containing 2B -0.676 0.094 CSTB cystatin B -0.680 0.034 JOSD2 Josephin domain containing 2 -0.696 0.054 PYCR1 pyrroline-5-carboxylate reductase 1 -0.703 0.026 UBAC1 UBA domain containing 1 -0.715 0.045 ZNF541 zinc finger protein 541 -0.716 0.089 PLCD3 phospholipase C delta 3 -0.716 0.031 SLC43A1 solute carrier family 43 member 1 -0.720 0.024 BOLA3 bolA family member 3 -0.720 0.059 UBXN6 UBX domain protein 6 -0.726 0.020 NFIX nuclear factor I X -0.727 0.041 SPC24 SPC24, NDC80 kinetochore complex component -0.727 0.045 PPP2R5B protein phosphatase 2 regulatory subunit B'beta -0.733 0.085 ASB1 ankyrin repeat and SOCS box containing 1 -0.734 0.054 UMODL1 uromodulin like 1 -0.742 0.089 HAGH hydroxyacylglutathione hydrolase -0.753 0.080 TUBB3 tubulin beta 3 class III -0.760 0.034 CCDC124 coiled-coil domain containing 124 -0.764 0.037 MARCH2 membrane associated ring-CH-type finger 2 -0.765 0.081 GSG1 germ cell associated 1 -0.769 0.045 MGLL monoglyceride lipase -0.774 0.045 ARHGDIG Rho GDP dissociation inhibitor gamma -0.775 0.092 DNAH17 dynein axonemal heavy chain 17 -0.775 0.066 SNN stannin -0.778 0.049 MYRF myelin regulatory factor -0.781 0.051 RANBP10 RAN binding protein 10 -0.787 0.038 PINK1 PTEN induced putative kinase 1 -0.795 0.024 GPX1 glutathione peroxidase 1 -0.802 0.092 FKBP8 FK506 binding protein 8 -0.809 0.017 PTP4A3 protein tyrosine phosphatase type IVA, member 3 -0.812 0.036 VAMP5 vesicle associated membrane protein 5 -0.813 0.058 SLC4A1 solute carrier family 4 member 1 (Diego blood group) -0.822 0.080 GALNT10 polypeptide N-acetylgalactosaminyltransferase 10 -0.825 0.020 EPOR erythropoietin receptor -0.837 0.056 ARHGAP32 Rho GTPase activating protein 32 -0.841 0.022 CD82 CD82 molecule -0.842 0.045 GUK1 guanylate kinase 1 -0.878 0.037 SPTB spectrin beta, erythrocytic -0.885 0.030 R3HDM4 R3H domain containing 4 -0.893 0.023 SNX22 sorting nexin 22 -0.902 0.096 FZR1 fizzy/cell division cycle 20 related 1 -0.918 0.010 ALAD aminolevulinate dehydratase -0.942 0.066 EVI5L ecotropic viral integration site 5 like -0.944 0.020 C1QTNF4 C1q and tumor necrosis factor related protein 4 -0.945 0.097 C7orf50 chromosome 7 open reading frame 50 -0.946 0.059 RUNDC3A RUN domain containing 3A -0.955 0.017

152 MXD4 MAX dimerization protein 4 -0.965 0.012 TGM2 transglutaminase 2 -0.970 0.064 FADS6 fatty acid desaturase 6 -0.993 0.045 TMCC2 transmembrane and coiled-coil domain family 2 -0.994 0.045 L3MBTL1 l(3)mbt-like 1 (Drosophila) -1.013 0.059 SMIM1 small integral membrane protein 1 (Vel blood group) -1.040 0.032 RPL41 ribosomal protein L41 -1.042 0.045 TP73 tumor protein p73 -1.055 0.094 NDUFA3 NADH:ubiquinone oxidoreductase subunit A3 -1.071 0.084 LYPD8 LY6/PLAUR domain containing 8 -1.084 0.084 ABTB1 ankyrin repeat and BTB domain containing 1 -1.139 0.034 PHOSPHO1 phosphoethanolamine/phosphocholine phosphatase -1.216 0.035 SNTA1 syntrophin alpha 1 -1.350 0.065

153 Table 4-S2. Gene ontology results for the 203 up-regulated and 146 down-regulated genes as a function of predator pressure.

Enriched Number GO term ID p-value Genes involved biological process of genes

RRM1, TARS, PGM3, RARS, DALRD3, VCP, UTP15, NAT10, CHD8, NFYC, HARS, LARS, HARS2, TAF15, SMARCE1, GMPR2, DDX47, SAMHD1, SNW1, ATP5A1, KARS, RPA2, CPSF3, TDP1, ZNF326, MLH1, SETD6, IWS1, XRCC5, PTCD2, ASH2L, HEATR1, DLD, SMG8, RTF1, AIFM1, SSRP1, GART, Nucleobase- CALR, NUFIP1, PAPSS1, PRKDC, PARP2, containing

GO:0006139 85 7.87E-20 MRPL44, MPHOSPH8, POLR1E, SYNCRIP, compound NDUFS1, ELAC1, NME7, LARP7, METTL16, metabolic process HDAC1, PRKCB, RFC4, FANCD2, TPR, HNRNPD, HNRNPK, SRRM1, CLP1, GMPS, METTL14, TRIP12, GARS, ZFYVE26, UTP6, UTP4, PAN2, FARSB, HDAC3, DROSHA, SON, KAT7, ZNF317, RAD23B, MSH6, regulated genelist regulated

- DICER1, NFATC3, ZNF281, YTHDF2, FOXP1, PGBD1, BOD1L1, GEMIN4 Up LARS, YWHAE, HSP90B1, SNX1, HSPA9, Intracellular protein CALR, LONP2, HSPA4, IPO5, TPR, COPB2, GO:0006886 19 0.00167 transport SRP54, HDAC3, VPS33B, COPG2, TNPO1,

SEC23B, IWS1

Ribonucleoprotein EIF2D, NUFIP1, PAN2, EIF3B, DICER1, GO:0022618 8 0.0167

complex assembly FASTKD2, GEMIN4, CLP1

LARS, KARS, RPA2, TDP1, YWHAE, HSP90B1, MLH1, PDK1, XRCC5, ASH2L, Cellular response to AIFM1, SSRP1, CALR, PRKDC, FANCD2, GO:0033554 28 9.65E-05 stress TPR, HNRNPK, HYOU1, TRIP12, VCP, USP10, ZFYVE26, HDAC3, KAT7, RAD23B,

MSH6, BOD1L1, PARP2

RRM1, GMPR2, SAMHD1, ATP5A1, KARS, Nucleotide GO:0009117 13 0.041 DLD, GART, PAPSS1, NDUFS1, NME7, metabolic process

VCP, GARS, GMPS

MYRF, ALAD, LEMD2, PDE9A, STK11, TMCC2, FZR1, POLR2E, AKT2, SF3B5,

PGPEP1, ELL, UBA52, CHMP4B, TP73, HSD3B7, UQCR10, ASCC2, MED9, LYL1, MXD4, ACSS1, RNF181, ASB1, ACOT8, CARHSP1, PHOSPHO1, NFIX, MARCH2, AGO2, WNT9A, GALNT10, HAGH, PC, Metabolic process GO:0008152 74 0.000231 MED11, CNPPD1, PLCD3, E2F2, FADS6, TESC, L3MBTL1, TGM2, ATXN7, SGTA, regulated genelist regulated

- LTBP4, NDUFB7, FKBP8, PYCR1, JOSD2, FOXO4, UBXN6, PKIG, PINK1, MGAT4B, HYAL2, PTP4A3, NDUFS7, SNTA1, ATP5I, Down PTDSS2, GFI1B, CAPN5, CDC34, ATG4D, MAFG, CYB5R3, GPX1, CSTB, RPS6KB2, GUK1, TXN2, RPL41, UMODL1, NTMT1

154 Bibliography

Akdemir D, Godfrey OU (2015) EMMREML: Fitting Mixed Models with Known Covariance

Structures.

Anders S, Pyl PT, Huber W (2015) HTSeq-A Python framework to work with high-throughput

sequencing data. Bioinformatics, 31, 166–169.

Arlet ME, Isbell LA (2009) Variation in behavioral and hormonal responses of adult male gray-

cheeked mangabeys (Lophocebus albigena) to crowned eagles (Stephanoaetus coronatus)

in Kibale National Park, Uganda. Behavioral Ecology and Sociobiology, 63, 491–499.

Bininda-Emonds ORP, Cardillo M, Jones KE et al. (2007) The delayed rise of present-day

mammals. Nature, 446, 507–512.

Blandino P, Barnum CJ, Solomon LG et al. (2009) Gene expression changes in the

hypothalamus provide evidence for regionally-selective changes in IL-1 and microglial

markers after acute stress. Brain, Behavior, and Immunity, 23, 958–968.

Blumstein DT (2007) The evolution, function, and meaning of marmot alarm call

communication. In: Advances in the Study of Behavior, pp. 371–401.

Blumstein DT (2013) Yellow-bellied marmots: insights from an emergent view of sociality.

Philosophical Transactions of the Royal Society B, 368, 20120349.

Blumstein DT, Barrow L, Luterra M (2008a) Olfactory predator discrimination in yellow-bellied

marmots. Ethology, 114, 1135–1143.

Blumstein DT, Cooley L, Winternitz J, Daniel JC (2008b) Do yellow-bellied marmots respond to

predator vocalizations? Behavioral Ecology and Sociobiology, 62, 457–468.

Blumstein DT, Ferando E, Stankowich T (2009) A test of the multipredator hypothesis: yellow-

bellied marmots respond fearfully to the sight of novel and extinct predators. Animal

155 Behaviour, 78, 873–878.

Blumstein DT, Im S, Nicodemus A, Zugmeyer C (2004a) Yellow-bellied marmots (Marmota

flaviventris) hibernate socially. Journal of Mammalogy, 85, 25–29.

Blumstein DT, Keeley KN, Smith JE (2016) Fitness and hormonal correlates of social and

ecological stressors of female yellow-bellied marmots. Animal Behaviour, 112, 1–11.

Blumstein DT, Lea AJ, Olson LE, Martin JGA (2010) Heritability of anti-predatory traits:

vigilance and locomotor performance in marmots. Journal of Evolutionary Biology, 23,

879–87.

Blumstein DT, Ozgul A, Yovovich V, Van Vuren DH, Armitage KB (2006) Effect of predation

risk on the presence and persistence of yellow-bellied marmot (Marmota flaviventris)

colonies. Journal of Zoology, 270, 132–138.

Blumstein DT, Runyan A, Seymour M et al. (2004b) Locomotor ability and wariness in yellow-

bellied marmots. Ethology, 110, 615–634.

Bujalka H, Koenning M, Jackson S et al. (2013) MYRF Is a Membrane-Associated Transcription

Factor That Autoproteolytically Cleaves to Directly Activate Myelin Genes. PLoS Biology,

11.

Carbon S, Dietze H, Lewis SE et al. (2017) Expansion of the gene ontology knowledgebase and

resources: The gene ontology consortium. Nucleic Acids Research, 45, D331–D338.

Carbon S, Ireland A, Mungall CJ et al. (2009) AmiGO: Online access to ontology and annotation

data. Bioinformatics, 25, 288–289.

Causton HC, Ren B, Koh SS et al. (2001) Remodeling of yeast genome expression in response to

environmental changes. Molecular Biology of the Cell, 12, 323–337.

Chai Y, Koppenhafer SL, Bonini NM, Paulson HL (1999) Analysis of the role of heat shock

156 protein (Hsp) molecular chaperones in polyglutamine disease. The Journal of Neuroscience,

19, 10338–10347.

Cheng J, Guggino W (2013) Ubiquitination and degradation of CFTR by the E3 ubiquitin ligase

MARCH2 through its association with adaptor proteins CAL and STX6. PLoS ONE, 8, 1–

13.

Clinchy M, Sheriff MJ, Zanette LY (2013) Predator-induced stress and the ecology of fear.

Functional Ecology, 27, 56–65.

Clinchy M, Zanette L, Boonstra R, Wingfield JC, Smith JNM (2004) Balancing food and

predator pressure induces chronic stress in songbirds. Proceedings of the Royal Society B:

Biological Sciences, 271, 2473–2479.

Cohen H, Matar MA, Zohar J (2008) Animal Models of Posttraumatic Stress Disorder. In:

Sourcebook of Models for Biomedical Research (ed Conn PM), pp. 591–601. Humana

Press, Totowa, NJ.

Cole SW, Yan W, Galic Z, Arevalo J, Zack JA (2005) Expression-based monitoring of

transcription factor activity: The TELiS database. Bioinformatics, 21, 803–810.

Colinet H, Lee SF, Hoffmann A (2010) Temporal expression of heat shock genes during cold

stress and recovery from chill coma in adult Drosophila melanogaster. FEBS Journal, 277,

174–185.

Creel S, Christianson D, Schuette P (2013a) Glucocorticoid stress responses of lions in

relationship to group composition, human land use, and proximity to people. Conservation

Physiology, 1, 1–9.

Creel S, Dantzer B, Goymann W, Rubenstein DR (2013b) The ecology of stress: effects of the

social environment. Functional Ecology, 27, 66–80.

157 Cullinan WE, Herman JP, Battaglia DF, Akil H, Watson SJ (1995) Pattern and time course of

immediate early gene expression in rat brain following acute stress. Neuroscience, 64, 477–

505.

Dashty M (2013) A quick look at biochemistry: Carbohydrate metabolism. Clinical

Biochemistry, 46, 1339–1352.

Davies MN, Lawn S, Whatley S et al. (2009) To what extent is blood a reasonable surrogate for

brain in gene expression studies: Estimation from mouse hippocampus and spleen.

Frontiers in Neuroscience, 3, 1–6.

Donner J, Pirkola S, Silander K et al. (2008) An association analysis of murine anxiety genes in

humans implicates novel candidate genes for anxiety disorders. Biological Psychiatry, 64,

672–680.

Fraser BA, Weadick CJ, Janowitz I, Rodd FH, Hughes KA (2011) Sequencing and

characterization of the guppy (Poecilia reticulata) transcriptome. BMC Genomics, 12, 202.

Gething M-J, Sambrook J (1992) Protein folding in the cell. Nature, 355, 33–45.

Ghosh S, Dent R, Harper M-E et al. (2010) Gene expression profiling in whole blood identifies

distinct biological pathways associated with obesity. BMC Medical Genomics, 3, 56.

Goebel MU, Mills PJ, Irwin MR, Ziegler MG (2000) Interleukin-6 and tumor necrosis factor-α

production after acute psychological stress, exercise, and infused isoproterenol: differential

effects and pathways. Psychosomatic Medicine, 62, 591–598.

Gracey AY, Chaney ML, Boomhower JP et al. (2008) Rhythms of gene expression in a

fluctuating intertidal environment. Current Biology, 18, 1501–1507.

Gray J, Rubin TG, Hunter R, McEwen BS (2014a) Hippocampal gene expression changes

underlying stress sensitization and recovery. Molecular Psychiatry, 19, 1171–1178.

158 Gray KA, Yates B, Seal RL, Wright MW, Bruford EA (2014b) Genenames.org: The HGNC

resources in 2015. Nucleic Acids Research, 43, D1079–D1085.

Guillou H, Zadravec D, Martin PGP, Jacobsson A (2010) The key roles of elongases and

desaturases in mammalian fatty acid metabolism: Insights from transgenic mice. Progress

in Lipid Research, 49, 186–199.

Hawlena D, Schmitz OJ (2010) Physiological stress as a fundamental mechanism linking

predation to ecosystem functioning. The American Naturalist, 176, 537–556.

Hayward SAL, Pinehart JP, Denlinger DL (2004) Desiccation and rehydration elicit distinct heat

shock protein transcript responses in flesh fly pupae. Journal of Experimental Biology, 207,

963–971.

Hazard D, Fernandez X, Pinguet J et al. (2011) Functional genomics of the muscle response to

restraint and transport in chickens. Journal of Animal Science, 89, 2717–2730.

Horvath S (Ed.) (2011) Weighted Network Analysis: Applications in Genomics and Systems

Biology. Springer, New York.

Houston B, Stewart AJ, Farquharson C (2004) PHOSPHO1- A novel phosphatase specifically

expressed at sites of mineralisation in bone and cartilage. Bone, 34, 629–637.

Jacobs WB, Kaplan DR, Miller FD (2006) The p53 family in nervous system development and

disease. Journal of Neurochemistry, 97, 1571–1584.

Kennerly E, Ballmann A, Martin S et al. (2008) A gene expression signature of confinement in

peripheral blood of red wolves (Canis rufus). Molecular Ecology, 17, 2782–91.

Kim D, Pertea G, Trapnell C et al. (2013) TopHat2: accurate alignment of transcriptomes in the

presence of insertions, deletions and gene fusions. Genome Biology, 14, R36.

Kitaysky AS, Piatt JF, Hatch SA et al. (2010) Food availability and population processes:

159 severity of nutritional stress during reproduction predicts survival of long-lived seabirds.

Functional Ecology, 24, 625–637.

De Kloet ER, Joëls M, Holsboer F (2005) Stress and the brain: From adaptation to disease.

Nature Reviews Neuroscience, 6, 463–475.

Kohane IS, Valtchinov VI (2012) Quantifying the white blood cell transcriptome as an accessible

window to the multiorgan transcriptome. Bioinformatics, 28, 538–545.

Krueger F (2015) Trim Galore: A wrapper tool around Cutadapt and fastQC to consistently apply

quality and adapter trimming to FastQ files.

Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network

analysis. BMC Bioinformatics, 9, 559.

Lavergne SG, McGowan PO, Krebs CJ, Boonstra R (2014) Impact of high predation risk on

genome-wide hippocampal gene expression in snowshoe hares. Oecologia, 176, 613–624.

Law CW, Chen Y, Shi W, Smyth GK (2014) voom: precision weights unlock linear model

analysis tools for RNA-seq read counts. Genome Biology, 15, R29.

Leek JT, Scharpf RB, Bravo HC et al. (2010) Tackling the widespread and critical impact of

batch effects in high-throughput data. Nature Reviews Genetics, 11, 733–739.

Liew CC, Ma J, Tang HC, Zheng R, Dempsey AA (2006) The peripheral blood transcriptome

dynamically reflects system wide biology: A potential diagnostic tool. Journal of

Laboratory and Clinical Medicine, 147, 126–132.

Liu Q, Yuan B, Alice K et al. (2012) Adiponectin regulates expression of hepatic genes critical

for glucose and lipid metabolism. Proceedings of the National Academy of Sciences of the

United States of America, 109, 14568–14573.

Lockwood BL, Sanders JG, Somero GN (2010) Transcriptomic responses to heat stress in

160 invasive and native blue mussels (genus Mytilus): molecular correlates of invasive success.

Journal of Experimental Biology, 213, 3548–3558.

Logue MW, Smith AK, Baldwin C et al. (2015) An analysis of gene expression in PTSD

implicates genes involved in the glucocorticoid receptor pathway and neural responses to

stress. Psychoneuroendocrinology, 57, 1–13.

Lynn SE, Breuner CW, Wingfield JC (2003) Short-term fasting affects locomotor activity,

corticosterone, and corticosterone binding globulin in a migratory songbird. Hormones and

Behavior, 43, 150–157.

Martin TE (2011) The cost of fear. Science, 334, 1353–1354.

Matz JM, Blake MJ, Tatelman HM, Lavoi KP, Holbrook NJ (1995) Characterization and

regulation of cold-induced heat shock protein expression in mouse brown adipose tissue.

The American Journal of Physiology, 269, R38–R47.

McEwen BS, Gianaros PJ (2010) Central role of the brain in stress and adaptation: Links to

socioeconomic status, health, and disease. Annals of the New York Academy of Sciences,

1186, 190–222.

Van Meter P, French J, Dionak S et al. (2009) Fecal glucocorticoids reflect socio-ecological and

anthropogenic stressors in the lives of wild spotted hyenas. Hormones and Behavior, 55,

329–337.

Mihaylova VT, Bindra RS, Yuan J et al. (2003) Decreased expression of the DNA mismatch

repair gene MLH1 under hypoxic stress in mammalian cells. Molecular and Cellular

Biology, 23, 3265–3273.

Miller GE, Chen E, Sze J et al. (2008) A Functional Genomic Fingerprint of Chronic Stress in

Humans: Blunted Glucocorticoid and Increased NF-κB Signaling. Biological Psychiatry,

161 64, 266–272.

Miller GE, Murphy MLM, Cashman R et al. (2014) Greater inflammatory activity and blunted

glucocorticoid signaling in monocytes of chronically stressed caregivers. Brain, Behavior,

and Immunity, 41, 191–199.

Minami M, Kuraishi Y, Yamaguchi T et al. (1991) Immobilization stress induces interleukin-1

beta mRNA in the rat hypothalamus. Neuroscience Letters, 123, 254–256.

Mitew S, Hay CM, Peckham H et al. (2014) Mechanisms regulating the development of

oligodendrocytes and central nervous system myelin. Neuroscience, 276, 29–47.

Monclús R, Tiulim J, Blumstein DT (2011) Older mothers follow conservative strategies under

predator pressure: The adaptive role of maternal glucocorticoids in yellow-bellied marmots.

Hormones and Behavior, 60, 660–665.

Mora A, Komander D, Van Aalten DMF, Alessi DR (2004) PDK1, the master regulator of AGC

kinase signal transduction. Seminars in Cell and Developmental Biology, 15, 161–170.

Morita K, Saito T, Ohta M et al. (2005) Expression analysis of psychological stress-associated

genes in peripheral blood leukocytes. Neuroscience Letters, 381, 57–62.

Nakamura N, Fukuda H, Kato A, Hirose S (2005) MARCH-II is a syntaxin-6 binding protein

involved in endosomal trafficking. Molecular Biology of the Cell, 16, 1696–1710.

Nguyen KT, Deak T, Owens SM et al. (1998) Exposure to acute stress induces brain interleukin-

1b protein in the rat. The Journal of Neuroscience, 18, 2239–2246.

Pajer K, Andrus BM, Gardner W et al. (2012) Discovery of blood transcriptomic markers for

depression in animal models and pilot validation in subjects with early-onset major

depression. Translational Psychiatry, 2, e101-10.

Pauwels K, Stoks R, De Meester L (2005) Coping with predator stress: Interclonal differences in

162 induction of heat-shock proteins in the water flea Daphnia magna. Journal of Evolutionary

Biology, 18, 867–872.

Pijanowska J, Kloc M (2004) Daphnia response to predation threat involves heat-shock proteins

and the actin and tubulin cytoskeleton. Genesis, 38, 81–86.

Preisser EL, Bolnick DI, Benard MF (2005) Scared to death? the effects of intimidation and

consumption in predator-prey interactions. Ecology, 86, 501–509.

R Core Team (2016) R: A language and environment for statistical computing.

Reeder DM, Kramer KM (2005) Stress in free-ranging mammals: integrating physiology,

ecology, and natural history. Journal of Mammalogy, 86, 225–235.

Reimand J, Arak T, Adler P et al. (2016) g:Profiler- a web server for functional interpretation of

gene lists (2016 update). Nucleic Acids Research, 1–7.

Reimand J, Kull M, Peterson H, Hansen J, Vilo J (2007) G:Profiler-a web-based toolset for

functional profiling of gene lists from large-scale experiments. Nucleic Acids Research, 35,

193–200.

Richter CA, Garcia-Reyero N, Martyniuk C et al. (2011) Gene expression changes in female

zebrafish (Danio rerio) brain in response to acute exposure to methylmercury.

Environmental Toxicology and Chemistry, 30, 301–308.

Ritchie ME, Phipson B, Wu D et al. (2015) limma powers differential expression analyses for

RNA-sequencing and microarray studies. Nucleic Acids Research, 43, e47.

Ritossa F (1962) A new puffing pattern induced by temperature shock and DNP in drosophila.

Experientia, 18, 571–573.

Romero IG, Pai AA, Tung J, Gilad Y (2014) RNA-seq: impact of RNA degradation on transcript

quantification. BMC Biology, 12, 42.

163 Rudkowska I, Raymond C, Ponton A et al. (2011) Validation of the use of peripheral blood

mononuclear cells as surrogate model for skeletal muscle tissue in nutrigenomic studies.

OMICS: A Journal of Integrative Biology, 15, 1–7.

Rui L (2014) Energy metabolism in the liver. Comprehensive Physiology, 4, 177–197.

Sands J, Creel S (2004) Social dominance, aggression and faecal glucocorticoid levels in a wild

population of wolves, Canis lupus. Animal Behaviour, 67, 387–396.

Sanogo YO, Hankison S, Band M, Obregon A, Bell AM (2011) Brain transcriptomic response of

threespine sticklebacks to cues of a predator. Brain, Behavior and Evolution, 77, 270–85.

Sapolsky RM, Romero M, Munck AU (2000) How do glucocorticoids influence stress

responses? Integrating permissive, stimulatory, and preparative actions. Endocrine Reviews,

21, 55–89.

Di Schiavi E, Riano E, Heye B, Bazzicalupo P, Rugarli EI (2005) UMODL1/Olfactorin is an

extracellular membrane-bound molecule with a restricted spatial expression in olfactory and

vomeronasal neurons. European Journal of Neuroscience, 21, 3291–3300.

Smedley D, Haider S, Durinck S et al. (2015) The BioMart community portal: An innovative

alternative to large, centralized data repositories. Nucleic Acids Research, 43, W589–W598.

Smith SM, Vale WW (2006) The role of the hypothalamic-pituitary-adrenal axis in

neuroendocrine responses to stress. Dialogues in Clinical Neuroscience, 8, 383–395.

Soria-Carrasco V, Castresana J (2012) Diversification rates and the latitudinal gradient of

diversity in mammals. Proceedings of the Royal Society B: Biological Sciences, 279, 4148–

4155.

Storey JD, Bass AJ, Dabney A, Robinson D (2015) QVALUE: Q-value Estimation for False

Discovery Rate Control.

164 Storey JD, Tibshirani R (2003) Statistical significance for genomewide studies. Proceedings of

the National Academy of Sciences of the United States of America, 100, 9440–5.

Su Y, Xie Z, Xin G, Zhao L, Li K (2011) Predator exposure-induced cerebral interleukins are

modulated heterogeneously by behavioral asymmetry. Immunology Letters, 135, 158–164.

Sullivan PF, Fan C, Perou CM (2006) Evaluating the comparability of gene expression in blood

and brain. American Journal of Medical Genetics - Neuropsychiatric Genetics, 141 B, 261–

268.

Thomas WK, Martin SL (1993) A recent origin of marmots. Molecular Phylogenetics and

Evolution, 2, 330–336.

Tochigi M, Iwamoto K, Bundo M et al. (2008) Gene expression profiling of major depression

and suicide in the prefrontal cortex of postmortem brains. Neuroscience Research, 60, 184–

191.

Trapnell C, Pachter L, Salzberg SL (2009) TopHat: Discovering splice junctions with RNA-Seq.

Bioinformatics, 25, 1105–1111.

Tung J, Zhou X, Alberts SC, Stephens M, Gilad Y (2015) The genetic architecture of gene

expression levels in wild baboons. eLife, 4, 1–22.

Verleih M, Borchel A, Krasnov A et al. (2015) Impact of thermal stress on kidney-specific gene

expression in farmed regional and imported rainbow trout. Marine Biotechnology, 17, 576–

592.

Van Vuren DH (2001) Predation on yellow-bellied marmots (Marmota flaviventris). The

American Midland Naturalist, 145, 94–100.

Walsh RC, Koukoulas I, Garnham A et al. (2001) Exercise increases serum Hsp72 in humans.

Cell Stress & Chaperones, 6, 386–393.

165 Wang J (2011) Coancestry: A program for simulating, estimating and analysing relatedness and

inbreeding coefficients. Molecular Ecology Resources, 11, 141–145.

Wold MS (1997) Replication Protein A: a heterotrimeric, single-stranded DNA-binding protein

required for eukaryotic DNA metabolism. Annual Review of Biochemistry, 66, 61–92.

Wong CG, Bonakdar M, Mautz WJ, Kleinman MT (1996) Chronic inhalation exposure to ozone

and nitric acid elevates stress-inducible heat shock protein 70 in the rat lung. Toxicology,

107, 111–119.

Wright FA, Sullivan PF, Brooks AI et al. (2014) Heritability and genomics of gene expression in

peripheral blood. Nature Genetics, 46, 430–437.

Xia D, Ji W, Xu C et al. (2017) Knockout of MARCH2 inhibits the growth of HCT116 colon

cancer cells by inducing endoplasmic reticulum stress. Cell Death and Disease, 8, e2957.

Yamano Y, Yoshioka M, Toda Y et al. (2004) Regulation of CRF, POMC and MC4R gene

expression after electrical foot shock stress in the rat amygdala and hypothalamus. The

Journal of Veterinary Medical Science, 66, 1323–1327.

Young AJ, Carlson A a, Monfort SL et al. (2006) Stress and the suppression of subordinate

reproduction in cooperatively breeding meerkats. Proceedings of the National Academy of

Sciences, 103, 12005–12010.

Zou L, Elledge SJ (2003) Sensing DNA damage through ATRIP recognition of RPA-ssDNA

complexes. Science, 300, 1542–1548.

166