<<

University of New Hampshire University of New Hampshire Scholars' Repository

Doctoral Dissertations Student Scholarship

Spring 2017

Genomic evaluation of male reproductive adaptations and responses to dehydration in eremicus ()

Lauren Kordonowy University of New Hampshire, Durham

Follow this and additional works at: https://scholars.unh.edu/dissertation

Recommended Citation Kordonowy, Lauren, "Genomic evaluation of male reproductive adaptations and responses to dehydration in Peromyscus eremicus (Cactus mouse)" (2017). Doctoral Dissertations. 152. https://scholars.unh.edu/dissertation/152

This Dissertation is brought to you for free and open access by the Student Scholarship at University of New Hampshire Scholars' Repository. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of University of New Hampshire Scholars' Repository. For more information, please contact [email protected].

Genomic evaluation of male reproductive adaptations and responses to dehydration in Peromyscus eremicus (Cactus mouse)

BY

Lauren Kordonowy B.A. in Biology (with Honors), Kenyon College, 2006

M.Sc. in Biological Sciences, Simon Fraser University, 2009

DISSERTATION

Submitted to the University of New Hampshire in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy in

May, 2017 Genomic evaluation of male reproductive adaptations and responses to dehydration in Peromyscus eremicus (Cactus mouse)

BY

Lauren Kordonowy

This dissertation has been examined and approved in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Genetics by:

Dissertation Advisor: Matthew MacManes, Ph.D., Assistant Professor of Molecular, Cellular, and Biomedical Sciences (UNH)

David Plachetzki, Ph.D., Assistant Professor of Molecular, Cellular, and Biomedical Sciences (UNH)

Philip Hatcher, Ph.D., Professor of Computer Science (UNH)

W. Kelley Thomas, Ph.D. Hubbard Professor of Genomics and Director for Department of Molecular, Cellular, and Biomedical Sciences (UNH)

David Townson, Ph.D., Department Chair and Professor of Department of and Veterinary Sciences (University of Vermont)

4-28-2017

Date of Defense

Original approval signatures are on file with the University of New Hampshire Graduate School.

ii

COPYRIGHT

Citation and Copyright for Chapter 1:

Kordonowy and MacManes (2016), Characterization of a male reproductive transcriptome for

Peromyscus eremicus (Cactus mouse). PeerJ 4:e2617; DOI 10.7717/peerj.2617

Copyright 2016 Kordonowy and MacManes

Distributed under Creative Commons CC-BY 4.0

OPEN ACCESS

Citation for Chapter 2:

Kordonowy L, MacManes L. 2016. Transcriptomic Evidence for Reproductive Suppression in Male Peromyscus eremicus (Cactus Mouse) Subjected to Acute Dehydration. bioRxiv DOI: https://doi.org/10.1101/096057 ; BMC Genomics: in review

iii

ACKNOWLEDGEMENTS

I deeply acknowledge Dr. Matt MacManes for serving as my Ph.D. advisor. He granted me the incredible opportunity to change course in my Ph.D. research at a critical juncture during my degree. Moreover, Matt allowed me the freedom to explore research questions that interested me, and he provided patient and continuous expertise in a novel research area in my education: bioinformatics and genomics. Furthermore, he has fully supported my mentorship of several undergraduates in his lab, which has provided me with invaluable experience. Finally, Matt has embraced my passion for outreach in the sciences, and he has been a valuable partner in our

UNH Science Sleuths education program for local pre-school aged children.

I also want to acknowledge Philip Hatcher, Kelley Thomas, David Townson, and David

Plachetzki for serving on my dissertation committee. Thank you for meeting with me frequently to aid my research progress. I have always felt like your doors have been open to me, and you have been instrumental in my educational success at UNH.

I further acknowledge my intrepid fellow MacManes lab graduate student, Andrew Lang.

Andrew has been a skilled and supportive partner at the lab bench, in coursework, in writing, in teaching, and in education outreach. Thank you for being my best ever lab partner!

I finally acknowledge my husband, Jonathan Cummings, for moving to New Hampshire with me and for his tireless support during my entire Ph.D. Thank you for countless hours of compassion and intellectual questions and for your expertise in R!

I also acknowledge my entire family and all of my friends who have supported me throughout my graduate education and who have made this time enjoyable.

Last, but not least, thank you, Bailey!

iv

TABLE OF CONTENTS

COMMITTEE PAGE ...... ii

COPYRIGHT...... iii

ACKNOWLEDGEMENTS...... iv

TABLE OF CONTENTS...... v

LIST OF TABLES...... vii

LIST OF FIGURES ...... viii

ABSTRACT...... ix

CHAPTER PAGE

INTRODUCTION...... 1

I. Characterization of a male reproductive transcriptome for Peromyscus eremicus

(Cactus mouse) ...... 15

Abstract...... 16

Introduction...... 17

Methods...... 21

Results and Discussion...... 27

Conclusions...... 35

Dryad Data File List...... 36

Additional Information and Declarations...... 37

Tables and Figures...... 39

References...... 47

v

II. Characterizing the reproductive transcriptomic correlates of acute dehydration in males in the desert-adapted , Peromyscus eremicus…...... 57

Abstract...... 58

Background...... 60

Methods...... 64

Results...... 71

Discussion...... 77

Conclusion...... 83

DropBox Data File List...... 84

Declarations...... 85

Tables and Figures...... 87

References...... 109

III. Characterization of the Seminal Vesicle Proteome of Peromyscus eremicus

(Cactus Mouse) in hydrated and dehydrated conditions…...... 125

Abstract...... 126

Dataset Brief...... 127

Supplemental Materials...... 131

Figure.……………...... 132

References...... 133

APPENDIX: IACUC Approval Letter ...... 137

vi

LIST OF TABLES

CHAPTER 1: Table 1 ...... 39

CHAPTER 1: Table 2...... 40

CHAPTER 1: Table 3...... 41

CHAPTER 2: Table 1...... 87

CHAPTER 2: Table 2...... 88

CHAPTER 2: Table 3...... 89

CHAPTER 2: Table 4...... 92

CHAPTER 2: Table 5...... 93

CHAPTER 2: Supplemental Table 1...... 103

CHAPTER 2: Supplemental Table 2...... 104

vii

LIST OF FIGURES

CHAPTER 1: Figure 1 ...... 43

CHAPTER 1: Figure 2...... 44

CHAPTER 1: Figure 3...... 45

CHAPTER 1: Figure 4...... 46

CHAPTER 2: Figure 1...... 94

CHAPTER 2: Figure 2...... 95

CHAPTER 2: Figure 3...... 96

CHAPTER 2: Figure 4...... 100

CHAPTER 2: Figure 5...... 102

CHAPTER 2: Supplemental Figure 1...... 106

CHAPTER 2: Supplemental Figure 2...... 107

CHAPTER 2: Supplemental Figure 2...... 108

CHAPTER 3: Figure 1...... 132

viii

ABSTRACT

Genomic evaluation of male reproductive adaptations and responses to dehydration in Peromyscus eremicus (Cactus mouse)

By

Lauren Kordonowy

University of New Hampshire, May, 2017

Research elucidating the genetic architecture of physiological mechanisms enabling survival and reproduction in extreme environments is becoming prominent in evolutionary biology. The desert, in particular, poses numerous challenges for its endemic species, and (and often, ) have been the focus for survival adaptations pertaining to water- limitation. However, desert rodent adaptation research has focused predominantly on survival, while potential physiological reproductive adaptations to dehydration have received less attention, aside from research evaluating water as reproductive cue. The fact that we do not know the physiological mechanisms enabling reproduction during dehydration is surprising, as desert rodents must possess adaptations to successfully reproduce in their water-limited habitats.

The cactus mouse (Peromyscus eremicus), a desert-specialist in the Southwest United States, is the focus of my genetic exploration of reproductive adaptations to dehydration. My dissertation describes three research studies that 1) characterize male cactus mouse reproductive tissue transcriptomes and find signatures of positive selection in these tissues relative to other rodent species, 2) describe differential expression of genes responding to water-limitation within testes, providing candidate genes for future studies exploring the impacts of acute and chronic drought

ix

on P. eremicus reproduction, and 3) generate a seminal vesicle proteome representative of proteins present in hydrated and dehydrated conditions experienced by the cactus mouse. These three studies contribute comprehensive genetic data critical to future research exploring the effects of water-limitation on reproduction as well as the genetic mechanisms for potential male reproductive adaptations in this desert-adapted rodent. Research on desert adaptations is particularly timely, as climate change will result in more frequent stochastic drought events and elevated temperatures. These increasing abiotic shifts will exacerbate clinical challenges for global health; thus, an enhanced understanding of mammalian desert-specialist adaptations may improve our ability to address these physiological demands in humans.

x

INTRODUCTION

Overview and Rationale

There has been an increase in evolutionary biology research exploring the genetic underpinnings of physiological, behavioral and ecological traits. Furthermore, these studies frequently utilize genomic and transcriptomic methods to explore genes under selection, to evaluate differential gene expression levels, and to describe gene ontologies (e.g. Atallah et al.,

2013; Plachetzki et al., 2014; Guillen et al., 2015; Bedford and Hoekstra, 2015; Munshi-South and Richardson, 2016; Behringer et al., 2015; Chen et al, 2016; Suarez-Vega et al., 2016;

Macrander et al., 2016; Marra et al., 2014, MacManes, 2017). However, this research is often exploratory, as it frequently generates candidate genes for future studies to evaluate. This is because analyses requisite for studies attributing genes with functionality are best executed on model organisms, which already possess a litany of genomic, transcriptomic, and physiological resources. Non-model organisms with environmental adaptations of interest have often been the focus of behavioral, ecological or physiological research, but until recently, they rarely had complete, high-quality genomes or comprehensive transcriptomes or proteomes (Ellegren, 2014).

Thus, it is incumbent upon evolutionary biologists to collect sufficient genetic data to bridge the informational gap between model and non-model species before they can begin to pursue causative adaptive research on genes of interest with functional studies (Storz and Wheat, 2010).

Such is the case with my dissertation study species, the cactus mouse (Peromyscus eremicus).

The cactus mouse has been the focus of intermittent biological research for decades

(Dewey et al., 1966; summarized in: King, 1968; Veal and Caire, 1979; MacMillen and Garland,

1

1989); however, evolutionary biologists have only recently begun collecting genetic data on this species for adaptation research (MacManes and Eisen, 2014; Kordonowy and MacManes,

2016; Kordonowy and MacManes, 2017, Kordonowy et al., 2017a, MacManes, 2017). This desert-adapted rodent is endemic to the Southwestern United States, and this species is remarkable for its ability to live entirely without drinking water (Veal and Caire, 1979; Caire,

1999). Recent research has sought to elucidate cactus mouse renal adaptations for survival using genomic (MacManes and Eisen, 2014, MacManes, 2017) and physiological methods

(Kordonowy et al., 2017b). However, previous research focused exclusively on cactus mouse survival adaptations and did not address potential physiological reproductive adaptations to combat water-limitation.

My thesis explores potential male reproductive adaptations in the cactus mouse using transcriptomic methodologies and bioinformatic analyses, as well as proteomics. The aims of my dissertation research are to (1) genetically characterize male cactus mouse reproductive tissues utilizing transcriptomics (testes, epididymis and vas deferens: Kordonowy and

MacManes, 2016) and proteomics (seminal vesicles: Kordonowy et al., 2017a), (2) to determine cactus mouse testes responses to dehydration (Kordonowy and MacManes, 2017), and (3) to propose and explore our hypothesis for male cactus mouse reproductive physiological adaptations to water-limitation (Kordonowy and MacManes, 2016). In the last three years, I have generated genetic resources for previously uncharacterized male reproductive tissues in the cactus mouse, and I have also conducted comparative and descriptive analyses to determine male reproductive responses to dehydration and to find candidate genes for potential reproductive adaptations. This will allow future studies to evaluate functional causality between the genes demonstrating transcriptomic responses to dehydration with physiological adaptations for acute

2

water-limitation in the cactus mouse: an ultimate goal in this evolutionary biology research.

The implications of desert adaptation research extend well beyond the realm of scientific inquiry within the field of experimental biology. Indeed, Johnson and colleagues (2017) asserted that studies on adaptations in desert-specialized mammals will be instrumental in our ability to address a growing health crisis due to the mounting effects of dehydration and heat-stress resulting from climate change.

Background

Desert mammals have evolved a variety of physiological and behavioral adaptations to live in hot, arid environments, including metabolic water production (MacMillen & Hinds, 1983; reviewed in Walsberg 2000), renal adaptations (Schmidt-Nielsen et al., 1948; Schmidt-Nielsen &

Schmidt-Nielsen, 1952; Vimtup & Schmidt-Nielsen, 1952; Dantzler, 1982; Diaz, Ojeda &

Rezenda, 2006; Urity et al., 2012), large ear size (Schmidt-Nieslen, 1964; Hill & Veghte, 1976), burrowing (in Vorhies, 1945; Kelt, 2011), nocturnality (Stephens & Tello, 2009; Fuller et al.,

2014), and seasonal torpor (Kalabukhov 1960; Geiser, 2010). Peromyscus eremicus possesses several of these adaptations; the cactus mouse is nocturnal and enters torpor during the hottest summer months, it uses burrows, and it produces metabolic water (reviewed in Veal and Caire,

1979; MacMillan and Garland, 1989). However, the cactus mouse does not possess the same kidney architecture as the kangaroo rat; namely, P. eremicus do not concentrate urine via elongated loops of Henle (Dewey et al., 1966; MacManes 2016, unpublished data). Thus, recent physiological and genetic studies have begun to pursue the currently unknown mechanisms for renal adaptations in the cactus mouse (MacManes and Eisen, 2014; Kordonowy et al., 2017b;

MacManes, 2017). However, my dissertation research is the first to hypothesize and explore potential reproductive adaptations in this species (Kordonowy and MacManes, 2016).

3

The rationale for evaluating potential reproductive adaptations is that desert-adapted organisms are subject to considerable selective pressures for successful reproduction despite water-limitation. Reproductive proteins have been shown to undergo rapid rates of

(reviewed in Swanson and Vacquier, 2002; Ramm et al., 2014). For example, among rodents, this has been demonstrated in Mus musculus epididymis-specialized genes (Dean et al., 2008), in sperm proteins involved in motility and in sperm-egg interactions for various mouse strains

(Vicens et al., 2014), and in Peromyscus testes proteins (Turner et al., 2008). Therefore, I predicted that selective habitat forces in deserts are capable of shaping reproductive proteins in response to dehydration in P. eremicus (Kordonowy and MacManes, 2016).

Several studies have explored the role of water as a cue for reproduction in desert rodents

(but not cactus mouse), as well as the suppressive reproductive effects of dehydration (reviewed in Schwimmer and Haim, 2009; Bales and Hostetler 2011; tested in Yahr and Kessler, 1975;

Breed, 1975, Christian 1979; Breed and Leigh, 2011; Henry and Dubost, 2012; Sarli et al., 2015;

Sarli et al., 2016). The physiological mechanisms responsible for these reproductive effects are currently unknown, though some researchers have proposed that arginine vasopressin, an antidiuretic hormone, is involved (reviewed in Schwimmer and Haim, 2009; tested in: Shanus and Haim, 2004; Wube et al., 2008; Bukovetzky et al., 2012a; 2012).

However, prior to evaluating evidence for reproductive adaptations in the male cactus mouse, we needed to produce a strong foundational genetic base to pursue this hypothesis; specifically, we required comprehensive genetic data for the investigated tissues. I chose to generate genetic data for four male reproductive tissues involved in sperm development and movement: the testes, epididymis, vas deferens, and seminal vesicles. Firstly, I sequenced and assembled a transcriptome for the testes, epididymis and vas deferens. I generated a proteome

4

for the seminal vesicles, because they are extraordinarily protein rich, making them ideal for building a proteome while unconducive for sequencing a transcriptome. The transcriptomic and proteomic data for these four tissues has allowed us to begin to pursue our research question on potential male cactus mouse reproductive adaptations.

My research utilizes bioinformatics to evaluate genetic data from male cactus mouse reproductive tissues. Although the genetic mechanisms responsible for adaptive morphologies among species residing in extreme environments remain largely unknown, this is a rapidly expanding area of research (e.g. Cheviron and Brumfield, 2011; Bedford and Hoekstra, 2015;

Herrera, Watanabe & Shank, 2015). This expansion is largely a result of genomic resources allowing researchers to elucidate the ultimate mechanisms of adaptive evolution with higher sensitivity (e.g. Guillen et al., 2015; MacManes and Eisen, 2014). Recent adaptation research in other Peromyscus species has successfully used genomic and transcriptomic approaches

(reviewed in: Bedford and Hoekstra, 2015; Munshi-South and Richardson, 2016). Furthermore, several studies have evaluated differential gene expression and genes under positive selection in desert-rodent kidneys (Marra et al., 2012; Marra et al., 2014; MacManes & Eisen, 2014;

MacManes, 2017). Differential expression analyses can identify genes with higher or lower expression in species occupying extreme environments, and these candidate genes serve as the focus for additional analyses to determine whether they play a functional role in adaptation. The manuscript (in review, BMC Genomics) in Chapter 2 uses this approach to address my dissertation research question.

Summary of Dissertation Research and Thesis Chapter Descriptions

Genomic, transcriptomic and proteomic resources utilized within the context of evolutionary biology provide incontrovertibly powerful approaches to explore the genetic

5

mechanisms for organismal adaptations to extreme environments. My dissertation describes the harnessing of genetic resources to pursue the following evolutionary biology research question:

What reproductive adaptations do desert-specialized male Peromyscus eremicus (cactus mouse) possess to combat dehydration? In order to address this line of inquiry, we have generated considerable genomic (MacManes, unpublished data), transcriptomic, and proteomic data for the relevant male reproductive tissues in this study species.

Chapter 1 is a published manuscript (Kordonowy and MacManes, 2016) characterizing a comprehensive three-tissue male reproductive transcriptome for P. eremicus. This transcriptomic data provides an essential genetic resource for reproductive tissue research in this species.

Indeed, we leveraged this transcriptomic resource for our differential expression manuscript

(described below). This Chapter 1 manuscript also includes a comparative analysis of reproductive tissues across three rodent species to identify genes with signatures of positive selection in cactus mouse. These candidate genes should be of interest to future studies addressing our research question on male cactus mouse reproductive adaptations.

Chapter 2 is a manuscript currently in review at BMC Genomics (Kordonowy and

MacManes, 2017), which describes a large scale differential gene expression study, wherein we experimentally exposed 22 male cactus mice to either acute dehydrated or hydrated treatment conditions. We performed a robust three-tiered statistical analysis for evaluating differential expression of genes in testes between treatments, and our results surprisingly alluded to reproductive modulation among testes of acutely dehydrated mice. Our work illustrates the need for more research on reproductive effects of chronic and acute dehydration in this species. The identified differentially expressed genes in this manuscript should be evaluated in future research

6

to causatively link their altered expression levels with reproductive mitigation; however, our findings are unexpected in a species highly adapted for desert-habitation.

Chapter 3 is a manuscript that will be submitted imminently to Proteomics as a Dataset

Brief (Kordonowy et al., 2017a) describing the seminal vesicle proteome for P. eremicus. This proteome was generated using three hydrated and three dehydrated cactus mice, because it is essential that it be representative of both hydration states that this species experiences. We describe the gene ontology for the proteome. Our intent is that future research will use this proteome to evaluate protein composition differences between experimentally manipulated differentially hydrated laboratory mice and wild caught cactus mice to explore reproductive protein modulation responses to drought. Such work would both significantly contribute to our limited understanding of dehydration effects on fertility and reproductive success in cactus mouse as well as aid in evaluating our novel male reproductive adaptation hypothesis.

REFERENCES:

Atallah J, Plachetzki DC, Jasper WC, Johnson BR. 2013. The utility of shallow RNA-seq for documenting differential gene expression in genes with high and low levels of expression. PLoS

One 8(12): e84160 (pp. 1-11). DOI: 10.1371/journal.pone.0084160.

Bales KL, Hostetler CM. 2011. Hormones and Reproductive Cycles in rodents Pp. 215-240 in

Norris DO, Lopez KH, eds. Hormones and Reproduction of Vertebrates: Volume 5 Mammals.

London, UK: Academic Press (Elsevier).

Bedford NL, Hoekstra HE. 2015. The Natural History of Model Organisms: Peromyscus mice as a model for studying natural variation. eLife 4: e06813. DOI: 10.7554/eLife.06813.

Behringer D, Zimmermann H, Ziegenhagen B, Liepelt S. 2015. Differential gene expression reveals candidate genes for drought stress response in Abies alba (Pinaceae). PloS One 10(4) :

7

e0124564 (pp. 1-18). DOI : 10.1371/journal.pone.0124564.

Breed WG. 1975. Environmental factors and reproduction in the female hopping mouse,

Notomys alexis. The Journal of the Society for Reproduction and Fertility 45: 273-281.

Doi: 10.1530/jrf.0.0450273.

Breed WG, Leigh CM. 2011. Reproductive biology of an old endemic murid rodent of

Australia, the Spinifex hopping mouse, Notomys alexis: adaptations for life in the arid zone.

Integrative Zoology 6: 321-333. Doi: 10.1111/j.1749-4877.2011.00264x

Bukovetzky E, Fares F, Schwimmer H, Haim A. 2012a. Reproductive and metabolic responses of desert adapted common spiny male mice (Acomys cahirinus) to vasopressin treatment. Comparative and Physiology, Part A 162(4): 349-356. http://doi.org/10.1016/j.cbpa.2012.04.007

Bukovetzky E, Schwimmer H, Fares F, Haim A. 2012b. Photoperiodicity and increasing salinity as environmental cues for reproduction in desert adapted rodents. Hormones and

Behavior 61(1): 84-90. http://doi.org/10.1016/j.yhbeh.2011.10.006

Caire W. 1999. Cactus mouse in The Smithsonian Book of North American Mammals. Wilson

D, Ruff S, editors. Washington, D.C.: Smithsonian Institution Press; 567-568

Chen W, Yao Q, Patil GB, Agarwal G, Deshmukh RK, Lin L, Wang B, Wang Y, Prince SJ,

Xu D, An YC, Valliyodan B, Varshney RK, Nguyen HT. 2016. Identification and comparative analysis of differential gene expression in soybean leaf tissue under drought and flooding stress revealed by RNA-seq. Frontiers in Plant Science 7(Article 1044): (pp. 1-19).

DOI :10.3389/fpls.2016.01044.

Cheviron ZA, Brumfield RT. 2011. Genomic insights into adaptation to high-altitude environments. Heredity 108(4):354–361 DOI 10.1038/hdy.2011.85.

8

Christian DP. 1979. Comparative demography of three Namib desert rodents: Responses to the provision of supplementary water. Journal of Mammalogy 60(4): 679-690. https://doi.org/10.2307/1380185

Dantzler WH. 1982. Renal Adaptations of Desert Vertebrates. BioScience 32(2):108-113. DOI:

10.2307/1308563

Dean MD, Good JM, Nachman MW. 2008. Adaptive evolution of proteins secreted during sperm maturation: an analysis of the mouse epididymal transcriptome. Molecular Biology and

Evolution 25(2):383–392 DOI 10.1093/molbev/msm265.

Dewey GC, Elias H, Appel KR. 1966. Stereology of renal corpuscles of desert and swamp deer mice. Nephron 3(6): 352-365. DOI: 10.1159/000179552.

Diaz GB, Ojeda RA, Rezende EL. 2006. Renal morphology, phylogenetic history and desert adaptation of South American Hystricognath rodents. Functional 20: 609–620.

DOI: 10.1111/j.1365-2435.2006.01144.x

Ellegren, H (2014). Genome sequencing and population genomics in non-model organisms.

Trends in Ecology and Evolution, 29(1): 1-13. Doi:10.1016/j.tree.2013.09.008

Fuller A, Hetem RS, Maloney SK, Mitchell D. 2014. Adaptation to Heat and Water Shortage in

Large, Arid-Zone Mammals. Physiology 29(3): 159-167. DOI: 10.1152/physiol.00049.2013

Geiser F. in Aestivation: Molecular and Physiological Aspects (Volume 49 in the series Progress in Molecular and Subcellular Biology, Springer, Berlin, 2010, Navas A, Carvalho C, Eduardo J.)

95-111. DOI:10.1007/978-3-642-02421-4_5

Guillen Y, Rius N, Delprat A, Williford A, Muyas F, Puig M, Casillas S, Ramia M, Egea R,

Negre B, Mir G, Camps J, Moncunill V, Ruiz-Ruano FJ, Cabrero J, De Lima LG, Dias GB,

Ruiz JC, Kapusta A, Garcia-Mas J, Gut M, Gut IG, Torrents D, Camacho JP, Kuhn GCS,

9

Feschotte C, Clark AG, Betran E, Barbadilla A, Ruiz A. 2015. Genomics of ecological adaptation in cactophilic Drosophila. Genome Biology and Evolution 7(1):349–366 DOI

10.1093/gbe/evu291

Henry O, Dubost G. 2012. Breeding periods of Gerbillus cheesmani (Rodentia, Muridae) in

Saudi Arabia. Mammalia 76(4): 383-387. Doi: 10.1515/mammalia-2012-0017.

Herrera S, Watanabe H, Shank TM. 2015. Evolutionary and biogeographical patterns of barnacles from deep-sea vent hydrothermal vents. Molecular Ecology 24: 673-689.

DOI: 10.1111/mec.13054

Hill RW, Veghte JH. 1976. Jackrabbit ears: surface temperatures and vascular responses.

Science 194(4263):436-8. DOI:10.1126/science.982027

Johnson RJ, Stenvinkel P, Jensen T, Lanaspa MA, Roncal C, Song Z, Bankir L, Sanchez-

Lozada LG. 2016. Metabolic and kidney diseases in the setting of climate change, water shortage, and survival factors. Journal of the American Society of Nephrology 27(8): 2247-56.

DOI: 10.1681/ASN.2015121314

Kalabukhov NI. 1960. Comparative ecology of hibernating . Bulletin of the Museum of

Comparative Zoology at Harvard 124: 45-74. DOI: N/A

Kelt DA. 2011. Comparative ecology of desert small mammals: a selective review of the past 30 years. Journal of Mammalogy, 92(6):1158–1178. DOI: 10.1644/10-MAMM-S-238.1

King, John A., (editor). Biology of Peromyscus (Rodentia). Special Publication No. 2, The

American Society of Mammologists, xiii + 594 pp, 1968. DOI: 10.2307/1378817

Kordonowy LK, MacManes MD. 2016. Characterization of a male reproductive transcriptome for Peromyscus eremicus (cactus mouse). PeerJ 4:e2617(pp.1-23). DOI: 10.7717/peerj.2617.

Kordonowy L, MacManes L. 2017. Characterizing the reproductive transcriptomic correlates

10

of acute dehydration in males in the desert-adapted rodent, Peromyscus eremicus. BMC

Genomics: in review. bioRxiv DOI: 10.1101/096057

Kordonowy L, Hogan D, Chu F, MacManes M. 2017a. Characterization of the Seminal

Vesicle Proteome of Peromyscus eremicus (Cactus mouse) in hydrated and dehydrated conditions. Proteomics: Dataset Brief: pending submission.

Kordonowy L, Lombardo K, Green H, Dawson MD, LaCourse S, Bolton E, MacManes M.

2017b. Physiological and biochemical changes associated with experimental dehydration in the desert adapted cactus mouse, Peromyscus eremicus. Physiological Reports 5: (e13218).

Doi: 10.14814/phy2.13218

MacManes MD, Eisen MB. 2014. Characterization of the transcriptome, nucleotide sequence polymorphism, and natural selection in the desert adapted mouse Peromyscus eremicus. PeerJ 2: e642. doi.org/10.7717/peerj.642.

MacManes MD. 2017. Severe acute dehydration in a desert rodent elicits a transcriptional response that effectively prevents kidney injury. American Journal of Physiology-Renal

Physiology: in press (April 5, 2017). doi: 10.1152/ajprenal.00067.2017

MacMillen RE, Garland T Jr. in Advances in the Study of Peromyscus (Rodentia) (Texas Tech

University Press, Lubbock, 1989, Kirkland LG Jr, Layne JN) 143-168.

MacMillen RE, Hinds DS. 1983. Water regulatory efficiency in Heteromyid rodents: A model and its application. Ecology 64(1): 152-164. DOI: 10.2307/1937337

Macrander J, Broe M, Daly M. 2016. Tissue-specific venom composition and differential gene expression in sea anemones. Genome Biology and Evolution 8(8): 2358-2375.

DOI: 10.1093/gbe/evw155

Marra NJ, Eo SH, Hale MC, Waser PM, DeWoody JA. 2012. A priori and a posteriori

11

approaches for finding genes of evolutionary interest in non-model species: Osmoregulatory genes in the kidney transcriptome of the desert rodent Dipodomys spectabilis (banner-tailed kangaroo rat). Comparative Biochemistry and Physiology, Part D: Genomics Proteomics 7(4):

328-339. DOI: 10.1016/j.cbd.2012.07.001.

Marra NJ, Romero A, DeWoody A. 2014. Natural selection and the genetic basis of osmoregulation in Heteromyid rodents as revealed by RNA-seq. Molecular Ecology 23(11):

2699-2711. DOI: 10.1111/mec.12764.

Munshi-South J, Richardson JL. 2016. Peromyscus transcriptomics: understanding adaptation and gene expression plasticity within and between species of deer mice. Seminars in Cell &

Developmental Biology: in press (pp.1-9). DOI: 10.1016/j.semcdb.2016.08.011

Plachetzki DC, Pankey MS, Johnson BR, Ronne EJ, Kopp A, Grosberg RK. 2014. Gene co- expression modules underlying polymorphic and monomorphic zooids in the colonial

Hydrozoan, Hydractinia symbiolongicarpus. Integrative and Comparative Biology 54(2): 276-

283. DOI: 10.1093/icb/icu080

Ramm SA, Scharer L, Ehmcke J, Wistuba J. 2014. Sperm competition and the evolution of spermatogenesis. Molecular Human Reproduction 20(12):1169–1179 DOI 10.1093/molehr/gau070.

Sarli J, Lutermann H, Alagali AN, Mohammed OB, Bennett NC. 2015. Reproductive patterns in the Baluchistan gerbil, Gerbillus nanus (Rodentia: Muridae), from western Saudi

Arabia: The role of rainfall and temperature. Journal of Arid Environments 113: 87-94. http://dx.doi.org/10.1016/j.jaridenv.2014.09.007

Sarli J, Lutermann H, Alagali AN, Mohammed OB, Bennett NC. 2016. Seasonal reproduction in the Arabian spiny mouse, Acomys dimidiatus (Rodentia: Muridae) from Saudi

Arabia: The role of rainfall and temperature. Journal of Arid Environments 124:352-359.

12

http://dx.doi.org/10.1016/j.jaridenv.2015.09.008

Schmidt-Nielsen K. in Desert Animals: Physiological Problems of Heat and Water (Oxford

University Press, New York, 1964, Schmidt-Nielsen K.) 129-138.

Schmidt-Nielsen B, Schmidt-Nielsen K, Brokaw A, Schneiderman H. 1948. Water conservation in desert rodents. Journal of Cellular Physiology 32(3): 331-360.

DOI: 10.1002/jcp.1030320306.

Schmidt-Nielsen K, Schmidt-Nielsen B. 1952. Water metabolism of desert mammals 1.

Physiological Reviews 32(2): 135-166. PMID: 1492697.

Schwimmer H, Haim A. 2009. Physiological adaptations of small mammals to desert ecosystems. Integrative Zoology 4: 357-366. Doi: 10.1111/j.1749-4877.2009.00176.x

Shanas U, Haim A. 2004. Diet salinity and vasopressin as reproductive modulators in the desert-dwelling golden spiny mouse (Acomys russatus). Physiology and Behavior 81(4): 645-

650. Doi: 10.1016/j.physbeh.2004.03.002

Stevens RD, Tello JS. 2009. Micro- and macrohabitat associations in Mojave desert rodent communities. Journal of Mammalogy 90(2):388-403. DOI: 10.1644/08-MAMM-A-141.1

Storz JF, Wheat CW. 2010. Integrating evolutionary and functional approaches to infer adaptation at specific loci. Evolution 64(9): 2489-2509. Doi: 10.1111/j.1558-5646.2010.01044.x

Suarez-Vega A, Guitierrez-Gil B, Klopp C, Tosser-Klopp G, Arranz J-J. 2016.

Comprehensive RNA-seq profiling to evaluate lactating sheep mammary gland transcriptome.

Scientific Data 3(Article number:160051). DOI: 10.1038/sdata.2016.51

Swanson WJ, Vacquier VD. 2002. The rapid evolution of reproductive proteins. Nature Reviews

Genetics 3(2):137–144 DOI 10.1038/nrg733.

Turner LM, Chuong EB, Hoekstra HE. 2008. comparative analysis of testis protein evolution

13

in Rodents. Genetics 179(4):2075–2089 DOI 10.1534/genetics.107.085902.

Urity VB, Issaian T, Braun EJ, Dantzler WH, Pannabecker TL. 2012. Architecture of kangaroo rat inner medulla: segmentation of descending thin limb of Henle’s loop. American

Journal of Physiology-Regulatory Integrative and Comparative Physiology 302(6): R720-R726.

DOI:10.1152/ajpregu.00549.2011.

Veal R, Caire W. 1979. Peromyscus eremicus. Mammalian Species 118:1-6. Available at http://www.science.smith.edu/msi/pdf/i0076-3519-118-01-0001.pdf.

Vicens A, Luke L, Roldan E.RS. 2014. Proteins involved in motility and sperm-egg interaction evolve more rapidly in mouse spermatozoa. PLoS ONE 9(3):e91302.

DOI 10.1371/journal.pone.0091302.

Vimtrup BJ, Schmidt-Nielsen B. 1952. The histology of the kidney of kangaroo rats. The

Anatomical Record 114(4): 515-528. DOI:10.1002/ar.1091140402.

Vorhies CT in Water Requirements of Desert Animals in the Southwest. (College of

Agriculture, University of Arizona, Tucson, Agricultural Experiment Station.) Technical Bulletin

No. 107: 487-525. http://hdl.handle.net/10150/190625

Walsberg GE. 2000. Small mammals in hot deserts: some generalizations revisited. BioScience

50(2): 109-120. doi: 10.1641/0006-3568(2000)050[0109:SMIHDS]2.3.C

Wube T, Fares F, Haim A. 2008. A differential response in the reproductive system and energy balance of spiny mice Acomys populations to vasopressin treatment. Comparative Biochemistry and Physiology. Part A, Molecular and Integrative Physiology 151: 499-504.

Doi: 10.1016/j.cbpa.2008.06.027

Yahr P, Kessler S. 1975. Suppression of reproduction in water-deprived Mongolian gerbils

(Meriones unguiculatus). Biology of Reproduction 12(2):249-254. Doi:10.1095/biolreprod12.2.2

14

CHAPTER 1

Characterization of a male reproductive transcriptome for Peromyscus eremicus

(Cactus mouse)

Lauren L. Kordonowy* and Matthew D. MacManes*

Department of Molecular, Cellular, and Biomedical Sciences, University of New

Hampshire, Durham, NH, United States

* These authors contributed equally to this work.

Submitted 20 April 2016

Accepted 27 September 2016

Published 27 October 2016

Corresponding author: Lauren L. Kordonowy, [email protected]

Academic editor: Ana Pavasovic

Additional Information and Declarations can be found on page 17

DOI 10.7717/peerj.2617

Copyright 2016 Kordonowy and MacManes

Distributed under Creative Commons CC-BY 4.0

OPEN ACCESS

How to cite this article: Kordonowy and MacManes (2016), Characterization of a male reproductive transcriptome for Peromyscus eremicus (Cactus mouse). PeerJ 4:e2617; DOI

10.7717/peerj.2617 15

ABSTRACT

Rodents of the genus Peromyscus have become increasingly utilized models for investigations into adaptive biology. This genus is particularly powerful for research linking genetics with adaptive physiology or behaviors, and recent research has capitalized on the unique opportunities afforded by the ecological diversity of these rodents. Well characterized genomic and transcriptomic data is intrinsic to explorations of the genetic architecture responsible for ecological adaptations. Therefore, this study characterizes the transcriptome of three male reproductive tissues (testes, epididymis and vas deferens) of

Peromyscus eremicus (cactus mouse), a desert specialist. The transcriptome assembly process was optimized in order to produce a high quality and substantially complete annotated transcriptome. This composite transcriptome was generated to characterize the expressed transcripts in the male reproductive tract of P. eremicus, which will serve as a crucial resource for future research investigating our hypothesis that the male cactus mouse possesses an adaptive reproductive phenotype to mitigate water-loss from ejaculate. This study reports genes under positive selection in the male cactus mouse reproductive transcriptome relative to transcriptomes from Peromyscus maniculatus (deer mouse) and Mus musculus. Thus, this study expands upon existing genetic research in this species, and we provide a high quality transcriptome to enable further explorations of our proposed hypothesis for male cactus mouse reproductive adaptations to minimize seminal fluid loss.

Subjects: Bioinformatics, Evolutionary Studies, Genetics, Genomics, Zoology

Keywords: Peromyscus eremicus, Transcriptome, Genomics, Bioinformatics, Adaptation,

Desert physiology, Cactus mouse, Reproduction

16

INTRODUCTION

The rapid infusion of novel bioinformatics approaches in the fields of genomics and transcriptomics has enabled the coalescence of the fields of genetics, physiology and ecology into innovative studies for adaptation in evolutionary biology. Indeed, studies on the biology of adaptation had previously been dominated by research painstakingly documenting morphological shifts associated with ecological gradients (e.g., in Peromyscus: Carleton,

1989; MacMillen & Garland, 1989). However, the discipline of bioinformatics has breathed new life into the field of adaptation biology. Specifically, while the morphological basis as well as the physiological mechanisms of adaptation have been explored for a variety of species in extreme environments, the genetic underpinnings of these adaptations have only recently become a larger area of research (Hoekstra et al., 2006; Cheviron & Brumfield, 2011;

Lorenzo et al., 2014; Guillen et al., 2015). High-throughput sequencing technology of model and non-model organisms (Ellegren, 2014) enables evolutionary biologists to conduct genome and transcriptome wide analyses and link patterns of gene selection with functional adaptations.

Studies on the genetic basis of adaptation have included a wide variety of taxa. For example, butterflies in the Heliconius genus have been a particularly effective study system for determining the genetic basis of pigmentation patterns, and there is evidence of interspecific introgression for genes enabling adaptive mimicry patterns (Hines et al., 2012; Heliconius

Genome Consortium, 2012). In addition, a population genomic study in three-spine sticklebacks has elucidated many loci responsible for divergent adaptations from marine to freshwater environments (Jones et al., 2012). Furthermore, researchers have developed a list of candidate genes that have evolved in multiple populations of freshwater adapted three-spine

17

sticklebacks (Hohenlohe et al., 2010). Another active area of adaptation genetics research focuses on species residing in extreme environments. High altitude adaptations to hemoglobin variants have been identified in multiple organisms, including humans (Lorenzo et al.,

2014), several species of Andean ducks (McCracken et al., 2009a; McCracken et al., 2009b), and deer mice, Peromyscus maniculatus (Storz et al., 2010; Natarajan et al., 2015). The genetic pathways responsible for physiological adaptations to desert habitats remain enigmatic; however, considerable progress has been made developing candidate gene sets for future analyses (e.g., Guillen et al., 2015; MacManes & Eisen, 2014; Marra et al., 2012; Marra,

Romero & DeWoody, 2014). Functional studies will stem from this foundational research aimed at identifying the genomic underpinnings of adaptations to extreme environments; yet, it is inherently challenging and critically important to demonstrate that specific loci are functionally responsible for adaptations (Storz & Wheat, 2010).

Rodents of the genus Peromyscus have been at the forefront of research elucidating the genetic basis for adaptation (reviewed in Bedford & Hoekstra, 2015). This diverse genus has served as an ideal platform for adaptation research spanning from the genetic basis of behavioral adaptations—such as complex burrowing in Peromyscus polionotus (Weber & Hoekstra, 2009;

Weber, Peterson & Hoekstra, 2013)—to the loci responsible for adaptive morphology—such as coat coloration in Peromyscus polionotus leucocephalus (Hoekstra et al., 2006), and including kidney desert adaptations in Peromyscus eremicus (MacManes & Eisen, 2014). We are currently using Peromyscus eremicus as a model species for investigating the genetic bases of desert adaptations. This paper describes a crucial component of this research aim.

Initial steps toward understanding the genetics of adaptation must include the genomic and transcriptomic characterization of target study species (MacManes & Eisen,

18

2014). To this end, we assembled and characterized a composite transcriptome for three male reproductive tissues in the desert specialist, P. eremicus. This species is an exceptional example of desert adaptation, as individuals may live entirely without water access (Veal &

Caire, 2001). MacManes & Eisen (2014) assembled transcriptomes from kidney, hypothalamus, liver, and testes of this species, and they identified several candidate genes potentially underlying adaptive renal physiology. However, to our knowledge, potential physiological reproductive adaptations to water limitation have not been studied in this species or in other desert rodents. We hypothesize that male P. eremicus possess reproductive adaptions to mitigate water loss.

The adaptive kidney physiology in kangaroo rat species (genus: Dipodomys), which produce highly concentrated urine via a disproportionately long loop of Henle (Schmidt-

Nielsen et al., 1948; Schmidt-Nielsen & Schmidt-Nielsen, 1952; Vimtrup & Schmidt-Nielsen, 1952;

Urity et al., 2012), has been well established. We propose that there may also be reproductive adaptations to mitigate water loss in desert rodents. Recent findings pertaining to the genetic signatures of adaptive kidney function in desert rodents (Marra et al., 2012; Marra, Romero &

DeWoody, 2014; MacManes & Eisen, 2014), suggest that such hypothesized reproductive adaptations may be detectable at the genetic level (should they exist) using similar comparative transcriptomic methods. We present this hypothesis in light of the mounting body of research for high rates of reproductive protein evolution (reviewed in Swanson & Vacquier, 2002;

Ramm et al., 2014), which we propose indicates that reproductive tissues may possess a significant capacity for evolving in response to strong selective pressures.

Dewsbury (1982) made the assertion that producing ejaculates incurs a cost for male mammals, which produce relatively high sperm counts, and his analysis utilizes P.

19

maniculatus as a model. Ejaculation has also been demonstrated to be costly in Mus musculus domesticus; Ramm & Stockley (2007) found that mice are able to manipulate the quantity of sperm released in response to varying competition for females. Peromyscus and other rodents exhibit rapid evolution of testis-expressed proteins (Turner, Chuong & Hoekstra,

2008). In addition, the epididymal transcriptome of M. musculus shows evidence for positive selection among epididymis-specialized genes that are secreted, which the authors attribute to their putative evolutionary importance (Dean, Good & Nachman, 2008). Moreover, a recent analysis for sperm genes from multiple mouse strains found that sperm proteins involved in both motility and in sperm-egg interactions show signatures of positive selection, potentially facilitating evolutionary mechanisms for sperm-competition and sexual conflict (Vicens,

Luke & Roldan, 2014). We therefore propose that both the inherently costly nature of producing ejaculate and the rapid evolution of genes in murine testes, epididymis, and sperm would be ideal for desert rodents to evolve ejaculate adaptations to limited water availability.

In order to develop genomic resources that will allow us to begin to test our hypothesis related to reproductive water conservation, we developed a transcriptome comprising three male specific reproductive tissues in P. eremicus. Specifically, we assembled a composite reproductive transcriptome for the epididymis, testes, and vas deferens. These tissues were chosen because they constitute numerous physiological roles in spermatogenesis and in the generation and transportation of seminal fluids. We posit that one or more of these reproductive tissues possess phenotypic characteristics to mitigate water loss from seminal fluids. We also propose that any molecular mechanisms responsible for such an adaption should be elucidated through future differential gene expression and comparative transcriptomic studies. However, the initial assembly and characterization of a

20

comprehensive male reproductive tissue transcriptome in the current manuscript will be essential for research studies exploring this hypothesized male P. eremicus reproductive desert adaptation. For example, this transcriptome will be instrumental in further experimental studies investigating differential gene expression in these tissue types in response to variable hydration levels or in large-scale comparative transcriptomic studies spanning numerous desert and non-desert rodents.

Here, we characterize a male cactus mouse tissue-specific transcriptome by presenting a preliminary exploration of transcript abundance and putative homology. We also perform comparative transcriptomic analyses to identify evidence of positive selection in genes potentially related to the hypothesized reproductive desert adaptations. It is beyond the scope of this manuscript to evaluate the functionality of these candidate genes in the context of desert adaptation, much less male specific reproductive adaptations. However, the elucidation of candidate genes in the context of male reproductive tissues will be instrumental for future studies aimed at determining which genes are functionally responsible for the proposed reproductive adaptations to water limitation in male cactus mouse.

METHODS

Tissue samples, RNA extraction, cDNA library preparation and sequencing

The Peromyscus eremicus male used for this study was captive born and descendant from a population from the Peromyscus Genetic Stock Center (Columbia, South Carolina).

This individual was housed in a facility at the University of New Hampshire designed to mimic desert conditions in the Southwestern United States. Specifically, the temperature increases gradually during the light hours until it peaks at 90◦ Fahrenheit in the afternoon,

21

and the temperature decreases during hours of darkness to 75◦. Humidity levels are 10% during the daylight hours and 25% during darkness. The photoperiodic cycle in this desert chamber is for long days of photostimulation, with 16 h of light, and 8 h of darkness. The colony includes males and females which are housed within a single room, providing olfactory cues that stimulate reproductive maturity. The photoperiod and shared housing in this colony result in the attainment of reproductive maturity in both sexes. Males are deemed reproductively mature when they are fully scrotal. The males do not undergo seasonal testicular atrophy, as evidenced by their consistent scrotal condition and their year-round successful reproduction.

A single reproductively mature P. eremicus male was sacrificed via isoflurane overdose and decapitation. This was done in accordance with University of New Hampshire

Animal Care and Use Committee guidelines (protocol number 130902) and guidelines established by the American Society of Mammalogists (Sikes et al., 2011). Testes, epididymis, and vas deferens were immediately harvested (within ten minutes of euthanasia), placed in RNAlater (Ambion Life Technologies) and stored at −80 ◦C until

RNA extraction. We used a standard TRIzol, chloroform protocol for total RNA extraction

(Ambion Life Technologies). We evaluated the quantity and quality of the RNA product with a Qubit 2.0 Fluorometer (Invitrogen) and a Tapestation 2200 (Agilent Technologies, Palo

Alto, USA).

We used a TURBO DNAse kit (Ambion) to eliminate DNA from the samples prior to the library preparation. Libraries were made with a TruSeq Stranded mRNA Sample Prep

LS Kit (Illumina). Each of the three samples was labeled with a unique hexamer adapter for identification after multiplex single lane sequencing. Following library completion, we

22

confirmed the quality and quantity of the DNA product with the Qubit and Tapestation. We submitted the multiplexed sample of the libraries for running on a single lane at the New York

Genome Center Sequencing Facility (NY, New York). Paired end sequencing reads of length

125 bp were generated on an Illumina 2500 platform. Reads were parsed by tissue type according to their unique hexamer IDs in preparation for transcriptome assembly.

Reproductive transcriptome assembly

The composite reproductive transcriptome was assembled with reads from the testes, epididymis and vas deferens using the previously developed Oyster River Protocol for de novo transcriptome assembly pipeline (MacManes, 2016). Briefly, the reads were error corrected with Rcorrector v1.0.1 (Song & Florea, 2015). We used the de novo transcriptome assembler Trinity v2.1.1 (Haas et al., 2013; Grabherr et al., 2011). Within the Trinity platform, we ran Trimmomatic (Bolger, Lohse & Usadel, 2014) to remove the adapters, and we also trimmed at PHRED < 2, as recommended by MacManes (2014) .

Next we evaluated transcriptome assembly quality and completeness using BUSCO v1.1b1 and Transrate v1.0.1. BUSCO (Simão et al., 2015) reports the number of complete, fragmented, and missing orthologs in assembled genomes, transcriptomes, or gene sets relative to compiled ortholog databases. We ran BUSCO on the assembly using the ortholog database for vertebrates, which includes 3,023 genes. The assembly was also analyzed by

Transrate using the Mus musculus peptide database from Ensembl (downloaded 2/24/16) as a reference. The Transrate score provided a metric of de novo transcriptome assembly quality, and the software also generated an improved assembly comprised of highly supported contigs (Smith-Unna et al., 2016). Finally, we re-ran BUSCO on the improved assembly

23

generated by Transrate to determine if this assembly had similar metric scores for completeness as the original assembly produced by Trinity. As alternatives to the original

Trinity assembly and the optimized Transrate assembly, we proceeded with our optimization determinations by filtering out low abundance contigs from the original Trinity assembly.

First we calculated the relative abundance of the transcripts with Kallisto v0.42.4 and

Salmon v0.5.1. Kallisto utilizes a pseudo-alignment algorithm to map RNA-seq data reads to targets for transcript abundance quantification (Bray et al., 2015). In contrast, Salmon employs a lightweight quasi-alignment method and a high speed streaming algorithm to quantify transcripts (Patro, Duggal & Kingsford, 2015). After determining transcript abundance in both Kallisto and Salmon, we removed contigs with transcripts per million

(TPM) estimates of less than 0.5 and of less than 1.0 in two separate optimization trials (as per MacManes, 2016). Finally, we evaluated these two filtered assemblies with Transrate and BUSCO to determine the relative quality and completeness of both assemblies. We chose the optimal assembly version by comparing Transrate and BUSCO metrics and also through careful consideration of total contig numbers across all filtering and optimizing versions. The chosen assembly was the Transrate optimized TPM > 0.5 filtered assembly, and this assembly was used for all subsequent analyses.

Annotation, transcript abundance, and database searches

We used dammit v0.2.7.1 (Scott, 2016) to annotate the optimized transcriptome assembly (as per MacManes, 2016). Within the dammit platform, we predicted protein coding regions for each tissue with TransDecoder v2.0.1 (Haas et al., 2013), which was used to find open reading frames (ORFs). Furthermore, dammit utilizes multiple database searches

24

for annotating transcriptomes. These database searches include searches in Rfam v12.0 to find non-coding RNAs (Nawrocki et al., 2015), searches for protein domains in Pfam-A v29.0

(Sonnhammer, Eddy & Durbin, 1997; Finn et al., 2016), the execution of a LAST search for known proteins in the UniRef90 database (Suzek et al., 2007; Suzek et al., 2015), ortholog matches in the BUSCO database, and orthology searches in OrthoDB (Kriventseva et al.,

2015).

We used the assembly annotated by dammit to re-run Kallisto to determine transcript abundance within each of the three tissue types. We used TPM counts of expression for all three tissues to generate counts of transcripts specific to and shared across tissue types. We also downloaded the ncRNA database for Mus musculus from Ensembl (v 2/25/16), and we did a BLASTn (Altschul et al., 1990; Madden, 2002) search for these ncRNAs in our assembly.

This database has 16,274 sequences, and we determined the number of transcript ID matches and the number of unique ncRNA sequence matches for our assembly. We also counted how many transcript matches were present in each of the tissues, and we referenced the corresponding Kallisto derived TPM values to determine the number of unique and ubiquitous transcript matches for each tissue.

We searched the annotated assembly for transporter protein matches within the

Transporter Classification Database (tcdb.org). This database has 13,846 sequences representing proteins in transmembrane molecular transport systems (Saier et al., 2014). We executed a BLASTx (Altschul et al., 1990; Madden, 2002) search to find the number of transcript ID matches and the number of unique transporter protein matches within the assembly. Next we determined how many transcript ID matches were found in each of the three tissues. As previously described above, we also cross-referenced these matches with

25

the Kallisto derived TPM values to find the number of ubiquitous and unique transcript matches by tissue type.

Comparative analysis for genes under positive selection

We performed a three-species comparative analysis to identify genes under positive selection in the male reproductive transcriptome for the P. eremicus lineage relative to M. musculus and P. maniculatus. The M. musculus nucleotide and protein sequences were downloaded (version GRCm38) from Ensembl (ensembl.org). The P. maniculatus nucleotide and protein files were downloaded (version GCF_000500345.1_Pman_1.9) from NCBI

(ncbi.nlm.nih.gov). The comparative analysis was conducted as described below. The corresponding nucleotide and protein files for all three species were modified so that the header names for the sequences in each species file pair were concordant. Next, we found orthologous groups of protein sequences using OrthoFinder v0.6.1 (Emms & Kelly, 2015).

The two OrthoFinder command scripts aligned sequences with MAFFT v7.123b (Katoh et al.,

2002), built trees with FastTreeMP v1 (Price, Dehal & Arkin, 2010), and generated orthogroups based on sequence similarity. Our next script selected the single copy orthologs

(SCOs) from among these orthogroups for analysis. Then we selected the cds file transcripts for all three species corresponding to the previously identified single copy orthologs.

Finally, we ran a script which aligned the sequences with PRANK v150803 (Löytynoja, 2014) and performed the analyses for positive selection for the single copy orthologous sequences with codeml in PAML v4.8 (Yang, 1997; Yang, 2007). Specifically, we performed the M 2a branch site test (Zhang, Nielsen & Yang, 2005) for positive selection after stipulating P. eremicus as the foreground lineage. Genes were deemed under positive selection if the omega values

26

(w) exceeded 1 within the M2a model and if they yielded statistically significant results for the likelihood ratio test (LRT) comparison between the null and alternative models. To perform the LRT, we determined if 2DlnL (two times the difference between the log likelihood values for the alternative—M2a—and the null—M 1a—models) exceeded a chi- square value corresponding to a significant p-value (p < 0.05) (Yang, 1998) after applying the Benjamini-Hochburg correction for multiple comparisons (Benjamini & Hochberg,

1995). For each of the P. eremicus gene sequences demonstrated to be under positive selection according to the LRT, we selected the M. musculus cds sequence from the SCO group corresponding to the P. eremicus gene sequence. Then we performed a BLASTn search on the

M. musculus sequence to find gene matches on NCBI to determine the gene identity for the P. eremicus sequence under positive selection.

Of note, the code for performing all of the above analyses can be found at GitHub

(https://github.com/macmanes-lab/peer_reproductive/transcriptome). The data files are available on Dryad (doi: 10.5061/dryad.01c3t).

RESULTS AND DISCUSSION

Reproductive transcriptome assembly

There were 45–94 million paired reads produced for each of the three transcriptome datasets, yielding a total of 415,960,428 reads. The raw reads are available at the European

Nucleotide Archive under study accession number PRJEB13364.

We assembled a de novo composite reproductive transcriptome with reads from testes, epididymis and vas deferens. The evaluation of alternative optimized assemblies allowed us to generate a substantially complete transcriptome of high quality. The

27

alternative assemblies had raw Transrate scores ranging from 0.156–0.194 (Table 1).

However, the scores for the improved assemblies generated by Transrate, consisting of only highly supported contigs, ranged between 0.285 and 0.349, which is well above the threshold Transrate score of 0.22 for an acceptable assembly. The BUSCO results indicated that the assemblies were highly complete, with complete matches ranging from 73–90% of vertebrate orthologs (Table 2). These BUSCO benchmark values are consistent with the most complete reported assessments for transcriptomes from other vertebrate taxa

(busco.ezlab.org). Furthermore, our BUSCO values exceed that of the only available reported male reproductive tissue (from a coelacanth: Latimeria menadoensis testes), which was 71% complete (Simão et al., 2015). The assembly version with the highest quality in relation to the Transrate metrics was the Transrate optimized Trinity assembly.

Specifically, the optimized Transrate score was 0.3492, and the percent coverage of the reference assembly was also highest, with 45% of the mouse database represented. This assembly was highly competitive for completeness, as indicated by the fact that it contained

85% of vertebrate single copy orthologs. However, this assembly had an exorbitantly high number of contigs (657,952 contigs), which is nearly an order of magnitude more contigs than the next best performing assembly: the Transrate optimized TPM > 0.5 filtered assembly (78,424 contigs). In consideration of the dramatically more realistic contig number for the Transrate optimized TPM > 0.5 filtered assembly, and in light of its second best performance for Transrate score (0.3013), reasonable Transrate mouse reference assembly coverage (37%), and sufficiently high BUSCO completeness (73% orthologs found), we chose this assembly as our optimized transcriptome. Therefore, we proceeded with this optimized assembly version as our finalized transcriptome assembly for our analyses.

28

Annotation, transcript abundance and database searches

The reproductive transcriptome assembly annotations were produced by dammit, and they are available on Dryad in a gff3 file format. Furthermore, TransDecoder was used to predict coding regions in the assembly. TransDecoder predicted that 49.5% (38,342) of the transcripts (78,424 total) contained ORFs, of which 63.9% (24,808) had complete ORFs containing a start and stop codon. The predicted protein coding regions generated by

TransDecoder are reported in five file types, and they are available on Dryad. Furthermore, the Pfam results yielded 30.7% of transcripts (24,107) matching to the protein family database. In contrast, the LAST search found that 75.9% of transcripts (59,503) matched to the UniRef90 database. We have uploaded the homology search results generated by Pfam and UniRef90 matches onto Dryad. In addition, 1.04% (816) of transcripts matched to the

Rfam database for ncRNAs, and these results are posted in Dryad. Of note, 80.1% (62,835) of the transcripts were annotated using one or more of the above described methods (the dammit.gff3 file is posted in Dryad), and it is this final annotated assembly that was used for all subsequent analyses (this annotated transcriptome is available on Dryad).

The Kallisto generated TPM counts of expression (available on Dryad) were utilized to determine which transcripts were ubiquitous and specific to the three tissue types, which we have depicted in a Venn diagram format (Fig. 1). The assembly consisted of 78,424 different transcript IDs, of which 64,553 were shared across all three tissues. The number of unique transcripts were as follows: 3,563 in testes, 342 in epididymis, and 502 in vas deferens. The relatively large number of unique transcripts in the testes is consistent with previous findings which describe the testes as the tissue with the highest number of

29

tissue-enriched genes in the human body (Djureinovic et al., 2014; Uhlén et al., 2015).

However, because this expression data was generated by a single individual, we want to emphasize that these results have no statistical power. Rather, we view these numerical comparisons of unique and ubiquitous transcript counts as an exploratory evaluation of potential relationships between tissue types. Such comparisons across tissue types for a single individual of this species previously indicated that the kidney had relatively higher numbers of unique transcripts than testes (MacManes & Eisen, 2014). Future research with multiple individuals will be necessary to statistically evaluate the relative rates of transcript expression between these reproductive tissues, as well as their relationship with other non-reproductive tissue types.

In addition, we searched for Mus musculus ncRNA sequence matches within our assembly. There were 15,964 transcript matches, which correspond to 2,320 unique ncRNA matches, and they are posted on Dryad. The transcript matches by tissue type were found using the Kallisto TPM determinations, and they were as follows, testes:

15,260, epididymis: 15,552, and vas deferens: 15,558. A Venn Diagram depicts unique and shared transcript matches by tissue type (Fig. 2). The majority of transcript matches were ubiquitous to all three tissues (14,724), and there were far fewer tissue specific matches.

The testes had more unique transcript matches (185) than the epididymis (26) or the vas deferens (45). These findings are consistent with our results above regarding the relative numbers of total unique transcripts in the assembly by tissue type. However, these counts for relative transcript matches among tissue types were generated with transcripts from a single individual; therefore, the comparative results across tissue type were not statistically evaluated. It is seemingly probable that the diversity of transcripts for regulation should be

30

highest in tissues generating relatively diverse proteins, in this case, the testes, which did have the highest number of unique transcript matches. The role of ncRNA in reproductive tissues throughout multiple developmental stages has recently been reviewed in detail

(Hale, Yang & Ross, 2014). In addition, ncRNAs have been found to be highly abundant in murine testes (Sun, Lin & Wu, 2013). Furthermore, sperm from humans and mice contain a significant number of ncRNAs (Krawetz et al., 2011; Kowano et al., 2012). However, we are unaware of any research investigating the involvement of ncRNA in desert adaptations; therefore, we cannot speculate on particular ncRNA matches within our dataset that may have potential desert adaptive roles.

Our search for transporter protein matches within the Transporter Classification

Database yielded 7,521 different transcript matches, corresponding to 1,373 unique transporter protein matches, and they are posted on Dryad. The number of transcript matches was highly similar between the tissue types (testes 7,025; epididymis 7,115; vas deferens: 7,071). We generated a Venn Diagram to display the numbers of shared and unique transcript matches to the transporter protein sequences (Fig. 3). Most transcript matches were present in all three tissues (6,472), and there were relatively few unique matches in the three tissue types. However, the testes had the highest number of unique transcript matches (215) relative to the epididymis (19) and the vas deferens (37). These comparative results across tissue types represent exploratory findings for a single mouse, and the count data have not been statistically tested, but these preliminary results should be investigated in future studies. Furthermore, our BLASTx search of this transporter protein database yielded transcript matches for multiple solute carrier proteins. We are particularly interested in solute carrier proteins because previous research has found candidate genes in

31

this protein family for desert adaptations in kidneys of the kangaroo rat (Marra et al., 2012;

Marra, Romero & DeWoody, 2014) and the cactus mouse (MacManes & Eisen, 2014). In addition, we had multiple matches to aquaporins, which are water channels allowing transport across cellular membranes. One transcript matched specifically to Aquaporin 3, a sperm water channel found in mice and humans, which is essential to maintaining sperm cellular integrity in response to the hypotonic environment within the female reproductive tract (Chen et al.,

2011).

Comparative analysis for genes under positive selection

To find evidence for genes undergoing positive selection in the male reproductive transcriptome of P. eremicus, we compared this species with two other generalist rodents. We chose M. musculus as the non-desert adapted outgroup because this species possesses transcriptomic resources which are exceptional in their annotation and completeness. The widely distributed (Carleton, 1989) habitat generalist deer mouse, P. maniculatus, was chosen because it harbors the most complete transcriptomic data available among the Peromyscus genus.

There were 3,731 panorthologous groups (single copy orthologs) in our three species comparison. The branch test was successfully implemented for SCOs when all three sequences aligned adequately with PRANK and when codeml produced both M 1a and M 2a output files for the LRT comparison (n = 2,820 in total). The M 2a test indicated that

42 genes were evolving under a model of positive selection in the Cactus mouse (Table 3).

Therefore, we investigated whether previous research on either rodent reproductive tissues or on desert specialized rodents documented evidence of positive selection for any of

32

these genes or gene families. Only one of these 42 genes matched an epididymis-specialized secreted gene undergoing positive selection in another rodent (C57/BL6 mice). Namely,

Qsox2 is a match for Qscn6l1, a member of the sulfhydryl oxidase/quiescin-6 family, which is purportedly involved in neuroblastoma apoptosis (Dean, Good & Nachman, 2008). However, we cannot speculate regarding the functionality this gene has in male rodent reproductive tissue, or why it appears to be evolving under a model of positive selection in these two studies. Another of our 42 positively selected genes, Lrrc46, may have some similarity to

Lrrc50, a gene under positive section in testes of Peromyscus (Turner, Chuong & Hoekstra,

2008). Leucine-rich repeat containing (Lrrc) genes have diverse biological roles; therefore, we also will not speculate on any correspondence between these two genes. As expected given our experimental design, there was no concordance between our 42 genes and those found to be under positive selection in a recent study on mouse spermatozoa proteins (Vicens, Luke &

Roldan, 2014).

Our search for gene matches from the current study with other desert rodent research revealed notable similarities. Two solute carrier proteins, Slc15a3 and Slc47a1, were found to be under positive selection in our analysis. This finding bears particular significance because another protein in this family, Slc2a9, shows signatures of positive selection in desert rodent kidney transcriptomes in Dipodomys spectabilis and Chaetodipus baileyi (Marra, Romero &

DeWoody, 2014) and P. eremicus (MacManes & Eisen, 2014). Solute carriers are a large family of cell membrane proteins that are responsible for transporting solutes (reviewed in Hediger et al., 2004; Hediger et al., 2013; César-Razquin et al., 2015). Furthermore, Marra, Romero &

DeWoody (2014), hypothesize that solute carriers are critical for osmoregulation in desert rodents, and they assert that these genes may be under evolutionary pressure in such rodents.

33

In response to the potential relevance of the two solute carriers under positive selection in our study to desert rodent osmoregulation, we generated STRING (Snel et al., 2000; Szklarczyk et al., 2015) diagrams for their protein-protein interactions (Fig. 4). These diagrams demonstrate multiple connections to other solute carriers for both proteins, thereby suggesting their potential functional roles.

This analysis of genes undergoing positive selection in P. eremicus relative to P. maniculatus and M. musculus provides candidate genes for desert specialization in the cactus mouse which can be the target of future studies focused on ascertaining which genes may be functionally responsible for our hypothesized male reproductive desert adaptation.

However, desert specialization is not the sole difference between P. eremicus and P. maniculatus, much less between P. eremicus and M. musculus. P. maniculatus is highly promiscuous, while P. eremicus is relatively socially monogamous (Wolff, 1989). Indeed, P. maniculatus has been the subject of considerable sperm competition research (Dewsbury,

1988; Fisher & Hoekstra, 2010). Differences in reproductive systems, and even potentially in sperm aggregation of P. maniculatus (Fisher & Hoekstra, 2010), may manifest themselves as evidence of selection patterns between these two Peromyscus species.

Therefore, we are not proposing that the 42 genes we found to be under positive selection in the male Cactus mouse are functionally responsible for adaptive desert physiology. Rather, we are proposing that they are interesting candidate genes for future studies investigating the genetic underpinnings of physiological desert adaptations, including our hypothesized male reproductive adaptation, on a functional level. Several of these genes, specifically those in the solute carrier family, seem particularly promising for such work because they are undergoing rapid evolution in multiple desert rodent species.

34

CONCLUSIONS

Although researchers have determined that renal adaptations are responsible for mitigating water loss in kangaroo rats via the genitourinary tract (Schmidt-Nielsen et al., 1948; Schmidt-

Nielsen & Schmidt-Nielsen, 1952; Vimtrup & Schmidt-Nielsen, 1952; Urity et al., 2012), we present the novel hypothesis that there may also be male reproductive adaptations to arid environments that allow desert specialists like the cactus mouse to conserve water during reproduction. Previous efforts to elucidate the genomic basis of desert adaptations have described candidate genes for adaptive renal physiology in some desert specialized rodents

(Marra et al., 2012; Marra, Romero & DeWoody, 2014), including the cactus mouse (MacManes

& Eisen, 2014). In light of these findings, we propose that if the male cactus mouse possesses an adaptive reproductive phenotype to mitigate water loss via seminal fluids in response to limited-water availability, such an adaptation will be detectable through transcriptomic analyses. The current study generates and characterizes a transcriptome for male reproductive tissues from the cactus mouse as an initial step towards future efforts to explore this hypothesized reproductive adaptation. This study describes a composite transcriptome from three male reproductive tissues in the desert specialist Peromyscus eremicus. Our analyses include quality and completeness assessments of this reproductive assembly, which we generated using reads from testes, epididymis, and vas deferens of a male cactus mouse. We generate annotations and search relevant databases for ncRNAs and transporter protein sequences. We also describe the degree of ubiquity between transcripts among the three tissues as well as identify transcripts unique to those tissues utilizing preliminary (based on a single individual) transcript differences between tissue types.

Furthermore, we find genes evolving under a model of positive selection in the P. eremicus

35

male reproductive transcriptome relative to P. maniculatus and M. musculus in order to generate a list of candidate genes for future investigations in desert adaption genetics. Our future research will investigate the hypothesized male reproductive physiological adaptation to water limitation in cactus mouse through a differential gene expression study, and the characterization of this reproductive transcriptome will form the foundation of studies along this vein. Moreover, this research contributes transcriptomic materials to a larger body of work in the expanding field of adaptation genetics, which benefits tremendously from enhanced opportunities for comparative analyses.

DRYAD DATA FILE LIST

Final Annotated Reproductive Tissue Transcriptome: reproductive.annotated.fasta (127 MB)

Transdecoder (Five Files): transdecoder.gff3 (49 MB) transdecoder.pep (27 MB) transdecoder.cds (62 MB) transdecoder.mRNA (172 MB) transdecoder.bed (10 MB)

Pfam Annotation: reproductive.pfam.gff3 (32 MB) Rfam Annotation: reproductive.rfam.gff3

(191 KB) Dammit Annotation: reproductive.dammit.gff3 (98 MB) UniRef90 Annotation: reproductive.uniref.gff3 (9.5 MB)

Kallisto Results for Annotated Transcriptome (Three Files): kallisto.testes.tsv (3.4 MB) kallisto.epi.tsv (3.4 MB)

36

kallisto.vas.tsv (3.4 MB) ncRNA Database Matches (three files): epi.tpm.plus.ncRNA.txt (4.3 MB) vas.tpm.plus.ncRNA.txt (4.3 MB) testes.tpm.plus.ncRNA.txt (4.3 MB) tcdb Datatbase Matches (three files): epi.tpm.plus.tcdb.txt (946 KB) vas.tpm.plus.tcdb.txt (946 KB) testes.tpm.plus.tcdb.txt (946 KB)

Comparative analysis for genes under positive selection (three files): PAMLresults.txt (354

KB) significant_orthogroups.txt (6 KB)

42SeqsM2aPosSel.txt (89 KB).

ADDITIONAL INFORMATION AND DECLARATIONS

Funding

The work was funded by startup funds from the University of New Hampshire. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Grant Disclosures

The following grant information was disclosed by the authors: University of New

Hampshire.

Competing Interests

The authors declare there are no competing interests.

37

Author Contributions

Lauren L. Kordonowy conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Matthew D. MacManes conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, reviewed drafts of the paper.

Animal Ethics

The following information was supplied relating to ethical approvals (i.e., approving body and any reference numbers):

University of New Hampshire Animal Care and Use Committee: (protocol number 130902).

DNA Deposition

The following information was supplied regarding the deposition of DNA sequences:

European Nucleotide Archive (PRJEB13364, ID: 318672).

Data Availability

The following information was supplied regarding data availability: Dryad DOI:

10.5061/dryad.01c3t; Github: https://github.com/macmanes- lab/peer_reproductivetranscriptome.

38

Table 1 Transrate results for the reproductive transcriptome assembly produced by different optimization methods.

Transrate Optimized # Read Pairs Contigs # Good % Good Assembly Score Score* (fragments) (n_seqs) Contigs Contigs Trinity 0.1944 0.3492 207,980,214 856,711 657,952 0.77 Original Filter 0.1672 0.3013 207,980,214 147,966 78,424 0.53 TPM<0.5 Filter 0.156 0.2854 207,980,214 80,165 54,140 0.68 TPM<1.0

NOTES. a This is the score of the Transrate optimized assembly in Table2

39

Table 2 BUSCO metrics for the reproductive transcriptome assembly produced by different optimization methods.

% % % Assembly Complete Duplicated Fragmented % Missing Trinity original 90 49 3.4 5.5 Transrate Optimized 85 44 4.3 9.7 Filter TPM<0.5 85 38 3.0 11 Transrate TPM<0.5 73 31 3.9 22 Filter TPM<1.0 80 28 2.8 16 Transrate TPM<1.0 74 25 3.4 21

40

Table 3 The 42 genes reached statistical significance (p < 0.05) after correcting for multiple hypothesis testing for the M2a branch-site test for positive selection in PAML in the male Cactus mouse reproduction transcriptome.

Orthogroup ID p-value BLASTn Description OG0010592 1.18E-10 Q6 sulfhydryl oxidase 2 (Qsox2) OG0010774 1.04E-03 adenylate kinase 6 (Ak6) OG0010833 4.90E-05 plakophilin 2 (Pkp2) 1-acylglycerol-3-phosphate O-acyltransferase 2 OG0011177 3.15E-03 (Agpat2) OG0011272 1.06E-04 cDNA sequence BC089491 OG0011374 2.83E-03 hepatoma derived growth factor-like 1 (Hdgfl1) OG0011384 2.05E-03 chitinase, acidic 1 (Chia1) OG0011551 7.05E-04 zinc finger protein 770 (Zfp770) OG0011784 1.63E-04 zinc finger, DHHC domain containing 19 (Zdhhc19) fumarylacetoacetate hydrolase domain containing 2A OG0011914 1.33E-09 (Fahd2a) OG0012115 0.00E+00 NUT midline carcinoma, family member 1 (Nutm1) OG0012232 2.06E-03 leucine rich repeat neuronal 2 (Lrrn2) OG0012396 1.19E-06 leucine rich repeat containing 46 (Lrrc46) OG0012449 2.79E-04 F-box and leucine-rich repeat protein 14 (Fbxl14) OG0012511 3.27E-08 SMAD family member 6 (Smad6) OG0012690 0.00E+00 solute carrier family 47, member 1 (Slc47a1) bromodomain and WD repeat domain containing 3 OG0012869 4.59E-02 (Brwd3) ArfGAP with SH3 domain, ankyrin repeat and PH OG0013171 5.47E-06 domain 3 (Asap3) OG0013288 6.09E-03 phosphodiesterase 3B, cGMP-inhibited (Pde3b) OG0013304 3.42E-02 persephin (Pspn) mitochondrial methionyl-tRNA formyltransferase OG0013342 4.11E-02 (Mtfmt) OG0013590 1.77E-02 matrix metallopeptidase 15 (Mmp15) OG0013771 1.77E-02 neuronal pentraxin 2 (Nptx2) OG0013841 4.34E-03 transformation related protein 63 regulated like (Tprgl) OG0014000 1.25E-02 alkaline phosphatase, liver/bone/kidney (Alpl) OG0014048 0.00E+00 BPI fold containing family A, member 5 (Bpifa5) OG0014062 1.42E-02 SPARC related modular calcium binding 2 (Smoc2) OG0014193 1.05E-06 carbonic anhydrase 11 (Car11) OG0014286 9.65E-03 cDNA 4930550C14 gene 41

OG0014309 7.40E-03 chymotrypsin-like elastase family, member 1 (Cela1) minichromosome maintenance 9 homologous OG0014333 2.11E-02 recombination repair factor (Mcm9) OG0014414 1.60E-04 growth arrest specific 6 (Gas6) OG0014634 2.62E-03 glutathione peroxidase 2 (Gpx2) OG0014827 0.00E+00 secretory carrier membrane protein 5 (Scamp5) STIP1 homology and U-Box containing protein 1 OG0014913 4.34E-04 (Stub1) OG0015025 3.23E-08 pancreatic lipase-related protein 2 (Pnliprp2) OG0015310 1.06E-03 claspin (Clspn) excision repair cross-complementing rodent repair OG0015466 7.62E-05 deficiency complementation group 6 like (Ercc6l) OG0015627 4.18E-03 solute carrier family 15, member 3 (Slc15a3) OG0015662 1.52E-03 epithelial membrane protein 2 (Emp2) OG0015704 3.96E-03 cubilin (intrinsic factor-cobalamin receptor) (Cubn) OG0015713 4.97E-07 suppressor APC domain containing 1 (Sapcd1)

42

Figure 1 Venn Diagram of transcript expression differences and similarities between the three

reproductive tissues for a single male mouse. The total number of transcripts is 78,424

Testes Vas Deferens

3563 1968 502

64553

2555 4736

342

Epididymis

43

Figure 2 Venn Diagram of transcript matches between the three reproductive tissues to ncRNA sequences in Mus musculus. The total number of transcript matches across the tissue types is

15,964.

Vas Deferens Testes

169 45 185

14724

182 620

26

Epididymis

44

Figure 3 Venn Diagram of transcript matches between the three reproductive tissues to protein sequences in the Transporter Classification Database. The total number of transcript matches across the tissue types is 7,521.

Vas Deferens Testes

138 215 37

6472 424 200

19

Epididymis

45

Figure 4 STRING diagrams of protein interactions for two proteins evolving under a mode of positive selectin in P. eremicus: Slc5a3 and Slc47a1.

46

REFERENCES

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool.

Journal of Molecular Biology 215(3):403–410 DOI 10.1016/S0022-2836(05)80360-2.

Bedford NL, Hoekstra HE. 2015. The natural history of model organisms: Peromyscus mice as a model for studying natural variation. eLife 4:e06813 DOI 10.7554/eLife.06813.

Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological)

57(1):289–300.

Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexiable trimmer for illumina sequence data. Bioinformatics 30(15):2114–2120 DOI 10.1093/bioninformatics/btu170.

Bray N, Pimentel H, Melsted P, Pachter L. 2015. Near-optimal RNA-seq quantification. ArXiv preprint. arXiv:1505.02710.

Carleton M. 1989. Systematics and evolution. In: Kirkland Jr LG, Layne JN, eds.

Advances in the study of Peromyscus (Rodentia). Lubbock: Texas Tech University Press, 7–142.

César-Razquin A, Snijder B, Frappier-Brinton T, Isserlin R, Gyimesi G, Bai X, Reithmeier RA,

Hepworth D, Hediger MA, Edwards AM, Superti-Furga G. 2015. A call for systematic research on solute carriers. Cell 162(3):478–487 DOI 10.1016/j.cell.2015.07.022.

Chen Q, Peng H, Lei L, Zhang Y, Kuang H, Cao Y, Shi Q, Ma T, Duan T. 2011. Aquaporin 3 is a sperm water channel essential for postcopulatory sperm osmoadaptation and migration. Cell

Research 21(6):922–933 DOI 10.1038/cr.2010.169.

Cheviron ZA, Brumfield RT. 2011. Genomic insights into adaptation to high-altitude environments. Heredity 108(4):354–361 DOI 10.1038/hdy.2011.85.

Dean MD, Good JM, Nachman MW. 2008. Adaptive evolution of proteins secreted during sperm

47

maturation: an analysis of the mouse epididymal transcriptome. Molecular Biology and Evolution

25(2):383–392 DOI 10.1093/molbev/msm265.

Dewsbury DA. 1982. Ejaculate cost and male choice. The American Naturalist

119(5):601–610.

Dewsbury DA. 1988. A test of the role of copulatory plugs in sperm competition in deer mice

(Peromyscus maniculatus). Journal of Mammalogy 69(4):854–857 DOI 10.2307/1381648.

Djureinovic D, Fagerberg L, Hallstrom B, Danielsson A, Lindskog C, Uhlen M, Ponten F.

2014. The human testes-specific proteome defined by transcriptomics and antibody-based profiling. Molecular Human Reproduction 20(6):476–488 DOI 10.1093/molehr/gau018.

Ellegren H. 2014. Genome sequencing and population genomics in non-model organisms.

Trends in Ecology and Evolution 29(1):1–13 DOI 10.1016/j.tree.2013.09.008.

Emms DM, Kelly S. 2015. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biology 16:157

DOI 10.1186/s13059-015-0721-2.

Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M,

Qureshi M, Sangrador-Vegas A, Salazar GA, Tate J, Bateman A. 2016. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Research 44:D279–D285

DOI 10.1093/nar/gkv1344.

Fisher HS, Hoekstra HE. 2010. Competition drives cooperation among closely related sperm of deer mice. Nature 463(7282):801–803 DOI 10.1038/nature08736.

Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L,

Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, Di Palma F,

Birren BW, Nusbaum C, Lindbald-Toh K, Friedman N, Regev A. 2011. Full-length transcriptome

48

assembly from RNA-Seq data without a reference genome. Nature Biotechnology 29:644–652

DOI 10.1038/nbt.1883.

Guillen Y, Rius N, Delprat A, Williford A, Muyas F, Puig M, Casillas S, Ramia M, Egea R, Negre

B, Mir G, Camps J, Moncunill V, Ruiz-Ruano FJ, Cabrero J, De Lima LG, Dias GB, Ruiz JC,

Kapusta A, Garcia-Mas J, Gut M, Gut IG, Torrents D, Camacho JP, Kuhn GCS, Feschotte C,

Clark AG, Betran E, Barbadilla A, Ruiz A. 2015. Genomics of ecological adaptation in cactophilic drosophila. Genome Biology andEvolution 7(1):349–366 DOI 10.1093/gbe/evu291.

Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D,

Li B, Lieber M, Macmanes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman

R, William T, Dewey CN, Henschel R, Leduc RD, Friedman N, Regev A. 2013. De novo transcript sequence reconstruction from RNA-Seq: reference generation and analysis with Trinity. Nature

Protocols 8(8):1494–1512 DOI 10.1038/nprot.2013.084.

Hale BJ, Yang CX, Ross JW. 2014. Small RNA regulation of reproductive function. Molecular

Reproduction and Development 81(2):148–159. DOI 10.1002/mrd.22272.

Hediger MA, Clemencon B, Burrier RE, Bruford EA. 2013. The ABCs of membrane transporters in health and disease (SLC series): introduction. Molecular Aspects of Medicine 34(2–3):95–107

DOI 10.1016/j.mam.2012.12.009.

Hediger MA, Romero MF, Peng J-B, Rolfs A, Takanaga H, Bruford EA. 2004. The ABCs of solute carriers: physiological, pathological and therapeutic implications of human membrane transport proteins: introduction. Pffugers Archiv 447(5):465–548 DOI 10.1007/s00424-003-

1192-y.

Heliconius Genome Consortium. 2012. Butterfly genome reveals promiscuous exchange of mimicry adaptations among species. Nature 487:94–98 DOI 10.1038/nature11041.

49

Hines HM, Papa R, Ruiz M, Papanicolaou A, Wang C, Nijhout HF, McMillan WO, Reed

RD. 2012. Transcriptome analysis reveals novel patterning and pigmentation genes underlying

Heliconius butterfly wing pattern variation. BMC Genomics 13:288 DOI 10.1186/1471-2164-13-

288.

Hoekstra HE, Hirschmann RJ, Bundey RA, Insel PA, Crossland JP. 2006. A single amino acid mutation contributes to adaptive beach mouse color patterns. Science 313:101–104

DOI 10.1126/science.1126121.

Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, Cresko WA. 2010.

Population genomics of parallel adaptation in threespine stickleback using sequenced RAD

Tags. PLoS Genetics 6(2):e1000862 DOI 10.1371/journal.pgen.1000862.

Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M, Zody

MC, White S, Birney E, Searle S, Schmutz J, Grimwood J, Dickson MC, Myers RM, Miller

CT, Summers BR, Knecht AK, Brady SD, Zhang H, Pollen AA, Howes T, Amemiya C, Broad

Institute Genome Sequencing Platform & Whole Assembly Team, Lander ES, Di Palma F,

Lindblad-Toh K, Kingsley DM. 2012. Thegenomic basis of adaptive evolution in threespine sticklebacks. Nature 484:55–61 DOI 10.1038/nature10944.

Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30(14):3059–3066

DOI 10.1093/nar/gkf436.

Kowano M, Kawaji H, Grandejean V, Kiani J, Rassoulzadegan M. 2012. Novel small noncoding

RNAs in mouse spermatozoa, zygotes and early embryos. PLoS ONE 7(9):e44542 DOI

10.1371/journal.pone.0044542.

Krawetz SA, Kruger A, Lalancette C, Tagett R, Anton E, Draghici S, Diamond MP. 2011. A survey

50

of small RNAs in human sperm. Human Reproduction 26(12):3401–3412 DOI

10.1093/humrep/der329.

Kriventseva EV, Tegenfeldt R, Petty TJ, Waterhouse RM, Simao FA, Pozdnyakov IA, Ioannidis

P, Zdobnov EM. 2015. OrthoDBv8: update of the hierarchical catalog of orthologs and the underlying free software. Nucleic Acids Research 43:D250–D256 DOI 10.1093/nar/gku1220.

Lorenzo FR, Huff C, Myllymäki M, Olenchock B, Swierczek S, Tashi T, Gordeuk V, Wuren T,

Ri-Li G, McClain DA, Khan TM, Koul PA, Guchhait P, Salama ME, Xing J, Semenza GL,

Liberzon E, Wilson A, Simonson TS, Jorde LB, Kaelin Jr WG, Koivunen P, Prchal JT. 2014.

A genetic mechanism for Tibetan high-altitude adaptation. Nature Genetics 46(9):951–956

DOI 10.1038/ng.3067.

Löytynoja A. 2014. Phylogeny-aware alignment with PRANK. Methods in MolecularBiology

1079:155–170 DOI 10.1007/978-1-62703-646-7_10.

MacManes MD. 2014. On the optimal trimming of high-througput mRNA sequence data.

Frontiers in Genetics 5:13 DOI 10.3389/fgene.2014.00013.

MacManes MD. 2016. Establishing evidenced-based best practice for the de novo assembly and evaluation of transcriptomes from non-model organisms. bioRxiv

DOI 10.1101/035642.

MacManes MD, Eisen MB. 2014. Characterization of the transcriptome, nucleotide sequence polymorphism, and natural selection in the desert adapted mouse Peromyscus eremicus. PeerJ

2:e642 DOI 10.7717/peerj.642.

MacMillen RE, Garland Jr T. 1989. Kirkland Jr LG, Layne JN, eds. Advances in the study of peromyscus (Rodentia). Lubbock: Texas Tech University Press, 143–168.

Madden T. 2002. The BLAST sequence analysis tool. In: McEntyre J, Ostell J, eds. The NCBI

51

Handbook. Bethesda: National Center for Biotechnology Information, Chapter 16. Available at https:// www.ncbi.nlm.nih.gov/ books/ NBK153387/ .

Marra NJ, Eo SH, Hale MC, Waser PM, DeWoody JA. 2012. A priori and a posteriori approaches for finding genes of evolutionary interest in non-model species: osmoregulatory genes in the kidney transcriptome of the desert rodent Dipodomys spectabilis (banner-tailed kangaroo rat). Comparative Biochemistry and Physiology, Part D: Genomics Proteomics 7(4):328–

339 DOI 10.1016/j.cbd.2012.07.001.

Marra NJ, Romero A, DeWoody A. 2014. Natural selection and the genetic basis of osmoregulation in heteromyid rodents as revealed by RNA-seq. Molecular Ecology 23(11):2699–

2711 DOI 10.1111/mec.12764.

McCracken KG, Barger CP, Bulgarella M, Johnson KP, Kuhner MK, Moore AV, Peters JL,

Trucco J, Valqui TH, Winker K, Wilson RE. 2009a. Signatures of high-altitude adaptation in the major hemoglobin of five species of andean dabbling ducks. The American Naturalist

174(5):631–650 DOI 10.1086/606020.

McCracken KG, Barger CP, Bulgarella M, Johnson KP, Sonsthagen SA, Trucco J, Valqui TH,

Wilson RE, Winker K, Sorenson MD. 2009b. Parallel evolution in the major haemoglobin genes of eight species of Andean waterfowl. Molecular Ecology 18(19):3992–4005

DOI 10.1111/j.1365-294X.2009.04352.x.

Natarajan C, Hoffman FG, Lanier HC, Wolf CJ, Cheviron ZA, Spangler ML, Weber RE, Fago

A, Storz JF. 2015. Intraspecific polymorphism, interspecific divergence, and origins of function- altering mutation in deer mouse hemoglobin. Molecular Biology and Evolution 32(4):978–997

DOI 10.1093/molbev/msu403.

Nawrocki EP, Burge SW, Bateman A, Daub J, Eberhardt RY, Eddy SR, Floden EW, Gardner

52

PP, Jones TA, Tate J, Finn RD. 2015. Rfam 12.0: updates to the RNA families database.

Nucleic Acids Research 43:D130–D137 DOI 10.1093/nar/gku1063.

Patro R, Duggal G, Kingsford C. 2015. Salmon: accurate, versatile and ultrafast quantification from RNA-seq data using lightweight-alignment. bioRxiv

DOI 10.1101/021592.

Price M, Dehal P, Arkin A. 2010. FastTree2-approximately maximum-likelihood trees for large alignments. PLoS ONE 5(3):e9490 DOI 10.1371/journal.pone.009490.

Ramm SA, Scharer L, Ehmcke J, Wistuba J. 2014. Sperm competition and the evolution of spermatogenesis. Molecular Human Reproduction 20(12):1169–1179 DOI 10.1093/molehr/gau070.

Ramm SA, Stockley P. 2007. Ejaculate allocation under varying sperm competition risk in the , Mus musculus domesticus. Behavioral Ecology 18(2):491–495

DOI 10.1093/beheco/arm003.

Saier MH, Reddy VS, Tamang DG, Vastermark A. 2014. The transporter classification database.

Nucleic Acids Research 42:D251–D258 DOI 10.1093/nar/gkt1097.

Schmidt-Nielsen K, Schmidt-Nielsen B. 1952. Water metabolism of desert mammals 1.

Physiological Reviews 32(2):135–166.

Schmidt-Nielsen B, Schmidt-Nielsen K, Brokaw A, Schneiderman H. 1948. Water conservation in desert rodents. Journal of Cellular Physiology 32(3):331–360 DOI 10.1002/jcp.1030320306.

Scott C. 2016. dammit: an open and accessible de novo transcriptome annotator. Available at www.camillescott.org/ dammit.

Sikes RS, Gannon WL, Animal Care and Use Committee of the American Society of

Mammalogists. 2011. Guidelines of the American society of mammalogists for the use of wild mammals in research. Journal of Mammalogy 92(1):235–253

53

DOI 10.1644/10-MAMM-F-355.1.

Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics

31(19):3210–3212 DOI 10.1093/bioinformatics/btv351.

Smith-Unna R, Boursnell C, Patro R, Hibberd JM, Kelley S. 2016. TransRate: reference free quality assessment of de-novo transcriptome assemblies. Genome Research 26:1134–1144

DOI 10.1101/gr.196469.115.

Snel B, Lehmann G, Bork P, Huynen MA. 2000. STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Research 28(18):3442–3444

DOI 10.1093/nar/28.18.3442.

Song L, Florea L. 2015. Rcorrector: efficient and accurate error correction for Illumina RNA- seq reads. Gigascience 4:48 DOI 10.1186/s13742-015-0089-y.

Sonnhammer ELL, Eddy SR, Durbin R. 1997. Pfam: a comprehensive database of protein families based on seed alignments. Proteins 28(3):405–420.

Storz JF, Runck AM, Moriyama H, Weber RE, Fago A. 2010. Genetic differences in hemoglobin function between highland and lowland deer mice. Journal of Experimental Biology

213(15):2565–2574 DOI 10.1242/jeb.042598.

Storz JF, Wheat CW. 2010. Integrating evolutionary and functional approaches to infer adaptation at specific loci. Evolution 64(9):2489–2509 DOI 10.1111/j.1558-5646.2010.01044.x.

Sun J, Lin Y, Wu J. 2013. Long non-coding RNA expression profiling of mouse testis during post-natal development. PLoS ONE 8(10):e75750 DOI 10.1371/journal.pone.0075750.

Suzek BE, Huang H, McGarvey PB, Mazumder R, Wu CH. 2007. UniRef: comprehensive and non-redundant UniProt reference clusters. Bioinformatics 23(10):1282–1288

54

DOI 10.1093/bioinformatics/btm098.

Suzek BE, Wang Y, Huang H, McGarvey PB, Wu CH. 2015. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31(6):926–

932 DOI 10.1093/bioinformatics/btu739.

Swanson WJ, Vacquier VD. 2002. The rapid evolution of reproductive proteins. Nature Reviews

Genetics 3(2):137–144 DOI 10.1038/nrg733.

Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic

M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, Von Mering C. 2015.

STRINGv10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids

Research 43:D447–D452 DOI 10.1093/nar/gku1003.

Turner LM, Chuong EB, Hoekstra HE. 2008. comparative analysis of testis protein evolution in Rodents. Genetics 179(4):2075–2089 DOI 10.1534/genetics.107.085902.

Uhlén M, Fagerberg L, Hallström BM, Lindskog C, Oksvold P, Mardinoglu A, Siverts- son Å,

Kampf C, Sjöjstedt E, Asplund A, Olsson I, Edlund K, Lundberg E, Navani S, Szigyarto CA,

Odeberg J, Djureinovic D, Takanen JO, Hober S, Alm T, Edqvist PH, Berling H, Tegel H, Mulder

J, Rockberg J, Nilsson P, Schwenk JM, Hamsten M, Von Feilitzen K, Forsberg M, Persson L,

Johansson F, Zwahlen M, Von Heijne G, Nielsen J, Pontén F. 2015. Tissue-based map of the human proteome. Science 347(6220):1260419 DOI 10.1126/science.1260419.

Urity VB, Issaian T, Braun EJ, Dantzler WH, Pannabecker TL. 2012. Architecture of kangaroo rat inner medulla: segmentation of descending thin limb of Henle’s loop. American Journal of

Physiology 302(6):R720–R726 DOI 10.1152/ajpregu.00549.2011.

Veal R, Caire W. 2001. Peromyscus eremicus. Mammalian Species 118:1–6.

Vicens A, Luke L, Roldan E.RS. 2014. Proteins involved in motility and sperm-egg interaction

55

evolve more rapidly in mouse spermatozoa. PLoS ONE 9(3):e91302.

DOI 10.1371/journal.pone.0091302.

Vimtrup BJ, Schmidt-Nielsen B. 1952. The histology of the kidney of kangaroo rats. The

Anatomical Record 114(4):515–528 DOI 10.1002/ar.1091140402.

Weber JN, Hoekstra HE. 2009. The evolution of burrowing behavior in deer mice (genus

Peromyscus). Animal Behavior 77(3):603–609 DOI 10.1016/j.anbehav.2008.10.031.

Weber JN, Peterson BK, Hoekstra HE. 2013. Discrete genetic modules are responsible for complex burrow evolution in Peromyscus mice. Nature 493(7342):402–405

DOI 10.1038/nature11816.

Wolff J. 1989. Social behavior. In: Kirkland Jr LG, Layne JN, eds. Advances in the study of

Peromyscus (Rodentia). Lubbock: Texas Tech University Press, 271–291.

Yang Z. 1997. PAML: a program package for phylogenetic analysis by maximum likelihood.

Computer Applications in BioSciences 13(5):555–556.

Yang Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Molecular Biology and Evolution 15(5):568–573.

Yang Z. 2007. PAML4: phylogenetic analysis by maximum likelihood. Molecular Biology and

Evolution 24(8):1586–1591 DOI 10.1093/molbev/msm088.

Zhang J, Nielsen R, Yang Z. 2005. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Molecular Biology and Evolution

22(12):2472–2479 DOI 10.1093/molbev/msi237.

56

CHAPTER 2

Characterizing the reproductive transcriptomic correlates of acute dehydration in males in the desert-adapted rodent, Peromyscus eremicus

Lauren Kordonowy 1* and Matthew MacManes 1+

1. University of New Hampshire Department of Molecular, Cellular and Biomedical Sciences

Durham, NH, USA

* [email protected]

+ [email protected]

Corresponding Author:

Lauren Kordonowy, [email protected]

Citation:

Kordonowy L, MacManes L. 2017. Characterizing the reproductive transcriptomic correlates of acute dehydration in males in the desert-adapted rodent, Peromyscus eremicus bioRxiv DOI: https://doi.org/10.1101/096057 ; BMC Genomics: in review

57

ABSTRACT

The understanding of genomic and physiological mechanisms related to how organisms living in extreme environments survive and reproduce is an outstanding question facing evolutionary and organismal biologists. One interesting example of adaptation is related to the survival of mammals in deserts, where extreme water limitation is common. Research on desert rodent adaptations has focused predominantly on adaptations related to surviving dehydration, while potential reproductive physiology adaptations for acute and chronic dehydration have been relatively neglected. This study aims to explore the reproductive consequences of acute dehydration by utilizing RNAseq data in the desert-specialized cactus mouse (Peromyscus eremicus). Specifically, we exposed 22 male cactus mice to either acute dehydration or control

(fully hydrated) treatment conditions, quasimapped testes-derived reads to a cactus mouse testes transcriptome, and then evaluated patterns of differential transcript and gene expression.

Following statistical evaluation with multiple analytical pipelines, nine genes were consistently differentially expressed between the hydrated and dehydrated mice. We hypothesized that male cactus mice would exhibit minimal reproductive responses to dehydration; therefore, this low number of differentially expressed genes between treatments aligns with current perceptions of this species’ extreme desert specialization. However, these differentially expressed genes include

Insulin-like 3 (Insl3), a regulator of male fertility and testes descent, as well as the solute carriers

Slc45a3 and Slc38a5, which are membrane transport proteins that may facilitate osmoregulation.

Together, these results suggest that in male cactus mice, acute dehydration may be linked to reproductive modulation via Insl3, but not through gene expression differences in the subset of other a priori tested reproductive hormones. Although water availability is a reproductive cue in desert-rodents exposed to chronic drought, potential reproductive modification via Insl3 in

58

response to acute water-limitation is a result which is unexpected in an animal capable of surviving and successfully reproducing year-round without available external water sources.

Indeed, this work highlights the critical need for integrative research that examines every facet of organismal adaptation, particularly in light of global climate change, which is predicted, amongst other things, to increase climate variability, thereby exposing desert animals more frequently to the acute drought conditions explored here.

KEYWORDS adaptation, testes, genetics, transcriptomics, differential expression, reproduction, physiology, dehydration, cactus mouse, Peromyscus eremicus

59

BACKGROUND

For decades, evolutionary biologists have successfully described examples where natural selection has resulted in the exquisite match between organism and environment (e.g. Salinity adaptations in three-spine sticklebacks: Hohenlohe et al., 2010; Jones et al. 2012; high-altitude adaptations for hemoglobin in deer mice and humans: Storz et al., 2010, Lorenzo et al., 2015; and Peromyscus adaptations for multiple environments: Hoekstra et al., 2006; Bedford &

Hoekstra, 2015; Munshi-South & Richardson, 2016). The match between organism and environment must be studied in the context of both components of fitness: survival and reproductive success, because both aspects of selection are critical to long term persistence in a given environment. Habitat specialists must possess phenotypes enabling survival and successful reproduction; therefore, cases where environmental selective pressures result in reduced reproductive success (e.g. Martin & Wiebe, 2004; Bolger, Patten & Bostock, 2005;

Evans et al., 2010; Wingfield, Kelley & Angelier, 2011), but not survival, demand attention.

Species occupying extreme environments are likely more vulnerable to the bifurcation of these two components of fitness. Moreover, long-term events like global climate change are predicted to increase climate variability and may enhance the challenges faced by species living on the fringes of habitable environments (Martin & Wiebe, 2004; Somero, 2010; Wingfield, Kelley &

Angelier, 2011; Wingfield, 2013; Asres & Amha, 2014).

Deserts present extraordinary environmental impediments for habitation, including extreme heat, aridity, and solar radiation. Examples of well-described desert behavioral adaptations are seasonal torpor (reviewed in Kalabukhov 1960; Geiser, 2010), nocturnality (e.g.

Stephens & Tello, 2009; Fuller et al., 2014) and burrowing (reviewed in Vorhies, 1945; Kelt,

2011) to avoid high temperatures and sun exposure. Desert mammals also exhibit a wide range

60

of morphological adaptations, including large ears for effective heat dissipation (e.g. Schmidt-

Nieslen, 1964; Hill & Veghte, 1976), metabolic water production (e.g. MacMillen & Hinds,

1983; reviewed in Walsberg 2000), and renal adaptations to minimize water-loss (e.g. Schmidt-

Nielsen et al., 1948; Dantzler, 1982; Diaz, Ojeda & Rezenda, 2006). Although desert rodents must possess adaptations conferring survival and reproductive benefits, researchers have focused on their physiological adaptations for survival. For example, renal adaptations in species of

Kangaroo rats (Dipodomys species) have been described and explored for over 60 years

(Schmidt-Nielsen et al., 1948; Schmidt-Nielsen and Schmidt-Nielsen, 1952; Marra et al., 2012;

Urity et al., 2012). While early research determined the renal physiology for Kangaroo rats

(Schmidt-Nielsen et al., 1948; Schmidt-Nielsen and Schmidt-Nielsen, 1952; Vimtrup and

Schmidt-Nielsen), recent research has focused on the genetic underpinnings of this phenotype

(Marra et al., 2012; Urity et al., 2012; Marra, Romero & DeWoody, 2014; Marra et al., 2014), which is indicative of a larger methodological shift in the approach for examining adaptation.

Research in another desert-adapted rodent, Peromyscus eremicus (cactus mouse), has followed a somewhat different trajectory; however, it too has only pursued survival oriented physiological mechanisms (but see Kordonowy and MacManes, 2016; Kordonowy et al., 2017;

MacManes, 2017). The ecology, physiology and behaviors of the cactus mouse in comparison with other Peromyscus species were summarized in 1968 (King, ed.), and the relationships between basal metabolic rate, body mass, and evaporative water loss were reviewed several decades later (MacMillen and Garland, 1989). Known desert adaptations for cactus mouse include nocturnality and torpor (reviewed in Veal and Caire, 1979; Caire, 1999); however, the cactus mouse does not possess the same elaborate kidney structures responsible for renal adaptations in kangaroo rats (Dewey et al., 1966; MacManes 2016, unpublished data). The

61

physiological renal adaptations in P. eremicus have not been described in detail, despite considerable explorations of other aspects of this species’ biology (reviewed in Veal and Caire,

1979; Caire, 1999). In order to initially characterize renal function of the cactus mouse, water consumption measurements and electrophysical dehydration effects for this species have also recently been documented (Kordonowy et al., 2017). Because the renal mechanisms for mitigating renal water-loss in P. eremicus have not been determined, a comparative genetic approach may be instrumental for characterizing this species’ adaptive kidney phenotype. To this end, MacManes and Eisen (2014) conducted a comparative analysis to find genes expressed in the kidney tissue of cactus mouse that were under positive selection relative to other mammals.

MacManes (2017) also recently conducted differential gene expression analyses on cactus mouse kidneys subjected to acute dehydration to explore transcriptomic renal responses. However, the transcriptomic resources available for this species extend considerably beyond renal tissue; transcripts from cactus mouse (as well as numerous other Peromyscus species) have been heavily utilized to pursue questions related to multiple aspects of evolutionary biology (reviewed in

Bedford and Hoekstra, 2015; Munshi-South and Richardson, 2016). Current investigations into cactus mouse desert-adaptive renal physiology include transcriptomic analyses (MacManes

2017); however, we extended this genetic approach by shifting the focus from adaptions for survival to include physiological adaptations for reproductive success (Kordonowy and

MacManes, 2016). The cactus mouse is an ideal system for investigating dehydration effects on reproduction, as well as potential reproductive adaptations for drought, given decades of study of reproductive biology, as well as more recent development of transcriptomic resources that include male reproductive tissues.

Substantial research has been done on the effects of various types of stress on

62

reproduction (e.g. Wingfield & Sapolsky, 2003; Ahmed et al., 2015; Nargund, 2015; Wingfield,

2013); furthermore, the impacts of dehydration stress on reproduction in desert specialized rodents have been historically explored by studies documenting the impacts of water availability as a reproductive cue (reviewed in Schwimmer and Heim 2009; Bales and Hostetler, 2011).

Specifically, some female desert rodents have shown evidence of reproductive attenuation due to water-limitation (Mongolian gerbil: Yahr and Kessler, 1975; hopping mouse: Breed, 1975), and male Mongolian gerbils subject to dehydration had decreased reproductive tissue mass (Yahr and

Kessler, 1975). In contrast, Shaw’s jird, an Egyptian desert rodent, did not elicit perceivable reproductive response to water deprivation in either males or females (El, Bakry et al., 1999).

Furthermore, water-supplementation studies among wild desert rodents resulted in prolonged breeding seasons in the hairy-footed gerbil and the four-striped grass mouse, but not in the Cape short-eared gerbil (Christian, 1979). Recent research has confirmed the importance of rainfall as a reproductive cue in the Arabian spiny mouse (Sarli et al., 2016), the Baluchistan gerbil (Sarli et al., 2015), Chessman’s gerbil (Henry and Dubost, 2012) and the Spinifex hopping mouse (Breed and Leigh, 2011). The focus of this previous research was to investigate reproductive cues and consequences of water-limitation in desert rodents, namely how species have adapted breeding onset and cessation patterns to respond to water availability. Our current study experimentally tests reproductive responses to acute dehydration using a differential gene expression approach in the cactus mouse, which has not been previously evaluated for reproductive impacts of dehydration.

In nature, wild cactus mice are subjected to both acute and chronic dehydration, and understanding the reproductive effects of dehydration stress is an initial step for fully characterizing the suite of phenotypes enabling their successful reproduction. Given that this

63

species has evolved in southwestern United States deserts and it breeds continuously throughout the year (Veal and Caire, 1979; Caire, 1999), we predict that neither acute nor chronic water stress, while physiologically demanding, would be associated with reproductive suppression. To evaluate acute water stress reproductive tissue gene expression responses in the current study, we leveraged previous research that characterized the transcriptome of male P. eremicus reproductive tissues from functional and comparative perspectives (Kordonowy and MacManes,

2016). We extend upon this work by performing an RNAseq experiment to identify differentially expressed genes in testes between male P. eremicus subjected to acute dehydration versus control (fully hydrated) animals in order to determine the impacts, if any, on male reproduction.

We hypothesized that male cactus mice would exhibit minimal gene expression level reproductive responses to acute dehydration because they are highly desert-adapted and they breed year-round, including in times of chronic draught. Specifically, we predicted that genes linked to reproductive function would not be differentially expressed in the testes in response to acute dehydration. We pursued this line of research on the effects of dehydration on reproduction in cactus mouse in order to begin to address the need for additional studies focusing on physiological adaptations related to reproductive success in animals living in extreme, and changing, environments.

METHODS

Treatment Groups, Sample Preparation and mRNA Sequencing

The cactus mice used for this study include only captive born individuals purchased from the Peromyscus Genetic Stock Center (Columbia, South Carolina). The animals at the stock center are descendant from individuals originally collected from a hot-desert location in Arizona

64

more than 30 years ago. The colony used in this study has been housed since 2013 at the

University of New Hampshire in conditions that mimic temperature and humidity levels in southwestern US deserts, as described previously (Kordonowy & MacManes, 2016). Males and females are housed together, which provides olfactory cues to support reproductive maturation.

Males do not undergo seasonal testicular atrophy, as indicated by successful reproduction throughout the year. The individuals used in this study were all of the same developmental stage

– reproductively mature – which was assessed by observing that the testes had descended into the scrotum from the abdomen, making them visible.

Males that had free access to water prior to euthanasia are labeled as WET mice in our analyses. Mice that were water deprived, which we refer to as DRY mice, were weighed and then water deprived for ~72 hours directly prior to euthanasia. All mice were weighed prior to sacrifice, and DRY mice were evaluated for weight loss during dehydration. Individuals in the study were collected between September 2014 – April 2016.

Cactus mice were sacrificed via isoflurane overdose and decapitation in accordance with

University of New Hampshire Animal Care and Use Committee guidelines (protocol number

130902) and guidelines established by the American Society of Mammalogists (Sikes et al.,

2016). Trunk blood samples were collected following decapitation for serum electrolyte analyses with an Abaxis Vetscan VS2 using critical care cartridges (Abaxis). The complete methodology and results of the electrolyte study, as well as the reported measures of water consumption and weight loss due to dehydration are described fully elsewhere (Kordonowy et al., 2017). Rather, this study focused on differential gene expression between the testes of 11 WET and 11 DRY mice. Testes were harvested within ten minutes of euthanasia, placed in RNAlater (Ambion Life

Technologies), flash-frozen in liquid nitrogen, and stored at -80° degree Celsius. A TRIzol,

65

chloroform protocol was implemented for RNA extraction (Ambion Life Technologies). Finally, the quantity and quality of the RNA product was evaluated with both a Qubit 2.0 Fluorometer

(Invitrogen) and a Tapestation 2200 (Agilent Technologies, Palo Alto, USA).

Libraries were made with a TruSeq Stranded mRNA Sample Prep LT Kit (Illumina), and the quality and quantity of the resultant sequencing libraries were confirmed with the Qubit and

Tapestation. Each sample was ligated with a unique adapter for identification in multiplex single lane sequencing. We submitted the multiplexed samples of the libraries for processing on lanes at the New York Genome Center Sequencing Facility (NY, New York). Paired end sequencing reads of length 125bp were generated on an Illumina 2500 platform. Reads were parsed by individual samples according to their unique hexamer IDs in preparation for analysis.

Assembly of Testes Transcriptome

We assembled a testes transcriptome from a single reproductively mature male using the de novo transcriptome protocol described previously (MacManes, 2016). The testes transcripts were assembled with alternative methodologies utilizing several optimization procedures to produce a high-quality transcriptome; however, the permutations of this assembly process are described extensively elsewhere (MacManes, 2016; Kordonowy and MacManes, 2016). The testes transcriptome we selected was constructed as described below. The raw reads were error corrected using Rcorrector version 1.0.1 (Song and Florea, 2015), then subjected to quality trimming (using a threshold of PHRED <2, as per MacManes, 2014) and adapter removal using

Skewer version 0.1.127 (Jiang et al, 2014). These reads were then assembled in the de novo transcriptome assembler BinPacker version 1.0 (Liu et al., 2016). We also reduced sequence redundancy to improve the assembly using the sequence clustering software CD-HIT-EST

66

version 4.6 (Li & Godzik, 2006; Fu et al., 2012). We further optimized the assembly with

Transrate version 1.0.1 (Smith-Unna et al., 2015) by retaining only highly supported contigs

(cutoff: 0.02847). We then evaluated the assembly’s structural integrity with Transrate and assessed completeness using the vertebrata database in BUSCO version 1.1b1 (Simão et al.,

2015). We quasimapped the raw reads to the assembly with Salmon version 0.7.2 (Patro, et al.,

2015) to confirm that mapping rates were high. Finally, the assembly was also annotated in dammit version 0.3.2, which finds open reading frames with TransDecoder and uses five databases (Rfam, Pfam, OrthoDB, BUSCO, and Uniref90) to thoroughly annotate transcripts

(https://github.com/camillescott/dammit).

Differential Gene and Transcript Expression Analyses

Several recent studies have critically evaluated alternative methodologies for differential transcript and gene expression to determine the relative merits of these approaches (Gierlinski et al., 2015; Schurch et al., 2016; Soneson et al., 2016; Froussios et al., 2016). Soneson and colleagues (2016) demonstrated that differential gene expression (DGE) analyses produce more accurate results than differential transcript expression (DTE) analyses. Furthermore, the differential gene expression approach is more appropriate than differential transcript expression for the scope of our research question, which is true of many evolutionary genomic studies

(Soneson et al., 2016). However, because both DTE and DGE approaches are widespread in current literature, we deemed it important to confirm that these methodologies yielded concordant results in the current study.

We utilized edgeR (Robinson, McCarthy & Smith, 2010; McCarthy, Chen & Smith,

2012) as our primary statistical software because Schurch and colleagues (2016) rigorously

67

tested various packages for analyzing DGE, and edgeR performed optimally within our sample size range. While edgeR is a widely used statistical package for evaluating differential expression, we also confirmed our results with another popular package, DESeq2 (Love et al.,

2014), in order to validate our findings.

We performed differential expression analyses with three alternative methodologies. Two analyses were conducted in R version 3.3.1 (R Core Team, 2016) using edgeR version 3.16.1, a

Bioconductor package (release 3.4) that evaluates statistical differences in count data between treatment groups (Robinson et al., 2010; McCarthy et al., 2012). Our first method utilized tximport, an R package developed by Soneson and colleagues (2016), which incorporates transcriptome mapping-rate estimates with a gene count matrix to enable downstream DGE analysis. The authors assert that such transcriptome mapping can generate more accurate estimates of DGE than traditional pipelines (Soneson et al., 2016). While our first methodology evaluated differential gene expression, our second analysis used the transcriptome mapped read sets to perform differential transcript expression and identify the corresponding gene matches.

The purpose of this second analysis was to evaluate whether the transcript expression results coincided with the gene expression results produced by the same program, edgeR. Finally, our third methodology determined differential gene expression with tximport in conjunction with

DESeq2 version 1.14.0 (Love et al., 2014), a Bioconductor package (release 3.4) which also evaluates statistical differences in expression. We performed this alternative DGE analysis with

DESeq2 in order to corroborate our DGE results from edgeR. Thus, the results for all three differential expression analyses were evaluated to determine the coincidence among the genes identified as significantly different between the WET and DRY groups. These alternative differential expression methods are described in detail below.

68

We quasimapped each of the 11 WET and 11 DRY sample read sets to the testes transcriptome with Salmon version 0.7.2 to generate transcript count data. To perform the gene- level analysis in edgeR, we constructed a gene ID to transcript ID mapping file, which was generated by a BLASTn (Altschul et al., 1990; Madden, 2002) search for matches in the Mus musculus transcriptome (Ensembl.org) version 7/11/16 release-85. We then imported the

Salmon-generated count data and the gene ID to transcript ID mapping file into R using the tximport package (Soneson et al., 2016) to convert the transcript count data into gene counts.

These gene count data were imported into edgeR for differential gene expression analysis

(Robinson et al., 2010; McCarthy et al., 2012). We applied TMM normalization to the data, calculated common and tagwise dispersions, and performed exact tests (p < 0.05) adjusting for multiple comparisons with the Benjamini-Hochburg correction (Benjamini and Hochburg, 1995) to find differentially expressed genes, which we identified in Ensembl (ensemble.org).

Next, we performed a transcript-level analysis using edgeR. To accomplish this, the

Salmon-generated count data was imported into R and analyzed as was described above for the gene-level analysis in edgeR. After determining which transcript IDs were differentially expressed, we identified the corresponding genes using the gene ID to transcript ID matrix described previously. The significantly expressed transcripts without corresponding gene matches were selected for an additional BLASTn search in the NCBI non-redundant nucleotide database (http://blast.ncbi.nlm.nih.gov/Blast.cgi). However, these results were not subjected to any additional analyses, because these matches were not consistent across all three differential expression analyses. This list of BLASTn search matches is provided in supplementary materials

(DTEno-matchBLASTnSequences.md).

The third analysis used DESeq2 to conduct an additional gene-level test, using the same

69

methods as described for the previous gene-level analysis, with the exception that data were imported into an alternative software package. We determined the significantly differentially expressed genes (p < 0.05) based on normalized counts and using the Benjamini-Hochburg correction (Benjamini and Hochburg, 1995) for multiple comparisons. We only retained genes with a -1 < log2 fold change > 1 in order to filter genes at a conservative threshold for differential expression based on our sample size (Schurch et al., 2016). This filtering was not necessary for either of the edgeR analyses because log2 fold changes exceeded this threshold for the differentially expressed genes and transcripts (-1.3 < log2 fold change > 1.4, in all cases).

We also compared the log2 fold change values (of treatment differences by mapped count) for each gene from the edgeR and DESeq2 gene-level analyses in a linear regression. This statistical test was performed in order to evaluate the degree of concordance between the two

DGE analyses. Furthermore, we constructed a list of genes identified as differentially expressed by all three analyses, which were further evaluated for function as well as chromosomal location.

These genes were also explored in STRING version 10.0 (string-db.org) to determine their protein-protein interactions (Snel et al., 2000; Szklarczyk et al., 2015).

Lastly, we performed an a priori test for DGE in edgeR on a small subset of nine genes encoding hormones and hormone receptors known to be involved in various aspects of reproductive functionality in male rodents. These genes are: steroidogenic acute regulatory protein (StAR), prolactin receptor (Prlr), luteinizing hormone/choriogonadotropin receptor

(Lhgcr), inhibin (Inha), ghrelin (Ghrl), estrogen related receptor gamma (Essrg), estrogen related receptor alpha (Essra), androgen receptor (Ar), and activin receptor type-2A (Acvr2a). We retrieved the Mus musculus genomic sequences for these hormones and receptors from Ensembl

(release 88: March 2017) and then executed BLASTn searches for the corresponding

70

Peromyscus eremicus sequences in the testes transcriptome. The Ensembl gene identifiers (Mus musculus) corresponding to the P. eremicus transcripts were queried from the table of results produced by the edgeR DGE analysis to evaluate treatment differences in expression.

RESULTS

Data and Code Availability

The testes transcriptome was assembled from a 45.8 million paired read data set.

Additionally, there were 9-20 million paired reads for each of the 22 testes data sets used for the differential expression analysis (Supplemental Table 1), yielding 304,466,486 reads total for this analysis. The raw reads are available at the European Nucleotide Archive under study accession number PRJEB18655. All data files, including the testes un-annotated transcriptome, the dammit annotated transcriptome, and the data generated by the differential gene expression analysis (described below) are available on DropBox: https://www.dropbox.com/sh/ffr9xrmjxj9md1m/AACpxjQNn-Jlf25qNdslfRSCa?dl=0.

These files will be posted to Dryad upon manuscript acceptance. All code for these analyses is posted on GitHub (https://github.com/macmanes-lab/testesDGE).

Assembly of Testes Transcriptome

The performance of multiple transcriptome assemblies was evaluated thoroughly, and the selected optimized testes assembly met high quality and completeness standards, and it also contains relatively few contigs and has high read mapping rates (Table 1). Therefore, this transcriptome was used for our differential expression analyses. The transcriptome was also annotated, and the complete statistics for this dammit annotation are provided in Table 1.

71

Differential Gene and Transcript Expression Analyses

Salmon quasimapping rates of all read datasets to the assembly were sufficiently high

(range: 81.46% - 87.02%; mean WET =84.41; mean DRY =83.81; Supplemental Table 1), indicating the successful generation of transcript count data for our differential expression analyses. The exact test performed for our gene-level analysis in edgeR indicated that fifteen genes reached statistical significance (after adjusting for multiple comparisons) for DGE between the WET and DRY treatment groups (Supplemental Figure 1). Specifically, seven genes were more highly expressed in WET individuals, and eight genes were more highly expressed in DRY individuals (Table 2).

We also performed an alternative transcript-level analysis using the referenced transcriptome mapped reads exclusively with edgeR. The exact test found 66 differentially expressed transcripts (Supplemental Figure 2), 45 of which were more highly expressed in the

WET group, and 21 were more highly expressed in the DRY group (Table 3). 10 of these differentially expressed transcripts were consistent with differentially expressed genes from the edgeR DGE analysis. In addition, the significantly expressed transcripts without an Ensembl ID match (nine WET and nine DRY) were retrieved for performing an nt all species BLASTn search (http://blast.ncbi.nlm.nih.gov/Blast.cgi), and these results are in the supplementary materials.

The gene-level analysis conducted in DESeq2 yielded 215 significantly differentially expressed genes (Supplemental Figure 3), 67 of which were more highly expressed in the WET group, while 148 were highly expressed in the DRY group. However, only 20 of these genes remained when we filtered them with a -1 < log2 fold change > 1 to retain genes with a

72

conservative threshold difference between treatment groups. This list of 20 genes yielded 16 genes more highly expressed in WET mice and four genes highly expressed in DRY mice (Table

4). Nine of these genes overlapped with those found to be significant in the previous two edgeR analyses.

To evaluate the correlation of log2 fold change results for each gene (Ensembl ID) from the two DGE analyses (edgeR and DESeq2), we performed a regression of these log values, and they were significantly correlated (Figure 1: Adj-R2 = 0.6596; F(1,14214) = 2.754x104; p <

2.2x10-16). This further demonstrates the concordance of the DGE analyses in these two software packages.

To evaluate the degree to which the three analyses produced concordant results, we generated a list of genes which were found to be significantly differently expressed by treatment across all three analyses (Supplemental Table 2). There were six genes that were consistently highly-expressed in the WET group and three genes that were highly-expressed in the DRY group. The six highly-expressed WET genes are Insulin-like 3 (Insl3), Free-fatty acid receptor 4

(Ffar4), Solute carrier family 45 member 3 (Slc45a3), Solute carrier family 38 member 5

(Slc38a5), Integrin alpha L (Itgal), and Transferrin (Trf). The three highly-expressed DRY genes are Ras and Rab Interactor 2 (Rin2), Insulin-like growth factory binding protein 3 (Igfbp3), and

Connective tissue growth factor (Ctgf). Because the patterns of expression of these nine genes were corroborated by multiple methodologies, we are confident that they are differentially expressed between our treatments. Estimates of expression for these genes generated using the gene-level edgeR analysis are plotted in Figure 2.

The significantly differently expressed genes were evaluated for gene function and chromosomal location (Table 5). These genes occur throughout the genome; namely, they are

73

located on different . The diverse functions of each gene will be described below.

In addition, we generated STRING diagrams (string-db.org) to view the protein-protein interactions for each of these nine genes (Snel et al., 2000; Szklarczyk et al., 2015).

Slc38a5 and Slc45a3 are among the highly expressed genes in the WET group (they have lower expression in the DRY group); these two solute carriers are members of a large protein family that is responsible for cross-membrane solute transport (reviewed in Hediger et al., 2004;

Hediger et al., 2013; Cesar-Razquin et al., 2015). Slc38a5 is involved sodium-dependent amino- acid transport, while Slc45a3 is purported to transport sugars (Vitavska and Wieczorek, 2013;

Schiöth, et al., 2013; http://slc.bioparadigms.org/), thereby playing an important potential role in maintaining water balance via management of oncotic pressures. Slc38a5 (Figure 3a) has interactions with multiple additional solute carriers, including Slc1a5, Slc36a2, Slc36a3, and

Slc36a4. Slc38a5 also has an interaction with disintegrin and metalloproteinase domain- containing 7 (Adam7), which is involved in sperm maturation and the acrosome reaction (Oh et al., 2005). In contrast, Slc45a3 (Figure 3b) does not have known protein interactions with other solute carriers; however, this protein does interact with steroidogenic acute regulatory protein

(StAR), which is critical in steroidogenesis (Christenson and Strauss III, 2001). Notably, our a priori DGE analysis did not demonstrate treatment differences in expression for StAR.

Insl3 was lower expressed in the DRY group, and this hormone purportedly regulates fertility in male and female mammals by preventing apoptosis of germ cells in reproductive organs of both sexes (Kawamura et al., 2004: Bathgate et al., 2012; Bathgate et al., 2013). In male rodents, Insl3 is critical to development by facilitating testicular descent, and it is also present in testes of adults, where it binds to relaxin family peptide receptor 2 (Rxfp2), also known as Lrg8 (Bathgate et al., 2012; Bathgate et al., 2013). Protein interaction data for Insl3

74

(Figure 3c) indicate that this hormone interacts with Rxfp2 and Rxfp1, as well as other proteins, including leptin (Lep), a pleiotropic hormone involved in reproduction, immunity, and metabolism (reviewed in Friedman, 2014).

Ffar4 was also down-regulated in the DRY group. Omega-3 fatty acid receptor 1

(O3Far1) is an alias of Ffar4, and it has roles in metabolism and inflammation (Moniri, 2016).

This protein interacts with multiple other free fatty acid receptors and G-protein coupled receptors as well as Stanniocalcin 1 (Stc1) (Figure 3d). Stc1 is involved in phosphate and calcium transportation (Wagner and Dimattia, 2006); however, this protein’s functional role in mice remains enigmatic (Chang et al, 2005).

Another of the lower expressed DRY group genes is Itgal (also known as CDa11a), which has multifaceted roles in lymphocyte-mediated immune responses (Bose et al., 2014).

Concordantly, the protein interactions with Itgal (Figure 3e) include numerous proteins integral to immunity, such as Intracellular adhesion molecules (specifically, ICAM1,2,4), which are expressed on the cell surface of immune cells and endothelial cells. Itgal is a receptor for these

ICAM glycoproteins, which bind during immune system responses (reviewed in Albelda et al.,

1994). However, an additional role of intercellular adhesion molecules has been proposed in spermatogenesis, whereby ICAMs may be integral to transporting non-mobile developing sperm cells through the seminiferous epithelium (Xiao et al., 2013).

The final gene with lower expression levels in the DRY treatment is Trf, which modulates the amount of free-iron in circulation and binds to transferrin receptors on the surface of erythrocyte precursors to deliver iron (reviewed in Gkouvastos et al., 2012). Trf interacts with multiple proteins (Figure 3f) involved in iron transport and uptake, including Steap family member 3 (Steap3), hephaestin (Heph), cerulopslamin (Cp), Solute carrier protein 40 member 1

75

(Slc40a1), and several H+ ATPases. Furthermore, Trf is linked to apolipoprotein A-1 (Apoa1), which interacts with immunoglobulin in a complex named sperm activating protein (Spap) to activate the motility of sperm when it inhabits the female genital tract (Akerlof et al., 1991;

Leijonhufvud et al., 1997).

One of the highly expressed genes in the DRY group is Rin2, which is involved in endocytosis (reviewed in Doherty and McMahon, 2009) and membrane trafficking through its actions as an effector protein for the GTPases in the Rab family within the Ras superfamily

(reviewed in Stenmark and Olkkonen, 2001). Rin2 protein-protein interactions (Figure 4a) include Ras related protein Rab5b and Rab5b, which are involved in vesicle transport as well as vasopressin-regulated water reabsorption. This mechanism for water reabsorption via Aquaporin

2 (Aqp2) in the kidney has been thoroughly reviewed by Boone and Deen (2008) and Kwon and colleagues (2013).

The second gene highly expressed in the DRY group is Igfbp3, which modulates the effects of insulin growth factors. Thus, the protein directly interacts (Figure 4b) with insulin growth factors 1 and 2 (Igf1, Igf2), which are responsible for increasing growth in most tissues

(reviewed in le Roth, 1997; Jones and Clemmons, 2008). Ctgf was also highly expressed in the

DRY group, and this protein is responsible for increased fibrosis and extracellular matrix formation (Reviewed in Moussad and Brigstock, 2000). The protein interactions for Ctgf (Figure

4c) include many transcription activators in the Hippo signaling pathway, including multiple

TEA domain transcription factors (Tead1, 2, 3 and 4), WW domain containing transcription regulator 1 (Wwtr1), as well as Yes-associated protein 1 (Yap1), which is responsible for both increasing apoptosis and preventing cell proliferation to mitigate tumor growth and control organ size (Reviewed in Pan, 2010).

76

The a priori edgeR DGE analysis for the genes encoding nine reproductive hormones and hormone receptors) did not reveal any statistically significant differences between the WET and

DRY mice. The log fold change values and corresponding p-values for these genes are in the analysis posted on GitHub. The patterns for these genes by treatment are shown in Figure 5.

DISCUSSION

This is the first study to evaluate gene expression levels of a reproductive tissue (testes) in response to acute dehydration in a desert-specialized rodent, Peromyscus eremicus (cactus mouse). Our results demonstrate differential expression of Insl3, which is a gene linked to reproduction, but not for a small subset of other reproductive hormone (and hormone receptor) genes. We also found expression differences in two solute carrier proteins, which is consistent with previous findings asserting the importance of this protein family for osmoregulation in desert rodents. Our findings lead us to hypothesize that reproductive function may be modified via Insl3 in acutely dehydrated mice. Any transcriptomic indication of potential reproductive modification in response to acute dehydration is surprising, given that this is not consistent with our understanding of P. eremicus as a desert specialist capable of breeding year-round in the wild. However, future studies must determine the physiological effects of decreased Insl3 expression on acutely dehydrated cactus mice. While acute dehydration is less common than chronic dehydration for desert mammals, given their ecology, it is a selective force they must overcome. Indeed, throughout much of the described range of the cactus mouse, rainfall events may occur several times per year. Cactus mice, and many other rodents, are known to rehydrate during these rainfall events (MacManes, personal observation). Following rehydration, cactus mice experience acute dehydration, followed by a steady state of chronic dehydration. The

77

reproductive responses of cactus mice to these acute and chronic dehydration events are unknown; therefore, this study describes the transcriptomic effects of acute dehydration in testes.

Insl3, which is believed to be a hormonal regulator of fertility among mammals of both sexes, inhibits germ line apoptosis in the testes (Kawamura et al., 2004; Bathgate et al., 2012;

Bathgate et al., 2013). Within adult rodent testes, luteinizing hormone (LH) stimulates expression of Insl3 in Leydig cells, and Insl3 binds to Lrg8 in seminiferous tubules, which results in inhibited apoptosis of germ-line cells, thus increasing their availability (Kawarmura et al., 2004). In addition, a study using murine Leydig cells demonstrated that Insl3 administration increased testosterone production (Pathirana et al., 2012). The precise mechanistic role of Insl3 in modulating fertility is still being elucidated; however, researchers assert that this hormone is an important regulator of fertility in males and females (reviewed in Bathgate et al., 2012).

Indeed, recent research has investigated the utility of Insl3 as an indicator of mammalian fertility

(e.g. in humans: Kovac and Lipshultz, 2013; in bulls: Pitia et al., 2016). Insl3 is also critical for the first phase of testicular descent, the transabdominal phase, which occurs during fetal development in rodents; but Insl3 does not appear to be involved in the inguinoscrotal phase which happens in sexually immature or inactive male rodents (reviewed in Hutson et al., 2015).

Lower Insl3 expression in the testes of acutely dehydrated mice leads us to suggest that fertility may be attenuated due to acute water deprivation. However, future work characterizing the functional consequences of Insl3 down-regulation, including direct measurements of sperm numbers and function, is needed to causatively demonstrate reproductive attenuation.

Specifically, does the number or quality of sperm decrease, and does this decrease reduce the probability of successful fertilization? Moreover, what are the temporal dynamics of reproductive suppression? Logically, species with core reproductive functions that are

78

suppressed by dehydration seem likely to be rapidly outcompeted by others lacking such limitations. Given this assertion, research characterizing the reproductive correlates of chronic dehydration is a logical extension of this work, although doing so is beyond the scope of this study.

Solute carrier proteins, specifically Slc45a3 and Slc38a5, are downregulated in acute dehydration. These genes are part of a large family essential for transferring solutes across membranes (reviewed in Hediger et al., 2004; Hediger et al., 2013; Cesar-Razquin et al., 2015).

Another member of this family, Solute carrier family 2 member 9 (Slc2a9), has been found to be undergoing positive selection in studies on kidney transcriptomes of cactus mouse (MacManes &

Eisen, 2014) and of other desert rodents (Marra et al., 2014). Our previous work with the male reproductive transcriptome of cactus mouse found evidence for positive selection in two additional solute carrier proteins: Slc15a3 and Slc47a1 (Kordonowy and MacManes, 2016). A recent differential gene expression study in cactus mouse kidneys found that Slc2a1 and Slc8a1 also showed responses to acute dehydration (MacManes, 2017). Therefore, our current findings that two solute carrier proteins are lower expressed in the DRY treatment group is consistent with previous research in the kidney and male reproductive transcriptomes for this species. This leads us to further support the hypothesis originally proposed by Marra and colleagues (2014) that this protein family is intrinsic to osmoregulation in desert rodents. Indeed, the findings of

MacManes and Eisen (2014), Kordonowy and MacManes (2016), and MacManes (2017) also lend support to the essential role of solute carrier proteins for maintaining homeostasis in the desert specialized cactus mouse.

In addition to their well characterized role in the maintenance of water and electrolyte balance, the differential expression of solute carrier proteins may have important reproductive

79

consequences, particularly as they relate to hormone secretion. Indeed, the interaction between

Slc38a5 and Adam7 is relevant, because Adam7 is involved in sperm maturation and the acrosome reaction (Oh et al., 2005). Furthermore, the protein-protein interactions between

Slc45a3 with StAR and between Insl3 and Lep are of particular interest because both StAR and

Lep are integral to reproduction, as well as to homeostasis (reviewed in Christenson and Strauss

III, 2001; Anuka et al., 2013; Friedman, 2014; Allison and Myers, 2014). However, our a priori

DGE analysis evaluating StAR, and other reproductive hormones, did not show evidence of expression changes. Thus, the protein interactions with reproductive implications are not restricted to solute carrier proteins. The protein relationships between Itgal and intercellular adhesion molecules are also noteworthy with respect to research hypothesizing an integral role for ICAMs in spermatogenesis (Xiao et al., 2013). Furthermore, Trf is linked to Apoa1, which is a critical component of sperm activating protein (Akerlof et al., 1991; Leijonhufvud et al., 1997).

While the relationship between these differentially expressed genes and the hormones involved in reproductive function are currently poorly-characterized, our findings that genes integral to sperm development and activation interact with genes differentially expressed in acute dehydration may indicate that, contrary to our expectations, acute dehydration is linked to reproductive modulation in the cactus mouse. However, functional studies will be necessary to elucidate the connection between these genes and physiological responses to dehydration. This is particularly important because many hormones have pleotropic effects, and further mechanisms of action unrelated to reproduction may be elucidated for these proteins in

Peromyscus eremicus.

In contrast to genes that are down-regulated in dehydration, the genes that were upregulated in the DRY group are known to be responsible for water homeostasis and cellular

80

growth. The significance of Rin2 is notable, because this protein is an effector for Rab5, which is a GTPase involved in vasopressin-regulated water reabsorption, a critical homeostatic process mediated through the Aqp2 water channel in kidneys (Boone and Deen, 2008; Kwon et al.,

2013). It is not surprising that genes in addition to solute carrier proteins, which are implicated in alternative processes for water homeostasis, are differentially expressed in response to water limitation. The other two genes that are up-regulated in the DRY treatment are indicative of modulated growth due to water limitation. Specifically, Igfb3 interacts directly with insulin growth factors responsible for tissue growth (le Roth 1997; Jones and Clemmons, 2008), and

Ctgf is linked with numerous transcription factors in the Hippo signaling pathway, which modulates apoptosis, proliferation, and organ size control (Pan, 2010).

To complement our male centric research, future studies should evaluate dehydration induced gene expression differences in female reproductive tissues, particularly in the uterus and ovaries during various reproductive stages. Indeed, given that the physiological demands of reproduction are purportedly greater in females, though this is controversial, (Bateman’s

Principle: proposed in Bateman, 1948; addressed in Trivers, 1972; reviewed in Knight, 2002; tested in Jones et al., 2002; 2005; Collet et al., 2014), we would expect to see a greater degree of reproductive suppression in females. While such work is beyond the scope of this manuscript, we hope that future research will evaluate female cactus mouse reproductive responses to dehydration.

Our findings are pertinent to physiological research in other desert-rodents showing reproduction suppression in response to water limitation (reviewed in Bales and Hostetler, 2011), specifically, in male and female Mongolian gerbils (Yahr and Kessler, 1975) and female hoping mice (Breed, 1975). The integral role of water as a reproductive cue for desert-rodents has also

81

been demonstrated in water-supplementation studies (reviewed in Bales and Hostetler, 2011;

Christian, 1979) as well as research on the effects of desert rainfall (Breed and Leigh, 2011;

Henry and Dubost, 2012; Sarli et al., 2015; Sarli et al., 2016). Thus, Schwimmer and Haim

(2009) asserted that reproductive timing is the most evolutionarily important adaptation for desert rodents. Furthermore, desert rodent research supporting a dehydration driven reproductive suppressive pathway mediated by arginine vasopressin (reviewed in Schwimmer and Haim,

1999; tested in Tahri-Joutei and Pointis, 1988a; 1988b; Shanas and Haim, 2004; Wube et al.,

2008; Bukovetzky et al., 2012a; Bukovetzky et al., 2012b) is somewhat analogous to our study linking decreased Insl3 expression in testes with dehydration, in that both findings represent non- traditional hormonal modulation of reproduction. We propose that future studies thoroughly explore physiological consequences for non-traditional hormonal pathways in response to dehydration in desert rodents, as well as well-established reproductive modulatory hormones in the hypothalamic-pituitary-gonadal axis.

Emerging from this work is a hypothesis related to the reproductive response to water stress in the cactus mouse, and perhaps other desert rodents. Specifically, we hypothesize that acute dehydration may be related to reproductive mitigation; however, we hypothesize that chronic dehydration is not. Indeed, it is virtually oxymoronic to suggest that chronic dehydration, which is the baseline condition in desert animals, has negative consequences for reproductive success. Indeed, desert rodents dynamically respond to water-availability to initiate and cease reproductive function. Generating an integrative, systems-level understanding of the reproductive responses to both acute and chronic dehydration across desert-adapted rodent is required for testing our hypothesis. While understanding the renal response to dehydration is critical for making predictions about survival, understanding the reproductive correlates is

82

perhaps even more relevant to evolutionary fitness. This study, to the best of our knowledge, is the first to describe the reproductive correlates of water-limitation in the cactus mouse, and the first to use a differential gene expression approach to evaluate reproductive tissue responses to drought. Furthermore, this study contributes to a research aim to determine whether novel physiological reproductive adaptations are present in male cactus mouse (Kordonowy and

MacManes, 2016). Developing a comprehensive understanding of reproductive responses to drought, and also the mechanisms underlying potential physiological adaptations, is necessary if we are to understand how increasing environmental variability due to climate change may modify the distribution of extant organisms.

CONCLUSIONS

The genetic mechanisms responsible for physiological adaptations for survival and reproduction in deserts remain enigmatic. Desert rodent research has focused primarily on physiological adaptations related to survival, specifically on renal adaptations to combat extreme water-limitation. In contrast, while previous studies have investigated reproductive effects of water-limitation in desert rodents, the underlying mechanisms for physiological adaptations for reproduction during acute and chronic dehydration are unknown. Furthermore, ours is the first study to evaluate reproductive transcriptomic responses to water limitation in a desert-rodent, the cactus mouse. To this end, we characterized the reproductive correlates of acute dehydration in this desert-specialized rodent using a highly replicated RNAseq experiment. In contrast to expectations, we describe a potential signal of reproductive modulation in dehydrated male cactus mouse testes. Specifically, dehydrated mice demonstrated significantly lower expression

83

of Insl3, which is a canonical regulator of fertility (and testes descent). Lower expression was also found in Slc45a3 and Slc38a5, lending further credence to the important role of solute carrier proteins for osmoregulation in the cactus mouse. While the low number of differentially expressed genes between acutely dehydrated and control mice might otherwise have suggested that this species is relatively unaffected by acute water-limitation, the diminished expression of

Insl3 in dehydrated mice leads us to propose that acute dehydration may compromise reproductive function via decreased fertility. Indeed, we hypothesize that non-traditional reproductive hormone pathways, such as those involving Insl3 or Avp (which has elicited suppressive reproductive responses in other desert rodent research), warrant further investigation in studies evaluating the reproductive effects of acute and chronic dehydration. Although future research must experimentally evaluate the potential functional relationship between Insl3 expression pattern and reproductive function and fertility, our findings that acute-dehydration alters Insl3 expression may be concerning, particularly with respect to global climate change.

Climate change driven increased variabilities in weather patterns may result in a greater frequency of acute water-stress, which could result in reduced reproductive function for the cactus mouse. In addition, because global climate change is predicted to shift habitats toward extremes in temperature, salinity, and aridity, and to alter species ranges, an enhanced understanding of the reproductive consequences of these changes, and of the potential for organisms to rapidly adapt, may enable us to effectively conserve innumerable species facing dramatic habitat changes.

Supplemental DropBox Files (will be submitted to Dryad upon acceptance):

Optimized final un-annotated transcriptome (good.BINPACKER.cdhit.fasta)

84

Annotated transcriptome (good.BINPACKER.cdhit.fasta.dammit.fasta)

Dammit gff3 file of annotation (good.BINPACKER.cdhit.fasta.dammit.gff3)

Salmon folder including salmon quant outputs for 22 individuals (salmon)

Salmon merged quant file (NEWmergedcounts.txt)

Gene ID by Transcript ID matrix (NEWESTfinalMUS.txt)

Transcripts without matches from edgeR DTE analysis (DTEnomatchBLASTnSequences.md)

List of abbreviations

Differential Gene Expression (DGE)

Differential Transcript Expression (DTE)

Dehydrated Treatment Group (DRY)

Control Treatment Group (WET)

All gene identities are defined in the text, but we can include them upon request here as well.

Declarations

Ethics approval and consent to participate

All cactus mice were sacrificed via isoflurane overdose and decapitation in accordance with

University of New Hampshire Animal Care and Use Committee guidelines (protocol number

130902) and guidelines established by the American Society of Mammalogists (Sikes et al.,

2016).

Consent for publication

Not applicable.

Availability of data and materials:

85

The raw reads are available at the European Nucleotide Archive under study accession number

PRJEB18655. All data files, including the testes un-annotated transcriptome, the dammit annotated transcriptome, and the data generated by the differential gene expression analysis

(described below) are available on DropBox

(https://www.dropbox.com/sh/ffr9xrmjxj9md1m/AACpxjQNn-Jlf25qNdslfRSCa?dl=0). These files will be posted to Dryad upon manuscript acceptance. All code for these analyses is posted on GitHub (https://github.com/macmanes-lab/testesDGE).

Competing Interests

The authors declare that they have no competing interests

Funding

This work was supported by a National Science Foundation award to Dr. Matthew MacManes

(NSF IOS 1455960).

Authors’ Contributions

LK and MM both contributed to the data collection, data generation, bioinformatics, analyses, interpretation, and writing of this manuscript. MM is the Ph.D. supervisor for LK.

Acknowledgments

We would like to acknowledge the MacManes laboratory, including graduate student Andrew Lang, and the undergraduate students in the laboratory. We also acknowledge Dr. Paul Tsang for advice on the reproductive endocrinology described in the discussion.

86

Table 1: Transcriptome assembly (BinPacker CD-hit-est Transrate Corrected) performance metrics for: contig number, TransRate score (Score), BUSCO indices: % single copy orthologs

(% SCO), % duplicated copy orthologs (% DCO), % fragmented (% frag), and % missing (% miss), as well as Salmon mapping rates (% mapping) for the optimized testes assembly. Dammit transcriptome assembly annotation statistics, including searches in the program TransDecoder for open reading frames (ORFs) and searches for homologous sequences in five databases: Rfam,

Pfam-A, Uniref90, OrthoDB, and BUSCO. Percentages were calculated from the count number of each parameter divided by the total number of contigs in the transcriptome (155,134). The only exception to this calculation is for complete ORFs, which were calculated as a percentage of the total ORFs (75,482). The BUSCO results for the annotated assembly are not shown here as they are identical to those for the un-annotated assembly.

Transcriptome Assembly Statistics Contig # Score % SCO % DCO % frag % miss % mapping 155,134 0.335 77 27 5.9 16 92.14 Dammit Annotation Statistics Search TransDecoder Rfam Pfam-A Uniref90 OrthoDB Dammit Type Parameter Total Complete ncRNAs Protein Proteins Orthologs Total ORFs ORFs Domains Annotated Contigs Count 75,482 43,028 937 25,675 62,865 51,806 77,915 Percentage 48.7% 57.0 % 0.6 % 16.6 % 40.5 % 33.4 % 50.2 %

87

Table 2: EdgeR determined significantly differentially expressed genes by treatment group in P. eremicus testes. Of the 15 DGE, seven were significantly more highly expressed in WET mice

(High in WET) and eight were more highly expressed in DRY mice (High in DRY).

Ensembl ID log2FC logCPM FDR Gene ID HIGH ENSMUSG00000079019.2 -4.354 1.650 5.82E-09 Insl3 WET ENSMUSG00000054200.6 -3.734 0.619 1.82E-06 Ffar4 WET ENSMUSG00000026435.15 -2.448 2.447 1.13E-03 Slc45a3 WET ENSMUSG00000025020.11 -2.231 1.770 1.13E-03 Slit1 WET ENSMUSG00000031170.14 -2.421 2.578 1.13E-03 Slc38a5 WET ENSMUSG00000030830.18 -2.180 1.666 3.37E-02 Itgal WET ENSMUSG00000032554.15 -2.066 3.287 4.85E-02 Trf WET ENSMUSG00000001768.15 3.086 1.006 1.46E-07 Rin2 DRY ENSMUSG00000025479.9 2.971 3.001 7.97E-05 Cyp2e1 DRY ENSMUSG00000020427.11 2.681 3.887 1.13E-03 Igfbp3 DRY ENSMUSG00000019997.11 2.314 3.235 1.13E-03 Ctgf DRY ENSMUSG00000040170.13 1.951 0.753 1.72E-03 Fmo2 DRY ENSMUSG00000023915.4 1.534 1.290 2.02E-02 Tnfrsf21 DRY ENSMUSG00000052974.8 2.077 0.647 2.26E-02 Cyp2f2 DRY ENSMUSG00000027901.12 2.492 -0.620 4.78E-02 Dennd2d DRY

88

Table 3: EdgeR determined significantly differentially expressed transcripts by treatment group in P. eremicus testes. Of the 66 total DTE, 45 were significantly more highly expressed in WET mice (High in WET) and 21 were more highly expressed in DRY mice (High in DRY). BLASTn matches to Ensembl IDs and corresponding Gene IDs.

HIGH: WET

Transcript ID log2FC logCPM FDR Ensembl ID Gene BINPACKER.15365.1 -3.703 0.047 5.31E-11 ENSMUSG00000054200.6 Ffar4 BINPACKER.2960.1 -4.268 1.147 2.06E-09 ENSMUSG00000079019.2 Insl3 BINPACKER.17981.2 -2.975 0.436 6.29E-08 ENSMUSG00000026435.15 Slc45a3 BINPACKER.9961.2 -2.426 1.998 7.50E-07 ENSMUSG00000031170.14 Slc38a5 BINPACKER.3452.1 -2.507 -0.140 3.56E-06 no match - BINPACKER.724.4 -2.162 2.667 8.32E-06 ENSMUSG00000032554.15 Trf BINPACKER.9604.1 -2.582 0.547 7.87E-05 no match - BINPACKER.31087.1 -2.908 -0.858 9.74E-05 no match - BINPACKER.24398.1 -2.440 -0.689 9.74E-05 ENSMUSG00000036596.6 Cpz BINPACKER.9726.1 -3.474 -0.107 2.38E-04 ENSMUSG00000026435.15 Slc45a3 BINPACKER.9218.3 -1.578 1.525 2.76E-04 ENSMUSG00000021253.6 Tgfb3 BINPACKER.18534.1 -2.332 1.346 4.85E-04 ENSMUSG00000025020.11 Slit1 BINPACKER.17022.3 -2.899 -0.561 1.00E-03 no match - BINPACKER.13806.1 -2.442 -0.381 1.13E-03 ENSMUSG00000025172.2 Ankrd2 BINPACKER.7740.1 -2.790 1.095 1.13E-03 ENSMUSG00000057074.6 Ces1g BINPACKER.10034.2 -4.420 0.387 1.23E-03 ENSMUSG00000026516.8 Nvl BINPACKER.11560.2 -1.465 2.050 1.66E-03 ENSMUSG00000021913.7 Ogdhl BINPACKER.13701.1 -1.312 1.804 2.28E-03 ENSMUSG00000025648.17 Pfkfb4 BINPACKER.3510.3 -2.163 0.906 2.95E-03 ENSMUSG00000027822.16 Slc33a1 BINPACKER.15806.1 -1.700 1.062 3.39E-03 ENSMUSG00000015702.13 Anxa9 BINPACKER.17992.1 -2.542 0.653 3.39E-03 ENSMUSG00000030830.18 Itgal BINPACKER.9726.2 -2.119 0.560 3.48E-03 ENSMUSG00000026435.15 Slc45a3 BINPACKER.6383.3 -2.093 1.270 4.16E-03 ENSMUSG00000002109.14 Ddb2

89

BINPACKER.20716.2 -4.204 -0.566 5.75E-03 ENSMUSG00000013846.9 St3gal1 BINPACKER.20114.1 -1.661 0.501 5.97E-03 ENSMUSG00000030972.6 Acsm5 BINPACKER.18622.1 -1.645 1.704 6.36E-03 no match - BINPACKER.24914.1 -2.211 -0.159 9.83E-03 ENSMUSG00000003555.7 Cyp17a1 BINPACKER.31815.1 -1.905 -0.770 9.83E-03 no match - BINPACKER.6740.3 -3.090 -0.434 1.04E-02 no match - BINPACKER.20530.1 -1.626 0.545 1.12E-02 ENSMUSG00000038463.8 Olfml2b BINPACKER.20656.1 -1.910 -0.531 1.22E-02 ENSMUSG00000029373.7 Pf4 BINPACKER.4855.1 -1.340 4.025 1.23E-02 ENSMUSG00000059991.7 Nptx2 BINPACKER.1846.1 -3.280 -0.792 1.23E-02 no match - BINPACKER.6494.2 -3.363 0.029 1.26E-02 ENSMUSG00000052861.13 Dnah6 BINPACKER.1818.1 -1.713 3.289 2.03E-02 ENSMUSG00000024125.1 Sbpl BINPACKER.10743.2 -1.915 -0.525 2.06E-02 ENSMUSG00000041607.16 Mbp BINPACKER.13054.2 -1.147 2.697 2.06E-02 ENSMUSG00000022994.8 Adcy6 BINPACKER.6807.1 -1.330 2.106 2.13E-02 ENSMUSG00000046687.5 Gm5424 BINPACKER.14160.1 -2.051 0.603 2.86E-02 ENSMUSG00000041556.8 Fbxo2 BINPACKER.16191.1 -1.431 0.926 3.42E-02 ENSMUSG00000028654.13 Mycl BINPACKER.10141.3 -3.283 -1.191 3.68E-02 ENSMUSG00000024132.5 Eci1 BINPACKER.23790.1 -1.756 -0.275 4.51E-02 ENSMUSG00000001119.7 Col6a1 BINPACKER.22521.1 -1.841 -0.056 4.52E-02 ENSMUSG00000054083.8 Capn12 BINPACKER.1061.6 -1.807 1.943 4.93E-02 no match - BINPACKER.17734.1 -1.660 2.109 4.94E-02 ENSMUSG00000049608.8 Gpr55 HIGH: DRY

Transcript ID log2FC logCPM FDR Ensembl ID Gene BINPACKER.21794.1 2.434 3.117 4.41E-08 ENSMUSG00000020427.11 Igfbp3 BINPACKER.28731.1 2.484 1.634 4.41E-08 no match - BINPACKER.5662.4 2.061 2.419 1.32E-07 ENSMUSG00000019997.11 Ctgf BINPACKER.87639.1 2.682 0.345 1.96E-07 ENSMUSG00000001768.15 Rin2 BINPACKER.35470.1 2.367 1.786 1.89E-04 no match - BINPACKER.52106.1 2.096 -0.542 6.83E-04 no match -

90

BINPACKER.3957.3 6.309 1.579 1.02E-03 ENSMUSG00000019988.6 Nedd1 BINPACKER.116235.1 2.212 0.301 3.94E-03 no match - BINPACKER.4449.4 3.428 -0.538 6.74E-03 ENSMUSG00000005150.16 Wdr83 BINPACKER.28.2 4.183 2.295 1.05E-02 ENSMUSG00000075706.10 Gpx4 BINPACKER.56553.1 1.472 0.172 1.46E-02 no match - BINPACKER.93518.1 1.711 -0.793 1.57E-02 no match - BINPACKER.11512.1 1.187 3.654 1.70E-02 ENSMUSG00000031591.14 Asah1 BINPACKER.66588.1 1.851 -0.347 1.71E-02 no match - BINPACKER.42718.1 1.542 0.507 2.06E-02 ENSMUSG00000030790.15 Adm BINPACKER.49203.1 1.639 -0.035 2.44E-02 no match - BINPACKER.147548.1 1.744 -0.007 2.99E-02 ENSMUSG00000042757.15 Tmem108 BINPACKER.23756.2 1.265 3.468 3.01E-02 ENSMUSG00000022061.8 Nkx3-1 BINPACKER.12709.1 3.906 2.611 3.01E-02 ENSMUSG00000028639.14 Ybx1 BINPACKER.5280.2 3.874 0.257 3.76E-02 ENSMUSG00000074582.10 Arfgef2 BINPACKER.58702.1 1.780 -0.500 4.93E-02 no match -

91

Table 4: DESeq2 determined significantly differentially expressed genes by treatment group in

P. eremicus testes. Of the 20 DGE with a -1 < log2 fold change > 1, 16 were significantly more highly expressed in WET mice (High in WET) and four were more highly expressed in DRY mice (High in DRY).

Ensembl ID baseMean log2FC p-adjusted Gene ID HIGH ENSMUSG00000054200.6 8.77721485 -2.2659204 1.24E-27 Ffar4 WET ENSMUSG00000026435.15 38.7630267 -2.2184407 1.16E-42 Slc45a3 WET ENSMUSG00000079019.2 24.7158409 -1.6454793 4.55E-13 Insl3 WET ENSMUSG00000031170.14 42.2322119 -1.6434261 6.64E-15 Slc38a5 WET ENSMUSG00000038463.8 16.2605998 -1.4619721 3.55E-12 Olfml2b WET ENSMUSG00000030830.18 22.0478661 -1.4358002 3.41E-10 Itgal WET ENSMUSG00000032554.15 67.5197473 -1.3762549 7.26E-10 Trf WET ENSMUSG00000021253.6 31.2493344 -1.3551661 7.02E-14 Tgfb3 WET ENSMUSG00000030972.6 13.8934534 -1.1709964 2.37E-07 Acsm5 WET ENSMUSG00000059991.7 173.025492 -1.1528314 5.12E-11 Nptx2 WET ENSMUSG00000046687.5 44.9527785 -1.0989949 8.31E-09 Gm5424 WET ENSMUSG00000024125.1 101.5876 -1.0962074 9.77E-06 Sbpl WET ENSMUSG00000021913.7 46.5401886 -1.0876018 8.70E-07 Ogdhl WET ENSMUSG00000015702.13 27.7002506 -1.0603879 1.95E-05 Anxa9 WET ENSMUSG00000036596.6 6.6698922 -1.0243046 9.04E-05 Cpz WET ENSMUSG00000025172.2 13.2622565 -1.0138171 0.00013318 Ankrd2 WET ENSMUSG00000042757.15 14.5676529 1.00643936 0.00019556 Tmem108 DRY ENSMUSG00000019997.11 64.49614 1.03331405 7.67E-05 Ctgf DRY ENSMUSG00000020427.11 92.3763518 1.56656207 4.55E-13 Igfbp3 DRY ENSMUSG00000001768.15 12.3794312 1.72433255 8.16E-16 Rin2 DRY

92

Table 5: Functional information and (CHR) locations (Mus musculus) for the nine genes differentially expressed across all three analyses in P. eremicus testes by treatment group

Gene Name Gene ID Gene Function CHR HIGH testicular function and testicular Insulin-like 3 Insl3 8 WET development Free-fatty acid receptor 4 Ffar4 metabolism and inflammation 19 WET Solute carrier family 45 sugar transport Slc45a3 1 WET member 3 Solute carrier family 38 sodium-dependent amino acid Slc38a5 X WET member 5 transport lymphocyte-mediated immune Integrin alpha L Itgal 7 WET responses iron transport and delivery to Transferrin Trf 9 WET erythrocytes endocytosis and membrane Ras and Rab Interactor 2 Rin2 2 DRY trafficking Insulin-like growth factor modulates effects of insulin Igfbp3 11 DRY binding protein 3 growth factors Connective tissue growth fibrosis and extracellular matrix Ctgf 10 DRY factor formation

93

Figure 1: Correlation of log2 fold change results for all Ensembl ID gene matches from DESeq2 and edgeR DGE analyses (Adj-R2 = 0.6596; F(1,14214) = 2.754x104; p < 2.2x10-16).

94

Figure 2: Box plots of edgeR analyzed differences in gene expression by treatment for the nine genes significantly differentially expressed in all three analyses. Counts per million (cpms) for both treatments (WET and DRY) are indicated.

95

Figure 3: STRING diagrams of protein-protein interactions for genes significantly differentially expressed (highly expressed) in the WET treatment group. These six genes are (a) Slc38a5, (b)

Slc45a3, (c) Insl3, (d) Ffar4 (also known as O3far1), (e) Itgal, and (f) Trf. Different colored circles stipulate different proteins interacting with the target proteins, small circles are proteins with unknown 3D structure, while larger circles are proteins with some degree of known or predicted 3D structure. Different colors of connecting lines represent different types of interactions between proteins. For fully interactive diagrams of the genes, view the provided links to string-db in the GitHub repository (StringDBlinks.md)

(a)

96

(b)

(c)

97

(d)

(e)

98

(f)

99

Figure 4: STRING diagrams of protein-protein interactions for genes significantly differentially expressed (highly expressed) in the DRY treatment group. These three genes are (a) Rin2, (b)

Igfbp3, and (c) Ctgf. Different colored circles stipulate different proteins interacting with the target proteins, small circles are proteins with unknown 3D structure, while larger circles are proteins with some degree of known or predicted 3D structure. Different colors of connecting lines represent different types of interactions between proteins. For fully interactive diagrams of the genes, view the provided links to string-db in the GitHub repository (StringDBlinks.md).

(a)

100

(b)

(c)

101

Figure 5: Box plots of edgeR analyzed differences in gene expression by treatment for the nine a priori tested reproductive hormone and hormone receptor genes. Counts per million (cpms) for both treatments (WET and DRY) are indicated.

102

Supplemental Table 1: Testes read data statistics, including sample identification (Mouse ID), number of reads (# Reads), percent reads mapped to transcriptome (% Mapping), and treatment group (TRT). Mouse ID 335T* is the dataset which was used to assemble the testes transcriptome; therefore, these reads were not used for the differential expression analysis.

Mouse ID # Reads % Mapping TRT 335T* 45759114 85.46 wet 3333T 15135923 82.56 Wet 2322T 12584407 82.37 Dry 382T 14305186 83.87 Dry 381T 14178847 83.23 Wet 376T 14588175 82.56 Dry 366T 13641731 82.95 Wet 349T 17289781 85.93 Wet 209T 11724617 84.02 Dry 265T 11536510 84.17 Dry 383T 13250034 81.46 Dry 384T 12152820 82.75 Dry 102T 11131941 84.84 Wet 400T 13259393 83.98 Wet 1357T 20603232 82.32 Wet 1358T 12240814 86.58 Wet 1359T 11144962 85.54 Wet 13T 11075885 83.55 Dry 343T 9423867 83.58 Dry 344T 17146134 85.36 Wet 355T 13948415 85.21 Wet 888T 18890387 86.52 Dry 999T 15213425 87.02 Dry

103

Supplemental Table 2:Significantly differentially expressed genes identified in the three analyses (DGE in edgeR, DTE in edgeR, and DGE in DESeq2) by treatment group in P. eremeicus testes. Of the 34 different genes which were more highly expressed in WET mice, six were significant across all three analyses (Gene IDs are italicized). Of the 17 genes which were more highly expressed in DRY mice, three were significant across all three analyses (Gene IDs are italicized). HIGH: WET Gene ID DGE edgeR DTE edgeR DGE DESeq2 Insl3 x X x Ffar4 x X x Slc45a3 x X x Slc38a5 x X x Itgal x X x Trf x X x Slit1 x X Cpz X x Tgfb3 X x Ces1g X Ankrd2 X x Nvl X Ogdhl X x Pfkfb4 X Slc33a1 X Anxa9 X x Ddb2 X St3gal1 X Acsm5 X x Cyp17a1 X Olfml2b X x Pf4 X Nptx2 X x

104

Dnah6 X Sbpl X x Adcy6 X Gm5424 X x Mbp X Fbxo2 X Mycl X Eci1 X Capn12 X Col6a1 X Gpr55 X HIGH: DRY Gene ID DGE edgeR DTE edgeR DGE DESeq2 Rin2 x X x Igfbp3 x X x Ctgf x X x Cyp2e1 x Fmo2 x Tnfrsf21 x Cyp2f2 x Dennd2d x Nedd1 X Wdr83 X Gpx4 X Asah1 X Adm X Tmem108 X x Nkx3-1 X Ybx1 X Arfgef2 X

105

Supplemental Figure 1: Plot of edgeR determined differentially expressed genes. The 15 significant genes are in red, with positive values indicating increased expression in the DRY group, and negative values depicting increased expression in the WET group.

106

Supplemental Figure 2: Plot of edgeR determined differentially expressed transcripts. The 66 significant transcripts are in red, with positive values indicating increased expression in the DRY group, and negative values depicting increased expression in the WET group.

107

Supplemental Figure 3: Plot of DESeq2 determined differentially expressed transcripts. The 215 significant transcripts are in red, with positive values indicating increased expression in the DRY group, and negative values depicting increased expression in the WET group.

108

REFERENCES:

Ahmed A, Tiwari RJ, Mishra GK, Jena B, Dar MA, Bhat AA. 2015. Effect of environmental heat stress on reproductive performance of dairy cows- a review. International Journal of

Livestock Research 5(4): 10-18. DOI: 10.5455/ijlr.20150421122704.

Akerlof E, Jornvall H, Slotte H, Pousette A. 1991. Identification of apolipoprotein A1 and immunoglobulin as components of a serum complex that mediates activation of human sperm motility. Biochemistry 30: 8986-8990. DOI:10.1021/bi00101a011.

Albelda SM, Smith CW, Ward PA. 1994. Adhesion molecules and inflammatory injury. The

FASEB Journal 8(8): 504-512. DOI: N/A

Allison MB, Myers MG Jr. 2014. 20 years of leptin: connecting leptin signaling to biological function. Journal of Endocrinology 223(1): T25-T35. DOI: 10.1530/JOE-14-0404.

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. Journal of Molecular Biology 215(3): 403-10. DOI: 10.1016/S0022-2836(05)80360-2.

Anuka E, Gal M, Stocco DM, Orly J. 2013. Expression and roles of steroidogenic acute regulatory (StAR) protein in 'non-classical', extra-adrenal and extra-gonadal cells and tissues.

Molecular and Cellular Endocrinology 371(1-2):47-61. DOI: 10.1016/j.mce.2013.02.003.

Asres A, Amha N. 2014. Physiological adaptation of animals to the change of environment: a review. Journal of Biology, Agriculture and Healthcare 4(25): 146-151. http://www.iiste.org/Journals/index.php/JBAH/article/view/17387

Bales KL, Hostetler CM. 2011. Hormones and Reproductive Cycles in rodents Pp. 215-240 in

Norris DO, Lopez KH, eds. Hormones and Reproduction of Vertebrates: Volume 5 Mammals.

London, UK: Academic Press (Elsevier).

Bateman AJ. 1948. Intra-sexual selection in Drosophila. Heredity 2: 349-368.

109

Bathgate RAD, Zhang S, Hughes RA, Rosengren KJ, Wade JD. 2012. The Structural

Determinants of Insulin-Like Peptide 3 Activity. Frontiers in Endocrinology 3:11 (pp.1-10).

DOI: 10.3389/fendo.2012.00011.

Bathgate RAD, Halls ML, van der Westhuizen ET, Callander GE, Kocan M, Summers RJ.

2013. Relaxin family peptides and their receptors. Physiological Reviews 93(1): 405-480.

DOI: 10.1152/physrev.00001.2012

Bedford NL, Hoekstra HE. 2015. The Natural History of Model Organisms: Peromyscus mice as a model for studying natural variation. eLife 4: e06813. DOI: 10.7554/eLife.06813.

Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological)

57(1): 289-30. Stable URL: http://www.jstor.org/stable/2346101

Bolger DT, Patten MA, Bostock DC. 2005. Avian reproductive failure in response to an extreme climatic event. Oecologia 142(3):398-406. DOI: 10.1007/s0042-004-1734-9.

Boone M, Deen PMT. 2008. Physiology and pathophysiology of the vasopressin-regulated renal water reabsorption. Pflugers Archiv 456(6): 1005-1024. DOI: 10.1007/s00424-008-0498-1

Bose TO, Colpitts SL, Pham Q-M, Puddington L, Lefrancois L. 2014. CD11a is essential for normal development of hematopoietic intermediates. Journal of Immunology 193: 2863-2872.

DOI: 10.4049/jimmunol.1301820.

Breed WG. 1975. Environmental factors and reproduction in the female hopping mouse,

Notomys alexis. The Journal of the Society for Reproduction and Fertility 45: 273-281.

Doi: 10.1530/jrf.0.0450273.

Breed WG, Leigh CM. 2011. Reproductive biology of an old endemic murid rodent of

Australia, the Spinifex hopping mouse, Notomys alexis: adaptations for life in the arid zone.

110

Integrative Zoology 6: 321-333. Doi: 10.1111/j.1749-4877.2011.00264x

Bukovetzky E, Fares F, Schwimmer H, Haim A. 2012a. Reproductive and metabolic responses of desert adapted common spiny male mice (Acomys cahirinus) to vasopressin treatment. Comparative Biochemistry and Physiology, Part A 162(4): 349-356. http://doi.org/10.1016/j.cbpa.2012.04.007

Bukovetzky E, Schwimmer H, Fares F, Haim A. 2012b. Photoperiodicity and increasing salinity as environmental cues for reproduction in desert adapted rodents. Hormones and

Behavior 61(1): 84-90. http://doi.org/10.1016/j.yhbeh.2011.10.006

Caire W. 1999. Cactus mouse Pp. 567-568 in Wilson D, Ruff S, eds. The Smithsonian Book of

North American Mammals. Washington, D.C.: Smithsonian Institution Press.

César-Razquin A, Snijder B, Frappier-Brinton T, Isserlin R, Gyimesi G, Bai X, Reithmeier

RA, Hepworth D, Hediger MA, Edwards AM, Superti-Furga G. 2015. A call for systematic research on solute carriers. Cell 162 (3): 478-487. DOI: i:10.1016/j.cell.2015.07.022.

Chang ACM, Cha J, Koentgen F, Reddel RR. 2005. The murine stanniocalcin 1 gene is not essential for growth and development. Molecular and Cellular Biology 25(3): 10604-10610.

DOI: 10.1128/MCB.25.23.10604-10610.2005.

Christenson LK, Strauss JF III. 2001. Steroidogenic acute regulatory protein: an update on its regulation and mechanism of action. Archives of Medical Research 32(6):576-586. http://dx.doi.org/10.1016/S0188-4409(01)00338-1

Christian DP. 1979. Comparative demography of three Namib desert rodents: Responses to the provision of supplementary water. Journal of Mammalogy 60(4): 679-690. https://doi.org/10.2307/1380185

111

Collet JM, Dean RF, Worley K, Richardson DS, Pizzari T. 2014. The measure and significance of Bateman’s principles. Proceedings of the Royal Society B-Biological Sciences

281(1782): pp. 1-9. Doi: 10.1098/rspb.2013.2973

Dantzler WH. 1982. Renal Adaptations of Desert Vertebrates. BioScience 32(2):108-113.

DOI: 10.2307/1308563

Dewey GC, Elias H, Appel KR. 1966. Stereology of renal corpuscles of desert and swamp deermice. Nephron 3(6): 352-365. DOI: 10.1159/000179552.

Diaz GB, Ojeda RA, Rezende EL. 2006. Renal morphology, phylogenetic history and desert adaptation of South American hystricognath rodents. Functional Ecology 20: 609–620.

DOI: 10.1111/j.1365-2435.2006.01144.x

Doherty GJ, McMahon HT. 2009. Mechanisms of Endocytosis. Annual Review of

Biochemistry 78: 857-902. DOI: 10.1146/annurev.biochem.78.081307.110540

El-Bakry HA, Zahran WM, Bartness TJ. 1999. Control of reproductive and energetic status by environmental cues in a desert rodent, Shaw’s jird. Physiology and Behavior 66(4): 657-666. http://doi.org/10.1016/S0031-9384(98)00344-8

Evans MEK, Hearn DJ, Theiss KE, Cranston K, Holsinger KE, Donoghue MJ. 2010.

Extreme environments select for reproductive assurance: evidence from evening primroses

(Oenothera). New Phytologist 191:555-563. DOI: 10.1111/j.1469-8137.2011.03697.x.

Friedman J. 2014. 20 Years of leptin: lepttin at 20: an overview. Journal of Endocrinology

223:1 T1-T8. DOI: 10.1530/JOE-14-0405.

Froussios K, Schurch NJ, Mackinnon K, Gierlinski M, Duc C, Simpson GG, Barton GJ.

2016. How well do RNA-Seq differential gene expression tools perform in higher eukaryotes? bioRxiv: DOI: 10.1101/090753.

112

Fu L, Niu B, Zhu Z, Wu S, Li W. 2012. CD-HIT: accelerated for clustering the next generation sequencing data. Bioinformatics 28(23): 3150-3152. DOI: 10.1093/bioinformatics/bts565

Fuller A, Hetem RS, Maloney SK, Mitchell D. 2014. Adaptation to Heat and Water Shortage in

Large, Arid-Zone Mammals. Physiology 29(3): 159-167. DOI: 10.1152/physiol.00049.2013

Geiser F. in Aestivation: Molecular and Physiological Aspects (Volume 49 in the series Progress in Molecular and Subcellular Biology, Springer, Berlin, 2010, Navas A, Carvalho C, Eduardo J.)

95-111. DOI:10.1007/978-3-642-02421-4_5

Gierliński M, Cole C, Schofield P, Schurch NJ, Sherstnev A, Singh V, Wrobel N, Gharbi K,

Simpson G, OwenHughes T, Blaxter M, Barton GJ. 2015. Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment. Bioinformatics 31(22): 3625-3630.

DOI: 10.1093/bioionformatics/btv425.

Gkouvatsos K, Papanikolaou G, Pantopoulos K. 2012. Regulation of iron transport and the role of transferrin. Biochemica Biophysica Acta (BBA) – General Subjects 1820(3): 188-202.

DOI:10.1016/j.bhagen.2011.10.013

Hediger MA, Romero MF, Peng J-B, Rolfs A, Takanaga H, Bruford EA. 2004. The ABCs of solute carriers: physiological, pathological and therapeutic implications of human membrane transport proteins: introduction. Pffugers Archiv: European Journal of Physiology 447(5): 465-

548. DOI:10.1007/s00424-003-1192-y.

Hediger MA, Clemencon B, Burrier RE, Bruford EA. 2013. The ABCs of membrane transporters in health and disease (SLC series): introduction. Molecular Aspects of Medicine

34(2-3): 95-107. DOI: 10.1016/j.mam.2012.12.009.

Henry O, Dubost G. 2012. Breeding periods of Gerbillus cheesmani (Rodentia, Muridae) in

Saudi Arabia. Mammalia 76(4): 383-387. Doi: 10.1515/mammalia-2012-0017.

113

Hill RW, Veghte JH. 1976. Jackrabbit ears: surface temperatures and vascular responses.

Science 194(4263):436-8. DOI:10.1126/science.982027

Hoekstra HE, Hirschmann RJ, Bundey RA, Insel PA, Crossland JP. 2006. A single amino acid mutation contributes to adaptive beach mouse color patterns. Science 313:101–104

DOI: 10.1126/science.1126121.

Hohenlohe PA, Bassham S, Etter PD, Stiffler N, Johnson EA, Cresko WA. 2010. Population genomics of parallel adaptation in threespine stickleback using sequenced RAD Tags. PLoS

Genetics 6(2):e1000862 DOI: 10.1371/journal.pgen.1000862.

Hutson JM, Li R, Southwell BR, Newgreen D, Cousinery M. 2015. Regulation of testicular descent. Pediatric Surgery International 31(4): 317-325. Doi: 10.1007/s00383-015-3673-4

Jiang H, Lei R, Ding S-W, Zhu S. 2014. Skewer: a fast and accurate adapter trimmer for next- generation sequencing paired-end reads. BMC Bioinformatics 15:182 (pp. 1-12).

DOI: 10.1186/1471-2015-15-182

Jones AG, Arguello JR, Arnold SJ. 2002. Validation of Bateman’s principles: A genetic study of sexual selection and mating patterns in the rough-skinned newt. Proceedings of the Royal

Society B-Biological Sciences 269(1509): 2533-2539. Doi: 10.1098/rspb.2002.2177.

Jones AG, Rosenqvist G, Berglund A, Avise JC. 2005. The measurement of sexual selection using Bateman’s principles: an experimental test in the sex-role reversed pipefish Syngnathus typhle. Integrative and Comparative Biology 45(5): 874-884. Doi: 10.1093/icb/45.5.874

Jones FC, Grabherr MG, Chan YF, Russell P, Mauceli E, Johnson J, Swofford R, Pirun M,

Zody MC, White S, Birney E, Searle S, Schmutz J, Grimwood J, Dickson MC, Myers RM,

Miller CT, Summers BR, Knecht AK, Brady SD, Zhang H, Pollen AA, Howes T, Amemiya

C, Broad Institute Genome Sequencing Platform & Whole Assembly Team, Lander ES, Di

114

Palma F, Lindblad-Toh K, Kingsley DM. 2012. The genomic basis of adaptive evolution in threespine sticklebacks. Nature 484:55–61. DOI: 10.1038/nature10944.

Jones JI, Clemmons DR. 2008. Insuline-like growth factors and their biding proteins: biological actions. Endocrine Reviews 16(1): DOI: 10.1210/edrv-16-1-3

Kalabukhov NI. 1960. Comparative ecology of hibernating animals. Bulletin of the Museum of

Comparative Zoology at Harvard 124: 45-74. DOI: N/A

Kawamura K, Kumagai J, Sudo S, Chun S-Y, Pisarska M, Morita H, Toppari J, Fu P,

Wade JD, Bathgate RAD, Hsueh AJW. 2004. Paracrine regulation of mammalian oocyte maturation and male germ cell survival. Proceedings of the National Academy of Sciences

101(19): 7323-7328. DOI: 10.1073.pnas.0307061101.

Kelt DA. 2011. Comparative ecology of desert small mammals: a selective review of the past 30 years. Journal of Mammalogy, 92(6):1158–1178. http://dx.doi.org/10.1644/10-MAMM-S-238.1

King, John A., (editor). Biology of Peromyscus (Rodentia). Special Publication No. 2, The

American Society of Mammologists, xiii + 594 pp, 1968. DOI: 10.2307/1378817

Knight J. 2002. Sexual Stereotypes. Nature 415: 254-256. doi:10.1038/415254a

Kordonowy LK, MacManes MD. 2016. Characterization of a male reproductive transcriptome for Peromyscus eremicus (cactus mouse). PeerJ 4:e2617(pp.1-23). DOI: 10.7717/peerj.2617.

Kordonowy L, Lombardo K, Green H, Dawson MD, LaCourse S, Bolton E, MacManes M.

2017. Physiological and biochemical changes associated with experimental dehydration in the desert adapted cactus mouse, Peromyscus eremicus. Physiological Reports: 5(e13218): pp. 1-8.

Doi: 10.14814/phy2.13218

Kovac JR, Lipshultz. 2013. The significance of insulin-like factor 3 as a marker of intratesticular testosterone. Fertility and Sterility 99(1): 66-67. Doi:

115

10.1016/j.fertnstert.2012.09.009.

Kwon T-H, Frokiaer J, Nielsen S. 2013. Regulation of aquaporin-2 in the kidney: A molecular mechanism of body-water homeostasis. Kidney Research Clinical Practice 32(3): 96-1023.

DOI: 10.1016/j.krcp.2013.07.005.

Leijonhufvud P, Akerlof E, Pousette A. 1997. Structure of sperm activating protein. Molecular

Human Reproduction 3(3): 249-253. DOI:10.1093/molehr/3.3.249.

Le Roith D. 1997. Insulin-like growth factors. The New England Journal of Medicine 336(9):

633-640. DOI: 10.1056/NEJM199702273360907

Li W, Godzik A. 2006. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22(13):1658-1659. DOI: 10.1093/bioinformatics/btl158.

Liu J, Li G, Chang Z, Yu T, Liu B McMullen R, Chen P, Huang X. 2016. BinPacker: packing-based de novo transcriptome assembly from RNA-seq data. PLos Computational

Biology 12(2):e1004772 (pp.1-15). DOI:10.1371/journal.pcbi.1004772.

Lorenzo FR, Huff C, Myllymäki M, Olenchock B, Swierczek S, Tashi T, Gordeuk V,

Wuren T, Ri-Li G, McClain DA, Khan TM, Koul PA, Guchhait P, Salama ME, Xing J,

Semenza GL, Liberzon E, Wilson A, Simonson TS, Jorde LB, Kaelin Jr WG, Koivunen P,

Prchal JT. 2014. A genetic mechanism for Tibetan high-altitude adaptation. Nature Genetics

46(9):951–956 DOI: 10.1038/ng.3067.

Love MI, Huber W, and Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15:550 (pp. 1-21).

DOI: 10.1186/s13059-014-0550-8

MacManes MD. 2014. On the optimal trimming of high-througput mRNA sequence data.

Frontiers in Genetics 5:13 (pp.1-7). doi.org/10.3389/fgene.2014.00013.

116

MacManes MD. 2016. Establishing evidenced-based best practice for the de novo assembly and evaluation of transcriptomes from non-model organisms. bioRxiv. DOI: 10.1101/035642

MacManes MD, Eisen MB. 2014. Characterization of the transcriptome, nucleotide sequence polymorphism, and natural selection in the desert adapted mouse Peromyscus eremicus. PeerJ 2: e642. doi.org/10.7717/peerj.642.

MacManes MD. 2017. Severe acute dehydration in a desert rodent elicits a transcriptional response that effectively prevents kidney injury. American Journal of Physiology-Renal

Physiology in press (April 5, 2017). Doi: 10.1152/ajprenal.00067.2017

MacMillen RE, Garland T Jr. in Advances in the Study of Peromyscus (Rodentia) (Texas Tech

University Press, Lubbock, 1989, Kirkland LG Jr, Layne JN) 143-168.

MacMillen RE, Hinds DS. 1983. Water regulatory efficiency in heteromyid rodents: A model and its application. Ecology 64(1): 152-164. DOI: 10.2307/1937337

Madden T. 2002 Oct 9 [Updated 2003 Aug 13]. The BLAST Sequence Analysis Tool. In:

McEntyre J, Ostell J, editors. The NCBI Handbook [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2002-. Chapter 16. Available from: http://www.ncbi.nlm.nih.gov/books/NBK21097/

Marra NJ, Eo SH, Hale MC, Waser PM, DeWoody JA. 2012. A priori and a posteriori approaches for finding genes of evolutionary interest in non-model species: Osmoregulatory genes in the kidney transcriptome of the desert rodent Dipodomys spectabilis (banner-tailed kangaroo rat). Comparative Biochemistry and Physiology, Part D: Genomics Proteomics 7(4):

328-339. DOI: 10.1016/j.cbd.2012.07.001.

Marra NJ, Romero A, DeWoody A. 2014. Natural selection and the genetic basis of osmoregulation in heteromyid rodents as revealed by RNA-seq. Molecular Ecology 23(11):

117

2699-2711. DOI: 10.1111/mec.12764.

Martin K, Wiebe KL. 2004. Coping mechanisms of alpine and arctic breeding birds: extreme weather and limitations to reproductive resilience. Integrative and Comparative Biology

44(2):177-185. DOI: 10.1093/icb/44.2.177.

McCarthy DJ, Chen Y, Smyth GK. 2012. Differential expression analysis of multifactor RNA-

Seq experiments with respect to biological variation. Nucleic Acids Research 40(10): 4288-4297.

DOI: 10.1093/nar/gks042

Moniri NH. 2016. Free-fatty acid receptor-4 (GPR120): Cellular and molecular function and its role in metabolic disorders. Biochemical Pharmacology 110-111:1-15.

DOI:10.1016/j.bcp.2016.01.021

Moussad EE-DA, Brigstock DR. 2000. Connective Tissue Growth Factor: What’s in a Name?

Molecular Genetics and Metabolism 71: 276-292. DOI: 10.1006/mgme.2000.3059.

Munshi-South J, Richardson JL. 2016. Peromyscus transcriptomics: understanding adaptation and gene expression plasticity within and between species of deer mice. Seminars in Cell &

Developmental Biology in press (pp.1-9). DOI: 10.1016/j.semcdb.2016.08.011

Nargund VH. 2015. Effects of psychological stress on male fertility. Nature Reviews Urology

12: 373-382. DOI: 10.1038/nrurol.2015.112

Oh J, Woo JM, Choi E, Kim T, Cho BN, Park ZY, Kim YC, Kim DH, Cho C. 2005.

Molecular, biochemical, and cellular characterization of epididymal ADAMs, ADAM7 and

ADAM28. Biochemical and Biophysical Research Communications 331(4):1374-1383.

DOI:10.1016/j.bbrc.2005.04.067

Pan D. 2010. The hippo signaling pathway in development and cancer. Developmental Cell

19(4): 491-505. DOI: 10.1016/j.devcel.2010.09.011.

118

Pathirana IN, Kawte N, Bullesbach EE, Takahashi M, Hatoya S, Inaba T, Tamada H.

2012. Insulin-like peptide 3 stimulates testosterone secretion in mouse Leydig cells via cAMP pathway. Regulatory Peptides 178(1-3): 102-106. http://doi.org.libproxy.unh.edu/10.1016/j.regpep.2012.07.003

Patro R, Duggal G, Kingsford C. 2015. Salmon: Accurate, Versatile and Ultrafast

Quantification from RNA-seq Data using Lightweight-Alignment. bioRxiv 021592;

DOI: http://dx.doi.org/10.1101/021592.

Pitia AM, Uchiyama K, Sano Hiroaki, Kinukawa M, Minato Y Sasada H Kohsaka T. 2017.

Functional insulin-like factor 3 (INSL3) hormone-receptor system in the testes and spermatozoa of domestic ruminants and its potential as a predictor of sire fertility. Animal Science Journal

88(4): 678-690. Doi: 10.1111/asj.12694

R Core Team. 2016. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. https://www.R-project.org

Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1): 139-140.

DOI: 10.1093/bioninformatics.btp616.

Sarli J, Lutermann H, Alagali AN, Mohammed OB, Bennett NC. 2015. Reproductive patterns in the Baluchistan gerbil, Gerbillus nanus (Rodentia: Muridae), from western Saudi

Arabia: The role of rainfall and temperature. Journal of Arid Environments 113: 87-94. http://dx.doi.org/10.1016/j.jaridenv.2014.09.007

Sarli J, Lutermann H, Alagali AN, Mohammed OB, Bennett NC. 2016. Seasonal reproduction in the Arabian spiny mouse, Acomys dimidiatus (Rodentia: Muridae) from Saudi

Arabia: The role of rainfall and temperature. Journal of Arid Environments 124:352-359.

119

http://dx.doi.org/10.1016/j.jaridenv.2015.09.008

Schiöth HB, Roshanbin S, Hägglund MG, Fredriksson R. 2013. Evolutionary origin of amino acid transporter families SLC32, SLC36 and SLC38 and physiological, pathological and therapeutic aspects. Molecular Aspects of Medicine 34(2-3): 571-585.

DOI:10.1016/j.mam.2012.07.012

Schmidt-Nielsen K. in Desert Animals: Physiological Problems of Heat and Water (Oxford

University Press, New York, 1964, Schmidt-Nielsen K.) 129-138.

Schmidt-Nielsen B, Schmidt-Nielsen K, Brokaw A, Schneiderman H. 1948. Water conservation in desert rodents. Journal of Cellular Physiology 32(3): 331-360.

DOI: 10.1002/jcp.1030320306.

Schmidt-Nielsen K, Schmidt-Nielsen B. 1952. Water metabolism of desert mammals 1.

Physiological Reviews 32(2): 135-166. PMID: 1492697.

Schurch NJ, Schofield P, Gierliński M, Cole C, Sherstnev A, Singh V, Wrobel N, Gharbi

K, Simpson GG, Owen-Hughes T, Blaxter M, Barton GJ. 2016. How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? RNA 22(6): 839-851. DOI: 10.1261/rna.053959.115

Schwimmer H, Haim A. 2009. Physiological adaptations of small mammals to desert ecosystems. Integrative Zoology 4: 357-366. Doi: 10.1111/j.1749-4877.2009.00176.x

Scott C. 2016. dammit: an open and accessible de novo transcriptome annotator. www.camillescott.org/dammit

Shanas U, Haim A. 2004. Diet salinity and vasopressin as reproductive modulators in the desert-dwelling golden spiny mouse (Acomys russatus). Physiology and Behavior 81(4): 645-

650. Doi: 10.1016/j.physbeh.2004.03.002

120

Sikes RS, Animal Care and Use Committee of the American Society of

Mammalogists. 2016. 2016 Guidelines of the American society of Mammalogists for the use of wild mammals in research and education. Journal of Mammalogy 97(3): 663-688.

DOI: 10.1093/jmammal/gyw078

Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs.

Bioinformatics 31(19): 3210-3212. DOI: 10.1093/bioinformatics/btv351.

Smith-Unna R, Boursnell C, Patro R, Hibberd JM, Kelley S. 2016. TransRate: reference free quality assessment of de-novo transcriptome assemblies. Genome Research 26: 1134-1144.

DOI: 10.1101/gr.196469.115

Snel B, Lehmann G, Bork P, Huynen MA. 2000. STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Research: 28(18):3442-

3444. DOI: 10.1093/nar/28.18.3442.

Somero GN. 2010. The physiology of climate change: how potentials for acclimatization and genetic adaptation will determine ‘winners’ and losers’. Journal of Experimental Biology

213:912-920. DOI: 10/1242/jeb037473.

Soneson C, Love MI, Robinson MD. 2016. Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences. F1000Reserach 4:1521 (pp 1-19).

DOI: 10.12688/f1000research.7563.2

Song L. Florea L. 2015. Rcorrector: efficient and accurate error correction for Illumina RNA- seq reads. Gigascience 4:48 (pp.1-8) DOI: 10.1186/s13742-015-0089-y.

Stenmark H, Olkkonen VM. 2001. The Rab GTPase family. Genome Biology 2(5):pp.1-7.

DOI: 10.1186/gb-2001-2-5-reviews3007

121

Stevens RD, Tello JS. 2009. Micro- and macrohabitat associations in Mojave desert rodent communities. Journal of Mammalogy 90(2):388-403. DOI: 10.1644/08-MAMM-A-141.1

Storz JF, Runck AM, Moriyama H, Weber RE, Fago A. 2010. Genetic differences in hemoglobin function between highland and lowland deer mice. Journal of Experimental Biology

213(15):2565–2574 DOI: 10.1242/jeb.042598.

Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, Simonovic

M, Roth A, Santos A, Tsafou KP, Kuhn M, Bork P, Jensen LJ, von Mering C. 2015.

STRING v10: protein-protein interaction networks, integrated over the tree of life. Nucleic Acids

Research 43: D447-452. DOI: 10.1093/nar/gku1003.

Tahri-Joutei A, Pointis G. 1988a. Modulation of mouse Leydig cell steroidogenesis through a specific arginine-vasopressin receptor. Life Sciences 43(2): 177-185.

Doi: 10.1016/0024-3205(88)90295-0

Tahri-Joutei A, Pointis G. 1988b. Time-related effects of arginine vasopressin on steroidogenesis in cultured mouse Leydig cells. The Journal for the Society of Reproduction and

Fertility 82: 247-254. Doi: 10.1530/jrf.0.0820247

Trivers R. L. in Sexual Selection and the Descent of Man (Aldine, Chicago, 1972, Campbell B.)

136-177.

Urity VB, Issaian T, Braun EJ, Dantzler WH, Pannabecker TL. 2012. Architecture of kangaroo rat inner medulla: segmentation of descending thin limb of Henle’s loop. American

Journal of Physiology-Regulatory Integrative and Comparative Physiology 302(6): R720-R726.

DOI:10.1152/ajpregu.00549.2011.

Veal R, Caire W. 1979. Peromyscus eremicus. Mammalian Species 118:1-6. Available at http://www.science.smith.edu/msi/pdf/i0076-3519-118-01-0001.pdf.

122

Vimtrup BJ, Schmidt-Nielsen B. 1952. The histology of the kidney of kangaroo rats. The

Anatomical Record 114(4): 515-528. DOI:10.1002/ar.1091140402.

Vitavska O, Wieczorek H. 2013. The SLC45 gene family of putative sugar transporters.

Molecular Aspects of Medicine 34(2-3): 655-660. DOI:10.1016/j.mam.2012.05.014

Vorhies CT. 1945 in Water Requirements of Desert Animals in the Southwest. (College of

Agriculture, University of Arizona, Tucson, Agricultural Experiment Station.) Technical Bulletin

No. 107: 487-525. http://hdl.handle.net/10150/190625

Wagner GF, Dimattia GE. 2006. The stanniocalcin family of proteins. Journal of Experimental

Zoology Part A 305A(9): 769-780. DOI: 10.1002/jez.a.313

Walsberg GE. 2000. Small mammals in hot deserts: some generalizations revisited. BioScience

50(2): 109-120. doi: 10.1641/0006-3568(2000)050[0109:SMIHDS]2.3.C

Wingfield JC. 2013. The ecology of stress: ecological processes and the ecology of stress: the impacts of abiotic environmental factors. Functional Ecology 27:37-44.

DOI: 10/1111/1365-2435.12039

Wingfield JC, Kelley JP, Angelier F. 2011. What are extreme environmental conditions and how do organisms cope with them? Current Zoologist 57(3):373-374.

DOI: 10/1093/czoolo/57.3.363

Wingfield JC, Sapolsky RM. 2003. Reproduction and resistance to stress: when and how.

Journal of Neuroendocrinology 15:711-724. DOI/10.1046/j.1365-2826.2003.01033.x

Wube T, Fares F, Haim A. 2008. A differential response in the reproductive system and energy balance of spiny mice Acomys populations to vasopressin treatment. Comparative Biochemistry and Physiology. Part A, Molecular and Integrative Physiology 151: 499-504.

Doi: 10.1016/j.cbpa.2008.06.027

123

Xiao X, Mruk DD, Cheng CY. 2013. Intercellular adhesion molecules (ICAMs) and spermatogenesis. Human Reproduction Update 19(2): 167-186. DOI:10.1093/humupd/dms049.

Yahr P, Kessler S. 1975. Suppression of reproduction in water-deprived Mongolian gerbils

(Meriones unguiculatus). Biology of Reproduction 12(2): 249-254. https://doi.org/10.1095/biolreprod12.2.249

124

CHAPTER 3

Characterization of the Seminal Vesicle Proteome of Peromyscus eremicus (Cactus mouse) in hydrated and dehydrated conditions

Lauren Kordonowy 1* Donna Hogan 1 Feixia Chu 1 Matthew MacManes 1+

1. University of New Hampshire Department of Molecular, Cellular and Biomedical Sciences

Durham, NH, USA

* [email protected]

+ [email protected]

Corresponding Author:

Lauren Kordonowy, [email protected]

Submission Pending for Proteomics as a Dataset Brief

125

ABSTRACT

The cactus mouse (Peromyscus eremicus) is a desert-specialized rodent that experiences both chronic and acute dehydration in the Southwestern United States. Our previous research has generated substantial transcriptomic data on P. eremicus kidneys, testes, epididymis, and vas deferens in individuals exposed to hydrated and dehydrated conditions; however, the study described here is the first to describe a seminal vesicle proteome for this species. We have produced a seminal vesicle proteome from P. eremicus with free access to water and mice that were acutely dehydrated to generate a dataset that is comprehensive for both alternative water- availability states experienced by this species. We have also provided gene ontologies for this proteome using PANTHER. This proteome will provide a crucial resource for future studies characterizing the genetic and proteomic responses of reproductive tissues to drought in this desert-specialized rodent. Furthermore, an enhanced understanding of survival and reproductive responses (and adaptations) to dehydration is particularly relevant to clinical work aimed at minimizing adverse human impacts, as climate change continues to increase the incidence of drought.

126

DATASET BRIEF

The cactus mouse (Peromyscus eremicus), is a desert-specialized rodent endemic to the

Southwestern United States that experiences both acute and chronic dehydration and can live entirely without water (Veal and Caire, 1979; Caire, 1999). Previous research has characterized the physiological (kidney - Kordonowy et al., 2017) and transcriptomic (kidney - MacManes and

Eisen, 2014; MacManes, 2017, testes - Kordonowy and MacManes, 2017) effects of water limitation, while other studies have characterized cactus mouse reproductive tissue transcriptomes (testes, vas deferens, and epididymis - Kordonowy and MacManes, 2016). The study described here contributes to this desert adaptation research focus by describing a seminal vesicle proteome from cactus mice with access to water and mice that were acutely dehydrated.

Reproductively mature male Peromyscus eremicus (described in Kordonowy and

MacManes, 2016) were maintained in a laboratory colony at University of New Hampshire in a desert chamber mimicking temperature and precipitation conditions in southwestern US deserts.

Three mice were deprived of water for ~72 hours (DRY group) immediately prior to sacrifice; whereas another three mice were provided water ad libidum (WET group). Seminal vesicles were harvested from mice within ten minutes of euthanasia (via isoflurane overdose and subsequent decapitation). Proteins from the homogenized seminal vesicles were extracted with a

Qproteome Mammalian Prep kit (Qiagen) and frozen in liquid nitrogen within one hour of sacrifice. All research was done in compliance with University of New Hampshire Animal Care and Use Committee guidelines (protocol number 130902) and in accordance with guidelines by the American Society of Mammologists (Sikes, 2016).

Protein content of the samples was analyzed with a Qubit 2.0 Fluorometer (Invitrogen), and similar protein quantities from each sample were run on different lanes on a 4-15% gradient

127

SDS page gel (BioRad) and electrophoresed constantly at 0.02 amps to separate proteins by molecular weight. Each sample lane was discretely sub-divided, and these gel fragments were digested with trypsin, in keeping with current procedures (reviewed in Deutsch et al., 2008;

Cappadona et al., 2012).

The current methodology for mass spectrometry data generation for proteomic analysis has been thoroughly described (e.g. Deutsch et al., 2008; Cappadona et al., 2012; Schmidt et al.,

2014). This study utilizes shotgun proteomics, wherein MS data is leveraged to identify proteins from their component peptide compositions (reviewed in Duncan et al., 2010; Cappadona et al.,

2012). All samples were analyzed by LC-MS/MS as described by Zeng-Elmore and colleagues

(2014). For the HPLC, 1 uL aliquots of each digested protein sample were continuously injected by the UltiMate 3000 Autosampler (Dionex Corp) at a flow rate of 450 nL/min into the 150 um x

10 cm reverse-phase capillary column containing C18 resin. The resulting peptides were introduced to the nanoelectrospray ionization source of the LTQ Orbitrap XL mass spectrometer

(Thermo Scientific). All of the LC-MS/MS data was collected using an information-dependent mode setting, and MS scans in the Orbitrap (with an m/z range between 315-1800) were alternated with low-energy CID analysis for further characterization.

PAVA (Guan et al., 2011) converted the raw data into peak lists, which were uploaded into Protein Prospector version 5.18.22 (Chalkley et al., 2008). We searched for protein matches with Protein Prospector for each biological sample using a database specific to Peromyscus eremicus, which contains 24,425 Maker-derived protein sequences (MAKER version 2.31.9:

Holt and Yandell, 2011). Within Batch-Tag Web, the custom database was searched using the following parameters: the was Mus musculus, 1 was the Max. Missed Cleavage,

Carbamidomethyl (C) was the constant modification, 0.7 Da was the parent tolerance, 20 ppm

128

was the fragment tolerance, 2-5 was the precursor charge range, and the remainder of the parameters were the default settings. Specifically, this included a Max E Value for Proteins of

0.01 and for Peptides of 0.05, and a Min Score for Proteins of 22.0 and for Peptides of 15.0. The

.txt results files from these Protein Prospector searches were used to compile the 1,142 contig ID matches for building the proteome (described below).

R. Chalkley also generated a random concatenated database (RCD) in Protein Prospector with the P. eremicus database in order to determine the false discovery rates (FDRs) of each sample. The proteins within the RCD database are the same size as those in the original P. eremicus database, but the protein sequences are randomized (as per Knudsen & Chalkley,

2011). To execute the RCD search, the results generated by each of the previous searches with the P. eremicus database were re-submitted for a new search with the RCD P. eremicus database to Batch-Tag Web after removing the accession number matches from the Pre-Search Parameters

(using the same parameters previously described). The number and percent of matches within the RCD relative to the complete P. eremicus database were determined for each sample to calculate FDRs. The FDRs were all <5.25% for each sample of our protein level analysis.

The proteome was generated by including P. eremicus protein sequence matches found in

Protein Prospector (identified 1,142 contig IDs within the P. eremicus database) which were supported by one or more unique peptides. All protein matches (matched contig IDs) were retrieved from the P. eremicus database file and compiled. This set of P. eremicus protein sequences comprises our un-annotated seminal vesicle proteome. This group of P. eremicus sequences was also used to execute a BLASTp (Altschul et al. 1990; Madden, 2002) search to find corresponding proteins in a downloaded (Ensembl.org) Mus musculus peptide database

(version updated 11/24/16 release-87). This search yielded 1,084 different Mus protein matches,

129

which constitutes our Mus musculus annotated P. eremicus contig IDs.

All Mus protein accession number search matches were extracted from the search results and uploaded into Panther (pantherdb.org), a gene ontology search platform (Mi et al. 2013;

2017), where we conducted our GO search within Mus musculus. The Panther GO analysis resulted in 777 total gene matches, which were categorized into five viewable formats:

Biological Process, Molecular Function, Cellular Component, Protein Class, and Pathway. In our supplementary materials we have included the GO results for this list of genes as well as the accession number file we uploaded to generate reproducible results for each of these five formats. However, we present the Biological Process results here (Figure 1) to display the relative number of GO results in each category within this format. For example, within

Biological Process, five genes were grouped in the Reproduction GO. Furthermore, among these five genes: three were placed within the gamete generation GO, one within the fertilization

GO, and one was un-classified.

In summary, we generated a seminal vesicle proteome from dehydrated and hydrated cactus mice. Moreover, the PANTHER gene ontologies provide a thorough characterization of the proteins found in this tissue. This study describes a new reproductive tissue proteome in a desert-specialized rodent, which adds to a growing body of research focused on drought tolerance in a wide range of taxa (e.g. plants; Behringer et al., 2015; Chen et al., 2016; insects:

Guillen et al., 2014; mammals: Diaz et al, 2006; Fuller et al., 2014). Indeed, understanding organismal responses to water imitation is particularly relevant as current projections predict dramatic climate change, and we may garner clinically relevant information from desert species adapted to combat dehydration (Johnson et al., 2016).

All three of our data-types (raw LC-MS, Protein Prospector protein identification results,

130

and the proteome sequences) are publically available in the appropriate repositories as per recommendations by Perez-Riverol et al. (2015) and are provided in Supplemental Materials.

Supplemental Materials are available on GitHub, unless otherwise stipulated:

SVproteomeMethods markdown file (containing all code and analyses)

Peromyscus eremicus seminal vesicle proteome: in UniProt (The UniProt Consortium, 2017)

Peromyscus eremicus amino acid sequence database file for Protein Prospector search

Peromyscus eremicus FDR database file for Protein Prospector search

Peromyscus eremicus proteome BLASTp matches to Mus musculus database (with scores)

Panther Gene Ontology Analysis: accession number file for upload and GO results file

Raw LC-MS data: available in PRIDE (Vizcaino et al., 2016)

Protein Prospector data: available in PRIDE

All MS/MS spectrum annotated with masses observed as well as fragment assignments

(SKYLINE)

131

Figure 1 Biological Process GO analysis produced by PANTHER (re-graphed in R)

132

REFERENCES:

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. 1990. Basic local alignment search tool. Journal of Molecular Biology 215(3): 403-10. DOI: 10.1016/S0022-2836(05)80360-2.

Behringer D, Zimmermann H, Ziegenhagen B, Liepelt S. 2015. Differential gene expression reveals candidate genes for drought stress response in Abies alba (Pinaceae). PLoS One 10(4): e0124564 (pp. 1-18). DOI: 10.1371/journal.pone.0124564.

Caire W. 1999. Cactus mouse in The Smithsonian Book of North American Mammals. Wilson

D, Ruff S, editors. Washington, D.C.: Smithsonian Institution Press, 567-568.

Cappadona S. Baker PR, Cutillas PR, Heck AJ, Breukelen B. 2012. Current challenges in software solutions for mass spectrometry-based quantitative proteomics. Amino Acids 43(3):

1087-1108. DOI: 10.1007/s00726-012-1289-8.

Chalkley RJ, Baker PR, Medzihradsky KF, Lynn AJ, Burlingame AL. 2008. In-depth analysis of tandem mass spectrometry data from disparate instrument types. Molecular and

Cellular Proteomics 7(12) : 2386-2398. DOI: 10.1074/mcp.M800021-MCP200.

Chen W, Yao Q, Patil GB, Agarwal G, Deshmukh RK, Lin L, Wang B, Wang Y, Prince SJ,

Xu D, An YC, Valliyodan B, Varshney RK, Nguyen HT. 2016. Identification and comparative analysis of differential gene expression in soybean leaf tissue under drought and flooding stress revealed by RNA-seq. Frontiers in Plant Science 7(Article 1044): (pp. 1-19).

DOI :10.3389/fpls.2016.01044.

Deutsch EW, Lam H, Aebersold R. 2008. Data analysis and bioinformatics tools for tandem mass spectrometry in proteomics. Physiological Genomics 33(1): 18-25.

DOI: 10.1152/physiolgenomics.00298.2007

Duncan MW, Aebersold R, Caprioli RM. 2010. The pros and cons of peptide-centric

133

proteomics. Nature Biotechnology 28(7): 659-664. DOI: 10.1038/nbt0710-659

Fuller A, Hetem RS, Maloney SK, Mitchell D. 2014. Adaptation to Heat and Water Shortage in

Large, Arid-Zone Mammals. Physiology 29(3): 159-167. DOI: 10.1152/physiol.00049.2013

Guan S, Price JC, Prusiner SB, Ghaemmaghami S, Burlingame AL. 2011. A data processing pipeline for mammalian proteome dynamics studies using stable isotope metabolic labeling. Molecular and Cellular Proteomics 10(12): M111.010728.

DOI: 10.1074/mcp.M111.010728

Guillen Y, Rius N, Delprat A, Williford A, Muyas F, Puig M, Casillas S, Ramia M, Egea R,

Negre B, Mir G, Camps J, Moncunill V, Ruiz-Ruano FJ, Cabrero J, De Lima LG, Dias GB,

Ruiz JC, Kapusta A, Garcia-Mas J, Gut M, Gut IG, Torrents D, Camacho JP, Kuhn GCS,

Feschotte C, Clark AG, Betran E, Barbadilla A, Ruiz A. 2015. Genomics of ecological adaptation in cactophilic Drosophila. Genome Biology and Evolution 7(1):349–366

DOI 10.1093/gbe/evu291

Holt C, Yandell M. 2011. Maker2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics 12:491. doi: 10.1186/1471-2015-12-491

Johnson RJ, Stenvinkel P, Jensen T, Lanaspa MA, Roncal C, Song Z, Bankir L, Sanchez-

Lozada LG. 2016. Metabolic and kidney diseases in the setting of climate change, water shortage, and survival factors. Journal of the American Society of Nephrology 27(8): 2247-56.

DOI: 10.1681/ASN.2015121314

Knudsen GM, Chalkley RJ. 2011 The Effect of Using an Inappropriate Protein Database for

Proteomic Data Analysis. PLoS ONE6(6): e20873. DOI:10.1371/journal.pone.0020873

Kordonowy LK, MacManes MD. 2016. Characterization of a male reproductive transcriptome for Peromyscus eremicus (cactus mouse). PeerJ 4:e2617(pp.1-23). DOI: 10.7717/peerj.2617.

134

Kordonowy L, MacManes L. 2017. Characterizing the reproductive transcriptomic correlates of acute dehydration in males in the desert-adapted rodent, Peromyscus eremicus. BMC

Genomics: in review bioRxiv. doi: 10.1101/096057

Kordonowy L, Lombardo K, Green H, Dawson MD, LaCourse S, Bolton E, MacManes M.

2017. Physiological and biochemical changes associated with experimental dehydration in the desert adapted cactus mouse, Peromyscus eremicus. Physiological Reports 5 (e13218). doi: 10.14814/phy2.13218

MacManes MD, Eisen MB. 2014. Characterization of the transcriptome, nucleotide sequence polymorphism, and natural selection in the desert adapted mouse Peromyscus eremicus. PeerJ 2: e642. DOI.org/10.7717/peerj.642.

MacManes MD. 2017. Severe acute dehydration in a desert rodent elicits a transcriptional response that effectively prevents kidney injury. American Journal of Physiology-Renal

Physiology in press (April 5, 2017). doi: 10.1152/ajprenal.00067.2017

Madden T. 2002 Oct 9 [Updated 2003 Aug 13]. The BLAST Sequence Analysis Tool. In:

McEntyre J, Ostell J, editors. The NCBI Handbook [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2002-. http://www.ncbi.nlm.nih.gov/books/NBK21097/

Marra NJ, Eo SH, Hale MC, Waser PM, DeWoody JA. 2012. A priori and a posteriori approaches for finding genes of evolutionary interest in non-model species: Osmoregulatory genes in the kidney transcriptome of the desert rodent Dipodomys spectabilis (banner-tailed kangaroo rat). Comparative Biochemistry and Physiology, Part D: Genomics Proteomics 7(4):

328-339. DOI: 10.1016/j.cbd.2012.07.001.

Marra NJ, Romero A, DeWoody A. 2014. Natural selection and the genetic basis of osmoregulation in heteromyid rodents as revealed by RNA-seq. Molecular Ecology 23(11):

135

2699-2711. DOI: 10.1111/mec.12764.

Mi H, Muruganujan A, Casagrande JT, Thomas PD. 2013. Large-scale gene function analysis with the PANTHER classification system. Nature Protocols 8, 1551-1566

DOI: 10.1038/nprot.2013.092.

Mi H, Huang X, Muruganujan A, Tang H, Mills C, Kang D, Thomas PD. 2017. PANTHER version 11: expanded annotation data form Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Research 45(D1): D183-D189.

DOI: 10.1093/nar/gkw1138

Perez-Riverol Y, Alpi E, Wang R, Hermjakob H, Vizcaino JA. 2015. Making proteomics data accessible and reusable: Current state of proteomics databases and repositories. Proteomics

15(5-6): 930-950. DOI: 10.1002/pmic.201400302

The Uniprot Consortium. 2017. UniProt: the universal protein knowledgebase. Nucleic Acids

Research 45(D1): D158-D169. DOI: https://doi.org/10.1093/nar/gkw1099.

Veal R, Caire W. 1979. Peromyscus eremicus. Mammalian Species 118:1-6. Available at http://www.science.smith.edu/msi/pdf/i0076-3519-118-01-0001.pdf.

Vizcaíno JA, Csordas A, del-Toro N, Dianes JA, Griss J, Lavidas I, Mayer G, Perez-

Riverol Y, Reisinger F, Ternent T, Xu Q-W, Wang R, Hermjakob H. 2016. 2016 update of the PRIDE database and its related tools. Nucleic Acids Research 44 (22):11033.

DOI: 10.1093/nar/gkw880

Zeng-Elmore X, Gao X-Z, Pellarin R, Schneidman-Duhovny D, Zhang X-J, Kozacka KA,

Tang Y, Sali A, Chalkley RJ, Cote RH, Chu F. 2014. Molecular architecture of photoreceptor phosphodiesterase elucidated by chemical cross-linking and integrative modeling. Journal of

Molecular Biology 426(22): 3713-3728. DOI: 10.1016/j.jmb.2014.07.022

136

University of New Hampshire

Research Integrity Services, Service Building 51 College Road, Durham, NH 03824-3585 Fax: 603-862-3564 14-0ct-2016

MacManes, Matthew D Molecular, Cellular, & Biomedical Sciences Rudman Hall Durham, NH 03824-2618

IACUC #: 160904 Project: Using Genomics to Understand Adaptation to Desert Life in Rodents Approval Date: 15-Sep-2016

The Institutional Animal Care and Use Committee (IACUC) reviewed and approved the protocol submitted for this study under Category C on Page 5 of the Application for Review of Vertebrate Animal Use in Research or Instruction - the research potentially involves minor short-term pain, discomfort or distress which will be treated with appropriate anesthetics/analgesics or other assessments.

Approval is granted for a period of three years from the approval date above. Continued approval throughout the three year period is contingent upon completion of annual reports on the use of animals. At the end of the three year approval period you may submit a new application and request for extension to continue this project. Requests for extension must be filed prior to the expiration of the original approval.

Please Note:

1. All cage, pen, or other animal identification records must include your ACUC # listed above. 2. Use of animals in research and instruction is approved contingent upon participation in the UNH Occupational Health Program for persons handling animals. Participation is mandatory for all principal investigators and their affiliated personnel, employees of the University and students alike. Information about the program, including forms, is available at http://unh.edu/research/occupational- health-proqram-ani mal-handlers.

If you have any questions, please contact either Dean Elder at 862-4629 or Julie Simpson at 862- 2003.

cc: File

137