Orthogonal Approaches for Surveying Genetic Variation

ORTHOGONAL APPROACHES FOR SURVEYING GENETIC VARIATION AND ITS CONSEQUENCES A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Maria Florencia Schlamp August 2019 © 2019 Maria Florencia Schlamp ORTHOGONAL APPROACHES FOR SURVEYING GENETIC VARIATION AND ITS CONSEQUENCES Maria Florencia Schlamp, Ph.D. Cornell University 2019 Our morphological traits, responses to stimuli, and the composition of our microbiomes are all phenotypic adaptations influenced by the genetic variation that defines us. Understanding this multimodal network of relationships requires the analysis of a multitude of orthogonal biological systems. Tailoring our approach to the individual biological outputs and systems allows us to reach a deeper understanding of the evolution, regulation, and interactions among biological processes. When available, we can use genomic data from large populations to establish links between genetic variation and phenotypic adaptation. For instance, positive selection can be inferred from variation computationally and statistically via evidence of selective sweeps. In Chapter 2, I evaluate eight selection scans to detect selective sweeps in domestic dogs, a population with well-documented selection pressures imposed by human preferences for specific morphologies and other traits. Pathogen-driven selective pressures modulate adaptation in the immune response, because hosts must keep up in the host-pathogen arms race. The high energetic cost of mounting an immune response reduces resource availability to other physiological processes. To explore these trade-offs, in Chapter 3 I profile the transcription dynamics of the Drosophila melanogaster innate immune response in a dense time course and I apply a broad range of statistical methods, including temporal clustering, gene set expression analysis, and Granger causality to construct putative gene interactions networks. The interaction of hosts with mutualistic symbionts can drive genetic adaptation in hosts through mutually-beneficial processes. In humans, the gut microbiome provides a wealth of symbiotic interactions. To address whether this mutualistic relationship drives host adaptation, in Chapter 4 I study the influence of host genetics on microbiome composition by performing high-resolution QTL mapping to identify genetic variation in Diversity Outbred mice significantly associated with specific bacterial abundances. This thesis presents three orthogonal approaches for surveying genetic variation and its consequences, using a combination of data collected through three sequencing methods: population genomic data using genotyping, global transcriptome dynamics using RNA-sequencing, and microbiome composition using 16S rRNA gene sequencing. BIOGRAPHICAL SKETCH Florencia Schlamp was born in a small town in Argentina in 1990. At the early age of 8, Florencia would already tell her siblings that her favorite subject was Natural Sciences. Back then, little did she know, that 20 years later, she would be completing her Doctoral studies in Genomics and Computational Biology at Cornell University. Her interest in Biology strengthened when she moved to Brazil as a teenager and was able to develop her love for SCUBA diving. Her love for science shaped while taking courses in High School, grew to the extent that she decided to pursue her college studies in Biology. In 2010 she joined the inaugural class of New York University Abu Dhabi, which would change her life both as a person and as a future scientist. Florencia continued to develop her research skills under the supervision of Dr. John Burt and Dr. Michael Purugganan. While in NYUAD, she met amazing faculty, staff, and students. She met lifelong friends and her life partner, Adam Dolan. Together they decided to pursue Graduate studies at Cornell University. For the following years, under the supervision of Dr. Andrew Clark, Florencia was fortunate to pursue multidisciplinary and diverse research projects, while also developing teaching and mentoring skills. Looking to the future, Florencia is passionate about research, and excited to try opportunities in academia, where she can make use of her passion for sharing knowledge and teaching science. iii To my family and friends, and everyone who believed in me… iv ACKNOWLEDGMENTS First and foremost, I want to thank my parents and siblings, whose support and love have built me up to who I am. My mom and dad have been an amazing influence on my life. Their examples are always with me and will continue to shape who I hope to be in the future. This achievement belongs to them, too. To my siblings, Sofia and Guillermo, thank you for your encouragement and love. I am so very proud of you both and I cannot wait to see what we accomplish in the future. I would like to thank Andy for being my PI and providing me with the most amazing group of people he managed to get together in his lab. I appreciate the patience and encouragement he has shown me and I will always remember the sailing trips we took. To Philipp I extend my gratitude for providing guidance, particularly during the early stages of my exploration into population genetics and computational biology. Mariana has been a source of great advice and I am thankful to have her on my committee. The collaboration I have had with Sumanta has been extraordinarily productive and I know that my time-course project would not have turned out as well without his involvement. I want to particularly thank him for his help, teachings, and advice. I want to thank the beautiful humans and scientists in the Clark lab for making it a joy to work there every day. I have found lasting friendships, sharp minds, and kind hearts there. I cannot mention everyone with descriptions of how much they mean to me, so an incomplete list of the great people from the Clark Lab will have to suffice: Manisha, Sri, Elissa, Andrea, Emily, Yasir, Arvid, Ian, Tram, Andrew, Amanda, Lori. Thank you to Cornell University and in particular the GGD Field, for providing me with an amazing community whose support and friendship have been vital these five years. Again, I am faced with the task of making an incomplete list, but I must mention a few: Roman, Dan, Zach, Mike, Ian, Afrah – Thank you for everything. Thank you to all my students, mentees, and undergraduate interns I have supervised, but in particular to David, whose hard work and continuous desire to learn served as a constant reminder of why I love science, research, and teaching so much. Thank you to my dearest lifelong friends I made at NYUAD, especially those who kept me company with DnD over the internet. I am so grateful to have you all as a continuing part of my life. Special thanks to Juan Felipe for all his help and support. I am so thankful for the privilege of having you around both at NYUAD and Cornell. And thank you Veronica for your unconditional friendship and support across time and distance. Above all, thank you to my amazing life partner, Adam. I am so proud of what we have accomplished, and I am excited to see what comes next. And thank you to my amazing cat, Numi, for all the cuddles. v TABLE OF CONTENTS Biographical sketch .................................................................................................................................... iii Dedication ................................................................................................................................................... iv Acknowledgments ........................................................................................................................................ v Table of contents ....................................................................................................................................... vi List of Figures ............................................................................................................................................ vii List of Tables ............................................................................................................................................ viii CHAPTER 1: INTRODUCTION ........................................................................................................... 1 Biological Motivations .................................................................................................................... 1 Inferring regions of positive selection in population genomic data ....................................... 5 Profiling transcription dynamics using RNA sequencing time series ................................... 10 Studying the influence of host genetics on gut microbiome composition .......................... 16 CHAPTER 2: EVALUATING THE PERFORMANCE OF SELECTION SCANS TO DETECT SELECTIVE SWEEPS IN DOMESTIC DOGS .......................................................... 23 Abstract ......................................................................................................................................... 23 Introduction .................................................................................................................................. 24 Material and Methods .................................................................................................................. 28 Results ............................................................................................................................................ 33 Discussion ...................................................................................................................................

Orthogonal Approaches for Surveying Genetic Variation

Soft Sweeps II—Molecular Population Genetics of Adaptation from Recurrent Mutation Or Migration

Soft Sweeps and Beyond: Understanding the Patterns and Probabilities of Selection Footprints Under Rapid Adaptation

On the Unfounded Enthusiasm for Soft Selective Sweeps III: the Supervised Machine Learning Algorithm That Isn’T

Recent Selective Sweeps in North American Drosophila Melanogaster Show Signatures of Soft Sweeps

Soft Selective Sweeps in Complex Demographic Scenarios

Loss and Recovery of Genetic Diversity in Adapting Populations of HIV

Spurious Signatures of Soft and Partial Selective Sweeps Result from Linked Hard Sweeps

Detection and Classification of Hard and Soft Sweeps from Unphased

Soft Sweeps and Beyond: Understanding the Patterns and Probabilities of Selection Footprints Under Rapid Adaptation

HIV-1 Evolutionary Dynamics Under Non-Suppressive Antiretroviral Therapy. Steven a Kemp1,3, Oscar Charles1, Anne Derache2, Colli

Detection and Classification of Hard and Soft Sweeps From

Identifying and Classifying Shared Selective Sweeps from Multilocus Data