Short-Sequence Approach to Uncovering Regulatory Mechanisms in the Human Immune System

Short-Sequence Approach to Uncovering Regulatory Mechanisms in the Human Immune System By Shaked Afik A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy In Computational Biology in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Nir Yosef, Chair Professor Lisa F. Barcellos Professor Lexin Li Professor Daniel A. Portnoy Spring 2020 Abstract Short-Sequence Approach to Uncovering Regulatory Mechanisms in the Human Immune System by Shaked Afik Doctor of Philosophy in Computational Biology University of California, Berkeley Professor Nir Yosef, Chair Short DNA sequences play an important role in the immune response to pathogens. As part of the non-coding regions of the genome, short DNA sequence motifs regulate cell activation and maturation by binding chromatin modifiers and transcription factors. They also determine the ability of each cell in the adaptive immune system to respond to a specific pathogen by forming the antigen-recognizing region of their receptors. This dissertation outlines computational tools I developed for utilizing and integrating high-throughput sequencing data to study the functions of short DNA sequences in the human immune system. I focus on two main aspects of short DNA sequences: (1) As components of the regulatory landscape that control the activation of dendritic cells (DCs) in response to lipopolysaccharide (LPS), and (2) as the determinants of the specificity of T cells and B cells. The first part of my dissertation investigates the regulatory landscape of DC activation following LPS stimulation. In chapter two I present a model which predicts gene induction based on sequence motif occurrences in the regulatory regions of each gene and show that this regulatory logic is conserved between human and mouse. Chapter three describes a supervised learning pipeline I devised to study the contribution of short sequence motifs to temporal epigenetic changes in human DCs. The second part of my dissertation describes my work on determining the specificity of T and B cells from single-cell RNA-sequencing data. Chapter four presents software I developed to reconstruct the full sequence of T cell receptors from short read single-cell RNA-sequencing. An application of the software links the length of the antigen-recognizing region of the receptor to the state of the cell, demonstrating the importance of such combined analysis in studying the immune response to viral infections. Chapter five describes an extension of the software to reconstruct B cells receptor sequences. 1 To my parents i Acknowledgements My greatest gratitude goes to my advisor, Nir Yosef, for his immense support and for not only teaching me how to think about science and how to be a scientist, but for doing so with kindness and patience. I am honored to have been part of the lab from (almost) its inception and to see it grow under his leadership into an incredible team. I also owe a great deal of gratitude to my former mentors and advisors. Nir Friedman and Sebastian Kadener at the Hebrew University introduced me to computational biology and to the exciting world of research. I feel lucky to have been mentored by Manuel Garber from UMass Medical School, my former advisor and current collaborator, who I enjoyed working with and learning from. If it was not for his guidance, I would never have thought about pursuing a PhD. I would also like to thank the members of my qualifying exam and dissertation committee for their valuable feedback: Lisa Barcellos, Lexin Li, Nick Ingolia and Dan Portnoy. I wish to thank the members of the Yosef lab for endless discussions and assistance. Special thanks to Jim Kaminski and Romain Lopaz (best office mates one could ask for), David DeTomaso, Anat Kreimer, Michael Cole, Alyssa Morrow and Chenling Xu. I would also like to thank my past and current co-first-authors and collaborators: Kathleen Yates, Kevin Bi and Nick Haining, Pranitha Vangala, Elisa Donnard and Manuel Garber, and Gabe Raulet. Their hard work and enthusiasm was an inspiration. I have the privilege of being a member of the second cohort of the UC Berkeley computational biology PhD program. Thank you to my fellow cohort member David DeTomaso and our “adoptive” cohort: Jim Kaminski, Brooke Rhead, Jeff Spence, Amy Ko and Rob Tunney. Their friendships have made my time here not only less stressful but also so enjoyable. Many Thanks to Kate Chase and Xuan Quach for their support to the program in general and to me personally. I would also like to thank all the great friends I gained here from the comp bio program, UC Berkeley and the Bay Area. You have made Berkeley feel like home and I will forever be grateful. Special thanks to my family, especially to my mom and dad. I could not have endured this time away without their never-ending love and support. This work is dedicated to my partner, Noam. For his spirit, thoughtfulness and nonstop encouragement that made me push myself further every day for the last 14 years, even when we were 10 time zones away. Thanks to him, my life is completely different than I could have ever imagined and I am so happy for it. As I am writing this, I am only a few weeks away from the birth of my first child. There are no words to describe how remarkably better the last two years have been just knowing I will get to meet you at the end of it. I cannot wait to discover what I will learn from you. ii Table of Contents Chapter 1 - Introduction ……………………………………………………………………….. 1 Transcriptional regulation …………………………………………………………………… 1 Uncovering the regulatory landscape of Dendritic cells following LPS stimulation ……….. 2 Heterogeneity and specificity of the adaptive immune system ……………………………... 3 References ………………………………………………………………………………….... 4 Chapter 2 - Comparative analysis of immune cells reveals a conserved regulatory lexicon 6 Summary ……………………………………………………………………………………. 6 Introduction …………………………………………………………………………………. 7 Results ………………………………………………………………………………………. 8 Discussion ………………………………………………………………………………….. 14 Figures …………………………………………………………………………………….... 17 References ………………………………………………………………………………….. 24 Supplementary information ……………………………………………………………….... 31 Chapter 3 - Uncovering the DNA sequence motifs which control the epigenetic landscape of Dendritic cells maturation ……………………………………………………………………. 49 Abstract …………………………………………………………………………………….. 49 Introduction ……………………………………………………………………………….... 50 Results …………………………………………………………………………………….... 51 Discussion ………………………………………………………………………………...... 56 Methods ……………………………………………………………………………………. 57 Figures …………………………………………………………………………………....... 65 References …………………………………………………………………………………. 68 Supplementary Figures …………………………………………………………………….. 74 Supplementary Tables …………………………………………………………………….... 79 Chapter 4 - Targeted reconstruction of T cell receptor sequence from single cell RNA-seq links CDR3 length to T cell differentiation state ……………………………………………. 82 Abstract …………………………………………………………………………………….. 83 Introduction ……………………………………………………………………………….... 83 Materials and Methods ……………………………………………………………………... 85 Results …………………………………………………………………………………….... 92 Discussion ………………………………………………………………………………….. 96 2iii Figures …………………………………………………………………………………….... 98 References ……………………………………………………………………………….... 102 Supplementary information ……………………………………………………………….. 107 Chapter 5 - Reconstructing B cell receptor sequences from short-read single cell RNA-sequencing with BRAPeS ……………………………………………………………... 118 Abstract …………………………………………………………………………………….118 Introduction ………………………………………………………………………………...119 Results ……………………………………………………………………………………...119 Discussion ………………………………………………………………………………….121 Materials and Methods ……………………………………………………………………. 122 Figures …………………………………………………………………………………….. 126 References ……………………………………………………………………………….... 129 Supplementary Information ……………………………………………………………….. 131 iv Chapter 1 - Introduction The role of the immune system is to defend against harmful pathogens such as viruses and bacteria. It can be broadly divided into two lines of defense. The first line is the innate immune system, which detects the pathogen and reacts with a rapid yet general response. Then, the adaptive immune system, namely T and B cells, provides a slower response which is tailored to the specific pathogen. Those processes include many cell types that undergo vast molecular changes as cells differentiate and mature in response to the pathogen. Profiling the molecular basis of the human immune response is of great importance, as it uncovers the mechanisms underlying the body’s response to vaccination, infections and other diseases, which in turn leads to developments of drugs and methods for cancer therapy (Jiang, 2017; Villani et al., 2018). Moreover, it allows us to gain a better understanding of basic principles in molecular biology (Pope and Medzhitov, 2018). In my dissertation, I was interested in the role that short DNA sequences play in various stages of the immune response to pathogens. I focused on two main functions of these sequences: As regulators of transcriptional changes in cells from the innate immune system, and as the major genomic component which determines the specificity of cells from the adaptive immune system. Transcriptional regulation Changes to cell state, e.g. in response to environmental stimuli, are controlled by changes in expression of thousands of genes. Those changes

Short-Sequence Approach to Uncovering Regulatory Mechanisms in the Human Immune System

IMGT-ONTOLOGY and IMGT Databases, Tools and Web

Nuclear Organization and the Epigenetic Landscape of the Mus Musculus X-Chromosome Alicia Liu University of Connecticut - Storrs, [email protected]

Multiple Deleted Regions on the Long Arm of Chromosome 6 in Astrocytic

Determining Region

The Role of the X Chromosome in Embryonic and Postnatal Growth

'Next- Generation' Sequencing Data Analysis

Candida Albicans

B Inhibition in a Mouse Model of Chronic Colitis1

Analysis of the Complex Interaction of Cdr1as‑Mirna‑Protein and Detection of Its Novel Role in Melanoma

Supplementary Data

Expandable and Reversible Copy Number Amplification Drives Rapid Adaptation to Antifungal Drugs Robert T Todd, Anna Selmecki*

This Item Is the Archived Peer-Reviewed Author-Version Of