A single-cell perspective on infection

by Nathan Scott Haseley

B.S. Rochester Institute of Technology (2009)

Submitted to the Harvard-MIT Program In Health Sciences and Technology in partial fulfillnent of the requirements for the degree of

Doctor of Philosophy in Bioinformatics

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Februrary 2016

Massachusetts Institute of Technology 2016. All rights reserved.

Author ...... Signature redacted Harvard-MIT Program In Health Sciences and 'kchnology September 29, 2015

Signature redacted C ertified by ...... \ Deborah T. Hung MD, PhD, Associate Professor Thesis Supervisor

Signature redacted A ccepted by ...... Emery N. Brown MD, PhD/Director, Harvard T Program in Health Sciences and Technology/Professor of Computational Neuroscience and Health Sciences and Technology

MASSACHUSES INSTITUTE OF TECHN0LOGY

MAR 142016 LIBRARI S '2 A single-cell perspective on infection by Nathan Scott Haseley

Submitted to the Harvard-MIT Program In Health Sciences and Technology on September 29, 2015, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Bioinformatics

Abstract The clinical course of infection is ultimately determined by a series of cellular interactions between invading pathogens and host immune cells. It has long been understood that these interactions, even when they occur in tissue culture models, give rise to a wide variety of different outcomes, some beneficial to the host, others to the pathogen. These cellular interactions, however, are typically studied at a bulk level; masking this cell-to-cell variation, losing important information about the full range of possible host-pathogen interactions, and leaving the mechanistic basis for these different outcomes largely unexplored. Here, we present a system that combines single-cell RNA sequencing with fluorescent markers of infection outcome to directly correlate host transcription signatures with infection outcome at the single cell level. Applying this system to the well-characterized model of Salmonella enterica infection of mouse macrophages, we found: 1) Unique transcription signatures associated with bacterial exposure and bacterial infection, 2) Sustained high levels of heterogene- ity in immune pathways in infected macrophages, and 3) A novel subpopulation of macrophages characterized by high expression of the Type I Interferon response after infection. Upon further investigation we found that this heterogeneity in the host Type I Interferon response was the result of heterogeneity in the population of infecting bacteria, namely in the extent of PhoPQ-mediated LPS modifications. This work highlights the importance of heterogeneity as a characteristic of bac- terial populations that can influence the host immune response. It also demonstrates the benefits of examining infection with single-cell resolution.

Thesis Supervisor: Deborah T. Hung Title: MD, PhD, Associate Professor

3 4 Acknowledgements

This thesis is a product of all of the support, care, and instruction that I have received over many years. There are many people to thank that have been instrumental in my graduate career and in getting me to this point. I am sure that I did not mention everyone here but nevertheless, I would like to say thank you. First, I would like to thank my wife, Psalm, for all her support and encouragement throughout graduate school and for making Boston into a home for both of us. I would like to thank my parents for their support and wisdom throughout the years. Also the numerous friends that I've had in Boston. I owe a debt to the entire community at CityLife Presbyterian Church for their

support and prayers, and particularly to my community group. Thanks as well to Sam, Whitney, Trevor, and Megan for numerous game nights and other times of hanging out that were essential for sanity. I would like to thank my adviser Deb, for teaching me how to think like a scientist and all current and past members of the Hung lab for helpful discussions, encouragement when things weren't working, and for making the Broad Institute a great place to work. I would also like to thank all of our collaborators in the Regev and Xavier labs. I would additionally like to thank my previous mentor Dr. Ferran, for getting me started in science and old members of the Ferran lab for their still-constant friendship. Finally, I would like to thank all the members of my committee, all of the HST staff for their help through this process, and my classmates, particularly all of my fellow HST BIG students for feedback, friendship, and keeping me interested in science outside my own project.

5 6 Contents

1 Introduction 15 1.1 Overview ...... 15 1.2 S. enterica pathophysiology ...... 17 1.3 Extracellular host-pathogen interactions ...... 18 1.3.1 The role of diverse inicroenvironments in promoting heterogeneity ...... 18 1.3.2 Heterogeneity as a mechanism to control the host inflammatory response . 20 1.4 Intracellular host-pathogen interactions: A battle between two highly adaptable cells 21 1.5 Heterogeneity in macrophage and bacterial populations ...... 25 1.6 The need for a new approach to study infection ...... 26

2 Examining S. enterica-Macrophage Interactions at the Single-Cell Level 29 2.1 A ttributions ...... 29 2.2 Introduction ...... 29

2.3 Results ...... 30 2.3.1 Heterogeneous outcomes of S. enterica-macrophage encounters ...... 30 2.3.2 Single-cell RNA-Seq accurately captures host transcriptional states after bac- terial exposure ...... 31 2.3.3 Single-cell RNA-Seq identifies transcriptional changes associated with extra- cellular and intracellular bacterial detection ...... 35 2.3.4 Bimodal induction of Type I IFN response in infected macrophages .. 37 2.3.5 Infected macrophages display high cell-to-cell variation in genes from immune response pathways ...... 40 2.4 D iscussion ...... 43

7 2.4.1 A general approach to characterize the transcriptional underpinnings of phe- notypic heterogeneity in host-pathogen encounters ...... 43 2.4.2 . A single-cell resolution map of S. typhimuriumr-inacrophageinteractions . . 43 2.5 Materials and Methods ...... 4 4 2.5.1 Mice, cell lines and bacterial strains ...... 44 2.5.2 Cell and bacterial cultures, single-cell sorting, and analysis ...... 45 2.5.3 Imaging assay protocol ...... 45 2.5.4 Image acquisition and analysis ...... 45 2.5.5 Single-cell expression profiling ...... 46 2.5.6 cDNA synthesis, amplification and library construction ...... 46 2.5.7 Transcript quantification ...... 47 2.5.8 Quality filters and statistics ...... 47 2.5.9 Comparison of methods for differential expression analysis ...... 48 2.5.10 Differential expression analysis ...... 48 2.5.11 P C A ...... 49 2.5.12 Identifying genes responding to extracellular or intracellular ba(teria (Clusters I and II) ...... 49 2.5.13 Identification of co-regulated clusters (Clusters III, IV, and V) ...... 50 2.5.14 Correlation plots ...... 51 2.5.15 H eatm aps ...... 51 2.5.16 Single-molecule RNA-flow FISH ...... 51 2.5.17 Estimating gene variance scores ...... 51 2.5.18 Identification of low and high variance pathways in infected inacrophages . . 52 2.5.19 Comparison of variance scores between exposure- and infection-induced gene ...... sets ...... 52 2.5.20 Cell expression density plQts ...... 52 2.5.21 D atasets ...... 53

3 The Mechanisms Behind Heterogeneity in the Type I IFN Response 55 3.1 A ttributions ...... 55 3.2 Introduction ...... 55 3.3 R esu lts ...... 56

8 3.3.1 Intracellular TLR4 signaling through TRIF and IRF3 determines the expres- sion of the Type I IFN response in infected macrophages ...... 56 3.3.2 Live bacteria, but not LPS-coated beads, elicit a variable Type I IFN response in infected macrophages ...... 58 3.3.3 Variation in the host Type I IFN response is driven by bimodal activity of the bacterial PhoPQ two-component system in infecting bacteria ...... 59 3.3.4 Intracellular recognition of PhoPQ-mediated LPS modifications results in in- duction of the Type I IFN response ...... 62 3.3.5 PhoPQ-mediated LPS modifications impact the in vivo Type I IFN response and infection outcome ...... 64 3.4 D iscussion ...... 67 3.4.1 Heterogeneity of pathogen populations as a mechanism to shape the host im mune response ...... 67 3.4.2 Studies of the immune response in the context of heterogeneous bacterial ligands 68 3.4.3 Possible advantages of bimodal expression of bacterial factors during the course of in vivo infection ...... 69 3.5 Materials and Methods ...... 70 3.5.1 Mice, cell lines and bacterial strains ...... 70 3.5.2 Cell and bacteria cultures, single-cell sorting, library construction, and analysis 71 3.5.3 RNAtag-Seq for simultaneous detection of host and intracellular bacterial transcripts ...... 71 3.5.4 Transcript quantification of RNAtag-Seq libraries ...... 72 3.5.5 Single-cell Real-time qPCR ...... 72 3.5.6 Cluster summaries for single-cell RNA-Seq libraries ...... 72 3.5.7 Analysis of Biomark HD qPCR data ...... 73 3.5.8 Single-cell summary plots of gene clusters for Biomark data ...... 73 3.5.9 TRIF/MYD88 gene analysis ...... 74 3.5.10 and DNA manipulations ...... 75 3.5.11 Identification of bacterial pathways enriched in ISRE positive and negative cells 76 3.5.12 Heat-killed bacteria and LPS extractions ...... 76 3.5.13 Coating beads with LPS ...... 77 3.5.14 Mouse LPS stimulation ...... 78

9 4 Conclusion 79 4.1 Summary ...... 79 4.2 Future directions ...... 80

A Appendix 97 A.1 Supplemental Figures...... 97 A.2 Supplemental Tables ...... 103

10 List of Figures

1-1 Overview of S. enterica pathophysiology ...... 17

2-1 Heterogenous outcomes of BMM-S. typhimurium encounters ...... 32

2-2 Single-cell RNA-Seq accurately captures transcriptional changes associated with bac-

terial exposure ...... 34

2-3 Transcriptional signatures associated with subpopulations of exposed macrophages . 36

2-4 The high variation of immune pathways in infected nacrophages ...... 42

3-1 Analysis of macrophage pathways regulating the bimodal induction of the Type I

IFN response ...... 57

3-2 Heterogeneity in the invading bacterial population shapes the heterogeneous host

Type I IFN response ...... 60

3-3 Bacterial variation in PhoPQ-mediated LPS modifications drives the bimodal induc-

tion of the host Type i IFN response ...... 63

3-4 PhoPQ-mediated LPS modifications impact in vivo infection outcomes ...... 65

S1 Confirmation of the behavior of gene clusters ...... 98

S2 Infection with LPS-beads induces Cluster III in a significantly higher fraction of cells

compared to infection with live bacteria ...... 99

S3 Bacterial PhoPQ activity drives heterogeneity in the host Type I IFN response. ... 100

S4 Stimulation with LPS extracted from PhoP mutant strains induces varying levels of

Cluster III genes ...... 101

S5 Stimulation of mice with LPS extracted from PhoP mutant strains ...... 102

11 12 List of Tables

1.1 S. enterica type III secretion effectors ...... 24

2.1 Enriched Ingenuity pathways in Clusters I, II, and III ...... 38 2.2 Enriched Ingenuity pathways in Clusters IV and V ...... 39 2.3 High and low variance pathways in infected macrophages ...... 41

S1 Genes used in PCA in Fig. 2-3A ...... 103 S2 Gene clusters 1, 11, and III identified from single-cell analysis ...... 104 S3 Genes Clusters IV and V identified from single-cell analysis ...... 108 S4 Genes and probes used in Bioiark experiments ...... 112

13 14 Chapter 1

Introduction

1.1 Overview

Bacterial pathogens are a serious and growing public health concern, with many pathogens that were once thought to be defeated re-emrging in recent years (National Institutes of Health (US), 2007).

The class of facultative intracellular bacterial pathogens is particularly troubling as it includes numerous public health threats including Mycobacterium tuberculosis (M. tuberculosis), Salmonella

enterica (S. enterica), Salmonella typhi (S. typhi), Legionella pneumophila (L. pneumophilia), and

Neisseria gonorrhoeae (N. gon.orrhoeae). Many of these organisms, individually, cause tens of millions of infections each year and A. tuberculosis is the second greatest cause of mortality from a single infection agent, causing 1.5 million deaths in 2013 (World Health Organization, 2015c,a,b). To make matters worse, antibiotic resistance is a particularly acute concern for this class of pathogens, with N. gonorrhoeae, M. tuberculosis, S. typhi, and non-typhoidal Salmonellae being identified as either urgent or serious threats by the most recent CDC report on antibiotic resistance (Centers for

Disease Control and Prevention, 2014).

Many of the difficulties presented by these organisms can be attributed, at least in part, to their intracellular lifestyle. To cause disease, facultative intracellular pathogens must spend a portion of their lifecycle replicating inside host cells, usually host macrophages. This intracellular niche allows these pathogens to evade the humoral immune response (Kaufmann, 1993) and it allows for easy dissemination within the host using the reticuloendothelial system (Haraga et al., 2008). It

also makes treatment challenging since many antibiotics have poor cellular permeability (Bonventre

15 et al., 1967). To compound these issues, recent research suggests that bacteria exist in various physiological states in the intracellular environment, some of which are resistant to immune- and antibiotic-mediated killing (Helaine et al., 2014). These phenotypically tolerant states are not only thought to contribute to treatment failure, particularly for M. tuberculosis (Wallis et al., 1999), but this diversity also confounds our basic understanding of intracellular host-pathogen interactions, which is largely based on measurements of average behavior (Helaine et al., 2014).

These shortcomings in our understanding of intracellular host-pathogen interactions hinder our ability to identify relevant disease targets and develop new treatments.

To address the public health threat posed by these organisms, a more nuanced understanding of intracellular host-pathogen interactions is required. We must develop methods to assess and quantify physiological diversity in the intracellular environment in order to understand the full range of potential host-bacterial interactions. This dissertation focuses on the construction of an experimental platform combining single-cell R.NA sequencing (RNA-Seq) with fluorescent markers of infection outcomes to begin to address this need. We apply it to the well-established tissue culture model of S. enterica serovar Typhimurium (S. typhimurium) infection of mouse bone marrow- derived macrohages (BMMs) demonstrating the utility of our approach for uncovering new biological insights.

Here, because of the extensive body of literature already accumulated on S. enterica, we take this organism as a representative example of facultative intracellular pathogens, and present an overview of its pathophysiology. We then present a more detailed discussion of host-pathogen inter- actions, focusing on aspects of the infection process that are likely to promote cellular heterogeneity as well as on a few examples of heterogeneity that have been reported. Next, we review recent lit- erature demonstrating that heterogeneity is a common attribute of both bacterial and macrophage populations, involved in a wide variety of processes. We conclude by suggesting that the role of heterogeneity during infection has been underestimated and requires further, more systematic, investigation.

16 1.2 S. enterica pathophysiology

S. enI/rii 1s onie of the best studied facultative intracellilar bacterial pathogens and is a public healt h threat iII its own right. cauising aliost 100 million cases of gastroenteritis e (Majowicz et a., 2010). Over 2000 serovars of S. cnterica exist, but onlY a handfil are pathogenic (Coburn et al., 2007). Depending on factors such as the infecting serovar anld the imuniie competence of the host, the typical couirse of itfection can range froii self-liniting episodes of diarrhea to life-ireatening high-grade fever and bacterenia. An asymptomatic carrier state is also possible

(Hlornick et al.. 1970). Despite this wide range of clitical presentations, the pathophysiology of all these types of S. (ntcrica iinfectioii is ulldergirded by complex cellular interactions that ultimiately determilne the course of infection.

Salmonella spp.

Epithelial cell M cell

Macrophage T cell / )B cell Gastroenteritis

- - - - I --PMN - influx------

Enteric fever: dissemination to lymph nodes, liver and spleen

Figure I-1: Overview of S. enterica pathophysiology. S. enterica eniters hosts \7ia Hie oral cavity. mnigrates to the intestinle, trlamsverses the epithelial barrier, takes upI residence in host, mnacrophiages, and u1ses the incellular niche to disseminamte through the host. Adapted fromn (Haraga et al., 2008)

17 As depicted in fig. 1-1, S. enterica typically enters hosts via the oral cavity upon ingestion of contaminated food. It migrates through the digestive system into the small intestine where it actively promotes a robust inflammatory response, using a Type III secretion system (T3SS), a specialized virulence factor that allows bacterial called effectors to be injected into host cells, encoded in Salmonella Pathogenecity Island I (SPI-1) (Haraga et al., 2008). This leads to the recruitment of macrophages, polymorphonuclear leukocytes (PMNs), and other immune cells to the area. S. enterica then uses active and passive mechanisms to traverse the intestinal lining and access the lymphatic system. This passage is achieved via CD18 expressing phagocytes (Vazquez-

Torres et al., 1999), direct disruption of tight junctions (Jepson et al., 1995), or host M-cells, a specialized antigen-sampling cell on Peyer patches (Jones et al., 1994). This combination of inflammation and tight junction disruption causes significant edema and fluid leakage contributing to the classic symptom of diarrhea (McGovern and Slavutin, 1979; Coburn et al., 2007). Once in the lymphatic system, bacteria are engulfed by host macrophages. Using a T3SS located on Salonella

Pathogenicity Island 2 (SPI-2) and other virulence programs, S. enterica attempts to subvert the natural bacteriocidal activty of host macrophages and convert the phagosomal compartment into a replication-permissive Salmonella-containing vacuole (SCVs) (Alpuche-Aranda et al., 1994; Haraga et al., 2008). If successful, S. enterica replicates in host macrophages and, in some cases, can use its intracellular niche to disseminate throughout the host (Coburn et al., 2007). Meanwhile, the host attempts. to activate a robust antibacterial response depending in part on interferon gamma

(IFN--y), which among other activites, has been shown to be critical for enabling macrophages to kill intracellular bacteria (Kagaya et al., 1989). These cellular interactions between bacterial virulence programs and the host immune response eventually determine the infection outcome.

1.3 Extracellular host-pathogen interactions

1.3.1 The role of diverse microenvironments in promoting heterogeneity

Mammals have a number of innate defenses to prevent colonization with intestinal pathogens. One key aspect of the extracellular S. enterica life cycle is adaptation to these diverse stressors. For example, during the course of infection, S. enterica must adapt to the low pH of the stomach (Audia et al., 2001), bile and osmotic stress in the small intestine (van Velkinburgh and Gunn, 1999; Sleator

18 and Hill, 2002), competition with the different communities of natural gut flora (Alvarez-Ordonez et al., 2011), and aspects of the host immune response such as anti-microbial peptides (Alvarez-

Ordonez et al., 2011). The bacteria also must adjust to the largely anaerobic environment of the intestine (Rychlik and Barrow, 2005). The success with which S. enterica infects its hosts, with an infectious dose estimated to be as low as hundreds of bacteria (Waterman and Small, 1998), underscores its adaptability as a pathogen and the diversity of genetic programs it can activate.

For example, to adapt to the low pH environment of the stomach, S. enterica activates a complex transcriptional program involving regulators such as RpoS, PhoPQ, Fur, and OmpR/EnvZ (Au- dia et al., 2001). This response has wide-reaching physiological consequences including activating lysine and arginine decarboxylases to increase intracellular pH (Park et al., 1996); inducing acid shock proteins, like Ada, to reduce or repair molecular damage (Bearson et al., 1998); and, modi- fying the bacterial membrane (Perez and Groisman, 2007). The response to osmotic shock, on the other hand, involves different adaptations including alterations in solute transport and synthesis of osmoprotectants such as proline and trehalose (Csonka, 1988; Howells et al., 2002). Finally, the re- sponse to bile and antimicrobial peptides integrates many of the above responses with some specific responses such as induction of the protease PgtE, which can cleave antimicrobial peptides (Guina et al., 2000), and the efflux pump acrB, which is able to eflux bile (Lacroix et al., 1996).

It is interesting to note that many of these bacterial adaptations can be either beneficial or harmful depending on the immediate environmental context in the host. PhoPQ-mediated LPS modifications, for example, are thought to increase bacterial tolerance to antimicrobial peptides

(Gunn, 2001), but PhoP activation also decreases bacterial replication rates and the efficiency of epithelial transcytosis (Behlau and Miller, 1993), a required step during pathogenesis (Nunez-

Hernandez et al., 2013). Similarly anaerobic respiration, while required to survive in certain areas of the host, is suboptimal if oxygen is available (Aussel et al., 2014). These observations feed into a growing body of research suggesting that during infection, bacteria may be exposed to different stresses and thus behave quite differently from one another, despite being in close proximity. For example, it has been suggested using a RIVET-based expression assay that PhoP may only be activated in a subset of cells in the intestinal lumen based on findings that, at the population level, both PhoP-activated and PhoP-repressed genes are induced (Merighi et al., 2005). Similarly, bacteria have been suggested to display significant variation in based on proximity

19 to different immune cells that produce differing amounts of various reactive oxygen and nitrogen species (to be discussed more below, Burton et al. 2014). The full extent of bacterial heterogeneity generated by host microenvironments is not well understood since the majority of methods used to study infection function at the bulk level and cannot measure these differences. Nevertheless, it seems apparent that even before entering the intracellular environment, extensive cell-to-cell variation likely exists in infecting bacteria. These differences may well influence the outcome of macrophage-bacterial encounters.

1.3.2 Heterogeneity as a mechanism to control the host inflammatory response

A second important aspect of extracellular host-pathogen interactions during S. enterica infection is the regulation of the host inflammatory response in the intestine. Upon detection of intracellu- lar pathogen associated molecular patterns (PAMPs) or of bacteria outside the intestinal lumen, a robust inflammatory response is initiated in the intestine involving IL-17, IL-22, and CXCR- mediated neutrophil recruitment; IFN--y mediated macrophage activation; and basophil- and mast cell-mediated increases in vascular permeability with associated exudate formation (Winter et al., 2010). These responses have been shown to play an important role in restricting intestinal pathogens including S. enterica, as demonstrated by the increased severity of disease in patients lacking func- tional IFN--y (MacLennan et al., 2004) or with neutropenia (Noriega et al., 1994). This benefit is not without cost, however. Excessive inflammation can damage host tissue and low level inflammation can actually enable S. entevica colonization by removing natural gut flora (Stecher et al., 2007).

Thus tight control of the type, extent, and timing of host inflammation is critical, for both the host and the pathogen.

S. enterica plays a direct role in regulating host inflammation by injecting SPI-1 T3SS effectors directly into host epithelial cells. These effectors include both potent pro-inflammatory molecules such as flagellin, SipA, and SopE, as well as anti-inflammatory effectors such as AvrA and SpvC

(Haraga et al., 2008). Heterogeneity has been observed in the expression of both immune-mediating effectors (Cummings et al., 2006; Schlumberger et al., 2005a) and in the expression of the SPI-1 T3SS itself (Hautefort et al., 2003). This heterogeneity could conceivably permit better spatial-temporal control over host responses. Alternatively, it has been proposed that this heterogeneity is a form of bet hedging designed to off-set the direct fitness cost of directly stimulating host immunity

20 (Ackermann et al., 2008). Regardless of the mechanism, bacterial heterogeneity has evolved in response to the need to tightly control host inflammation suggesting, again, that prior to entering the intracellular environment individual bacteria may be in different cell states with different effects on host cells.

1.4 Intracellular host-pathogen interactions: A battle between two

highly adaptable cells

During the intracellular stage of the S. enterica lifecycle, the macrophage becomes a new site of

struggle between the host and pathogen. Macrophages possess a variety of antibacterial defenses

that are activated upon the detection of conserved PAMPs on bacterial pathogens by host pattern

recognition receptors (PRRs), including Toll-like receptors (TLRs) and Nod-like receptors (NLRs).

The canonical means of detection of gram negative bacteria, such as S. enterica, is the recognition

of bacterial lipopolysaccharide (LPS) by the macrophage TLR4 receptor (Mogensen, 2009). In

fact, although other PAMPs including bacterial RNA, DNA, peptidoglycan, and flagellin can be

recognized, it has been reported that the niacrophage transcriptional response to live S. enterica can

be almost entirely recapitulated by LPS stimulation (Rosenberger et al., 2000). Upon recognition,

macrophages engulf invading bacteria, trapping them in a phagosomal compartment. In the absence

of interference, this compartment undergoes a process known as maturation that involves vacuolar

ATP-ase-mediated acidification. It also involves sequential fusions with early and late endosomes, as well as lysosomes, in processes that are mediated, at least in part, by Rab family GTPases

(Canton, 2014). The mature phagosomne has a pH of approximately 4.5-5.5, a number of hydrolytic

enzymes, and reactive oxygen species produced by NADPH oxidase, all of which can be lethal to

bacteria (Canton, 2014). It is also deficient in a number of factors necessary for bacterial growth, most notably metal ions such as Fe 2+ and Mg 2+ (Garcia-del Portillo et al., 1992).

Additionally macrophages induce other anti-bacterial programs including the typical inflammna-

tory response mediated by NF-KB and MAPK (Rosenberger et al., 2000). These pathways activate

the expression of numerous proteins such as NOS2, which leads to the production of reactive nitrogen

species and has been shown to be important in controlling intracellular S. enterica (Vazquez-Torres

et al., 2000). Macrophages also express numerous chemokines including CCL3 and CCL4, attracting

21 other immune cells to the area, and a variety of cytokines including IFN-Y (which is also secreted by other immune cells macrophages recruit), TNF-a, and Type I interferon (Type I IFN) (Winter et al., 2010; Freudenberg et al., 2002). IFN--y has an important role in potentiating macrophage activation. Indeed, macrophages pre-treated with IFN-Y have been shown to be able to more ef- fectively eliminate intracellular S. enterica despite the bacterial defenses discussed below (Kagaya et al., 1989). TNF-a is an important pro-inflammatory cytokine, inducing effects like fever, and apoptosis. The effects of Type I IFN oil bacterial infection are more equivocal. Type I IFN has been shown to be able to induce IFN--y, a process generally considered beneficial for the host (Freuden- berg et al., 2002), however, they also are thought to predispose macrophages to necroptosis, a form a cell death that is thought to be detrimental to the host (Robinson et al., 2012).

S. enterica uses a multi-pronged approach to overcome macrophage antibacterial responses.

First, this involves sensing the transition from extracellular to intracellular environments. Inter- estingly, this process often depends on macrophage antibacterial responses, making these responses double-edged swords that activate specific bacterial virulence programs that can harm the host even as they inhibit bacterial replication. For example, S. enterica uses multiple two-component systems such as OnmpR-EnvZ, SsrA-B, and PhoPQ to sense aspects of the phagosomal environment such as decreasing pH, decreasing Mg2 + and Fe2+ concentrations, and nutrient starvation (Haraga et al., 2008). This leads to somewhat unintuitive experimental results suggesting that inhibiting specific macrophage immune functions such as phagosome acidification actually decreases bacterial survival (Rathman et al., 1996). It also emphasizes the close connection that exists between specific macrophage and S. enterica responses such that altering macrophage activity is sufficient to elicit different bacterial responses.

A second aspect of the S. enterica response to macrophages involves activating virulence path- ways that promote bacterial survival in harsh environments. To permit survival in the iron-starved environment of the SCV, for example, S. enterica produces two siderophores, salmnochelin and enter- obactin, that scavenge iron from the environment (Ibarra and Steele-Mortimer, 2009). S. enterica also encodes the gene feo, which represents an independent iron acquisition pathway necessary for virulence (Boyer et al., 2002). To overcome reactive oxygen and nitrogen species, S.enterica, en- codes numerous enzymes to either detoxify reactive species (SodCI, SodCII, KatG) (Sly et al., 2002;

McLean et al., 2010) or repair the resulting damage (RecA, various elements of the SoxRS regulon)

22 (Janssen et al., 2003). To overcome the low pH stress encountered in macrophages, S. enterica induces LPS modifications, changing both the structure of lipidA and the concentration of acidic glycerophospholipids in the outer membrane to form a more effective barrier (Gibbons et al., 2005;

Dalebroux et al., 2014).

Third, S. enterica activates virulence programs that directly interfere with macrophage activity.

This is primarily mediated by the SPI-2 T3SS, which is responsible for translocating at least 28 different bacterial effectors into the host cytosol (Figueira and Holden, 2012). Although bacterial effectors and their functions are still active areas of research, multiple studies demonstrate dramatic alterations in host pathogen interactions if effector translocation is blocked. One notable example is SifA, which interferes with host trafficking and alters SCV membrane integrity. Translocation of

SifA causes the formation of glycoprotein-containing membrane tubules known as Sifs around the phagosome, whose function remain unclear but may expand the SCV or aid in bacterial nutrient

acquisition (Figueira and Holden, 2012). SifA also helps maintain SCV integrity, as shown by the

phenotype of SifA-null mutant strains that are found to escape the phagosomal compartment and

reside in the host cytoplasm (Beuzon et al., 2000). Other examples of effectors include SspH1,

which has ligase activity and is thought to interfere with host inflammation by disrupting

signaling through the NF-KB pathway (Haraga and Miller, 2003), and SpvC, which inhibits host

inflammation by disrupting the MAPK pathway (Mazurkiewicz et al., 2008). Other effectors are

listed in table 1.1 with their suspected functions. In addition to the SPI-2 T3SS, S. enterica also

can interfere with macrophage responses by altering bacterial recognition by host PRRs. This is

primarily thought to be accomplished through PhoP-mediated LPS modifications which have been

associated with a decreased host inflammatory response (Guo et al., 1997").

It should be apparent from the above discussion that the relationship between macrophages

and intracellular S. enterica involves many different processes ranging from the control of available

nutrients to the control of host inflammatory pathways. Furthermore, macrophage and bacterial

responses are intimately linked, with each sensing and responding to the activities of the other. The

complexities of this environment are likely to produce and even propagate cell-to-cell differences

leading to extensive differences between individual infection events. Despite this, few studies have

been able to catalog the heterogeneity of host and bacterial factors during intracellular infection or

evaluate the functional consequences of this heterogeneity.

23 Table 1.1: S. enterica type III secretion effectors. Adapted from (Haraga et al., 2008) Effector Cellular Function Target SPI-1 AvrA Inhibits NF-KB signalling and IL-8 production; also prevents ubiquitination Unknown of -catenin SipA or Decreases the critical concentration of G-actin and increases the stability of F-actin, SspA F-actin; also induces PMN transepithelial migration and disrupts tight T-plastin junctions SipB or Binds and activates Caspase-1 and induces autophagy in macrophages Caspase-1, SspB* cholesterol SipC or Nucleates and bundles actin F-actin; SspC* KRT-8, KRT-18 SopA Stimulates PMN transmigration by HECT-like E3 ubiquitin ligase activity Unknown SopB or Activates Cdc42, RhoG, AktA, and chloride secretion through its inositol Unknown SigD phosphatase activity and disrupts tight junctions SopD Stimulates fluid accumulation in bovine ligated ileal loops and contributes Unknown to diarrhoea in calves and systemic disease in mice SopE Activates Cdc42, Raci, and RhoG by its GEF activity and disrupts tight Cdc42, junctions Raci, Rab5 SopE2 Activates Cdc42, Raci and RhoG by its GEF activity and disrupts tight Cdc42, junctions Raci SptP Inhibits Cdc42 and Raci by its GAP activity and MAPK signalling and Raci IL-8 secretion through its tyrosine phosphatase activity SPI-2 GogB Unknown Unknown PipB Unknown Unknown PipB2 Contributes to Sif formation Kinesin-1 SifA Induces Sif formation, maintains integrity of the SCV, and downregulates SKIP, Rab7 kinesin recruitment to the SCV SifB Unknown Unknown SopD2 Contributes to Sif formation Unknown SpiC* Interferes with endosomal trafficking Hook3 SpvB** Actin-specific ADP-ribosyltransferase and downregulates Sif formation Actin SseF Contributes to Sif formation and microtubule bundling Unknown SseF Contributes to Sif formation and microtubule bundling Unknown SseG Contributes to Sif formation and microtubule bundling Unknown SseI or Contributes to host-cell dissemination Filamin, SrfH TRIP6 SseJ Maintains integrity of the SCV and has deacylase activity Unknown SseK1 Unknown Unknown SseK2 Unknown Unknown SseL Deubiquitinase Ubiquitin SspH2 Inhibits the rate of actin polymerization and contributes to virulence in Filamin, calves profilin SteA Unknown Unknown SteB Unknown Unknown SteC Unknown Unknown SPI-1 and SPI-2 SIrP Contributes to virulence in calves Unknown SspH1 Inhibits NF-KB signalling and IL-8 secretion contributes to virulence in PKN1 calves and has E3 ubiquitin ligase activity 1 *Also a component of the secretion apparatus. 2 **Has not been definitively shown to be an SPI2 T3SS effector 24 1.5 Heterogeneity in macrophage and bacterial populations

In addition to the specific aspects of infection that are likely to promote bacterial heterogeneity that were discussed above, there is a growing body of literature suggesting that heterogeneity is a common and important feature of bacterial populations regardless of environmental context. At the transcriptional level, multiple studies have suggested that heterogeneity in the expression of single genes is not only unavoidable, but specifically selected for, particularly in genes with certain func- tional roles such as carbon metabolism and stress response (Silander et al., 2012). Furthermore, gene networks are often structured in such a way as to amplify this variation, to the extent of creating bistable populations (Rao et al., 2002). This transcriptional heterogeneity presumably contributes

to the phenotypic diversity that has been observed even in artificially homogenous lab environments

such as variations in cell division rates and antibiotic tolerance between isogenic cells in the same

population (Balaban et al., 2004). From an evolutionary perspective, this heterogeneity is thought

to be a form of "bet hedging" to ensure the survival of some fraction of the population should a

certain stress such as starvation or antibiotic stress be encountered (Losick and Desplan, 2008).

Alternatively, heterogeneity may be a mechanism to ensure that behaviors that may incur a fitness

cost to individual cells but benefit the entire population can be performed at some frequency (Ack-

ermnann et al., 2008). These are not mutually exclusive explanations and, of course, are in addition

to other causes of heterogeneity such as differences in microenvironments. These evolutionary argu-

ments, combined with the abundance of elements during infection that could be expected to produce

heterogeneity provide a compelling argument for suspecting that the role bacterial heterogeneity

plays during infection as been under-appreciated and requires further investigation.

Macrophages and other immune cells have also been suggested to display extensive cell-to-cell

variation, suggesting that host cell heterogeneity may also play an important role in determining

infection outcome. For macrophages, this heterogeneity is due, at least in part, to their diverse phys-

iological roles across different, complex environments in vivo. For example, macrophages residing in

different tissues differ widely in characteristics such as cell surface markers, adhesion properties, and

response to growth factors (Gordon and Taylor, 2005). These differences suggest that macrophages

may respond differently to similar environmental stimuli based on their origin. Even when exam-

ining macrophages with similar origins, it must be noted that macrophages are capable of existing

25 in multiple cell states to play roles in diverse functions such as, wound repair, homeostatic mainte- nance, and pathogen response (Martinez et al., 2008). Since many of these roles are mediated by cell-to-cell interactions and cytokine signaling, macrophages in the same region could be expected to respond very differently based on differences in their micro-environment (e.g. changes cytokine concentrations). Indeed, macrophages have been shown to have a variety of different activation states in vivo, falling between the two extremes of pro-inflammatory M1 "classical" activation, and tissue remodeling M2 "alternative" activation (Martinez et al., 2008) based on stimulation with different cytokines. Even when examining macrophages in artificially similar microenvironments, such as in simple tissue culture models, macrophages have long been suspected to display hetero- geneity in their ability to phagocytize pathogens (Gog et al., 2012). Additionally, although direct evidence in macrophages is lacking, closely related immune cells, dendritic cells, have been shown to generate different transcriptional responses to identical immune ligands in tissue culture models

(Shalek et al., 2014).

1.6 The need for a new approach to study infection

The evidence suggesting that transcriptional variation plays a significant, functional role in infec-

tion extends beyond characteristics of infection that are likely to promote heterogeneity and the

prominent role that heterogeneity plays in macrophage and bacterial populations. Studies of tissue

culture models have demonstrated that S. enterica infection gives rise to a diversity of phenotypes,

all existing simultaneously in the same culture, as might be expected from the interactions of diverse

populations. Across a wide range of MOIs, mnacrophages differ in intracellular bacterial burden from

zero to many bacteria (Helaine and Holden, 2013). Intracellular bacteria can take on a variety of

states including non-viable, quiescent, and rapidly dividing (Helaine and Holden, 2013; Claudi et al.,

2014). Finally macrophages have a variety of possible fates including survival, apoptosis, necrosis,

and pyroptosis (Monack et al., 1996; Brennan and Cookson, 2000).

Despite evidence that transcriptional heterogeneity exists during infection, the mechanisms be-

hind these different outcomes remain obscure. Understanding how this variability is generated could

inform our basic biological understanding of how host cells integrate extracellular and intracellular

pathogen sensing to determine cell fate and how bacteria regulate different virulence mechanisms in

26 the intracellular environment. It could also highlight key interactions that drive the fate of cellular host pathogen encounters, leading to new treatment options. Thus, the next chapter describes our efforts to build and apply a system designed to profile transcriptional heterogeneity in host cells during infection and correlate this with phenotypic markers of infection outcome.

27 28 Chapter 2

Examining S. enterica-Macrophage

Interactions at the Single-Cell Level

2.1 Attributions

Significant portions of this chapter were published as a manuscript in Cell: Avraham et al. (2015) or adapted from this manuscript.

Wetlab experiments were designed and executed in collaboration with Roi Avraham.

2.2 Introduction

Interactions between a pathogen and its host involve both a complex virulence program executed by pathogens and activation of an orchestrated defense response by the host (Schwan et al., 2000).

These interactions are usually measured in populations of cells, masking cell-to-cell variation that may be important for infection outcome. One of the best-studied cellular models of the host- pathogen interaction is infection of macrophages with the enteric pathogen S. typhimurium. In a single population, both in vitro and in vivo, S. typhimurium, like other bacterial pathogens, has been shown to display significant cell-to-cell variation in attributes such as growth rate, expression of virulence factors, and sensitivity to antibiotics (Claudi et al., 2014). Similarly macrophages and other innate immune cells have been observed to display extensive cell-to-cell variation upon exposure to even homogeneous ligands (Shalek et al., 2014).

29 The heterogeneous, stochastic, and dynamic nature of both macrophage and bacterial popula- tions suggests that their interaction is likely to result in a variety of subpopulations with different, complex phenotypes (Helaine et al., 2010). Indeed, infection of macrophages with S. typhimurium generates well-documented diverse outcomes: some macrophages engulf the bacteria, while others remain uninfected (McIntrye et al., 1967); some mnacrophages lyse the ingested bacteria, while oth- ers are permissive to intracellular bacterial survival (Mclntrye et al., 1967); some macrophages will undergo cell death with bacterial release (Monack et al., 1996), while others survive and allow bac- teria to multiply or persist intracellularly (Helaine et al., 2010). Despite longstanding observations of these diverse outcomes, however, we currently lack an understanding of the underlying molecular mechanisms in either the host or pathogen.

Understanding how macrophages integrate signals from bacterial PAMPs to determine cell fate, and how bacteria regulate different virulence strategies to optimize pathogenicity in the host en- vironment are ftmndamental to infection biology and finding novel treatment options for infectious disease. Understanding the basis and significance of heterogeneity could inform strategies that result in a more beneficial outcome to the host.

Until recently the tools for investigating transcriptional heterogeneity in a high throughput manner have been lacking. However, the advent of single-cell RNA-Seq presents an interesting opportunity to gain such a high resolution map of infection, albeit only on host macrophages at the moment due to the lack of polyadenelation on bacterial messenger transcripts and their relatively low abundance. To make full use of this transcriptional information, however, we need to be able to correlate it to infection phenotype. Thus, we first aim to develop a fluorescent system for examining heterogeneity during bacterial exposure.

2.3 Results

2.3.1 Heterogeneous outcomes of S. enterica-macrophage encounters

To quantitatively characterize outcomes of individual S. typhimurium-macrophageinteractions, we developed a fluorescent system using GFP-expressing bacteria stained with the red dye pHrodo

(Experimental Procedures), which binds to the cell wall of bacteria and increases in fluorescence in the low pH environment of macrophage lysosomes. In the early stages after S. typhimuriurn

30 challenge, there axe three possible outcomes (figs. 2-1 A and 2-1 B): (1) no infection, (2) infection with intracellular survival of a bacterium, and (3) infection resulting in an intracellular dead bac- terium. While live bacteria display both red and green fluorescence, dead bacteria fluoresce only red due to degradation of GFP. Exposed but uninfected macrophages do not fluoresce (fig. 2-1 B).

Importantly, using the GFP and pHrodo reporters we could distinguish cells that had been initially infected but cleared the infecting bacterium (pHrodo+, GFP-) from those that had never been

infected (pHrodo-, GFP-). We used this system to follow mouse bone BMMs exposed to pHrodo- stained, GFP-expressing S. typhimurium at a multiplicity of infection (MOI) of 1:1 for 24 hours.

Importantly, we used a low MOI to ensure that infected macrophages are generally infected with

only one bacterium.

Microscopy and FACS revealed diverse phenotypes, including uninfected cells and cells infected

with single or multiple, live (yellow) or dead (red) bacteria, as has been previously described (figs. 2-

1 C to 2-1 E, McIntrye et al. 1967). This variability is neither simply a transient phenomenon nor a

mere outcome of the specific MOI chosen, since it is sustained throughout the 24 hour time courge (fig. 2-1 D) and with increasing MOI (fig. 2-1 F). To better quantify bacterial burden in single cells, we sorted macrophages according to fluorescence phenotype and enumerated the number of

intracellular bacteria by plating for colony forming units (CFU) (Experimental Procedures). As

expected, no viable bacteria were recovered from uninfected or pHrodo+, GFP- (dead bacteria) infected cells. GFP+, pHrodo+ cells contained a range of bacteria (fig. 2-1 G), which correlated with GFP intensity (fig. 2-1 H), and showed reduction in bacterial burden during the twenty-four

hours time course, similar to other studies using primary cells infected at low MOIs (Monack et al., 1996; Schwan et al., 2000). Thus, individual host cells may vary widely in their ability to phagocytose

bacteria and/or restrict bacterial growth after uptake.

2.3.2 Single-cell RNA-Seq accurately captures host transcriptional states after

bacterial exposure

Having confirmed that our fluorescent system can accurately distinguish diverse phenotypes during

bacterial exposure, we next sought to verify that single-cell RNA-Seq accurately reflects transcrip-

tional changes after bacterial exposure. Using SMART-Seq (Trombetta et al., 2014) and our flu-

orescent labeling system detailed above, we prepared RNA-Seq libraries from single macrophages

31 A B

Dead bacteria Unexposed Exposed

Macrophage o - o Live bacteria

GFP 0 Uninfected Labeled bacteria Uninfected Infected

C D E

Unexposed 160 0.1% 0.1%

4 hours 1n4d GFP+ Hrod pHrodo O 120 *24 hours

2 10 "Uninfected 0 -80 I ') 02 4 hours 10s- 2.50/ 4.1%

840 n4 pHrodoI pHnodo E z I 10- 0 i Uninfectd 1 2 3 5 6 8 910 Bacteria per cell GFP F G H

20

a.) 50 - 0D Single cell CFU: i10 0 >10 GFP Fluorescenco (D 5-10 m2-4 60 50 0 o 0 0 1 2 4 8 24 (hours) 0 0a GFP infected cells MOI XC S i Dead bacteria 20 D1~0 (pHrodo only) 0 1 2 3 0 i 2 3 5 1 2 3 4 5 6 7 910+ CFU per cell

Figure 2-1: Heterogenous outcomes of BMM-S. typhinmurium encounters (A) Diagram indicating the nomenclature of iacrophage infection stains. (B) Schematic represelntatioll of't lie experimental model. using BDNINIs exposed to plirodo-labeled. GFP-expressing S. typ).1'milIm1 -urn.(C) R eprescntative imnages of mouse BNIh s exposed to S. Lyphnmuriu. reveals heterogeneity in infection plhenotype including liinfected macrophages, and infected macropliages containing live (yellow) or dead (red) bacteria at early (4 hours: 1op) and late (24 hours: bottom) time points. (D) images from (C) were analyzed to enumerate number of bacteria per infected cell. (E) FACS analysis of' BMMs exposed to fhioresceitly labeled S. cn/cica (unexposed-lefit, exposed for 4 hours-right). (F) BNIMs were exl)osed to GFP-expressin S. tyimuU/ pg labeled vitli pliodo in the indicated hOls (x-axis). Percent of infected cells (GFP or pHrodo positive, y-axis) was analyzed by FACS. (G) CFT enumerated from individual fluorescently' labeled imacrophages. Unexposed. uninfected and pHrodo i .GFP cells had no or minimial survining bacteria. GFP+ cells contain different numbers of cells over tiie (left Y-axis). The red line indicates the1i percentage of pHr odo-onlv infected cells demonstrating the increase in the niilumber of dead bacteria over time (right Y axis). (H) l3MMs were exposed to S. /yph murn riu- at an MOI of 2 5, and, after 4 hours, single cells with low. mediumi or high GFP fluorescence were sorted into 96-well )1iates. Cell lvsates were plated on LB augur and intracellular bacter1ia Were (1 eii1iiiera ted bY (FU. 3) and from matching controls of 150 sorted cells at 3 time points following bacterial exposure. We also generated a similar time course from bulk samples (5x105 ) using Illumina's TruSeq library construction method. Across all genes, found good agreement between RSEM-generated gene ex- pression estimates between all three datasets (figs. 2-2 A and 2-2 B), with some loss of low abundance transcripts in single-cell libraries as has been reported previously.

To focus our comparison more on transcriptional changes resulting from bacterial exposure, we first needed to identify a statistically rigorous method for calling differential expression in single- cell datal. Single-cell RNA-Seq expression estimates are thought to follow a zero-inflated negative binomial distribution instead of the typical negative binomial distribution of bulk RNA-Seq exper- iments (Shalek et al., 2014; Kharchenko et al., 2014). This difference likely arises from the loss of transcripts during the initial RNA isolation procedure producing many more transcripts to which no reads are mapped than would otherwise be found. This difference may confound some standard methods, which typically assume a traditional negative binomial distribution. To mitigate these ef- fects, we first applied stringent filtering to single-cell libraries, excluding libraries with low absolute read counts and few detected genes, thus removing cells that would depart most significantly from a negative binomial distribution (see Materials and Methods). We also tested a number of established statistical approaches for calling differential expression to evaluate their behavior on single-cell data.

We first identified a "gold standard" set of differentially expressed genes that were agreed upon by all tested methods in our bulk time course (see Materials and Methods). We then tested the ability of these algorithms to recover these same transcripts from our single-cell data, without identify- ing spurious transcripts. The results are shown in fig. 2-2 C as a receiver operator characteristic

(ROC) curve, which shows the number of true positives recovered at a number of different possible false positive rates. As can be seen, DESeq (Anders and Huber, 2010) gives the best results with an area under the curve (AUC, a metric that positively correlates with performance and takes a value between 0 and 1) of .766. To further demonstrate that DESeq would not give false positive results when working with single-cell data, as these errors would be particularly concerning, we randomly assigned single infected cells from our four hour time point into one of two categories and tested for differentially expressed genes between these two categories. DESeq correctly identified

'Statistical analysis methods for single-cell RNA-Seq have been emerging recently (Kharchenko et al. 2014, for example) but were not available when this project was started. Thus we adapted statistical methods from traditional RNA-Seq, as explained, for our needs.

33 A B C

1.0* 0) 0) , _j a) _0 75

~0 a) 0 1< 50 0 U) ci) s) =$.25- DESeq DESecr:Qq 12 L XI LU EdgeR 5 50 75 1.0 Expression (Bulk, Expression (Sorted, Log ) False Positive Rate Log 2 ) 2

D E

0)

S0 lu 1000 oac) CI 0 500 Cpi) E" 0 "1-2

Time (H): 4 8 Bulk Sorted Single population cells 1 Bulk F] Sorted M Single cell] Unexposed N Infected

Figure 2-2: Single-cell RNA-Seq accurately captures transcriptional changes associated with bacterial exposure. (A) Scatter plot of expression estimates (logQ TPM) of mouse BMMs using bulk RNA-Seq lil)raries (mean of 2 replicates) prepared from 5x1_(5 cells (x-axis) and lil)raries generated from 150 FACS-sorted cells (mean of 3 replicates. y-axis). (B) Scatter plot of expressioll est imates from sorted populations as described in A (x-axis) and the mean expression of 67 single FACS-sorted cells (y-axis). (C) Receiver operator characteristic curve for standard 1)lk methods of differential expression testil oi single-cell data. Methods are assessed ased on a gold staildard derived from hulk experiments. Met hods with better performmnce Iave c(uirves that are shifted towards the upper left of the plot. DESeq displays the best sensitivity overall. (D) Tie number of significantly iipregulated genes as detected by DESeq at 4 and 8 hours after baeterial exposure usiig the different conditions descril)ed in A and B. (E) Hleatnap of geies identified as siglnficamitiy upregulated iI single cells after 4 hours of bacterial exposure in 1ulk, sorted. and single-cell 11)rai C's. Genes detected as significantly upregulated at the siigle-cell level show similar patterlis iII bulk and sorted samples.

434

___ zero differentially expressed genes at a false discovery rate (FDR) of .05 demonstrating that this method is robust to the noise inherent in single-cell data and is thus a reasonable method for calling differentially expressed genes.

Using DESeq, we find good agreement in differentially expressed genes after bacterial exposure at the bulk, sorted, and single-cell levels. As shown in fig. 2-2 D over 500 genes are detected at the single-cell level at both four and eight hours after bacterial exposure. While this represents a loss of sensitivity, particularly compared to bulk samples, it can be seen that the genes identified at the single-cell level behave similarly at the bulk and sorted levels (fig. 2-2 E), indicating that these genes are not false positives. Furthermore, this loss of sensitivity is complemented by the additional information on gene variance and cell subpopulations that can be obtained from single-cell exper- iments. To summarize, then, single-cell RNA-Seq accurately captures both global transcriptional patterns and changes in expression following bacterial exposure in host macrophages.

2.3.3 Single-cell RNA-Seq identifies transcriptidnal changes associated with ex-

tracellular and intracellular bacterial detection

Single-cell transcriptional proffles clearly distinguished cells with different phenotypic states. We

used a list of 535 genes that are upregulated in exposed macrophages in our single-cell libraries

(DEseq p <0.05 and fold change >2, Experimental Procedures) to perform principal component

analysis (PCA) on the single-cell expression data. PCA clearly distinguished both between exposed

and unexposed macrophages (mostly on PC1) and between infected and uninfected macrophages (mostly on PC2, fig. 2-3 A). Taken together, the ability to distinguish these different phenotypes

suggests that some pathways respond primarily to extracellular cues of bacterial presence, while

others respond to intracellular cues.

To better understand these distinct responses, we calculated a metric we term the "intracellular to extracellular response ratio" reflecting the magnitude of induction accounted for by bacterial

infection verses extracellular bacterial exposure (fig. 2-3 B, Experimental Procedures). We then

classified genes based on their mode of response: Cluster I contains genes that respond primarily

to extracellular cues of bacterial presence, and Cluster II contains genes that respond primarily

to intracellular cues. Supporting our classification, many Cluster I genes (responding to bacterial

exposure - i.e., PC1 above) are known to be associated with the classic LPS response (e.g., Thf and

35 A B

5 15- 0 Tnf 7 Unexposed K~ "2~" Uninfected Sod2 -

1o- pHrodo Nfkbiz GFP + pHrodo 0

0 S S (D E 0 0Q- 00 .0 2 (5O 0 S 0 0 -FD 0~0 F ______9-10- .0 0 CD 1112b - :-~zs~. ,&j 0 Nos2--

-20 -10 0 10 C Principal component 1 C a) Unexposed 2.5 hours 0D IV

D C) V 11l1 0) CD 4 hours 8 hours CD MX I CD, Ifit2 -J - Herc6 IV C) ai) Ed -7 (n E5 M-- V Key forB, D III ;214 Genes 1 0

Pearson Correlation Unexposed Uninfected Infected

Figure 2-3: Transcriptional signatures associated with subpopulations of exposed macrophages (A) Single macnroplages have distin1( trallseriptional responses depi)eIndilg on infec- tion phenotype. 89 singlle cells from fig. 2-1 E were analyzed 1y RNA-Seq and principle eomponent analysis. Shown are the first two principal eomponents (PC and PC2. 5 and 3 percenlt of the total variatioli res)eCtivelv). (B) Expression level of genes (rows) in single BNl s (columns) were measured using single-cell R NA-Seq after exposure to S. andmipdimrium grouped by their infec- tion phenotype (unexposed (white, u 23), uninfeeted (grey, 11 24), infected (green, 11 42)). Geies are categorized into two eluseters as described. The number of genes in each cluster is denoted next to the heat map. Genes are arranged by the extraceluiar or intracellular ratio (IC EC ratio), left bars indicate distribultion of scores for each(luster. (C) Correlation plot showing the Pearson

correlation coefficient of expression Vxalues (log 2 TPM) for significant clusters identified by WG(\VCNA (see Experimental Procedures) across all single cells for each time point. (D) Heatmnap showing the bimodal behavior of Cluster III genes across infected (ells after four 1ours of bacterial exposure. Cells in (B) and (D) are sorted according to average expression of Cluster III.

36 Nf-Kb, table 2.1) and many Cluster II genes (responding to intracellular bacteria - i.e., PC2 above) are known to be associated with antibacterial defense (e.g., Nos2 and 1112b). While Cluster I was relatively stable across different time points, many genes in Cluster II were found to be induced also in uninfected cells at later time points (fig. S1 A). At these early time points (8 hours) we did not detect differences between pHrodo+,GFP+ and pHrodo+, GFP- infected cells, so these groups were merged for further analysis.

2.3.4 Bimodal induction of Type I IFN response genes in infected macrophages

It has been previously suggested that immune networks may be structured to produce subpopu- lations of cells with distinct physiologies (Jin et al., 2014). Thus we sought to use our single-cell data to identify novel subpopulations of macrophages displaying distinct transcriptional behaviors.

We used weighted gene correlation network analysis, an algorithm based on pairwise gene-gene correlations that is specifically designed to identify gene sets that change in a concerted fashion across multiple samples (Langfelder and Horvath 2008, Experimental Procedures). Because this analysis was not initially designed for single-cell data and may be unduly influenced by the noise inherent in these experiments, we made a few modifications to the typical workflow. First, we used a nonparamteric measure of correlation (spearman's p) when calculating initial correlation values.

Second, to remove false positive results, we analyzed each time point separately and filtered for gene clusters that exhibited co-regulation in at least two time points.

We identified three gene clusters (Clusters III, IV, and V) that met these criteria (figs. 2-3 C and S1 B and tables S2 and S3). Cluster III was particularly interesting as it was significantly enriched for the Type I IFN response (table 2.1), which has previously been shown to play a role in non-canonical inflammasome activation in response to S. typhimurium exposure (Rathinam et al., 2012). Cluster III is induced in approximately one third of infected macrophages beginning at 4 hours post exposure (fig. 2-3 D) and continues to show bimodal expression at 8 hours, suggesting that this induction is not a transient phenomenon (fig. S1 B). Notably, this cluster is induced also in uninfected cells at 8 hours. This may not be surprising, given that interferon is a secreted soluble factor that may result in a non-cell autonomous induction of this cluster later in uninfected cells

(Honda and Taniguchi, 2006). Cluster IV is enriched for cell-cycle genes (table 2.2), is bimodal in unexposed cells, and decreases in expression upon exposure. It does not differentiate between un-

37 Table 2.1: Enriched Ingenuity pathways in Clusters I, II, and III

Cluster I Pathway -logio p-value TNFR2 Signaling 11.80 CD40 Signaling 8.70 Production of Nitric Oxide and Reactive Oxygen Species in Macrophages 7.43 IL-10 Signaling 7.40 iNOS Signaling 7.07 NF-KB Activation by Viruses 7.07 Toll-like Receptor Signaling 7.00 Death Receptor Signaling 6.94 NF-KB Signaling 6.90 Role of Macrophages, Fibroblasts and Endothelial Cells in Rheumatoid Arthritis 6.80

Cluster II Pathway -logio p-value Altered T Cell and B Cell Signaling in Rheumatoid Arthritis 11.20 Communication between Innate and Adaptive Immune Cells 9.71 Granulocyte Adhesion and Diapedesis 9.70 Role of Hypercytokinemia/hyperchemokinemia in the Pathogenesis of Influenza 8.57 Role of Macrophages, Fibroblasts and Endothelial Cells in Rheumatoid Arthritis 8.53 TREM1 Signaling 8.01 Type I Diabetes Mellitus Signaling 7.65 Dendritic Cell Maturation 7.58 Role of IL-17F in Allergic Inflammatory Airway Diseases 7.21 Role of Pattern Recognition Receptors in Recognition of Bacteria and Viruses 7.05

Cluster III Pathway -logio p-value Interferon Signaling 10.80 Activation of IRF by Cytosolic Pattern Recognition Receptors 8.67 Role of Pattern Recognition Receptors in Recognition of Bacteria and Viruses 5.20 Role of PKR in Interferon Induction and Antiviral Response 4.18 Role of RIG1-like Receptors in Antiviral Innate Immunity 3.97 Xanthine and Xanthosine Salvage 2.26 Role of JAKI, JAK2 and TYK2 in Interferon Signaling 2.12 Guanine and Guanosine Salvage I 1.96 Adenine and Adenosine Salvage I 1.96 UVA-Induced MAPK Signaling 1.90

38 Table 2.2: Enriched Ingenuity pathways in Clusters IV and V

Cluster IV Pathway -logio p-value Cell Cycle: G2/M DNA Damage Checkpoint Regulation 11.10 Role of BRCA1 in DNA Damage Response 9.73 Hereditary Breast Cancer Signaling 9.40 Mitotic Roles of Polo-Like Kinase 9.25 Cell Cycle Control of Chromosonal Replication 8.03 ATM Signaling 7.79 Role of CHK Proteins in Cell Cycle Checkpoint Control 6.11 Mismatch Repair in Eukaryotes 5.86 RAN Signaling 5.68 DNA Double-Strand Break Repair by Homologous Recombination 4.84

Cluster V

Pathway -logo1 p-value CTLA4 Signaling in Cytotoxic T Lymphocytes 2.01 Factors Promoting Cardiogenesis in Vertebrates 1.93 Retinoate Biosynthesis I 1.90 Cell Cycle Regulation by BTG Family Proteins 1.83 L-dopachrome Biosynthesis 1.83 Oleate Biosynthesis II (Animals) 1.82 nNOS Signaling in Skeletal Muscle Cells 1.70 Oxidative Ethanol Degradation III 1.64 Putrescine Degradation III 1.59 Tryptophan Degradation X (Mammalian, via Tryptamine) 1.55

39 infected, pHrodo+,GFP- or pHrodo+,GFP+- cells at any time point (Mann-Whitney test, p>0.05).

Cluster V is highly expressed in all unexposed cells and has reduced expression in some cells upon exposure (becomes bimodal).

We verified representative expression patterns in Cluster II and III genes using single-molecule

RNA fluorescence in situ hybridization (FISH). Using pHrodo to identify infected cells, we confirmed both the induction of Cluster II in all infected cells (e.g., Illb, I112b, and Nos2) and the bimodal induction of Cluster III (e.g., Irf7 and Ifit2) in infected cells ( fig. S1 D). This method also allowed us to directly verify that the expression of Cluster III was not correlated with GFP fluorescence, indicating that the heterogeneity we observe is not merely due to differences in bacterial burden

(fig. S1 C). It is important to note that single-cell analysis was required to identify the induction of

Cluster III between infected and uninfected cells. Analyzing sorted populations (fig. S1 E) failed to identify these genes as they are not highly induced when averaged over all cells.

2.3.5 Infected macrophages display high cell-to-cell variation in genes from im-

mune response pathways

Motivated by the high variation of the Type I IFN response between infected cells, we next wanted to test whether immune responsive pathways in general show high variation between infected cells.

In RNA-Seq data, gene variance and average expression are highly correlated, such that using a simple variance calculation to select variable genes would be biased toward selecting abundant genes.

We therefore developed a scoring system where we use local linearized regression to estimate the expected mean-variance relationship in our samples. For each gene, we subtract the genes actual variance from its predicted variance based on its mean expression, thus removing the confounding effects of this correlation (Experimental Procedures, fig. 2-4A).

We then tested pathways from the Kyoto Encyclopedia of Genes and (KEGG) (as annotated in MSigDB) for its enrichment in variable genes using GSEA (Experimental Procedures), and identified those pathways that are consistently variable across multiple time points post exposure in infected macrophages. As expected from previous reports (Shalek et al., 2013), many pathways associated with housekeeping functions, such as ribosome function and oxidative phosphorylation, show consistently low variation. On the other hand, many pathways involved in the immune response including Toll-like receptor signaling, cyctokine-cytokine receptor interactions and Rig-I receptor

40 signaling show consistently high variance up to at least 8 hours after bacterial infection (table 2.3). It should be noted that only infected cells were used in this analysis so the differences in gene expression between infected and uninfected cells noted above did not contribute to this signal. Furthermore, in infected cells, at all time points evaluated, genes induced primarily by the intracellular bacterial signals of infection (Cluster II) were more variable than those induced by extracellular exposure cues in infected macrophages (Cluster I; figs. 2-4 B and 2-4 C). This difference suggests that within a seemingly homogenous population of infected cells there exists extensive cell-to-cell variation in the response to infection. This variation is characteristic of responses to intracellular cues of infection more than those to extracellular cues, possibly due to variability in intracellular bacterial state, bacterial burden, or bacterial clearance.

Table 2.3: High and low variance pathways in infected macrophages High Variance Pathway FWER p-value KEGG TOLL LIKE RECEPTOR SIGNALING PATHWAY 0 KEGG CYTOKINE CYTOKINE RECEPTOR INTERACTION 0 KEGG RIG I LIKE RECEPTOR SIGNALING PATHWAY 0 KEGG PPAR SIGNALING PATHWAY 0 KEGG LEISHMANIA INFECTION 0.01 KEGG STARCH AND SUCROSE METABOLISM 0.01 -M KEGG NOD LIKE RECEPTOR SIGNALING PATHWAY 0.01 KEGG TYPE II DIABETES MELLITUS 0.02 KEGG ABC TRANSPORTERS 0.02 KEGG GAP JUNCTION 0.04

Low Variance Pathway FWER p-value KEGG RIBOSOME 0 KEGG OXIDATIVE PHOSPHORYLATION 0 KEGG PARKINSONS DISEASE 0 KEGG HUNTINGTONS DISEASE 0 KEGG CARDIAC MUSCLE CONTRACTION 0 KEGG ALZHEIMERS DISEASE 0 KEGG PROTEASOME 0

41 A B

Infected cells Rp135 30- Per-gene estimate 1.0 Best-fit estimate

20- 0 0.0 0 10 15 2 in 3.5 Tnf 20 X 10- (Cluster I) 0 1. CD 0) (n, 0J 0 0 (D -3.5 Log 2 Expression Mean C

Infected cells Nos2 10 15 20 * * * (Cluster II)

CU 1.0 E S I

W 2

0.0 C, -50 .CU fit2 10 15 20 (Cluster ill) C (a 1.0-

-2

2.5 4 8 Time (hrs) W) 0.05 O * 5 10 S, -Exposure response (Cluster 1) 3 | 1015 20 -Infection response (Cluster II) C-) /~> 0 2.5 hrs 4 hrs 8 hrs

I,

Figure 2-4: The high variation of immune pathways in infected macrophages. (A) Highly variable genes ill infect ed cells are enriched for i1nune response pathways (table 2.3). Localized regression was used to est imna.te Hihe mean variance relationship for genes in infected mcroplages. Genes were assigned a variance Score base(l on dista ne from the fitted relatiolslhil) (solid line). (B) Repiesenitative exampljles of siingle-cell gene expressioll distributions in infected cells from honse- keeping genes 'd genes from Clnst.ers 1. 11 811(1 111. (C) Shown are box ph ts of variance score for either ex)osulre- (Cluster I) or infection-response genes (Cluster ii). at tlree time points following infection. Ifection-responise gene's have signihicantly higher variance then exp)osnre-resl)onsegenles ( )0.01 by Ile WXilcoxon rank-sun test).

12 2.4 Discussion

2.4.1 A general approach to characterize the transcriptional underpinnings of

phenotypic heterogeneity in host-pathogen encounters

Heterogeneity between individual cells is a common feature of dynamic cellular processes, including signaling, transcription, and cell fate (Elowitz et al., 2002). Phenotypic heterogeneity has similarly long been observed as an important feature of infection resulting from individual cellular encounters that involve highly dynamic, adaptable cells and bacteria. However, to date, tools for probing the variation in host-pathogen interactions have been limited and studies of host-pathogen interactions have relied on bulk, population-level measurements. Thus, the specific mechanisms underlying this phenotypic heterogeneity remain largely unknown.

Here, we present a generalizable approach to identify and characterize transcriptional hetero- geneity in host cells that may underlie the phenotypic variation of infection by directly probing individual interactions between host cells and bacteria. By using fluorescent markers to map in- fection phenotypes to transcriptional states we provide a mechanism to prioritize heterogeneity that may contribute to specific outcomes of interest. This is particularly important as biological networks have, in many cases, evolved to minimize the effects of transcriptional noise (Rao et al., 2002) implying that many forms of heterogeneity may not have obvious phenotypes. Importantly, one could imagine easily expanding this system by applying both both new fluorescent markers, possibly interrogating different aspects of infection such as host cell viability, and different tissue culture models of infection greatly expanding our understanding of the basic biology surrounding infection.

2.4.2 A single-cell resolution map of S. typhimurium-macrophage interactions

We demonstrated the usefulness of understanding infection at a single-cell level by applying our system to individual S. typhimuriumn-macrophage encounters. In this model system, we were able to specifically delineate host transcriptional responses arising from extracellular and intracellular detection of bacteria. Although, transcriptional differences have been reported previously between infected host cells and "bystander" cells in other systems (Kasper et al., 2010) we provide a com- prehensive analysis of these differences across multiple time points. We were also able to show that

43 a gene's variation across infected cells is tied to its regulatory mechanism (intracellular vs. extra-

cellular bacterial detection). One could imagine many possible reasons for the increased variation

in genes dependent on intracellular detection. These genes, for example, may be responding to

infection phenotypes that we did not explicitly label, such as differences in host cell fate or maximal

bacterial burden. Another intriguing possibility is that host cells may be responding to differences

in bacterial behavior.

In addition to identifying transcriptional heterogeneity associated with fluorescently labeled

populations, we were also able to identify novel subpopulations of cells defined by the expression

of certain heterogeneous pathways. The most interesting of these was a subpopulation of infected

macrophages expressing genes associated with the Type I IFN response. Heterogeneity in the

activity of the Type I IFN responses has been reported previously in LPS-stimulated dendritic cells

(Shalek et al., 2014), however, this has not, to our knowledge, been reported in the context of

bacterial infection previously, Additionally, the root causes of this heterogeneity and its biological

relevance remain poorly understood. We pursue these questions further in the next chapter.

Before proceeding, it should be noted that despite our success in differentiating the transcription

signatures of infected and uninfected niacrophages, we were, unfortunately, unable to identify unique

transcriptional signatures distinguishing macrophages infected with live (GFP+) vs dead (GFP-)

bacteria. Such genes would obviously be of interest for determining what permits host cells to

successfully control pathogens. We suggest that our failure to find an obvious signature emphasizes

complex nature of cellular infection and likely indicates that many diverse attributes contribute to

infection outcome. For example, one could imagine a scenario where a given host response could be

beneficial or harmful to the host depending on the behavior of the invading bacteria. Furthermore

'the timing of specific host and bacterial responses could easily play a significant role. These questions

are beyond the scope of this work, but should be followed up in future studies.

2.5 Materials and Methods

2.5.1 Mice, cell lines and bacterial strains

C57BL/6 WT mice were obtained from Jackson Laboratory (Bar Harbor, Maine). All animals

were housed and maintained in a conventional pathogen-free facility at the Massachusetts General

44 Hospital in Boston, Massachusetts. All experiments were performed in accordance to the guidelines outlined by the MGH Committee on Animal Care (Boston, MA).

All experiments in this section used S. typhimurium strain SL1344 labeled with the GFP- containing pFPV25.1; (Addgene, Cambridge, MA).

2.5.2 Cell and bacterial cultures, single-cell sorting, and analysis

We prepared cultures of bone marrow derived macrophage cells (BMMs) from 6-8 week old female

C57BL/6 mice as previously described (Falk et al., 1988). BMMs were maintained in DMEM (Life

Technologies, Carlsbad, CA) supplemented with 20% FBS (Life Technologies, Carlsbad, CA), 5%

L-glutamine (Life Technologies, Carlsbad, CA) and 25 ng/ml recombinant murine in-CSF (R&D

Minneapolis, MN). At 5 days of in vitro culture, macrophages were seeded in 6-well non-tissue culture treated plates (5x105 cells per plate). Cultures of S. typhimurium strain SL1344 labeled with GFP (pFPV25.1; Addgene, Cambridge, MA) were grown in Luria-Bertani (LB) medium at

37*C shaken at 250 rpm. Sixteen hours later, S. typhimurium were washed in PBS and incubated for 1 hour with pHrodo dye (Life Technologies, Carlsbad, CA) at room temperature in 100 mM sodium bicarbonate. S. typhimurium cells were then washed three times with HBSS and OD600 was measured. BMMs were infected at an MOI of 1:1 and spun down for 5 minutes at 250g.

Thirty minutes later cells were washed with media containing 15 pg/ml gentamicin to remove S. typhimurium that were not internalized.

2.5.3 Imaging assay protocol

7.5x104 macrophages were seeded into 96-well black clear-bottom plates. S. typhimurium consti- tutively expressing GFP was stained with pHrodo as above, and used to infect cells at an MOI of

10:1. At the indicated time points after infection, media was aspirated, cells were washed three times with PBS, and fixed with 4% paraformaldehyde with Triton X-100 and DAPI.

2.5.4 Image acquisition and analysis

Plates were imaged using an Image Xpress Micro high-throughput microscope (Molecular Devices).

Images were taken with a 20x objective at sixteen sites per well. Images were then analyzed using

CellProfiler open-source software (Lamprecht et al., 2007). The imaging-analysis pipeline included

45 correction to homogenize illumination over each field, a filter to remove any large debris from the analysis, identification and quantitation of DAPI-stained nuclei, and identification, quantitation, and pixel intensity calculations for GFP- and pHrodo-expressing bacteria. The final output was calculated as (average GFP pixel intensity per bacterium across the field) x (number of bacteria identified in the field)/number of nuclei per field. The sixteen images sites per well were averaged.

2.5.5 Single-cell expression profiling

At the indicated time points, cells were lifted from plates, transferred to FACS tubes and analyzed by FACS, using a BD Facs Aria II (Excitation/Emission parameters were as follows: GFP/CD11b-

FITC: 488/500-530, pHrodo/RFP: 561/600-620, CFP: 445/462.5-477.5, F4/80-PE: 561/574.5-589.5).

Single cells were sorted based on GFP or pHrodo signals into individual wells of a 96-well plate, each containing 5 pl TCL buffer supplemented with 1% f2-mercaptoethanol (Qiagen, Valencia, CA). Af- ter centrifuging, we snap froze the plates and transferred them immediately to -80*C. The total time elapsed between removal from the incubator and lysis was less than 5 minutes. Right before cDNA synthesis, we thawed the cells on ice and purified RNA with 2.2x RNAClean SPRI beads

(Beckman Coulter Genomics, Danvers, MA) without final elution. The beads with captured RNA were processed immediately for cDNA synthesis. We also prepared control wells with ensembles of

1,000 cells as population samples.

2.5.6 cDNA synthesis, amplification and library construction

We used the SMARTer Ultra Low RNA Kit (Clontech, Mountain View, CA) to prepare ampli- fied cDNA (Trombetta et al., 2014). Specifically, we added 1 ml of 12 PM 3' SMART primer

(5'-AAGCAGTGGTATCAACGCAGAGTACT(30)N-1N (N - A, C, G, or T; N-1 - A, G, or C)), 1 pl of H20, and 2.5 pl of Reaction Buffer onto the RNA-capture beads. We mixed them well by pipetting, heated the mixture at 72'C for 3 minutes and placed the mixture on ice. First-strand cDNA was synthesized from this RNA primer mix by adding 2 pl of 5x first-strand buffer, 0.25 pd of

100mM DTT, 1 pl of 10 mM dNTPs, 1 pl of 12 pM SMARTer II A Oligo (5'-AAGCAGTGGTAT-

CAACGCAGAGTACXXXXX), 100 U SMARTScribe RT, and 10 U RNase Inhibitor in a total volume of 10 p1 and incubating at 420 C for 90 minutes followed by 10 minutes at 70'C. We purified the first strand cDNA by adding 25 pl of room temperature AMPure XP SPRI beads (Beckman

46 Coulter Genomics, Danvers, MA), mixing well by pipetting, and incubating at room temperature for 8 minutes. We removed the supernatant from the beads after a good separation was established.

We carried out all of the above steps in a PCR product-free clean cabinet. We amplified the cDNA by adding 5 pl of 10x Advantage 2 PCR Buffer, 2 pl of 10 mM dNTPs, 2 Il of 12 pM IS PCR primer (5'- AAGCAGTGGTATCAACGCAGAGT), 2 pl of 50x Advantage 2 Polymerase Mix, and

39 pl H20 in a total volume of 50 td. We performed the PCR at 95*C for 1 minute, followed by

21 cycles of 15 seconds at 95*C, 30 seconds at 65*C and 6 minutes at 680C, followed by another 10 minutes at 72*C for final extension. We purified the amplified cDNA by adding 90 pl of AMPure

XP SPR.I beads and washing with 80% ethanol.

Amplified cDNA was analyzed by Bio-analyzer (Agilent, Santa Clara, CA) quantified by Qubit, and 0.125ug of cDNA was the used for fragmentation using the Nextera XT kit (Illunina, San Diego, CA) according to the manufacturer's instructions, with all quantities adjusted the I of the recommended volumes. Libraries were then analyzed by Bio-analyzer and quantified by Qubit.

Quality of libraries was verified by Miseq and then analyzed on Hiseq2500.

2.5.7 Transcript quantification

Basic quality assessment of Illumina reads and sample demultiplexing was done with Picard version

1.107. Samples profiling were aligned to the mouse transcriptome generated from the Dec. 2011

(GRCm38/mmn10) build of the nouse , and release 69 of the mm1O Ensemnbl gene annotations

(Flicek et al., 2014) using RSEM v1.2.3 (Li and Dewey, 2011) with the default connand-line options.

Alignments and quantifications were also done using RSEM with the following command options: rsem-calculate-expression -paired-end -calc-ci -estimate-rspd. Gene abundance was estimated using

Transcripts per Million (TPM) with RSEM.

2.5.8 Quality filters and statistics

Single cells with fewer than 50,000 mapped reads or fewer than 4,000 genes detected (TPM larger than 0) were discarded. We began with data from 288 individual macrophages, of which 11 were discarded as having too few mapped reads and 36 were discarded as having to few detected genes, leaving us with 241 (84%) cells passing all of our quality filters with an average read depth of

1,069,025 reads per cell with an average of 6,299 genes detected per cell. Non-single-cell ("bulk")

47 RNA-Seq libraries used for quality control contained an average number of 14,477,197 reads per library with an average of 13,200 expressed genes.

2.5.9 Comparison of methods for differential expression analysis

In R version 3.2, DESeq version 1.10.1, EBSeq version 1.8.0, DESeq2 version 1.8.1 and edgeR version

3.10.2 were evaluated for their ability to call differentially expressed genes in the context of single- cell libraries. First, a gold standard set of 2199 genes that were differentially expressed between unexposed macrophages and macrophages exposed to S. typhimurium for four hours was identified using bulk R.NA-Seq libraries (5x10 5 cells) constructed using Illumina's TruSeq protocol. Each method was -applied separately to these bulk samples and only genes identified by all methods were selected. Each method was then applied to our single-cell libraries, with each cell being identified as a biological replicate of either an unexposed sample or a sample that was exposed for four hours

(combining uninfected and infected macrophages). For DESeq the options described below were applied as these were found to give better performance. An ROC curve was constructed to compare the results of each algorithm on single-cell data to our "gold standard" using the R package pROC

(version 1.8). Algorithms were assessed using the area-under-the-curve metric. To further establish that DESeq would not give false positive results when calling differentially expressed genes, infected macrophages (as determined by pHrodo signal) that had been exposed to

S. typhimurium for four hours were randomly assigned into one of two groups. DESeq was used to try to identify differentially expressed genes between these two groups and correctly identified no differentially expressed genes.

2.5.10 Differential expression analysis

All analyses run in R, unless otherwise noted, were run using R version 2.15.1. Differential ex- pression analysis was done using DEseq version 1.10.1 (Anders and Huber, 2010). Read counts were enumerated using the rsem-generate-data-matrix function as provided in RSEM. For single- cell data, the following modifications were made to the standard DEseq workflow (as described in the DESeq vignettes: http://bioconductor.org/packages/release/bioc/html/DESeq.html):

1) Every cell in a given population was treated as a replicate

48 2) Genes with positive TPM in fewer than ten percent of the cells being analyzed were filtered prior to analysis.

3) Dispersions were estimated using fitType-"local", sharingMode-="maximum", and method-"pooled".

4) Genes were only considered differentially expressed if they were statistically significant at an

FDR <0.05 and had an absolute mean log2 fold change of at least 1 between the samples being compared.

Differential expression analysis of populations was performed as above, with the exceptions that the parametric fit was used for dispersion estimates and prefiltering was applied to genes where the maximum expression in relevant datasets was less than 10.

2.5.11 PCA

PCA was performed on log2 gene expression values (log2(TPM +1)), scaled to mean 0 and unit standard deviation (by gene), using the R function prcomp(. Principle components were visualized using the R package ggplot2() version 0.9.3.1

2.5.12 Identifying genes responding to extracellular or intracellular bacteria

(Clusters I and II)

All genes that were significantly over expressed at 4hrs in exposed or infected macrophages (both compared with unexposed macrophages) based on our single-cell experiments were placed into one of two gene clusters: Cluster I, for genes that respond primarily to extracellular bacteria, or Cluster

II for genes that respond primarily to intracellular bacteria.

To determine whether a gene would be placed in Cluster I or Cluster II we calculated a metric we termed the intracellular-to-extracellular ratio (IC:EC) as detailed below:

IC:EC- (infectedMeanExpresion - uninfectedMeanExpression) - (uninfectedMeanExpression - unexposedMeanExpression)

Intuitively, the IC:EC ratio is simply the average increase in gene expression resulting from bacterial infection minus the average increase in gene expression resulting from bacterial exposure.

Genes with IC:EC <0 were placed into Cluster I. Genes with IC:EC >0 were placed into Cluster

II. Any genes that were later discovered to belong to Cluster III (described below) were withheld from this analysis.

49 2.5.13 Identification of co-regulated gene clusters (Clusters III, IV, and V)

To identify clusters of genes that behave in a coordinated fashion but may not be obviously re- lated to our fluorescently labeled, phenotypic populations, we performed weighted gene correlation network analysis using the R package WGCNA version 1.27-1 running on R-3.0.2 (Langfelder and

Horvath, 2008). All genes expressed in at least 10 cells (TPM >0) in our entire exposure time course were included in this analysis. Each time point was analyzed separately to find clusters of related genes. At each time point the basic workflow described in the WGCNA tutorials was followed

(http://abs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/) with the following exceptions:

1) Our similarity matrix was defined using Spearman's correlation coefficient (p) to minimize the effect of outliers.

2) A "signed hybrid" TOM matrix was constructed as this seemed to give the most reasonable clustering results by manual inspection

3) The power argument for adjacency.fromSimilarity() was set to six for our unexposed sample and five for our 2.5 hour sample based on visual inspection as pickSoftThreshold() failed to find an optimal power. Both of these values were commonly selected for other single-cell samples by pickSoftThreshold()

4) cutreeDynamic() with average linkage and the default minimum cluster size was used to define our final clusters.

Due to the noise inherent in single-cell data, WGCNA identified on average 34 clusters per time point. Most of these were based on the behavior of only one or two cells and were not stable across multiple time points. Three clusters, however, were stable across more than one time point based on a hypergeometic test and a minimum gene overlap of 40 genes, which was chosen to minimize noise. Cluster III was chosen for further analysis and is shown in fig. 2-3 D. The genes in each cluster are shown in tables S2 and S3 and pathway enrichment using Ingenuity Pathway Analysis

(IPA, QIAGEN, Redwood City, www.qiagen.com/ingenuity) for each cluster is shown in tables 2.1 and 2.2.

50 2.5.14 Correlation plots

Pearson gene-gene correlations were calculated from log2 transformed gene expression values (log2 (TPM+ 1)) at each time point using all cells passing our basic quality thresholds described above. Correlation plots were created using the heatmap.2() function in the R package gplots version 2.11.0.

2.5.15 Heatmaps

Heatmaps were generated using the heatmap.2() function in the R package gplots (version 2.11.0).

Gene expression values were transformed into log space (log2 (TPM +1)) and scaled (by gene) to mean 0 and unit standard deviation prior to plotting. Rows and columns were ordered as described in each figure legend. The following parameters to heatmap.2() were used: Rowv-F, Colv-F, trace-"none", dendrogram-"none", col-colorpanel(120, 'blue', 'white', 'red'), breaks-seq(from--2, to-2, length.out-121).

2.5.16 Single-molecule RNA-flow FISH

BMMs were infected with fluorescently labelled S. typhimurium strains. Four hours after infection, cells were fixed and analyzed by RNA-fish using the Panomics QuantiGene ViewRNA ISH kit according to the manufacturer's instruction (Affymetrix, Santa Clara). Probes were directed against the host genes Ill b, Nos2, Il12b, Irf7, and Ifit2. Images were acquired using ImageStreamX Mark

II (Amnis, Seatle), and analyzed with Ideas software (Ainis, Seatle).

2.5.17 Estimating gene variance scores

The mean and variance of all genes (in TPM) across single cells was computed. These were both transformed into log space and the loessO function in R (with default settings) was used to fit a localized regression model. All genes not showing expression levels of at least 5 TPM in at least

10% of cells were excluded from this analysis. By performing modeling in log space and excluding very low abundant genes we minimize the effects of unequal dispersion on our analysis and were able to rank genes effectively despite changes in average expression levels. The residual between the actual log variance and the fitted value was used as our variance score. The resulting variance score was roughly normally distributed.

51 2.5.18 Identification of low and high variance pathways in infected macrophages

To identify pathways that show consistently high variation in infected cells, all genes were assigned variance scores as described above, using only infected cells, at each of three time points post infection: 150, 240, and 480 minutes. Because genes responding to bacterial infection or exposure may be expected to show transiently increased variation initially after bacterial stimulation, before reaching steady-state levels of induction, we used the minimum variance score among our three time points as our estimate of the overall variance for a given gene. We ranked each gene (according to the minimum variance score from our three time points) and used the GSEA Preranked tool in GSEA version 2.1.0 (Subramnanian et al., 2005) to identify KEGG pathways (Kanehisa and

Goto, 2000) that were enriched among our highly variable genes, using the default analysis settings and 100,000 permutations. KEGG pathway annotations were taken from MSigDB version 4.0 and mapped onto the mouse genome using the Complete List of Human and Mouse Homologs at the

Jackson Laboratory website (downloaded on 8/26/2014). Low variance pathways were identified in an identical manner with the exception that the median variance score from all three time points was used to determine a gene's final ranking (instead of the minimum) so that a single time point showing low variation would not be sufficient to give a gene a low overall variance score.

2.5.19 Comparison of variance scores between exposure- and infection-induced

gene sets

At 150, 240, and 480 minutes post-exposure, exposure and infection induced gene sets were deter- mined using the procedure described for the identification of Clusters I and II above. At each time point, the variance scores between the exposure-induced and infection-induced genes were compared

(with any genes not being assigned a variance score due to low expression being omitted). Box- plots were made using ggplot2 (version 0.9.3.1). P-values were assigned to differences using the non-parametric Wilcoxon rank-sum test (R function wilcox.testo, default settings).

2.5.20 Cell expression density plots

Density estimates for plots displaying gene expression distributions across our time course were made with the R function density() with density estimates being made at 1024 points spanning

52 the total range of expression values for that gene during the time course using default kernel and band-width settings. Plots were made using the persp() function in R.

2.5.21 Datasets

Gene Expression data are accessible through GEO (http://ncbi.nlm.nih.gov/geo) under accession numbers GSE65528, GSE65529, GSE65530, and GSE65531.

53 54 Chapter 3

The Mechanisms Behind Heterogeneity in the Type I IFN Response

3.1 Attributions

Significant portions of this chapter were published as a manuscript in Cell: Avraham et al. (2015) or adapted from this manuscript

Wetlab experiments were designed and executed in collaboration with Roi Avraham

LPS extractions were done by Douglas Brown

Mouse experiments were done with the aid of Cristina Penaranda and Humberto B Jijon

3.2 Introduction

We demonstrated in the previous chapter that the macrophage transcriptional response to S. ty- phimurium exposure is composed of a homogenous response mediated by extracellular bacterial detection (Cluster I), a response that is unique to infected cells, mediated by intracellular bacterial detection (Cluster II) and a heterogenous response induced in a fraction of infected cells that is enriched for elements of the Type I IFN response (Cluster III). The heterogeneous activity of the

Type I IFN response is interesting, particularly as the this response has been tied to non-canonical inflammasome activation in response to S. typhimurium exposure (Rathinam et al., 2012) and to

55 both positive and negative effects on infection in different infection models (Robinson et al., 2012;

Freudenberg et al., 2002) We therefore wished to understand the mechanism underlying this het- erogeneity and evaluate its impact on infection. These two questions are the main focus of this chapter.

3.3 Results

3.3.1 Intracellular TLR4 signaling through TRIF and IRF3 determines the ex-

pression of the Type I IFN response in infected macrophages

We first sought to understand the host pathways surrounding Type I IFN induction in infected macrophages. It has been previously suggested that LPS accounts for all the transcriptional re- sponses to infection, including intracellular bacterial detection (Rosenberger et al., 2000). LPS is detected by TLR4, which signals through two different adaptor proteins MYD88 or TRIF, depending on whether LPS is sensed at the cell membrane or at a phagosome, respectively (Kagan et al., 2008).

Specifically, induction of the Type I IFN response was shown to be mediated by TRIF through the interferon regulatory factors IRF3 and IRF7 (Fitzgerald et al., 2003). We hypothesized that the differential activation of Cluster III in infected cells may depend on key components of TLR4/LPS signaling. Thus, we measured the transcriptional response of wild-type (WT), TLR4-/-, TRIF-/-, and MYD88-/- immortalized BMMs (iBMMs) (Experimental Procedures) to infection with S. ty- phimurium at the single-cell level by monitoring an expression signature of 96 genes representative of Clusters I, II, and 111 using qRT-PCR (fig. S2 A, table S4, Experimental Procedures). Compared to WT cells, we found ablated activation of all three clusters in TLR4-/- cells (fig. 3-1 A), suggest- ing that LPS and TLR4 sensing dominate the transcriptional responses to infection, as previously suggested (Rosenberger et al., 2000). Next, to analyze the transcriptional response to infection of

MYD88-/- and TRIF-/- cells, we defined a "TRIF-MYD88 ratio" to assess the dependence of each gene's expression on TRIF versus MYD88 (fig. 3-1 A, Experimental Procedures). We found that in

Cluster I and Cluster II regulation is partitioned, with some genes being regulated by MYD88 and some by TRIF. Cluster III on the other hand, is regulated almost entirely through TRIF (fig. 3-1 A).

Interestingly, MYD88 knockout upregulated this cluster in both infected and uninfected cells, which may indicate MYD88-dependent negative feedback inhibition of Cluster III induction.

56 A D

Single Cells

TLR4 f Exracellular ntre uar , 4 MYD88 TRIF e TBK1 Exposure, Infection F3 II I I Hit11 Responses (Cluster I I) c c Type I FN 0il IIa lW 1 111 response ill Il! 2 (Cluster 1ll) in a subset of cells E

Unexposed Ji K Jn~U UInfected i i UB Unfected WT TLR4-/- MYD88- TRIF B C

Sorted Populations Sorted Populations F] I' blei

(hrs):024802480248 _ WT IRF3-/- IRF7-/- -- Unexposed Expression score [Uexsposed * Uninfected EUninfected U Infected Elnfected

Figure 3-1: Analysis of rnacrophage pathways regulating the bimodal induction of the Type I IFN response. (A) Induction of' Cluster III is solely dependent on TRIF signaling. i3NIMs froin \'T TLR-4- -. MYD88- -. or TRIF- - mice were infected with S. I'yphvimuiriuim. and the expression of single cells was analyzed. Genes are arrangedi Iby a score summarizing their NIYD88 or TRIF lepen(lence ( MTR. left ibars in(licate the (listribution of scores for each cluster). (B) B\I s fromn IRF3- - aii(d IRF7- - mice were iifectedl with l)I-ro(lo-stainied GFP-labeled S. ltyph'im rim. Decrec-se( ind1uction of reprisentiativ( genes from Cluster III was evident in IRF3- - cells, compared to increasd( (inductioti levels in IRF7 - cells. (C) BNI1\s were iifecte(l with pllrodo-la)eled, GFP- latl)( l . typ/him -rn., in tle presence of 1312536 or BX795. Whliile B12536 inhibited mostly Chister Ill gees ])ut also genes from Cluster I and II., BX795 sl)ecificallv iniilited only the induction of onlI Cluster III genes. (D) Schinmat ic representation of the gene regulatory networks that control y iu)rnl m exposuretht andrspons infection. Theof inductionmiacrophages of the Type to S. 1 I IFN respoise 1s (ue to activation of T K1 ai IRF3 in only a subset of infected tells. (E) Plots suunarize the exiression of each genle cluster in BIMMs exposed to live bacteria (top panel) or LPS-coated beads (bott o1 panel) using a weighted average of scaled ex pressiol va luies (x-axis) verses the frequency of single cells (y-axis). In comntrast to tle biiimodal act ivatioi of t he Type I IFN response in cells infected witIi live l)acteria, there was a mnuclh higher l)roportion of cells that activate(l (iluster III among thbe cells that had taken up LPS coate(l beads. 57 I1 Next, we infected BMMs from WT, IRF3-/- and IRF7-/- mice and found that knockout of IRF3 exclusively ablates the activity of Cluster III in infected cells, while IRF7 knockout enhances its activation (fig. 3-1 B). This suggests that while TRIF has a role in the induction of all clusters, its activation of IRF3 is specific to Cluster III and occurs in only a subset of infected cells. Based on these results, we tested two known inhibitors of the Type I IFN response, BX795 (a TBK1 inhibitor, (Lee et al., 2013)) and B12536 (a PLK inhibitor, (Chevrier et al., 2011)) and found that while B12536 inhibited genes from all three clusters, BX795 specifically inhibited only Cluster III genes (fig. 3-1 C). Overall, these data are consistent with a model in which single-cell transcriptional responses of macrophages to S. typhimurium exposure include a homogenous inflammatory response to bacterial exposure (Cluster I) and a more variable antibacterial response to intracellular invasion (Cluster II). Both responses are mediated by a combination of MYD88 and TkIF activity. A third response also occurs in a fraction of infected cells, involving intracellular LPS detection by TLR4 which signals through TRIF and IRF3 and results in a bimodal Type I IFN response (fig. 3-1 D).

3.3.2 Live bacteria, but not LPS-coated beads, elicit a variable Type I IFN response in infected macrophages

To study the molecular mechanisms that lead to activation of the Type I IFN response in only a subset of infected macrophages, we explored this variation over time and in different infection models. A recent study showed that in dendritic cells exposed to LPS, the Type I IFN response is initially bimodally expressed and then uniformly induced over the entire population by four hours due to paracrine signaling (Shalek et al., 2014). In contrast, we have observed that the Type I IFN response in macrophages infected with bacteria has sustained bimodal expression during the entire time course (8 hours). While we also observe additional non-cell autonomous effects of Type I IFN activation at late time points in uninfected cells (reminiscent of the induction pattern seen in dendritic cells exposed to LPS), these additional effects do not eliminate the bimodal response in cells infected with live bacteria. This discrepancy between a transient and sustained bimodal Type I IFN response might be due to a difference between stimulation with soluble LPS versus LPS associated with an intact, infecting bacterium, other additional components of an intracellular bacterium, or a difference between host cell types. To examine these possibilities, we compared transcriptional responses between macrophages exposed to live S. typhimurium and macrophages

58 exposed to fluorescently labeled latex beads coated with LPS extracted from S. typhimurium (Ex- perimental Procedures). Macrophages exposed to LPS-coated beads indeed activated Clusters I, II and III (fig. S2 B).

To compare between different subpopulations after treatment with LPS-coated beads or live bacteria, we summarized the expression of each cluster with a single "eigen-gene" and calculated the density of these values across single cells (Experimental Procedures). Interestingly, compared to cells infected with bacteria, a much higher proportion of cells activated Cluster III among the cells that had taken up LPS-coated beads (figs. 3-1 E, S2 B and S2 C). This difference in activation was not the result of different levels of LPS exposure, since there was a uniform but reduced induction of Clusters

I and II in cells exposed to LPS coated beads compared to live bacteria (figs. 3-1 E and S2 C). This result suggests that there may be a bacterial factor that varies (e.g., displays bimodal behavior) among individual invading bacteria that accounts for the heterogeneous expression of the Type I IFN response upon bacterial uptake. However, on isolated LPS-coated beads this factor's heterogeneity is less pronounced. We also observe a stronger non-cell autonomous effect in uninfected cells exposed to LPS-coated beads that may be due to the release of more interferon from infected cells.

3.3.3 Variation in the host Type I IFN response is driven by bimodal activity

of the bacterial PhoPQ two-component system in infecting bacteria

Based on the hypothesis that the bimodal induction of the Type I IFN response may be due to het- erogeneity in the infecting bacteria, we sought to identify bacterial factors that may influence Type I

IFN expression. In the nucleus, IRF3 binds to the IFN-stimulated response element (ISRE, (Honda and Taniguchi, 2006)), a process that can be monitored at the single-cell level using a fluorescent reporter and FACS. We usn ed iBMMs stably transduced with an ISRE fused to GFP as a reporter of the Type I IFN response in individual cells (iBMM-ISRE, fig. 3-2 A, Experimental Procedures).

We infected iBMM-ISRE with RFP-expressing bacteria, sorted ISRE-positive and ISRE-negative infected populations, and used RNA-Seq to simultaneously profile host and bacterial transcripts in each population (Experimental Procedures). We confirmed that indeed, the Type I IFN response is more strongly induced in the sorted ISRE-positive population compared to the sorted ISRE-negative population (induction>1.5 fold, PFWER<0 -0 5 , GSEA analysis, fig. 3-2 B). Comparing the expression of bacterial pathways in these two populations, we found that targets of the bacterial transcription

59 A B

6 x ISRE 0.3 PhoPQO (D m a a a & 5.0

phoQ E03 Cluster II ol- phoP c,4 2 5 W 0.10 0) - Rank in ordered data set ISRE + ISRE - L 00 i4.A a) C >

25 Pho 0, hi/A

0 W50 Bacteria Host All genes All genes PhoP regulated , Cluster III genes LU genes 4 Pho 1 100 10,000 a) low 3 PhoP high Mean expression 2- E L

0 - -,. . ? Unexposed 2 1 10 104 105 106 FE1 Uninfected Fluorescence intensity 0 Infected D

1PhoP- C) Sorted populations

0' pcn 0 -2 C

.2 (D

L Time (hrs): 0 4 8 4 8 4 8 Uninfected PhoP-Iow PhoP-high Expression score

Figure 3-2: Heterogeneity in the invading bacterial population shapes the heterogeneous host Type I IFN response. (A) Schematic of iBMMs witlh a transcriplional )orter (fXISBE- GFP) of the activitv of the Type I IFN gene cluster. (B) Shown is an MA plot of the induction levels of host and bacterial trallscrilts in ISH E-positive over ISRE-negative populations (v-axis) versus average al)sollte read coilits (x-axis). Infected ISRE-positive cells ex)ressingo high levels of Cluster III genes (green dots) ale iiifected wvi th lact-eria eXp)ressinlg higher levels of PhoPQ- regiudated genes (id dots) colnmared xwith IS1E-iuegative cells. Inset indicaties the enrichment, of PhoPQ-regulated genies and Cluster III genes (GSEA analysis. p 0.007 and p 0.001 respectively). (C) Schematic of S. tyjpb Iim'.1trt'inm with11 a tranlscrii)tional reporter of PhoP activity (ploP-GFP. top'). PhoP displaved bimodal activity in infected macrophages. as analyzed by FACS (bottom, infected cells wer iduetifie d by plirodo). (D) Cells inlfected Xvith bacteria expressilng high levels of )1.oP-GFI) sho1w hiogher exl)ressioln of Cluster III genes colmlipare(d t~o cells were infected with bhacteria expressing low levels of phoP-GFP. (E) Plots suiimarize tI expi ression of the Type I IFN response in BMMs iilfecteld vith WT. PhoP-. or PhoP, strains of S. i1 JyphIm1nrtIlr/ with a s(ore based on a wveiglted average of scaled eXpressi)on values (x-axis) and dislAyli it versus the frequency of siigle cells (y-axis). Infection with PholP results ill inidiiction of the y pe I IFN response in almost all infected cells, comflpareld to cells infected wit i W'T or Phol- strains. 60 factor PhoP were significantly upregulated in ISRE-positive cells compared to ISRE-negative cells

(pFWER<0.05, GSEA analysis, figs. 3-2 B and S3 A). In fact, both phoP and the associated phoQ gene were in the top 50 differentially expressed genes between these two populations, while hilA, a gene known to be repressed by PhoP, was among the most downregulated (fig. 3-2B). PhoP is the response regulator of a two-component system (with its cognate sensor kinase PhoQ) that is activated after a Salmonella bacterium is taken up by macrophages and induces the expression of genes important for intramacrophage survival (Groisman, 2001). We therefore hypothesized that variation in PhoP activity may underlie the variation in the Type I IFN response.

To test this hypothesis, we first assessed the variation in PhoP activity among intracellular bacteria using ail engineered reporter with a PhoP-sensitive upstream of GFP (phoP-

GFP). We infected BMMs with pHrodo-labeled S. typhimurium carrying the phoP-GFP reporter.

Consistent with our hypothesis, we found that PhoP indeed has bimodal activity in the population of infected cells (fig. 3-2 C). We then sorted GFP-high and GFP-low macrophage populations and confirmed the difference in the expression levels of phoP between GFP-high and GFP-low infected cells by real-time qPCR (fig. S3 B). We found increased expression of the Type I IFN response in the PhoP-high compared to PhoP-low infected cells (over 5-fold increase at 4 hours and over 3-fold increase at 8 hours, p<0.05 at both time points by a bootstrap analysis, (figs. 3-2 D and S3 C)).

Importantly, this difference in Type I IFN expression was not observed when using a constitutive

GFP reporter (fig. S31D) implying that the host cell is not responding primarily to differences in bacterial burden but to unique properties of PhoP-low and PhoP-high bacteria. No significant difference was observed in the expression of Cluster I or Cluster II between PhoP-high and PhoP- low infected cells (fig. S3 E). Together, these results demonstrate a correlation between PhoPQ activity and the host Type I IFN response.

To establish whether PhoP activity functionally determines Type I IFN expression in host cells, we infected macrophages with a phoP null mutant (PhoP-) and a strain with a single mutation in the phoQ gene that renders PhoP constitutively active (PhoPc) (Miller and Mekalanos, 1990). Analyzing sorted infected populations, we found that cells infected with PhoPc bacteria induce the Type I IFN response more strongly than WT infected cells, while cells infected with PhoP- bacteria induce a weaker response (fig. S3 F). Interestingly, at the single-cell level, we found that infection with PhoPc, like stimulation with LPS-coated beads, increased the fraction of cells inducing the Type I IFN

61 response (fig. 3-2 E). PhoPc exposure also elicited a Type I IFN response in more uninfected cells than WT or PhoP~ exposure, again implicating non-cell autonomous effects. Similar proportions of PhoP and WT infected cells induced the Type I IFN response. Notably, no differences in the induction of Clusters I or 1I were observed between the phoP mutants (figs. S3 F and S3 G). These results indicate that the Type I IFN response is both correlated with and functionally the result of the activity of PhoPQ.

3.3.4 Intracellular recognition of PhoPQ-mediated LPS modifications results in

induction of the Type I IFN response

PhoPQ is a global regulator of S. typhimurium virulence, involved in numerous cellular processes including activation of Type III secretion and cell wall alterations (Groisman, 2001). To test which of these processes might impact host Type I IFN expression, we treated BMMs with supernatants or heat-killed bacteria from Ph o PC and PhoP cultures. Culture supernatants failed to elicit a differential Type I IFN response excluding the involvement of factors secreted by PhoP-regulated

Type III secretion systems (fig. S4 A, bottom). Treatment with heat-killed cultures elicited a dif- ferential Type I IFN response, corresponding to infection with live mutants (fig. S4 A, top). This result would be consistent with cell wall alterations playing a role in Type I IFN induction.

These results, together with reports implicating PhoPQ as regulator of LPS modification (Guo et al., 1997), led us to hypothesize that PhoPQ may exert its influence on the Type I IFN response through LPS modification. To test this hypothesis, we extracted LPS from WT, PhoP- and PhoPc strains and used it to stimulate BMMs. We used a standard limulus amebocyte lysate (LAL) test to normalize LPS concentrations from the different extractions (Experimental Procedures). Similar to infection with live bacteria, PhoPc LPS induced higher levels of Type I IFN responsive genes compared to WT LPS (over 9-fold higher at 2 hours post exposure, p<0.05 by bootstrap analysis), while LPS from PhoP- induced lower levels (over 4-fold lower at 2 hours post exposure and over

40-fold lower at 4 hours post exposure, p<0.05 at each time point by bootstrap analysis, figs. 3-

3 A and S4 B). Notably, stimulation of cells with commercially available LPS from S. typhimurium resulted in induction levels similar to LPS from WT (fig. S4 C), validating our extraction method and quantifications of LPS. These results demonstrate that PhoPQ-mediated LPS modifications are responsible for the induction of the Type I IFN response.

62 I

A B

Sorted populations

0~ A o iN 5 CD I~I / C 0) -I C 0~ En) 0) CD /

F12

U) C) 0 Expression score

LPS Unexposed 0 PhoP U PhoP Time- 0 2 4 8 2 4 8 LPS- WT PhoP- PhoPc

C

++ +

0 20 + -D ~F 0

Infected Infected cells cells

Low High Low High Expression levels Expression levels

E] Low Type I IFN expression - Unmodified LPS .C D WT bacteria D High Type I IFN expression + Modified LPS _+ LPS beads

Figure 3-3: Bacterial variation in PhoPQ-nediated LPS modifications drives the bimodal induction of the host Type I IFN response. (A) Cells st inuniated withi LPS froi the PholP straniiduee higher levels ofTy pe I IFN responsive genes eoiCipared to cells stimulated with LPS firoin Ilie WTI strain. Cells stiiulated withi LPS froin the Pio1- strain showed less indctetion of this eluster. (B) BMMs were stiniulate( with a miixture of re(l aiti green fluorescent beads (oated with LPS extraeted from PhoP or PhoP- strains respeetivelv. 74% of (:ells containing beads eoated with PlioPiLPS idne the Type I IFN response more highly thn11 any unexposed eell (white) comptared to only 2 6 %Y of cells eont aining PlioP LPS (p-0.003 using a two-p)oplilation) l)roportion z-tcst). (C) Seheinati represenitation of the differeCii(es in the responses of B1\lMs to inftietion with live baetermi and to stinnulation with LPS coated beads. Live baeteria are more lidterogeneous tlian LPS coatedz beads.

63 I

/ We next sought to test whether variations in LPS on the surface of individual bacteria are sufficient to drive a bimodal Type I IFN response. As it is not currently technically possible to query LPS modifications at the single-cell level, we simulated a heterogeneous population of "bacteria" by coating red fluorescent beads with LPS from the PhoP strain and green fluorescent beads with LPS from the PhoPc strain. We then treated macrophages with an equal mixture of red and green LPS coated beads, sorted macrophages according to the color of beads they had taken up, and examined induction of genes at the single-cell level. We used a low MOI treatment to preclude the uptake of more then one bead in a given cell. We observed no difference in the induction of Cluster

I or Cluster II between cells that took up beads with LPS from PhoP- or PhoPc strains (fig. 3-3 B, inset). In contrast, there was a clear shift in Cluster III induction, with induction of this cluster in a larger proportion of cells taking up beads coated with PhoPc LPS (74%) than in cells taking up beads coated with PhoP- LPS (26%) (fig. 3-3B; p=0.003 using a two-population proportion z-test). Similar to exposure to live bacterial strains, a non-cell autonomous effect was also evident in uninfected cells exposed to beads coated with PhoPc LPS (fig. S4 D). As controls, induction levels of green or red beads coated with LPS extracted from WT S. typhimurium were similar, no induction was observed using beads not coated with LPS, and comparable results were obtained in bead-color swap experiments (fig. S4 D). Additionally, no significant expression changes were noted between our brightest and dimmest cells infected with beads coated with WT LPS (demonstrating that differences in LPS burden cannot explain our results, fig. S4 E). These results indicate that the bimodal Type I IFN response within a population of infected cells can be recapitulated by infecting with LPS coated beads from PhoP- and PhoPc mutant strains. Thus, differences in the induction of the Type I IFN response are determined not only by the internal state of the host cells or non- cell autonomous effects between host cells, but also as a direct result of the state of the infecting bacterium, specifically, in this case, in the extent of PhoPQ-mediated LPS modification (figs. 3-3 B and 3-3 C).

3.3.5 PhoPQ-mediated LPS modifications impact the in vivo Type I IFN re- sponse and infection outcome

To confirm bimodal induction of the Type I IFN response of infected macrophages that had been naturally differentiated in vivo, we collected all cells from the peritoneal cavity of mice (Experimental

64 A B C

Ex vivo - single cells In vivo - sorted populations 100

0 C- I Cu, I 50 A

C- A 2 (D

2 6 24 48 72 CD Time (hrs)

Jr. a WT E WT + BX795 A PhoP A PhoP + BX795 Unexposed LPS -WT P- P; WT PhoP- 0 Infected WT IRF3-/-

Figure 3-4: PhoPQ-mediated LPS modifications impact in vivo infection outcomes. (A) Like BM\ Is, peritoneal imacrophages. when infected (c- vi'vo with GFP-labeled S. Iyphim.uu' ri.. show bimodal indnction of Cluster 111. (B) Activation of the Type I IFN response 1n vivo was enhanced after stIimilation with PhoP) LPS and reducedi after stinuilation with PhoP- LPS, coiijpared to WT, S. Iyphimium LPS . As a co1rol, ho induction of the Type I IFN respIonse was measured in 111,F3- - mice. (C) Mice challenigeid with PhoB' LPS (blue.. n 12) showed reduced survival con1}pared to mice challenged with WT LPS (black. n 11). Inhibition of the Type I IFN response bY co(-administration of BX795 iniproved survival fromii PhoPl LPS challenge, restoring it to levels associated with WVT LPS challenge (dotted blue. 11 12). Mice challenged with Pho- LPS (red. n 11) showed enhanced survival compared to mice challenged with NVT LPS. Procedures) and immediately infected them with GFP-labeled S. typhimurium. After two hours we sorted infected resident macrophages (fig. 3-4 A) and analyzed the induction of Clusters 1, 11, and III. We found similar expression patterns for all three clusters to those we observed in infected BMMs.

Importantly, we found bimodal induction of the Type I IFN response (fig. 3-4 A), indicating that this pattern of response to infection is generalizable to macrophages from different tissues.

To determine the physiological importance of the relationship between bacterial PhoPQ activity, LPS variation, and the host Type I IFN response, we next sought to demonstrate the same cor- relation in mice using LPS stimulation. We injected mice intraperitoneally (i.p.) with sub-lethal, normalized doses of LPS extracted from WT, PhoP- and PhoPc strains (Experimental Procedures).

After two hours we isolated peritoneal macrophages from treated mice (fig. S5 A), and analyzed the in vivo induction of Clusters I, II, and III. Notably, LPS from the PhoPc strain induced higher levels of Cluster III, while LPS from the PhoP strain had the opposite effect, thereby mirroring the in vitro infection results (fig. 3-4B). Minimal differences in Clusters I and II were observed in vivo, similar to what was observed in vitro. Additionally, we confirmed the IRF3 dependence of Cluster

III by performing this same experiment in IRF3-/- mice (fig. 3-4 B). These results demonstrate a relationship between modified LPS and Type I IFN expression in mice and suggest that PhoPQ is an important regulator of the Type I IFN response in vivo.

Ifnarl-/- mice were previously shown to have prolonged survival after S. typhimurium challenge, demonstrating an important role for the Type I IFN response in determining infection outcome (Robinson et al., 2012). We thus sought to test whether bacterial PhoPQ activity, through its activation of the Type I IFN response, has an impact on infection outcome similar to ablating signaling downstream of Ifnar. Because PhoP~ and PhoPc strains are both avirulent in mice (Miller and Mekalanos, 1990), we turned to a mouse model of LPS-induced septic shock. Septic shock, a systemic response to severe bacterial infection, is considered an important determinant of infection outcome, as it is often associated with high mortality (Morrison and Ryan, 1987). We induced septic shock in mice using high doses of LPS extracted from WT, PhoP- or PhoPc

Salmonella strains and monitored survival. Mice injected with normalized amounts of PhoPc LPS had significantly higher mortality rates than mice injected with WT LPS (fig. 3-4 C, p=0.003, log- rank test). Meanwhile, mice challenged with PhoP- LPS had higher survival rates compared to WT-LPS challenged mice (p=0.003, log-rank test). We then co-administered LPS extracted from

66 PhoPc with the small molecule BX795, which we had previously shown to be a specific inhibitor of the Type I IFN response (fig. 3-4 C), and found significantly improved survival rates of the PhoPc

LPS challenged mice (p-0.0 31, log-rank test). We further verified that these effects were mirrored in the transcriptional responses of peritoneal macrophages. The co-administration of the BX795 inhibitor together with LPS extracted from PhoPc strain abrogated the induction of Type I IFN response, reducing it to levels similar to mice challenged with WT LPS (fig. S5 B). These results demonstrate that the extent of LPS modification by PhoPQ and its interaction with the cognate host Type I IFN response are important determinants of infection outcome in vivo.

3.4 Discussion

3.4.1 Heterogeneity of pathogen populations as a mechanism to shape the host

immune response

In the previous chapter, we revealed specific genetic pathways that show unexpectedly large amounts

of variation between what otherwise appear to be identically infected cells. One such pathway is

the Type I IFN response, which was only fully induced in a fraction of infected macrophages. Here,

we found that the level of Type I IFN induction in infected macrophages is determined by the level

of PhoPQ activity in the invading bacterium (figs. 3-2 D and 3-2 E).

Heterogeneity of transcriptional responses has been reported and traditionally ascribed to stochas-

tic variation or intrinsic state of the cell. For example, a recent publication suggests that the induc-

tion of the antiviral response in dendritic cells in response to bacterial LPS stimulation is dependent

on the existence of a relatively small fraction of "precocious" cells that initiate the response that

eventually spreads through the population via paracrine responses (Shalek et al., 2014). Our work

highlights the fact that immune activation also depends on the state of the invading pathogen. This

demonstrates an alternative source of host heterogeneity, whereby intrinsic variation in bacterial

populations shapes the host immune response. The in vivo experiments indicate functional conse-

quences during infection of the variable factors identified, and point to heterogeneity as a feature

of pathogen populations that impacts infection.

67 3.4.2 Studies of the immune response in the context of heterogeneous bacterial

ligands

Different types of LPS have been shown to produce dramatically different host responses, with diversity in LPS structures having been described between bacterial populations exposed to different environments (Paciello et al., 2013), different bacterial mutants (Guo et al., 1997), and different LPS variants resulting from different isolation procedures (Gutschow et al., 2013). We now show that heterogeneity also exists within a single population of wild-type bacteria. While this alone may not be altogether surprising, we demonstrate that this variability has functional consequence.

There is accumulating evidence that cell-to-cell variation exists in the expression of numerous bacterial factors in addition to LPS, including other PAMPs and virulence factors. For example,

bacteria in the same culture can be in either a motile (flagella positive) or a non-motile (flagella

negative) state (Cummings et al., 2006), or contain very different levels of effector proteins (Schlum-

berger et al., 2005b). Importantly, immunological studies of such molecules have often implicitly

neglected pathogen variability by relying on measurements of host cell response to what is assumed

to be a homogenous ligand, ignoring the reality that such ligands actually result from a hetero-

geneous, diverse population. In this study, we show that coating beads with LPS isolated from a pooled, heterogeneous population of bacteria artificially limnits heterogeneity by mixing modifid

and unmodified LPS stemming from different individual bacteria onto the same bead. This system

thus fails to recapitulate the diversity of actual pathogens and the diversity of the cognate host

response. The heterogeneity of the host response can be restored by reinstating the heterogeneity

in the chemical stimulus (coating two sets of beads with LPS isolated from two different bacterial

mutants (PhoP- and PhoPc), followed by mixing of the two sets of beads (fig. 3-3 C).

Importantly, although we show that bacterial heterogeneity in PhoPQ-mediated LPS modifica-

tions has a significant effect in mediating the host Type I IFN response, this is by no means the only

determining factor, nor is it solely responsible for the heterogeneity we observe. It is well known

that the Type I IFN response can be induced by non-cell autonomous effects such as paracrine

signaling, given that interferon is a soluble secreted molecule (Honda and Taniguchi, 2006). Indeed, we also observe induction of this cluster in uninfected cells at later time points during infection

(fig. Si B). We also observe this paracrine signaling in a larger fraction of uninfected cells treated

68 with PhoPe LPS, probably due to the fact that a larger fraction of infected cells are inducing the

Type I IFN response.

We also observe the induction of the Type I IFN response in a small population of cells infected with the PhoP~ strain (fig. 3-2 E). This demonstrates that the infection with PhoP mutant strains does not perfectly mirror the naturally occurring low and high PhoP populations that we observe during WT infection as genetically altering the strains cannot provide the fine-tuned regulation and variation that occurs in WT bacteria. It is a relatively common phenomenon that genetic knockout does not abolish an activity for a that is revealed by overexpression (Kitano, 2004); in fact it has been demonstrated before that the PhoP- strain does not always show the opposite phenotype of the PhoPc strain (Strandberg et al., 2012). This is generally indicative of redundant pathways and suggests that PhoPQ levels do not fully account for the variability observed in the host response.

Other complementary bacterial pathways are also known to control LPS modifications and it is likely that some of these also play a similar role in modulating Type I IFN response. For example, one such possible candidate is the bacterial PmrAB two component system (Perez and Groisman,

2007) and understanding the role of such additional regulators merits further investigation.

3.4.3 Possible advantages of bimodal expression of bacterial factors during the

course of in vivo infection

It has been previously reported that while virulence factors allow growth and survival of the pathogen within the host, their activity elicits changes that seem both beneficial and detrimental to the bac- teria (Ackermann et al., 2008). For example, while PhoPQ activation plays a key role in permitting intracellular survival by making Salmonella more resistant to environmental stressors, it is also as- sociated with decreased transcytosis by epithelial cells and decreased replication rates (Groisman, 2001). This suggests that the utility of these factors may be highly dependent on environmental context. In changing environments, bistability or diversification of bacterial populations has been shown to be beneficial (Kussell and Leibler, 2005). Recently, it has been shown that cooperation between virulent and avirulent subpopulations is essential for S. enterica pathogenicity (Diard et al., 2013). The effects of this cooperation were demonstrated using co-infection with genetically dis- tinct mutant strains. Our work suggests that this strategy need not be restricted to mixed genetic subpopulations, but could occur between isogenic subpopulations during WT infection. For exam-

69 ple, one could imagine a beneficial cooperation in which a population with high PhoPQ activity could induce a more robust immune response, as has previously shown to be helpful in overcoming the commensal microflora (Lupp et al., 2007), paying a metabolic cost that benefits a population with low PhoPQ activity. In support of this, it is interesting that both the PhoP~ and PhoPc mutants, that are unable to diversify PhoP activity, are attenuated (Miller and Mekalanos, 1990).

Thus, in order to succeed in the complex host environments encountered throughout infection, S. typhimurium could tune the variation of factors such as PhoPQ to create distinct subpopulations that ensure that some pathogen subsets prevail in infection.

To conclude, this work establishes a mechanism by which transcriptional heterogeneity can have functional consequences for host-pathogen interactions, in this case through differences in pathogen detection. The ability of immune cells to respond to differences between individual pathogens implies that pathogen heterogeneity is a key feature of pathogen populations that impacts host response. This work suggests that further investigation of the role of bacterial heterogeneity as a mechanism to drive different host responses and the extent to which this strategy is employed by diverse pathogens is warranted to fully uncover its role in bacterial pathogenesis and ultimately, in determining infection outcome.

3.5 Materials and Methods

3.5.1 Mice, cell lines and bacterial strains

C57BL/6 WT mice were obtained from Jackson Laboratory (Bar Harbor, Maine). All animals were housed and maintained in a conventional pathogen-free facility at the Massachusetts General

Hospital in Boston, Massachusetts. All experiments were performed in accordance to the guidelines outlined by the MGH Committee on Animal Care (Boston, MA). WT, TLR.4-/-, TRIF-/- and

MYD88-/- iBMMs were obtained from BEI resources (Manassas, Virginia). IRF3-/- and IRF7-/-

BMMs were a generous gift from Dr. Nir Hacohen (Broad Institute, Cambridge, Massachusetts).

All S. Typhimurium strains used in this study were derived from wild-type strain ATCC14028s or

SL1344. 14028 mutant strains PhoPc with pho-24 and PhoP- with phoP::Tn10d-Cam (Miller and

Mekalanos, 1990) were a generous gift from Dr. Sam Miller (University of Washington, Seattle, Washington).

70 3.5.2 Cell and bacteria cultures, single-cell sorting, library construction, and

analysis

S. enterica infections, macrophage cell culture, RNA-Seq library construction, and analysis were performed as described in the Materials and Methods section in Chapter 2.

3.5.3 RNAtag-Seq for simultaneous detection of host and intracellular bacterial

transcripts

We used the RNAtag protocol (Shishkin et al., 2015), with several modifications, for constructing

libraries of macrophage and bacteria transcript. We first treated with TURBO DNase at 37*C for 10

minutes. We then dephosphorylated eluted RNA (including probe RNA) with FastAP Thermosen-

sitive Alkaline Phosphatase (Thermo Scientific) at 37*C for 10 minutes, then added 25 mM EDTA

and heated at 68*C for 2 minutes. We cleaned the reaction using RNA Clean & Concentrator 5

columns (Zymo Research). We combined dephosphorylated RNA with 20 pmol adaptors, denatured

at 70'C for 2 minutes, and snap-cooled them by transferring to ice. We ligated the adaptor to the

R.NA using T4 RNA Ligase 1 (New England Biolabs) at 23'C for 75 minutes, shaking frequently.

We cleaned the ligation reactions by adding 3x volume of Buffer RLT (Qiagen) and 0.58x volume

of ethanol, using RNA Clean & Concentrator 5 columns (Zynmo Research), and eluting in water.

Following clean-up, we added 12 pmol of RT primer and synthesized first strand cDNA using Affini-

tyScript Reverse Transcriptase (Agilent), incubating at 55*C for 50 minutes followed by a 4*C hold.

Next, we pooled libraries together, removed host and bacterial rRNA (Epicentre, Illumina, San

Diego, CA) and cleaned probe RNA and newly-synthesized cDNA by AMPure XP beads (Beckman

Coulter Genomics, Danvers, MA). For the remaining cDNA sample, we degraded RNA by adding

100 mM sodium hydroxide and incubating at 70*C for 10 minutes. The base was neutralized with

addition of 100 mM acetic acid, and cleaned using AMPure XP beads (Beckman Coulter Genomics,

Danvers, MA) as described above. We added a second adaptor to the cDNA by adding 40 pumol

3Tr3 DNA adaptor and ligating with T4 RNA Ligase (highconcentration, New England Biolabs) at

23C overnight. Following clean-up with SPRI beads, we amplified the cDNA library for 10 cycles

using barcoded primers. We cleaned amplified libraries with AMPure XP beads (Beckman Coulter

Geniomics, Danvers, MA) and quantified library yield using an Agilent Bioanalyzer.

71 3.5.4 Transcript quantification of RNAtag-Seq libraries

Samples profiling only host genes were profiled as described in chapter 2. Samples profiling both bacterial and host transcripts in infected macrophages, were aligned to a composite "transcriptome" made by combining the mouse transcriptome described in chapter 2, the SL1344 genome down- loaded from NCBI (NC_017718.1), and a collection of mouse rRNA sequences from the UCSC genome browser (Karolchik et al., 2004). RSEM was used, to map and quantify host reads to this "transcriptome", as described in chapter 2. BWA mem (BWA version 0.7.10-r789, http://bio- bwa.sourceforge.net/) was used to align bacterial reads. An in-house script was used to count bacterial transcripts. Libraries had an average of 16,179 bacterial reads and 5,685,408 host reads

(.28% bacterial)

3.5.5 Single-cell Real-time qPCR

Cells were sorted as described in chapter 2 into 5ul lysis buffer (4.5ul TE (Clonetech, Mountainview, CA) supplemented with 0.25ul NP-40 and 0.05ul RNASin (Life Technologies, Carlsbad, CA)). Cells were immediately frozen on dry ice and kept at -80*C. Right before cDNA synthesis we thawed the cells on ice, incubated them at 65*C for 90 seconds, and moved to ice for 1 minute. First strand cDNA was synthesized by adding lul of RT enzyme (Fluidigmn, South San Francisco, CA) and incubating at 25*C for 5 minutes, 42C for 30 minutes and at 85'C for 5 minutes. cDNA was then amplified by adding lul of 500nM 96 gene-specific primers, 2ul of preamp enzyme (Fluidigm, South

San Francisco, CA) and lul DDW, and incubating at 96*C for 30 seconds, followed by 18 cycles of

95"C for 5 seconds, and 60 *C for 4 minutes. Amplified cDNA was diluted 1:5 in TE, and loaded on Dynamic arrays (Fluidigm, South San Francisco, CA), together with Evagreen SYBR green for amplification and detection.

3.5.6 Cluster summaries for single-cell RNA-Seq libraries

To compare the expression levels of our different gene clusters between different experimental con- ditions we found it helpful to summarize the expression of each cluster in each cell with a single value and compare the distributions of these cluster summaries. Since differences in the quality of single-cell libraries can confound these estimates (Shalek et al., 2013) we used an approach similar

72 to (Shalek et al., 2013). Briefly, for each gene expressed in at least 5% of cells that was not included in either Cluster 1, 11, or III, we estimated the mean expression level (mean of log2 (TPM+1) among expressing cells (those where log2 (TPM+1);>1) in each condition. In each experimental condition we placed genes into one of 25 bins, using the cut2() function in R, based on this estimate of mean expression and fit a logistic regression equation to each cell, describing the probability that genes with a given expression level in each condition would be detected in this cell. We used this fit to compute a weighted average among the genes in each cluster. In short, genes that were detected in a given cell were given a weight of 1 while genes that were undetected were weighted according

to the expected probability of detection in that given cell (Shalek et al., 2013). This minimizes the

effect of differences in library quality between samples. Average cluster values were then divided by

the median value of the corresponding unexposed sample to allow for better comparisons between

samples. Plots of density estimates of these weighted values were made using the geomdensity() function in ggplot2 version 0.9.3.1 with default parameters, with the exception that y was set to

..height.. (thus giving each population the same maximum height) for easier visualization.

3.5.7 Analysis of Biomark HD qPCR data

A set of 96 genes was chosen to be representative of Clusters I, II, and III. Genes were chosen

both based on strength of membership in each cluster and a literature review for genes that are

thought to be important in the host immune response. Though this list changed slightly between

experiments as bad probes were removed, a list of all genes used in Bioiark studies can be seen in

table S4. CT values were called for all samples using the standard real time PCR analysis software

from Fluidigmn. CT values from technical duplicates were averaged prior to normalization to Gapdh.

Samples with Gapdh values >12 were excluded from the analysis, as were genes that were detected

in fewer than half of bulk samples or fewer than 20 out of 96 cells for single-cell experiments. Genes

with undetected threshold values were assigned delta CT values of either -20 for tissue culture assays

or -18 for in vivo studies corresponding to the minimum delta CT typically observed in assay type.

3.5.8 Single-cell summary plots of gene clusters for Biomark data

As with RNA-Seq data, when working with Biomark data, it is often helpful to summarize the

expression of the entire cluster with a single value. To do so, PCA, as described in chapter 2, was

73 applied to each gene cluster using only WT, unexposed macrophages and WT macrophages exposed

to WT SL1344. Any other samples are projected onto these components using the R predicto function. The first principle component was taken as a summary value for each cluster and negated

as needed so that a positive correlation existed between the PCI value and gene expression (as

signs in principle components are arbitrary). Plots for different samples were scaled so that the

median expression of the unexposed samples was equivalent. Density plots of summaiy scores were

generated in ggplot2 using the function geomdensity(. The heights of density plots were scaled

to a maximum height of one using the option y-..height.. to improve visualization. When needed,

95% confidence intervals were calculated on the cluster summary scores using bootstrapping by

sampling from the genes in each cluster with replacement prior to PCA calculations.

3.5.9 TRIF/MYD88 gene analysis

To classify genes according to their TRIF and MYD88 dependencies WT, TLR4-//-, TRIF-/-, or

MYD88-/- iBMMs were exposed to S. typhimurium for 4hrs. The average log2 fold induction of

a subset of genes from clusters I, II, and III, chosen as described above for Biomark analysis,

was assessed, in each background, comparing unexposed and infected (pHrodo positive) cells using

single-cell qPCR. on the Biomark HD system from Fluidigmn (fig. 3-1 A and table S4).

To aid in summarizing our single-cell data, we developed a metric that we term the TRIF-

MYD88 ratio, estimating each gene's dependence on TRIF or MYD88. We used an approach similar

to (Jaitin et al., 2014) modeling the fold inductions as a mixture of four different components:

expression change due to TRIF (T), expression change due to MYD88 (M), expression changes

not due to TRIF or MYD88 (I), and non-linear interactions of TRIF and MYD88 (TM). To fit

this mixture model, we first computed the average fold induction after infection in each genetic

background. Because genes in Clusters 1, 11, and III tend to be expressed at low levels in unexposed

cells (and thus were often detected in only a few cells of our unexposed samples), apparent large

variations in fold induction could be generated by small differences in the number of cells in which

a gene was detected in these samples. To make our fold change measurements more robust to this

noise, we used the WT unexposed sample when calculating fold induction for all genetic backgrounds

except in cases where a statistically significantly difference in expression exists between unexposed

samples in WT and the genetic background of interest (assessed by Wilcoxon test, p<.05). After

74 computing our fold inductions, we used the following system of equations:

WTfold change = T + M + TM + I

TLR4fo(I change I

MYD88-/-fold change T + I

TRIF-/fold change = M + I giving a unique solution for each component. Genes were excluded from the analysis if they did not meet the following criteria:

1) Upregulated at least 1.5 fold after infection in WT iBMM's

2) Detected in at least one quarter of infected cells in WT, and either TRIF-/-, or MYD88-/- iBMM's

3) Induction in TLR4-/- iBMM's was less than WT iBMM's

4) The maximum of T and M is not less than -.5

To assign each gene that passed the above criteria a single value describing its dependence on

MYD88 we used the following equations, splitting the nonlinear interactions between TRIF and

MYD88:

Ttotai = T + TM/2

Mtotai = M + TM/2

TRIF-MYD88 ratio = (Ttotal - Mtotal)/(abs(Ttotal) + abs(Mtotai))

The TRIF-MYD88 ratio falls on a scale between -1 (MYD88 dependent) and 1 (TRIF dependent)

and has the useful characteristic that as the interaction term increases, the ratio tends towards

0, reflecting the gene's dependence on both pathways. The scores and expression patterns for all

assessed genes can be seen in fig. 3-1 A.

3.5.10 Plasmids and DNA manipulations

The plasinid phoP-GFP,encoding the phoP promoter region and fused to a pronoterless GFP gene,

was constructed by cloning a PCR fragments generated from the SL1344 genome using primers:

5'-ATATGCGGCCGCTCGCGCTGTGACTCTGGTCG-3' and

5'-ATATGAATTCTCCTCTACAACCAGTACGC-3'

into the plasmid TOPO-pGlow. All insertions were validated using standard Sanger sequencing.

75 The plasmid pGreenFirel-ISRE, an HIV-based transcription reporter that co-expresses destabi- lized copGFP was purchased from Systems Biosciences (Montain View, CA). 293T cells were seeded in 10 cm plates and infected with the GFP-ISRE plasmid using lipofectamine 2000 according to the manufacturer's instructions (Life Technologies, Carlsbad, CA). Thirty-six hours later, viral parti- cles were obtained from the cell supernatant and used to spin-fect WT iBMMs for 90 minutes at

300g. Cells were then replenished with fresh media. Forty-eight hours later media was replaced with selective media including 10pg/ml puromycin for 5 days. Positive clones were verified for GFP induction after LPS stimulation.

3.5.11 Identification of bacterial pathways enriched in ISRE positive and neg-

ative cells

Putative targets for bacterial transcription were identified for S. enterica using publicly available binding site prediction. Data was downloaded from tractordb, version 2.0

(http://www.tractor.lncc.br/), CollecTF (http://collectf.umbc.edu), and RegPrecise

(http://regprecise.lbl.gov/RegPrecise/). Both CollectTF and RegPrecise data were downloaded on

10/8/2014. Bacterial genes were ranked based on log2 fold-change in ISRE positive compared to

ISRE-negative cells and the GSEA Pre-ranked tool was used to identify transcription factors whose targets were significantly enriched based on this list. No transcription factors were significantly enriched among genes downregulated in ISRE positive cells.

3.5.12 Heat-killed bacteria and LPS extractions

For stimulation with heat-killed bacteria and culture supernatants, overnight cultures of SL1344 were spun down and supernatants and cell pellets were incubated at 80'C for 30 minutes, filtered with 0.2u filters and plated to confirm no viable bacteria were present.

Purification of LPS from each PhoP mutant was adapted from a traditional phenol-chloroform- petroleum ether extraction (Galanos et al., 1969). Eighteen liters of each mutant was cultured overnight (37C, 220rpm) and then centrifuged (7500g, 10min, 4*C). Cells were washed in 700mnL dH20, then 60mL 100% ethanol, and finally twice with 60mL acetone, after which the cell pellet was allowed to dry overnight in a fume hood and for 24hrs in a bench top desiccator. A fresh phenol-chloroform-hexane (PCH) solution was created using 20mL 90% (w/w) phenol in water:

76 50mL chloroform : 80 mL hexane, and solid phenol was added while swirling the cloudy solution until it became completely transparent. The cell pellet was ground into a fine powder using a mortar and pestle, and then transferred to an Erlenmeyer flask where 80mL of PCH solution was

added and stirred for one hour at room temperature. The reaction was transferred to PCH-resistant

Teflon tubes and centrifuged (4000g, 20min, RT), and the supernatant was transferred to a 500mL

round-bottom flask. The remaining pellet was broken up using a spatula and transferred back into

the Erlenmeyer flask where the remaining 80mnL PCH solution was used for a second extraction

(1hr, RT). Following centrifugation, the supernatant was combined with the first extraction and

organic solvents were removed via rotary evaporation (200ppmn, 37C). The remaining yellow liquid

was transferred to a Teflon tube, and endotoxin-free water was added drop-wise while swirling until

the solution was opaque. The solution was then centrifuged (4000g, 10min, RT) resulting in a yellow

gel pellet, and the now-transparent supernatant was transferred to a new tube for an additional

precipitation using more water. The pellets were then washed and combined using 10iL 80% (w/w)

phenol in water, followed by centrifugation (5000g, 10min, 4C) and removal of the supernatant.

This wash was repeated a total of four times, after which the pellet was washed four times using

10mL acetone. The pellet was dried overnight in a fume hood. A linulus amebocyte lysate (LAL;

Sigma) test was performed on each lot of endotoxin to calculate its concentration (Levin and Bang,

1968).

3.5.13 Coating beads with LPS

To coat of latex beads, 30 pg of LPS from S.enterica (Sigma, St. Louis, MO) were incubated at 4C

for 12 hours with 1x108 of 1pim yellow-green fluorescent polystyrene latex beads (Sigma, St. Louis,

MO). Beads were then extensively washed to remove any free LPS, and were used to stimulate cells

at an MOI of 1:1.

For experiments with LPS extracted from the PhoP strains, we wanted to minimize dissociation

of LPS from the beads and exchange of LPS between fluorescent beads, and thus attached the LPS

covalently to epoxy beads using the following protocol: 30 pg of extracted LPS were incubated

for 12 hours with 1x108 epoxy-activated sepharose 6B beads (Polymicrospheres, Indianapolis, IN)

in coupling buffer (100mM sodium bicarbonate, pH 8.5). Excess LPS was removed by extensive

washing in high and then low pH buffer solutions. To fluorescently label LPS beads we incubated the beads in either pHrodo green or red dye (Life Technologies, Carlsbad, CA) at room temperature in 100 mM sodium bicarbonate. The remaining active groups were blocked with 1M ethanol-amine.

Fluorescently labeled LPS-coupled beads were then washed extensively in PBS and then in culture

medium.

3.5.14 Mouse LPS stimulation

For LPS stimulation, mice were injected intraperitoneally with a sub-lethal doses of LPS extracted

from WT, PhoP- or PhoPc salmonella strains (20ig per mouse), and 2 hours later, mice were

sacrificed and intraperitoneal cells were harvested (Zhang et al., 2008). Cells were then analyzed

and sorted by FACS based on known macrophage markers; F4/80 and Cd1lb (ebiosciences, San

Diego, CA). For survival studies, mice were injected intraperitoneally with a single lethal dose (700ug

per mouse) of LPS extracted from WT, PhoP~ or PhoPc cultures. BX795 inhibitor (2.5 mg/kg)

was administered to the relevant groups in conjunction to LPS administration. Mouse morbidity

was monitored for 5 days thereafter.

78 Chapter 4

Conclusion

4.1 Summary

Transcriptional heterogeneity is emerging as an important mechanism behind a wide variety of cellular processes. Although examples of transcriptional heterogeneity have been observed during infection, a systematic study in this area was lacking. Furthermore, although differences in infection outcome between individual infection events have been observed for a long time, the role of tran- scriptional heterogeneity in driving these different outcomes remained poorly understood. Here, we presented a generalizable system to study transcriptional heterogeneity during infection and link this heterogeneity to specific outcomes of interest by combining single-cell RNA-Seq with fluorescent markers of infection outcome.

By applying this system to the well-established model of S. typhimurium. infection of mouse

BMMs, we were able to gain numerous insights not apparent from bulk studies. First we were able to improve our understanding of host genes induced during infection by segregating genes according to whether they are induced by extracellular (Cluster I) or intracellular (Cluster II) bacterial stimulation. We were also able to evaluate gene heterogeneity during bacterial infection.

Finally, we were able to identify distinct subpopulations of macrophages that emerge during bacterial exposure, with one such subpopulation being identified by strong induction of the Type I IFN response.

By tracing the mechanism behind Type I IFN induction in a subset of infected macrophages, we were able to show that this pathway was activated by host TLR4 signaling through the adaptor

79 protein TRIF in macrophages that had been infected with bacteria demonstrating high PhoPQ activity and thus high levels of PhoPQ-mediated LPS modifications. In short, heterogeneity in the host macrophage population was generated by heterogeneity in the infecting bacterial population.

This emphasized the importance of heterogeneity as a characteristic of pathogen populations and demonstrated a functional effect of transcriptional heterogeneity in bacterial populations, namely that differences between isogenic bacteria, raised in the same culture, can be severe enough to elicit different host immune responses.

4.2 Future directions

Although the above discoveries are quite significant, there are a number of improvements that could be made to our system and many biological questions that remain to be investigated. One of the first, and most obvious, drawbacks of our system is that we are currently able to query only host transcripts during infection at the single-cell level. This is the result of two main technical hurdles.

First, the reverse transcription (RT) reaction following RNA isolation must be extremely efficient to give good technical reproducibility from the low RNA input expected from single cells. All current procedures have thus used polyA selection, which automatically excludes bacterial transcripts. This need not necessarily be the case if, for example, random hexamer (or even designed hexamer) priming could be made sufficiently efficient. The second technical hurdle is the comparative abundance of bacterial nRNA in an infected host. For example, when working with sorted populations containing both host and bacterial transcripts, on average, only .28% of our reads aligned to bacterial mRNA.

Overcoming this hurdle would likely require some form of enrichment for bacterial transcripts. These technical challenges are by no means trivial, but if they could be overcome, we would gain the ability to directly correlate bacterial and host responses at different time points. Such a study may enable us to capture differences between subpopulations that were previously hidden, such as differences between our GFP+ and GFP- infected populations.

Additionally, improvements could be made in fluorescent labeling, both in terms of quality and quantity. It will be noted from figs. 2-1 G and 2-1 H that our system of distinguishing macrophages containing live bacteria and macrophages containing dead bacteria is imperfect. A significant num- ber of GFP+ macrophages (labeled as infected with live bacteria) showed no viable bacteria upon

80 CFU enumeration. While it is possible that bacteria were viable but not culturable, it seems more reasonable to assume that there is a time delay between the loss of bacterial viability and the loss of membrane integrity needed to disrupt GFP fluorescence. Other fluorescent systems, such as inducible constructs, may be able to improve on this level of noise. Furthermore, one could imagine using fluorescent labels to track many other attributes of infection outcome such as different forms of inacrophage cell death, differences in bacterial replication rates, or the expression of specific bacterial virulence factors such as secretion systems. It would be interesting to see how specific transcriptional responses correlate with these other outcomes.

Finally, our work established functional importance for bacterial heterogeneity in the context of S. typhimurium infection, largely using a tissue culture model. Although Salmonella species do represent a public health threat, it would be interesting to apply this system to organisms such

as M. tuberculosis which represent a much larger health burden. Additionally, one could imagine

applying this system to more realistic infection models (for example isolating single cells directly

from mouse tissue) where the diversity of microenvironments and different immune cells would be

likely to increase heterogeneity between infection events. This study was an initial demonstration

that transcriptional heterogeneity does play a role in determining the outomne individual cellular

infections. Many more experiments will be needed to fully define the extent and nature of this role.

81 82 Bibliography

M. Ackermann, B. Stecher, N. E. Freed, P. Songhet, W. D. Hardt, and M. Doebeli. Self-destructive

cooperation mediated by phenotypic noise. Nature, 454(7207):987-90, 2008.

C. M. Alpuche-Aranda, E. L. Racoosin, J. A. Swanson, and S. I. Miller. Salmonella stimulate

macrophage macropinocytosis and persist within spacious phagosomes. J Exp Med, 179(2):601-8, 1994.

A. Alvarez-Ordonez, M. Begley, M. Prieto, W. Messens, M. Lopez, A. Bernardo, and C. Hill.

Salmonella spp. survival strategies within the host gastrointestinal tract. Aicrobiology, 157(Pt

12):3268-81, 2011.

S. Anders and W. Huber. Differential expression analysis for sequence count data. Genome Biol, 11(10):R106, 2010.

J. P. Audia,. C. C. Webb, and J. W. Foster. Breaking through the acid barrier: an orchestrated

response to proton stress by enteric bacteria. Int J Med Microbiol, 291(2):97-106, 2001.

L. Aussel, L. Loiseau, M. Hajj Chehade, B. Pocachard, M. Fontecave, F. Pierrel, and F. Barras. ubij, a new gene required for aerobic growth and proliferation in macrophage, is involved in coenzyme

q biosynthesis in escherichia coli and salmonella enterica serovar typhimurium. J Bacteriol, 196

(1):70-9, 2014.

R. Avraham, N. Haseley, D. Brown, C. Penaranda, H. B. Jijon, J. J. Trombetta, R. Satija, A. K.

Shalek, R. J. Xavier, A. Regev, and D. T. Hung. Pathogen cell-to-cell variability drives hetero-

geneity in host immune responses. Cell, 2015.

83 N. Q. Balaban, J. Merrin, R. Chait, L. Kowalik, and S. Leibler. Bacterial persistence as a phenotypic switch. Science, 305(5690):1622-5, 2004.

B. L. Bearson, L. Wilson, and J. W. Foster. A low ph-inducible, phopq-dependent acid tolerance

response protects salmonella typhimurium against inorganic acid stress. J Bacteriol, 180(9):

2409-17, 1998.

I. Behlau and S. I. Miller. A phop-repressed gene promotes salmonella typhimurium invasion of

epithelial cells. J Bacteriol, 175(14):4475-84, 1993.

C. R. Beuzon, S. Meresse, K. E. Unsworth, J. Ruiz-Albert, S. Garvis, S. R. Waterman, T. A.

Ryder, E. Boucrot, and D. W. Holden. Salmonella maintains the integrity of its intracellular

vacuole through the action of sifa. EMBO J, 19(13):3235-49, 2000.

P. F. Bonventre, R. Hayes, and J. Imhoff. Autoradiographic evidence for the impermeability of

mouse peritoneal macrophages to tritiated streptomycin. J Bacteriol, 93(1):445-50, 1967.

E. Boyer, 1. Bergevin, D. Malo, P. Gros, and M. F. Cellier. Acquisition of mn(ii) in addition to

fe(ii) is required for full virulence of salmonella enterica serovar typhimurium. Infect Immun, 70

(11):6032-42, 2002.

M. A. Brennan and B. T. Cookson. Salmonella induces macrophage death by caspase-1-dependent

necrosis. Mol Microbiol, 38(1):31-40, 2000.

N. A. Burton, N. Schurmann, 0. Casse, A. K. Steeb, B. Claudi, J. Zankl, A. Schmidt, and D. Bu-

mann. Disparate impact of oxidative host defenses determines the fate of salmonella during

systemic infection in mice. Cell Host Microbe, 15(1):72-83, 2014.

J. Canton. Phagosome maturation in polarized macrophages. J Leukoc Biol, 96(5):729-38, 2014.

Centers for Disease Control and Prevention. Antibiotic resistance threats in the united states, 2013. Technical report, 2014.

N. Chevrier, P. Mertins, M. N. Artyomov, A. K. Shalek, M. lannacone, M. F. Ciaccio, I. Gat-Viks,

E. Tonti, M. M. DeGrace, K. R. Clauser, M. Garber, T. M. Eisenhaure, N. Yosef, J. Robinson,

A. Sutton, M. S. Andersen, D. E. Root, U. von Andrian, R. B. Jones, H. Park, S. A. Carr,

84 A. Regev, I. Amit, and N. Hacohen. Systematic discovery of tlr signaling components delineates

viral-sensing circuits. Cell, 147(4):853-67, 2011.

B. Claudi, P. Sprote, A. Chirkova, N. Personnic, J. Zankl, N. Schurmann, A. Schmidt, and D. Bu-

mann. Phenotypic variation of salmonella in host tissues delays eradication by antimicrobial

chemotherapy. Cell, 158(4):722-33, 2014.

B. Coburn, G. A. Grassl, and B. B. Finlay. Salmonella, the host and disease: a brief review.

Immunol Cell Biol, 85(2):112-8, 2007.

L. N. Csonka. Regulation of cytoplasmic proline levels in salmonella typhimurium: effect of osmotic

stress on synthesis, degradation, and cellular retention of proline. J Bacteriol, 170(5):2374-8, 1988.

L. A. Cummings, W. D. Wilkerson, T. Bergsbaken, and B. T. Cookson. In vivo, flic expression by

salmonella enterica serovar typhimurium is heterogeneous, regulated by clpx, and anatomically

restricted. Mol Microbiol, 61(3):795-809, 2006.

Z. D. Dalebroux, S. Matamouros, D. Whittington, R. E. Bishop, and S. I. Miller. Phopq regulates

acidic glycerophospholipid content of the salmonella typhimurium outer membrane. Proc Natl

Acad Sci U S A, 111(5):1963-8, 2014.

M. Diard, V. Garcia, L. Maier, M. N. Remus-Emsermann, R. R. Regoes, M. Ackermann, and W. D.

Hardt. Stabilization of cooperative virulence by the expression of an avirulent phenotype. Nature,

494(7437):353-6, 2013.

M. B. Elowitz, A. J. Levine, E. D. Siggia, and P. S. Swain. Stochastic gene expression in a single

cell. Science, 297(5584):1183-6, 2002.

L. A. Falk, M. M. Hogan, and S. N. Vogel. Bone marrow progenitors cultured in the presence of

granulocyte-macrophage colony-stimulating factor versus macrophage colony-stimulating factor

differentiate into macrophages with distinct tumoricidal capacities. J Leukoc Biol, 43(5):471-6, 1988.

R. Figueira and D. W. Holden. Functions of the salmonella pathogenicity island 2 (spi-2) type iii

secretion system effectors. Microbiology, 158(Pt 5):1147-61, 2012.

85 K. A. Fitzgerald, D. C. Rowe, B. J. Barnes, D. R. Caffrey, A. Visintin, E. Latz, B. Monks, P. M.

Pitha, and D. T. Golenbock. Lps-tlr4 signaling to irf-3/7 and nf-kappab involves the toll adapters

tram and trif. J Erp Med, 198(7):1043-55, 2003.

P. Flicek, M. R. Amode, D. Barrell, K. Beal, K. Billis, S. Brent, D. Carvalho-Silva, P. Clapham,

G. Coates, S. Fitzgerald, L. Gil, C. G. Giron, L. Gordon, T. Hourlier, S. Hunt, N. Johnson,

T. Juettemann, A. K. Kahari, S. Keenan, E. Kulesha, F. J. Martin, T. Maurel, W. M. McLaren,

D. N. Murphy, R. Nag, B. Overduin, M. Pignatelli, B. Pritchard, E. Pritchard, H. S. Riat,

M. Ruffier, D. Sheppard, K. Taylor, A. Thormann, S. J. Trevanion, A. Vullo, S. P. Wilder, M. Wilson, A. Zadissa, B. L. Aken, E. Birney, F. Cunningham, J. Harrow, J. Herrero, T. J.

Hubbard, R. Kinsella, M. Muffato, A. Parker, G. Spudich, A. Yates, D. R. Zerbino, and S. M.

Searle. Ensembl 2014. Nucleic Acids Res, 42(Database issue):D749-55, 2014.

M. A. Freudenberg, T. Merlin, C. Kalis, Y. Chvatchko, H. Stubig, and C. Galanos. Cutting edge:

a murine, il-12-independent pathway of ifn-gamma induction by gram-negative bacteria based on

activation by type i ifn and il-18 signaling. J Immunol, 169(4):1665-8, 2002.

F. Garcia-del Portillo, J. W. Foster, M. E. Maguire, and B. B. Finlay. Characterization of the

micro-environment of salmonella typhimurium-containing vacuoles within mdek epithelial cells.

Mol Microbiol, 6(22):3289-97, 1992.

H. S. Gibbons, S. R. Kalb, R. J. Cotter, and C. R. Raetz. Role of mg2+ and ph in the modification

of salmonella lipid a after endocytosis by macrophage tumour cells. Mol Microbiol, 55(2):425-40,

2005.

J. R. Gog, A. Murcia, N. Osterman, 0. Restif, T. J. McKinley, M. Sheppard, S. Achouri, B. Wei,

P. Mastroeni, J. L. Wood, D. J. Maskell, P. Cicuta, and C. E. Bryant. Dynamics of salmonella

infection of macrophages at the single cell level. J R Soc Interface, 9(75):2696-707, 2012.

S. Gordon and P. R. Taylor. Monocyte and iacrophage heterogeneity. Nat Rev Immunol, 5(12):

953-64, 2005.

E. A. Groisman. The pleiotropic two-component regulatory system phop-phoq. J Bacteriol, 183(6):

1835-42, 2001.

86 T. Guina, E. C. Yi, H. Wang, M. Hackett, and S. I. Miller. A phop-regulated outer membrane

protease of salmonella enterica serovar typhimurium promotes resistance to alpha-helical antimi-

crobial peptides. J Bacteriol, 182(14):4077-86, 2000.

J. S. Gunn. Bacterial modification of lps and resistance to antimicrobial peptides. J Endotoxin Res,

7(1):57-62, 2001.

L. Guo, K. B. Lim, J. S. Gunn, B. Bainbridge, R. P. Darveau, M. Hackett, and S. I. Miller. Reg-

ulation of lipid a modifications by salmonella typhimurium virulence genes phop-phoq. Science, 276(5310):250-3, 1997.

M. V. Gutschow, J. J. Hughey, N. A. Ruggero, B. T. Bajar, S. D. Valle, and M. W. Covert. Single-cell

and population nf-kappab dynamic responses depend on lipopolysaccharide preparation. PLoS

One, 8(1):e53222, 2013.

A. Haraga and S. I. Miller. A salmonella enterica serovar typhimurium translocated leucine-rich

repeat effector protein inhibits nf-kappa b-dependent gene expression. Infect Immun, 71(7):4052-

8, 2003.

A. Haraga, M. B. Ohlson, and S. I. Miller. Salmonellae interplay with host cells. Nat Rev Microbiol, 6(1):53-66, 2008.

I. Hautefort, M. J. Proenca, and J. C. Hinton. Single-copy green fluorescent protein gene fusions

allow accurate measurement of salmonella gene expression in vitro and during infection of mam-

malian cells. Appl Environ Microbiol, 69(12):7480-91, 2003.

S. Helaine and D. W. Holden. Heterogeneity of intracellular replication of bacterial pathogens. Curr

Opin Microbiol, 16(2):184-91, 2013.

S. Helaine, J. A. Thompson, K. G. Watson, M. Liu, C. Boyle, and D. W. Holden. Dynamics of

intracellular bacterial replication at the single cell level. Proceedings of the National Academy of

Sciences of the United States of America, 107(8):3746-51, 2010.

S. Helaine, A. M. Cheverton, K. G. Watson, L. M. Faure, S. A. Matthews, and D. W. Holden. Inter-

nalization of salmonella by macrophages induces formation of nonreplicating persisters. Science, 343(6167):204-8, 2014.

87 K. Honda and T. Taniguchi. Irfs: master regulators of signalling by toll-like receptors and cytosolic pattern-recognition receptors. Nat Rev Immunol, 6(9):644-58, 2006.

R. B. Hornick, S. E. Greisman, T. E. Woodward, H. L. DuPont, A. T. Dawkins, and M. J. Snyder.

Typhoid fever: pathogenesis and immunologic control. N Engl J Med, 283(13):686-91, 1970.

A. M. Howells, H. L. Bullifent, K. Dhaliwal, K. Griffin, A. Garcia de Castro, G. Frith, A. Tunnacliffe, and R. W. Titball. Role of trehalose biosynthesis in environmental survival and virulence of

salmonella enterica serovar typhimurium. Res Microbiol, 153(5):281-7, 2002.

J. A. Ibarra and 0. Steele-Mortimer. Salmonella-the ultimate insider. salmonella virulence factors

that modulate intracellular survival. Cell Microbiol, 11(11):1579-86, 2009.

D. A. Jaitin, E. Kenigsberg, H. Keren-Shaul, N. Elefant, F. Paul, I. Zaretsky, A. Mildner, N. Cohen, S. Jung, A. Tanay, and I. Amit. Massively parallel single-cell rna-seq for marker-free decomposi-

tion of tissues into cell types. Science, 343(6172):776-9, 2014.

R.. Janssen, T. van der Straaten, A. van Diepen, and J. T. van Dissel. Responses to reactive oxygen

intermediates and virulence of salmonella typhimurium. Microbes Infect, 5(6):527-34, 2003.

M. A.-~ r .Colares-.Buzat, M.i A. Clark, B. H. Hirst and I. . Simmons. aid disrpt

of epithelial barrier function by salmonella typhimurium is associated with structural modification

of intercellular junctions. Infect Immun, 63(1):356-9, 1995.

S. Jin, Y. Li, R. Pan, and X. Zou. Characterizing and controlling the inflammatory network during

influenza a virus infection. Sci Rep, 4:3799, 2014.

B. D. Jones, N. Ghori, and S. Falkow. Salmonella typhimurium initiates murine infection by pen-

etrating and destroying the specialized epithelial m cells of the peyer's patches. J Exp Med, 180

(1):15-23, 1994.

J. C. Kagan, T. Su, T. Horng, A. Chow, S. Akira, and R. Medzhitov. Tram couples endocytosis of

toll-like receptor 4 to the induction of interferon-beta. Nat Immunol, 9(4):361-8, 2008.

K. Kagaya, K. Watanabe, and Y. Fukazawa. Capacity of recombinant gamma interferon to activate

macrophages for salmonella-killing activity. Infect Immun, 57(2):609-15, 1989.

88 M. Kanehisa and S. Goto. Kegg: kyoto encyclopedia of genes and genomes. Nucleic Acids Res, 28

(1):27-30, 2000.

D. Karolchik, A. S. Hinrichs, T. S. Furey, K. M. Roskin, C. W. Sugnet, D. Haussler, and W. J.

Kent. The ucsc table browser data retrieval tool. Nucleic Acids Res, 32(Database issue):D493-6, 2004.

C. A. Kasper, I. Sorg, C. Schmutz, T. Tschon, H. Wischnewski, M. L. Kim, and C. Arrieumerlou.

Cell-cell propagation of nf-kappab transcription factor and map kinase activation amplifies innate

immunity against bacterial infection. Immunity, 33(5):804-16, 2010.

S. H. Kaufmann. Immunity to intracellular bacteria. Annu Rev Immunol, 11:129-63, 1993.

P. V. Kharchenko, L. Silberstein, and D. T. Scadden. Bayesian approach to single-cell differential

expression analysis. Nat Methods, 11(7):740-2, 2014.

H. Kitano. Biological robustness. Nat Rev Genet, 5(11):826-37, 2004.

E. Kussell and S. Leibler. Phenotypic diversity, population growth, and information in fluctuating

environments. Science, 309(5743):2075-8, 2005.

F. J. Lacroix, A. Cloeckaert, 0. Grepinet, C. Pinault, M. Y. Popoff, H. Waxin, and P. Pardon.

Salmonella typhimurium acrb-like gene: identification and role in resistance to biliary salts and

detergents and in murine infection. FEMS Microbiol Lett, 135(2-3):161-7, 1996.

M. R. Lamprecht, D. M. Sabatini, and A. E. Carpenter. Cellprofiler: free, versatile software for

automated biological image analysis. Biotechniques, 42(1):71-5, 2007.

P. Langfelder and S. Horvath. Wgcna: an r package for weighted correlation network analysis. BMC

Bioinformatics, 9:559, 2008.

M. N. Lee, M. Roy, S. E. Ong, P. Mertins, A. C. Villani, W. Li, F. Dotiwala, J. Sen, J. G.

Doench, M. H. Orzalli, I. Kramnik, D. M. Knipe, J. Lieberman, S. A. Carr, and N. Hacohen.

Identification of regulators of the innate immune response to cytosolic dna and retroviral infection

by an integrative approach. Nat Immunol, 14(2):179-85, 2013.

89 J. Levin and F. B. Bang. Clottable protein in limulus; its localization and kinetics of its coagulation by endotoxin. Thromb Diath Haemorrh, 19(1):186-97, 1968.

B. Li and C. N. Dewey. Rsem: accurate transcript quantification from rna-seq data with or without

a reference genome. BMC Bioinformatics, 12:323, 2011.

R. Losick and C. Desplan. Stochasticity and cell fate. Science, 320(5872):65-8, 2008.

C. Lupp, M. L. Robertson, M. E. Wickham, I. Sekirov, 0. L. Champion, E. C. Gaynor, and

B. B. Finlay. Host-mediated inflammation disrupts the intestinal microbiota and promotes the

overgrowth of enterobacteriaceae. Cell Host Microbe, 2(3):204, 2007.

C. MacLennan, C. Fieschi, D. A. Lammas, C. Picard, S. E. Dorman, 0. Sanal, J. M. MacLennan, S. M. Holland, T. H. Ottenhoff, J. L. Casanova, and D. S. Kumararatne. Interleukin (il)-12 and

il-23 are key cytokines for immunity against salmonella in humans. J Infect Dis, 190(10):1755-7, 2004.

S. E. Majowicz, J. Musto, E. Scallan, F. J. Angulo, M. Kirk, S. J. O'Brien, T. F. Jones, A. Fazil, R. M. Hoekstra, and S. International Collaboration on Enteric Disease 'Burden of Illness. The

global burden of nontyphoidal salmonella gastroenteritis. Clin Infect Dis, 50(6):882-9, 2010.

F. 0. Martinez, A. Sica, A. Mantovani, and M. Locati. Macrophage activation and polarization.

Front Biosci, 13:453-61, 2008.

P. Mazurkiewicz, J. Thomas, J. A. Thompson, M. Liu, L. Arbibe, P. Sansonetti, and D. W. Holden.

Spvc is a salmonella effector with phosphothreonine lyase activity on host mnitogen-activated

protein kinases. Mol Microbiol, 67(6):1371-83, 2008.

V. J. McGovern and L. J. Slavutin. Pathology of salmonella colitis. Am J Surg Pathol, 3(6):483-90, 1979.

J. Mclntrye, D. Rowley, and C. R. Jenkin. The functional heterogeneity of macrophages at the

single cell level. Aust J Exp Biol Med Sci, 45(6):675-80, 1967.

S. McLean, L. A. Bowman, and R. K. Poole. Katg from salmonella typhimurium is a peroxynitritase.

FEBS Lett, 584(8):1628-32, 2010.

90 M. Merighi, C. D. Ellermeier, J. M. Slauch, and J. S. Gunn. Resolvase-in vivo expression technology

analysis of the salmonella enterica serovar typhimurium phop and pmra regulons in balb/c mice.

J Bacteriol, 187(21):7407-16, 2005.

S. I. Miller and J. J. Mekalanos. Constitutive expression of the phop regulon attenuates salmonella

virulence and survival within macrophages. J Bacteriol, 172(5):2485-90, 1990.

T. H. Mogensen. Pathogen recognition and inflammatory signaling in innate immune defenses. Clin

Microbiol Rev, 22(2):240-73, Table of Contents, 2009.

D. M. Monack, B. Raupach, A. E. Hromockyj, and S. Falkow. Salmonella typhimurium invasion

induces apoptosis in infected macrophages. Proc Natl Acad Sci U S A, 93(18):9833-8, 1996.

D. C. Morrison and J. L. Ryan. Endotoxins and disease mechanisms. Annu Rev Med, 38:417-32, 1987.

National Institutes of Health (US). Understanding ermerging and re-emrging infectious disease, 2007.

L. M. Noriega, P. Van der Auwera, D. Daneau, F. Meunier, and M. Aoun. Salmonella infections in

a cancer center. Support Care Cancer, 2(2):116-22, 1994.

C. Nunez-Hernandez, A. Tierez, A. D. Ortega, M. G. Pucciarelli, M. Godoy, B. Eisman, J. Casadesus, and F. Garcia-del Portillo. Genome expression analysis of nonproliferating in-

tracellular salmonella enterica serovar typhiimurium unravels an acid ph-dependent phop-phoq

response essential for dormancy. Infect Immun, 81(1):154-65, 2013.

1. Paciello, A. Silipo, L. Lemnbo-Fazio, L. Curcuru, A. Zumsteg, G. Noel, V. Ciancarella, L. Sturiale, A. Molinaro, and M. L. Bernardini. Intracellular shigella remodels its lps to dampen the innate

immune recognition and evade inflamnmasome activation. Proc Natl Acad Sci U S A, 110(46):

E4345-54, 2013.

Y. K. Park, B. Bearson, S. H. Bang, 1. S. Bang, and J. W. Foster. Internal ph crisis, lysine

decarboxylase and the acid tolerance response of salmonella typhimurium. Mol Microbiol, 20(3): 605-11, 1996.

91 J. C. Perez and E. A. Groisman. Acid ph activation of the pmnra/pmrb two-component regulatory

system of salmonella enterica. Mol Microbiol, 63(1):283-93, 2007.

C. V. Rao, D. M. Wolf, and A. P. Arkin. Control, exploitation and tolerance of intracellular noise.

Nature, 420(6912):231-7, 2002.

V. A. Rathinam, S. K. Vanaja, L. Waggoner, A. Sokolovska, C. Becker, L. M. Stuart, J. M. Leong, and K. A. Fitzgerald. Trif licenses caspase-1 1-dependent nlrp3 inflammasome activation by gram-

negative bacteria. Cell, 150(3):606-19, 2012.

M. Rathman, M. D. Sjaastad, and S. Falkow. Acidification of phagosomes containing salmonella

typhimurium in murine macrophages. Infect Immun, 64(7):2765-73, 1996.

N. Robinson, S. McComb, R. Mulligan, R. Dudani, L. Krishnan, and S. Sad. Type i interferon in-

duces necroptosis in macrophages during infection with salmonella enterica serovar typhimurium.

Nat Irnrnunol, 13(10):954-62, 2012.

C. M. Rosenberger, M. G. Scott, M. R. Gold, R. E. Hancock, and B. B. Finlay. Salmonella ty-

phimurium infection and lipopolysaccharide stimulation induce similar changes in macrophage

gene expression. J Immunol, 164(11):5894-904, 2000.

I. Rychlik and P. A. Barrow. Salmonella stress management and its relevance to behaviour during

intestinal colonisation and infection. FEMS Microbiol Rev, 29(5):1021-40, 2005.

M. C. Schlumberger, A. J. Muller, K. Ehrbar, B. Winnen, I. Duss, B. Stecher, and W. D. Hardt.

Real-time imaging of type iii secretion: Salmonella sipa injection into host cells. Proc Natl Acad

Sci U S A, 102(35):12548-53, 2005a.

M. C. Schlumberger, A. J. Muller, K. Ehrbar, B. Winnen, I. Duss, B. Stecher, and W. D. Hardt.

Real-time imaging of type iii secretion: Salmonella sipa injection into host cells. Proc Natl Acad

Sci U S A, 102(35):12548-53, 2005b.

W. R. Schwan, X. Z. Huang, L. Hu, and D. J. Kopecko. Differential bacterial survival, replication, and apoptosis-inducing ability of salmonella serovars within human and murine macrophages.

Infect Immun, 68(3):1005-13, 2000.

92 A. K. Shalek, R. Satija, X. Adiconis, R. S. Gertner, J. T. Gaubloime, R. Raychowdhury,

S. Schwartz, N. Yosef, C. Malboeuf, D. Lu, J. T. Trombetta, D. Gennert, A. Gnirke, A. Goren, N. Hacohen, J. Z. Levin, H. Park, and A. Regev. Single-cell transcriptomics reveals bimodality

in expression and splicing in immune cells. Nature, 2013.

A. K. Shalek, R. Satija, J. Shuga, J. J. Trombetta, D. Gennert, D. Lu, P. Chen, R. S. Gertner,

J. T. Gaublomme, N. Yosef, S. Schwartz, B. Fowler, S. Weaver, J. Wang, X. Wang, R. Ding, R. Raychowdhury, N. Friedman, N. Hacohen, H. Park, A. P. May, and A. Regev. Single-cell

rna-seq reveals dynamic paracrine control of cellular variation. Nature, 510(7505):363-9, 2014.

A. A. Shishkin, G. Giannoukos, A. Kucukural, D. Ciulla, M. Busby, C. Surka, J. Chen, R. P. Bhat-

tacharyya, R. F. Rudy, M. M. Patel, N. Novod, D. T. Hung, A. Gnirke, M. Garber, M. Guttman,

and J. Livny. Simultaneous generation of many rna-seq libraries in a single reaction. Nat Methods, 12(4):323-5, 2015.

0. K. Silander, N. Nikolic, A. Zaslaver, A. Bren, I. Kikoin, U. Alon, and M. Ackermann. A

genome-wide analysis of promoter-mediated phenotypic noise in escherichia coli. PLoS Genet, 8

(1):e1002443, 2012.

R. D. Sleator and C. Hill. Bacterial osmoadaptation: the role of osmolytes in bacterial stress and

virulence. FEMS Microbiol Rev, 26(1):49-71, 2002.

L. M. Sly, D. G. Guiney, and N. E. Reiner. Salmonella enterica serovar typhimurium periplasmic

superoxide dismutases sodci and sodcii are required for protection against the phagocyte oxidative

burst. Infect Immun, 70(9):5312-5, 2002.

B. Stecher, R. Robbiani, A. W. Walker, A. M. Westendorf, M. Barthel, M. Kremer, S. Chaffron, A. J. Macpherson, J. Buer, J. Parkhill, G. Dougan, C. von Mering, and W. D. Hardt. Salmonella

enterica serovar typhimurium exploits inflammation to compete with the intestinal microbiota.

PLoS Biol, 5(10):2177-89, 2007.

K. L. Strandberg, S. M. Richards, and J. S. Gunn. Cathelicidin antimicrobial peptide expression

is not induced or required for bacterial clearance during salmonella enterica infection of human

nionocyte-derived macrophages. Infect Immun, 80(11):3930-8, 2012.

93 A. Subrananian, P. Tainayo, V. K. Mootha, S. Mukherjee, B. L. Ebert, M. A. Gillette, A. Paulovich, S. L. Pomeroy, T. R. Golub, E. S. Lander, and J. P. Mesirov. Gene set enrichment analysis: a

knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci

U S A, 102(43):15545-50, 2005.

J. J. Trombetta, D. Gennert, D. Lu, R. Satija, A. K. Shalek, and A. Regev. Preparation of single-cell

rma-seq libraries for next generation sequencing. Curr Protoc AIol Biol, 107:4 22 1-4 22 17, 2014.

J. C. van Velkinburgh and J. S. Gunn. Phop-phoq-regulated loci are required for enhanced bile

resistance in salmonella spp. Infect Immun, 67(4):1614-22, 1999.

A. Vazquez-Torres, J. Jones-Carson, A. J. Baumler, S. Falkow, R. Valdivia, W. Brown, M. Le, R. Berggren, W. T. Parks, and F. C. Fang. Extraintestinal dissemination of salmonella by cd18-

expressing phagocytes.'Nature, 401(6755):804-8, 1999.

A. Vazquez-Torres, J. Jones-Carson, P. Mastroeni, H. Ischiropoulos, and F. C. Fang. Antimicro-

bial actions of the nadph phagocyte oxidase and inducible nitric oxide synthase in experimental

salmonellosis. i. effects on microbial killing by activated peritoneal macrophages in vitro. J Exp

Med, 192(2):227-36, 2000.

R. S. Wallis, S. Patil, S. H. Cheon, K. Edmonds, M. Phillips, M. D. Perkins, M. Joloba, A. Namale, J. L. Johnson, L. Teixeira, R.. Dietze, S. Siddiqi, R. D. Mugerwa, K. Eisenach, and J. J. Ellner.

Drug tolerance in mycobacterium tuberculosis. Antimicrob Agents Chemother, 43(11):2600-6, 1999.

S. R. Waterman and P. L. Small. Acid-sensitive enteric pathogens are protected from killing under

extremely acidic conditions of ph 2.5 when they are inoculated onto certain solid food sources.

Appl Environ Microbiol, 64(10):3882-6, 1998.

S. E. Winter, A. M. Keestra, R. M. Tsolis, and A. J. Baumler. The blessings and curses of intestinal

inflammation. Cell Host Microbe, 8(1):36-43, 2010.

World Health Organization. Salmonella (non-typhoidal), Aug, 2013 2015a.

World Health Organization. Tuberculosis, March 2015 2015b.

94 World Health Organization. Typhoid, April 13 2015 2015c.

95 96 Appendix A

Appendix

A.1 Supplemental Figures

97 A C D

Single Cells Uninfected Infected

MO 40: 0 30 C: r C,) Lr 20 Infected CL i4 -F- (D U.. 101

Number of Ifit2 mRNA Fluorescence C'D 1 C') ____ (D molecules per infected cell (pHrodo) C

CD Sorted Single (- 2 POpulations Cells

a, 0)1 C-) 0 Time c.2 a (hrs): o 2.5 4 8 -; E Unexposed 9 pHrodo Fluorescence Li M Uninfecte d [ pHrodo+GFP

Unexposed B * Uninfected M Infected Single Cells Cluster IV Cluster V Cluster III 2

U) U) 'Ii 0 (D W2 gn Unexfposed Uninfeted n pHrodo pHrodo+GFP sl S2.5 4 8 0 2.5 4 8 0 2.5 4 8

Figure SI: Confirmation of the behavior of gene clusters. (A) Heatimaps shoxwini ti be- havior of Cilust ers I and 11 in single cells across time points. (B) Heatmnaps showing tHie belhavior of Clusters III, IV. and V in single cells. across time points. Cells are sorted accordiiig to average expression in each cluster. (C) MIMs were infected with p1)Jrodo-labelled GFP-expressing S. ty- fphimuru'. Fouir' boors after infection cells were fixed and analyzed by R NA flow-FIS with pIobes that (letect Ifi2. a representative gelle fronm Chuster Ill. Infected cells were identified usio p1110(rodo. Plotting the niunber of Ifi12 molecules vs. bacterial GFP florescence il single infected Cells in- licated 11o correlation betwei baC t(i-al load and (luster Il expression (R 0.003484). (D) BMls were analyzed by RNA flow-FISH.1 as in (C). Probes for the iiidicated genes (representatives of Clusters 11 and 1.11) were usd. Ilomogenous induction in infected cells was evident f(r genes in Cluster 11 and bimodal indiuiction was evident for genes in Cluster Ill. The honse keeping gene A ctb served as con trol and showed siiilii (xp1ression across these conditions. (E) Comparison of Cluster Ill genes in sorted populations and single cells exposed to S. fyphimuiT m. Although the difference between uninfected and ilIfected cells is apparent using single-cell analysis, none of' these genes pass significance thresholds (p 0.05 and absolhte log2 fokl change of at least 1) in sorted popllations.

98 I A B

Single Cells Single Cells

LIP 0-) E

(D

I N Uk 0) U)

o ~} C .2 Unexposed - i Uninfected xi M pHrodo - 1- Unexposed I pHrodo+GFP * Uninfected J6 6 .. .. 0 infected S enterica LPS-beads C

Cluster I Cluster I1 Cluster IlIl 1

_ I

m 0-

CD

I <\ ______

a) Unexposed 0 Uninfected * infected

Expression Score

Figure S2: Infection with LPS-beads induces Cluster III in a significantly higher fraction of cells compared to infection with live bacteria (A) Heat maps showing the behavior 96 genes representing Clusters 1, II and III using single-cell real-time (jJPCR at four hours post S. fyphtm'u'riwm. exposure. (B) BM1s were exposed to either live S. (yphimUriI. 0r latex beads coated with S. ty~p wi'uin LPS for folly hours. Gene expJression values for Clusters 1. 11I and III are shown. Cells 1 are ordered according to average induction of Cluster III (white xp osed, ' Iacteria 23 . nLPS 20, grev unin fected, na 24, iPS 15, greeii infected ua1, ri 42. nP 14). (C) Plots suutlliiarize IIh(, expr ession of eachl c1luster. from (B) witi a scori based oni a weiglited average of scaled expression vaies (x-xis, (lot's lepresent locatioii of single clls) and display it verses the frequency of single cells (y-axis) (nunexposed-llack (top). uninfected-grey (middle), infected-green (bottom)) for BMMs infected withi lixe bu teia (top panel) or witi LPS-coated beads (bottom panel). Dist ribu1tions are scaled to hiIve\ tll sami maximumn height. A C F

/Sorted population

3- Cluster I

B- D EE fl ~ClusterIt D(

1 0u)e PU -go ex o e

B EE

Jsterted po p L-n., ecel o 3!,o

4 hours iso 8 hours m

300

T 04 0 4840 #ifc1 Uninfected PhoP PhoP -

low high WT PUoP PhoP infected

Figure S3: Bacterial PhoPQ activity drives heterogeneity in the host Type I IFN response. (A) iBMNs stabl)y exjpressing (xISRE-GFP wr( 1( d w1ith iRBFP-labeld S. y- p urium. RE-egativeandl ISR-poSte populations sorted and host anl bacterial tran-

scripts were (puantifled. Showvn are b~acteria1 pathways enriched by GSEA analvsis in ISRE -positive compIared to ISPRE-negative cells. Targets of the traliscription factor PhoP were significantlv over rep~reselt ed1 (p 0.05 after mnul tiple hypothlesis correct ion). (B) 13MIN s were cxp osed to phoP-GFP expressing S. /yphnnmluam. Infected cells were sortedI according to GFP intensity andl bawcrial tranlscripts wvere analyzed. As expectedI. GFTP intensity correlates with ph/wP levels. (C) Cells wvere

infected as in (B). andl host t ranscriptsIM were8_ anlalyzed.hours_ Plots suinnarize the exlpressionl of each (lister in uniiifected cells and cells infected with PhoP-low or PlhoP-lighb bacteria. Confidence intervals were calculated by b~ootstr'app)ing across genes (Experiment a] Procednres). (D) BMIs wvere exposedl to S. lyp/iimourium expressing either a PhoP reporter or a (onistitultive tlnoresceiit relporter (driven by the rpsM promloter). Cells were sorted acc(.rdiing to GFP fluorescence (non - uninfected cells), and Ijit2, a Cister ill represeintative transcript, was quantified by real-time qPCR (Gapdh was used for relative quantification). Cluster III indunctioll correlates with GFPP intensity in cells cont aining a PhoP relporter, no associationi was dletectedi in (ells cont aining the constitutive relport er. (E) Cells were infected as in (B), and host transcripts were analyzed. No differell(es were mleasired etween PhoP-low and PhoP-high infected populations in Clusters I or II. (F) BMs were exposed to WT, PhoP-- or PhoP0 S.typhnmurium strains. Cells in fected wit hPhoPC' induce higher expression of lhe TYvpe I IFN response (omp1aredl to cell infected with WT. and oo'lls inifectedl with PhoP- induce lower levels (of this cluster compared to cells infected withIi'T. (G) i3MINs were infected with Wi\T. PhioP- or PhioPe S. /yplh'murium strains. Most single cells exposedl to PhoP' showed indullction of Climst er Ill genes. WTl and PhoP -infected (sells showed bimodal indluctioii of Cluster III. 1 it A B C D

140 Heat killed bacteria Sorted populations Single Cells o 20

FA ) 100 0

E. 80 0 0 2 C Cluster I )60 mZ 40 - 20 X

z W 140-1 Culture supernatant E 0 20 f}6 ~0 E Cluster 11 E a)a X0 80)- T ~J .f < )( 60 - E cc 40 00 o20 Cluster III Wft2 Tnif (Cluster 1ll) (Cluster 1) Time (hrs) Time (hrs): 0 2 4 8 2 4 8 WT PhoP-PhoP Both U U WT M PhoP M PhoP Commercial WT extracted UPhoP,PhoPT S.enterica LPS LPS 1] unexposed 0 infected-red beads Q uninfected M infected-green beads E

Sorted populations

EaE WWI C

0 CL C Xi

Figure S4: Stimulation with LPS extracted from PhoP mutant strains induces varying levels of Cluster III genes (A) B\IMs were stlimulate(d with liat-killed S. fyphirium or (iltuire supernatalnts fromt the different PhoP mihutant strains, a1(1 the expressioit of representative genes fronm Cluster I ( Tuf) aiid Cluster III (Iit2) was analyzed. Onily triatiment wvItli heat-killed bacteria pro(lllced a dlifferential Type I IFN response (Ifit2). No (liff(cei is were observed for stinulationi wit 11 1 )acterial culture silpernatants or a Cluster I represent ativ geie ( Tnf). (B) The expression of each cluster was suitlunarized withi a score based( oit weighlted average expression alter stimdlation

with LPS extracte( fom WT, PhoP~ or PIhol ( strains. Conlidlence intervals were calclatedi by bootstrapping across g( nes (Experimental Procedtres). (C) BMMs were stimulated with either LPS extr-actedl from T S. typ Iiphi'nmrilm-i or conititercial S. typlimuriu LPS and analyzed by qIRT- PCR. No (lifferelices in gene expression were evident betweeI.t the two (onditions. (D) BMs were expose(l to uncoat ed, VT (red or greeni), PhoP- (red), PhoPt (green) or a mixture of PhoP- and Pliol"' LPS-c(oat ed beads. Shown is thte single-cell expression of 96 signature genes. After exposure to at mixed poplat01ion of beads. imiost Cells contaillilig PlhoP" ioato(-beads show higher i1(llction of Type I IFN respoitse comparedl to cells cotitailting PhoPi(- coate(I beads. (E) BMMs were exposed to bta(is coated witli LPS extracted( from VT S. fyphmnumu , and sorted according to bead blur(en as idl(licate(l by fluorescence. Induction of Cluster III was simila ii cells infected with low and ligt bcadl brdnIC.

101.

______I A

Gating Strategy

5Al 100 100 200 200 00 100 100 200 200 00 100 050 200 20 000 0 0 1 0 1 FSC-A SSC-H FSC-H eCFP (Sytox blue) F4/80 B CPi SortedPopulation 4a LL

<

n3, -tL U)~

X0

Figure S5: Stimulation of mice with LPS extracted from PhoP mutant strains. (A) GJating strategy for t he isolation of intrap~erit oneal macrophages fr'om nice stimulated withI LPS. Mice were injected ip. with LP~S exl racted from the PhoP mutants. and( 2 hours later Iperit 011(al fluid was harvestedi and analyzed 1by FACS for Ina8crophlage sp)('(ific surface markers (F4 8() and CDI 1b) . (B) Mice were challenged with LPS exrce from 11w PhoIP strains. The activation of the Type I IFN response was enhanced in mice stimulated with PhoPC LPS conmpared to mliCe stimnulated1 with WT LPS. This (lifference was ab~rogated1 when Ph~oP1 LPS was co-admninisteredl withlthe B3X795 inhibitor.

102 A.2 Supplemental Tables

Table Si: Genes used in PCA in Fig. 2-3 A Genes Genes (Cont) Genes (Cont) Genes (Cont) Genes (Cont) Genes (Cont) IRG1 MTDH RAB20 PSMD10 ALAS1 BC006779 CLEC4E NCK1 IF1205A TMEM168 AKRIC12 RNF34 GPR84 IRAK2 AK4 4933426M11RIK STX6 ANAPC2 TRAF1 CXCL1 CISH PRKCD MMP13 ADHFE1 TNFRSF1B ARHGEF3 HIVEP3 HOOK3 DLD ADAMTS4 ICAM1 NINJ1 AEBP2 ZFP800 OASL2 H2-M2 TNFAIP2 PTAFR SBDS RPL13 TICAM1 JARID2 EHD1 DTX3L MTHFD2 IL15 HDC NAA25 TNF IF1203 SERPINE1 GPR31C STARD5 DHCR24 CD40 MNDA DNAJB6 AGRN EIF2C3 ITGAL SOD2 MYD88 TET2 RBPJ CHMP2B PHF21A NFKBIA C3 IL10RA A1504432 PDGFRB IGTP PLEK NFE2L1 IL2RG AMPD3 ABCC5 DDT TLR2 CDKN1A ADORA2A ILiRN TNFSF9 HVCN1 CCL5 SYK RABGEF1 PTPRC HERC6 RPS6KA3 NLRP3 PDGFB ZFP263 STX11 JUNB LEPRE1 NFKBIZ JAG1 MYADM NT5C3 RIN3 STAT2 CXCL2 CA13 TRIM13 KDM4A DAAM1 GTF2E1 CD14 EIF4EBP1 GBP7 SLC15A3 ACOT9 PHTF2 BIRC3 CLIC4 STK38L FAM20C DCUN1D3 SLC44A1 SDC4 DENR NCF1 SAMSN1 TAGAP ETS2 BCL2A1B AGPAT4 SKIL OPTN EZR EVI5L TNFAIP3 PARP14 AA467197 AKNA DENND1A RGL1 RASSF4 3110043021RIK PPFIBP2 MVP RRAGA IFIT3 CCL3 CASP4 SLC16A3 SLC39A14 LRRC16A INTS6 NFKBIE SHISA3 ZFP36 UBA3 MTMR14 CIAPIN1 IRAK3 IL12B PSMB10 DDX3Y GTF2B STXBP3 IGSF6 MAP3K8 MMP14 RNF14 UPP1 RNF114 SOCS3 MITF IFIH1 ZEB2 PALM2 MX2 ITGA5 SLC11A2 IL1B SH3BGRL2 IL6 MT2 GBP5 GPD2 HCK PLAT NRP2 CCNL1 SQSTM1 CXCL1o DST PHGDH MINA NCOA7 CCL9 DRAM1 CCDC86 BNIP2 SH2B2 EXT1 CD82 PION FKBPL CLEC5A SP140 RNF213 PDE4B RCL1 PELI1 MAML2 TLR6 ACVR2A TNIP1 GRAP GAS7 CCDC50 NCF4 PRDM1 SLC2A6 DNAJA2 UBE2F TNIP3 CDC40 DPF2 GADD45B NIACR1 RASA4 LY9 RALGAPA2 VCAM1 CYBB MSANTD3 TBCD TRIP12 PEX16 CD200 FNBP1L CALCRL STARD7 TMEM176B CX3CL1 IF135 CCRL2 MALT1 TPM4 THEM4 RHEB XAF1 ORAI2 TYK2 ARID5B RAB11FIP1 IFI204 HBP1 CD83 FPR1 NFE2L2 SLC25A37 GPR126 ETV6 SLC2A1 SLC7A11 TLR1 DSE STAP1 PBX3 PPFIBP1 PPP1R15A ICOSL CAV1 MYO10 NOTCH2 CDC42EP2 SAA3 TARM1 PTGES TCIRG1 TRAFD1 NCRNA00085 IFIT1 MECP2 TARS HIF1A FOP

103 Table S1 -continued Genes (Cont) Genes (Cont) Genes (Cont) Genes (Cont) Genes (Cont) Genes (Cont) CXCL3 PSTPIP2 ASSI CFB CLEC4A1 PDCD10 CFLAR LILRB4 SBNO2 PVR ATP2B1 TMEM171 ADORA2B H2-D1 HCLS1 EDNRB PIDI LIPG SLFN2 ATP2B4 ZNFX1 IKBKB AKAP2 EIF2S2 SLC31A2 GP49A RCSD1 MX1 USP18 VPS37C CD274 NFKB2 LMO4 SLC4A7 DUSP2 SLC38A7 CCL4 IRF1 SLC31A1 SLCO3A1 DDX58 CCL7 GBP2 BCL2L11 AOAH PFN1 MYBBP1A GGCT SWAP70 GM6377 RSAD2 OASL1 RCAN1 POFUT2 EBI3 LACC1 SPATA13 ARAP1 BCL11A TRIM30A GPR18 MTPN RNF19B APPL1 DDX28 CXCL12 NUPRI IL27 6330409N04RIK GDPD1 RGS16 SP100 TREX1 ZC3H12C ZMYND15 IARS MLLT6 PIM2 ST3GAL1 KPNA3 RILPL2 POLDIP3 FAM102B CD247 PIK3R5 SNX10 PLAGL2 OLR1 AGTRAP ZFP607 ISG15 PLAUR RAB1O MAPKAPK2 GSTT1 GCA RASGEFIB PSMA6 CLCN7 BSG RRS1 OPA3 ACSL1 CPEB4 LPCAT2 MAPKBP1 HMGCR GPR132 NFKBIB GYK LCN2 ACP2 DYRK2 XYLT2 FAS ITGA4 ST7 TMA16 MOB3C KDM6B NOS2 GEM PSMD5 MCOLN2 IFIT2 SLFN8 CXCL16 SAMD9L RTP4 AARS GCH1 SLC13A3 LZTFL1 CSF2RB ABIl CPD EIF6 TNFRSF8 EGR.2 IL411 THAI SEMA4D ATPBD4 DYNC1LI1 FCGR2B FAM49A SLC25A3 TXNRD1 CULl TRIB3 CD69 NADK BTG2 KCNA3 DOCK4 MCEMP1 CERS6 FLNB CD47 SELiL ADCK3 TMEM39A TANK AW112010 CARS BTG1 SPNA2 SLC1A4 LPAR1 CYP4V3 ADAM17 FYB IL18BP ZBP1 IER3 NAMPT VASP GARS GM7535 IRGM1 MARCKSL1 PIK3R6 LIMS1 EDN1 GLCE SPTLC2 IKBKE PRDX5 MARCKS SPP1 ZFP719 DCAF13 GBP3 KLF7 PFKL H2-Q6 RBM7 SLC37A3 SRGN HSPA9 MNDAL PFKP IFT57 ETF1 ASNS REL EXOC3L4 MRPL52 ABCA1 SLC41A1 ARL5C H2-AB1 LPHN2 FPR2 SGMS1 TMEM68 PPP4R2 CD44 FICD SLC25A17 ITGAV ADAM9 PARP9 NOD2 SLC23A2 CCL2 PEX5 GOLGB1 HERPUDI SLC7A2 NARS FCHSD2 TICAM2 PTGS2 GTPBP2 RAB32 RAP1B GM4902 LCP2 SLC7A7 3110003A17R1K MFSD7 CLTA TFEC LILRB3 RIPK2 CMPK2 OSTF1 RHBDF2 DUSP16 GRAMD1A SERTAD1 CDYL2

Table S2: Gene clusters I, II, and III identified from single-cell analysis Cluster I Cluster II Cluster III FTH1 TBCD PLAGL2 I830012016RIK CD14 LR.RC16A PDGFRB IFIT3 CCL7 ADAM9 ARID5B RSAD2

104 Table S2 -continued Cluster I Cluster II Cluster III LILRB4 PION CLIC4 USP18 GPR84 VPS37C MECP2 IRF7 NFKBIA HBP1 NOTCH2 IFIT2 TNF TRIB3 OLR1 MX1 CLEC4E HVCN1 GPR126 PYHIN1 GP49A MTMR14 SLC38A1 IFIT1 ICAM1 TMEM39A FMNL2 CMPK2 CCL2 SBNO2 HOOK2 GVIN1 TLR2 MARCKS CTNND2 ISG20 BCL2A1B MITF RAB11FIP1 PHF11L SDC4 RAP1B EXT1 TRIM30C CCL9 ABCC5 TFEC TGTP CD83 ICOSL AKR1C12 MX2 NCF4 ST3GAL1 KCNA3 MS4A4C NINJ1 PSTPIP2 SLC37A3 IGTP CXCL1 PDGFB VHL GM4902 TNFAIP2 CUL FAS ZBP1 PLEK DYNC1LIl ATP6VOA2 IRGM1 SLFN2 TICAM1 THAi IF147 HERPUD1 SOCS3 STX6 IF1205A CLEC5A KLF7 MAPKBP1 HERC6 TMEM176B TXNRD1 DPEP3 ISG15 IGSF6 TNFAIP3 MALTI OASL2 TNFRSF1B MOB3C EXOC3L4 TRIM30A MRPL52 SPATA13 PCM1 OASLl SLC31A2 PEX16 GLCE DDX60 CXCL2 VASP XKR6 IRGM2 ITGA5 PSMD5 A1504432 DDX58 CDKN1A MARCKSL1 IL7 RNF213 3110003A17RIK XYLT2 FGR STAT2 PLAUR NRP2 LIPG TRIM30D ADORA2B GM6377 AQP9 GM14446 LILRB3 JARID2 LAPTM4B XAF1 ASNS AOAH MS4A6C STATI FYB IKBKB DCP2 BC094916 CD82 TARM1 FBXO42 OAS3 IL2RG PRKCD LPCAT2 RTP4 SLC2A1 PIM2 ADCK3 MNDA RAB32 CHMP2B FAM72A IF1204 NARS PIK3R6 APPL1 GM4955 MTHFD2 CPD AKAP2 SLFN5 IL411 RNF19B AGRN IF135 PIK3R5 CYP4V3 BIRC2 TRIM30B CXCL16 PHTF2 TRIM13 TAP1 RPL13 LEPREL GPR31C SAMHD1 NLRP3 ZFP719 RPS6KA3 PARP14 ASS1 UPPi LPHN2 SLFN1 SOD2 SLC38A7 AK4 PYDC4 UBE2F CD44 NUB1 H2-T22 CDC42EP2 OPTN BCO2 DHX58

105 Table S2 -continued Cluster I Cluster II Cluster III HCLS1 NFE2L1 2500003M10RIK SLFN8 CAV1 EIF2C3 NCOA7 GBP9 NUPR1 GTPBP2 EVI5L OAS2 NCF1 CA13 CX3CLL TRAFD1 MINA GK MMP14 IF1203 DENR ETS2 PPM1K GM1966 SEMA4D HIF1A ITGAL SP140 EHD1 SLC2A6 TRPM4 SAMD9L CD69 ETF1 DDX28 SP100 TPM4 MAPKAPK2 AMPD3 PHF11 CARS 4933426M11RIK ZNFX1 OAS1G NADK MSANTD3 FBXO30 LGALS9 ACP2 RIPK2 RCL1 LGALS3BP PSMD10 CPEB4 AEBP2 MNDAL CYBB CDC40 DDT CXCL10 TANK ZC3H12C SLFN10-PS FAM26F LIMS1 RNF34 RNF25 IL18BP SWAP70 LPAR1 JAK2 AA467197 TRAF1 RAB10 INTS5 TRIM34A MYADM CASP4 BSG IFITM3 ZFP36 H2-Q7 FLNB MLKL NFE2L2 SLC23A2 TICAM2 SLC25A22 SNX10 3110043021RIK CLTA OAS1A SERTAD1 GTF2E1 ADHFEl EIF2AK2 HCK ANAPC2 TLR6 TREXI PTGES ALAS1 NFKB2 PSMB10 PSMA6 CLDND1 ETV6 TLR3 ARL5C SAMSN1 ADORA2A GM5431 IRASGEF1B SLC39A14 CLEC4A1 UBA7 ZFP263 DTX3L SPINT2 EPSTI1 LCP2 SLC13A3 MOB3B SPilO EIF4EBP1 THEM4 GSTT1 PML MVP EGR2 GEM TSPO RBM7 ARHGEF3 GM7535 BST2 PLAT MYO10 IDH2 THEMIS2 BIRC3 RIN3 PPP4R2 H2-T23 EBI3 GRAP STARD5 PNP RASSF4 PEX5 GADD45B GCA TNFSF9 MAP3K8 SLCO3A1 TRIM25 AGPAT4 SPTLC2 GM20459 PSME2 SLC7A7 BNIP2 TMEM93 RNF114 TARS LACC1 SH3BGRL2 SETDB2 CFLAR BTG1 MAML2 PTTG1 STAP1 CD47 TSHZ1 MOV10 IER3 NAA25 ZADH2 AA960436 NFKBIE ZFP607 SLC25A3 ZUFSP TNIP1 HSPA9 DUSP16 MTHFR PTPRC ATP2B4 MDFIC MITD1 FCGR2 CCNL1 CD247 BC006779 MYBBP1A SPRYD7 MCEMP1 GM4759

106 Table S2 --continued Cluster I Cluster II Cluster III NFKBIZ SLC44A1 H2-AB1 TOR3A ABIl NCK1 CCNG2 FCGR1 RCAN1 DUSP2 TCIRG1 FAM46A TNIP3* RRAGA PLSCR1 IL15RA SLC11A2 LZTFL1 ADAMTS4 LGALS8 AARS TRIP12 FPR2 LY6E ZEB2 ZMYND15 MAP2K1 EDN1 IKBKE PHB2 KIAA1551 PGAP2 MT2 TET2 GAS7 GCH1 SKIL DENND1A FPR1 PAPD7 GDPD1 ZFP800 CD274 TIFAB RILPL2 GPD2 CIAPIN1 PNPT1 GTF2B GRAMD1A LANCL2 SLC31A1 DNAJB6 TYK2 GPR18 A1607873 FNBP1L MTPN RHEB GNB4 HMGCR OSTF1 NAMPT ARHGAP25 ITGA4 GGCT GBP7 MSR1 PFKL H2-M2 TRIM12A CD200R4 TMA16 INTS6 SLC7A2 ADAP2 MYO1G KDM4A CASP1 AIM2 POLDIP3 SERPINE1 BATF TMEM171 PHGDH DCUN1D3 MYD88 TCF4 POFUT2 OSBPL3 HDC CPNE3 BTG2 TAPBP SLC7A11 RAI14 IARS DRAM1 PPP1R15A EVI2A ARAP1 CISH PROCR NQO2, H2-D1 PRDM1 IFIH1 PARP10 MCOLN2 NFKBIB GBP5 CLCN7 HCAR2 IL18 REL DST IR1F1 ADAM17 AGTRAP CFB STARD7 LMO4 LCN2 ACSL1 PPFIBP2 PDIA3 PTAFR ST7 VCAM1 RAB20 DNAJA2 IL15 SYK RALGAPA2 CD200 MTDH PELI1 CCL3 RABGEF1 PBX3 DAXX IRAK2 GARS ATF3 TLR1 H2-Q6 GBP4 TMEM68 CDYL2 IRG1 SBDS JUNB PFN1 CCDC50 MFSD7 NT5C3 SLC25A17 SHISA3 NOS2 ATPBD4 DYRK2 ITM2B CSF2RB SLC1A4 CCRL2 FKBPL IFT57 SRGN LY9 STX11 ILIRN C3 RCSD1 MMP13 SLC16A3 ABCA1 PRDX5_

107 Table S2 -continued Cluster I Cluster 11 Cluster III RHBDF2 SLC15A3 IL6 ORAI2 FAM102B PTGS2 CCDC86 KPNA3 GBP2 ITGAV FCHSD2 IL27 SPNA2 NOD2 GM11428 DLD ANO6 IL1B SGMS1 TXLNB AW112010 RBPJ SH2B2 CD40 HOOK3 OPA3 CXCL3 EZR PVR IL12B PHF21A RRS1 SAA3 STXBP3A DAAM1 SPP1 DCAF13 CALCRL CCL4 FAM20C RGL1 CCL5 RASA4 DOCK4 RGS16 KDM6B BCL2L11 CXCL12 RNF14 AKNA PID1 GPR132- IL10RA SLC41A1 ACOT9 MLLT6 LASS6 SLC4A7 EDNRB IRAK3 SQSTM1 FICD DDX3Y SLC25A37 UBA3 PARP9 FAM49A TNFRSF8 PPFIBP1 TAGAP PDE4B NFKBIL1 TMEM168 ACVR2A STK38L DHCR24 NCRNA00085 BCL11A JAGI 9930022D16RIK TGIF1 DSE ATP2B1 FAM108C1

Table S3: Genes Clusters IV and V identified from single-cell analysis Cluster IV Cluster V TOP2A 4930427A07RIK MTR GRK4 TCTE2 6430704M03RIK PBK FANCD2 PSIP1 FGD4 CCDC114 LRP5 KIF22 SMC4 CEP192 CPA6 DHTKD1 ARHGEF12 CENPE RBBP7 FANCG GM17615 9830107B12R1K CANO NUSAP1 HAUS1 C1QBP KIAA1107 NYX BTBD19 PRC1 SPAG5 ZFP961 4732465J04RIK ODZ1 GPC6 NUF2 NCAPG2 BRD8 HSD3B2 TIMELESS FAM158A BIRC5 TINF2 SRGAP3 CES2B APOL7B LEF1 AURKB KIF14 FADS1 VMN1R58 CCDC144B SH2D4A KIF11 GINS2 2700029MQ9RIK IFT80 SLC17A8 FAP BUB1B WDR67 R.NF141 ZNF488 MPP3 GM10032

108 Table S3 -continued Cluster IV Cluster V SPC25 SYCE2 CAML CYP4X1 EXPH5 TNS4 ANLN LIGI GM16039 9930111J21RIK BGLAP-RS1 RCBTB1 CDCA8 CHTF18 LUC7L3 CUL4A WIZ EWSR1 NDC80 NASP UBE2S SIK2 TXNDC16 CAMK4 CCNB1 LSM3 MRPS17 CCDC34 APLF SPEER4A PLK1 RPA3 0610007P14R1K SSBP2 RDH16 RAB9B TPX2 G2E3 NRF1 GJC3 XPO7 MLLT4 CASC5 DNMT1 CKLF F830016B08RIK VWF PRDM4 CENPF BC055324 WDR54 4930512M02RIK AGBL3 ALDH1A3 BUB1 HAUS6 TRIM37 CAPS2 CBLN3 1300002K09RIK CCNF CCDC99 ILF3 PKIB GM3550 TERF2IP CDCA3 TERF1 BTF3L4 CESSA ABCB9 CHRNA5 SMC2 FAM111A C030039L03RIK ZFP300 DOK6 C030046E11RIK SKA1 EXOSC8 TMEM129 PRL8A1 ZMAT1 NOM1 FAM64A MLF1IP ZFP35 ACER2 MRGPRE MCC CKAP2L ORC6 TRAF7 PI15 ERMP1 - AKAP11 NCAPG KNTC1 SKP2 1700128F08RIK CAPRIN2 ERLEC1 FBXO5 EZH2 DLG1 SKINT8 A930039A15RIK 1500012FO1RIK PAF SNRPD1 TXNDC15 RDH1 MPLKIP RAB11FIP1 CDK1 BRCA1 CWF19L1 MDGA2 PDE6H NRBF2 SHCBP1 DBF4 NUP205 OLFR613 UGGT2 MRPS24 ECT2 SSRP1 SEC11C TNFRSF10B ZFP113 BTG2 AURKA ASPM PTGES3 ZDHHC21 GM10974 SLC7A11 MELK TOPBP1 TYWI HEXDC CD8A KIF20B LIN9 THRAP3 LTBP1 PRRG1 KIF20A 9430015G10RIK SDHA PRDM11 ACAD9 C79407 ZWILCH ADAM10 SLC38A6 ZBTB8OS HMMR SRSF7 BC053749 NOX4 TSPAN15 KIF23 FANCA GMDS GRK1 GM10912 RRM2 TUBB5 PFKFB2 CCDC141 HSD17B13 NCAPH PPIL1 WHSC2 PAQR3 2010002M12R1K ASF1B BLM RABEP2 KCNQ5 D130062J21R1K CDCA2 SNRPG MARK2 SYNE1 SCD4 TUBA1B PPIH CBWD1 RECQL SRRM3 KIF2C SASS6 ACIN1 GM10125 FAM160B2 H2AFZ CKAP5 SRSF10 COL22A1 PPP2R2C FIGNL1 ANP32E APEX2 CCRN4L H2-Q6 NCAPD2 5930416I19RIK COQ4 STYK1 GM10521 SKA3 LMF2 GANAB RDH9 INTU OIP5 NUP133 S100A4 ACSS3 TMEM201 D17H6S56E-5 MAD1Ll ZFP180 ZFP71-RS1 ERCC6L INCENP RPA1 UBN1 1110021LO9RJK WAPAL UBE2C PSMG2 PRPF4 E030030106RIK KLHL20 SGOL1 HN1 MPHOSPH9 TRIM24 ALDH1L2 CEP55 RBBP4 MTHFD1 CPNE5 E230001NO4RIK CENPA CDCA4 RBAK PARVA DYNLT1 HMGB1 H3F3A ZFP120 ELMOD2 KIF18A GTSE1 ANAPC11 LRRK1 SLC13A1 KBTBD11 TACC3 TDP1 IFT140 4930555F03RIK ARHGEF15 DLGAP5 TAGLN2 ZBTB48 CMAH RAB34 _ _T

109 Table S3 -continued Cluster IV Cluster V KIF4 ARL6IP1 EIF4E PRKCQ LHFPL4 CENPN SGOL2 CNOT3 E33002OD12RIK AP1S3 RRM1 ANAPC5 NAA15 GM10576 GM5577 SPC24 RRP1B XPO1 GM10722 JUP CENPQ PHF17 ZNRF2 UBE1AY TCTN2 TMPO NSMCE1 ANKRD54 DTNB RPL9-PS6 POC1A CSE1L EIF4H GM5934 USP17L5 CIT NUP62 ENDOD 4930431F12R1K FGFR1 RAD51 NEK2 RNF26 PSG16 ASPA TK1 SLBP MAGOHB VMN1R65 WNT4 KIF15 NAA38 SERF1 GM10800 ST6GALNAC5 CENPI MDC1 FGD6 6030422M02RIK LRRC48 KIF18B HSPA14 CNBP GM14443 FAM211B STIL RANBP1 CARS CYB5R4 ZNF473 RAD51AP1 POLD1 1110059G1ORJK CDH17 GM10699 HISTIHB NEDD1 MRPS12 FSTL1 KLHL32 DEPDC1A TMX2 TIPRL POT1B UGT3A1 HMGN2 POLD3 CSDA UPB1 ESR2 SKAP EIF1AD MRPL1 SLC12A8 ZFP940 IQGAP3 NUP43 SLC35A1. PAPOLG PCDHB5 UBE2T FANCI IPO11 ERMAP BC035947 CDC20 GCAT RASSF1 MRPL48 SCD3 STMN1 BUB3 SLC25A1O GM9964 DENND2C ERCC6L WDR62 COX7B EMID2 AGPHD1 C330027C09RIK NFIA CLPB PAK2 ASPHD1 BORA TADA2A TAF6 2310010M20RIK SPEER4E E130306D19RIK NMRAL1 DVL3 ME3 OPCML HMGB2 POLD2 LMO2 WDR31 CD3D ESPLI NCAPH2 SGSM1 DNAJC1 NCKAP5 DNAJC9 RFC2 ZFP90 RPIA PFKFB2 TUBB5 SERINC3 S10OA10 ATG16L1 MMD RACGAP1 HNRNPF ZFP933 GM16493 MAOA MAD2LI , PCNT YIPF4 GM17613 SPAG16 HIST1H1E TUBB6 BAP1 CD209C GPR157 PTMA SMC6 SNX25 TANCI MCART6 GEN1 4930534B04RK PKNOX1 RMI2 SLC9A7 CENPP RFC3 PATZ1 1110038D17RIK AUH CCDC21 YWHAQ NOL11 HSPB11 PNPLA8 RAD18 LRRC40 LEPREL2 TSPAN18 MMP12 CDCA5 HN1L POLR2I CNTN1 RPGRIP1 CCNA2 TUBG1 ISCAl 1700048020RIK SUPT3H CENPK RBL1 FAM206A GPR26 CLRN1 CKS1B LARP7 TIMM21 GM10775 OLFR287 CDKN3 NUP155 MKKS ADAMTS2 SLC35E2 VRK1 BRD9 TUFTI 4930423020RIK RIPPLY3 NUP85 IP09 CNOT1 3110007F17RIK ZKSCAN3 MASTL TMEM194 EPRS 4932414N04RIK GM6351 GMNN ARPP19 COQ7 SKINT4 SLC26A3 TYMS PA2G4 RG9MTD2 GM20449 DMD TCF19 WHSC1 ZFP948 TMPRSS11BNL FAM81A

110 Table S3 -continued Cluster IV Cluster V RFC4 TNFAIP8L1 EXOSC10 GM9924 ABCA8B MCM10 SNRPB ZFPL1 H2-GS10 BMP7 LSM2 DPY30 AHRR GM4297 CHRNB4 PMF1 ILF2 MTHFD1L FDXACB1 FAM119B CBX3 GTF2A2 CCDC22 NEIL3 DCBLD1 ESCO2 ANP32B ZDHHC2 4932425124RIK ARMC2 TTK ANKRD32 TIMM17B BHMT2 USH2A TIPIN RDBP OSBPL9 PIGX RAB27B NSL1 PIH1D1 NUDT22 GIMAP8 RELN CBX5 RIF1 ORC3 1700052N19RIK HERC3 EME1 TFDP1 VBP1 EML1 TTBK2 FEN1 DHFR PPP1R16B GIMAP1 5730494M16RIK H2AFV PCK2 QPCTL D730040F13RIK CD200R3 MIS18A CDKN2C MMS19 TC2N STX1B CENPW TAF11 WDR77 4933427I04RIK SLC25A42 ANKLE1 SLC29A1 RAPGEF5 HHAT KAZN CLSPN WDHD1 ARAP2 CCDC30 ZFP933 CKS2 TTC32 ACO2 GM16223 OLFR536 HJURP CENPO RFX7 CXCR2 SYNE2 CCNB2 XRCC2 RRAGC MBNL3 A63001OA05RIK ATAD2 SUPT16H UPF3A 2410004P03RIK DNAJC25 UHRF1 TONSL FOS CRX KCNMB1 RANGAPI RMI1 GAB3 FADS6 5830462119RIK NUP107 CDC123 ZFP68 PRSS35 CCHCR.1 MK167 UPF3B ZFP238 RDH12 PLA2G2D AAAS 2610318N02RIK RIC8A PSG21 FBXW20 LSM5 NDE1 CCDC71 ZFP277 CPNE9 TRAIP RTTN SKIV2L PLD5 GM6802 CDC25B MED18 GLUL GM4841 FAM65B SKA2 ALYREF MPHOSPH10 ZFP459 ZFP937 CENPM LNP CNOT6 BTBD16 TMC2 EFCAB11 TMEM19 POMGNT1 GM9978 KLF11 DIAP3 MAPRE1 RNPEPL1 EXD1 PRRG3

RAN DCTPP1 HNRNPC GLRX2 NUMBL _ HAUS5 ERIl RYBP INVS MOSC2 RCL HDAC2 AMFR PAM RTBDN MCM5 NUDT1 N4BP2L2 AGBL2 CCDC8 CDC45 SSNA1 MTG1 SLC25A23 PTPN23 DSN1 PHF19 PARP11 SLC5A4B PRLR SAE1 DARS2 1200011M11R.IK E2F7 PLA2R1 FOXM1 RAET1D TMEM65 GM9776 KATNAL1 NCAPD3 TH1L STX2 GPR98 SCLT1 MCM7 MAP4K1 CYP4F16 XPNPEP2 BTNL10 KPNA2 MED4 KLHL15 MYO1B FGFR2 NUP35 RPP30 GM885 D330041H03RIK CRB1 HAUS3 MRPL40 UBRh3 CD200R1L LONRF1 CKAP2 FAM110A CD99L2 NRK RAPGEF2 RPA2 MANBA PIP5K1C HECTD2 RPH3A TUBA1C SMCHD1 UBTF PCSK2 ENTPD1 USPi RNF187 ADAT2 MYH8 FAM40B

111 Table S3 -continued Cluster IV Cluster V CENPH FAAP24 GCLM 2610305D13RIK CDNF NUP37 KIAA0513 PIK3C2A TULPI SH2D2A HMGB3 U2AF1 UTP20 DZANK1 CCBE1 MNS1 GM10094 ABHD11 DCBLD2 IOC0044D17RIK SUV39H1 TUBGCP6 SKINT7 EAF2 HACE1 TMEM48 MYGi CDK2AP2 SGIP1 SCAI SLC43A3 TUBE1 SOS2 4930425F17RIK CCBL1 LBR ALG8 TNFRSF8 GM12789 ZC3HC1 UCHL5 BCKDK IREB2 ZFP11 TYR 4632434I11RIK 42069 TJP2 MEGF9 EFCAB4A RAD54L DPY19L1 4930433N12RIK RGS9 ZFP874B TEX30 RBMX SLC22A5 ZFP9 LRRC3 ARHGAP19 RUVBL2 ETNK1 SLCO1A6 NIPA1 CDC25C SEPHS1 EIF4G2 FOXP2 THSD4 CENPL 061001OK14RIK LCP2 ZFP37 LRRC2 PDS5B CENPJ MIIP BPIFC CCR2

NUCKS1 NFYB E2F8 4932438A13RIK RRM2B _ _

Table S4: Genes and probes used in Biomark experiments Gene Outer F Outer R Inner F Inner R Abil CGAGCACATTGTC TCGTGGGAACTGT CGATCCCACGCAG CGTACAGGCTCTA GAGAACAAACC TGGAGGCTTA AAACCACCAA GGGTTTTGTAAGG G cl3 CGACGCCATATGG CGTAACGATGAAT CGACGCCATATGG CGTAACGATGAAT AGCTGACAC TGGCGTGGAA AGCTGACAC TGGCGTGGAA cl4 CGAGCACCAATGG CGTAAGCTGCCGG CGAGCACCAATGG CGTAAGCTGCCGG GCTCTGAC GAGGTG GCTCTGAC GAGGTG cMI5 CGATACCATGAAG CGTGGCTGCAGTG CGATACCATGAAG CGTGGCTGCAGTG ATCTCTGCAGCT AGGATGAT ATCTCTGCAGCT AGGATGAT ccl7 CGACCCCAAGAGG CGTCAGACTTCCA CGACCCCAAGAGG CGTCAGACTTCCA AATCTCAAGAG TGCCCTTCTT AATCTCAAGAG TGCCCTTCTT ccndl CATCCATGCGGAA CAGGCGGCTCTTC CATCCATGCGGAA CAGGCGGCTCTTC AATCG TTCAA AATCG TTCAA Ccr12 CGATCAAGCAACC CGTCTGTTGTCCA ACGTGTTTTGTCC CGTATCGTCCGGG TGCCTCAAAC GGTAGTCGTCTAA GGTGAGCAAGG GCCACT cd103 CTCAGGTGGCTTG CAGACATGCCGAA CTCAGGTGGCTTG CAGACATGCCGAA TATGACAGT GTAGTGG TATGACAGT GTAGTGG Cd14 CGAGCGGCAGATG CGTGCCGCTTTAA CGACGCAGCCTGG TCGGATAATATCA TGGAATTGTA GGACAGAGAC AATACCTTCTAA GTGAACTGCCCCA GA Cd40 CGAAACAGGGAGA CGTGTGCAGTGTT CGACCAGCACAGA CGTATTCTGCGGT TTCGCTGTCA GTCCTTCCTTAC CACTGTGAACC GCCCTCCT cd64 TTAAGCGCAGCCC TCCCACTGACAGA TTAAGCGCAGCCC TCCCACTGACAGA TGAGT TAAAACAGG TGAGT TAAAACAGG cd82 GAAGACGCGGGAC ATAGGACGGTGGG GAAAGGATTCTGT GTGTTCACAGGCC TTAAAGC GACATCA GAGGCTGA AATCCTC Cers6 ACGCTCACAGCTG CGTCAAGGTGGTG CGATATTACATCC CGTATGCCGAAGT ACCTTCACTAC CAGGAACATA TGGAGCTGTCATT CCTTCCTTTT T

112 Table S4 -continued Gene Outer F Outer R Inner F Inner R Clec4e CGAAAAACTCCCA CGTCTGTAAGTTC CGAATAGCCGGGG CGTTTGAGAGCTG AGTGCTCTCC TGCCCGGAAA CCTCCAT CGATATGTTACGA Clic4 CGAACCCGCCATT CGTACTCTGGGTG CGAAGCGAAGTCA CGTTTAGGTACTT TATAACCTTCAAC CTTTGGTGAA AGACGGATGTG GGGTGGGCACAA cmpk2 CGAGGAGGTCCAG CGTACTGCGTCAG CGAGGAGGTCCAG CGTACTGCGTCAG AAAGGGAAGTT TGTGGTCTTA AAAGGGAAGTT TGTGGTCTTA esflr CGAGGGAGACTCC GACTGGAGAAGCC CGAGGGAGACTCC GACTGGAGAAGCC AGCTACA ACTGTCC AGCTACA ACTGTCC cx3crl AAGTTCCCTTCCC CAAAATTCTCTAG AAGTTCCCTTCCC CAAAATTCTCTAG ATCTGCT ATCCAGTTCAGG ATCTGCT ATCCAGTTCAGG cxcl10 CGATGAGGGCCAT CGTTTTTCATCGT CGATGAGGGCCAT CGTTTTTCATCGT AGGGAAGCT GGCAATGATCTCA AGGGAAGCT GGCAATGATCTCA AC AC Cxcl2 CGAGCCAAGGGTT CGTGCTTCAGGGT CGAATCCAGAGCT CGTTTTTGACCGC GACTTCAAGAAC CAAGGCAAAC TGAGTGTGAC CCTTGAGA Cxcl3 CGAGTGCCTGAAC CGTGGGTTGAGGC ACGGGGTTGATTT CGTTCCTTGAGAG ACCCTACCAA AAACTTCTTGAC TGAGACCATCC TGGCTATGACT Ddx60 CGATGCTGAAAGA CGTGAGATTCACC CGAGGCGTGGTTG CGTAACAGTTGCT GAGCGATGAA AGCTGGCATA TGTATGTTG GCCACTTGA dusp2 ACGTGACCAGGGT CGTTGCAGGCCCT ACGTGACCAGGGT CGTTGCAGGCCCT GGTCCTGTG GAAGATCTGA GGTCCTGTG GAAGATCTGA CGAGGAGTGGCGG CGTTCGGATACGG CGAGGAGTGGCGG CGTTCGGATACGG GAGATG GAGATCCA GAGATG GAGATCCA Fas TCAAGGAGGCCCA CCCCCTGCAATTT CTGTCAACCATGC CCGTTTGGCTTCT TTTTGCT CCGTTTG CAACCT TTACCC fos ACGGGGAGGACCT CGTAGGCCAGATG ACGGGGAGGACCT CGTAGGCCAGATG TACCTGTTCG TGGATGCTT TACCTGTTCG TGGATGCTT Gapdh CGAAAGAGAGGCC CGTTGCAGCGAAC CGACCCCAACACT CGTCCCTAGGCCC CTATCCCAAC TTTATTGATGGTA GAGCATCTCC CTCCTGTTATTAT gbp5 CGATGACAGAACT TCGGATTAGAATC CGATGACAGAACT TCGGATTAGAATC GACAGACTTGCT AGAGGAGTTTCTT GACAGACTTGCT AGAGGAGTTTCTT GTCC GTCC Gtf2b CGATCAGCTGAGA CGTTGAAGTCTGA CGAAAAGAAATCG CGTGAGCCCGAGG AGCGAACACA AGGGAAGAGATCC GGGATATTGCTGG GTAGATCAGT T ier3 CGACCAGCTACCA TCGGAAAGAGGAC CGACCAGCTACCA TCGGAAAGAGGAC ACCGAGGAA CCTCTTGGCAA ACCGAGGAA CCTCTTGGCAA Ifi44 CGATCAGCTGAGA CGTTGAAGTCTGA CGAAAAGAAATCG CGTGAGCCCGAGG AGCGAACACA AGGGAAGAGATCC GGGATATTGCTGG GTAGATCAGT T Ifi47 CGATTCCTGAAGG CGTGAACTGATCC CGACCGTTGGTGG CGTGGGAAGCTCA AGGGCAGAC ATGGCAGTTACC TGGCTGTG GATCCAAGAAGTA GA Ifihi ACGGCCCACTTTG TCGGACTCATTCC CGACTTCTGATTA CGTAATCCGATTT GTGGACAAA CGCTGTTTCC ACGATGTCTTGGA CTGTCTTCGACTG CAC T ifit2 CGAGCAGACAGTT CGTTGGCATTTTA CGAGCAGACAGTT CGTTGGCATTTTA

_ ACACAGCAGTCA GCTGTCGCAGAT ACACAGCAGTCA GCTGTCGCAGAT ifnbl ACGCACAGCCCTC TCGCATTTCCGAA ACGCACAGCCCTC TCGCATTTCCGAA

_ TCCATCAACTA TGTTCGTCCT TCCATCAACTA TGTTCGTCCT

113 Table S4 -continued Gene Outer F Outer R. Inner F Inner R Igtp CGACACGTCCCAT CGTAAGTGACTCA CGAAAGTTGCCAC TCGAGATTTAGAC GGATTTAGTCAC CCAGCAGTCA AAAATATCTGGAA CACGGGCTGAT GA 1112b CGAGAAGCACGGC CGTCCTCCACCTG CGATATGAGAACT CGTGGCTTCATCT AGCAGAATAAA TGAGTTCTTCAA ACAGCACCAGCTT GCAAGTTCTTGG C 1115 CGAAGGAATACAT CGTTTGGCCTCTG CGATTGTGTTTCC CGTCTACACTGAC CCATCTCGTGCTA TTTTAGGGAGAC TTCTAAACAGTCA ACAGCCCAAA C C illa ACGTTGGTTAAAT TCGGAGCGCTCAC ACGTTGGTTAAAT TCGGAGCGCTCAC GACCTGCAACA GAACAGTTG GACCTGCAACA GAACAGTTG Illm CGATGCCGCCCTT CGTATTTGGTCCT CGACTGCAAGATG CGTTGAGCTGGTT CTGGGAAAA TGTAAGTACCCAG CAAGCCTTC GTTTCTCAGG CAA i127 CGACCTCTCTGAC CGTTGTGGTAGCG CGACCTCTCTGAC CGTTGTGGTAGCG TCTGAGAGACTC AGGAAGCA TCTGAGAGACTC AGGAAGCA 114i1 CGAGCGAAACTAT CGTAAGGCCTTGA CGAGATGCCAGAA CGTGGCCTTGTTG GTGGTGGAGAA GGTCTTTGAA AAGCTGGGCTA AGTGCCATCT Irfi CGAAGACCCTGGC CGTGATGTCCCAG CGACAGGGCTGAT CGTGCAGCGTGCT TAGAGATGCA CCGTGCTTA CTGGATCAATA TCCATG Irf3 CGACAGCTGAGGT TCGGGCAGCCATG CGACAGCTGAGGT TCGGGCAGCCATG CTAGGTGCT CTGTGTTT CTAGGTGCT CTGTGTTT CGATCCGAGAACT CGTCTTGCCGCCT CGATCCGAGAACT CGTCTTGCCGCCT GGAGGAGTTTC CCGA GGAGGAGTTTC CCGA Irgi CGAGGGACGATTA CGTGTCAAGCTGA ACGGGAAGCCAAA CGTCGCAGCAGCA ATGCACTTCTCC GCCCCAGAA GACATACCAAAGA GCACTTC G irgin2 CGACGGGTTGGCT CGTCTTTCTGTTG CGACGGGTTGGCT CGTCTTTCTGTTG CTATAGGTTTTG TTTGCATCGACTG CTATAGGTTTTG TTTGCATCGACTG isg15 CGACAGTCTCTGA TCGGCTCTAGGTC CGACAGTCTCTGA TCGGCTCTAGGTC CTGTGAGAGCAA CCCTGGAA CTGTGAGAGCAA CCCTGGAA itga5 ACGCTACACCCCC CGTTTGTGTTCCT ACGCTACACCCCC CGTTTGTGTTCCT AACTCACAG GAGGCAGTAGAA AACTCACAG GAGGCAGTAGAA lcn2 ACGGCTACAATGT TCGCCTGCTCCTG ACGGCTACAATGT TCGCCTGGTCCTG CACCTCCATCC GTCCCT CACCTCCATCC GTCCCT Ly6e ACGTGTGCTTCTC CGTAGGGTGTAGC CGAACTGCCTGTG CGTCCCGCAGCGG ATGTACCGATCA CAAGGTTGAC GCCAGTTT CAGATA Lyz2 TGGGATCAATTGC CACCACCCTCTTT TGGGATCAATTGC CACCACCCTCTTT AGTGCT GCACATT AGTGCT GCACATT Mam12 CGATCAGCAGATG CGTGTTCATCTGA CGAACAGACCCTG CGTCATCTGCTGC ATGGGGAAGAA TCCTGAGGGGAA CAGAGACCAAC TGGAGGAGAAGT Marcks CGAGTGCCCAGTT CGTGCGTCCCCGT CGACGCAGCGAAG CGTGCCATTCTCC CTCCAAGAC TCACTTTTA GGAGAAG TGCCCATTT Mx2 CGATTGCCAGGGT CGTTGTAAAGGCA CGACATTATAAGA CGTAGTGACCGTG TTGTGAATTAC GCTCGTACAA AGACAAATCAAAA TGCAGCAT CCCTGGA naaa GCATCTGTGACTC GGCCACAATACTG GCATCTGTGACTC GGCCACAATACTG GCTCAAC GTGCAG GCTCAAC GTGCAG Nfkbie CGACACATTGCAG CGTGCGCTCCTGG CGAGCTCATAGAA CGTAGGTGCAGGG AGGAACCAACC GTTTCCA CTGTTGCTTCAGA CAGTCTTT

114 I Table S4 -continued Gene Outer F Outer R Inner F Inner R nfkbiz ACGTGGACAGAGT CGTCCAGATCTTG ACGTGGACAGAGT CGTCCAGATCTTG GCCTTTCAGG CACAATGAGGTG GCCTTTCAGG CACAATGAGGTG Nlrp3 CGACTCTGTGAGG CGTTCAGCTCAGG CGACTGCCTCCTG CGTGAGCGCTTCT TGCTGAAACA CTTTTCTTCC CAGAGCCTA AAGGCACGTTT Nos2 CGACTGTAGCACA CGTTCATGTTTGC CGATTCAGCACAT CGTCACAGTGATG GCACAGGAAA CGTCACTCC CTGCAGACACATA GCCGACCTGA CT oaslG CGACCTTTGATGT CGTCAGGGAGGTA CGACCTTTGATGT CGTCAGGGAGGTA CCTGGGTCATG CATTCCCAGAT CCTGGGTCATG CATTCCCAGAT Oas2 CGACAGTCAGATC CGTTCCTGGAACT CGAGGCTAAGGGT CGTGTCCATGAAG CCTGTGAAGGAA GTTGGAAGCA GGCTCCTAT AGAACAAGGGTAC oasl1 CGAGGTCATCGAG TCGTGGGTCCAGG CGAGGTCATCGAG TCGTGGGTCCAGG GCCTGTGT ATGATAGGC GCCTGTGT ATGATAGGC Oas12 CGACGGTTGGTGA CGTTGCTCTCTGT CGATGCAGTGCCT CGTGTAGATGGTC AGTTCTGGTAC ACCCATCTCC GAGACGTAA AGCAGCTCCA Parp14 ACGGTGGCCACTG CGTCACATTTTGT CGATCAAAGAAGT CGTAATTGCTTTG GGAACATCA CCTGCACCTTCC GGCAGATGTGATT GAGACCCCTGA G Pelil CGAAGTCTGCGTG CGTTGCAGTACGC CGACTCAGCAGAG CGTGGTTGCACCA AAACCAGATCA CACAGCAA AGGAAAGATGGT CAAAGGTCAA PMl CGAACAGGACTCT CGTCAGCGCAGAA CGAAGAGGAACCC CGTTCCTGTATGG GCCCTGAC ACTGAAATTCC TCCGAAGAC CTTGCTCTG Pnp2 CGAGTGTCACCAT CGTACTGCCACTT CGAAAGATTTGGG CGTAGTGTGTTGC '1 GGAGCAGAAA GAGGTCGATA CGCCTCTGTC AGAAGCCACTT ptges CGAGCTGCGGAAG TCGATCCTCGGGG CGAGCTGCGGAAG TCGATCCTCGGGG AAGGCTT TTGGCAA AAGGCTT TTGGCAA ptgs2 CAGTCAGGACTCT GGCCCTGGTGTAG CCACTTCAAGGGA TCTGGATGTCAGC GCTCACG TAGGAGA GTCTGGA ACATATTTCA Pvr CGAACTGGTATGT CGTGCGCTTAACA ACGACTTGACCCT TCGTCGTGCTCCA TGGCCTCACTA GAGTTGGGAA GACCTGTGA GTTATATCCAG Pyhini CGATGTCTTCAGG TCGTTTTGGAACC CGATGCCCAGGCA CGTCCTCAGAAGG AACAGCATCCA TTGCTGGTGAC CTAGGAG TTCTTTGGGTAC Rassf4 CGACCTATGGGTC CGTCGGTGTAGAG CGAGTCCGGGTTA CGTGGACCATCTT TGTGACCAAC TGCAAACTCA ACAGCACTATGAC CTACCCGGAACT Rpl101 CGAGCGAAGGTGG CGTTCCTTGCCAC CGACTGCGGCCAC CGTATGCGTGCGG ATGAGTTTCC AGCTCTTCA ATGGTGTC CCTCCA rsad2 ACGCCCTGGAGGA CGTTGTTTGAGCA ACGCCCTGGAGGA CGTTGTTTGAGCA GGCCAA GAAGCAGTCCT GGCCAA GAAGCAGTCCT Mavs CGAGGCTGGCTGA TCGAAATGCAGAG CGATCGAGTTTAT CGTGTAACTGCAG TCAAGTGAC GGTCCAGAAAC CAGAGCTACCTG TGGCTCTAGG Saa3 ACGGCCTGGGCTG CGTTTGGGGTCTT CGAGATGCCAGAG CGTGCCCCACTCA CTAAAGTCA TGCCACTCC AGGCTGTTCA TTGGCAAAC Samd9l CGAAAAGTCAGCC CGTAGCTTGTATG CGAAGAACTTTTC CGTAGTTCTCCTT GCTGTTTCA CGCTTCCTAC TGCCATTTAGATA TTGAAGTAACAGA CC TC Sdc4 CGAGGGGATGACA CGTACGGCAAAGA CGACCCAGGGCAG CGTACGCCGCCCA TGTCCAACAAA GGATGCCTA CAACATCTTTG CGATCA Shisa3 CGATGGAGCATCC CGTGGTCTCAAAC CGACGCGCAACCT CGTAGATAGCCAC TGGCATCA AGGTGCAACA GTCTATGTC TAAAGAGCCCAAA Slc7a2 CGAGGTGCTGCAA CGTAGCAGACAAG CGATGCCTTTGTG CGTTATTCCGATG CGTGCTTTTA TAAGGACGTCAC GGCTTTGACTG GGGATCGCCTTT

115 Table S4 -continued Gene Outer F Outer R Inner F Inner R socs3 ACCTTCAGCTCCA CAGGAACTCCCGA GTCGGGGACCAAG GGTCACTCTGCAG AAAGCGA ATGGGTC AACCTAC CGAAAA Sod2 CGAGGGCTGGCTT CGTGTGCTCCCAC CGAGGAGCAAGGT CGTGCAGCGGAAT GGCTTCAATAA ACGTCAATCC CGCTTACAGATT AAGGCCTGT Srgn CGATGGTCAGGAT CGTCAGCGGACCC CGAGTCGGCAGCA CGTCTGGCTCTCC GCAGGTTCC ACTGGTAC GGCTTGT GAGCAGGATA Tank CGAGACAGGCTGA CGTCATGTGGTTC CGATTCTCGTGGA CGTTATTCTTCCT AATCACAGCTAC ATCAAGGGTCAAA TTCTAGTCGAGAT TCTGTCACTGTCT A TC Tapi CGACTGAACCGGA CGTCTGGTTCCCA CGAAGCTGTGGCC CGTTCTGTGTCAT CTCCAACCA GTCTCACCTAC GTGGAGTC AGCCCTGAGGGAA Ticam2 CGACAGTGCGAGA CGTGCTCTGTGGC CGAAGCCTCGTGG TCGTCGGTGATTG GGAAGATCGAA ACCTTACCA TCAAGCA AGACGCCTTA Tlr4 CGAGTTCTTCTCC CGTAGGGGTTGAA ACGGGAAGCTTGA TCGGGTTGAAGAA TGCCTGACAC GCTCAGATCTA ATCCCTGCAT GGAATGTCATCAG G tnf CGAGCCTCTTCTC CGTTCATCCCTTT CGAGCCTCTTCTC CGTTCATCCCTTT ATTCCTGCTTGTG GGGGACCGATC ATTCCTGCTTGTG GGGGACCGATC Thfaip2 CGAAACATCATGG CGTATGGGCCTTG CGAACTGCCTGTT CGTTCCAGCAGGC CCAACATCAACA AGGTCTTTCA CTTCTGGACT GGTTCA Tnfrsflb CGAGGCTCAGATG CGTTGGTTCCAGA ACGAGTGTCCTCC CGTTGCTTGCCTC TGCTGTGCTA CCTGGGTATACA TGGCCAATA ACAGTCC Trafdl ACGGGATTCGCAG CGTAGTAGGTGGG CGACTTCAGCAGA CGTATTCCGGCTT CCAGAAAACA GCCTTTCAC GCTGTCCAG GACATCATCC Tubalc CGAAGTTTTCGCG CGTGATGCCATGT CGAAGGACTAAAT CGTGAGCTCCCAG GACCACTTCA TCCAGGCAGTA ATGCGTGAGTGCA CAGGCATT T Usp18 CGATGGTCATTAC CGTGCGGTATCTG CGAGGAATCCCGT CGTGGTACACTGG TGTGCCTACATCC TGGTTCCCATA GGATGGAAA ACATCCTTCC Zbpl CGATGGCAGAAGC TCGGGGCACTTGG CGATTGAGCACAG CGTAGCTGGCCAA TCCTGTTGAC CATTTCTTCAC GAGACAATCTG TCTTCACAG Zufsp CGAGAGGGGTGTA CGTTGTACAGACA CGATGTGAAAACA CGTTGGGACAGTC GTCAAGATATGGA AGCCCACACA AAGCATGCCAGTC ATAGAGTGGTTGA A

116