bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Functional single-cell genomics of human cytomegalovirus infection

Marco Y. Hein1, and Jonathan S. Weissman1,

1Department of Cellular and Molecular Pharmacology & Howard Hughes Medical Institute, University of California San Francisco, San Francisco CA 94143, USA

Understanding how host factors and hundreds of viral or- rectly and their function through errors introduced by the host chestrate the complex life cycle of herpesviruses represents a DNA repair machinery (8). Cleavage of the viral DNA in fundamental problem in virology. Here, we use CRISPR/Cas9- non-essential regions has a moderate impact on genes proxi- based screening to scan at high-resolution for functional ele- mal to the cut site, but minimal impact on HCMV replication ments in the genome of human cytomegalovirus (HCMV), and and host cell viability, likely because DNA repair is fast rel- to generate a genome-wide mapping of host dependency and re- ative to the kinetics of replication ((8) and our data below). striction factors. Our data reveal an architecture of functional Thus, Cas9 represents an effective tool for making targeted modules in the HCMV genome, and host factor pathways in- volved in virus adhesion and entry, membrane trafficking, and disruptions in the viral genome. To enable high-resolution innate immune response. Single-cell analysis shows that the scanning of viral elements for a comprehensive functional an- large majority of cells follow a stereotypical trajectory in viral notation of the HCMV genome, we designed a single-guide expression space. Perturbation of host factors does not al- RNA (sgRNA) library that targets every protospacer-adjacent ter this trajectory, but can accelerate or stall progression. Con- motif (PAM) for S. pyogenes Cas9 (NGG PAM sequence versely, perturbation of viral factors creates discrete alternate present roughly every 8 bp) along the genome of the clinical ‘dead-end’ trajectories. Our results reveal a fundamental di- HCMV strain Merlin (Fig. 1A, Table S1). We delivered the chotomy between the roles of host and viral factors in orches- library into primary human foreskin fibroblasts engineered to trating viral replication and more generally provide a road map express Cas9, so that upon HCMV infection, each cell exe- for high-resolution dissection of host-pathogen interactions. cutes a cut at a defined position along the viral genome, col- Correspondence: [email protected], [email protected] lectively tiling its entirety. The betaherpesvirus HCMV is a pervasive pathogen that es- We mapped the phenotypic landscape by quantifying the tablishes lifelong infection in the majority of the human pop- abundances of individual sgRNA cassettes in cells surviving ulation. Activation of its lytic cycle triggers a characteristic infection relative to the initial population by deep sequenc- cascade of events, starting with stereotypical waves of viral ing (Fig. S1). We found that cutting phenotypes are rel- gene expression, continuing with the replication of the large, atively constant within individual genes, i.e. the determin- 235 kb dsDNA genome, and culminating in the budding of ing factor is which gene is targeted by Cas9, more so than newly assembled virions. A number of systematic studies the relative position of the target site within the gene. Ad- have described these phenomena on the level of the transcrip- jacent sets of genes also frequently had similar phenotypes. tome, the set of translated messages, and the proteome in time However, some gene boundaries were marked by abrupt phe- and space (1–6). These studies have highlighted the com- notype changes, arguing that here, strong functional conse- plexity of the process and have raised the question of how quences of Cas9-induced double-strand breaks are limited to hundreds of viral genes cooperate to manipulate the host and the immediate vicinity of the cut sites (Figs. 1B, S2, Table undermine its defense machinery. CRISPR technology pro- S1). vides us with tools to systematically measure the functional At a larger scale, changes in the direction and magnitude of contribution of each viral gene and to identify the host fac- the phenotypes defined six major genomic modules: Cuts in tors involved in productive infection (7). Here, we present both distal regions of the genome, which lack genes essen- systematic screens for both host and viral factors affecting tial for viral replication (9, 10), had minimal impact on host HCMV infection in primary human fibroblasts. To capture cell survival. As expected, targeting the two regions cover- the complexity of the molecular events during infection, we ing UL48A–UL73 and UL96–UL150, both of which contain recorded the transcriptomes of tens of thousands of single essential genes involved in viral DNA replication, packaging cells, monitoring how perturbation of critical host and viral and nuclear egress (9, 11), strongly protected infected cells factors alters the timing, course, and progression of infec- from death. Surprisingly, in the two remaining regions of the tion. Our data paint an unprecedented picture of the HCMV genome, we found that disruption of genes required for viral life cycle and its vulnerabilities to antiviral intervention. replication did not necessarily protect the host from death. Cuts within the UL32–UL47 region, which contains essen- High resolution functional landscape of the HCMV tial genes, led to a strongly increased ability of the virus to genome kill cells. The most strongly sensitizing phenotypes mapped It was recently shown that targeting individual essential her- to the known viral apoptosis inhibitors UL36, UL37, and pesvirus genes by CRISPR/Cas9 disrupts their expression di- UL38 (12). While this behavior can be rationalized for virally

Hein & Weissman | bioRχiv | October 2, 2019 | 1–22 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A (d)Cas9-expressing primary fibroblasts HCMV HCMV genome population surviving tiling library HCMV infection deep or sequencing - lentiviral wide gene 1 gene 2 gene n packaging t0 population library uninfected population

B 2 disruption ) 0 enhances 1 host survival

0 50,000 100,000 150,000 200,000 kb (surviving/t

-12 disruption UL97 UL98 UL102 UL105 diminishes UL95 UL99 RNA5.0 log UL104 -2 host survival

UL, distal UL32–UL47 UL48A–UL73 UL75–UL88 UL96–UL150 US, distal C ) 0 PDGFRA heparan sulfate Fig. 1. Virus and host-directed CRISPR screens map interferon response the phenotypic landscape of HCMV and its host’s depen- host restriction factors dency and restriction factors. (A) Experimental design (uninfected/t 2 NEDDylation for pooled, virus and host-directed CRISPR screening. log Cullin-RING Our HCMV tiling library or genome-wide human sgRNA li- COP9 signalosome braries were lentivirally delivered into primary human fore- apoptosis skin fibroblasts expressing the CRISPRi or CRISPRn ma- UNC50 chinery, followed by infection with HCMV. sgRNA cas- diminished enhanced RIC1/RGP1/RAB6A settes were quantified by deep sequencing in the initial host survival host survival COG complex (t0) population, the surviving population, and the unin- BORC complex fected control population. (B) Phenotypic landscape of the TRAPP complex III HCMV genome obtained by locally averaging the pheno- ERAD types of individual sgRNAs. Strong changes in the mag- SRP/translocon nitude of the phenotype coincide with gene-gene bound- ER stress aries (inset). (C) Results of host-directed CRISPRi screen essential Ragulator displayed as a scatter plot of average gene essentiality host genes NuA4 HAT complex (i.e. infection-independent phenotype; y-axis) vs. pro-

log2(surviving/t0) tection/sensitization to death upon HCMV infection (i.e. infection-dependent phenotype; x-axis). encoded anti-apoptotic proteins, it extended to many other the initial population, as well as in an uninfected control pop- virus-essential genes without known anti-apoptotic roles, in- ulation to account for host gene essentiality (Fig. 1C, Table cluding the DNA polymerase processivity factor UL44. Sim- S2). ilarly, cuts in the central region spanning UL75–UL88 led to Our screen revealed a range of diverse host genes required slightly enhanced host cell death upon infection. Many genes for multiple steps in the viral life cycle. Genes involved in in this region are encoding essential structural components of the biosynthesis of heparan sulfate were among the strongest the viral envelope, tegument, and capsid. protective hits. Heparan sulfate proteoglycans on the cell Targeting essential viral genes, by definition, undermines the surface enable the adhesion of HCMV prior to cell entry production of viral offspring. The outcome for the host, how- (15, 16). Additionally, we found a range of vesicle trafficking ever, is more nuanced and sometimes counterintuitive. It ap- factors: RAB6A and its GEFs RIC1/KIAA1432 and RGP1, pears that disrupting essential genes involved in viral DNA the conserved oligomeric Golgi (COG) complex, members replication mostly protects the host. However, interfering of TRAPP complex III, and UNC50. These factors converge with the later steps of assembling new virions may not only on the Golgi apparatus and mediate both retrograde and an- be ineffective for protecting the host, but even place an addi- terograde transport, implying that they act downstream of vi- tional burden, leading to enhanced host cell death. ral entry. Some had previously been implicated in the in- ternalization of diverse bacterial and plant toxins, suggest- Genome-wide screen for host factors of HCMV infec- ing that HCMV and toxins exploit similar pathways for cell tion entry (13, 17–21). Other protective hits included members Next, we carried out a screen for host factors modulat- of the LAMTOR/Ragulator complex, Folliculin (FLCN), and ing HCMV infection by systematically repressing expres- the Lyspersin (C17orf59) subunit of the BORC complex, sion of human genes by CRISPR interference (CRISPRi) all linked to lysosome positioning and nutrient sensing (22– (13, 14). Phenotypes were defined by enrichment or deple- 24). This supports and extends the recent observation that tion of sgRNA cassettes in the surviving cell population over HCMV infection changes lysosome dynamics (6). Addition-

2 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. ally, host cell death was reduced by knockdown of certain ability from cell to cell (42–46). Cullin-RING E3 , adaptor subunits, substrate recep- To lay the groundwork for the Perturb-seq analysis, we ex- tors, regulators, and the associated neddylation and dened- plored the spectrum of cellular states in response to infec- dylation machineries. Many viruses, including HCMV, hi- tion by recording single-cell transcriptomes from cells sam- jack this pathway to degrade host restriction factors, which pled from eight time points with two multiplicities of in- can be prevented by broadly-acting NEDD8-activating en- fection each (Fig. 2A). Instead of synchronizing cells ex- zyme inhibitors (25, 26). Finally, we identified genes in- perimentally, which has inherent limits to its resolution due volved in tail-anchored protein insertion into the ER, as well to intrinsic heterogeneity in the timing at which individual as ER-associated degradation: AMFR, an E3 , and the cells are infected and the rate of progression of the infec- TRC40/GET pathway members BAG6 and ASNA1, which tion, we staged cells computationally by their transcriptional were shown to be required for insertion of membrane pro- signatures. The largest sources of variability between cells teins of herpes simplex virus I which, however, lack HCMV were the extent of IFN signaling and the viral load, i.e. the orthologs (27). fraction of viral transcripts per cell, which reached levels of Our screens also identified a number of host factors whose around 75 % and was positively correlated with total cellu- knockdown sensitizes cells to death upon infection rather lar RNA content (Fig. S4). Together, these properties de- than protecting them. Among these were known restriction fine three main subpopulations of cells: a naïve population factors such as PML and DAXX, as well as members of the (uninfected and IFN-negative); a bystander population (un- interferon type I (IFN) pathway. We also found subunits infected and IFN-positive); and an infected population with of the NuA4 histone acetyltransferase complex, which was varying amounts of viral transcripts (Fig. 2B). Of note, while shown to counteract Hepatitis-B virus (28). Other sensitizing activation of IFN-stimulated genes (ISGs) and early viral hits included members of the signal recognition particle, the gene expression happen at the same experimental time points, translocon and associated factors, and genes involved in ER they are almost entirely segregated to different populations stress (29, 30). Finally, we found genes with anti-apoptotic of cells, and only a small cohort of cells with low viral loads function, whose removal likely increases the sensitivity to also express ISGs. This phenomenon may present itself as an apoptosis triggered by HCMV infection. apparent correlation of some viral genes with ISGs in bulk To validate and extend the host factors identified in the measurements, but has been described in single-cell studies CRISPRi screen, we conducted a knockout screen using an of herpes simplex virus 1 (4, 45). Together, this underscores established CRISPR cutting (CRISPRn) library (31) (Fig. the rapidity with which the virus actively suppresses IFN sig- S3A, Table S2). The CRISPRi and CRISPRn screens were in naling (see S4B). general agreement (Fig. S3B). However, the phenotypes of When looking at viral gene expression, we found that after an hits involved in virus entry, as well as pro- and anti-apoptotic initial noisy phase, the majority of infected cells followed a genes, were more pronounced in the knockout screen, consis- dominant and highly stereotypical trajectory with increasing tent with the notion that selection pressure acts more strongly viral load. A subpopulation of around 2 % of cells, however, on cells with true null alleles compared to cells with resid- followed an alternate trajectory (Fig. 2C). Staging cells in sil- ual expression of targeted genes. PDGFRA stood out as the ico for each trajectory enabled us to calculate high-resolution top protective hit, supporting its described role as HCMV en- expression patterns of individual viral genes as a function of try receptor in fibroblasts (32–35). Conversely, genes that a ‘pseudo-temporal’ axis defined by viral load. Hierarchi- are essential for host viability and factors with weaker phe- cal clustering of individual viral transcripts along the dom- notypes were often not enriched above background in the inant trajectory broadly recapitulated the established classes CRISPRn screen. One reason may be the toxicity of DNA of ‘immediate-early’, ‘delayed-early’, ‘leaky-late’ and ‘true- cutting per se, especially in a cell type with an intact p53 late’ genes, but with a high degree of pseudo-temporal fine response (36, 37). Our findings underscore the benefits of structure within each class (Fig. 2D, S5) thus providing a combining orthogonal modes of genetic screening (38). far higher resolution view of these waves. In agreement with the classification based on proteomics measurements (4), we The lytic cascade resolved by single-cell transcrip- defined an additional ‘intermediate’ kinetic pattern. tomics The alternate trajectory, followed by ∼2 % of cells, was Our pooled screens provide a genome-scale picture of the fac- characterized by a uniform increase of early genes, an al- tors involved in lytic HCMV infection. To investigate the most complete absence of true-late transcripts and of the roles of critical host and viral factors in more detail, we used otherwise abundant major non-coding RNAs and generally Perturb-seq, which combines CRISPR-based genetic pertur- low total cellular RNA content, indicating that this trajec- bations with a rich transcriptional readout at the single-cell tory is abortive (Figs. 2E, S5). The pattern of expressed level (29, 39–41). Measuring tens of thousands of single- genes showed no enrichment of specific genomic loci, argu- cell transcriptomes provides a massively parallel way of test- ing against the hypothesis that this subpopulation might be ing large numbers of genetic perturbations under controlled infected by virions with defined long-range genomic defects conditions with a high-dimensional readout. The single-cell (Fig. S5). nature of this approach makes it particularly well suited for Focusing on the patterns of host gene expression as a func- studying virus infection, a process with great inherent vari- tion of viral load, we found that once interferon signaling

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 3 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Hein et al., Fig. 2

A HCMV cells oil

single-cell 0 h 6 h...120 h beads transcriptomes

B late infected 12,919 cells early infected

bystander

t-sne 2 naïve t-sne 1

experimental time 0 6 20 28 48 72 96 120 h.p.i. interferon score [AU] % viral RNA 6.25 12.5 25 50 % C 5,330 infected cells

Fig. 2. Single-cell infection time-course defines the lytic cascade of expression events as a trajectory in viral gene dominant trajectory expression space. (A) Cells were infected with a low or alternate high MOI of HCMV, harvested after times ranging from 6– trajectory (~2% of cells) 120 h.p.i., pooled and subjected to emulsion-based single- 6 20 28 48 72 96 120 h.p.i. 3.1 6.2 12.5 25 50 cell RNA-seq. (B) t-sne projection of all 12,919 cells color- UMAP 2 UMAP experimental time % viral RNA UMAP 1 coded by experimental time point (left), interferon activa- tion score (center), or viral load (right) defining the naïve, D early-stage late-stage E immediate bystander, and infected subpopulations. (C) UMAP pro- early jection of the viral parts of the transcriptomes of 5,330 delayed infected cells (>2.5 % viral load; >5 % for time points early 72 h.p.i), color-coded by experimental time (left) or viral inter- > mediate load (center). Trajectories were determined by averaging the position of cells ranked by viral load. (D) Clustering leaky late of averaged viral gene expression in cells binned in incre- ments of 2 % by viral load recapitulates the temporal waves

true of genes in the lytic cascade. (E) The same cluster order late applied on the viral gene expression pattern along the al- 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 ternate trajectory, binned in increments of 10 %. For higher % viral RNA % viral RNA resolution see Fig. S5). normalized expression normalized expression is halted in early-infected cells, most host transcripts be- Host and virus-directed genetic perturbations lead to have coherently as viral load increases (Fig. S6). HCMV, conceptually different outcomes unlike other herpesviruses, is known to allow translation of We next conducted a series of Perturb-seq experiments ex- host transcripts to proceed selectively (47). Some housekeep- ploring the impact of targeting either host or viral factors on ing transcripts (e.g. those of ribosomal proteins) decreased the viral replication cycle. In contrast to the pooled screen, in abundance, suggesting that they are being actively sup- where phenotypes emerge by enrichment or depletion of cells pressed (48, 49). A small but prominent set of host tran- over multiple days, Perturb-seq provides a high-resolution scripts were upregulated, including CD55, which was pre- view of the impact of targeting a viral or host gene over the viously seen at the protein level (4) and shown to be incor- ∼72 h course of a single viral replication cycle. porated into budding virions to counteract the complement We first selected 52 host genes with protective or sensitiz- system (50). ing phenotypes identified in the pooled screens (Fig. 3A, B, Table S3). We monitored how perturbing them by CRISPRi changes the host cell transcriptomes and the propensity of Only cells in the G1 phase of the cell cycle are permissive to cells to get infected when challenged with HCMV at a low the progression of infection (51). Accordingly, we observed multiplicity of infection (MOI) (Fig. 3C, D). In uninfected that G1 cells are gradually depleted from the population of cell populations, we observed the strongest transcriptional uninfected cells, and a majority of infected cells adopt a G1- responses with knockdown of LAMTOR/Ragulator subunits like state (Fig. S7A, B). Most cells later abruptly switched and the neddylation machinery, as well as mild responses to to a state most resembling S-phase. This transition co-occurs the knockdowns of vesicle trafficking factors, and of genes with the onset of ‘true-late’ gene expression, which marks associated with the translocon. The patterns of the transcrip- the beginning of viral genome replication (see Fig. S5B) and tional responses to the knockdowns organized host factors by likely reflects the pseudo-mitotic state described for cells in biological pathway in a principled fashion (Fig. 3C). This re- late-stage infection (52). quired no prior knowledge and provided a layer of informa-

4 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Hein et al., Fig. 3

A perturb-seq HCMV library low MOI

0 h 24 48 72 h host screen hits class log2(phenotype) entry sensitizing protective host sgRNA targets trafficking 2 B cullin-RING 1 LAMTOR 0 apoptosis ER stress -1 other -2 CSNK2A1 WDR81 SEC62 SEC61B SEC63 BCL2L1 VTCN1 TRAF2 IFNAR2 IRF9 STAT2 control LAMTOR1 LAMTOR4 LAMTOR5 LAMTOR3 LAMTOR2 AMFR WDR26 KXD1 C17orf59 FAM126A CUL3 NEDD8 NAE1 UBA3 RBX1 DDA1 DCAF4 CASP2 CASP9 CASP3 CYCS HCCS KIRREL KIAA1432 RAB6A RGP1 BAG6 ASNA1 TRAPPC8 TRAPPC12 TRAPPC11 COG2 COG8 COG5 UNC50 LARGE B4GALT7 SLC35B2 EXT2 GLCE HS6ST1 class

infected/t0

uninfected/t0

C host gene expression in bystander cells ribosomal proteins ER stress

ER/Golgi proteins

disulfide/ lysosome/ endosome Fig. 3. Transcriptional responses to host factor knockdowns. (A) ISGs Host dependency and restriction factors were selected from the pooled screen, cloned into a Perturb-seq library, delivered into cytoskeleton dCas9-expressing fibroblasts, which were challenged with a low MOI of HCMV for 24–72 h. (B) Selected host factors have a wide range IRF9 EXT2 CUL3 KXD1 NAE1 UBA3 RBX1 DDA1 BAG6 RGP1 GLCE CYCS AMFR HCCS COG2 COG8 COG5 STAT2 control TRAF2 SEC62 SEC63 RAB6A VTCN1 DCAF4 CASP2 CASP9 CASP3 ASNA1 UNC50 NEDD8 LARGE WDR81 WDR26 IFNAR2 KIRREL BCL2L1 HS6ST1 SEC61B

C17orf59 of sensitizing to protective phenotypes, varying degrees of essen- SLC35B2 B4GALT7 FAM126A KIAA1432 CSNK2A1 TRAPPC8 LAMTOR1 LAMTOR4 LAMTOR5 LAMTOR3 LAMTOR2 TRAPPC11 TRAPPC12 tiality and cover different pathways. (C) Transcriptional response to log (ratio) 2 host gene perturbations, shown as a set of the most responsive host D relative proportion of uninfected cells 2 24 h.p.i. 1 genes, relative to control cells in the bystander population (here de- 48 h.p.i. 0 fined as cells with <1 % viral load from the 24–72 h.p.i. time points). 72 h.p.i. -1 -2 See also Fig. S8. (D) Fraction of cells that remained uninfected IRF9 EXT2 CUL3 KXD1 NAE1 UBA3 RBX1 DDA1 BAG6 RGP1 GLCE CYCS AMFR HCCS COG2 COG8 COG5 STAT2 control TRAF2 SEC62 SEC63

RAB6A (<1 % viral RNA) at 24, 48 and 72 h.p.i., displayed as ratio relative to DCAF4 VTCN1 CASP2 CASP9 CASP3 ASNA1 UNC50 NEDD8 LARGE WDR26 WDR81 IFNAR2 KIRREL BCL2L1 HS6ST1 SEC61B C17orf59 SLC35B2 B4GALT7 FAM126A KIAA1432 CSNK2A1 TRAPPC8 LAMTOR1 LAMTOR4 LAMTOR5 LAMTOR3 LAMTOR2 TRAPPC11 TRAPPC12 the cell population expressing control sgRNAs. tion that would have remained unresolved when ranking hits scripts were found in cells lacking COG8 or UNC50, im- simply by their phenotypes in the pooled screens. The tran- plicating these factors as novel components required for vi- scriptional patterns were mostly similar in naïve cells com- ral entry. Perturbation of Ragulator/LAMTOR and FLCN, as pared to the bystander population, however, knockdown of well as RIC1/KIAA1432 and RGP1, allowed infected cells IFN pathway members prevented cells from mounting the re- to initially progress to early-stage, but then caused a stall. sponse characteristic of bystander cells (Fig. S8). This suggests these factors act downstream of viral uncoat- The Perturb-seq data also revealed the proportion of cells that ing but prior to viral genome replication. Conversely, block- remained uninfected after HCMV challenge (Fig. 3D). The ing the interferon pathway caused an acceleration of progres- results correlated broadly with the low-resolution view ob- sion, with most cells reaching late-stage prematurely, at 48 h tained in the pooled screen; for example, knockdown of inter- post-infection. Finally, knockout of pro/anti-apoptotic host feron pathway members led to a strong increase in the num- genes, among others, barely changed the viral load distribu- bers of infected cells. tion compared to the negative controls, suggesting that rather In order to follow changes in viral gene expression in in- than impacting the course of infection itself, these interfere fected cells, we next carried out a Perturb-seq experiment with the ability of the virus to kill the cells in a way that is where we challenged cells with a high MOI of HCMV. We not immediately apparent at the transcript level. used CRISPRn, allowing us to target viral genes in addition Compared to targeting host genes, targeting viral genes typ- to host factors. Based on their strongly protective or sen- ically led to qualitatively different outcomes. The distribu- sitizing phenotypes in the pooled, virus-directed screen, we tions of viral loads were no longer bimodal, but multimodal selected 31 viral gene targets, in addition to a representative (Fig. 4C, right panel), prompting us to monitor the underly- set of 21 host factors, for which we verified that knockdown ing patterns of individual viral genes, rather than looking at and knockout trigger comparable host responses (Fig. 4A, B, viral genes in aggregate by way of the viral load (Fig. 4D). In Table S3, Fig. S9). the space of viral gene expression, all trajectories defined by We found that the time course of progression of infection var- cells with perturbed host genes were nearly congruent with ied widely depending on the targeted gene (Fig. 4C). Control the control trajectories, indicating that host genes do not de- cells gradually transitioned from being predominantly early- termine the patterns of viral gene expression at the different stage to predominantly late-stage. Perturbation of host fac- stages of infection (Fig. 4D, top). Rather, host-directed per- tors caused shifts in this bimodal distribution (Fig. 4C, left turbations determine which stage of infection can be reached panel). For example, knockout of PDGFRA (the proposed and how quickly. In marked contrast, when we targeted vi- viral receptor (33)) or HS6ST1 (involved in heparan sul- ral genes, infection progressed along radically distinct tra- fate biosynthesis) prevented any accumulation of viral tran- jectories in expression space that diverge from the default scripts. Similarly, no appreciable quantities of viral tran- and fall into a set of classes (Fig. 4D, bottom). Cells with

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 5 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A HCMV perturb-seq high MOI library

0 h 24 48 72 h host screen hits + HCMV screen hits

B sensitizing protective host sgRNA targets HCMV sgRNA targets (by genome position) CSNK2A1 BCL2L1 SEC62 IFNAR2 STAT2 control LAMTOR2 LAMTOR3 FLCN DDA1 CASP9 CYCS KIAA1432 RGP1 ASNA1 COG8 UNC50 EXT2 B4GALT7 SLC35B2 HS6ST1 PDGFRA control UL35 UL36 UL37 UL38 UL40 UL42 UL43 UL52 UL53 UL54 UL55 UL56 UL57 ORFL150C UL59 ORFL152C UL69 UL70 UL102 UL105 UL112 UL115 ORFL257C UL119 UL121 UL122 UL123 UL148 UL144 UL141 UL135

surviving/t0 inactive in uninfected cells uninfected/t0 category no replication defect entry LAMTOR log2(phenotype) essential for replication vesicle trafficking apoptosis moderate/severe replication defect no data (Dunn et al. 2003) cullin-RING ER stress when deleted interferon -2 -1 0 1 2

75 75 C 50 24 h.p.i. 50 25 25 0 early 0 75 75 50 50 25 48 h.p.i. 25 0 0 75 late 75 50 50 25 25 0 72 h.p.i. 0 UL35 UL36 UL37 UL38 UL40 UL42 UL43 UL52 UL53 UL54 UL55 UL56 UL57 UL59 UL69 UL70 EXT2 FLCN DDA1 RGP1 CYCS COG8 UL112 UL115 UL119 UL102 UL105 UL121 UL122 UL123 UL148 UL144 UL141 UL135 STAT2 control control SEC62 CASP9 ASNA1 UNC50 IFNAR2 BCL2L1 HS6ST1 PDGFRA SLC35B2 B4GALT7 KIAA1432 CSNK2A1 LAMTOR2 LAMTOR3 ORFL150C ORFL152C ORFL257C % viral RNA % viral RNA D cells with host- sgHost targeting sgControl sgRNAs 17,786 infected cells

sgUL52 – sgUL69 sgUL105

sgUL115 – sgUL148

% viral RNA 6.25 12.5 25 50 % sgControl

cells with HCMV- sgUL35 – targeting sgUL43 sgUL112 sgRNAs

E Fig. 4. Host and virus-directed perturbations stall or accel- erate progression, or create alternate trajectories in viral gene expression space. (A) Host and viral factors were se- lected from the pooled screens, cloned into a Perturb-seq library, delivered into Cas9-expressing fibroblasts, which were challenged with a high MOI of HCMV for 24–72 h. (B) Selected factors organized by their respective pheno- types in the pooled screens, essentiality for the host and the virus (9), respectively, and pathway membership. (C) Violin plots of viral loads as a function of time after infection % viral RNA sgHost sgUL35 – sgUL52 – sgUL115 – sgHCMV and the perturbed factor (red, protective phenotype; blue, safe-targeting sgUL43 sgUL105 sgUL69 sgUL112 sgUL148 safe-targeting normalized expression sensitizing phenotype; grey, control) (D) UMAP projection F of the viral parts of the transcriptomes of infected cells, color-coded by viral load. Subset of cells with host-directed (top) or virus-directed (bottom) sgRNAs, color-coded by 2 guide identity. Trajectories were determined by averag- ) 0 1 ing the geometric position of cells with a given sgRNA, ranked by viral load. (E) Averaged viral gene expression 0 profiles along the groups of trajectories, determined from 50,000 100,000 150,000 200,000 kb cells binned by viral load. (F) Perturbed viral genes from (surviving/t

-12 each trajectory set, mapped back to their respective posi- log screen phenotype -2 tions in the HCMV genome.

6 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. sgRNAs targeting non-essential regions of the viral genome Most importantly, we reveal a dichotomy in how the system followed the same trajectory as those with host-targeting con- reacts to host- versus virus-directed perturbations: Target- trols, showing only mild transcriptional effects on genes in ing critical host factors alters how fast or how far cells can the vicinity of the cut sites (Figs. S10, S11). progress along the lytic cascade. If the lack of a factor stalls Alternate trajectories were arranged in three main bundles progression, this directly reveals the phase at which this fac- (Fig. 4E, F): The first class of trajectories was established tor plays a critical role. Conversely, targeting viral factors by targeting genes in the strongly sensitizing region (UL35– can alter the course of infection by creating defined and spe- UL43). Here, the number of cells declined rapidly with in- cific alternate trajectories that do not necessarily culminate in creased viral load, in agreement with the higher susceptibil- successful replication or that drive cells into premature apop- ity to cell death upon infection found in the pilot screen (Fig. tosis. This dichotomy, may be a general feature of virus-host S10). In addition to creating a pro-apoptotic state, a surpris- systems. HCMV is entirely dependent on the transcriptional ing feature of these alternate trajectories was the failure to and translational machinery of its host. At the same time, our shut down expression of immediate-early and delayed-early findings indicate that the lytic cascade is a deterministic pro- genes once later stage genes were expressed (Fig. 4F). A gram, which, once set in motion, is solely controlled by viral second bundle corresponded to perturbations of genes in the factors. first strongly protective module (UL52–ORFL152C), as well Our work also provides a roadmap for the design of effec- as UL102 and UL105. These trajectories showed a marked tive antiviral combination therapies by selecting sets of tar- delay and reduced induction of true-late genes, with few gets that drive the virus into distinct, nonproductive pathways cells reaching high viral loads, indicating a failure to repli- while sparing or inducing apoptosis in the host, depending cate the viral genome. A third bundle contained perturba- on the desired outcome. Similarly, our data can inform the tions of genes located within the UL115–UL148 region. This design of engineered attenuated viral strains for vaccine de- bundle was most similar to the unperturbed trajectories. Of velopment purposes. More generally, we envision that our note, the kinetics of progression varied between the targeted approach can serve as a blueprint for studying other viruses genes within this group (Fig. 4C, right), with perturbation of and to define their vulnerabilities to genetic or pharmacolog- the major immediate-early transactivator genes UL122 and ical interventions. UL123 causing the strongest delays. Finally, cells with per- ACKNOWLEDGMENTS turbations of UL69 and UL112, genes with comparatively We thank M. A. Horlbeck for designing the HCMV tiling library, L. A. Gilbert for help weak protective phenotypes, defined two trajectories distinct setting up pooled screens, T. M. Norman, M. A. Horlbeck, J. A. Hussmann and X. Qiu for help with data analysis. A. Xu, J. A. Villalta and R. A. Pak provided technical from the three main bundles, despite their genome positions assistance. The UCOE sequence was a gift from G. Sienski. We thank T. Fair for adjacent to or within genes belonging to the main bundles help with Perturb-seq experiments. We thank N. Stern-Ginossar, M. J. Shurtleff, M. Jost, R. A. Saunders and all members of the Weissman lab for insightful discus- (Fig. 4F). Targeting UL69 caused overexpression of RNA1.2, sions. J. Winkler and A. S. Puschnik provided helpful comments on the manuscript. while targeting of UL112 triggered widespread deregulation Special thanks to O. Wueseke (impulse-science.org) for editorial help. Funding: J.S.W. is a Howard Hughes Medical Institute Investigator. M.Y.H. was supported of genes from all temporal classes (Figs. 4E, S10). Taken by an EMBO long-term postdoctoral fellowship (EMBO ALTF 1193-2015, co-funded together, this shows that adherence to the normal viral trajec- by the European Commission FP7, Marie Curie Actions, LTFCOFUND2013, GA- 2013-609409). Author contributions: M.Y.H. and J.S.W. conceptualized the study, tory is orchestrated by a diversity of viral genes, but not host interpreted the experiments and wrote the manuscript. M.Y.H. designed and car- factors. ried out the experiments. Competing Interests: J.S.W. and M.Y.H. have filed patent applications related to CRISPRi screening and Perturb-seq. J.S.W. consults for and holds equity in KSQ Therapeutics and Maze Therapeutics, and consults for 5AM Ventures. Data and materials availability: Plasmids and libraries will be made avail- Discussion able via addgene. Raw sequencing data will be deposited to GEO. The waves of viral gene expression during lytic infection are a key signature of herpesvirus biology and its molecular play- References ers are subjects of intense investigation (53). Here, we in- troduce a functional single-cell genomics approach to study 1. Derek Gatherer, Sepehr Seirafian, Charles Cunningham, Mary Holton, Derrick J Dargan, Katarina Baluchova, Ralph D Hector, Julie Galbraith, Pawel Herzyk, Gavin W G Wilkinson, lytic HCMV infection with unprecedented temporal and gene and Andrew J Davison. High-resolution human cytomegalovirus transcriptome. Proc. Natl. expression resolution. We combine comprehensive pooled Acad. Sci. U. S. A., 108(49):19755–19760, December 2011. 2. Lisa Marcinowski, Michael Lidschreiber, Lukas Windhager, Martina Rieder, Jens B Bosse, CRISPR screening, directed against both host and virus, with Bernd Rädle, Thomas Bonfert, Ildiko Györy, Miranda de Graaf, Olivia Prazeres da Costa, massively parallel single-cell transcriptional analysis of in- Philip Rosenstiel, Caroline C Friedel, Ralf Zimmer, Zsolt Ruzsics, and Lars Dölken. Real-time transcriptional profiling of cellular and viral gene expression during lytic cy- fection. Together, this strategy identifies genes critical for tomegalovirus infection. PLoS Pathog., 8(9):e1002908, September 2012. infection and pinpoints their specific roles by observing the 3. Noam Stern-Ginossar, Ben Weisburd, Annette Michalski, Vu Thuy Khanh Le, Marco Y Hein, Sheng-Xiong Huang, Ming Ma, Ben Shen, Shu-Bing Qian, Hartmut Hengel, Matthias Mann, changes in the course of infection when each factor is per- Nicholas T Ingolia, and Jonathan S Weissman. Decoding human cytomegalovirus. Science, turbed. 338(6110):1088–1093, November 2012. 4. Michael P Weekes, Peter Tomasec, Edward L Huttlin, Ceri A Fielding, David Nusinow, Our study redefines the lytic cascade at the single-cell level as Richard J Stanton, Eddie C Y Wang, Rebecca Aicheler, Isa Murrell, Gavin W G Wilkin- a highly resolved continuum of cellular states, which would son, Paul J Lehner, and Steven P Gygi. Quantitative temporal viromics: an approach to investigate host-pathogen interaction. Cell, 157(6):1460–1472, June 2014. otherwise be obscured by differences in the kinetics of infec- 5. Osnat Tirosh, Yifat Cohen, Alina Shitrit, Odem Shani, Vu Thuy Khanh Le-Trilling, Mirko tion in bulk experiments. We find that the large majority of Trilling, Gilgi Friedlander, Marvin Tanenbaum, and Noam Stern-Ginossar. The transcrip- tion and translation landscapes during human cytomegalovirus infection reveal novel Host- cells follow this stereotypical trajectory in viral gene expres- Pathogen interactions. PLoS Pathog., 11(11):e1005288, November 2015. sion space, while a small but prominent subpopulation take 6. Pierre M Jean Beltran, Rommel A Mathias, and Ileana M Cristea. A portrait of the human organelle proteome in space and time during cytomegalovirus infection. Cell Syst, 3(4): an alternate, abortive trajectory. 361–373.e6, October 2016.

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 7 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

7. Andreas S Puschnik, Karim Majzoub, Yaw Shin Ooi, and Jan E Carette. A CRISPR toolbox 31. Konstantinos Tzelepis, Hiroko Koike-Yusa, Etienne De Braekeleer, Yilong Li, Emmanouil to study virus-host interactions. Nat. Rev. Microbiol., 15(6):351–364, June 2017. Metzakopian, Oliver M Dovey, Annalisa Mupo, Vera Grinkevich, Meng Li, Milena Mazan, 8. Ferdy R van Diemen, Elisabeth M Kruse, Marjolein J G Hooykaas, Carlijn E Bruggeling, Malgorzata Gozdecka, Shuhei Ohnishi, Jonathan Cooper, Miten Patel, Thomas McKerrell, Anita C Schürch, Petra M van Ham, Saskia M Imhof, Monique Nijhuis, Emmanuel J H J Bin Chen, Ana Filipa Domingues, Paolo Gallipoli, Sarah Teichmann, Hannes Ponstingl, Wiertz, and Robert Jan Lebbink. CRISPR/Cas9-Mediated genome editing of herpesviruses Ultan McDermott, Julio Saez-Rodriguez, Brian J P Huntly, Francesco Iorio, Cristina Pina, limits productive and latent infections. PLoS Pathog., 12(6):e1005701, June 2016. George S Vassiliou, and Kosuke Yusa. A CRISPR dropout screen identifies genetic vul- 9. Walter Dunn, Cassie Chou, Hong Li, Rong Hai, David Patterson, Viktor Stolc, Hua Zhu, and nerabilities and therapeutic targets in acute myeloid leukemia. Cell Rep., 17(4):1193–1205, Fenyong Liu. Functional profiling of a human cytomegalovirus genome. Proc. Natl. Acad. October 2016. Sci. U. S. A., 100(24):14223–14228, November 2003. 32. Liliana Soroceanu, Armin Akhavan, and Charles S Cobbs. Platelet-derived growth factor- 10. Dong Yu, Maria C Silva, and Thomas Shenk. Functional map of human cytomegalovirus α receptor activation is required for human cytomegalovirus infection. Nature, 455(7211): AD169 defined by global mutational analysis. Proc. Natl. Acad. Sci. U. S. A., 100(21): 391–395, 2008. 12396–12401, October 2003. 33. Anna Kabanova, Jessica Marcandalli, Tongqing Zhou, Siro Bianchi, Ulrich Baxa, Yaroslav 11. Ellen Van Damme and Marnix Van Loock. Functional annotation of human cytomegalovirus Tsybovsky, Daniele Lilleri, Chiara Silacci-Fregni, Mathilde Foglierini, Blanca Maria gene products: an update. Front. Microbiol., 5:218, May 2014. Fernandez-Rodriguez, Aliaksandr Druz, Baoshan Zhang, Roger Geiger, Massimiliano Pa- 12. A Louise McCormick, Anna Skaletskaya, Peter A Barry, Edward S Mocarski, and Victor S gani, Federica Sallusto, Peter D Kwong, Davide Corti, Antonio Lanzavecchia, and Lau- Goldmacher. Differential function and expression of the viral inhibitor of caspase 8-induced rent Perez. Platelet-derived growth factor-α receptor is the cellular receptor for human apoptosis (vICA) and the viral mitochondria-localized inhibitor of apoptosis (vMIA) cell death cytomegalovirus gHgLgO trimer. Nat Microbiol, 1(8):16082, June 2016. suppressors conserved in primate and rodent cytomegaloviruses. Virology, 316(2):221– 34. Nadia Martinez-Martin, Jessica Marcandalli, Christine S Huang, Christopher P Arthur, 233, November 2003. Michela Perotti, Mathilde Foglierini, Hoangdung Ho, Annie M Dosey, Stephanie Shriver, 13. Luke A Gilbert, Max A Horlbeck, Britt Adamson, Jacqueline E Villalta, Yuwen Chen, Evan H Jian Payandeh, Alexander Leitner, Antonio Lanzavecchia, Laurent Perez, and Claudio Ci- Whitehead, Carla Guimaraes, Barbara Panning, Hidde L Ploegh, Michael C Bassik, Lei S ferri. An unbiased screen for human cytomegalovirus identifies neuropilin-2 as a central Qi, Martin Kampmann, and Jonathan S Weissman. Genome-Scale CRISPR-Mediated con- viral receptor. Cell, July 2018. trol of gene repression and activation. Cell, 159(3):647–661, October 2014. 35. Xiaofei E, Paul Meraner, Ping Lu, Jill M Perreira, Aaron M Aker, William M McDougall, 14. Max A Horlbeck, Luke A Gilbert, Jacqueline E Villalta, Britt Adamson, Ryan A Pak, Yuwen Ronghua Zhuge, Gary C Chan, Rachel M Gerstein, Patrizia Caposio, Andrew D Yurochko, Chen, Alexander P Fields, Chong Yon Park, Jacob E Corn, Martin Kampmann, and Abraham L Brass, and Timothy F Kowalik. OR14I1 is a receptor for the human cy- Jonathan S Weissman. Compact and highly active next-generation libraries for CRISPR- tomegalovirus pentameric complex and defines viral epithelial cell tropism. Proc. Natl. Acad. mediated gene repression and activation. Elife, 5, September 2016. Sci. U. S. A., 116(14):7043–7052, April 2019. 15. T Compton, D M Nowlin, and N R Cooper. Initiation of human cytomegalovirus infection 36. Tim Wang, Kıvanç Birsoy, Nicholas W Hughes, Kevin M Krupczak, Yorick Post, Jenny J Wei, requires initial interaction with cell surface heparan sulfate. Virology, 193(2):834–841, April Eric S Lander, and David M Sabatini. Identification and characterization of essential genes 1993. in the human genome. Science, 350(6264):1096–1101, November 2015. 16. Stefanie Hetzenecker, Ari Helenius, and Magdalena Anna Krzyzaniak. HCMV induces 37. Robert J Ihry, Kathleen A Worringer, Max R Salick, Elizabeth Frias, Daniel Ho, Kraig Theri- macropinocytosis for host cell entry in fibroblasts. Traffic, 17(4):351–368, April 2016. ault, Sravya Kommineni, Julie Chen, Marie Sondey, Chaoyang Ye, Ranjit Randhawa, Tripti 17. Ganesh V Pusapati, Giovanni Luchetti, and Suzanne R Pfeffer. Ric1-Rgp1 complex is a Kulkarni, Zinger Yang, Gregory McAllister, Carsten Russ, John Reece-Hoyes, William For- guanine nucleotide exchange factor for the late golgi Rab6A GTPase and an effector of rester, Gregory R Hoffman, Ricardo Dolmetsch, and Ajamete Kaykas. p53 inhibits CRISPR- the medial golgi Rab33B GTPase. Journal of Biological Chemistry, 287(50):42129–42137, Cas9 engineering in human pluripotent stem cells. Nat. Med., 24(7):939–946, July 2018. 2012. 38. David W Morgens, Richard M Deans, Amy Li, and Michael C Bassik. Systematic com- 18. Richard D Smith, Rose Willett, Tetyana Kudlyk, Irina Pokrovskaya, Adrienne W Paton, parison of CRISPR/Cas9 and RNAi screens for essential genes. Nat. Biotechnol., 34(6): James C Paton, and Vladimir V Lupashin. The COG complex, rab6 and COPI define a 634–636, June 2016. novel golgi retrograde trafficking pathway that is exploited by SubAB toxin. Traffic, 10(10): 39. Paul Datlinger, André F Rendeiro, Christian Schmidl, Thomas Krausgruber, Peter Traxler, 1502–1517, October 2009. Johanna Klughammer, Linda C Schuster, Amelie Kuchler, Donat Alpar, and Christoph Bock. 19. Sicen Liu, Monika Dominska-Ngowe, and Derek Michael Dykxhoorn. Target silencing of Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods, 14(3):297– components of the conserved oligomeric golgi complex impairs HIV-1 replication. Virus 301, March 2017. Res., 192:92–102, November 2014. 40. Atray Dixit, Oren Parnas, Biyu Li, Jenny Chen, Charles P Fulco, Livnat Jerby-Arnon, Ne- 20. Michael C Bassik, Martin Kampmann, Robert Jan Lebbink, Shuyi Wang, Marco Y Hein, manja D Marjanovic, Danielle Dionne, Tyler Burks, Raktima Raychowdhury, Britt Adamson, Ina Poser, Jimena Weibezahn, Max A Horlbeck, Siyuan Chen, Matthias Mann, Anthony A Thomas M Norman, Eric S Lander, Jonathan S Weissman, Nir Friedman, and Aviv Regev. Hyman, Emily M LeProust, Michael T McManus, and Jonathan S Weissman. A systematic Perturb-Seq: Dissecting molecular circuits with scalable Single-Cell RNA profiling of pooled mammalian genetic interaction map reveals pathways underlying ricin susceptibility. Cell, genetic screens. Cell, 167(7):1853–1866.e17, December 2016. 152(4):909–922, 2013. 41. Diego Adhemar Jaitin, Assaf Weiner, Ido Yofe, David Lara-Astiaso, Hadas Keren-Shaul, 21. Andrey S Selyunin, Lakesla R Iles, Geoffrey Bartholomeusz, and Somshuvra Mukhopad- Eyal David, Tomer Meir Salame, Amos Tanay, Alexander van Oudenaarden, and Ido Amit. hyay. Genome-wide siRNA screen identifies UNC50 as a regulator of shiga toxin 2 traffick- Dissecting immune circuits by linking CRISPR-Pooled screens with Single-Cell RNA-Seq. ing. J. Cell Biol., 216(10):3249–3262, October 2017. Cell, 167(7):1883–1896.e15, December 2016. 22. Georgina P Starling, Yan Y Yip, Anneri Sanger, Penny E Morton, Emily R Eden, and Mark P 42. Fabio Zanini, Szu-Yuan Pu, Elena Bekerman, Shirit Einav, and Stephen R Quake. Single- Dodding. Folliculin directs the formation of a Rab34-RILP complex to control the nutrient- cell transcriptional dynamics of flavivirus infection. eLife, 7, 2018. dependent dynamic distribution of lysosomes. EMBO Rep., 17(6):823–841, June 2016. 43. Alistair B Russell, Cole Trapnell, and Jesse D Bloom. Extreme heterogeneity of influenza 23. Przemyslaw A Filipek, Mariana E G de Araujo, Georg F Vogel, Cedric H De Smet, Daniela virus infection in single cells. Elife, 7, February 2018. Eberharter, Manuele Rebsamen, Elena L Rudashevskaya, Leopold Kremser, Teodor Yor- 44. Florian Erhard, Marisa A P Baptista, Tobias Krammer, Thomas Hennig, Marius Lange, danov, Philipp Tschaikner, Barbara G Fürnrohr, Stefan Lechner, Theresia Dunzendorfer- Panagiota Arampatzi, Christopher S Jürges, Fabian J Theis, Antoine-Emmanuel Saliba, Matt, Klaus Scheffzek, Keiryn L Bennett, Giulio Superti-Furga, Herbert H Lindner, Taras and Lars Dölken. scSLAM-seq reveals core features of transcription dynamics in single Stasyk, and Lukas A Huber. LAMTOR/Ragulator is a negative regulator of arl8b- and cells. Nature, 571(7765):419–423, July 2019. BORC-dependent late endosomal positioning. J. Cell Biol., 216(12):4199–4215, Decem- 45. Nir Drayman, Parthiv Patel, Luke Vistain, and Sava¸sTay. Hsv-1 single-cell analysis reveals ber 2017. the activation of anti-viral and developmental programs in distinct sub-populations. eLife, 8: 24. Jing Pu, Tal Keren-Kaplan, and Juan S Bonifacino. A Ragulator-BORC interaction controls e46339, 2019. lysosome positioning in response to amino acid availability. J. Cell Biol., 216(12):4183– 46. Emanuel Wyler, Vedran Franke, Jennifer Menegatti, Christine Kocks, Anastasiya Boltenga- 4197, December 2017. gen, Samantha Praktiknjo, Barbara Walch-Rueckheim, Nikolaus Rajewsky, Friedrich 25. Tanja Becker, Vu Thuy Khanh Le-Trilling, and Mirko Trilling. Cellular cullin RING ubiquitin Graesser, Altuna Akalin, and Markus Landthaler. Single-cell RNA-sequencing of Herpes ligases: Druggable host dependency factors of cytomegaloviruses. Int. J. Mol. Sci., 20(7), simplex virus 1-infected cells identifies NRF2 activation as an antiviral program. bioRxiv, April 2019. page 566992, 2019. 26. Vu Thuy Khanh Le-Trilling, Dominik A Megger, Benjamin Katschinski, Christine D Lands- 47. Caleb McKinney, Jiri Zavadil, Christopher Bianco, Lora Shiflett, Stuart Brown, and Ian Mohr. berg, Meike U Rückborn, Sha Tao, Adalbert Krawczyk, Wibke Bayer, Ingo Drexler, Matthias Global reprogramming of the cellular translational landscape facilitates cytomegalovirus Tenbusch, Barbara Sitek, and Mirko Trilling. Broad and potent antiviral activity of the NAE replication. Cell Rep., 6(6):1175, March 2014. inhibitor MLN4924. Sci. Rep., 6:19977, February 2016. 48. Jesper B Andersen, Krystyna Mazan-Mamczarz, Ming Zhan, Myriam Gorospe, and Bret A 27. Melanie Ott, Débora Marques, Christina Funk, and Susanne M Bailer. Asna1/TRC40 that Hassel. Ribosomal protein mRNAs are primary targets of regulation in RNase-L-induced mediates membrane insertion of tail-anchored proteins is required for efficient release of senescence. RNA Biol., 6(3):305–315, July 2009. herpes simplex virus 1 virions. Virol. J., 13(1):175, October 2016. 49. Emma Abernathy and Britt Glaunsinger. Emerging roles for RNA degradation in viral repli- 28. Hironori Nishitsuji, Saneyuki Ujino, Keisuke Harada, and Kunitada Shimotohno. TIP60 com- cation and antiviral defense. Virology, 479-480:600–608, May 2015. plex inhibits hepatitis B virus transcription. J. Virol., 92(6), March 2018. 50. G T Spear, N S Lurain, C J Parker, M Ghassemi, G H Payne, and M Saifuddin. Host cell- 29. Britt Adamson, Thomas M Norman, Marco Jost, Min Y Cho, James K Nuñez, Yuwen Chen, derived complement control proteins CD55 and CD59 are incorporated into the virions of Jacqueline E Villalta, Luke A Gilbert, Max A Horlbeck, Marco Y Hein, Ryan A Pak, Andrew N two unrelated enveloped viruses. human T cell leukemia/lymphoma virus type I (HTLV-I) Gray, Carol A Gross, Atray Dixit, Oren Parnas, Aviv Regev, and Jonathan S Weissman. A and human cytomegalovirus (HCMV). J. Immunol., 155(9):4376–4381, November 1995. multiplexed Single-Cell CRISPR screening platform enables systematic dissection of the 51. Boris Bogdanow, Henry Weisbach, Jens von Einem, Sarah Straschewski, Sebastian Voigt, unfolded protein response. Cell, 167(7):1867–1882.e21, December 2016. Michael Winkler, Christian Hagemeier, and Lüder Wiebusch. Human cytomegalovirus tegu- 30. Yifat Cohen, Márton Megyeri, Oscar C W Chen, Giuseppe Condomitti, Isabelle Riezman, ment protein pp150 acts as a cyclin A2-CDK-dependent sensor of the host cell cycle and Ursula Loizides-Mangold, Alaa Abdul-Sada, Nitzan Rimon, Howard Riezman, Frances M differentiation state. Proc. Natl. Acad. Sci. U. S. A., 110(43):17510–17515, October 2013. Platt, Anthony H Futerman, and Maya Schuldiner. The yeast p5 type ATPase, spf1, reg- 52. Laura Hertel and Edward S Mocarski. Global analysis of host cell gene expression late dur- ulates manganese transport into the endoplasmic reticulum. PLoS One, 8(12):e85519, ing cytomegalovirus infection reveals extensive dysregulation of cell cycle gene expression December 2013. and induction of pseudomitosis independent of us28 function. Journal of virology, 78(21):

8 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

11988–12011, 2004. 53. Edward S Mocarski, Jr, Thomas Schenk, Paul D Griffiths, and Robert F Pass. Cy- tomegaloviruses. In Peter M Howley David M. Knipe, editor, Fields Virology, pages 1960– 2014. Lippincott Williams & Wilkins, June 2013. 54. Uta Müller-Kuller, Mania Ackermann, Stephan Kolodziej, Christian Brendel, Jessica Fritsch, Nico Lachmann, Hana Kunkel, Jörn Lausen, Axel Schambach, Thomas Moritz, and Manuel Grez. A minimal ubiquitous chromatin opening element (UCOE) effectively prevents si- lencing of juxtaposed heterologous promoters by epigenetic remodeling in multipotent and pluripotent stem cells. Nucleic Acids Res., 43(3):1577–1592, February 2015. 55. John G Doench, Nicolo Fusi, Meagan Sullender, Mudra Hegde, Emma W Vaimberg, Kather- ine F Donovan, Ian Smith, Zuzana Tothova, Craig Wilen, Robert Orchard, Herbert W Virgin, Jennifer Listgarten, and David E Root. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol., 34(2):184–191, February 2016. 56. Eva Maria Borst, Gabriele Hahn, Ulrich H Koszinowski, and Martin Messerle. Cloning of the human cytomegalovirus (HCMV) genome as an infectious bacterial artificial in Escherichia coli: a new approach for construction of HCMV mutants. J. Virol., 73(10): 8320–8329, October 1999. 57. Joshua G Dunn and Jonathan S Weissman. Plastid: nucleotide-resolution analysis of next- generation sequencing and genomics data. BMC Genomics, 17(1):958, November 2016. 58. L Van Der Maaten and G Hinton. Visualizing high-dimensional data using t-sne. journal of machine learning research. J. Mach. Learn. Res., 9:26, 2008. 59. Raul Garreta and Guillermo Moncecchi. Learning scikit-learn: Machine Learning in Python. Packt Publishing Ltd, November 2013. 60. Etienne Becht, Leland McInnes, John Healy, Charles-Antoine Dutertre, Immanuel W H Kwok, Lai Guan Ng, Florent Ginhoux, and Evan W Newell. Dimensionality reduction for visualizing single-cell data using UMAP. Nat. Biotechnol., December 2018. 61. Xiaojie Qiu, Qi Mao, Ying Tang, Li Wang, Raghav Chawla, Hannah A Pliner, and Cole Trap- nell. Reversed graph embedding resolves complex single-cell trajectories. Nature methods, 14(10):979, 2017. 62. Jeffrey Heer, Nicholas Kong, and Maneesh Agrawala. Sizing the horizon: the effects of chart size and layering on the graphical perception of time series visualizations. In Proceedings of the 27th international conference on Human factors in computing systems - CHI 09, page 1303, New York, New York, USA, 2009. ACM Press. 63. Eain Murphy, Isidore Rigoutsos, Tetsuo Shibuya, and Thomas E Shenk. Reevaluation of human cytomegalovirus coding potential. Proc. Natl. Acad. Sci. U. S. A., 100(23):13585– 13590, November 2003.

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 9 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

Materials and Methods Cell and virus culture Human foreskin fibroblasts (HFFs; CRL-1634) and HCMV (strain Merlin; VR-1590) were purchased from the American Tissue Culture Collection. HFFs were cultured in DMEM, supplemented with 10 % FBS and penicillin-streptomycin. HCMV stocks were expanded by two rounds of propagation on HFFs and titered by serial dilution. For stable expression of the CRISPRi/n machineries in HFFs, we modified established lentiviral (d)Cas9 expression vectors (13) by inserting a minimal ubiquitous chromatin opening element (UCOE) (54) upstream of the SFFV promoter, resulting in pMH0001 (UCOE-SFFV-dCas9-BFP- KRAB) and pMH0004 (UCOE-SFFV-Cas9-BFP). The UCOE prevented epigenetic silencing that affected the original con- structs.

Pooled CRISPR screening The HCMV tiling library was designed to contain sgRNAs targeting every single of the 33,465 PAMs in the HCMV Merlin genome (NCBI NC_006273.2), as well as 533 non-targeting controls (Table S1). It was synthesized and cloned into a lentiviral vector (Addgene #84832) as previously described (13, 14). For targeting host genes, we used the human CRISPRi v2 library (Addgene #83969) (14), and the K. Yusa et al. human knockout CRISPR v1 library (Addgene #67989) (31), respectively. Libraries were packaged into lentiviruses and delivered into (d)Cas9-expressing HFFs at an MOI of 0.3–0.5, followed by puromycin selection. Pooled screens were carried out at 500–1,000 × coverage, i.e. ∼500–1,000 cells per library element per sample taken. A t0 sample was harvested and the remaining cells either passaged normally, or infected with HCMV at an MOI of 0.5–1.0 (for the HCMV tiling screens) or 0.1 (for the host-directed screens). Infected flasks were washed with PBS and given fresh media at days 3, 5, 7 post infection to remove dead cells, and harvested at day 9–10. Genomic DNA was extracted and digested with MfeI (pCRISPRia v2-based libraries) or HindIII (Yusa et al. library) to release a fragment containing the sgRNA cassette, followed by gel-based extraction, PCR amplification and deep sequencing as described (14). Raw count data were normalized for read depth and a small constant added to account for missing values. Phenotypes of individual sgRNAs were expressed as log2-transformed ratios of adjusted read counts between samples (Table S2). We calculated the mean of all sgRNAs specific to each host gene. For the HCMV tiling screen, we calculated a rolling average in a 200 bp window, with the average of all non-targeting sgRNAs defining the baseline.

Single-cell RNA-seq For the single-cell infection time course, WT HFFs were lentivirally transduced with barcoded Perturb-seq vectors to encode the experimental condition (pBA571, Addgene #85968; Table S3, sheet 2), followed by puromycin selection. Cells were seeded at a density of 250,000 per well of a 12-well plate and infected with an MOI of 0.5 (low) or 5.0 (high). Infection times were staggered so that all time points for a given MOI were harvested in parallel, pooled, prepared for single-cell transcriptomics using one lane each of the Chromium Single Cell 30 Gene Expression Solution v2 according to manufacturer’s instructions (10x Genomics), and sequenced on a NovaSeq platform (Illumina) at ∼100,000 reads/cell. Barcodes encoding the experimental condition were PCR-amplified from the final library and sequenced separately as described (29).

Perturb-seq For the host-directed CRISPRi Perturb-seq experiment, we initially selected 53 candidate genes by their strong protective or sensitizing phenotypes in the pooled screen (one gene was later removed during analysis, see below). We manually picked the two best performing sgRNAs for each candidate. Additionally, we added six non-targeting control constructs (targeting GFP, which is not present in our HFFs). For the host- and virus-directed CRISPRn Perturb-seq experiment, we selected a set of 21 host factors, of which 19 were already among the targets CRISPRi Perturb-seq experiment, had no strong essentiality knockout phenotypes and comparable protective or sensitizing phenotypes in both the pooled host-directed CRISPRi and CRISPRn screens (see Fig. S3). We further added PDGFRA and FLCN, both of which were strong hits in the pooled CRISPRn screen. For each host target, we manually picked the two best performing sgRNAs. In addition, we selected 31 viral targets with strong protective or sensitizing phenotypes, corresponding to the three strongest modules identified in the HCMV tiling screen (see Fig. S2). From the tiling screen, we selected the two highest-ranking sgRNAs for each target gene based on the following scoring system: From the pool of unique sgRNAs falling within the gene boundaries and having a Doench score (55) of >0.5, we calculated the absolute average phenotype across replicates and subtracted a penalty defined as the difference between replicates plus the average absolute essentiality phenotypes (on a log2 scale). We designed a number of safe-targeting control sgRNAs targeting intergenic DNA in the US2–US12 region. This region was selected based on its near-neutral phenotypes in the tiling screen (Fig. S2), a lack of essential genes (9, 10), and its comparatively large spaces between consensus genes. Further, in some bacterial artificial chromosome (BAC) constructs harboring HCMV genomes, this region has been replaced by the BAC backbone, underlining its non-essential nature during infection in tissue culture (56). We picked five sgRNAs based on their Doench scores from a pool of unique sgRNAs targeting the intergenic regions and having survival and essentiality phenotypes of <0.5 (log2 scale) in all replicates. In addition, we included four control sgRNAs directed against safe-harbor loci in the host genome that we repurposed from gene knock-in applications.

10 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

All sgRNAs were synthesized as individual oligo pairs (IDT) and cloned into a barcode library containing plasmid pool (pBA571, Addgene #85968), thereby linking each sgRNA to a unique guide barcode contained within the 30-UTR of the puromycin resistance gene (29). Barcodes were validated to not contain homo-oligomers or sequences resembling transcription termination signals. All sgRNA and barcode sequences are listed in Table S3. sgRNAs vectors were packaged into lentiviruses individually, titered separately and pooled to ensure equal representation. We delivered the library into (d)Cas9-expressing HFFs at an MOI of 0.3 followed by puromycin selection. Cells were seeded at 250,000 per well of a 12-well plate and infected with HCMV at an MOI of 0.5 (for the CRISPRi host-directed experiment) or 5.0 (for the CRISPRn host and virus-directed experiment). Cells were harvested in the uninfected state (0 h) and at 24, 48 and 72 h.p.i. We aimed at a representation of each sgRNA by ∼100 cells per time point. Cells were collected and prepared for sgRNA-seq using the 10x Chromium platform as described above for the single-cell infection time course. Libraries were sequenced on a HiSeq4000 or NovaSeq platform (Illumina) at ∼40,000 reads/cell.

Single-cell data analysis Raw sequencing data were submitted to ‘cellranger’ (10x Genomics) according to the manufacturers instructions. We compiled a reference transcriptome from the hg19 human genome and a custom assembly of HCMV coding transcripts based on our previous ribosome profiling dataset (3) as distributed as part of the ‘Plastid’ python library demo dataset (57). We manually added four well-established lncRNA transcripts (RNA1.2, 2.7, 4.9, 5.0). Internal ORFs were removed as they would create ambiguous mappings, as were ORFs overlapping with the aforementioned lncRNAs. Cells retained in the final dataset had to cross the default cellranger quality thresholds, as well as have one unique lentiviral barcode assigned with high confidence (29). During data analysis of the perturb-seq experiments, three CRISPRn sgRNAs targeting host genes were removed computation- ally because they were found to be inactive, as seen by lack of transcriptional responses and viral load patterns similar to control sgRNAs. One host gene, RBBP5, was similarly excluded from both the CRISPRi and CRISPRn datasets as it became apparent that its knockdown/knockout causes differentiation of cells and a strong transcriptional response, rather than true protection against infection. Viral loads were calculated as the fraction of total UMIs per cell mapping to viral genes. Gene expression was normalized first in each cell by the total unique molecular identifiers (UMIs) per cell and then z-scored on a per-gene basis across all cells. Human genes represented by <10,000 UMIs across all cells, as well as viral genes represented by <5,000 UMIs were removed before hierarchical clustering. For heatmap representations of gene expression as a function of viral load, cells were binned by viral load and gene-level expression values averaged in each bin. Bin widths of 2 % or 10 % were selected depending on the available number of cells, with the lower bound indicated in the respective figures. Cell cycle phases were scored based on marker genes as described (29). Using a similar approach we calculated an IFN score by summing and subsequently z-scoring the relative expression values of the following set of robustly quantified ISGs (PSMB8, PSMB9, PSME1, PSME2, ISG15, ISG20, IRF7, MX1, MX2, GBP1, GBP2, GBP3, IFI6, IFI44, IFI35, IFI16, IFI27, IFIH1, IFI44L, IFIT1, IFIT2, IFIT3, IFIT5, IFITM1, IFITM2, IFITM3, EIF2AK2, OAS1, OAS2, OAS3, CNP, PLSCR1, BST2, BTN3A2, XAF1, CASP1, CASP4, CASP7, GSDMD). Dimensionality reduction was performed by t-SNE (58) implemented in scikit-learn (59) or UMAP (60) implemented in Mono- cle (61). Trajectories were based on UMAP projections. To determine trajectories, selected cells were ranked by viral load and the geometric position of cells averaged in a sliding window that was shifted in increments of 0.2 window sizes. Window sizes were selected based on the total number of available cells: Perturb-seq experiments: 100 cells; unperturbed cells, dominant trajectory: 500 cells, alternate trajectory: 50 cells.

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 11 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

2-fold raw sgRNA ratios

200 bp rolling average baseline = average of non-targeting sgRNAs

split into bands mirror negative space

layer bands for final horizon graph

Fig. S1. Data processing for the HCMV tiling screen.

We calculated log2 ratios of each individual sgRNA in the surviving over the t0 populations. Ratios were averaged in a sliding 200 bp window. The average of the ratios of the non-targeting sgRNA population was set as the baseline. The plot was then colored based on the sign of the average phenotype (reds: positive sign, protective phenotype;

blues: negative sign, sensitizing phenotype) and layered in bands of decreasing lightness, one log2 unit wide. The negative space was mirrored on the baseline, and bands were stacked for the final horizon plot representation (62). For high resolution see Fig. S2.

12 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

1.0 2.0 -1.0 -2.0 2 0.5 1.5 -0.5 -1.5 log viral ORF essential for impaired lytic origin repeat region enrichment 0.0 1.0 0.0 -1.0 replication replication (non-unique sgRNAs) enhanced host survival diminished host survival when deleted

Fig. S2. High-resolution phenotypic landscape of the HCMV genome. Horizon graph representation (62) of the phenotypic landscape with mirrored and stacked bands. Shades of blue denote sensitization to host cell death, shades of red denote protection from cell death upon HCMV genome cleavage. Major features of the HCMV genome are annotated. sgRNAs targeting internal and terminal repeat regions (hashed) typically have multiple target sites and likely result in higher-order fragmentation of the HCMV genome, exacerbating their respective phenotypes. Viral ORFs are classified by their essentiality for viral replication based on Dunn et al. (9) ORFL150C, ORFL151C (originally named UL59, but thought to not be expressed as a protein (63), causing it to be dropped from the consensus annotation), and ORFL152C were the only short ORFs with strong phenotypes in areas of the genome devoid of consensus genes. UL48 was the only gene that showed a substantial phenotype gradient within its gene body: Cutting the N-terminal region caused mild sensitization to death upon infection, whereas cutting the C-terminus had the opposite effect.

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 13 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

A B stronger knockout phenotype ) ) 0 0 CRISPRn (surviving/t 2 (uninfected/t 2 log log stronger knockdown phenotype

diminished enhanced host survival host survival essential host genes

log2(surviving/t0) log2(surviving/t0) CRISPRi

Fig. S3. Host-directed CRISPRn screen results and CRISPRi/n comparison. (A) Results of host-directed CRISPRn screen displayed as a scatter plot of average gene essentiality (i.e. infection-independent phenotype; y-axis) vs. protection/sensitization to death upon HCMV infection (i.e. infection-dependent phenotypes; x-axis). Note that due to the experimental design of the screen, the apparent gene essentiality phenotypes

are underestimating the real essentiality because t0 refers to the beginning of HCMV infection, not lentiviral delivery of the sgRNA library. (B) Direct comparison of CRISPRi and CRISPRn phenotypes for host targets represented in both libraries. Hits involved in viral adhesion and entry, as well as host cell survival or apoptosis are more pronounced in the CRISPRn screen. Cullin/RING pathway members and some vesicle trafficking factors were only resolved in the CRISPRi screen.

14 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection en&Wisa ucinlsnl-elgnmc fHM infection HCMV of genomics single-cell Functional | Weissman & Hein load, viral with cells correlated all positively are almost and where cell, h.p.i, per 72 content at RNA apparent total for most proxy are a MOI as serve high can and cells. counts infected low UMI ( of (compare between infection. point. content suppressed MOI Differences time RNA rapidly low experimental cellular then after by of cells. higher is remains down activation a scores infected response broken subpopulation interferon indicating interferon Interferon stage uninfected load, cells, an (B) an infected viral later mount whereas early of respectively. cells in infected In function RNA), of baseline are interferon- a signaling. majority viral MOI of to paracrine The of high expression through almost (MOI). a fraction likely of infection with returns h.p.i.), the extent of 6 challenged and the multiplicity (at (i.e. by h.p.i) infected and 28 load explained become point and viral cells time be of experimental the 20 can by minority and 2, down a and methods) broken as 1 and loads, soon components viral as Materials their see i.e. against variability, score; plotted of interferon cells sources individual the largest by the (represented that genes shows stimulated analysis (IC) component Independent (A) S4. Fig. C B A certified bypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmadeavailableunder bioRxiv preprint #UMIs/cell IC 1 interferon score [AU] interferon score [AU] IC 2 high MOI low MOI nefrnatvto n ia odaetemi ore fvraiiybtencells. between variability of sources main the are load viral and activation Interferon interferon score[AU] % viralRNA % viralRNA 0 0 h h doi: https://doi.org/10.1101/775080 6 6 %viralRNA h h 6.25 12.5 20 20 25 h h a 50 CC-BY-NC-ND 4.0Internationallicense ; this versionpostedOctober3,2019. 28 28 h h 48 48 h h The copyrightholderforthispreprint(whichwasnot . ubro nqemlclrietfir UI)prcl as cell per (UMIs) identifiers molecular unique of Number C) 72 72 h h 96 96 h h v|15 | bioRχiv 120 120 h h bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

all cells alternate trajectory cells ORFS337C (US10) ORFS327C (US2) ORFL264C (UL123) ORFS329C (US3) ORFS328C ORFL101C ORFL109C (UL42) ORFS361C (US26) ORFL253W (UL112) ORFS371W (US34)

immediate early ORFS373W ORFS372W ORFS369W ORFL226W (UL96) ORFL171W ORFS355C (US24) ORFL313C (UL138) ORFL254C (UL114) ORFS331C (US6) ORFL48W (UL15A) ORFL47W (UL14) ORFL143C (UL54) ORFL96W ORFL321W ORFL312C ORFL46W ORFL34W (UL4) delayed early ORFS352C (US22) ORFS339C (US12) ORFS346C (US18) ORFS333C (US8) ORFL196W (UL78) ORFL102C (UL38) ORFL55C (UL21A) ORFS356W ORFL114W ORFL127C (UL48A) ORFL159W ORFL53W (UL20) ORFS367W (US30) ORFL50W (UL17) ORFS350W ORFS349W ORFS344W intermediate ORFL120W (UL48) ORFL123C ORFS376W RNA4.9 ORFL199W (UL80) ORFL174C (UL72) ORFL178W (UL74A) RNA1.2 ORFL15C ORFL24W (RL13) ORFL43C ORFL255C (UL115) ORFL142W ORFL28C ORFL184C (UL75) ORFL82W ORFL35W (UL5) ORFL36W (UL6) ORFL23W (RL12) RNA2.7 leaky late ORFL105C (UL40) ORFL287C (UL145) ORFL78C ORFL266W (UL124) ORFL289C (UL144) ORFL64C (UL23) ORFL208C (UL84) ORFL87W ORFL161C (UL69) ORFL76C ORFL145C (UL55) ORFL98W (UL35) ORFL236C ORFL237C ORFL117C (UL46) ORFL301C ORFL307W ORFL181C (UL74) ORFL13C ORFL1W (RL1) ORFL302W ORFL303W ORFL300C (UL141) ORFL238W ORFL85C (UL30) ORFL88C (UL30A) RNA5.0 ORFL11C ORFL70W (UL25) true late ORFL232C (UL100) ORFL202C (UL82) ORFL265C (UL122) ORFL222W (UL94) ORFL271C (UL128 truncated) ORFL248W (UL111A) ORFL276C (UL132) ORFL175W (UL73) ORFL57W (UL22A) ORFL231W ORFL230W (UL99) ORFL229W (UL98) ORFL151C 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 % viral RNA

genome position UL US normalized expression (z-scored across cells) log2(phenotype) -2 -1 0 1 2 temporal class 1 2 3 4 5 n.d. Weekes et al. 2014

Fig. S5. Expression pattern of viral genes as a function of viral load. Cells were binned and averaged in viral load increments of 2 % (all cells, left) and 10 % (alternate trajectory cells, right) and viral transcripts clustered based on the pattern observed with all cells. This is a high-resolution version of Fig. 2D, E. Viral transcripts are annotated by their temporal class as described by Weekes et al. (4), by their phenotype in the HCMV tiling screen (see Fig. 1B and Fig. S2), and by their position in the viral genome (green, unique long (UL) branch; purple, unique short (US) branch; increasing saturation towards the terminal regions). Note the relationship between a gene’s temporal class and its phenotype in the pooled screen: True-late and leaky-late genes predominantly showed protective phenotypes, whereas earlier classes also contained sensitizing genes.

16 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

HES6 SPINT2 RAMP1 STX3 PCSK1N LITAF DAAM1 APOE TSPAN13 HMGCS1 CD55 PIK3C2A BICD2 SCD SQLE DHCR24 NET1 IRS2 UGCG ACLY MSMO1 RNF149 SC5D PHTF2 CKB DHCR7 SEL1L CD2AP TPST1 RASSF3 C16orf87 NPW GSKIP MAP1LC3A LSR FKBP1B OSBP SDF2L1 TRIB3 IFRD1 MTHFD2 PSAT1 SLC3A2 ATP6V1G1 SF3B6 PFDN2 UQCRFS1 EIF2S2 CSNK2B SSBP1 EIF1 LAPTM4B TERF2IP PSMD3 SNRPA1 AUP1 CCDC59 GTPBP4 ATP6V1E1 BUD31 M6PR TCEA1 HSPD1 GHITM RSL24D1 RPL29 OAZ1 RPS26 BRK1 RPL39 RBX1 NDUFS5 RPS17 RPLP1 RPL37 RPL24 RPS14 RPS23 RPL32 PFN1 ATP5G2 RPS2 HSBP1 SUB1 PARK7 RPL23 SNRPE SNRPD2 EDF1 FAU RPL41 SUMO2 FKBP1A RPL34 AP2S1 RPL18A RPS12 RPL13 RPL8 RPL27A RPL18 RPL35 RPL12 RPL36 RPS8 RPS15 RPL7A RPS7 RPS27A RPL31 GAPDH AP2M1 HNRNPA1 RPL13A UBA52 RPS24 RPS3 RPL10A RPL3 PKM PFDN5 CD59 RPS27 RPS15A RPL23A RPS18 RPL11 RPS25 CFL1 EMP3 RPLP2 RPS16 RPS19 RPL27 RPS9 TMSB10 S100A11 MYL6 GPX1 S100A6 RPS4X RPL10 RPL28 SH3BGRL3 LGALS1 TMSB4X CAV1 EEF1A1 EEF1D 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 % viral RNA

normalized expression

Fig. S6. Expression pattern of host genes as a function of viral load. Cells were binned and averaged in viral load increments of 2 % and host transcripts clustered hierarchically. The dominant pattern is, as expected, for host transcripts to decrease in abundance according to the increase in viral load. A minority of transcripts stand out by either increasing with viral load (upper two blown-up clusters), peaking at medium viral load (middle cluster), or decreasing over-proportionally (bottom two clusters). Upregulated transcripts encode membrane proteins, trafficking factors and involved in lipid synthesis. Downregulated transcripts include highly abundant housekeeping transcripts such as those of ribosomal proteins, elongation factors, or GAPDH.

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 17 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

infected A B uninfected M-G1 M-G1 80 4 G1-S G1-S 2 60 0

S S fold 40 -2 G2-M G2-M 20 enrichment -4 M % of cells M 0 10 20 30 40 50 60 70 6 20 28 h.p.i. % viral RNA

Fig. S7. Shifts in the cell cycle distribution of infected cells. (A) Distribution of cells across cell cycle phases as a function of viral load. Infected cells assume a G1-like state and switch to an S-like state at ∼55 % of viral load, which also marks the onset of late-gene expression and viral genome replication (compare Figs. 2D, S5). (B) Ratio of cell cycle distributions of early infected cells (defined here as 2–10 % viral load) over uninfected cells (<1 % viral load) as a function of experimental time. Cells in M-G1 and G1-S are the ones permissive towards initial progression of infection (at 6 h.p.i.), and their progression towards G2/M phase is then suppressed (at 20–28 h.p.i.).

18 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection en&Wisa ucinlsnl-elgnmc fHM infection HCMV of genomics single-cell Functional | Weissman & Hein response interferon The 3C. and Fig. IFN-responsive SEC61B the of factors with version susceptibility part resolution in the overlaps data higher of that shows a response Knockdown panel CRISPRi transcriptional is STAT2. a right the and cause and The RBX1 in points) IRF9 and point). perturbations time UBA3 IFNAR2, time NAE1, genetic members h NEDD8, h.p.i. directionality. 0 the 24–72 pathway of varying the Knockdown of the IFN with from well. but from either of as cells genes, ISGs knockdown load to as of by here viral responses suppression (defined suppressed mild % strongest <1 show cells is SEC62 the naïve with cells show from cells bystander data that as of shows genes here characteristic panel host (defined left 98 cells The bystander of cells. from set control by representative normalized a experiment, expression perturb-seq the of clustering Hierarchical S8. Fig. certified bypeerreview)istheauthor/funder,whohasgrantedbioRxivalicensetodisplaypreprintinperpetuity.Itmadeavailableunder bioRxiv preprint target: sgRNA target: sgRNA host geneexpressioninna rncitoa epnet otfco ncdw nnïeadbsadrcells. bystander and naïve in knockdown factor host to response Transcriptional CSNK2A1 CSNK2A1 WDR81 WDR81 SEC62 SEC62 SEC61B SEC61B SEC63 SEC63 BCL2L1 BCL2L1 VTCN1 VTCN1 TRAF2 TRAF2 IFNAR2 IFNAR2 doi: IRF9 IRF9 STAT2 STAT2

control control https://doi.org/10.1101/775080 LAMTOR1 LAMTOR1 LAMTOR4 LAMTOR4 LAMTOR5 LAMTOR5 LAMTOR3 LAMTOR3 LAMTOR2 LAMTOR2 ïve cells AMFR AMFR WDR26 WDR26 KXD1 KXD1 C17orf59 C17orf59 FAM126A FAM126A CUL3 CUL3 NEDD8 NEDD8 NAE1 NAE1 UBA3 UBA3 RBX1 RBX1 DDA1 DDA1 DCAF4 DCAF4 CASP2 CASP2 CASP9 CASP9 CASP3 CASP3 CYCS CYCS HCCS HCCS KIRREL KIRREL KIAA1432 KIAA1432 RAB6A RAB6A a CC-BY-NC-ND 4.0Internationallicense RGP1 RGP1 ; BAG6 BAG6 this versionpostedOctober3,2019. ASNA1 ASNA1 TRAPPC8 TRAPPC8 TRAPPC12 TRAPPC12 TRAPPC11 TRAPPC11 COG2 COG2 COG8 COG8 COG5 COG5 UNC50 UNC50 LARGE LARGE B4GALT7 B4GALT7 SLC35B2 SLC35B2 EXT2 EXT2 GLCE GLCE HS6ST1 HS6ST1 host geneexpressioninbystandercells CSNK2A1 CSNK2A1 WDR81 WDR81 SEC62 SEC62 SEC61B SEC61B SEC63 SEC63 BCL2L1 BCL2L1 VTCN1 VTCN1 TRAF2 TRAF2 IFNAR2 IFNAR2 IRF9 IRF9 STAT2 STAT2 control control LAMTOR1 LAMTOR1 LAMTOR4 LAMTOR4

LAMTOR5 LAMTOR5 The copyrightholderforthispreprint(whichwasnot

LAMTOR3 LAMTOR3 . LAMTOR2 LAMTOR2 AMFR AMFR WDR26 WDR26 KXD1 KXD1 C17orf59 C17orf59 FAM126A FAM126A CUL3 CUL3 NEDD8 NEDD8 NAE1 NAE1 UBA3 UBA3 RBX1 RBX1 DDA1 DDA1 DCAF4 DCAF4 CASP2 CASP2 CASP9 CASP9 CASP3 CASP3 CYCS CYCS HCCS HCCS KIRREL KIRREL KIAA1432 KIAA1432 RAB6A RAB6A RGP1 RGP1 BAG6 BAG6 ASNA1 ASNA1 TRAPPC8 TRAPPC8 TRAPPC12 TRAPPC12 TRAPPC11 TRAPPC11 COG2 COG2 COG8 COG8 COG5 COG5 UNC50 UNC50 v|19 | bioRχiv LARGE LARGE B4GALT7 B4GALT7 SLC35B2 SLC35B2 EXT2 EXT2 GLCE GLCE

HS6ST1 LGALS1 S100A6 CD59 MT2A VIM ID1 TMSB10 ANXA2 MARCKS S100A10 PRKCDBP GPX1 TAGLN ACTN1 TPM1 PTRF MRPL41 PRMT2 PDCD5 CCND1 PYCR1 EIF1 LMAN2 PPP1R11 S100A16 IFI27 MX1 OAS3 LY6E IFI6 IFITM3 ISG15 IFITM2 FTH1 ATP6V0E1 MT-ND1 ATP6V1F NOP10 AKAP12 GSTO1 CSTB GPNMB VAT1 GREM1 SQSTM1 CTSK BRI3 TMSB4X MMP2 NPC2 FTL TXN SLC7A11 TXNRD1 NQO1 TALDO1 TKT MGST1 HSBP1 GLIPR1 NENF PRNP SSR3 CBR1 PRDX1 HINT1 AP2S1 IGFBP5 PRSS23 SPARC ARF4 KDELR2 ATF4 CKAP4 TMEM258 CIB1 SEC13 TMEM208 SERF2 RABAC1 CRABP2 KDELR3 YIF1A SERP1 SEC61G MYDGF P4HB OSTC SSR2 FKBP2 MANF TPT1 EEF1A1 RPS3A RPS23 RPS27A RPL4 SERPINH1 HS6ST1 log 2 (ratio) -2 -1 0 1 2 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

log (ratio) host gene expression in naïve cells host gene expression in bystander cells 2 2 1 sgRNA 0 -1

target: CSNK2A1 BCL2L1 SEC62 IFNAR2 STAT2 control LAMTOR2 LAMTOR3 FLCN DDA1 CASP9 CYCS KIAA1432 RGP1 ASNA1 COG8 UNC50 EXT2 B4GALT7 SLC35B2 HS6ST1 PDGFRA CSNK2A1 BCL2L1 SEC62 IFNAR2 STAT2 control LAMTOR2 LAMTOR3 FLCN DDA1 CASP9 CYCS KIAA1432 RGP1 ASNA1 COG8 UNC50 EXT2 B4GALT7 SLC35B2 HS6ST1 PDGFRA -2 SERPINH1 RPL4 RPS27A RPS23 RPS3A EEF1A1 TPT1 MANF FKBP2 SSR2 OSTC P4HB MYDGF SEC61G SERP1 YIF1A KDELR3 CRABP2 RABAC1 SERF2 TMEM208 SEC13 CIB1 TMEM258 CKAP4 ATF4 KDELR2 ARF4 SPARC PRSS23 IGFBP5 AP2S1 HINT1 PRDX1 CBR1 SSR3 PRNP NENF GLIPR1 HSBP1 MGST1 TKT TALDO1 NQO1 TXNRD1 SLC7A11 TXN FTL NPC2 MMP2 TMSB4X BRI3 CTSK SQSTM1 GREM1 VAT1 GPNMB CSTB GSTO1 AKAP12 NOP10 ATP6V1F MT-ND1 ATP6V0E1 FTH1 IFITM2 ISG15 IFITM3 IFI6 LY6E OAS3 MX1 IFI27 S100A16 PPP1R11 LMAN2 EIF1 PYCR1 CCND1 PDCD5 PRMT2 MRPL41 PTRF TPM1 ACTN1 TAGLN GPX1 PRKCDBP S100A10 MARCKS ANXA2 TMSB10 ID1 VIM MT2A CD59 S100A6 LGALS1 sgRNA EXT2 EXT2 FLCN FLCN DDA1 DDA1 RGP1 RGP1 CYCS CYCS COG8 COG8 STAT2 STAT2 control control SEC62 SEC62 CASP9 ASNA1 UNC50 CASP9 ASNA1 UNC50 IFNAR2 IFNAR2 BCL2L1 BCL2L1 HS6ST1 target: HS6ST1 PDGFRA PDGFRA SLC35B2 SLC35B2 B4GALT7 B4GALT7 KIAA1432 KIAA1432 CSNK2A1 CSNK2A1 LAMTOR2 LAMTOR3 LAMTOR2 LAMTOR3

Fig. S9. Transcriptome response to host factor knockout in naïve and bystander cells. Conceptually related to Fig. S8, showing the same clustering of 98 representative host genes in response to knockout of host factors in the CRISPRn Perturb-seq experiment. The left panel shows data from naïve cells; the right panel shows data from bystander cells, both defined as in Fig. S8. Patterns and directionality of the transcriptional responses are in agreement between host factor knockdown and knockout.

20 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

sgRNA target(s) sgHCMV sgUL35 – sgUL52 – sgUL69 sgUL112 sgUL115 – sgHost safe-targeting sgUL43 sgUL105 sgUL148 safe-targeting

100 % relative number of cells 40 % 20 % 0 % ORFS337C (US10) ORFS327C (US2) ORFL264C (UL123) ORFS329C (US3) ORFS328C ORFL101C ORFL109C (UL42) ORFS361C (US26) ORFL253W (UL112) ORFS371W (US34)

immediate early ORFS373W ORFS372W ORFS369W ORFL226W (UL96) ORFL171W ORFS355C (US24) ORFL313C (UL138) ORFL254C (UL114) ORFS331C (US6) ORFL48W (UL15A) ORFL47W (UL14) ORFL143C (UL54) ORFL96W ORFL321W ORFL312C ORFL46W ORFL34W (UL4) delayed early ORFS352C (US22) ORFS339C (US12) ORFS346C (US18) ORFS333C (US8) ORFL196W (UL78) ORFL102C (UL38) ORFL55C (UL21A) ORFS356W ORFL114W ORFL127C (UL48A) ORFL159W ORFL53W (UL20) ORFS367W (US30) ORFL50W (UL17) ORFS350W ORFS349W ORFS344W intermediate ORFL120W (UL48) ORFL123C ORFS376W RNA4.9 ORFL199W (UL80) ORFL174C (UL72) ORFL178W (UL74A) RNA1.2 ORFL15C ORFL24W (RL13) ORFL43C ORFL255C (UL115) ORFL142W ORFL28C ORFL184C (UL75) ORFL82W ORFL35W (UL5) ORFL36W (UL6) ORFL23W (RL12) RNA2.7 leaky late ORFL105C (UL40) ORFL287C (UL145) ORFL78C ORFL266W (UL124) ORFL289C (UL144) ORFL64C (UL23) ORFL208C (UL84) ORFL87W ORFL161C (UL69) ORFL76C ORFL145C (UL55) ORFL98W (UL35) ORFL236C ORFL237C ORFL117C (UL46) ORFL301C ORFL307W ORFL181C (UL74) ORFL13C ORFL1W (RL1) ORFL302W ORFL303W ORFL300C (UL141) ORFL238W ORFL85C (UL30) ORFL88C (UL30A) RNA5.0 ORFL11C ORFL70W (UL25) true late ORFL232C (UL100) ORFL202C (UL82) ORFL265C (UL122) ORFL222W (UL94) ORFL271C (UL128 truncated) ORFL248W (UL111A) ORFL276C (UL132) ORFL175W (UL73) ORFL57W (UL22A) ORFL231W ORFL230W (UL99) ORFL229W (UL98) ORFL151C % viral RNA

genome position UL US normalized expression (z-scored across cells) log2(phenotype) -2 -1 0 1 2

Fig. S10. Patterns of viral gene expression along the different groups of trajectories. Cells were binned and averaged in viral load increments of 10 % for all cells belonging to a given group of trajectories, and clustered based on the pattern observed in unperturbed cells to reflect their membership in the default temporal classes. This is a high-resolution version of Fig. 4E. Viral transcripts are annotated by their phenotype in the HCMV tiling screen (see Fig. 1B and Fig. S2), and by their position in the viral genome (green, unique long (UL) branch; purple, unique short (US) branch; increasing saturation towards the terminal regions).

Hein & Weissman | Functional single-cell genomics of HCMV infection bioRχiv | 21 bioRxiv preprint doi: https://doi.org/10.1101/775080; this version posted October 3, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.

ORFL300C (UL141) ORFL306C (UL140) ORFL308C (UL139) ORFL313C (UL138) ORFL314C (UL136) ORFL315C (UL135) ORFL316C (UL133) ORFL318C (UL148A) ORFL319C (UL148B) ORFL324C (UL150) ORFS327C (US2) ORFS329C (US3) ORFS331C (US6) ORFS332C (US7) ORFS333C (US8) ORFS335C (US9) ORFS337C (US10) ORFS338C (US11) ORFS339C (US12) ORFS340C (US13) ORFS341C (US14) ORFS342C (US15) ORFS343C (US16) ORFS345C (US17) ORFS346C (US18) ORFS347C (US19) ORFS348C (US20) ORFS351C (US21) ORFS352C (US22) ORFS353C (US23) ORFS355C (US24) ORFS361C (US26) ORFS363W (US27) ORFS367W (US30) ORFS368W (US31) ORFS370W (US33A) ORFS371W (US34) % viral RNA average of all cells with host-directed normalized expression safe-targeting sgRNAs

US2 US3 US6 US7 US8 US9 US10 US11 US12

200,000 201,000 202,000 203,000 204,000 205,000 206,000 207,000 genome position

Fig. S11. Effect of safe-targeting sgRNAs on expression of genes adjacent to the cut sites. The clustering shows the viral gene expression patterns as a function of viral load, broken down by individual safe-targeting sgRNA. The sgRNAs were designed to target the comparatively large intergenic sequences in the non-essential US2–US12 region of the HCMV genome, and do not change the overall expression trajectories of infected cells. Viral genes are arranged in the order they are encoded in the genome. Red arrows denote which genes are adjacent to the cut sites in the expression heatmaps. Two of the five safe-targeting sgRNAs, one targeting a site immediately upstream of US6, and one between US11 and US12, show markedly reduced expression of an adjacent gene (US6 and US12, respectively). Note that another sgRNA targeting the US6–US7 intergenic space further away from the US6 N-terminus shows a much weaker effect on US6. All safe-targeting sgRNAs caused reduced expression of US6, US8 and US12, the most strongly expressed genes in a ∼5 kb window around the cut sites.

22 | bioRχiv Hein & Weissman | Functional single-cell genomics of HCMV infection