The Pennsylvania State University

The Graduate School

College of Medicine

INTERACTION OF ONCORETROVIRAL GAG WITH HOST NUCLEAR FACTORS

A Dissertation in

Biomedical Sciences

by

Breanna Lynn Rice

 2018 Breanna Lynn Rice

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

August 2018

The dissertation of Breanna Lynn Rice was reviewed and approved* by the following:

Leslie J. Parent Vice Dean for Research and Graduate Studies Professor of Medicine and Microbiology & Immunology Co-Director MD/PhD Program Physician, Department of Infectious Disease Dissertation Advisor Chair of Committee

Sarah Bronson Associate Dean for Interdisciplinary Research Associate Professor of Cellular & Molecular Physiology Director, Research Development Co-Director, Junior Faculty Development Program

Rebecca Craven Professor of Microbiology and Immunology

Kristin Eckert Professor of Pathology and Biochemistry & Molecular Biology Associate Director of Education, Penn State Cancer Institute Director, Office of Postdoctoral Affairs

Sergei Grigoryev Professor of Biochemistry and Molecular Biology

Ralph Keil Chair, Biomedical Sciences Graduate Program

*Signatures are on file in the Graduate School

ii

Abstract

Retroviruses are obligate parasites that cause cancer and immunodeficiences in humans and animals. Their RNA genomes undergo reverse transcription into a DNA intermediate that is integrated into the host genome. Following integration, the virus requires use of the host cell’s transcription machinery to produce the mRNAs needed for protein translation as well as the unspliced transcript that serves as the packaged genome. The structural protein of retroviruses, Gag, is responsible for binding the genome and packaging it into virions. The initial interaction between Gag and the genome was always thought to occur in the cytoplasm where the mRNA is translated into the Gag protein. However, pioneering work in our laboratory has demonstrated the

Gag protein from the Rous sarcoma virus (RSV) undergoes nuclear trafficking, which is necessary for efficient genomic RNA packaging. RSV Gag utilizes the nuclear import Importin α/β, Importin 11, and TNPO3 for nuclear entry, and the CRM1 pathway for nuclear egress. Since our discovery of the functional significance for Gag nuclear trafficking, the nuclear trafficking of other retroviral Gag proteins has begun to be more extensively studied.

RSV Gag forms discrete foci within the nucleus, and when Gag is trapped in the nucleus by blocking the CRM1-dependent export, the number of foci significantly increases. These foci exhibit characteristics that are similar to host nuclear bodies. Gag proteins move in and out of the foci rapidly, with a half-time of recovery of approximately

8 seconds. The overall goal of the research performed for this dissertation is to characterize the nuclear foci of RSV Gag and determine what host protein(s) could be functioning as the tether for these foci. We also wanted to determine whether there are host factors that Gag may use within the nucleus to locate the genomic viral RNA. In this dissertation, I will be discussing my work that demonstrates novel interactions Gag may have with cellular nuclear factors.

iii

Findings presented in this dissertation demonstrate that the RSV Gag nuclear foci display obstructed diffusion, indicating the foci are tethered to a protein or RNA. To determine where in the nucleus Gag localizes, fractionation experiments were performed that showed RSV Gag, as well as human immunodeficiency virus (HIV) Gag were able to be extracted from euchromatin and heterochromatin protein fractions. To uncover host nuclear proteins that Gag could be interacting with, mass spectrometry experiments were performed and it was found that RSV and HIV-1 Gag proteins may interact with transcription-related proteins, splicing factors, and chromatin remodeling proteins.

Colocalization experiments indicate that RSV Gag colocalizes with splicing factors SC35 and SF2 to a high degree, and that when the expression of SC35 is increased in cells, the number of RSV Gag foci increase. We found that when expression of the nuclear import protein, TNPO3 is increased, the amount of nuclear Gag increases. This is interesting because of the role TNPO3 has in transporting splicing factors to splicing speckles, suggesting Gag may use TNPO3 to localize to splicing speckles.

Unpublished lab results show that the RSV and HIV Gag proteins associate with the viral RNA in the nucleus. While it has been published that certain Gag proteins or domains are able to interact with chromatin during integration, the work presented here, along with our unpublished results lead us to hypothesize that Gag may be travelling to sites of transcription through interactions with cellular factors, which allows Gag to bind the viral RNA co-transcriptionally for packaging. This is a novel model that challenges the dogma of where retroviruses package their genome, and presents new roles for Gag in the nucleus.

iv

Table of Contents

List of Figures...... vii List of Tables...... ix Abbreviation List...... x Preface...... xii Acknowledgements...... xiii

Chapter 1: Literature Review...... 1 1.1 Introduction...... 1 1.2 Overview of Retrovirus Replication...... 2 1.3 Nuclear Trafficking of Retroviral Gag Proteins...... 10 1.3.1 Rous Sarcoma Virus...... 10 1.3.2 Human Immunodeficiency Virus...... 18 1.3.3 Other Nuclear Retroviral Gag Proteins...... 23 1.4 Composition and Functions of the Nucleus...... 25 1.4.1 Nuclear Bodies...... 26 1.4.2 Movement within the Nucleus...... 28 1.4.3 Nuclear Import and Export...... 32 1.5 Identifying Host Nuclear Interaction Partners of Retroviral Gag...... 32 1.6 Basics of Mass Spectrometry...... 36 1.7 Overview...... 37

Chapter 2: Interplay between the alpharetroviral Gag protein and SR proteins SF2 and SC35 in the nucleus...... 41 2.1 Abstract...... 41

Chapter 3: Identification of possible chromatin-associated protein interacting partners of the retroviral Gag protein...... 43 3.1 Abstract...... 43 3.2 Introduction...... 43 3.3 Materials and Methods...... 46 3.4 Results...... 63 3.4.1 Subcellular localization...... 63 3.4.2 Affinity purifications and proteomic analysis...... 71 3.4.3 Characterization of putative Gag interaction partners...... 127 3.5 Discussion...... 133 3.6 Acknowledgements...... 140

Chapter 4: Rous sarcoma virus Gag utilizes Transportin-SR as a mechanism for nuclear entry...... 141 4.1 Abstract...... 141 4.2 Introduction...... 141 4.3 Materials and Methods...... 142 4.4 Results...... 150 4.5 Discussion...... 156 4.6 Acknowledgements...... 161

Chapter 5: Overall Discussion...... 162 5.1 Characterization of RSV Gag Nuclear Foci...... 163

v

5.2 Interaction of RSV Gag with Transportin SR/TNPO3...... 165 5.3 Roles of Gag in the Nucleus...... 166 5.4 Gag and RNA Interactions within Nuclear Foci...... 167 5.5 Model and Future Directions...... 169 5.5.1 Liquid-liquid phase separation: a model for Gag nuclear foci?...... 174 5.6 Summary...... 175

Appendix A: Interplay between the alpharetroviral Gag protein and SR proteins SF2 and SC35 in the nucleus...... 177 Abstract...... 178 Introduction...... 178 Materials and Methods...... 180 Results...... 181 Discussion...... 185 Acknowledgements...... 188

Appendix B: Mass Spectrometry Tables of Chapter 3...... 191

Appendix C: Association of retroviral Gag proteins with unspliced viral RNA in the nucleus, Figure 6...... 213

Bibliography...... 216

vi

List of Figures

Chapter 1 Figure 1.1: Retrovirus replication cycle...... 4 Figure 1.2: RSV and HIV-1 Gag proteins...... 13 Figure 1.3: Nuclear localization of HIV-1 Gag in published studies...... 20 Figure 1.4: Organization of the nucleus...... 30 Figure 1.5: Basic layout of TOF mass spectrometry...... 39

Chapter 2 See Appendix A

Chapter 3 Figure 3.1: Schematic of protein expression constructs...... 61 Figure 3.2: Subcellular fractionations...... 64 Figure 3.3: RSV Gag localization during mitosis...... 67 Figure 3.4: Gag.L219A localization in the nucleus...... 69 Figure 3.5: Biological processes enriched from the wildtype purified RSV Gag and Gag.ΔNC affinity tagged purifications...... 72 Figure 3.6: Ontology (GO) hierarchy of biological processes identified from the wildtype purified RSV Gag and Gag.ΔNC affinity tagged purifications...... 74 Figure 3.7: Overlapping proteins found in the purified RSV and HIV-1 Gag mass spectrometry...... 77 Figure 3.8: Biological processes enriched from the purified RSV Gag using DF1 nuclear lysates...... 81 Figure 3.9: (GO) hierarchy of biological processes identified from the purified RSV Gag using DF1 nuclear lysates...... 83 Figure 3.10: Biological processes enriched from the purified RSV Gag using HeLa nuclear lysates...... 85 Figure 3.11: Gene Ontology (GO) hierarchy of biological processes identified from the purified RSV Gag using HeLa nuclear lysates...... 87 Figure 3.12: Biological processes enriched from the purified HIV Gag using HeLa nuclear lysates...... 90 Figure 3.13: Gene Ontology (GO) hierarchy of biological processes identified from the purified HIV Gag using HeLa nuclear lysates...... 92 Figure 3.14: Biological processes enriched from the purified HIV Gag using DF1 nuclear lysates...... 94 Figure 3.15: Gene Ontology (GO) hierarchy of biological processes identified from the purified HIV Gag using DF1 nuclear lysates...... 96 Figure 3.16: Biological processes enriched from the Strep-tagged purifications...... 100 Figure 3.17: Gene Ontology (GO) hierarchy of biological processes identified from the Gag-strep purification...... 102 Figure 3.18: Gene Ontology (GO) hierarchy of biological processes identified from the L219A-strep purification...... 104 Figure 3.19: Nuclear biological processes enriched from the published HIV Gag purifications...... 109 Figure 3.20: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Engeland et al., 2011 publication...... 111 Figure 3.21: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Jäger et al., 2011 publication...... 113

vii

Figure 3.22: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Engeland et al., 2014 publication...... 115 Figure 3.23: Nuclear biological processes enriched from the published HIV Gag purifications...... 117 Figure 3.24: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Ritchie et al., 2015 publication...... 119 Figure 3.25: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Le Sage et al., 2015 publication...... 121 Figure 3.26: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Li et al., 2016 publication...... 123 Figure 3.27: Analysis of the common proteins identified from the HIV Gag purifications...... 125 Figure 3.28: Effect of Camptothecin on Gag.L219A nuclear foci...... 129 Figure 3.29: Localization of Gag.L219A nuclear foci with endogenous proteins...... 131 Figure 3.30: Effect of cellular protein expression decrease on RSV particle formation...... 134

Chapter 4 Figure 4.1: Gag and TNPO3 constructs utilized in these studies...... 148 Figure 4.2: Effects of increased TNPO3 expression on Gag nuclear localization.... 151 Figure 4.3: In vitro affinity-tagged purifications of Gag and TNPO3 protein complexes ...... 154 Figure 4.4: Effects of multiples import factors on Gag nuclear localization...... 159

Chapter 5 Figure 5.1: One Possible Model for Gag Nuclear Trafficking...... 172

Appendix A Figure 1: Characterization of Gag.L219A nuclear foci …………………...... 179 Figure 2: Localization of Gag.L219A with host nuclear body proteins in QT6 cells.. 183 Figure 3: Localization of Gag.L219A with host nuclear body proteins in HeLa cells.. 184 Figure 4: Gag.L219A localization with endogenous splicing speckles in HeLa cells.. 185 Figure 5: The effect of Clk1 on Gag.L219A localization in QT6 cells……………….. 186 Figure 6: Quantification of Gag.L219A nuclear foci number with coexpression of SC35, SF2, or PSP1………………………………………………………………………………. 187

Appendix B Figure 6: Colocalization of Gag.L219A with its own RNA is dependent on the presence of Ψ...... 212

viii

List of Tables

Appendix B Table 3.1: Top 10 DAVID biological processes for proteins isolated from purified RSV Gag and Gag.ΔNC affinity purifications from DF1 nuclear lysates...... 159 Table 3.2: Top 10 nuclear enriched DAVID biological processes for proteins isolated from purified RSV Gag and Gag.ΔNC affinity purifications from DF1 nuclear lysates 164 Table 3.3: Top 10 DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from DF1 nuclear lysates...... 171 Table 3.4: Top 10 nuclear enriched DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from DF1 nuclear lysates... 173 Table 3.5: Top 10 DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from HeLa nuclear lysates...... 175 Table 3.6: Top 10 nuclear enriched DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from HeLa nuclear lysates.. 179 Table 3.7: Top 10 DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from HeLa nuclear lysates...... 183 Table 3.8: Top 10 nuclear enriched DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from HeLa nuclear lysates...... 186 Table 3.9: Top 10 DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from DF1 nuclear lysates...... 189 Table 3.10: Top 10 nuclear enriched DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from DF1 nuclear lysates...... 191 Table 3.11: Top 10 DAVID biological processes for proteins isolated from the Gag-Strep purification...... 193 Table 3.12: Top 10 DAVID biological processes for proteins isolated from the L219A- Strep purification...... 195 Table 3.13: Top 10 nuclear enriched DAVID biological processes for proteins isolated from the L219A-Strep purification...... 197 Table 3.14: Top 10 DAVID biological processes of nuclear proteins identified in Engeland et al 2011...... 199 Table 3.15: Top 10 DAVID biological processes of nuclear proteins identified in Jäger et al 2011...... 201 Table 3.16: Top 10 DAVID biological processes of nuclear proteins identified in Engeland et al 2014...... 203 Table 3.17: Top 10 DAVID biological processes of nuclear proteins identified in Ritchie et al 2015...... 205 Table 3.18: Top 10 DAVID biological processes of nuclear proteins identified in Le Sage et al 2015...... 207 Table 3.19: Top 10 DAVID biological processes of nuclear proteins identified in Li et al 2016...... 209 Table 3.20: Top 10 DAVID biological processes of the common proteins identified from published work and the HIV Gag results presented here...... 211

ix

Abbreviation List

Ψ psi BSA bovine serum albumin CA capsid CFP cyan fluorescent protein CH cysteine-histidine box CPT camptothecin FISH fluorescent in situ hybridization FIV feline immunodeficiency virus FRAP fluorescent recovery after photobleaching FRET fluorescence resonance energy transfer FV foamy virus GFP green fluorescent protein GO gene ontology gRNA genomic RNA HIV human immunodeficiency virus IN integrase L late domain LMB leptomycin B M membrane binding motif MA matrix MBD membrane binding domain mg milligrams MHR major homology region MI multimerization interface MLV murine leukemia virus MMTV mouse mammary tumor virus MPMV Mason-pfizer monkey virus mRNA messenger RNA MTOC microtubule-organizing center m/z mass to charge ratio NC nucleocapsid NES nuclear export signal NLS nuclear localization signal NoLS nucleolar localization signal NPC nuclear pore complex PIC pre-integration complex PML promyelocytic leukemia PMSF phenylmethanesulfonyl fluoride PR protease RNAi RNA interference RNP ribonucleoprotein RRE Rev response element RSV Rous sarcoma virus RT reverse transcriptase shRNA short hairpin RNA siRNA short interfering RNA snRNA small nuclear RNA SP spacer peptide TOF time-of-flight

x

µg microgram µl microliter VLP virus like particle YFP yellow fluorescent protein

xi

Preface

For the work described in Chapter 2/ Appendix A, the following individuals conducted the experiments: Figure 1: Breanna L. Rice, Rebecca Maldonado, Timothy Lochmann Figure 2: Breanna L. Rice Figure 3: Breanna L. Rice and Rebecca Maldonado Figure 4: Breanna L. Rice and Rebecca Maldonado Figure 5: Breanna L. Rice, Rebecca Maldonado, and Matthew S. Stake Figure 6: Matthew S. Stake The manuscript was written by Breanna L. Rice, Rebecca Maldonado, Matthew Stake, and Leslie J. Parent

For the work described in Chapter 4, the following individuals conducted the experiments: Figure 2: Breanna L. Rice performed one replicate and updated analysis, Matthew Stake did the other replicates and the initial analysis. Images are ones that Breanna L. Rice imaged. Figure 3: Breanna L. Rice Figure 4: Matthew Stake did the imaging and part of the analysis. Breanna L. Rice finished the analysis.

For the work on Figure 6 described in Appendix B, Breanna L. Rice performed the cloning and imaging of the Gag.L219A.C21S mutant, and the categorization of the RNA pattern along with Rebecca Maldonado. The manuscript was written by Rebecca Maldonado and Breanna L. Rice helped with editing.

xii

Acknowledgements

I would like to thank my mentor, Dr. Leslie Parent, for giving me the opportunity to work in her laboratory and providing me the guidance and support I needed for completely my dissertation. To my thesis committee, who always pushed me to achieve my best and guiding me through some of my more difficult experiments. My lab mates who made work fun and who were always there to help me out.

I would also like to tremendously thank my family and friends for listening to me when I was having a difficult time in graduate school and always giving me encouragements to stick it and do my best.

xiii

Chapter 1

Literature Review

1.1 Introduction

Retroviruses are pathogens that affect both humans and animals causing immunodeficiencies and cancer. They were first discovered at the beginning of the 20th century through studies examining diseases in chickens. In 1908, a Danish pair of investigators, Vilhelm Ellermann and Oluf Bang, demonstrated that the chicken form of leukemia and lymphoma, called leukosis, could be transmitted through cell-free filtrates

(reviewed in (203)). In 1910, Peyton Rous from the Rockefeller Institute showed that tumors would propagate in healthy chickens when transferred from diseased chickens

(168). Then in 1911, Rous added to Ellermann and Bang’s work, to show tumors in the chickens were caused by a transmissible agent (167). This tumor-causing virus was then later named after its discoverer, Rous sarcoma virus (RSV). It is a member of the broad group known as the avian sarcoma-leukosis viruses (ASLV) (203).

Retroviruses were named due to their ability to undergo reverse transcription.

This was initially discovered when scientists tried to isolate double-stranded RNA from infected cells, expecting that retroviruses replicated in a similar fashion to other RNA viruses that use a RNA-dependent RNA polymerase. However, they were unable to find double-stranded RNA yet were able to find viral DNA in the cells. Replication was also sensitive to DNA synthesis inhibitors (8, 197). This suggested to Howard Temin that the infection cycle of these viruses represented a challenge to the Central Dogma of

Molecular Biology (DNA to RNA to protein) and produce a DNA copy of the RNA genome that Temin named the provirus (203). This hypothesis became widely accepted after the discovery of the reverse transcriptase protein in RSV and murine leukemia virus

(MLV) infected cells by Temin and David Baltimore, respectively (9, 198).

1

The RSV src gene is the gene responsible for causing tumors in RSV infected chickens. Src is unique RSV and is not present in any other ASLVs. Through studies to determine how RSV obtained this oncogene, it was found that there is a cellular version of src that functions as a protein kinase. It was concluded that the virus obtained a copy of src during replication and has continued to remain part of the RSV genome (190).

Since this discovery, other cellular oncogenes have been found.

From the history discussed here, the study of retroviruses has been and still is a crucial tool in learning more about viral and cellular biology. RSV is the predominant virus studied in this dissertation, and the research discussed below focuses on uncovering the possible roles of RSV Gag in host cell nuclei and whether these roles are conserved in other retroviruses as well.

1.2 Overview of Retrovirus Replication

Retroviruses are positive-sense, single-stranded RNA viruses whose hallmark quality is that they must undergo reverse transcription of their RNA genomes into a DNA intermediate that is integrated into the host genome in order to replicate. All retroviruses contain four main coding regions: gag, which codes for structural protein that is responsible for packaging the viral genomic RNA (gRNA) in virions as well as forming the virus particle and is composed of at least the matrix (MA), capsid (CA), and nucleocapsid (NC) domains. There are also the pro gene that encodes for the viral protease; pol, which encodes the reverse transcriptase and integrase proteins; and env that encodes the glycoprotein that is present on the outside of the virus particle. The MA domain of Gag contains the membrane binding domain which is responsible for the targeting and binding to the plasma membrane. The CA domain facilitates Gag-Gag interactions for particle formation. The NC domain is responsible for binding to nucleic acids and also contains Gag-Gag binding abilities (reviewed in (205)).

2

There are two different viral assembly types that retroviruses can exhibit: B- and

D-type retrovirus assembly pathway, and C-type retrovirus and lentivirus assembly pathway. For the B-type retroviruses (ie - mouse mammary tumor virus) and D-type retroviruses (ie - Mason-Pfizer monkey virus (MPMV)), the Gag and Gag-Pol proteins begin forming a stable particle-like structure in the cytoplasm known as the A particle that traffics to the plasma membrane. Once at the plasma membrane, the particle obtains a lipid membrane and the env glycoprotein. After budding is complete, the viral protease cleaves Gag into its separate domains forming the viral capsid and the mature virus particle. Foamy viruses also follow this assembly pathway; except the Gag proteins do not undergo cleavage by the viral protease into the different domains (122). For C- type retroviruses (ie – RSV and MLV) and lentiviruses (ie – human immunodeficiency virus type-1 (HIV-1)), form their particles at the plasma membrane where budding occurs

(90, 194). This section and dissertation focus on the C-type/lentiviral assembly pathway.

The retrovirus replication cycle, outlined in Figure 1.1, begins when the virus particle binds to the plasma membrane through interactions with host cell receptors and the Env glycoprotein on the virion surface. After binding, the virus fuses its lipid membrane to the cellular plasma membrane during entry into the cell. This releases the viral core that is composed of the two copies of the gRNA coated in the NC protein; reverse transcriptase (RT), and integrase (IN) that is surrounded by the CA protein forming the core. As the viral core traffics to the nucleus, most likely by microtubules, the viral core undergoes uncoating and the gRNA is reverse transcribed by RT into a double-stranded DNA intermediate. This nucleoprotein complex is known as the pre- integration complex (PIC). When the PIC enters the nucleus, either through trafficking through the nuclear pore complex (NPC) with the help of cellular import proteins, or during mitosis when the nuclear envelope has broken down, IN then integrates the viral

DNA intermediate into the host genome now called a provirus.

3

4

Figure 1.1: Retrovirus replication cycle. Outlined here is the basic retrovirus cycle. 1)

The virus particle attaches to the cellular plasma membrane through a virus-specific using the Env glycoprotein. The particle membrane fuses with the plasma membrane and releases the PIC into the cytoplasm. 2) The viral RNA genome undergoes reverse transcription, by reverse transcriptase, into a double stranded DNA intermediate. 3) The PIC undergoes nuclear trafficking and then the DNA intermediate and integrase are released into the nucleoplasm. 4) The DNA intermediate is integrated into the host genome, becoming the provirus. The provirus undergoes transcription using the host RNA polymerase II transcription machinery. The viral RNA that is made has a few different fates: 5) the RNA can be spliced where is exported from the nucleus and serves as the mRNA template for translation of viral proteins, including Env (green). 6)

The viral RNA can also remain unspliced where it is exported from the nucleus. 7) From here the unspliced viral RNA can serve as the mRNA template for the translation of Gag and Gag-Pol, or 8) Gag can bind to the unspliced RNA and 9) package it into virus particles as a dimer. 10) After the genome is packaged and all the necessary Gags are assembled, the particle buds from the plasma membrane. 11) The immature particle undergoes maturation through proteolytic cleavage of the Gag and Gag-Pol proteins by the viral protease.

5

Following integration, the provirus undergoes transcription by the host RNA polymerase II machinery. This produces an unspliced viral mRNA transcript. This mRNA is 5’-capped and 3’-polyadenylated and can undergo splicing, which occurs co- transcriptionally to produce spliced mRNA transcripts that serve to produce Env and virus-specific accessory proteins. However, there remains an unspliced version of the viral RNA that can serve as the template for Gag and the Gag-Pol polyprotein translation. The unspliced RNA also serves as the gRNA needed for packaging by Gag.

Typically unspliced mRNAs are not exported from the nucleus due to the possibility of translating into aberrant proteins. Retroviruses have devised different mechanisms to bypass the cellular blocks on exporting unspliced mRNAs. Some retroviruses encode accessory proteins that are responsible for the export of the unspliced viral RNAs. An example is the Rev protein from HIV-1. Rev binds to the Rev response element, or RRE, that lays in the 3’ end of the unspliced RNA. Rev uses many cellular proteins such as

Rab/hRIP, RanBP1, Sam68, hnRNP A1, and the CRM1 export pathway in order to export the viral RNA for translation of Gag and Gag-Pol. Retroviruses that do not encode accessory proteins use cellular proteins in order to undergo nuclear export. The MPMV

RNA contains an element that interacts with several export proteins including the nuclear export Tap/NXT1 complex as well as Sam68 and RNA helicase A. For RSV, it is thought that the unspliced RNA is exported from the nucleus using Tap and Dbp5 through two direct repeats present in the viral RNA (reviewed in (67, 91, 187)). From here, after nuclear export, the unspliced RNA serves as the template for the translation of Gag and

Gag-Pol.

It was historically thought that Gag bound to the gRNA in the cytoplasm for packaging. Gag recognizes the viral RNA through direct interactions between the domains in the NC domain of Gag and the highly structured sequence in the RNA called psi (Ψ) that is located in the 5’ end of the RNA. The cellular sites at which Gag

6 initially recognizes the viral RNA is dependent upon each retrovirus. For HIV-1, it is not completely known where the initial interaction between Gag and the RNA occurs. Some studies have suggested that the interaction occurs at the microtubule organization center

(MTOC). It is hypothesized that the host protein hnRNP A2 mediates the nuclear export of the viral RNA through hnRNP A2-response elements present in the gag sequence of the viral RNA. When these elements are mutated in the viral RNA, the RNA accumulates in the nucleus and there is a decrease in RNA packaging. Studies have shown that the accumulation of viral RNA occurs very rapidly at the MTOC and gRNA egress within the cytoplasm is dependent on hnRNP A2 expression. HIV-1 Gag colocalizes with the viral

RNA in the perinuclear region and with centrioles. The Gag:RNA complex subsequently traffics through the cytoplasm to the plasma membrane of the cell using the dynein motor (14, 116, 118, 159). Another study was also able to detect HIV-1 gRNA in the perinuclear region; however, they were unable to detect HIV-1 Gag in these areas. They were able to detect gRNA accumulated with HIV-1 Gag in late endosomal foci (94). This demonstrates that HIV-1 Gag could also initiate assembly in endosomes. Depending on the cell line, HIV-1 Gag can be found at the plasma membrane and in endosomes. This indicates that HIV-1 could have several sites for assembly. HIV-1 Gag initiates and completes assembly at the plasma membrane in T lymphocytes. But in other cell lines including T cells and macrophages, HIV-1 Gag can initiate assembly in endosomes, and then traffic to the plasma membrane ((155, 180) reviewed in (137)). The presence of

HIV-1 Gag in endosomes may serve several functions such as the recruitment of

ESCRT complexes (66, 180), gRNA packaging (12, 70, 94, 137), and envelope incorporation (19, 70).

The initial interaction between MLV Gag and the gRNA has been proposed to occur in the nucleus due to a small amount of Gag found in the nucleus. It has been suggested that MLV Gag may have functions in the nucleus involved in the transport

7 and/or splicing regulation of viral RNAs (147). Another study showed that MLV Gag and the Env were present on lysosomes and transferrin-positive endosomes, and recruited the viral RNA from the cytosol. This viral protein:RNA complex then traffics to the plasma membrane through vesicular trafficking pathways (12, 180).

For feline immunodeficiency virus (FIV), it is thought that the initial interaction of

Gag and the viral RNA could be occurring on the cytoplasmic face of the nuclear envelope shown through live-cell imaging studies (94). The interaction site for MPMV

Gag and the viral RNA is speculated to occur at the nuclear pore complex. MPMV Gag has been shown to interact and colocalize with the E2 SUMO conjugating enzyme Ubc9, which residues in the nuclear pore, at the nuclear rim. And it is hypothesized that through its interactions with Ubc9, MPMV Gag localizes to the nuclear rim and this may be the site of Gag-RNA interaction (211). For foamy virus (FV), Gag accumulates at the

MTOC where it will begin capsid assembly. It can be assumed that this is the localization at which FV Gag acquires the viral RNA. At the MTOC, the assembling capsids utilize the microtubule transport system to move around in the cell (218). For RSV, studies have shown that Gag nuclear trafficking is essential for efficient genome packaging (63).

It can be deduced that inside the nucleus is the location of the initial interaction between

RSV Gag and the gRNA. However, how RSV Gag is able to locate the unspliced viral

RNA for packaging is still not understood.

Interactions among Gag proteins are crucial for the production of virus particles.

Gag alone is capable of forming virus-like particles (VLPs) without the presence of other viral components (194). There are various regions within the Gag protein that are important for the proper interaction between Gag molecules. The CA domain is known to be the primary source of Gag-Gag interactions. In particular is the C-terminal domain of

CA known as the CTD. Within the CTD, there is a stretch of amino acids that are conserved across retroviruses and is called the major homology region (MHR) (62). The

8

MHR is important for assembly, particle maturation, and infectivity. How the MHR functions is still not completely understood, but it has been shown that the MHR residues function to promote stability and multimerization after membrane targeting of the Gag proteins (39, 41, 195). The multimerization interface region (MI) (Figure 1.2A) within the

RSV Gag protein spans across the p10 domain into the CA domain is essential for immature particle formation (157). The interaction motif (I) (Figure 1.2) is another major region of Gag-Gag interactions and is present in the NC domain. Studies have shown that when the NC domain from various retroviral Gag proteins is deleted, budding is reduced. When assembly-competent Gags were added to the system, there was no rescue of the NC mutants into particles. Indicating that the I domains are important for

Gag-Gag interactions. There are a few NC deletion mutants that are able to form particles to similar levels of wildtype, but the particles were much smaller in size. These demonstrate that while Gag-Gag interactions occur in regions other than NC, NC is needed for proper particle formation. Due to the I domains containing high concentrations of basic residues, it is suggested that RNA is important for acting as a scaffold for Gag proteins to tightly pack together ((21, 58), reviewed in (193)). The CA-

NC fragment of RSV Gag has been shown to assemble into hollow cylinders in vitro (28,

146). When examining larger fragments of Gag, assembly into regular structures required RNA. And that in fragments containing at least p10-CA-NC, spherical particles resembling wildtype RSV particles were formed. When p10 was missing in any of the fragments, the proteins assembled into cylindrical particles like those formed by CA-NC alone (27). These findings demonstrate that the MI and MHR motifs within p10 and CA, along with NC, are needed to form proper virus-like particles.

Most Gag proteins are directed to the plasma membrane through the MA domain using a motif consisting of basic residues, and with some Gag proteins, through modifications made to the N-terminus of MA with myristic acid. Interactions with

9 membrane-associated factors are also important for Gag targeting to the plasma membrane; HIV-1 and FIV MA interacts with the phospholipid, phosphatidylinositol-4,5- bisphosphate [PI(4,5)P2]. Similarly with RSV, when PI(4,5)P2 and phosphatidylinositol-

(3,4,5)-triphosphate [PI(3,4,5)P3] were depleted in cells, the localization of Gag to the plasma membrane decreased (144). Other ways that Gag could traffic to the plasma membrane is through interactions with the microtubule motor proteins such as KIF-4 which has been found to be important for HIV-1 (132). Once at the plasma membrane the immature particle needs to undergo scission to be released from the cell. This generally occurs through interactions with the components of the cellular endosomal- sorting complexes required for transport (ESCRT), or proteins related to this machinery.

The L motif in the Gag proteins is required for the interaction with the ESCRT machinery

(reviewed in (67, 130)).

1.3 Nuclear Trafficking of Retroviral Gag Proteins

1.3.1 Rous Sarcoma Virus:

The retrovirus that serves as the main focus of the work presented here is RSV.

RSV is known as a simple retrovirus in that it does not code for additional proteins besides the four main viral proteins that are needed for replication (204). Along with the four proteins all retroviruses encode for (Gag, Pro, Pol, Env), RSV also contains src that serves as the oncoprotein that causes tumor formation in domesticated fowl, but it is dispensable for viral replication. RSV Gag consists of six domains (Figure 1.2A). The

MA domain contains a plasma membrane targeting and binding motif, as well as a nonconical nuclear localization signal (NLS). The p2 domains contains a late domain that is important for budding and the p10 domain contains elements important for virion morphology, as well as a CRM1 dependent nuclear export signal (NES). The CA domain facilitates Gag-Gag interactions through its major homology region and multimerization interface which is important for particle formation. The multimerization interface overlaps

10 into p10 on the NES (157). There is a spacer peptide (SP) that is important proper virion formation. The NC domain which mediates Gag-RNA interactions through two zinger finger domains contained in Cys-His boxes, and is involved in Gag nuclear import through a classical NLS. The last domain of Gag is the PR domain which serves as the viral protease that is involved in the maturation of virus particles (reviewed in (91)).

Historically, it was always thought that Gag bound to the unspliced gRNA in the cytoplasm since that is the area of Gag production. However, when the subcellular localization of the RSV Gag MA domain was examined, it was found that RSV Gag can traffic into the nucleus (172). It was initially found that when 10 codons from the v-src gene were added to the 5’ end of MA (Myr1E), virus particles were released to a similar level as wildtype Gag but there was a loss in infectivity, and the RNA isolated from the

Myr1E particles contained only monomers (153). Looking at the localization of wildtype

MA and the Myr1E mutant, it was found that MA was cytoplasmic but had a strong nuclear presence as well, and Myr1E was very strongly targeted to the plasma membrane (64). These data suggested that the strong targeting to the plasma membrane may account for the reasons of packaging monomeric RNA and that nuclear trafficking may be involved in packaging of dimeric genomes.

To begin to test whether RSV Gag underwent nuclear trafficking, the region(s) of

Gag that are responsible for nuclear trafficking were mapped. It was found that as before, MA-GFP is found both in the cytoplasm and the nucleus, but was more concentrated in the nucleus, compared to a GFP only control which was uniformly distributed throughout the cell. When compared to Gag.GFP, Gag was mostly localized to the plasma membrane with some in the cytoplasm, suggesting that there is something in MA causing targeting into the nucleus. The sequence of MA was examined and a classical motif for a NLS, clusters of basic residues, could not be identified. It was then examined whether there was a region in Gag that was required for trafficking out of the

11 nucleus. Gradually, Gag domains were added back to the C-terminus of MA, and it was found that after adding back the p2 and p10 regions, along with CA, resulted in a different localization compared to MA alone or MA.p2, there was a lack of nuclear localization. This exclusion could be the result of increased molecular weight, or a presence of a NES in the p10 region of RSV Gag (172).

A common nuclear export pathway used in cells utilizes the CRM1 nuclear export protein. CRM1 is a member of the karyopherin-β superfamily of nuclear transport proteins and interacts with hydrophobic motifs of its cargo proteins (57, 88, 186). CRM1 is sensitive to the drug leptomycin B (LMB) which covalently binds to CRM1 preventing it from binding to its cargo (103, 104, 151, 192). To determine whether RSV Gag contained the NES that CRM1 binds to, LMB was added to cells expressing MA.GFP, and MA.p2.GFP to which there was no change in their localization. However, when cells expressing MA.p2.p10.GFP or MA.p2.p10.CA.GFP were treated with LMB, there was a drastic change in the localization of these proteins to the nucleus, suggesting there is a

NES in the p10 domain of Gag (172). Since determining that Gag nuclear export was dependent upon CRM1, the next step was to identify the NES in Gag by looking for hydrophobic regions. A potential region was identified in the C-terminal end of the p10 domain (Figure 1.2A). A series of mutations were made in the hydrophobic residues and found when each of the single residues were mutated into alanines, there was a dramatic relocalization into the nucleus. But if mutations are made in hydrophobic residues further upstream in p10, the localization resembles wildtype Gag (174). It was found that viruses that contained any of the NES mutants produced fewer particles compared to wildtype, and these particles had several major structural defects indicating that the p10 domain is involved in Gag nuclear export as well as particle morphology

(173).

12

13

Figure 1.2: RSV and HIV-1 Gag proteins. A) Schematic of the RSV Gag protein. Along with the three main domains that every retroviral Gag contains (MA, CA, and NC), RSV

Gag also contains a p2 domain which contains the late motif (L) that is important for budding. The p10 domain that contains a CRM1-dependent NES (sequence shown) and well multimerization interface that also extends in CA. The C-terminal domain of CA also contains the major homology region (MHR) that is important during the late stages of assembly. The NC domain contains the interaction motifs (I) that are important for Gag-

Gag interactions. RSV Gag also contains a spacer peptide (SP) and the viral protease

(PR) needed for particle maturation. There are two NLSs: one in MA that overlaps the membrane binding motif (M); and one in NC that sits between the two cysteine-histidine boxes (CH) that act as zinc fingers and interact with nucleic acids. The microscopy images show the cellular localization of wildtype RSV Gag, as well as the nuclear restricted NES mutant, Gag.L219A. B) Schematic of the HIV-1 Gag protein. The MA domain of HIV-1 Gag contains two NLSs, and another NLS in the NC domain. HIV-1

Gag contains spacer peptide 1 (p2) and spacer peptide 2 (p1) as well as a p6 domain that is important for budding.

14

Due to a NES being mapped in Gag, it was next determined where a NLS is located in Gag; a putative, nonclassical NLS was found in the first half of MA as shown in Figure 1.2A (172). Utilizing strains of the yeast, Saccharomyces cerevisiae that contained conditional mutations or deletions in encoding both essential and nonessential proteins of the karyopherin-β nuclear import protein family, Gag nuclear import proteins were identified. In yeast cells expressing GFP-tagged Gag domains, the p10 NES was further confirmed for CRM1 dependence using a temperature-sensitive mutant of CRM1, that when at higher temperature, p10.GFP remained nuclear. It was then determined that the NLS in MA was dependent upon Importin-11 and Transportin-

SR. The NC domain of Gag was also examined for interactions with import factors because NC contains multiple basic residue clusters. It was found that the putative NLS in NC was dependent upon the classical Importin-α/β pathway (Figure 1.2A) (26). The interactions between the NC domain and Importin α/β, and the MA domain and Importin

11 were confirmed by affinity-tagged purifications from avian cells (72).

It was then asked whether both NLSs contribute equally to the nuclear import of

Gag. A mutant where the NC domain is deleted was utilized, as well as two mutants that have deletions in the MA domain. It was found that when the NLS in MA is deleted and cells are treated with LMB, the amount of Gag in the nucleus is comparable to the amount of wildtype Gag in the nucleus under treatment. However when NC is deleted, there is nuclear Gag after LMB treatment but there remained more cytoplasmic Gag compared to the wildtype Gag under treatment. These results indicated that the NLS in

NC is stronger that the NLS in MA, and may be the dominant signal for nuclear entry of

Gag (26).

To characterize nuclear Gag, it was first examined whether Gag-Gag interactions could occur in the nucleus. Gag-Gag interactions are needed in order to form virus particles, and it has been shown that Gag multimerization occurs within the cytoplasm

15 and as well as at the plasma membrane through fluorescence resonance energy transfer

(FRET) analysis (112). To study the intranuclear interactions of Gag, cells expressing wildtype Gag were treated with LMB. It was found that through complementation experiments that wildtype Gag could partially rescue the NES-mutated, nuclear-trapped

Gag and form virus-like particles. Conversely, the nuclear export mutant could retain wildtype Gag in the nucleus. Furthermore experiments using FRET and biomolecular fluorescence complementation demonstrates Gag-Gag interactions in the nucleus.

These experiments also demonstrate that the mutation made in the NES of RSV Gag, that also overlaps the multimerization interface (MI) (Figure 1.2A), does not disrupt the interactions needed between Gag proteins (96).

It was discovered that when cells expressing Gag are treated with LMB, Gag displays two different patterns of distribution, one where the Gag is diffuse, and the other that Gag becomes focal throughout the nucleus (Figure 1.2A). To determine whether

LMB was having an effect on the pattern of nuclear Gag, a mutant where the NES was mutated to all alanines was expressed in cells. The NES mutant also displayed focal or diffuse patterns in cell nuclei. When the NC deleted mutant with LMB treatment was examined however, it was found that the focal pattern of Gag disappeared; all of the cells expressing Gag had a diffuse appearance. To test whether the lack of foci in the nuclei was due to the loss of the protein-protein interaction function of NC, or if it was because of the loss of the protein-RNA interaction ability, a mutant that contained the domain of the CREB binding protein that facilitates protein-protein binding was used. It was found that again, after treatment with LMB, the distribution of nuclear

Gag was diffuse. This implied that the basis for formation of nuclear foci was dependent upon the RNA binding ability of the NC domain (96). In the cells that formed foci during

Gag nuclear accumulation, it can be found that Gag concentrates in nucleoli in 70% of the cells containing nuclear foci. This localization was found to be due to a nucleolar

16 localization signal (NoLS) present in NC. Gag molecules present in the nucleoli also rapidly exchanged with Gag molecules that are in the nucleoplasm measured by fluorescent recovery after photobleaching (FRAP). It is common for nucleolar proteins to share this characteristic of rapid exchange depending upon the availability of binding partners present in the nucleoplasm, indicating that nucleolar Gag behaves similarly to host nucleolar proteins. The purpose of Gag undergoing nucleolar trafficking is still not understood.

The reason for Gag to undergo nuclear trafficking has still yet to be fully elucidated. It was hypothesized that Gag could help facilitate the export of gRNA. Using the Myr1E mutant that is strongly targeted to the plasma membrane, even under LMB treatment, it was found that there was about a 60% decrease in gRNA packaging compared to wildtype. When nuclear trafficking of Myr1E was restored through the addition of a classical NLS from nucleoplasmin protein, there was an increase in nuclear localization as well as a restoration of gRNA packaging to near wildtype levels (63). To understand the mechanisms governing Gag and viral RNA interactions, and the intracellular trafficking that is involved in the packaging of the RNA, it was asked how the nuclear trafficking of Gag is regulated. It was found that when RNA, both viral and nonviral, is incubated with Gag, there is inhibition of Gag binding to the import factors importin α/β and importin-11. This was especially true when using the viral RNA that contains the psi packaging signal (Ψ). This suggests that Gag cannot simultaneously bind to RNA and import factors. On the other hand, it was found that the interactions between Gag and the nuclear export protein CRM1 were enhanced when RNA, in particular Ψ-containing RNA, was present. These data suggest that there is a conformational change that occurs in Gag when viral RNA is present, after nuclear import, to make the Gag-RNA complex more export competent with CRM1 (72).

Furthermore, there currently is data that demonstrates that Gag colocalizes with viral

17

RNA in the nucleus, and through live-cell imaging, the Gag-RNA complex appears to move towards the nuclear periphery (Maldonado R. et al., in progress).

An overall model to explain these data consists of RSV Gag using either the importin α/β pathway, or the importin 11 pathway to gain entrance into the nucleus.

Once inside, Gag is able to locate and bind to the gRNA making it export competent using the CRM1 export pathway. When the Gag-RNA complex is exported, it can then traffic to the plasma membrane for packaging and particle assembly.

1.3.2 Human Immunodeficiency Virus:

The other retrovirus that will be discussed in this dissertation is human immunodeficiency virus type 1, or HIV-1. HIV-1 is a lentivirus and the causative agent of

AIDS. HIV-1 is a complex retrovirus in that in addition to the four main retroviral proteins needed for replication, it contains extra accessory proteins that are needed for replication (54, 204). The HIV-1 Gag proteins differs from RSV Gag in that instead of containing p2 and p10 domains that are needed for virion morphology and budding, HIV-

1 Gag contains a p6 domain that is involved in budding (Figure 1.2B) (55). The coding region of HIV-1 Gag does not contain the protease coding region, instead the pol gene contains the pro coding region.

Various studies have suggested to the presence of HIV-1 Gag in the nucleus, but currently the nuclear trafficking of HIV-1 Gag is controversial. It is difficult to study because unlike RSV Gag, HIV-1 Gag is not dependent upon the CRM1 export pathway

(10, 69, 95). Currently there is not a known method for trapping HIV-1 Gag in the nucleus, making the study of nuclear HIV-1 Gag difficult because through imaging studies, it appears a small subset of HIV-1 Gag traffics to the nucleus. One group stated that the MA domain of HIV-1 Gag contained a CRM1-dependent NES, but these findings have so far been unable to be reproduced (45). It has been shown that HIV-1 Gag can

18 accumulate at the nuclear periphery and colocalize with the viral RNA, in particular at the microtubule organizing center (MTOC) (118, 159).

The first study to suggest nuclear HIV-1 Gag was in 1991 when it was found that when insect cells were infected with a baculovirus vector containing HIV-1 Gag. It could be seen that Gag was in the nucleus via electron microscopy. They also identified two possible NLSs in the MA domain (169). Since then, multiple imaging studies have shown that there is a low amount of nuclear signal of HIV-1 Gag (13, 69, 71, 73, 79, 138), and that HIV-1 Gag can be detected in a cytoplasm-nucleoplasm fractionation (69, 116, 129,

138) (Figure 1.3).

Several groups have reported on the existence of two NLSs in the MA domain of

HIV-1 Gag, and that the functions of the NLSs are important for the steps of integration in non-dividing cells. The first function NLS was found in the N-terminal half of MA, due to when the basic residues of the NLS (KKKYKLK) were added to the bovine serum albumin (BSA) protein, that BSA was able to be imported into the nucleus (25, 145).

Another basic residue motif (KSKKK) was found in the C-terminal half of MA, and when the motif was added to the BSA protein, the protein underwent nuclear import (145).

When both NLSs are inactivated, this causes the PIC to be nuclear import deficient and unable to integrate in non-dividing macrophages (25, 61, 75, 206). There is a study that suggests that the basic residue NLS in the N-terminal region of MA utilizes the Rch1 protein (member of the karyopherin-α family) for nuclear import (60). Other studies confirmed this report and found that karyopherin-α family members bind to both NLSs in

MA, but has a higher affinity to the N-terminal NLS (75, 145).

19

20

Figure 1.3: Nuclear localization of HIV-1 Gag in published studies. The above images were taken from previously published studies from various laboratories. Within the publications, most of the authors do not discuss the nuclear localization of HIV-1

Gag. A) HeLa and 293T cells were transfected with a wildtype HIV-1 Gag construct.

Immunofluorescence staining was performed to detect Gag and was imaged using confocal microscopy (71). B) These images are z-stack slices taken from a confocal imaged live COS-7 cell transfected with wildtype HIV-1 Gag-GFP (79). A few Gag foci can be detected in the nucleus. C) In this study, HeLa cells were infected with recombinant vaccinia-viruses that contained full-length Gag, the MA domain, or the CA domain. Gag was immunostained and imaged using confocal microscopy (73). The arrows point to cells containing nuclear Gag, and nuclear Gag can be seen in the cells on the right in the CA infected cells. D) Cells expressing Gag.CFP through transfection

(left) or from a stable cell line (right) were imaged through widefield deconvolution microscopy, and single z-slices are shown. The cells were also expressing viral RNA that contained MS2 stem loops in place of Pol for RNA imaging (not shown) (13). E) A

HIV-1 viral construct was transfected into HeLa cells and Gag is detected through immunofluorescence and imaged by confocal microscopy. The left image is 6 hours post-transfection and the right is 24 hours post-transfection (138).

21

When the NC domain of HIV-1 Gag is expressed in cells, it is strongly nucleolar

(125, 217). In one study, when full length HIV-1 Gag was examined, the researchers did see some nuclear Gag. To test whether the membrane targeting/binding ability inhibited the nuclear trafficking, they mutated the myristoylation site in MA, which is involved in the plasma membrane targeting, but they did not see much of an increase in nuclear staining. They next deleted all of MA and saw more HIV-1 Gag at the nuclear periphery with a little more signal in the nucleus then when compared to wildtype HIV-1 Gag. Then they deleted CA and they saw even more HIV-1 Gag in the nucleus; from there, deleting more of Gag did not increase nuclear signal. Furthermore, they used just the CA protein itself and found it cytoplasmic with strong localization near the nuclear membrane, and performed a fractionation and showed no CA in nucleus. These indicated that nuclear localization of HIV-1 Gag is negatively affected by the presence of the MA and CA proteins (217). After integration of the provirus, it appears that the NC protein from the virus particle is important for the expression of the early spliced viral mRNA. It was shown that 18 hours post-infection, NC traffics into the nucleus. If mutations are made in the zinc fingers of the NC domain, the localization of NC becomes less nuclear and there is a delay in the expression of the early spliced mRNAs (224). Currently there are no known nuclear import factors that bind to HIV-1 NC even though the NC domain can be detected to the nucleus.

Recently, there has been data that suggests the HIV-1 Gag does bind the unspliced viral RNA in the nucleus, and the Gag-RNA complexes colocalize with the Rev protein. Rev is responsible for the export of the unspliced HIV-1 RNA from the nucleus for the translation of Gag and Gag-Pol. It is hypothesized that HIV-1 Gag undergoes nuclear entry for gRNA binding and then either uses Rev for nuclear export, or an alternative export method if Rev is unavailable (Tuffy K. et al., in progress).

22

1.3.3 Other nuclear retroviral Gag proteins:

The second Gag found to be present in the nucleus was MLV Gag (147). This group was interested in determining whether MLV Gag underwent nuclear localization due to some unpublished work suggesting that MLV Gag may be involved in the transport and/or the splicing of viral mRNAs. They performed both biochemical fractionations and immunoelectron microscopy on infected cells and found that full- length MLV Gag is found in the nucleus. When they used Gag mutants that lacked various regions of CA, they saw that the nuclear localization disappeared suggesting to them that there is a NLS in the CA domain (147). A following study looked at the localization of the Gag proteolytic products; MA, CA, and NC, as well as integrase (IN).

They found through electron microscopy that after infection, only NC and IN were found in the nucleus and in particular the nucleoli of non-dividing cells. These data suggested that there is a NLS in the NC domain (165). In another study examining the NC domain of a few retroviruses, it was confirmed that MLV NC is in fact nuclear and can be found in nucleoli (125). The main focus of looking at nuclear MLV Gag has lately been on the p12 domain. The function of the p12 domain of MLV Gag is still not fully understood. It was found that mutation of a PPPY motif in the middle of the protein causes defects in assembly of particles and budding (220, 222). But when either the N-terminus or C- terminus of p12 is mutated; there are severe defects during infection prior to integration

(222). Further work showed that p12 is important during the trafficking of the PIC into the nucleus (221). It was also found that wildtype p12 proteins accumulate adjacent to mitotic chromatin and that p12 colocalized to CA from the PIC as well as the reverse transcribed genome implying that p12 is a component of the PIC and responsible for anchoring the PIC to the host mitotic chromatin (48, 49, 160, 176). It was determined that the N-terminus of p12 is responsible for binding to the viral core, and the C-terminus binds to the mitotic chromatin (24, 212).

23

The Gag proteins from FV are unique among the retroviruses. FV Gag proteins do not undergo cleavage into the different domains of Gag for particle maturation (121,

140). They also do not contain the characteristic motifs that other Gag proteins contain, such as the membrane-binding domains (M domains) and the Cys-His zinc fingers needed for binding nucleic acids. Instead, FV Gags contain several distinct domains such as the essential Gag-Env interaction domain and three glycine and arginine rich boxes (GR boxes) that are used to bind nucleic acids (reviewed in (140)). Natural hosts for FV include primates, felines, cattle and horses, but interestingly do not cause any noticeable disease. However, in tissue culture, FV infections are cytopathic, leading to cellular syncytium formation and a vacuolization of the cells (foamy appearance) followed by death. It has been known for almost 5 decades that FV Gag goes into the nucleus and this characteristic was actually used as a diagnostic feature of FV infection in cell culture (83, 121, 140). Initially it was reported that the second GR box is responsible for the nuclear localization of Gag and may contain a NLS (142, 175, 219).

Although, it was later shown that the NLS is nonfunctional and Gag requires the breakdown of the nuclear envelope during mitosis to gain nuclear localization (124, 141).

Prior to nuclear entry, FV Gag, along with the PIC, accumulates at the MTOC waiting for the cell to begin mitosis for nuclear envelope breakdown in order to enter the nucleus

(115, 156, 171). Once in the nucleus and near the host chromatin, the C-terminus of

Gag interacts with the core histones H2A and H2B causing the PIC to be anchored the chromatin through Gag (117, 200). It was also shown that there is a CRM1 dependent

NES in the N-terminus of FV Gag, and similarly to the RSV Gag.L219A that is nuclear restricted, the researchers found if they mutate a residue in a leucine rich motif that the

FV Gag is mostly nuclear (162). These studies have shown that the nuclear trafficking of

FV Gag is important for integration of the provirus into dividing cells.

24

Lesser studied Gag proteins that undergo some form of nuclear trafficking include the Gag protein from feline immunodeficiency virus (FIV). Through imaging studies, it was found that when the localization of gRNA and FIV Gag are examined, that they colocalized at the nuclear envelope facing out towards the cytoplasm, suggesting that FIV Gag may first bind the gRNA for packaging at the nuclear envelope (94). Later it was shown that if cells expressing FIV Gag are treated with the CRM1 inhibitor LMB, there is a dramatic shift of Gag in the nucleus. Furthermore, they found that when CA-

NC-p2 of Gag is expressed, this section of Gag is sensitive to LMB treatment, with MA being dispensable for trafficking (147). Currently neither a NES nor NLS have been mapped in FIV Gag. Mouse mammary tumor virus (MMTV) undergoes nucleolar trafficking through the NC domain (125). It was also found to interact with ribosomal protein L9 through a yeast-2-hybrid screen via the CA protein and this interaction turned out to be important for particle assembly (16). Mason-Pfizer monkey virus (MPMV) Gag has been found to localize to the pericentriolar region in the cytoplasm through microtubule motor proteins (178), possibly at or near nuclear pores (20). When studying possible cellular partners for MPMV Gag, it was found that Gag interacts with hUbc9, a nuclear pore-associated E2 SUMO conjugating enzyme, through a yeast-2-hybrid screen. MPMV Gag colocalized with hUbc9 in discrete foci on the cytoplasmic face of the nuclear membrane. When hUbc9 is overexpressed in cells, a fraction of Gag colocalizes with hUbc9 in the nucleus (211). All of the research studying the nuclear trafficking of retroviral Gag has shown that nuclear Gag is not a rare phenomenon among retroviruses, and that it plays important roles either at the steps of PIC nuclear entry and/or integration, or during the late phases of replication during gRNA packaging.

1.4 Composition and Functions of the Nucleus

The nucleus is a highly complicated, yet organized organelle (Figure 1.4) that is divided into multiple areas where numerous and different processes occur, despite the

25 lack of membranes allowing for separation. After , nuclear bodies make up most of the various components of the nucleus. DNA plays a role in the biogenesis of many nuclear bodies including nucleoli which form around ribosomal DNA (Figure

1.4A). Nuclear body positioning is partially dependent upon DNA because nuclear space is occupied by chromatin, and nuclear bodies occur in the chromatin-free regions.

Interphase chromosomes occupy compact and mostly nonoverlapping areas called territories (Figure 1.4J) with regions that are depleted of chromatin called interchromatin domains (Figure 1.4I) (65, 184, 213). The focus of this section is on nuclear bodies and the movement of molecules through the nucleus.

1.4.1 Nuclear Bodies

Nuclear bodies are membrane less structures that are seeded on either RNAs or proteins. The following requirements are needed to be met in ordered to be classified as a nuclear body: they need to be microscopically visible during at least some periods of the cell cycle; specific proteins and RNAs are concentrated within the structures; and their components are constantly exchanged with the surrounding nucleoplasm (Figure

1.4) (189).

Nucleoli are nuclear bodies that were always thought to primarily function in ribosome assembly. But proteomic and biochemical analyses uncovered roles that nucleoli have in cell cycle regulation, DNA damage response, pre-mRNA and non-long coding RNA processing, telomere metabolism, and stress response. Nucleoli assemble around areas of the chromosomes that contain repeats of ribosomal RNA genes during late telophase. Assembly is initiated by RNA polymerase I-mediated transcription of the ribosomal DNA and recruitment of ribosome biogenesis factors (reviewed in (38, 109,

110) (Figure 1.4A).

Cajal bodies are thought to be involved in the processing and modification of

26 spliceosomal small nuclear RNAs (snRNAs), such as U1, U2,U4, U5, and U6. They contain high concentrations of splicing small nuclear ribonucleoproteins (snRNPs) made from snRNAs and associated proteins, as well as other RNA processing factors involved in the spliceosome. The spliceosome is a collection of snRNPs and splicing factors from splicing speckles that is responsible for pre-mRNA splicing. The association of snRNAs with Cajal bodies is thought to be through interactions with the major protein of Cajal bodies, coilin (152, 189) (Figure 1.4H).

Paraspeckles are seeded on the long non-coding RNA, NEAT1, and contain over 40 different RNA binding proteins (Figure 1.4F). Assembly of paraspeckles occurs during times cellular stress. They function as regulators of through the retention of RNAs during cellular processes involved in differentiation, viral infection, and stress. Whereas NEAT1 functions as the structural component of paraspeckles, another

RNA is abundant within paraspeckles, Ctn. Ctn functions in the gene expression regulation (53, 189).

PML bodies function in a variety of nuclear pathways including DNA repair, senescence, stress response, and anti-viral defense. The PML protein is the main organizer of the bodies, and recruits various proteins. PML bodies may sequester, modify or degrade proteins as part of their functions. They are regulated by various cellular stressors, and the transcription of PML and other PML body proteins are greatly enhanced by interferons and (108, 189) (Figure 1.4G).

Transcription factories are sites within the nucleus that accommodate numerous replicons at once, allowing for more efficient transcription of multiple genes at the same time. All of the necessary proteins needed for transcription can be found at these factory sites with anywhere from 4 to 30 RNA polymerase complexes present (33)

(Figure 1.4D). There is evidence that transcription factories helps organize chromatin

27 through the formation of chromatin loops and clustering of active and co-regulated genes

(164).

Splicing speckles function as storage, assembly, and modification compartments that supply splicing factors to active transcription sites. Speckles are enriched in splicing factors and are often found near active transcription sites (Figure

1.4E). Along with splicing factors, kinases and phosphatases are found at or near splicing speckles, and they are involved in the regulation and localization of splicing factors between the speckles and transcription sites (reviewed in (185)).

1.4.2 Movement within the Nucleus

The nucleus is extremely dynamic. Proteins are constantly moving in and out of various nuclear bodies, as well as the nucleus itself. The components of nuclear bodies constantly and rapidly exchange with other proteins/RNAs present in the nucleoplasm

(189). Many proteins from nuclear bodies exhibit Brownian motion, which is defined as the random motion of particles suspended in a fluid from collisions with fast-moving molecules (32). FRAP experiments are commonly performed by scientists to study the dynamics of proteins in cells. Briefly, a region of interest is bleached using a high powered laser until the signal from a fluorophore tagged protein is destroyed. Then over time, the region is imaged to see whether there is a return of signal within that area, indicating that the protein is relocating to the area that was bleached. How quickly this occurs reflects the dynamic motion of the protein. FRAP studies have been performed on numerous nuclear body proteins. When Cajal bodies were examined, it was found that there is a rapid exchange of coilin, fibrillarin, and other Cajal body components between the body and the nucleoplasm, with half-times for recovery on the order of a few minutes (reviewed in (152)). For some splicing speckle proteins, the exchange rate between the speckle and the nucleoplasm is very rapid; complete photobleaching occurred within 30 seconds and the recovery half-time was ~3-5 seconds (185).

28

Through FRAP experiments, the exchange dynamics of the PML protein found in PML bodies ranged from a half-time of recovery of 3 minutes to only 32% recovery after 20 minutes (209).

To examine whether the nuclear bodies move, time course experiments are performed and the diffusion coefficient is measured. This measures how much a focus moves over time. One study determined PML bodies can be classified into three groups: stationary, constricted, and directed movement (143). Another group classified PML bodies exhibited four different types of motion; confined, obstructed, directed, and free diffusion. They found that Cajal bodies can also exhibit these four types of motion as well (68). There is speculation that the changes in the type of motion exhibited by these nuclear bodies could be dependent upon cell cycle (35).

Another form of movement macromolecules may utilize within the nucleus could be through the nuclear β-actin. Nuclear β-actin is a component of several chromatin remodeling complexes and is involved in mRNA processing, nuclear export, and nuclear envelope assembly. It is also required by all three classes of RNA polymerases. There are also six different myosins and four different kinesins that have been shown to localize to the nucleus as well. Nuclear myosin, MYO1C has been shown to associate with all three classes of RNA polymerases, similarly to β-actin. A model is proposed in which MYO1C binding to DNA, and actin bound to RNA polymerase act together as a motor to drive transcription. Nuclear kinesins, KIF4A and KID, are also able to bind DNA, but whether they function as motors is not yet known (reviewed in (182)).

29

30

Figure 1.4: Organization of the nucleus. A) The nucleolus. Chromosomes containing ribosomal DNA sequences are sites of nucleolar formation. This is mediated by RNA polymerase I transcription of these sites (orange circles). The magenta lines are the ribosomal RNAs. B) Nuclear envelope. C) Nuclear pore complex. D) Transcription factories. Multiple loops of DNA can interact with these factories at one time. The factories contain numerous RNA polymerase II and transcription factors. E) Splicing speckle that store splicing factors. F) Paraspeckles contain multiple RNA binding proteins and retained RNAs. G) PML bodies contain numerous proteins. H) Cajal bodies process and modify snRNAs. I) Interchromatin spaces hold the various nuclear bodies.

J) Chromosome territories.

31

1.4.3 Nuclear Import and Export

Trafficking in and out of the nucleus is crucial for cell vitality. Macromolecules that are greater than 40-50kD need to use transport proteins in order to move through the NPC. Most pathways use the karyopherin-β family of import/export factors. karyopherins bind to their cargoes by recognition of a NLS, or a NES. Different types of

NLS and NES are recognized directly or indirectly via adapters, or by the transport proteins themselves. Transport receptors interact with Ran GTPase and with nucleoporins at the nuclear pore complex (NPC). RNA is exported from the nucleus via a variety of mechanisms. Most mRNAs seem to be exported through the non-karyopherin

TAP-dependent pathway. Other mRNAs and certain viral RNAs are exported through the

CRM1 export pathway. CRM1 is not a RNA-binding protein, and so it requires an adapter such as HuR and eIF4e, or a viral proteins such as HIV-1 Rev (reviewed in (31,

81, 139, 154, 191)).

1.5 Identifying Host Nuclear Interaction Partners of Retroviral Gag

Retroviruses hijack many host proteins throughout infection and assembly. They utilize receptors on the surface of the plasma membrane to bind to the cell. Then the virus uses the actin and/or microtubule transports systems to move the PIC to the nucleus during infection, and then traffic the Gag:RNA complex to the plasma membrane during assembly. Nuclear import pathways are needed to bring the PIC inside the nucleus in non-dividing cell, and RNA export pathways are needed to transport viral

RNAs into the cytoplasm for translation and possibly packaging. The transcription, splicing, and translation machinery are also hijacked by retroviruses for replication. Then finally, retroviruses use the host machinery for the budding and release of particles

(reviewed in (67)). This section is focused on the techniques used to identify host proteins that could interact with the Gag protein.

32

Techniques that have been used to uncover binding partners of retroviral Gag include genome-wide RNA interference (RNAi) screens, yeast-2-hybrid screens, and mass spectrometry on affinity-tagged purifications. RNAi screens typically use either small interfering RNAs (siRNAs) or small (or short) hairpin RNAs (shRNAs). siRNAs are

RNA duplexes that range from 19 to 27 nucleotides and are introduced into cells through transient transfection. siRNA libraries are commercially available that target the entire human or mouse genomes, or focus on particular families of proteins such as kinases. shRNAs are RNAs roughly 65 nucleotides in length that contain complementary sequences at their 5’ and 3’ ends. They are typically cloned into viral vectors so the shRNA can be stably integrated and provide long-term knockdown of the target mRNA.

The Dicer complex recognizes the hairpin that forms in the shRNA and cleaves it into siRNAs (reviewed in (29)). Various shRNA or siRNA screens were performed to detect cellular proteins that are needed for HIV-1 replication (22, 47, 99, 100, 150, 161, 216,

225). A number of the screens performed identified host proteins that are involved in various nuclear processes such as transcription, DNA repair, nuclear-cytoplasmic transport, and RNA processing. However, in these experiments, the researchers did not examine the role of Gag with these nuclear processes.

Another technique used by researchers to identify possible binding partners of

Gag is the yeast-2-hybrid screening method (16, 30, 73, 78, 82, 97, 128, 207). The system works by splitting the yeast Gal4 protein into the N-terminal DNA binding domain and the C-terminal transcriptional activation domain. For genome-wide screens, generally a cDNA library is used to generate a library of preys on one of the domains of

Gal4, the other half is fused to the bait protein to uncover possible interactors (23).

Through these studies, a few nuclear proteins were identified to be possible binding partners of Gag. Nucleolin was found to be a binding partner of MLV Gag NC and validated through in vitro binding assays and detection of nucleolin in particles (7). MLV

33 was also found to interact with Ubc9 and PIASy which are involved in SUMOylation, through the CA domain. SUMOylation is involved in a variety of cellular processes including: nuclear-cytoplasmic transport, transcriptional regulation, apoptosis, protein stability, stress response, and cell cycle progression (223). MPMV was also found to interact with Ubc9 through the CA domain (211); and for HIV-1, the p6 domain was found to interact with Ubc9 and SUMO-1 (74) . For MMTV Gag, ribosomal protein L9 was found to interact with Gag and be important for particle formation (16). In a study looking at binding partners of MA, a putative nuclear shuttling protein, that they designated “virion-associated nuclear shuttling protein”, or VAN, was discovered to interact with MA (73).

In the mass spectrometry experiments, multiple techniques were used to try to identify possible interaction partners of HIV-1 Gag. One group utilized five different and independent affinity-tagged purification protocols. The five techniques used consisted of:

1) a tandem affinity purification (TAP) tag for a C-terminally tagged Gag, 2) GFP-TRAP

A beads for Gag with GFP fused internally in the MA domain, 3) GFP-TRAP A beads for

Gag with GFP fused to the C-terminus of Gag, 4) GFP microbeads for Gag with GFP fused internally in the MA domain, and 5) GFP microbeads for Gag with GFP fused to the C-terminus of Gag. In one set of experiments, using 293T transfected with HIV-1

Gag, 31 proteins were identified in at least three of the purification experiments (51).

One protein in particular that they decided to focus on was Lyric due to it being the only cellular protein identified across all five screens. Lyric is involved in the NF-ΚB signaling pathway by acting as an activator. The interaction between Lyric and Gag was confirmed through co-immunoprecipitations, as well as being identified in HIV-1 virions. They examined whether Lyric interacted with other retroviruses and found that both equine infectious anemia virus and MLV Gag also co-immunoprecipitated Lyric. Through further studies, this group hypothesized that Lyric could be important for regulating HIV-1

34 infectivity. Other proteins that were discovered included topoisomerase I, multiple splicing factors, and other RNA processing proteins that were not further discussed (51).

In another round of affinity-tagged purifications followed by mass spectrometry, the same group as before, using the same techniques, identified 944 proteins as potential binding partners of HIV-1 Gag. Various categories of nuclear proteins identified included

RNA processing proteins, in particular splicing factors; nucleosome assembly proteins, nucleolar proteins, and chromosome binding proteins. In these experiments, they did not validate any of the potential interactors (50).

Another group wanted to identify host proteins that interacted with any of HIV-1 proteins, including the Gag, Pol, and Env (Gp160) polyproteins and all of their processed proteins (MA, CA, NC, p6, PR, RT, IN, Gp120, and Gp41), as well as the accessory proteins (Vif, Vpr, Vpu, Nef, Tat, and Rev), in a systematic and quantitative way. They tagged each protein with two strep tags and 3 flag tags at the C-terminal end, then expressed them in HEK293 cells, as well as stably expressed Jurkat cell lines. They discovered 2,849 unique proteins, in which, 1,134 were identified from the full-length

Gag or the proteolytic products of Gag (MA, CA, NC, and p6) purifications. A number of

RNA processing proteins were identified from the Gag purifications (89).

Two other groups wanted to expand on the possible novel protein interactors of

HIV-1 Gag using a technique that utilizes the Escherichia coli biotin ligase BirA* that tags proteins in close proximity through biotinylation. One group put the BirA* tag coding region in MA. They found 53 unique proteins that included RNA processing proteins and nuclear transport proteins (166). The other group used the same technique except that they fused the BirA* tag at the N-terminus of Gag, in which they found 42 unique proteins (114). While the first group did not validate any of the proteins, Le Sage et al

(114) confirmed the interactions of Gag with DDX17, a RNA helicase, and RPS6, a ribosomal protein, through co-immunoprecipitations.

35

Li et al 2016 (120) wanted to identify novel binding factors of HIV-1 MA. They inserted a strep tag to the C-terminus of MA in the context of the virus and then used this replication competent virus to infect Jurkat cells. After 48 hours the cells were lysed and the strep-tagged MA proteins were affinity-tagged purified. In this experiment they identified 97 proteins, in which they validated the interactions of MA with nucleolin, a

DNA and RNA metabolism regulator; YB-1, involved in many mRNA pathways; and

Ku70 and Ku80, parts of the DNA nonhomologous end-joining pathway. While many exploratory experiments have focused on finding new host proteins that retroviruses use, few have validated nuclear proteins that were identified and determined their functions with Gag, and that is one of the aims of this dissertation.

1.6 Basics of Mass Spectrometry

Mass Spectrometry has been a critical tool in analyzing protein sequence and structure, as well as identifying proteins from peptides. The type of mass spectrometry analysis used in this dissertation is discussed below and a basic diagram of the process is outlined in Figure 1.5.

Mass spectrometry works through the ionization of the macromolecular compounds into a gaseous state. Ionization occurs through the use of electrospray ionization, which creates ions by applying high voltage to the liquid sample. This generates charged droplets in the form of a fine mist. The solvent then dries, either through drying gas or heat, causing droplet sizes to decrease, resulting in the formation of desolvated ions. Through this procedure, the ions, which are now highly positively charged, are not fragmented so the true molecular mass of an ion can be calculated since more than one charge state is observed. Mass-to-charge (m/z) ratios are next determined by measuring the time it takes for ions to move through a field-free region

(called time of flight measurement (TOF)). Given a constant accelerating voltage, the flight time for an ion is related to its m/z ratio. The flight path for an ion can be increased,

36 without increasing the size of the flight tube, by incorporating an ion mirror or reflectron at the end of the flight tube. Ion direction is reversed to send the ions back down the same vacuum chamber at a different angle so the flight paths of the reflected ions do not cross with the entering ions. More energetic ions arrive earlier at the reflectron compared to the less energetic ions. After reflection, the ions reach the detector where the m/z is measured.

Correlating the sample peptides’ m/z data from the detector with known sequences is used to identify the protein(s) present in the sample. Provided a sufficient number of peptide ions are observed in the mass analysis step and the protein is not heavily modified, a match can usually be found (reviewed in (214)). The database search approach is the most popular method for protein identification. It identifies proteins by predicting peptide mass values through theoretical digestion of proteins, generating different peptides from a given protein sequence database. Then it compares the experimental spectra with the theoretical ones to find the closest matches. This is only effective if the proteins of interest are already known and the databases examined contain the correct protein sequences. This can become difficult since not all proteins and protein modifications are completely known. Therefore, only a portion of the identifications reported by a database search is correct (208). Determining protein identification is also dependent upon the completeness of the particular database that the peptide sequences are being compared.

1.7 Overview

As discussed above, it is known that RSV Gag undergoes nuclear trafficking and this is needed for efficient genome packaging. And while in the nucleus, RSV Gag is able to form foci. Currently it is not known whether RSV Gag has other roles within the nucleus besides binding the genomic RNA, and what exactly Gag is doing within these nuclear foci. Based on the roles of cellular proteins in the nucleus, besides finding the

37 gRNA for packaging, RSV Gag could also be promoting the transcription of the proviral integration site. Gag could be inhibiting the antiviral response through chromatin remodeling and gene silencing. Gag could be affecting cell cycle; regulating splicing of the viral transcripts to favor unspliced viral RNA; or Gag could be recruiting factors needed for viral replication.

In this dissertation, I will be discussing studies that begin to answer these questions. The focus of this dissertation is to identify candidate nuclear binding partners of the retroviral Gag protein and determine the role(s) of nuclear Gag. Chapter 2 focuses on the further characterization of the nuclear foci and queries what candidate host proteins are in these foci. The data presented demonstrate that Gag may be interacting with splicing factors. Due to the localization of Gag in the nucleus and the formation of foci, Chapter 3 explores the possible binding partners Gag has in the nucleus, and tries to uncover a possible tether(s) of the nuclear foci. A few factors that were identified via mass spectrometry were tested for their possible role in virus packaging/assembly and colocalization with RSV Gag. Chapter 4 specifically investigates the interaction of Gag with the nuclear import protein Transportin-SR (TNPO3). What Gag is doing in nuclear foci and what cellular proteins may Gag be interacting with are currently unknown and are the focus of this dissertation.

38

39

Figure 1.5: Basic layout of TOF mass spectrometry. Time of flight (TOF) mass spectrometry determine the mass-to-charge (m/z) ratio by measuring the time it takes for ions to move through a field-free region (214). The protein sample is injected into the machine through the inlet where it undergoes ionization by an electric field. The peptide ions are then accelerated through the drift region. When the ions reach the pusher, the pusher acts to reaccelerate the ions in a single push so that ions that are the same m/z but had different initial energies will meet at the detector at the same time. After the ions are pushed, they are reflected to the detector. During this, the time it takes ions to go from the pusher to the detector is measured, and this determines the m/z. The m/z spectra are then compared to protein databases of known peptide spectra to identify the proteins in the sample.

40

Chapter 2

Interplay between the alpharetroviral Gag protein and SR proteins SF2 and SC35 in the nucleus

Breanna L. Rice*, Rebecca J. Kaddis*, Matthew S. Stake*, Timothy L. Lochmann, and Leslie J. Parent *Authors contributed equally

Citation: Rice BL, Kaddis RJ, Stake MS, Lochmann TL, Parent LJ. Interplay between the alpharetroviral Gag protein and SR proteins SF2 and SC35 in the nucleus. Frontiers in Microbiology. 2015;6:925. doi:10.3389/fmicb.2015.00925.

© 2015 Rice, Kaddis, Stake, Lochmann and Parent

Manuscript Appended

2.1 Abstract

Retroviruses are positive-sense, single-stranded RNA viruses that reverse transcribe their RNA genomes into double-stranded DNA for integration into the host cell chromosome. The integrated provirus is used as a template for the transcription of viral

RNA. The full-length viral RNA can be used for the translation of the Gag and Gag-Pol structural proteins or as the genomic RNA (gRNA) for encapsidation into new virions by the Gag protein. The mechanism by which Gag selectively incorporates unspliced gRNA into virus particles is poorly understood. Although Gag was previously thought to localize exclusively to the cytoplasm and plasma membrane where particles are released, we found that the Gag protein of Rous sarcoma virus, an alpharetrovirus, undergoes transient nuclear trafficking. When the nuclear export signal of RSV Gag is mutated

(Gag.L219A), the protein accumulates in discrete subnuclear foci reminiscent of nuclear bodies such as splicing speckles, paraspeckles, and PML bodies. In this report, we observed that RSV Gag.L219A foci appeared to be tethered in the nucleus, partially co- localizing with the splicing speckle components SC35 and SF2. Overexpression of SC35

41 increased the number of Gag.L219A nucleoplasmic foci, suggesting that SC35 may facilitate the formation of Gag foci. We previously reported that RSV Gag nuclear trafficking is required for efficient gRNA packaging. Together with the data presented herein, our findings raise the intriguing hypothesis that RSV Gag may co-opt splicing factors to localize near transcription sites. Because splicing occurs co-transcriptionally, we speculate that this mechanism could allow Gag to associate with unspliced viral RNA shortly after its transcription initiation in the nucleus, before the viral RNA can be spliced or exported from the nucleus as an mRNA template.

42

Chapter 3

Identification of possible chromatin-associated protein

interacting partners of the retroviral Gag protein

3.1 Abstract

Retroviruses use a variety of host proteins in order to properly replicate and form infectious virions. Most studies that examined possible interaction partners of Gag focused on cytoplasmic or plasma membrane interactors. Not much work has delved into the host proteins that Gag may interact with inside the nucleus, even though numerous studies have demonstrated the presence of Gag in the nucleus. The nuclear function of the Gag protein has only been determined for a few retroviruses. The work discussed here focuses on uncovering the nuclear localization and possible binding partners of both RSV and HIV Gag through subcellular fractionations and affinity-tagged purifications followed by mass spectrometry, respectively. Here it is shown that both

Gags can be extracted from chromatin-associated protein fractions, suggesting a possible role of either chromatin-binding or interactions with chromatin bound proteins.

Through mass spectrometry, numerous host proteins were identified that function during transcription and RNA processing, such as splicing, suggesting that Gag may be interacting with host proteins at transcription sites.

3.2 Introduction

Retrovirus replication is dependent upon the successful packaging of its RNA genome. The Gag structural protein is responsible for the encapsidation of the unspliced viral RNA. Historically it was thought that after translation, Gag remained in the cytoplasm where it bound to the unspliced RNA and trafficked to the plasma membrane for budding. However, it was demonstrated that the Gag protein from Rous sarcoma virus (RSV) did not remain strictly cytoplasmic but in fact trafficked to the nucleus.

43

Through immunofluorescence and live-cell imaging, Gag has been shown to transiently traffic through the nucleus in a CRM1 nuclear export dependent manner (172).

Sequence analysis and mutational studies were performed and a nuclear export signal

(NES) was found in the p10 domain of RSV Gag. To further study the nuclear trafficking of RSV Gag, a mutation was made within the NES, designated Gag.L219A that dramatically relocalizes to the nucleus and forms foci (173). The kinetics of the

Gag.L219A foci were examined and it was determined that the foci exhibit obstructed diffusion characteristic of being tethered to something, such as a RNA or protein (163)

(Chapter 2/Appendix A).

Studies suggest that a link exists between RSV Gag nuclear trafficking and efficient genomic RNA (gRNA) packaging. To examine the effectiveness of RNA packaging when nuclear trafficking is disrupted, a Gag mutant was used that traffics strongly to the plasma membrane, bypassing the nucleus. This altered trafficking caused a 60% decrease in viral genome packaging compared to wildtype levels. When nuclear trafficking has returned, genome packaging is restored to near wildtype levels, suggesting that nuclear trafficking is needed for efficient genome packaging (63). Recent data suggests that RSV Gag is binding the viral RNA in the nucleus and trafficking across the nuclear envelope into the cytoplasm. Furthermore, when cells expressing

Gag.L219A are treated with the transcription inhibitor, Actinomycin D, the Gag nuclear foci disappear, implying that the formation of the Gag nuclear foci depends upon active transcription (Maldonado R. et al., in progress).

It has been found that the Gag proteins from foamy virus (FV), murine leukemia virus (MLV), feline immunodeficiency virus (FIV), Mason-Pfizer monkey virus (MPMV), mouse mammary tumor virus (MMTV), and human immunodeficiency virus type-1 (HIV-

1) also undergo nuclear localization. FV Gag nuclear trafficking has been shown to be important for binding to mitotic chromatin through interactions with histone H2A and H2B

44 for the purpose of provirus integration (117, 141, 162, 175, 200). MLV Gag has been shown to traffic into the nucleus (147), and the NC domain can be seen in the nucleolus

(87, 125, 165) and the p12 domain is important during integration (49, 176). MMTV Gag has also been seen in nucleoli (16), as well as HIV-1 Gag (125). FIV, MPMV, and HIV-1

Gags have been shown to be nuclear through imaging and biochemical approaches (20,

95, 211) (Tuffy, K. et al., in progress). This shows that nuclear trafficking of Gag is not only occurring in RSV, but appears to be a common feature among most retroviruses, even though the reasons for the trafficking are not completely understood for many retroviruses.

There are many possible reasons for why Gag could be trafficking into the nucleus. For RSV, it appears that Gag may be selecting the gRNA for packaging within the nucleus, but this does not mean that this is the only thing RSV Gag could be doing.

Gag could also be altering cellular transcription to favor the site of viral integration, as well as regulating splicing of the viral mRNA to favor the unspliced viral RNA. Gag could also be inhibiting the antiviral response or affecting the cell cycle. Another possibility of nuclear Gag is to recruit factors that are needed for viral replication. To determine what nuclear processes Gag could be interacting with, as well as identify possible protein tethers of the nuclear Gag foci, multiple mass spectrometry experiments were performed and discussed in this chapter. Other laboratories have performed studies trying to identify novel host proteins that interact with HIV-1 Gag through various techniques such as GFP-tagged Gag, tandem affinity purification tagged Gag, and BirA* tagged Gag (50,

51, 89, 114, 120, 166). As discussed below, many proteins are found to overlap between the experiments performed here and the proteins identified by these other groups. This leads to a good list of cellular proteins to further test whether they have a role in viral replication. The hypothesis for this work is that RSV Gag forms functional interactions in vivo with affinities sufficient for isolation of its partners in vitro.

45

3.3 Materials and Methods

Cells, Plasmids, and Purified Proteins

DF1 chicken embryo fibroblast cells and HeLa human cervical cancer cells were maintained as described (80, 125). QT6 quail fibroblast cells were maintained as described (40) and were transfected using the calcium phosphate method (56). HEK

293T cells were maintained using the same conditions as the DF1 cell line, except at

37°C, and were transfected using the calcium phosphate method.

The RSV Gag constructs pGagΔPR (referred to as RSV Gag) and pGag.L219A.ΔPR (referred to as RSV Gag.L219A), as well as Gag.L219A.YFP,

Gag.L219A.GFP, and Gag.ΔNC were previously described (96, 173, 174). The RSV viral construct pRC.V8 was previously described (40). The HIV Gag constructs HIV Gag.CFP

(Rev independent), HIV Gag.CFP (Rev dependent), and Rev.YFP were previously described (94, 125).

The RSV Gag.tandemstrep construct was cloned using pGagΔPR and primers to introduce the tandem strep tag: 5’ – GGA GGA AGT GGA GGA AGT GCA TGG AGC

CAC CCG CAG TTC GAA AAA TAA GCG GCC GCG ACT CTA G and 5’ – TCC ACT

TCC TCC TCC TTT TTC GAA CTG CGG GTG GCT CCA TGC ACT CGA GAC GGC

AGG TGG CTC using the Q5 Site-Directed Mutagenesis protocol according to the manufacturer guidelines (New England Biolabs). The Gag.L219A.tandemstrep was cloned by taking the MA-p10 region from a Gag.L219A construct by digesting with restriction enzymes SacI and ScaI, and inserting into Gag.tandemstrep using the same enzymes. mCherry.tandemstrep was made by using primers: 5’ – AGG AAG TGG AGG

AAG TGC ATG GAG CCA CCC ACA GTT CGA AAA ATA ATA AAG CGG CCG CGA

CTC T and 5’ – CCT CCG CTA CCA CCA CCC TTC TCA AAT TGT GGA TGA CTC

CAT GCA CTC TTG TAC AGC TCG TCC ATG C, and performing Q5 Site-Directed

Mutagenesis using the mCherry.N2 plasmid. GFP.N2 was obtained from Clontech, and

46

Histone H2B-GFP was a generous gift from Dr. Zheng (98). pET28.TEV-Gag.3h coding

RSV H6.Gag.3h, pET28a.WT.HIV.Gag.Δp6 coding HIV WT.Gag.Δp6.H6, and RSV

Gag.ΔSP.ΔNC (referred to as Gag.ΔNC) protein preparation were previously described

(15, 72, 170). RSV integrase protein was a generous gift from Dr. Katzman (PSU

College of Medicine) (183).

Subcellular Fractionations

Chase

QT6 and 293T cells were transfected with either the tagless RSV Gag constructs or the CFP-tagged HIV Gag constructs with and without Rev.YFP (Figure 1C); 16 hours later the media was changed to fresh primary growth media (PGM) and the cells were allowed to recover for 24 hours. All steps and buffers used were performed on ice or at

4°C unless otherwise stated. Cells were fractionated according to (34). Cells were collected using trypsin and then washed in cold PBS. The cell pellet was resuspended in sucrose buffer (10 mM Hepes pH 7.9, 10 mM KCl, 2 mM magnesium acetate, 3 mM

CaCl2, 340 mM sucrose, 1 mM DTT, 100 μg/ml phenylmethanesulfonyl fluoride (PMSF),

1 μg/ml pepstatin, and Roche Complete Protease Inhibitor Cocktail) and incubated on ice for 10 minutes. IGEPAL Nonidet P-40 was added to the final concentration of 0.5% and cells were vortexed on high for 15 seconds, and then spun for 10 minutes at 3,500 x g at 4°C. The supernatant (cytoplasm fraction) was collected and the pelleted nuclei were resuspended in nucleoplasm extraction buffer (50 mM Hepes pH 7.9, 150 mM potassium acetate, 1.5 mM MgCl2, 0.1% IGEPAL Nonidet P-40, 1 mM DTT, 100 μg/ml

PMSF, 1 μg/ml pepstatin, and Roche Complete Protease Inhibitor Cocktail) and transferred to a dounce homogenizer and homogenized with 20 slow strokes. Nuclei were checked under a light microscope for lysis. The nuclear lysate was transferred to a new tube and rotated at 4°C for 20 minutes. The lysates were spun at 16,000 x g for 10 minutes at 4°C. The supernatant (nucleoplasm fraction) was collected and the chromatin

47 pellet was resuspended in nuclease incubation buffer (50 mM Hepes pH 7.9, 10 mM

NaCl, 1.5 mM MgCl2, 1 mM DTT, 100 μg/ml PMSF, 1 μg/ml pepstatin, and Roche

Complete Protease Inhibitor Cocktail) with 100U/ml of OmniCleave (Epicentre). The chromatin was digested for 10 minutes at 37°C. NaCl was added to 150 mM and incubated on ice for 20 minutes. The lysates were spun for 10 minutes at 16,000 x g at

4°C. The supernatant (low-salt chromatin fraction) was collected and the pellet was resuspended in chromatin extraction buffer (50 mM Hepes pH 7.9, 500 mM NaCl, 1.5 mM MgCl2, 0.1% Triton X-100, 1 mM DTT, 100 μg/ml PMSF, 1 μg/ml pepstatin, and

Roche Complete Protease Inhibitor Cocktail) and were incubated on 20 minutes on ice.

The lysates were spun for 10 minutes at 16,000 x g at 4°C. The supernatant (high-salt chromatin fraction) was collected.

Henikoff

QT6 and 293T cells were transfected with either the tagless RSV Gag constructs

(Figure 1C); 16 hours later the media was changed to fresh PGM and the cells were allowed to recover for 24 hours. All steps and buffers used were performed on ice or at

4°C unless otherwise stated. Cells were fractionated according to (77). Cells were collected using trypsin and then washed in cold PBS. Cell were resuspended in TM2 buffer (10 mM Tris pH 7.4, 2 mM MgCl2, 100 μg/ml PMSF, 1 μg/ml pepstatin, and Roche

Complete Protease Inhibitor Cocktail) and incubated on ice for 1 minute. NP-40 was added to 1.5% and lysates were incubated on ice for 5 minutes. Lysates were spun at

500 x g for 10 minutes at 4°C. The supernatant was discarded and the pellet was resuspended in TM2 buffer supplemented with 1 mM CaCl2 and 1.6U/ml OmniCleave.

The lysates were incubated at 37°C for 10 minutes. The reaction was quenched with 2 mM EGTA on ice. Lysates were pelleted at 500 x g for 10 minutes at 4°C. The supernatant was discarded and the pellet was resuspended in 150 mM buffer (50 mM

Tris pH 7.4, 2 mM MgCl2, 150 mM NaCl, 2 mM EGTA, 0.1% Triton X-100, 100 μg/ml

48

PMSF, 1 μg/ml pepstatin, and Roche Complete Protease Inhibitor Cocktail) and rotated for 2 hours at 4°C. The supernatant (150 mM chromatin fraction) was collected and the chromatin pellet was resuspended in 600 mM buffer (50 mM Tris pH 7.4, 2 mM MgCl2,

600 mM NaCl, 2 mM EGTA, 0.1% Triton X-100, 100 μg/ml PMSF, 1 μg/ml pepstatin, and

Roche Complete Protease Inhibitor Cocktail) and rotated overnight at 4°C. The lysates were pelleted at 500 x g for 10 minutes at 4°C and the supernatant (600 mM chromatin fraction) was collected.

Western Blot Analysis

The subcellular fractionations were analyzed via SDS-PAGE. Aliquots of the fractions were heated to 90°C in 4X SDS-PAGE sample buffer (250 mM Tris-HCl, pH

6.8, 40% glycerol, 0.4% bromophenol blue, 8% SDS, and 8% β-mercaptoethanol) for 10 minutes prior to loading on a 10% SDS-PAGE gel and analyzed by Western blot.

Proteins were detected using antibodies against RSV Gag (210), GFP (Abcam ab290),

Calnexin (Enzo Life Sciences ADI-SPA-865), Crm1 (Santa Cruz sc-5595), Med4 (Abcam ab129170), RCC1 (Abcam ab54600), Histone H2B (Abcam ab52484), GAPDH (UBP Bio

Y1040), and HRP-conjugated secondary antibodies (Invitrogen).

For the RSV Gag Chase fractionations the signal density of the protein bands on the antibody-stained membranes were analyzed using the Image Lab Software on the

ChemiDoc MP system. Rectangles were drawn around each band, as well as a blank region using the volume tools feature. Using the volume tools feature, the signal intensity of the bands can be quantified. The background subtraction method was set to local, and the blank region that was highlighted by a rectangle was labelled as the background volume. The volumes report table was exported to Microsoft Excel. For each Gag, the adjusted volumes for each fraction were added to together to calculate the total adjusted volume. Then the percentages of each fraction were calculated by subtracting the fraction’s adjusted volume from the total adjusted volume. Averages and standard

49 deviations were calculated for each fraction for each Gag protein from three separate experiments.

Mitotic Chromatin

Serum Starvation

Cells were transfected with Gag.L219A.GFP, GFP.N2, or H2B-GFP for 16 hours.

Cells were washed with warm standard buffer, and PGM was added for 8 hours to allow for recovery. After 8 hours, the cells were washed with standard buffer and serum-free media was added for 16 hours, after which the media was changed to PGM. Cells were allowed to recover and grow for 7 hours then fixed in 3.7% paraformaldehyde in PHEM buffer (120 mM PIPES, 55 mM HEPES, 20 mM EGTA, and 16.5 mM MgSO4 pH 7.0)

(133) and DAPI stained at 5 μg/ml for 30 seconds. The coverslips were mounted using

ProLong Diamond Antifade Mountant (ThermoFisher Scientific).

Imaging

Cells were imaged using the Leica SP8 TCS scanning confocal microscope equipped with a White Light Laser (WLL) using a 63X oil immersion objective.

Sequential scanning between frames was used to average four frames for each image.

DAPI was excited with the 405 nm UV laser at 10% laser power and emission detection window 410 – 451 nm using a PMT detector. GFP was imaged using the WLL excited with the 489 nm laser line and a hybrid detector window of 495 – 547 nm. YFP was imaged using the WLL with a laser line excitation of 514 nm and a hybrid detector window of 519–600 nm. All channels using the hybrid detectors had a time gating of 0.3 to 6.0 ns.

Purified RSV Gag, Gag.ΔNC, and RSV Integrase Pulldowns

This first RSV purified protein mass spectrometry experiment was performed in collaboration with Dr. Nikoloz Shkriabai and Dr. Mamuka Kvaratskhelia from The Ohio

State University. Cells were provided by me, and I performed the final analysis, the

50 lysate preparation, affinity-tagged purification, and mass spectrometry were performed by our collaborators.

Lysate Preparation

DF1 cells were fractionated using the NE-PER Nuclear and Cytoplasmic

Extraction kit (ThermoFisher Scientific). All steps and buffers used were performed on ice or at 4°C unless otherwise stated. Cells were lysed in CERI buffer containing the

Complete Protease Inhibitor Cocktail (Roche). Cells were vortexed on the highest setting for 15 seconds and incubated on ice for 10 minutes. Ice-cold CERII buffer was added and cells were vortexed on high for 5 seconds then centrifuged for 5 minutes at 16,000 x g in a microcentrifuge. The supernatant was collected (cytoplasmic fraction), and the pelleted nuclei were resuspended in ice-cold NER buffer with protease inhibitor cocktail added. The nuclei were vortexed on high for 15 seconds and incubated on ice for 10 minutes, then vortexed for 15 seconds every 10 minutes for a total of 40 minutes. The lysed nuclei were centrifuged at 16,000 x g for 10 minutes. The supernatant (nuclear fractionation) was diluted to 14ml with Buffer A (25 mM Tris-HCl pH 8.0, 200 mM NaCl, 2 mM 2-Mercaptoethanol (BME), and protease inhibitor cocktail). The nuclear fraction was concentrated to ~1 ml in a 3kd mwco Amicon column and then was diluted to 14ml and concentrated once more to ~1.2 ml.

Nickel Affinity Purifications

Nickel resin was washed three times in Buffer B (25 mM Tris-HCl pH 8.0, 200 mM NaCl, 30 mM Imidazol, 0.2% v/v NP40, and 2 mM BME). Six reactions were performed using ~6 ug of RSV-IN, RSV H6.Gag.3h, or RSV Gag.ΔSP.ΔNC with either the prepped nuclear extract (divided among three reactions) or just Buffer B. The proteins were incubated with nickel beads rotating for 1 hour at 4°C. Beads were washed three times with Buffer B and then incubated with either nuclear extract or just

Buffer B rotating for 2 hours at 4°C. The beads were washed three times with Buffer B.

51

Proteins were eluted from the beads in 60% loading dye containing 50mM DTT and

50mM EDTA, heated for 7 minutes at 95°C. The samples were run on a 4-12% gradient

SDS-PAGE gel.

Sample Preparation for Mass Spectrometry

Seven bands were excised per lane and incubated in 50% acetonitrile (ACN) overnight while shaking. Tubes were spun briefly and supernatant was removed and replaced with 0.7 ml water then vortexed for 1 minute. This was repeated two more times. 100% ACN was added to each tube and vortexed for 10 minutes or until the gel slices were white and shrunken. The ACN was removed and the gel slices were dried in a speedvac at medium heat for ~15 minutes. Trypsin solution (0.2 μg/μl trypsin stock diluted 20-fold with 50 mM ammonium bicarbonate) was added to each tube and digested overnight shaking at 300 rpm at room temperature. The tubes were spun briefly and 100% ACN was added and vortexed for 10 minutes. The tubes were spun briefly and the supernatant was transferred to clean 0.5 ml and dried completely in a speedvac on medium heat. Each sample was reconstituted in HPLC-grade water and 0.1% trifluoroacetic acid (TFA), and then sonicated to 15 minutes in a bath sonicator. The samples were run through ZipTip pipet tips that were first treated with methanol, then

80% ACN/ 0.1% TFA, and equilibrated with 0.1% TFA. After running the samples through the tips, the tips were washed with 0.1% TFA twice and then the sample was eluted into new tubes with 80% ACN/ 0.1% TFA two times. The samples were lyophilized through speedvacing and analyzed at Caltech.

Protein Identification and Analysis

Data were analyzed using Mascot (Matrix Science, London, UK; version Mascot).

Mascot was set up to search the ChickenRous.290413 database. Scaffold (version

Scaffold_4.5.1, Proteome Software INC., Portland, OR) was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they

52 could be established at greater than 90.0% probability by the Peptide Prophet algorithm

(93). Protein identifications were accepted if they were established at greater than 90.0% probability and contained at least 1 peptide (148). Proteins that are listed as common laboratory contaminants or reversed decoys were removed.

Purified RSV Gag and HIV Gag Pulldowns

Lysate Preparation

Nuclear lysates from DF1 and HeLa cells were prepared as described above using the NE-PER Nuclear and Cytoplasmic Extraction kit.

Nickel Affinity Purifications

The his-tagged protein pulldowns were performed as above with slight alterations. Three reactions were performed using 6 μg of RSV H6.Gag.3h and HIV

WT.Gag.Δp6.H6, and a no protein control for both DF1 and HeLa nuclear lysates. The proteins and no protein control, were incubated on pre-washed nickel beads for 1 hour rotating at 4°C. The beads were then washed three times in Buffer B, followed by incubation with 500 μg of nuclear extract for 2 hours rotating at 4°C. The beads were washed again three times in Buffer B, and bound proteins were eluted from the beads using Buffer B supplemented up to 300 mM Imidazol for 15 minutes rotating at 4°C. The eluates were buffer exchanged into water using Zeba Spin Desalting Columns

(ThermoFisher Scientific) and 20 ug of each sample was used for mass spectrometry analysis.

Sample Preparation for Mass Spectrometry

The samples were prepared and processed at the Mass Spectrometry and

Proteomics Core Research Facility at Penn State College of Medicine using the ABSciex

5600 TripleTOF. In a final volume of 100 μl, the samples were incubated in 50 mM

NH4HCO3, pH 8.0, 10% v/v acetronitrile, and 0.1 μg trypsin for at least 3 hours at 48°C.

To evaporate off the NH4HCO3 and acetronitrile, samples were dried down using a

53

SpeedVac, and then resuspended in 200 μl H2O with vortexing. The drying was repeated 3X total, but the final resuspension volume was 10 μl. To each sample, 1/9th volume of 1% formic acid was added.

Strep-tag Affinity Purifications

Lysate Preparation

Cytoplasmic/Nuclear fractionation was modified from (177). All steps and buffers used were performed on ice or at 4°C unless otherwise stated. QT6 cells were transfected with Gag.tandemstrep, Gag.L219A.tandemstrep, or mCherry.tandemstrep.

Media was changed 16 hours after transfection and 24 hours later, cells were harvested by scraping in PBS. Cells were pelleted at 800 x g for 5 min. The cell pellet was resuspended in lysis buffer (10 mM Hepes pH 7.9, 10 mM KCl, 0.1 mM EDTA, 0.4%

Nonidet P-40, 1 mM DTT, 100 μg/ml PMSF, 1 μg/ml pepstatin, and Roche Complete

Protease Inhibitor Cocktail) and incubated on ice for 5 minutes. The cells were spun at

4,000 x g for 5 minutes and the supernatant (cytoplasm) was transferred to a new tube.

The nuclear pellet was washed with lysis buffer for 5 minutes and spun again. The supernatant was added to the prior collection. The pellet was resuspended in nuclear extraction buffer (20 mM Hepes pH 7.9, 10 mM NaCl, 1 mM DTT, 100 μg/ml PMSF, 1

μg/ml pepstatin, and Roche Complete Protease Inhibitor Cocktail) and homogenized in a dounce homogenizer with 20 strokes. Lysates were transferred into new tubes and 100

U/ml OmniCleave (Epicentre) was added and incubated at 37°C for 10 minutes. On ice, the lysates were brought up to 150 mM NaCl and vortexed on high for 15 seconds and rotated at 4°C for 20 minutes. The lysates were spun at 16,000 x g for 10 minutes and the supernatant (nucleoplasm) was transferred to a new tube.

Affinity Purifications

Strep-Tactin Sepharose beads (IBA Lifesciences) were washed three times in wash buffer (20 mM Hepes pH 7.9, 150 mM NaCl, 1 mM EDTA, 100 μg/ml PMSF, 1

54

μg/ml pepstatin, and Roche Complete Protease Inhibitor Cocktail). 5 mg of nuclear lysate was added to beads and were rotated overnight at 4°C. Beads were washed five times in wash buffer. Proteins were eluted off the beads using elution buffer (2.5 mM D-

Desthiobiotin, 20 mM Hepes pH 7.9, 150 mM NaCl, and 1 mM EDTA). Elutions were buffer exchanged into water using Zeba Spin Desalting Columns and 20 μg of protein was submitted to the PSU College of Medicine Mass Spectrometry Core for iTRAQ labelling. iTRAQ Sample Preparation

The protein samples were labelled using the iTRAQ Multiplex (4-plex) kit (Applied

Biosystems). To each sample (almost dried completely) Dissolution Buffer (0.5 M triethylammoniumbicarbonate (TEAB) at pH 8.5) was added followed by 2% SDS and then vortexed. Reducing Reagent (110 mM tris-(2-carboxyethyl) phosphine [TCEP]) was added to each sample to make 5 mM TCEP concentration. The samples were vortexed and spun down. Next they were incubated at 60°C for 1 hour and spun down again.

Freshly prepared 84 mM solution of iodoacetamide was added and the samples were vortexed and spun again. The tubes were incubated in the dark for 30 minutes at room temperature. To each sample 10 μl of 1mg/ml trypsin solution (in 50 mM acetic acid) was added then the samples were vortexed and spun. The samples were incubated at 48°C overnight and were spun the next day. The iTRAQ Reagents were brought up to room temperature and 70 μL of ethanol was added to each iTRAQ Reagent vial. Each vial was vortexed for 1 minute and then spun. Each of the iTRAQ reagents were added to the separate samples, then vortexed and spun. The samples + iTRAQ reagent were incubated at room temperature for 1 hour. Then 100 μl of Milli-Q water was added to each tube and incubated at room temperature for 30 minutes to quench the reaction. All of the sample tubes were combined into one tube and vortexed then spun. The tube was then dried, resuspended in 100 μl water, vortexed, spun, and completely dried. The

55 drying and resuspension of the sample was repeated two more times. The dried samples were then resuspended in Buffer A (10 mM ammonium formate, pH 2.7, in 20% acetonitrile/80% water).

Mass Spectrometry

2D-LC separations:

SCX separations were performed on a passivated Waters 600E HPLC system, using a 4.6 X 250 mm PolySULFOETHYL Aspartamide column (PolyLC, Columbia, MD) at a flow rate of 1 ml/min. The gradient was Buffer A (10 mM ammonium formate, pH

2.7, in 20% acetonitrile/80% water) at 100% (0-22 minutes following sample injection),

0%→40% Buffer B (666 mM ammonium formate, pH 2.7, in 20% acetonitrile/80% water)

(16-48 min), 40%→100% Buffer B (48-49 min), then isocratic 100% Buffer B (49-56 min), then at 56 min switched back to 100% Buffer A to re-equilibrate for the next injection. 1 ml fractions were collected and were dried down then resuspended in 9 µl of

2% (v/v) acetonitrile, 0.1% (v/v) formic acid and filtered prior to reverse phase C18 nanoflow-LC separation.

Mass Spectrometry analysis:

Each SCX fraction was analyzed following a calibration run using trypsin- digested β-Gal as a calibrant, then a blank run using the ABSciex 5600 TripleTOF. MS

Spectra were then acquired from each sample using the newly updated default calibration, using a 60 minute (120 minute for the iTRAQ samples) gradient from an

Eksigent NanoLC-Ultra-2D Plus and Eksigent cHiPLC Nanoflex through a 200 µm x 0.5 mm Chrom XP C18-CL 3 µm 120 Å Trap Column and elution through a 75 µm x 15 cm

Chrom XP C18-CL 3 µm 120 Å Nano cHIPLC Column.

Protein Identification and Analysis:

Protein identification and quantitation were performed using the Paragon algorithm as implemented in Protein Pilot 5.0 software (ProteinPilot 5.0, which contains

56 the Paragon Algorithm 5.0.0.0, build 4632 from ABI/MDS-Sciex) (181). Spectra were searched against Homo sapien or Gallus gallus RefSeq subsets (plus 389 common contaminants) of the NCBInr database concatenated with a reversed "decoy" version of itself. For the ProteinPilot analyses, the preset Thorough (iTRAQ or Identification)

Search settings are used, and identifications must have a ProteinPilot Unused Score >

1.3 (>95% Confidence interval) in order to be accepted. In addition, the only protein IDs accepted must have a "Local False Discovery Rate" estimation of no higher than 5%, as calculated from the slope of the accumulated Decoy database hits by the PSPEP

(Proteomics System Performance Evaluation Pipeline ) (196). Proteins that were labelled as contaminants or reversed were removed.

Analysis of Proteomics

The Database for Annotation, Visualization, and Integrated Discovery (DAVID, version 6.8) (85, 86) was used to assign each protein to their cellular compartment(s) and biological process categories. Proteins were organized by their gene name for entries into DAVID and the Homo sapiens species database was used. Data presented in the tables were generated using the Gene Ontology GOTERM_BP_ALL to categorize proteins by their biological function, and GOTERM_CC_ALL to first identify the proteins present in the nucleus. Categories with a p-value of ≤ 0.05, as determined by modified

Fisher’s Exact Test, were considered statistically overrepresented, and any redundant categories (same p-value and proteins) were removed.

The BioVenn online comparison tool (87) was used to generate the Venn diagrams.

The gene ontology hierarchy flow charts were designed using the European

Bioinformatics Institute website www.ebi.ac.uk/QuickGO/ (18).

Camptothecin (Topoisomerase 1 inhibitor) treatment

57

QT6 cells were transfected with Gag.L219A.YFP for 16 hours. Media was changed and the cells were allowed to recover for 4 hours. Cells were then treated with

1 μM Camptothecin (VWR; gift from Dr. Moldovan, PSU College of Medicine) or DMSO for 2 hours then fixed with 3.7% paraformaldehyde in PHEM buffer for 15 minutes. Then the cells were permeabilized with 0.5% Triton X-100 for 10 minutes. The cells were blocked in 0.2% BSA in PBS for 1.5 hours at 37°C. Then stained with Ms α Histone

H2A.X (Millipore 05-636) at 1:500 in blocking buffer overnight at 4°C then stained with

Dk α Mouse Alexa Fluor 647 (Invitrogen) 1:1000 in blocking buffer for 1 hour at room temperature. Then DAPI stained at 5 μg/ml for 30 seconds. The coverslips were mounted using ProLong Diamond Antifade Mountant.

Imaging

Cells were imaged using the Leica SP8 TCS scanning confocal microscope equipped with a White Light Laser (WLL) using a 63X oil immersion objective.

Sequential scanning between frames was used to average four frames for each image.

DAPI was excited with the 405 nm UV laser at 10% laser power and emission detection window 410 – 451 nm using a PMT detector. YFP was imaged using the WLL with a laser line excitation of 514 nm and a hybrid detector window of 519–542 nm. α-647 was imaged using the WLL excited with the 652 nm laser line and a hybrid detector window of 658 – 747 nm. All channels using the hybrid detectors had a time gating of 0.3 to 6.0 ns.

Gag.L219A localization with endogenous proteins

QT6 cells were transfected with Gag.L219A.GFP for 16 hours then stained for the endogenous proteins with the following conditions:

MAGOH: Cells were fixed in cold methanol for 10 minutes at -20°C then permeabilized in cold acetone for 10 minutes at -20°C. The cells were blocked in 10% goat serum,

58

300mM glycine, and 3% BSA for 1 hour. Then stained using Rb α MAGOH antibody

(Abcam ab38768) at 1:200 dilution in blocking buffer for 1 hour.

SKP1: Cells were fixed in 3.7% paraformaldehyde in PHEM for 15 minutes then permeabilized in cold methanol for 10 minutes at -20°C. Cells were blocked in 1% BSA for 1 hour. Then stained using α SKP1 antibody (Abcam ab119912) at 1:250 dilution in blocking buffer for 1 hour.

TOP1: Cells were fixed in 3.7% paraformaldehyde in PHEM for 15 minutes then permeabilized in cold methanol for 10 minutes at -20°C. The cells were blocked in 10% goat serum, 300mM glycine, and 3% BSA for 1 hour. Then stained using α TOP1 antibody (Abcam ab3825) at 1:100 dilution in blocking buffer for 1 hour.

All cells were then stained with Dk α Rabbit Alexa Fluor 647 at 1:1000 in blocking buffer for 1 hour. Then DAPI stained at 5 μg/ml for 30 seconds. The coverslips were mounted using ProLong Diamond Antifade Mountant

Imaging

Cells were imaged using the Leica SP8 TCS scanning confocal microscope equipped with a White Light Laser (WLL) using a 63X oil immersion objective.

Sequential scanning between frames was used to average four frames for each image.

DAPI was excited with the 405 nm UV laser at 10% laser power and emission detection window 410 – 451 nm using a PMT detector. GFP was imaged using the WLL with a laser line excitation of 489 nm and a hybrid detector window of 494–564 nm. Dk α Rabbit

Alexa Fluor 647 was imaged using the WLL excited with the 650 nm laser line and a hybrid detector window of 655 – 751 nm. All channels using the hybrid detectors had a time gating of 0.3 to 6.0 ns. siRNA and Virus Particle Analysis siRNAs

59

Sense (s) and antisense (a) siRNAs were designed against the 3’ end of SF2

(SRSF1), SC35 (SRSF2), SKP1, MAGOH, and TOP1 by Sigma Aldrich. The siRNAs sequences (5’ to 3’) for each of the proteins are listed. SF2: s –

GUAUCAGAAUGAAAGGGAA, a – UUCCCUUUCAUUCUGAUAC. SC35: s –

GUAAUGCUGGUUAGCUGUU, a – AACAGCUAACCAGCAUUAC. SKP1: s –

GAAUCCAGGUUUGACCAAA, a – UUUGGUCAAACCUGGAUUC. MAGOH: s –

UUCUUCAGGCCCGAAGCUU, a – AAGCUUCGGGCCUGAAGAA. And two sets of

TOP1: (201) s – GUCCUAUGGACAACUUAUU, a – AAUAAGUUGUCCAUAGGAC and

(240) s – AAUUCUAGCUGUCUGGUUU, a – AAACCAGACAGCUAGAAUU.

Viral Particle and Lysate Collection and Western Blot Analysis

QT6 cells were transfected with the first round of siRNAs. 8 hours later the second round of siRNAs were transfected along with pRC.V8. After 16 hours the media was changed and particles were collected for 24 hours. Following the collection period, the media was removed, filtered through a 0.2 μM filter, and particles were pelleted through a 25% sucrose cushion at 55K x g for 1 hour at 4°C. Virus pellets were resuspended in 4X SDS-PAGE sample buffer. Cells were lysed using RIPA buffer (50 mM Tris pH 7.5, 150 mM NaCl, 1% NP40, 0.5% deoxycholate, 0.1% SDS). 50 μg of the lysates were heated to 90°C in 4X SDS-PAGE sample buffer for 10 minutes prior to loading on a 10% SDS-PAGE gel and analyzed by Western Blot using antibodies against RSV Gag (210), gamma tubulin (Abcam ab11316), and HRP-conjugated secondary antibodies (Invitrogen).

60

61

Figure 3.1: Schematic of protein expression constructs. A) The wildtype RSV Gag polyprotein is show at the top and is tagged at N-terminus of the MA domain with six histidine residues for purification using nickel resin. The ΔNC mutant of RSV Gag is also

N-terminally tagged with histidine, as well as RSV Integrase. The HIV Gag construct however was C-terminally tagged with six histidine residues. These four constructs were used for the protein purification followed by mass spectrometry experiments. B) Wildtype

RSV Gag is C-terminally tagged with a tandem strep tag. The Gag.L219A mutant, which contains a mutation in the NES of p10 (highlighted residue) also contains a C-terminal tandem strep tag, as well as the mCherry protein. These constructs were used for the ex vivo strep-tactin purification followed by mass spectrometry experiment. C) Untagged wildtype RSV Gag and Gag.L219A are shown, as well as the Gag.ΔNC mutant. These were used for the subcellular fractionation experiments. Fluorophore tagged derivatives of wildtype RSV Gag and Gag.L219A were also used in imaging experiments. The HIV

Gag.CFP and Rev.YFP were used for the subcellular fractionation studies, as well as a form of HIV Gag that is Rev independent through specific codon mutations.

62

3.4 Results

3.4.1 Subcellular localization

To determine biochemically to which nuclear subcompartments Gag is localized, subcellular fractionations were performed. The RSV Gag constructs in Figure 3.1C were transfected into QT6 cells. Cells were fractionated into cytoplasm, nucleoplasm and two chromatin-associated protein factors (Figure 3.2A). The first chromatin fraction is extracted using NaCl concentration of 150mM which removes proteins that are loosely associated with chromatin. In the second chromatin fraction however, 500mM NaCl is used, along with detergent to remove proteins that are more tightly bound to chromatin.

It can be seen that when RSV Gag is expressed in cells, it is present in the cytoplasm and the nucleoplasm as expected from previously published imaging work (96, 172).

Interestingly, Gag was also present in fractions that contain chromatin associated proteins. When the nuclear-restricted mutant Gag.L219A was used, there is an increase of Gag.L219A signal in the chromatin fractions compared to the wildtype Gag signal.

When the mutant that has the NC domain of Gag deleted, Gag.ΔNC, there is more Gag present in the cytoplasm and approximately the same amount in the nucleoplasm compared to wildtype (41% vs 65.9% and 38.2% vs 33.6%, respectively). But there was no detectable amount of Gag.ΔNC in either of the chromatin fractions. Cells were again transfected with RSV Gag and Gag.L219A, and fractionated using a protocol that was designed to extract chromatin-associated proteins for chromatin immunoprecipitations using two different NaCl concentrations (77). This particular fractionation protocol does not allow for the cytoplasm and the nucleoplasm to be separate fractions due to how the cellular membranes are lysed. Analysis of the fractions showed that Gag and

Gag.L219A are in fact isolated from chromatin fractions (Figure 3.2B).

63

64

Figure 3.2: Subcellular fractionations. A) A fractionation protocol was used that separates the cytoplasm and nucleoplasm from the chromatin, which can is further fractionated using different NaCl concentrations (34). The chromatin 150 mM fraction contains proteins that are more easily extracted from chromatin and are generally associated with open chromatin. The chromatin 500 mM fraction contains proteins that are more tightly bound to chromatin. Wildtype RSV Gag and Gag.L219A are detected in all of the fractions. Gag.ΔNC is mostly detected in the cytoplasm and a little in the nucleoplasm. To the right are the signal averages calculated for each fraction for each

Gag construct. This experiment was replicated over four times and the standard error was calculated for the densitometry values. B) A protocol that harshly lyses cells to extract out chromatin was used (77). With the conditions used to lyse the plasma and nuclear membranes, there are not distinct fractions for the cytoplasm and nucleoplasm.

Proteins were extracted from chromatin using 150 mM NaCl and 600 mM NaCl. Under these conditions both wildtype RSV Gag and Gag.L219A are still present in the two chromatin fractions. This experiment was replicated three times. C) The protocol for (A) was used on cells expressing either a Rev-independent form of HIV Gag, or a wildtype

Rev dependent HIV Gag with and without Rev. Both HIV Gags are found in all four protein fractions. This experiment was replicated three times.

To check for the purity of the fractions, multiple cellular proteins were detected using antibodies. Calnexin is a cytoplasmic protein, and GAPDH is mostly cytoplasmic. Crm1 is a nuclear export protein and is both cytoplasmic and nucleoplasmic. Med4 is in the nucleus and bound to chromatin through transcription machinery. RCC1 and Histone

H2B are chromatin-associated proteins.

65

To determine whether HIV Gag behaves in a similar fashion and can be isolated from chromatin fractions, the fractionation was repeated. Using a mutant of HIV Gag where the expression is no longer dependent on Rev (101), as well as the wildtype Rev dependent HIV Gag, subcellular fractionations were performed, and it can be seen that with both HIV Gag constructs, Gag is present in the cytoplasm fraction, as expected, but as well the nuclear and both chromatin fractions (Figure 3.2C).

Due to the presence of RSV Gag in chromatin-associated protein fractions, and that both MLV p12 and FV Gag bind to mitotic chromatin, it was next determined whether RSV Gag could also bind to mitotic chromatin (Figure 3.3). Cells that were transfected with Gag.L219A were serum starved for 16 hours to synchronize the cells.

Normal media was added to the cells, and were fixed 7 hours later. It can be seen that in the first row, using a YFP only control, the YFP is diffuse throughout the cell and lacking in the regions of the chromatin. When using Histone H2B.GFP control, there is overlap of the H2B signal with chromatin during mitosis, as expected. However, when RSV Gag is examined in mitotic cells, Gag, as well as Gag.L219A, adopts a diffuse localization devoid of the chromatin similar to the YFP only control. This demonstrates that unlike

PFV or MLV p12, RSV Gag does not bind to mitotic chromatin.

To examine RSV Gag.L219A localization in respects to chromatin in interphase cells, 3D renderings of confocal z slices were constructed. Figure 3.4 shows that

Gag.L219A (in red) appears to localize to the interchromatin space of the nucleus during interphase instead of interacting directly with chromatin (DAPI in blue). This can further be seen when the front half of the cell is cut away (Figure 3.4 right).

66

67

Figure 3.3: RSV Gag localization during mitosis. Cells transfected with either

YFP.N2, H2B.GFP, RSV Gag.YFP, or Gag.L219A.YFP were serum starved for 16 hours, then allowed to recover for 7 hours. They were then fixed and imaged. In the top row, cells were transfected with YFP.N2 and the distribution of the protein can be seen all over the cell except where the chromatin is. In the second row, histone H2B.GFP was transfected in cells and the protein localizes to the chromatin as expected. In the bottom two rows, wildtype RSV Gag and Gag.L219A were transfected in cells, and it can be seen that both Gags adopt a similar distribution as YFP. From three separate experiments, 27 Gag.L219A cells were imaged. Only three Gag cells were imaged. For

H2B.GFP, 5 cells were imaged and 7 cells were imaged for the YFP only control.

68

69

Figure 3.4: Gag.L219A localization in the nucleus. Cells were transfected with

Gag.L219A, fixed, then imaged. Confocal z slices were taken throughout the nucleus of the cell and were rendered into 3-dimension. Surfaces were then added applied to the fluorescence signal from Gag.L219A (red) and DAPI (blue) using Imaris. The left image shows an intact nucleus, and the right is the same nucleus in which the front half has been removed to view the inside of the nucleus.

70

3.4.2 Affinity purification and proteomic analysis

Previously, it has been shown that the nuclear foci of RSV Gag.L219A exhibit obstructed diffusion characteristics (Chapter 2/Appendix A (163)). This means that the foci appear to be tethered to either a protein or a RNA. To determine what proteins Gag could be interacting with in the nucleus, possibly as a tether, affinity tagged purifications were performed followed by mass spectrometry.

An affinity-tagged purification/mass spectrometry experiment was performed using histidine tagged wildtype RSV Gag, RSV Gag.ΔNC, and RSV integrase purified proteins (Figure 3.1), along with nuclear lysates from DF1 (chicken) cells. Initially, this experiment was performed to determine what factors the NC domain could be interacting with in the nucleus. The purified proteins were incubated with nuclear lysates and then the histidine tagged proteins were purified along with any bound cellular partners, using nickel beads. The eluates were then subjected to mass spectrometry analysis.

Originally, proteins that were identified from the Gag.ΔNC purification would be subtracted from the wildtype Gag purification in hopes of identifying proteins that would be utilized by the NC domain. Proteins that were also identified in the integrase purification would be subtracted from the final list as well to remove any possible non- specific binding partners. Ultimately, it was decided that this way of analyzing the data would subtract out potential important proteins that Gag could utilize for viral replication.

It was then decided to examine the proteins that were identified in the Gag and

Gag.ΔNC purifications, and to ignore the proteins identified only in the integrase purification. The peptides identified produced a total of 723 proteins that met the stringency criteria: proteins that met the 90.0% peptide and protein identification threshold and contained at least 1 peptide. Proteins labeled as common laboratory contaminants or reversed decoys were removed. In the Gag.ΔNC purification, 273 unique proteins were identified and 122 in the Gag purification.

71

72

Figure 3.5: Biological processes enriched from the wildtype purified RSV Gag and

Gag.ΔNC affinity tagged purifications. Proteins identified from the mass spectrometry experiments were analyzed using the DAVID analysis software that categorized the proteins according to biological functions. A) Shown are the top 10 GO terms for the biological processes identified, as well as the number of proteins in each category (in parentheses) and the p-value associated with each category showing how over-enriched each category is from the sample set. B) The nuclear proteins from the mass spectrometry were first isolated from the lists and then analyzed as above.

73

74

Figure 3.6: Gene Ontology (GO) hierarchy of biological processes identified from the wildtype purified RSV Gag and Gag.ΔNC affinity tagged purifications. The top 10 biological processes identified in Figure 3.5 are displayed here in their GO hierarchy. Boxes that are colored yellow indicate they are the processes shown in Figure

3.5A and the ones colored blue are the top 10 nuclear biological processes shown in

Figure 3.5B. Green boxes indicate they are present in both top 10 lists. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The blue arrows indicate the direction of the ‘part_of’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

75

The final list of proteins (combined list of Gag and Gag.ΔNC) was then analyzed through the online program, DAVID Bioinformatics Resources v6.8. The DAVID analysis software analyzes gene lists generated by high-throughput genomic, proteomic, and bioinformatics experiments (85, 86). Through these analyses, enriched annotation terms can be identified in the list of proteins identified by the mass spectrometry. Figure 3.5A

(Table 3.1/Appendix B) shows the top 10 gene ontology terms (GO) for biological functions as a pie graph, and the number of proteins identified under the terms in parentheses. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test. Next, the proteins were separated using

DAVID to identify the proteins found in the nucleus. From here, the nuclear proteins were analyzed for their biological functions and the top 10 are shown in Figure 3.5B

(Table 3.2/Appendix B). From Figure 3.5, some of the top biological function categories of particular interest included RNA processing, gene expression, and RNA metabolism.

The gene ontology (GO) defines the concepts relating to gene functions (‘GO terms’), and how these functions are related to each other (‘relations’) for macromolecules identified in –omics experiments generating acyclic graphs, or hierarchies. GO describes function with respect to three aspects: molecular function, cellular component, and biological process (5, 199). Figure 3.6 shows the relationship between the GO terms presented in Figure 3.5. As is shown, most of the GO terms have a parent-child relationship demonstrating a connection between the proteins identified in the mass spectrometry.

76

77

Figure 3.7: Overlapping proteins found in the purified RSV and HIV-1 Gag mass spectrometry. Venn diagrams displaying the number of unique proteins identified and number of overlapping proteins for: A) the purified RSV purification comparing proteins identified using DF1 and HeLa nuclear lysates, B) the purified HIV purification comparing proteins identified using HeLa and DF1 nuclear lysates, C) purified RSV purification with the DF1 nuclear lysate compared to the purified HIV purification with the HeLa nuclear lysate. The overlapping proteins for each comparison are listed on the right.

78

To determine whether there is a difference in nuclear factors identified between cell types, the prior experiment was repeated using DF1 and HeLa nuclear extracts, for both RSV Gag and HIV Gag. Both Gags were utilized to also determine whether there are any overlapping proteins used by both viruses. To get the final list of proteins for each condition, a beads-only purification was performed using DF1 lysates and HeLa lysates incubated with nickel beads. Proteins identified from these two purifications were removed from the RSV and HIV affinity-tagged purification using the same lysates.

Proteins that had an unused score, as defined by Protein Pilot, of less than 1.3 were removed. Common contaminants were removed from the final list as well.

When comparing the proteins identified from the DF1 lysates and HeLa lysates for the RSV Gag purifications, 19 proteins, which function in vesicle formation and transport and cell component assembly, overlapped out of 112 and 350 total proteins, respectively (Figure 3.7A). When examining the proteins identified from the DF1 and

HeLa lysates from the HIV Gag purifications, 14 proteins involved in splicing, transcription termination and nuclear export, overlapped out of 116 and 284 total proteins respectively (Figure 3.7B). Looking at the biologically relevant cell species with their respective Gag (DF1 and RSV, HeLa and HIV), there are 10 proteins that are involved in organelle organization and RNA transport, overlap from 112 total proteins from the RSV/DF1 purification and 284 total proteins from the HIV/HeLa purification

(Figure 3.7C). Figure 3.8A (Table 3.3/Appendix B) shows the results of using the

DAVID analysis software on the list of proteins obtained from the RSV Gag purification using DF1 nuclear lysates, and the top 10 biological processes GO terms are displayed.

To increase the number of nuclear related biological process, only nuclear proteins, as determined by DAVID, were then analyzed and Figure 3.8B (Table 3.4/Appendix B) shows the top 10 biological processes. After this, the 2nd and 3rd top GO terms were gene expression and RNA metabolic process, respectively; the top term was cellular

79 nitrogen compound metabolic process. Figure 3.9 shows the relationships between the biological processes identified for Figure 3.8. When the experiment was repeated, this time using HeLa nuclear lysates, the GO terms RNA processing and gene expression remained in the top 20 identified biological processes (Figure 3.10A [Table

3.5/Appendix B]). When only the nuclear proteins were examined, the top 10 biological processes included: various metabolic processes and RNA processing (Figure 3.10B

[Table 3.6/Appendix B]). Figure 3.11 shows the relationships between the biological processes identified for Figure 3.10.

80

81

Figure 3.8: Biological processes enriched from the purified RSV Gag using DF1 nuclear lysates. Proteins identified from the mass spectrometry experiments were analyzed using the DAVID analysis software that categorized the proteins according to biological functions. A) Shown are the top 10 GO terms for the biological processes identified, as well as the number of proteins in each category (in parentheses) and the p- value associated with each category showing how over-enriched each category is from the sample set. B) The nuclear proteins from the mass spectrometry were first isolated from the lists and then analyzed as above.

82

83

Figure 3.9: Gene Ontology (GO) hierarchy of biological processes identified from the purified RSV Gag using DF1 nuclear lysates. The top 10 biological processes identified in Figure 3.8 are displayed here in their GO hierarchy. Boxes that are colored yellow indicate they are the processes shown in Figure 3.8A and the ones colored blue are the top 10 nuclear biological processes shown in Figure 3.8B. Green boxes indicate they are present in both top 10 lists. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

84

85

Figure 3.10: Biological processes enriched from the purified RSV Gag using HeLa nuclear lysates. Proteins identified from the mass spectrometry experiments were analyzed using the DAVID analysis software that categorized the proteins according to biological functions. A) Shown are the top 10 GO terms for the biological processes identified, as well as the number of proteins in each category (in parentheses) and the p- value associated with each category showing how over-enriched each category is from the sample set. B) The nuclear proteins from the mass spectrometry were first isolated from the lists and then analyzed as above.

86

87

Figure 3.11: Gene Ontology (GO) hierarchy of biological processes identified from the purified RSV Gag using HeLa nuclear lysates. The top 10 biological processes identified in Figure 3.10 are displayed here in their GO hierarchy.

Boxes that are colored yellow indicate they are the processes shown in Figure 3.10A and the ones colored blue are the top 10 nuclear biological processes shown in Figure

3.10B. Green boxes indicate they are present in both top 10 lists. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The blue arrows indicate the direction of the ‘part_of’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

88

Next, the potential binding partners of HIV Gag were examined. When HeLa nuclear lysates were used for the Gag affinity-tagged purifications, RNA processing and gene expression were the top 10 GO biological process terms to be identified after

DAVID analysis (Figure 3.12A [Table 3.7/Appendix B]). Also within the top 10, RNA splicing and RNA location were enriched in the data set. Looking at only nuclear proteins, the top 10 GO terms again include gene expression and RNA processing

(Figure 3.12B [Table 3.8/Appendix B]). Figure 3.13 shows the relationships between the GO terms identified in Figure 3.12. When HIV Gag protein was incubated with DF1 nuclear lysate, the top 10 GO terms are shown in Figure 3.14A (Table 3.9/Appendix

B). The list includes different processes involved in cellular organization, metabolism, biogenesis, gene expression, and cellular localization. Figure 3.14B (Table

3.10/Appendix B) shows the top 10 GO terms for the nuclear proteins from this data set. The processes present on this list include gene expression and various RNA processing functions. Figure 3.15 demonstrates the relationships between the various

GO terms from Figure 3.14.

89

90

Figure 3.12: Biological processes enriched from the purified HIV Gag using HeLa nuclear lysates. Proteins identified from the mass spectrometry experiments were analyzed using the DAVID analysis software that categorized the proteins according to biological functions. A) Shown are the top 10 GO terms for the biological processes identified, as well as the number of proteins in each category (in parentheses) and the p- value associated with each category showing how over-enriched each category is from the sample set. B) The nuclear proteins from the mass spectrometry were first isolated from the lists and then analyzed as above.

91

92

Figure 3.13: Gene Ontology (GO) hierarchy of biological processes identified from the purified HIV Gag using HeLa nuclear lysates. The top 10 biological processes identified in Figure 3.12 are displayed here in their GO hierarchy. Boxes that are colored yellow indicate they are the processes shown in Figure 3.12A and the ones colored blue are the top 10 nuclear biological processes shown in Figure 3.12B. Green boxes indicate they are present in both top 10 lists. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The blue arrows indicate the direction of the ‘part_of’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

93

94

Figure 3.14: Biological processes enriched from the purified HIV Gag using DF1 nuclear lysates. Proteins identified from the mass spectrometry experiments were analyzed using the DAVID analysis software that categorized the proteins according to biological functions. A) Shown are the top 10 GO terms for the biological processes identified, as well as the number of proteins in each category (in parentheses) and the p- value associated with each category showing how over-enriched each category is from the sample set. B) The nuclear proteins from the mass spectrometry were first isolated from the lists and then analyzed as above.

95

96

Figure 3.15: Gene Ontology (GO) hierarchy of biological processes identified from the purified HIV Gag using DF1 nuclear lysates. The top 10 biological processes identified in Figure 3.14 are displayed here in their GO hierarchy. Boxes that are colored yellow indicate they are the processes shown in Figure 3.14A and the ones colored blue are the top 10 nuclear biological processes shown in Figure 3.14B. Green boxes indicate they are present in both top 10 lists. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

97

To examine possible interaction partners of RSV Gag ex vivo, Gag was tagged with a tandem strep tag and transfected into QT6 cells. Gag.L219A was utilized due to wildtype Gag nuclear trafficking being transient and the Gag.L219A mutant is trapped within the nucleus, thereby increasing the chances of nuclear protein interaction. As a control for the strep-tag purifications, the tandem strep tag was added to the C terminus of the mCherry fluorescent protein. mCherry was chosen because it does traffic into the nucleus and was used as a non-specific binding control. 48 hours after transfection cells were harvested and fractionated to extract out the nucleoplasm. The nuclear extracts were subjected to strep-tactin purifications and the proteins eluted from the strep beads were trypsin digested. To quantitatively identify possible nuclear binding partners of RSV

Gag, the samples were labelled via iTRAQ Multiplex labelling. After trypsin digestion, each sample is incubated with a different iTRAQ label that covalently bonds to the N- terminus side chain amines of the peptides. The samples were mixed together then analyzed via mass spectrometry which allows for quantitatively measuring the amount of peptides identified between each sample. The final protein list was obtained by removing proteins that had an unused score below 1.3, as according to ProteinPilot guidelines

(>95% confidence interval), were labelled as a contaminant or reversed protein, or had a

Local False Discovery Rate higher than 5%. Final proteins were also at least 2-fold more enriched in either the Gag or Gag.L219A samples over the mCherry control. This allows proteins to remain that Gag or Gag.L219A could be interacting with, but may nonspecifically interact with the mCherry protein. The proteins that remained for the Gag pulldown were further analyzed using the DAVID program and the top 10 GO biological process terms are listed in Figure 3.16A (Table 3.11/Appendix B). A majority of the processes are involved in mitochondrial functions, apoptosis, and other non-nuclear related processes. To try to concentrate nuclear functions, the protein list was analyzed by DAVID to find nuclear proteins, but there were no proteins classified as nuclear. It

98 appears that either there was not enough Gag in the nucleus to obtain binding proteins, or the interactions are very transient that the interaction was destroyed during the fractionation and/or purification procedure. Figure 3.17 shows the relationships between the GO terms identified for the Gag-strep purifications shown in Figure 3.16A. When the

Gag.L219A protein list was analyzed by DAVID, the top 10 GO terms included mitochondrial and apoptosis related processes (Figure 3.16B [Table 3.12/Appendix

B]). When the nuclear proteins were examined, the top 10 GO terms are show in Figure

3.16C (Table 3.13/Appendix B). The top terms still included apoptotic processes and

NAD/NADH metabolism. Figure 3.18 shows the various relationships between the GO terms identified between the top 10 and top 10 nuclear terms for L219A-strep purifications. Due to the results of the DAVID, it was determined that the fractionation protocol used for this experiment was not optimal due to of the amount of mitochondrial protein contamination.

99

100

Figure 3.16: Biological processes enriched from the Strep-tagged purifications.

Proteins identified from the mass spectrometry experiments were analyzed using the

DAVID analysis software that categorized the proteins according to biological functions.

A) Shown are the top 10 GO terms for the biological processes identified from the wildtype RSV Gag-Strep purifications, as well as the number of proteins in each category (in parentheses) and the p-value associated with each category showing how over-enriched each category is from the sample set. B) Shown are the top 10 GO terms for the biological processes identified from the RSV Gag.L219A-Strep, as well as the number of proteins in each category (in parentheses) and the p-value associated with each category showing how over-enriched each category is from the sample set. C) The nuclear proteins from the RSV Gag.L219A-Strep mass spectrometry were first isolated from the lists and then analyzed as above.

101

102

Figure 3.17: Gene Ontology (GO) hierarchy of biological processes identified from the Gag-strep purification. The top 10 biological processes identified in Figure 3.16 are displayed here in their GO hierarchy. Boxes that are colored yellow indicate they are the processes shown in Figure 3.16A. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The orange arrows indicate the direction of regulation between two GO terms. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

103

104

Figure 3.18: Gene Ontology (GO) hierarchy of biological processes identified from the L219A-strep purification. The top 10 biological processes identified in Figure 3.16 are displayed here in their GO hierarchy. Boxes that are colored yellow indicate they are the processes shown in Figure 3.16A and the ones colored blue are the top 10 nuclear biological processes shown in Figure 3.16B. Green boxes indicate they are present in both top 10 lists. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The orange arrows indicate the direction of regulation between two GO terms. The solid red arrow indicates a negative regulation between the two terms. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

105

Mass spectrometry experiments have been previously done by other laboratories using HIV-1 Gag; however, these investigators did not focus on the various nuclear proteins that were identified in these experiments. Engeland et al 2011 (51) performed five independent affinity-tagged purification experiments to identify cellular proteins that interact with HIV-1 Gag. The techniques used consisted of a tandem affinity purification

(TAP) tag for a C-terminally tagged Gag, GFP-TRAP A beads and GFP microbeads for

Gags with GFP fused either internally to the MA domain or the C-terminus of Gag. Each of these Gag constructs were transfected into 293T cells. They found 31 proteins that were identified in at least 3 of the experiments, and out of these 31 proteins, 24 of them are found to be nuclear. When these nuclear proteins are analyzed by DAVID, RNA processing and gene expression are the top two biological functions listed (Figure 3.19A

[Table 3.14/Appendix B]). In Jäger et al 2012 (89), they wanted to identify host proteins that interact with all HIV-1 polyproteins, processed proteins, and accessory proteins, in a systematic and quantitative way. They utilized a purification tag consisting of two strep tags and 3 flag tags at the C-terminal end of the proteins that were then expressed in cells. When examining their raw data, it can be shown that there were 1,134 unique proteins that were identified from full-length Gag and the proteolytic products of Gag

(MA, CA, NC, and p6) purifications, and out of these proteins, 180 of them were identified as being nuclear by DAVID. Further examination of these nuclear proteins by

DAVID, it can be found that RNA processing was the top biological function term identified, followed by protein targeting functions and RNA metabolism (Figure 3.19B

[Table 3.15/Appendix B]). Engeland et al 2014 (50) wanted to examine the HIV-1 Gag interactome as a whole instead of focusing on Gag partner characterization as in their previous publication using the same techniques (51). They found that there were 944 proteins that met their restriction criteria. Out of these 944 proteins DAVID found 186 to be nuclear. They performed their own GO enrichment analysis using DAVID to

106 determine the most enriched biological processes, and found nuclear processes including RNA processing, RNA splicing, and nucleosome assembly. These results are similar to the GO analysis that was performed here looking at the 186 nuclear proteins

(Figure 3.19C [Table 3.16/Appendix B]). Figures 20-22 demonstrate the relationships of the top 10 nuclear GO terms identified from Engeland 2011, Jäger 2011, and

Engeland 2014, respectively.

Ritchie et al 2015 (166) wanted to expand on the possible protein interactors of

HIV-1 Gag using a technique that utilizes the Escherichia coli biotin ligase BirA* that tags close proximity proteins through biotinylation. The BirA* tag coding region was inserted within the MA domain of Gag, which was then transfected into cells. They found 53 proteins after their exclusion criteria were met, and from these 53 proteins, 17 proteins were found to be nuclear by DAVID analysis. Figure 3.23A (Table 3.17/Appendix B) shows DAVID analysis of these 17 proteins shows that the top biological categories included cell adhesion, nucleic acid metabolism, and posttranscriptional regulation of gene expression. Le Sage et al 2015 (114) were also interested in identifying potentially novel host factors that interacted with HIV-1 Gag. Similar to Ritchie et al, they used the

BirA* tagging system, except the tag was at the N-terminus of Gag which was expressed in cells. They found a total of 42 proteins in which 19 were nuclear. These 19 nuclear proteins were analyzed by DAVID and the top hits included protein targeting, and RNA processing and metabolism (Figure 3.23B [Table 3.18/Appendix B]). Li et al 2016

(120) were interested in examining binding partners of the MA domain of HIV-1 Gag by inserting a strep tag to the C-terminus of MA and collecting MA complexes after infection. There were 97 proteins identified over their lysate only control and that met the exclusion criteria, and out of these proteins, 63 were nuclear. When only the nuclear proteins are further analyzed by DAVID, the top categories were ER targeting, and viral transcription/gene expression (Figure 3.23C [Table 3.19/Appendix B]). When the

107 proteins identified from each of these six publications are examined, there are 388 proteins that are present in at least two of the publications; this is from a total of 2,305 proteins identified as potential binding partners to HIV-1 Gag. Figures 24-26 demonstrate the relationships of the top 10 nuclear GO terms identified from Ritchie

2015, Le Sage 2015, and Li 2016, respectively.

Next, the proteins that were identified from the other laboratories were compared to the proteins that were identified from the HIV Gag purifications using HeLa and DF1 lysates discussed earlier. Out of the 1,826 unique proteins identified from the 6 publications discussed, and 285 unique proteins identified from the combined lists of the two HIV Gag affinity purifications presented above, there was a total of 59 proteins in common (Figure 3.27A). These 59 proteins were analyzed by the DAVID online software, and the top 10 biological functions are shown in Figure 3.27B (Table

3.20/Appendix B). Some of the top hits include protein localization, gene expression, and various metabolic processes.

108

109

Figure 3.19: Nuclear biological processes enriched from the published HIV Gag purifications. Nuclear proteins identified from these mass spectrometry experiments were analyzed using the DAVID analysis software that categorized the proteins according to biological functions. Shown are the top 10 GO terms of the identified biological processes A) Proteins identified from Engeland et al., 2011 (51). B) Jäger et al., 2011 (89). C) Engeland et al., 2014 (50).

110

111

Figure 3.20: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Engeland et al., 2011 publication. The top 10 nuclear biological processes (blue boxes) identified in Figure 3.19A are displayed here in their GO hierarchy. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The blue arrows indicate the direction of the ‘part_of’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

112

113

Figure 3.21: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Jäger et al., 2011 publication. The top 10 nuclear biological processes (blue boxes) identified in Figure 3.19B are displayed here in their GO hierarchy. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

114

115

Figure 3.22: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Engeland et al., 2014 publication. The top 10 nuclear biological processes (blue boxes) identified in Figure 3.19C are displayed here in their GO hierarchy. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

116

117

Figure 3.23: Nuclear biological processes enriched from the published HIV Gag purifications. Nuclear proteins identified from these mass spectrometry experiments were analyzed using the DAVID analysis software that categorized the proteins according to biological functions. Shown are the top 10 GO terms of the identified biological processes A) Ritchie et al., 2015 (166). B) Le Sage et al., 2015 (114). C) Li et al., 2016 (120).

118

119

Figure 3.24: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Ritchie et al., 2015 publication. The top 10 nuclear biological processes (blue boxes) identified in Figure 3.23A are displayed here in their GO hierarchy. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

120

121

Figure 3.25: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Le Sage et al., 2015 publication. The top 10 nuclear biological processes (blue boxes) identified in Figure 3.23B are displayed here in their GO hierarchy. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

122

123

Figure 3.26: Gene Ontology (GO) hierarchy of the nuclear biological processes identified from the Li et al., 2016 publication. The top 10 nuclear biological processes

(blue boxes) identified in Figure 3.23C are displayed here in their GO hierarchy. This hierarchy displays the relationships between the identified GO terms.

The black arrows indicate the direction of the ‘is_a’ child-parent relationship. The blue arrows indicate the direction of the ‘part_of’ child-parent relationship. The hierarchy is not all inclusive and is not showing non-relevant terms that fall between the displayed terms, as indicated by the red-dashed arrows.

124

125

Figure 3.27: Analysis of the common proteins identified from the HIV Gag purifications. A) Venn diagram showing the number of proteins that overlapped between the HIV Gag purifications presented in these studies and from the six published

HIV Gag lists discussed above. B) Top 10 biological processes GO terms identified from the 59 proteins that were in common among the HIV Gag purifications.

126

3.4.3 Characterization of putative Gag interaction partners

When all of the protein lists from the mass spectrometry, as well as the published

HIV-1 Gag purifications and mass spectrometry were compared, a protein that was consistently identified was TOP1, which was identified in three of the already published

HIV-1 Gag purifications and in the Gag.L219A-tandemstrep tag purification, the HIV Gag with HeLa nuclear lysate, and the first purified RSV Gag purification discussed above.

TOP1 became an interesting target due to the work demonstrating the importance of active transcription for RSV Gag nuclear foci stabilization (Maldonade R. et al., in progress). TOP1, or topoisomerase I, functions to relax DNA supercoiling by nicking a strand of DNA and then ligating the pieces together once the tension is eased. This process is crucial for the events of DNA replication, transcription, and recombination and repair. To determine whether TOP1 function plays a role in RSV Gag.L219A foci formation or stabilization, cells were treated with the drug camptothecin (CPT). CPT is a drug that targets TOP1 within minutes of treatment. It works by reversibly binding to the cleavage complexes TOP1 creates when it nicks DNA and prevents religation of the

DNA to occur (158). Figure 3.28C demonstrates that when cells expressing Gag.L219A are treated with CPT, it did not appear to affect the presence of Gag.L219A foci. Cells were stained with α-Histone H2A.X to mark the cells that were experiencing DNA damage due to CPT treatment (105). Furthermore, when Gag.L219A foci are examined for colocalization with endogenous TOP1, there does not appear to be any colocalization

(Figure 3.29 top row). These images are 2-dimensional pictures of the cells, when the cells are examined in 3-dimensions, any Gag.L219A foci that appear to be colocalized with TOP1 foci in 2D lose any overlap they appeared to have (data not shown). Next, it was tested whether reduced expression of TOP1 through the use of two different siRNAs would have an effect on particle production. Figure 3.30A demonstrated that when the siRNAs for TOP1 are used (designated 201 and 240), there appears that a

127 deficit in Gag production when cells are transfected with a RSV viral construct, as well as a loss in production of virus particles (Figure 3.30B).

The next two proteins that were examined for interactions with RSV Gag were

MAGOH and SKP1. These proteins were chosen because they were identified in the first

RSV Gag purification experiment and in the Gag.L219A-strep experiment. Furthermore,

MAGOH is the core component of the exon junction complex that is deposited at the splice junctions present in mRNAs. There is a particular interest in examining proteins that are involved in splicing and their possible interactions with RSV Gag because it has been previously shown that there is a high degree of colocalization between Gag.L219A and overexpressed splicing factors SC35 and SF2 (163) (Chapter 2/Appendix A).

SKP1 is a component of the SCF complexes which serve to regulate the ubiquitination of specific protein substrates and targets them for degradation by the proteosome. It has also been characterized as a RNA polymerase II elongation factor. SKP1 has also been found to be purified from affinity-tagged purification using HIV-1 VPU as the bait protein

(89). When these two proteins are examined for possible colocalization with Gag.L219A, a similar result can be seen as with TOP1 in that there doesn’t appear to be much, if any, colocalization of SKP1 or MAGOH with Gag.L219A in cells in 2D (Figure 3.29). And when cells are examined in 3D, any colocalization that was present in the 2D images is lost (data not shown). However, when the expression of SKP1 is decreased using siRNAs, Gag production from virus transfected cells was significantly decreased, as well as particle production (Figure 3.30). But when MAGOH expression is decreased using siRNAs, there does not appear to be an effect on Gag or particle production (Figure

3.30).

128

129

Figure 3.28: Effect of Camptothecin on Gag.L219A nuclear foci. A) Untreated controls looking at cells expressing Gag.L219A.YFP, or stained for Histone H2A.X which labels double stranded DNA breaks. In untreated cells, there should be low signal, if any, of H2A.X. B) Cells expressing Gag.L219A.YFP were treated with DMSO vehicle control to examine whether there were effects on foci formation. Untransfected cells were treated with 1 μM Camptothecin (CPT) and stained for H2A.X to demonstrate DNA damage is occurring, signify that TOP1 function is inhibited. C) Two examples are shown of cells expressing Gag.L219A.YFP that were treated with 1 μM of CPT.

In one experiment, 14 cells expressing Gag.L219A under CPT treatment were imaged.

130

131

Figure 3.29: Localization of Gag.L219A nuclear foci with endogenous proteins. To examine whether there was colocalization between Gag.L219A.GFP nuclear foci and the endogenous proteins TOP1, MAGOH, and SKP1 detected by antibody. A total of 13 cells expressing Gag.L219A and stained for TOP1 were imaged among two separate experiments. There were 10 cells with MAGOH staining, and 14 cells with SKP1 staining among two separate experiments.

132

As stated above, it has been shown that Gag.L219A has a great deal of colocalization with SF2 and SC35 (163) (Chapter 2/Appendix A), so siRNAs for both of those proteins were transfected into cells, as well as a RSV viral construct, and when

SC35 expression is decreased, there is no detectable Gag present in the lysates or any particles (Figure 3.30). However, when SF2 expression is decreased, Gag is still being produced and particles are being made, albeit what appears to be at a decreased amount (Figure 3.30).

3.5 Discussion

It has been established that RSV requires the Gag protein to undergo nuclear trafficking in order to have efficient packaging of the viral genome (63, 172).

Furthermore, imaging studies have been performed that visualize RSV Gag and RNA colocalization in the nucleus (Maldonado R. et al., in progress). Nuclear Gag is not uncommon among retroviruses; numerous retroviruses have been shown to undergo nuclear trafficking for various reasons. Both MLV Gag and FV Gag undergo nuclear trafficking during replication (141, 147, 162, 175). The p12 domain of MLV and full length

FV Gag bind to mitotic chromatin during integration (49, 117, 141, 176, 200). MPMV Gag has been shown to localize to the nuclear pore complex and have low levels within the nucleus (20, 211). HIV Gag has also been shown to be present in the nucleus, as well as the nucleolus along with FIV, RSV, MMTV, and MLV NC domain (16, 95, 125, 165).

There is even data showing HIV Gag colocalization with unspliced viral RNA in the nucleus (Tuffy, K. et al., in progress). All of this demonstrates that there appears to be a common feature of retroviral Gag entering the nucleus. RSV Gag was examined in mitotic cells to determine whether RSV Gag can also bind directly to chromatin similarly to FV and MLV p12 (Figure 3.3). It can be seen that RSV Gag did not localize to mitotic chromatin, demonstrating that RSV Gag may not directly bind to chromatin, but instead cellular proteins that interact with chromatin.

133

134

Figure 3.30: Effect of cellular protein expression decrease on RSV particle formation. QT6 cells were transfected with siRNAs of the cellular proteins listed (two different siRNAs for TOP1), or a controlled scramble siRNA, or no siRNAs, for two rounds. In the second round, the construct RC.V8, which encodes for the RSV virus, was transfected along with the siRNAs. After 16 hours, the media was changed. 24 hours later, media and cell lysates were collected. A) Western blot of the cell lysates are shown. Each of the cellular protein siRNAs were compared to the scrambled siRNA control and the RC.V8 only control. The blots on the left were probed for Gag using an α

RSV antibody. The blots on the right are probed for gamma tubulin for loading controls.

B) Virus particles were removed from the media by ultracentrifugation through a sucrose cushion and SDS-PAGE followed by Western blot was performed. The blot was probed using the α RSV antibody to detect the presence of capsid in the particles.

This experiment under these conditions was performed once.

135

While it is known that nuclear trafficking of RSV Gag is important for efficient packaging and that Gag colocalizes with the unspliced viral RNA in the nucleus, several questions remain. How does Gag locate the viral RNA in the nucleus? Does Gag have any other role in the nucleus besides binding the viral RNA possibly for packaging? To begin to answer these questions, it was first determined to which parts of the nucleus does Gag localize. Subcellular fractionations demonstrated that Gag can be extracted from chromatin fractions (Figure 3.2), implying that Gag may be interacting with chromatin in some manner. Furthermore, when examining Gag localization in the interchromatin regions of the nucleus that contains various protein and ribonucleoprotein

(RNP) complexes, it was found that RSV Gag highly colocalized to the splicing factors

SF2 and SC35 which are contained within splicing speckles (163). Splicing speckles are protein deposits that hold splicing factors until they are hyperphosphorylated causing them to travel to sites of transcription and splice mRNA co-transcriptionally (134, 135,

185, 215). It has also been found that RSV Gag.L219A nuclear foci are destabilized and the protein exhibits a diffuse appearance when treated with Actinomycin D which inhibits

RNA polymerase II transcription. These data imply that Gag could be traveling to sites of transcription in order to find and obtain the viral RNA. Gag could be interacting with splicing factors as a means to get to sites of transcription or to prevent the splicing of the viral RNA since Gag preferentially packages the unspliced viral RNA (11).

It has also been shown that the Gag.L219A nuclear foci appear to be tethered to either a RNA or a protein. Mass spectrometry experiments were performed to try to determine what the possible tether for the RSV Gag nuclear foci could be, as well as to decipher the possible roles that Gag could have in the nucleus. The results from the mass spectrometry experiments suggest that there is not a single protein that functions as the tether for the Gag foci. To narrow the potential Gag tethers, a mass spectrometry experiment isolating Gag from chromatin fractions can be performed. Due to the

136 presence of Gag in chromatin fractions as well as the identification of many chromatin bound proteins, it can be inferred that Gag may be binding to specific sites within the host chromatin. To test this idea, chromatin immunoprecipitations or ChIP will need to be performed. This would demonstrate whether Gag is binding to specific regions of the chromatin, such as the integrated provirus in infected cells, or whether Gag is just binding at random sites.

Several proteins have been identified across various mass spectrometry experiments and these are important to examine further. When the comparisons were done looking for any overlapping factors that were identified between different cell types for RSV and HIV Gag, 19 and 14 proteins, respectively, were identified in both DF1 and

HeLa nuclear lysates. Furthermore, when comparing the biologically relevant cell type to the prospective Gag, 10 proteins were found to be in common (Figure 3.5). When the proteins that were identified and published from other laboratories were compared to the proteins identified with HIV Gag presented here, 59 proteins were in common. These proteins that were identified across multiple mass spectrometry experiments are important to examine in the future because they have been identified through different cell lysis and protein purification protocols as well as varying exclusion analyses performed on the mass spectrometry output. Depending on how experiments are performed and due to the limitations of mass spectrometry (as explained in Chapter 1), each experiment performed can yield different lists of proteins identified. This means that proteins identified across multiple techniques have more merit in being true binding partners to the bait protein used. As shown through the gene ontology hierarchies for each mass spectrometry experiments, many of the proteins identified fall under GO terms that have a relationship with other GO terms. Whether it is a parent-child relation, or a regulatory relationship, most of the proteins identified have related biological functions. This indicates that the proteins RSV or HIV Gag could be interacting with

137 belong to certain classifications of biological function. While this does demonstrate the complexity of mass spectrometry experiments in that many different proteins are identified between different experiments with not necessarily much overlap, that functionally speaking, a lot of the proteins identified are related in some manner. This could imply that these proteins or types of proteins may be important for retroviral replication.

A few proteins were examined here to determine whether they have a role in

RSV replication. TOP1 was a perspective binding partner of both RSV and HIV Gag due to being identified in multiple mass spectrometry experiments with both Gag proteins.

The function of TOP1 was tested to determine whether there is an effect on Gag.L219A foci, similar to the Actinomycin D experiments (Maldonado R. et al., in progress). It was discovered however, that when TOP1 function is inhibited through the use of CPT, there did not appear to be an effect on the Gag foci, suggesting that TOP1 function is not important for Gag foci stabilization. Through imaging studies, TOP1 and two other identified proteins, SKP1 and MAGOH, were examined for colocalization with

Gag.L219A. Figure 3.29 shows that there does not appear to be any colocalization between these factors and Gag.L219A. One reason why there could be a lack of visualized interaction between Gag.L219A and these three factors is that the interaction could be very transient and trying to capture a snapshot of the interaction can be difficult.

Another reason could be that there is no viral genome in the system. These cells were transfected with a Gag expression construct which does not contain the entire RSV genome, nor is it integrated. If Gag is looking for the site of integration, and using any of these proteins as a way to find or be near the site of integration, the lack of an integrated genome could reduce the interactions detected. Another limitation to these experiments is that if few molecules of Gag or of the cellular proteins are involved in the interaction, it could be below the limit of detection of the microscope so the colocalization cannot be

138 visualized. To solve the transient interaction issue, cells can undergo chemical crosslinking which will bind proteins together that are near one another. From here, immunoprecipitations can be performed to determine whether the proteins of interest can be purified with Gag. Also, colocalization can be examined in infected cells to determine whether the integrated provirus is important for Gag-protein interactions.

Another way to circumnavigate the transient interaction issue is to utilize the E. coli biotin ligase (BirA*) technique that enzymatically biotinylates proteins that are near you protein of interest that contains the BirA* coding tag (52).

When it was examined to determine whether knockdown of TOP1, MAGOH,

SKP1, SF2 and SC35 would affect virus particle production, it can be seen that there is a noticeable decrease in Gag expression in cells where TOP1, SC35, and SKP1 were knocked down, which lead to a lack of detectable virus particles as well. However, there did not appear to be much of an affect or at least to a lesser degree when SF2 and

MAGOH were knocked down (Figure 3.30). When the expression of TOP1, SC35, and

SKP1 were decreased, there did not appear to have a global effect on transcription as shown by the loading control gamma tubulin. However, depending on the half-life of gamma tubulin mRNA and/or protein, the effect of reduced global transcription might not be detected by probing for this protein.

The work discussed here demonstrates that there may be a common feature among retroviruses of interacting with chromatin or chromatin-associated proteins, although the reason why may vary. The analysis of the mass spectrometry experiments list many potential interaction partners of both RSV and HIV including proteins that are involved in gene expression and RNA processing, as these were some of the top biological function categories identified. The data presented in this chapter further supplies evidence to support the hypothesis that Gag may be using chromatin- associated proteins as a means to be near sites of transcription in order to obtain the

139 unspliced viral RNA for packaging. The protein families identified by the mass spectrometry, such as transcription involved proteins, splicing factors, and other chromatin-associated proteins could contain the protein(s) that Gag uses to find transcription sites.

3.6 Acknowledgments

I would like to thank the following scientists for their generosity in supplying reagents: Dr. Zheng (John Hopkins University), Drs. Katzman and Moldovan (PSU

College of Medicine). I would like to thank Dr. Nikoloz Shkriabai and Dr. Mamuka

Kvaratskhelia (OSU) for the collaboration with the purified RSV Gag and Gag.ΔNC affinity-tagged purifications and mass spectrometry. I would like to acknowledge the

Microscopy Imaging Core Facility at PSU College of Medicine for use of the confocal microscope and the Imaris imaging analysis software, as well as the Mass Spectrometry and Proteomics Core. This project was funded in part by NIH P50 CRNA (LJP), NIH R01

CA076534 (LJP), T32 CA60395 (BLR), F31 CA196292 (BLR).

140

Chapter 4

Rous sarcoma virus Gag utilizes Transportin-SR as a

mechanism for nuclear entry.

4.1 Abstract

Nuclear trafficking of RSV Gag is a very transient event. Understanding the nuclear import and export pathways that Gag hijacks is crucial in understanding the role of Gag in the nucleus. It is known RSV Gag uses the CRM1 export pathway to leave the nucleus. And it has been shown that RSV Gag utilizes the Importin α/β nuclear import pathway through the NC domain of Gag, and the Importin 11 pathway through MA. It was demonstrated that RSV Gag may also utilize the TNPO3 import pathway as well, through the same domain that interacts with Importin 11. Here it is shown through in vitro studies that there is direct binding between Gag and TNPO3. To further examine this interaction in cells, both proteins were transfected into cells resulting in an increase of nuclear Gag localization with increased TNPO3 expression. The nuclear localization of

Gag was further increased when TNPO3 was overexpressed along with Importin β, but not Importin 11. This suggests that there is an interaction between RSV Gag and

TNPO3 in cells and that this interaction occurs through the MA domain.

4.2 Introduction

It has been found that the Gag proteins from various retroviruses undergo nuclear trafficking (reviewed in (187)). While the mechanisms of Gag nuclear trafficking have been mostly studied using RSV, it is still not completely understood. The nuclear export of RSV Gag was first discovered to be dependent on the CRM1 nuclear export pathway through a nuclear export signal (NES) mapped to the p10 domain (172, 174).

The MA and NC domains were found to be involved in the nuclear import of Gag.

Studies utilizing Saccharomyces cerevisiae mutants deficient in members of the

141

Importin-β protein superfamily, found that the NC domain undergoes nuclear entry through the Kap60p/Kap95p (mammalian Importin-α/β) pathway, while MA uses either

Kap120p (mammalian Importin 11) or Mtr10p (mammalian Transportin-SR) (26).

Furthermore, the interactions between the NC domain and the Importin-α/β complex, as well as the MA domain and Importin 11 were confirmed via affinity-tagged purifications

(72).

Transportin-SR, or TNPO3, consists of three domains: a N-terminal RanGTP binding domain, a nuclear pore complex (NPC) interaction domain, and a C-terminal cargo binding domain (Figure 4.1B) (106, 126). TNPO3 functions as a nuclear import factor for proteins that typically contain arginine/serine (RS)-rich domains, such as the

SR splicing factors (92). After transport into the nucleus, TNPO3 brings the splicing factors to nuclear speckles, which are subnuclear compartments that store splicing factors before they move to sites of transcription for splicing (106, 111).

Additionally, TNPO3 has been shown to be important for HIV-1 replication.

TNPO3 was identified in three siRNA screens that demonstrated that knockdown of

TNPO3 expression impairs HIV-1 replication (22, 37, 100). Multiple groups have shown that TNPO3 appears to be involved in the nuclear entry of the HIV-1 pre-integration complex (PIC) (44, 179, 202). And there is evidence to suggest the CA domain of HIV-1

Gag could be interacting with TNPO3 (17, 43, 84, 102, 131, 202, 226) as well as HIV-1 integrase (4, 37, 42, 59, 113, 119, 127, 201). And recently, a group demonstrated that

TNPO3 is important for the nuclear import of the foamy virus PIC as well (2). While it appears that TNPO3 is important for a few retroviruses, the interaction of RSV Gag with

TNPO3 has yet to be investigated and is the subject of this work.

4.3 Materials and Methods

Methods were modified from (188)

Cells and Plasmids

142

QT6 quail fibroblast cells were maintained as described in (40) and were transfected with the calcium phosphate precipitation method (56).

MA-GFP was described in (172) and Gag-GFP was described in (125).

Gag.ΔMA5-86.GFP and Gag.ΔMA5-148.GFP were described previously (26). GFP-TNPO3 and GFP-TNPO3.ΔCargo (126) were gifts of Nathaniel Landau (NYU Langone Medical

Center). mCherry-TNPO3 was created by PCR amplifying the TNPO3 coding sequence from GFP-TNPO3 with flanking XhoI and SalI restriction sites and inserting into mCherry-N2 digested with those same enzymes. HA-TNPO3 was created by amplifying the TNPO3 coding sequence with flanking restriction sites and insertion into pKH3 with the same restriction sites. pKH3-Importin 11 and pKH3-Importin β, encoding HA-Importin

11 and HA-Importin β, respectively, were described in (72).

pET28.TEV-Gag.3h encoding RSV Gag with an N-terminal 6 histidine tag

(H6.Gag.ΔPR) is described in (72). H6.Gag.ΔNC.ΔSP was created by inserting two consecutive, in-frame stop codons into pET28.TEV-Gag.3h preceding the SP coding sequence. Gag.ΔMBD.ΔPR was created by PCR amplification of the Gag coding region starting at amino acid 83 and terminating at the N-terminus of NC with the insertion of stop codons. This product was ligated into appropriately digested pET24a+. pGEX6P3- hTNPO3 (102) encoding GST-TNPO3 for bacterial expression was a gift from Alan

Engelman (Dana Farber Cancer Institute). GST-TNPO3.ΔCargo1-501 was created from pGEX6P3-hTNPO3 by inserting three in frame, consecutive stop codons following the sequence encoding amino acid 501 of TNPO3 by Quikchange PCR mutagenesis (123).

GST-TNPO3.NPC was created by deleting most of the Ran binding domain using primers: 5’ – GGA GAA AAC CTT TAC TTC CAG GG and 5’ – AAT GGA TCC CAG

GGG CCC using the Q5 Site-Directed Mutagenesis protocol according to the manufacturer guidelines (New England Biolabs). Michael Malim (King’s College London) kindly provided the bacterial expression construct encoding GST-Importin β.

143

Cell fixation and immunofluorescence

QT6 cells were grown on glass coverslips and fixed with 2% paraformaldehyde in phosphate buffered saline (PBS) (supplemented with 5 mM EGTA and 4 mM MgCl2, and adjusted to pH 7.2-7.4 with HCl) for 15 minutes at room temperature. Cells requiring antibody staining were then permeabilized with ice cold 100% methanol for 2 minutes on ice and blocked in 5% goat serum (Rockland) diluted into PBS for at least 2 hours at room temperature. Cells were stained with mouse anti-HA antibody (Genscript) diluted

1:500 in PBS supplemented with 0.5% goat serum and 0.01% Tween-20 (Sigma) for at least one hour in a humidified chamber. Then incubated with goat anti-mouse antibody conjugated to Cy5 (Molecular Probes) diluted 1:500 in PBS for 30 minutes at room temperature. Cells were stained with DAPI at 5 μg/ml and mounted with Slow-Fade mounting medium (Molecular Probes).

Microscopy

Most images were captured on a DeltaVision DV Elite (Applied Precision) wide field deconvolution microscope using a 60x oil immersion objective. 0.2 μm optical slices encompassing the entire cell were captured and deconvolved with softWoRx 5.0

(Applied Precision). From the deconvolved image stack, a single slice encompassing the widest section of the nucleus was exported for each channel as an uncompressed TIFF file for subsequent display and analysis. The images used for the Gag mutants with

TNPO3 were imaged using the Leica SP8 TCS scanning confocal microscope equipped with a White Light Laser (WLL) using a 63X oil immersion objective. Sequential scanning between frames was used to average four frames for each image. DAPI was excited with the 405 nm UV laser at 10% laser power and emission detection window 410 – 466 nm using a PMT detector. GFP was imaged using the WLL excited with the 489 nm laser line and a hybrid detector window of 495 – 559 nm. YFP was imaged using the

WLL with a laser line excitation of 514 nm and a hybrid detector window of 519–583

144 nm. And mCherry was imaged using the WLL with a laser line excitation of 587 nm and a hybrid detector window of 594–698 nm. All channels using the hybrid detectors had a time gating of 0.3 to 6.0 ns. The intensity of images displayed in figures was adjusted uniformly with CorelDRAW X3 (version 13; Corel Corp).

Quantitation of nuclear localization

ImageJ software version 1.46m was used to analyze cells for the amount of Gag present in the nucleus, as determined by fluorescence signal (1). The sum of all the pixel intensities in the Gag channel, expressed in ImageJ as the “Integrated Density,” for a region encompassing the entire cell was divided by the integrated density of the nucleus to calculate the percentage of the total cellular Gag pool residing in the nucleus.

Transfections and analysis were blinded with respect to the presence of nuclear import factor overexpression to eliminate bias in analysis. Outliers with a p value less than 0.05 as determined by Grubbs’ test (GraphPad Software Inc,

) were removed from subsequent analyses. GraphPad Prism 5 (GraphPad Software, Inc.) was used to create all graphs, perform linear regression analysis, calculate the mean, and determine p values. p values were calculated by unpaired t-test as only pairs (i.e. no more than two) of conditions are analyzed simultaneously.

Expression and purification of recombinant His-tagged RSV Gag proteins

All constructs for protein expression and purification were transformed into BL21

Gold DE3 pRIL E. coli. The purity of all protein preps was verified with Coomassie staining following SDS-PAGE and/or western blot analysis with appropriate antibodies.

All sonications were performed on ice with an S-4000 sonicator (Misonix, Inc.) using a

½” tip. The expression and purification of His-tagged Gag.3h (H6.Gag.ΔPR) is described in (72). H6.Gag.ΔSP.ΔNC was purified using Ni+2 column affinity chromatography and subsequent size exclusion chromatography. Gag.ΔMBD.ΔPR, Gag.CA.NC, MA, and

145

MA.p2.p10.NTD were expressed in ZYP-5052 autoinduction medium, lysed in

BugBuster Primary Amine Free Protein Extraction Reagent (Novagen) supplemented with recombinant lysozyme. PEI was added to a final concentration of 0.15% and lysate was centrifuged at 21,000 RCF for 30 minute to remove cell debris. The protein remaining in the soluble fraction was precipitated for 30 minute at room temperature with concentrated ammonium sulfate. The pellet containing the Gag protein was resuspended and clarified by centrifugation prior to chromatographic separation and elution with a sulfopropyl cation exchange column. Peak eluted fractions were dialyzed against 25 mM HEPES pH 7.5, 500 mM NaCl, 0.1 mM EDTA, 0.1 mM TCEP, 0.01 mM

o ZnSO4 prior to concentration, aliquoting, and storage at -80 C.

Expression and purification of recombinant GST-tagged proteins

Purified GST protein was a gift from John Flanagan (Penn State College of

Medicine). GST-TNPO3 was grown in two 250 ml cultures of ZYP-5052 supplemented with ampicillin and incubated at 37oC for 19.5 hours. Cell pellets were harvested by centrifugation and stored at -20oC prior to purification. A batch purification protocol adapted from (76) was used to purify GST-TNPO3. Cell pellets were thawed on ice, homogenized into PBS containing Roche Complete EDTA free protease inhibitors

(Roche). Ready-lyse lysozyme and Omnicleave nuclease (Epicentre) were added and the mixture was permitted to rock on ice for 15 minute. The homogenate was then sonicated three times at 80% power, with a one-minute recovery between each sonication. Lysate was clarified for 30 min at 21,000 RCF at 4oC and passed through a

0.45 μm filter. The soluble portion was incubated with gentle end over end mixing for 3 hours at 4oC with Glutathione Sepharose 4 Fast Flow Beads (GE) prewashed with PBS.

Following binding, the beads were washed three times with PBS to remove unbound proteins. Bound proteins were eluted with a 30 minute incubation at 4oC with Elution

Buffer (50 mM Tris pH 8.0, 40 mM reduced glutathione, and Roche Complete protease

146 inhibitors). At the end of the incubation, beads were pelleted and the supernatant removed to a prechilled tube. Two additional elution steps were performed. All elutions were pooled and dialyzed against TNPO3 storage buffer (50 mM HEPES pH 7.4, 150 mM NaCl, 10% Glycerol, and 2 mM DTT). Following dialysis, purified GST-TNPO3 was concentrated with an Amicon Ultra Centrifugal Filter Device (Millipore), aliquoted, and stored at -80oC prior to use. GST-TNPO3.ΔCargo and GST-TNPO3.NPC were expressed at 30oC, but otherwise expressed and purified identically to GST-TNPO3.

In vitro GST affinity purification protein-protein interactions assays

The protocol for GST affinity purification assays was adapted from (102). All proteins used in purification assays were performed at equimolar of 185 nM in 540 μl pull-down buffer (150 mM NaCl, 5 mM MgCl2, 5 mM DTT, 0.1% NP-40, 25 mM Tris-Cl pH 7.4). For the input gel, 40 μl was removed. Proteins were incubated for 1 hour at room temperature with gentle end over end rotation. 60 μl of a 50% slurry of glutathione beads (Glutathione Sepharose 4 Fast Flow, GE) prewashed four times in pull-down buffer were then added to the complexes and incubated for 2 hours at room temperature with gentle end over end rotation. The beads were pelleted by centrifugation at 800xg for

2 minutes. The supernatant containing unbound proteins was removed and the beads containing the bound protein complexes were washed four times with 10 packed bead volumes of pull down buffer. Following the final wash, one packed bead volume of elution buffer (25 mM Tris-Cl and 40 mM reduced glutathione, pH 8) was added, mixed with the beads, and placed on ice for 10 minutes to elute the bound complexes.

Following elution, beads were pelleted by centrifugation at 800xg for 2 minutes at 4°C.

The supernatant containing bound complexes was then removed to a clean microcentrifuge tube and 4x SDS-PAGE loading buffer (250 mM Tris-HCl, pH 6.8, 40%

147

148

Figure 4.1: Gag and TNPO3 constructs utilized in these studies. A) Schematic representation of RSV Gag constructs used in this study. Gag is composed of the following domains: MA (matrix), p2, p10, CA (capsid), NC (nucleocapsid), and PR

(protease). The locations of the nuclear localization signals (NLSs) and nuclear export signal (NES) are highlighted. Gag.ΔPR contains the entire coding region of Gag through the first seven amino acids of the PR domain. All Gag truncation mutants are depicted in line with the full-length construct (top). SP stands for the spacer peptide in Gag. MBD is the membrane binding domain of MA. NTD is the N-terminal domain of CA. B)

Schematic representation of TNPO3 constructs used in this study. TNPO3 consists of a three domain structure including a N-terminal RanGTP interaction domain, a nuclear pore complex (NPC) binding domain, and C-terminal cargo-binding domain. The full- length TNPO3 protein is shown on the top, followed by a ΔCargo mutant that has the entirety of the cargo domain and a small portion of the NPC domain deleted. The

TNPO3.NPC mutant contains most of the NPC domain and a portion of the C-terminal end of the RanGTP binding domain.

149 glycerol, 0.4% bromophenol blue, 8% SDS, and 8% β-mercaptoethanol) was added to

1x final concentration. The samples were then heated for 5 minutes at 85oC and separated by SDS-PAGE and analyzed by Western blot using rabbit anti-GST antibody

(Genscript), rabbit α-RSV CA, rabbit α-RSV MA.p2, rabbit α-RSV MA.MDB, and HRP- conjugated secondary antibodies (Invitrogen). RSV antibodies were generous gifts from

Rebecca Craven (Penn State College of Medicine). The input gels were visualized with

Acquastain dye (Bulldog Bio).

4.4 Results

The goal of these experiments was to test a hypothesis generated by studies utilizing yeast deficient in the karyopherin-β family members (26). To first determine whether TNPO3 is involved in the nuclear import of RSV Gag, cells were transfected with Gag.GFP with and without mCherry.TNPO3. It can be seen that in Figure 4.2A what Gag.GFP looks like in the cells without exogenous TNPO3, but when TNPO3 is added to the system, there is an increase in the amount of Gag in the nucleus (Figure

4.2B). The graph in Figure 4.2C illustrates a significant increase in the amount of Gag in the nucleus from 18.5% up to 25% with increased TNPO3 expression with a p-value <

0.0001. Furthermore, when the Gag mutant with the NC domain deleted, Gag.ΔNC

(Figure 4.1A), was utilized, there was a similar trend as with the wildtype Gag in that there was 17% nuclear Gag.ΔNC, but when TNPO3 was added, the amount of nuclear

Gag.ΔNC raised to 22% with p < 0.0001 (Figure 4.2). These results were expected due to the work that has shown that the NC domain of RSV Gag binds to the Importin α/β complex (26, 72). To confirm the interaction of TNPO3 with the MA domain of Gag, two mutants were utilized that contain deletions within MA (Figure 4.1A). One deletion mutant is missing the N terminal portion of MA containing the NLS (Gag.ΔMA5-86), and the other has most of MA deleted (Gag.ΔMA5-148). When nuclear localization of the two

MA mutants, Gag.ΔMA5-86 and Gag.ΔMA5-148, was examined, 15.5% and 12% were

150

151

Figure 4.2: Effects of increased TNPO3 expression on Gag nuclear localization. A)

The localization of wildtype Gag is shown in the top panel, followed by the localization of the Gag.ΔNC, Gag.ΔMA5-86 and Gag.ΔMA5-148. In the bottom panel, shows the localization of TNPO3 when overexpressed in cells. B) Cells expressing both the Gag from the right and TNPO3 are shown. C) A graph plotting the percentage of nuclear Gag for each Gag protein shown with and without exogenous TNPO3. At least 60 cells were analyzed from three independent experiments were analyzed for each condition, with the standard error of the mean represented by the error bars. A “ * ” signifies a statistical significance (p < 0.0001) between the compared groups. Gags with and without TNPO3 were analyzed using unpaired Student’s t-test.

152 nuclear, respectively. When exogenous TNPO3 was added, these percentages became

16.5% and 13% for Gag.ΔMA5-86 and Gag.ΔMA5-148, respectively, demonstrating no significant increase in Gag nuclear localization. These results indicate that the interaction between Gag and TNPO3 occurs through the MA domain of Gag.

Next, the interaction between Gag and TNPO3 was further examined through in vitro affinity-tagged purification. The purified proteins that were used in this study are shown in Figure 4.1. Purified wildtype Gag was incubated with purified GST-TNPO3, and then the protein complexes were purified using GST beads and separate on SDS-

PAGE. Figure 4.3 shows that Gag is strongly purified with GST-TNPO3 and not with

GST only protein. To further test that TNPO3 interacts with Gag through the MA domain, various Gag mutants were utilized that were either missing a portion of MA

(ΔMBD.Gag.ΔPR) or all of MA (CA.NC). Other mutants were used to map the Gag-

TNPO3 interaction sites. A MA-CA Gag protein (Gag.ΔSPΔNC), a Gag protein missing the C-terminal domain of CA and all of NC (MAp2p10NTD), and finally a MA only construct were also used. Surprisingly, it was found that each of these Gag mutants were able to be purified with GST-TNPO3, albeit to a much lesser extent than wildtype

Gag (Figure 4.3). Next, it was determined which domain of TNPO3 was important for binding to Gag. Two mutants of TNPO3 were utilized, one that has the cargo binding domain deleted (TNPO3.ΔCargo), and one that only contains the NPC binding domain of

TNPO3 (TNPO3.NPC) (Figure 4.1B). When these mutants were used to purify Gag, it was found that Gag bound very strongly to these two mutants (Figure 4.3). Typically,

TNPO3’s cargos bind to the cargo binding domain of TNPO3, so it was unexpected to see an interaction with TNPO3.ΔCargo. Next the Gag mutants were incubated with the

TNPO3 mutants to determine whether an interaction could occur, and it is shown that there is a very weak interaction between ΔMBD.Gag.ΔPR, CA.NC, Gag.ΔSPΔNC, and

153

154

Figure 4.3: In vitro affinity-tagged purifications of Gag and TNPO3 protein complexes. GST affinity purifications were performed by incubating Gag with GST-

TNPO3. A) On the left, part of the protein mixtures were removed before the incubation step and ran out on SDS-PAGE gel that was stained with Acquastain to visualize the proteins added to the affinity purification. On the right are gels of the elutions ran out on

SDS-PAGE and analyzed by Western blot using α-RSV antibodies. B) Same procedure as above with the addition of the GST-TNPO3.NPC domain and the MA domain of Gag.

155

MAp2p10NTD with TNPO3.ΔCargo. When just the MA domain of Gag was incubated with TNPO3 or the two mutants, MA appeared to have similar binding to wildtype

TNPO3 and to the NPC domain of TNPO3. Very little, if any of the MA protein was found to be purified with the TNPO3.ΔCargo mutant (Figure 4.3).

It was next determined whether a cooperative effect could be seen on Gag nuclear localization by adding importin β or importin 11 in the presence of exogenous

TNPO3. Figure 4.4A and C show that when any of these three nuclear import proteins are with Gag, the amount of nuclear Gag significantly increases compared to having only

Gag in the cells, although importin 11 did not have as big of an increase in nuclear Gag localization. When TNPO3 is added, there is an increase from 15% to 21% with a p- value < 0.0001, similarly when importin β is added; the increase goes from 15% to 25% with p < 0.0001. When importin 11 is added, the increase is only to 19% with p = 0.0234.

When TNPO3 and importin 11 were added to cells expressing Gag, the amount of nuclear Gag was 21%, same as when only TNPO3 is added to the cells, showing that there is no benefit to an increase of importin 11 on nuclear Gag when TNPO3 is also present. However, when TNPO3 and importin β are both added to Gag expressing cells, there is an increase from the 21% of TNPO3 only to 28% with the addition of importin β with p < 0.0001.

4.5 Discussion

The purpose of these studies was to understand the relationship between RSV

Gag and TNPO3. A study was performed that demonstrated that the nuclear localization of the MA domain of Gag was dependent upon Importin 11 and TNPO3 in yeast (26).

The interaction between MA and Importin 11 was confirmed through an affinity tagged purification, but there was no follow up on TNPO3 (72). Here it has been shown that there is a correlation between the amount of TNPO3 in cells and the amount of Gag in nucleus (Figure 4.2). Through these imaging studies, it was shown that the interaction

156 appears to be through the MA domain because when portions of MA are deleted, the increase in Gag nuclear localization is no longer seen with an increase in TNPO3

(Figure 4.2). Furthermore, a direct interaction can be detected when TNPO3 and Gag are incubated together in vitro (Figure 4.3). To verify the interaction between TNPO3 and Gag, affinity-tagged purifications need to be performed using cell lysates expressing both proteins.

Due to the results from the yeast studies, it was expected that TNPO3 would bind to Gag through the MA domain. However, in the in vitro purifications it can be seen that there is some Gag found in the elutions when using the Gag mutants that do not contain

MA (Figure 4.3). It was also expected that Gag would bind through the cargo domain of

TNPO3, but when that was removed, Gag was still able to bind. Gag could even bind when it was only the NPC domain of TNPO3 as well (Figure 4.3). These data suggest that the interaction between Gag and TNPO3 could be between multiple sites of each protein. A recent study demonstrated that when the N-terminus of TNPO3 (all of the

RanGTP binding domain and part of NPC) is incubated with the catalytic core domain and C-terminal domain of HIV-1 integrase, there is interaction in vitro, suggesting that other cargo proteins of TNPO3 may bind to sites other than the cargo binding domain

(201). Another possibility is that the interactions that are being detected in the affinity- tagged purifications using the Gag mutants could be an artifact of the purified proteins in the binding solution used, because the imaging studies in Figure 4.2 show only an increase in Gag nuclear localization when MA is intact. To determine whether Gag and

TNPO3 could have multiple sites of interactions, the Gag mutants should be tested in cells to see whether there is an increase in nuclear localization with exogenous TNPO3, as well as the TNPO3 mutants.

There was a cooperative effect seen when increased levels of both TNPO3 and

Importin-β caused an increase in nuclear Gag compared to looking at either of the import

157 factors separately (Figure 4.4). There was not an increase of nuclear Gag when TNPO3 and Importin 11 levels were increased because they are probably competing for the same binding area of Gag (Figure 4.4B). This further suggests that TNPO3 is interacting through MA, and the reason for seeing a cooperative effect with Importin β is because

Importin β interacts with Gag through the NC domain, while Importin 11 interacts through

MA, causing a competition with TNPO3 (26, 72).

One question that remains is why RSV Gag would utilize three different nuclear import pathways? One reason could be to reach a certain area in the nucleus. Importin β transports a variety of proteins into the nucleus including gene regulators, cell cycle regulators, histones, and ribosomal proteins. Importin 11 transports ribosomal proteins as well, and E2-ubiquitin-conjugating enzymes (reviewed in (36)). The known transport cargo for TNPO3 includes SR splicing factors, in which TNPO3 traffics them to splicing speckles. Once in the speckles, the splicing factors can be activated through phosphorylation which triggers them to go to sites of active transcription in order to splice newly transcribing RNAs (134, 149). It has been shown previously that RSV Gag has a high degree of colocalization with the SR proteins SF2 and SC35. Furthermore, when the levels of SC35 are increased in cells, the number of Gag.L219A nuclear foci increase. This is not seen with SF2 or another nuclear body protein PSP1 (163)

(Chapter 2/Appendix A). Also, in the previous chapter, it was shown that when SC35 expression is decreased using siRNAs, the levels of Gag produced in the cells are decreased beyond detectable levels via Western blot. Gag could be using TNPO3 as a means to be transported to splicing speckles, where Gag can hitch a ride to active transcription sites in order to find the transcribing provirus and obtain the viral RNA before it is spliced so it can then be packaged. Further work is needed to test whether this hypothesis is true. Imaging experiments are planned to examine fluorophore-tagged

158

159

Figure 4.4: Effects of multiple import factors on Gag nuclear localization. A)

Visualization of cellular localization of wildtype Gag (green) with exogenous individual import factors, TNPO3 (red: top row), Importin 11 (magenta: middle row), and Importin β

(magenta: bottom row). B) Visualization of cellular localization of wildtype Gag with exogenous TNPO3 and Importin 11 (top row), and TNPO3 and Importin β (bottom row).

C) A graph displaying the percentage of Gag either with no exogenous import factors, with TNPO3, Importin 11 and Importin β individually, and then TNPO3 with either

Importin 11 or Importin β. At least 27 cells were analyzed from two independent experiments were analysis for each condition except for Gag with Importin 11, only 10 cells were analyzed, with the standard error of the mean represented by the error bars.

A “ * ” signifies a statistical significance (p < 0.0001) between the compared groups. A “

# ” signifies a statistical significance of p = 0.0234. Gag with TNPO3, with Importin 11, and with Importin β were compared to the Gag alone group separately. The Gag with

TNPO3 and Importin 11, and Gag with TNPO3 and Importin β were compared to the

Gag and TNPO3 group separately. Group comparisons were analyzed using unpaired

Student’s t-tests.

160

Gag, fluorescent in situ hybridization (FISH) labelled viral RNA, and antibody stained

SC35 and RNA polymerase II to determine whether there is colocalization among Gag, viral RNA, and SC35 at sites of transcription (RNA polymerase II). These experiments will be testing a novel hypothesis that would map out the pathway of nuclear trafficking of a Gag protein to find and obtain the viral RNA for packaging.

4.6 Acknowledgments

I would like to thank the following scientists for their generosity in supplying reagents, Dr. Nathaniel Landan (NYU Langone Medical Center), Dr. Alan Engelman

(Dana Farber Cancer Institute), Dr. Michael Malim (King’s College London), Kathleen

Griffin and Dr. Rebecca Craven (PSU College of Medicine). I would like to acknowledge the Microscopy Imaging Core Facility at PSU College of Medicine for use of the confocal and deconvolution microscopes and the Imaris imaging analysis software. This project was funded in part by NIH R01 CA076534 (LJP) and F31 CA196292 (BLR).

161

Chapter 5

Overall Discussion

Previous work in our laboratory has demonstrated that nuclear trafficking of the

RSV Gag protein is necessary for efficient genomic packaging (63). Furthermore, we have also recently shown that RSV Gag and the unspliced viral RNA colocalize in the nucleus, as does HIV Gag and its unspliced RNA (Maldonado, R. et al., in progress).

This is the first time Gag and viral RNA interactions have been visualized in the nucleus.

Our laboratory has been on the forefront in studying the roles of nuclear Gag. While we were not the first to identify retroviral Gag in the nucleus, we were the first to identify a role for nuclear Gag during assembly. Soon after, other retroviral Gag proteins have been examined for nuclear localization and many studies have shown roles of certain

Gag domains important for integration (24, 48, 49, 160, 176, 212, 220-222), or in the case of foamy virus, the full-length Gag protein (115, 124, 141, 156, 171). However, our laboratory has identified roles of nuclear Gag during the later stages of infection, during particle assembly. We have also shown that when RSV Gag is in the nucleus, foci form, and when the Gag is concentrated in the nucleus, either through LMB treatment or the use of the Gag.L219A mutant, the number of foci increases (96). What Gag is doing in the foci and what components other than Gag are present, are currently unknown and are the focus of this dissertation.

In this dissertation I characterized the nuclear foci formed by RSV Gag as well as examine Gag nuclear import through the Transportin SR (TNPO3) pathway. The goal of my research was to uncover possible host nuclear interaction partners of RSV and HIV

Gag. We hypothesized that Gag could be using cellular proteins as a means to find the integrated provirus in order to bind the viral genome at the site of transcription. If this hypothesis proves to be true, it would open a new window into the biology of the

162 replication cycle of RSV and related viruses. No other retrovirus has been studied in terms of nuclear Gag binding the gRNA in the nucleus for packaging. This work could provide possible targets for treatment in retrovirus caused diseases; if Gag cannot package the viral genome, then the virus cannot spread to new cells/areas in the host.

5.1 Characterization of RSV Gag nuclear foci

When RSV Gag undergoes nuclear trafficking, the formation of nuclear foci can be seen. To understand more about these foci, we tracked the foci over a period of 10 minutes to determine the type of motion they exhibit (Chapter 2/Appendix A). We determined that the Gag nuclear foci exhibited obstructed diffusion, meaning that the foci appear to be tethered to something. Examining the properties of known host nuclear bodies, the bodies themselves also exhibit a similar motion pattern, and are known to be tethered to RNAs or proteins, and the proteins that make up the bodies move in and out of the bodies fairly rapidly. FRAP was performed on the Gag nuclear foci and we found that the foci move rapidly in and out of the foci with a half-time of recovery of around 8 seconds (unpublished data – Tim Lochmann). Due to the Gag foci exhibited similar characteristics to that of nuclear bodies, we next examined whether Gag could be present in these bodies. As described in Chapter 2/Appendix A, we found that the Gag foci did not overlap with that of PML bodies or paraspeckles, but there was a high degree of colocalization with the SR splicing factors SC35 and SF2. Furthermore, when the expression of SC35 is increased, the number of Gag foci increased.

In Chapter 3, through subcellular fractionations, I have demonstrated that RSV

Gag and HIV Gag proteins can be isolated from chromatin-associated protein fractions

(Figure 3.2). Due to the ability of FV Gag and MLV p12 proteins to bind to mitotic chromatin during integration, I wanted to determine whether RSV Gag behaved similarly during late stages of assembly. I found that RSV Gag did not bind to mitotic chromatin

(Figure 3.3), but in fact the foci of Gag.L219A were distributed through the

163 interchromatin regions of the interphase nucleus (Figure 3.4). These results further demonstrate that the Gag foci appear to form their own nuclear bodies, tethered to either a RNA or protein. To test whether the tether could be a protein, proteomic experiments were performed to identify host proteins that Gag could be interacting with in the nucleus

(Chapter 3). The classes of proteins identified included proteins involved in transcription

(mediator proteins and transcription factors), splicing factors, histone proteins and other chromatin-associated proteins. Another purpose to perform the proteomic studies was to identify proteins that were purified with different retroviral Gag proteins. I performed experiments to examine common proteins identified between RSV Gag and HIV Gag and found that a few of the proteins in common between the two are involved in RNA transport (Figure 3.7). Studies in Chapter 2/Appendix A demonstrate that there appears to be a relationship between RSV Gag and SC35, however; SC35 was not identified in any of the RSV Gag mass spectrometry experiments. This shows that there are limitations to protein identification by mass spectrometry. A reason why SC35 was not identified could be that the techniques used to isolate protein complexes caused disruption of these complexes and broke any interaction between Gag and SC35.

Another reason is that the interaction between RSV Gag and SC35 need to occur in cells; in the experiments using purified RSV Gag and nuclear lysates, the conditions might not have been optimal for SC35 and Gag interaction.

In recent years, a number of proteomic studies were performed by other laboratories examining the possible interaction partners of HIV-1 Gag (50, 51, 89, 114,

120, 166). In these studies, a number of the proteins identified were nuclear proteins.

While these proteins were not necessarily in particular interest of the other researchers, I wanted to examine them more closely to determine whether there are proteins that overlap multiple studies. RNA processing proteins and transcription-related proteins were constantly some of the top categories identified from the nuclear proteins identified

164

(Chapter 3). A caveat to using mass spectrometry in identifying possible binding partners of the protein of interest is the variability in the proteins that are identified.

Generally only stably associated protein complexes survive the purification procedures, transient interactors are usually lost during sample collection. Modern mass spectrometers are highly sensitive, which increases the detection of non-specific interactors to the protein of interest, or the purification medium (3). Other limitations of protein identification are discussed in Chapter 1. Due to multiple laboratories using various techniques that have identified a number of proteins related to transcription and splicing, one can predict that a common feature among Gag proteins is that they may have a role in these processes. Determining the exact role of Gag with these proteins and these particular nuclear processes still needs to be done.

5.2 Interaction of RSV Gag with Transportin SR

In Chapter 4, the interaction that was identified between RSV Gag and

Transportin SR (TNPO3) (72) was further explored. Through in vitro affinity-tagged purification experiments and overexpression ex vivo experiments, we’ve shown that there indeed is an interaction between RSV Gag and TNPO3. This interaction between

Gag and TNPO3 is of particular interest because TNPO3 is responsible for the nuclear import of SR proteins, such as SC35 and SF2. In Chapter 2/Appendix A, we have shown that there is a great deal of colocalization between Gag and these two splicing factors. The in vitro affinity-tagged purification experiments demonstrated that Gag is able to interact with the NPC domain of TNPO3, not the cargo binding domain as one might expect. In results not shown in Chapter 4, it was found that when TNPO3 is overexpressed in cells along with the MA domain of the RSV Gag protein, that TNPO3 is relocalized to the periphery of the nucleus. One possibility for this relocalization is that the MA domain could be binding to the NPC domain of TNPO3, and by having an

165 excess of MA in cells that bound to the NPC domain of TNPO3 prevented TNPO3 from interacting with the nuclear pore complex in the nuclear membrane.

5.3 Roles of Gag in the Nucleus

Besides the proposed role of Gag binding the viral RNA for packaging in the nucleus, there may be other roles of Gag in the nucleus. From the variety of proteins that were identified in the mass spectrometry experiments, there are a few possibilities of the roles that RSV Gag could have in the nucleus. From the number of transcription-related proteins that were identified, besides using these factors to find the provirus, Gag could be altering transcription itself. There could be the possibility of Gag either recruiting these factors to the provirus for transcription or preventing transcription of cellular proteins as a way of preventing an immune response. RSV does not undergo latency unlike HIV, so maybe RSV Gag is altering transcription, or chromatin structure as a way to prevent RSV from going latent. To determine whether Gag alters transcription to prevent an immune response, increase viral transcription, or silences other cellular genes, a transcription analysis needs to be performed.

In Chapter 3, a few proteins that were identified from the mass spectrometry experiments were tested for possible roles in RSV assembly: TOP1, SKP1, MAGOH,

SF2, and SC35. TOP1 was chosen because it has been identified across numerous HIV

Gag pulldowns that were published, as well as a few of the mass spectrometry experiments that I performed. SKP1 and MAGOH were chosen because they were present in the Gag.L219A-tandemstrep list and the first purified RSV Gag and Gag.ΔNC list. SF2 and SC35 were chosen because we previously published about a possible interaction between them and Gag (Chapter 2). I have started by determining whether knocking down these proteins using siRNAs affected particle production or genomic packaging. Figure 3.30 demonstrates that when SKP1, TOP1, and SC35 siRNAs were used, Gag expression was undetectable in the cell lysates, and subsequently no

166 particles were detected via Western blotting. This experiment was only performed once, and my ultimate goal is to isolate the RNA from virus particles to determine whether there is an effect on packaging. Of course, this can only be done if I can obtain particles.

Imaging experiments need to be performed to determine whether there is a change is

Gag localization or expression in these knock down cells. This initial experiment was performed with the siRNAs being transfected first before the viral construct. Decreasing the expression of the cellular proteins may be affecting the transcription of the Gag plasmid and subsequently the translation of Gag, hence the reason for not detecting

Gag on the Western blot.

5.4 Gag and RNA interactions within nuclear foci

From work by Maldonado R. et al (manuscript in progress), we know that there is not 100% colocalization between Gag and the viral RNA in the nucleus. This implies that

Gag has other roles in the nucleus other than binding to the gRNA for packaging. And as stated earlier, the mass spectrometry data also implies this as well. We also know that when examining nuclear bodies, the tether can be either a protein or a RNA. We know that Gag is able to nonspecifically bind to non-viral RNAs and Gag nuclear foci fall apart under Actinomycin D (transcription inhibitor) treatment, illustrating that the formation of

Gag nuclear foci is dependent upon ongoing RNA production. In these experiments, the Gag foci could form using either cellular RNAs at transcription sites for scaffolding, or

Gag might be binding to newly transcribed viral RNA expressed from the Gag expression plasmid. To differentiate between these two possibilities, different Gag plasmids were used; one containing the psi domain (Ψ) (RU5.Gag.L219A) and the other which lacks Ψ (Gag.L219A) (Appendix C). The Ψ signal in the viral RNA is what

Gag binds to when bringing the RNA to particles for packaging. It is a highly structured region in the 5’ of the viral RNA and Gag displays strong affinity to this signal (reviewed in (91)).

167

If the Gag.L219A foci used a cellular RNA as a tether, then foci formation would be independent of the Ψ signal. We found that Gag.L219A formed foci when the RNA was with or without Ψ. However, the RNA FISH signal was not consistently focal; the cells observed expressed RNA that appeared either diffuse or focal. The gag mRNA expressed from Gag.L219A, which lacks Ψ, displayed a diffuse pattern in the majority of nuclei (Appendix C, panel B). But when Ψ is present (RU5.Gag.L219A), the focal and diffuse patterns of the RNA occurred in nearly equal amounts (Appendix C, panel C).

When the colocalization of Gag and the Ψ-containing RNA was examined, 40% of Gag colocalized with the RNA and 62% of RNA colocalized with Gag, but when there is no Ψ, the colocalization is decreased to 4% and 10% respectively. This is an expected result because Gag preferentially binds to Ψ containing viral RNAs versus Ψ minus RNAs.

These data imply that the ability of gag mRNA to condense into foci depends upon the presence of Ψ and could be due to the binding of Gag.

The interaction between Gag and the viral RNA occurs through zinc fingers present cysteine-histidine (Cys-His) boxes in the NC domain binding to the stem-loop of Ψ (reviewed in (91)). We mutated the first cysteine residue in the proximal Cys-

His motif of NC to serine (C21S), a mutant that has previously been shown to be deficient in gRNA packaging (46). We hypothesized that by mutating the Cys-His box, this should prevent Gag from binding to Ψ-containing RNAs. When

RU5.Gag.L219A.C21S is expressed in cells (Appendix C, panel D), the nuclear foci were fewer in number and appeared much larger than normal foci formed by

Gag.L219A, and the RNA pattern was mostly diffuse. Due to the significant reduction in the ability of the Gag.L219A.C21S RNA to adopt a focal pattern, these data suggest that even in the presence of Ψ, the RNA does not condense properly when the Gag protein does not contain a functional NC RNA binding domain. Thus, it appears that nuclear

RNA condensation is dependent upon both Ψ and a functional Gag protein.

168

5.5 Model and Future Directions

The findings described in this dissertation lead to a model in which Gag undergoes nuclear trafficking using TNPO3 (Figure 5.1). Through TNPO3, Gag is targeted to splicing speckles where SC35 and SF2 are stored until they are needed for splicing co-transcriptionally. From here, Gag can use splicing factors to go to sites of active transcription to search for the provirus and bind to the unspliced viral RNA. In support of the model, we have shown that when cells expressing RSV Gag.L219A are treated with the transcription inhibitor Actinomycin D, the nuclear foci dissociate and become diffuse, demonstrating that the foci are dependent upon active transcription

(Maldonado R. et al., in progress). After Gag binds to the viral RNA, this causes a conformational change in Gag that exposes the NES (72) and allows the Gag-RNA complex to be exported from the nucleus using the CRM1 nuclear export pathway. Once in the cytoplasm, the Gag-RNA complex can traffic to the plasma membrane for packaging and particle assembly.

One way to test this model is to perform experiments using DNA fluorescent in situ hybridization (FISH) to label the provirus, RNA FISH to label the unspliced viral

RNA, and image along with Gag.L219A (to have an increased concentration of Gag in the nucleus) and SC35. SC35 is of particular interest because as described in Chapter

2/Appendix A, there is a high colocalization of Gag with SC35, and this colocalization increases (although not significantly) when the Clk1 kinase is present, which hyperphosphorylates SR proteins causing them to leave speckles and go to sites of transcription. Also, when the amount of SC35 is increased in cells, the number of

Gag.L219A nuclear foci increase, implying an interaction between Gag and SC35. If colocalization is observed between all four components, this could imply that Gag is at sites of active transcription and may be using SC35 as a way to find these sites. Live cell imaging can also be performed using tagged wildtype Gag, TNPO3, and SC35 to

169 determine whether Gag and TNPO3 traffic into the nucleus together and TNPO3 brings

Gag to splicing speckles (location of SC35). We could also look at localization of Gag and Gag-RNA complexes after TNPO3 expression has been knocked down by siRNAs as well as knocking down expression of SC35 to determine whether there is a decrease in Gag-RNA colocalization or if there is a different distribution of Gag in the nucleus.

Since Gag can use other nuclear import factors such as the Importin α/β complex and

Importin 11, the subnuclear localization of Gag could depend upon the normal cargo localization of the import factor. It is known that Importin β transports a variety of cargos into the nucleus that include gene and cell cycle regulators, histones, and ribosomal proteins. Importin 11 in known to transport ribosomal proteins as well, along with E2- ubiquitin-conjugating enzymes (reviewed in (36)). It is hypothesized that TNPO3 brings splicing factors directly to splicing speckles (106, 107). Depending on the pathway Gag uses to enter the nucleus, the nuclear function of Gag could be different. This would need to be tested through live-cell experiments tracking Gag nuclear import using each of these pathways, as well as knocking down expression of the import proteins to determine whether there is difference in Gag subnuclear localization.

Due to the high degree of colocalization of RSV Gag with the splicing factors

SC35 and SF2, as well as increased RSV Gag.L219A nuclear foci with increased SC35 expression, Gag may have a role in regulating the splicing of viral transcripts. Gag preferentially packages unspliced viral RNA, so within the nucleus, RSV Gag may be interacting with splicing factors in order to prevent the splicing of the viral transcripts. A way to test this is to use a viral construct that contains two different RNA labelling stem loops. The RNAs can be visualized through tagged coat proteins that bind to the stem loops. Putting one set of stem loops into the Gag coding region and another set replacing the src coding region. Both sets of stem loops will be present on the unspliced transcripts, and only the set that is present in the src coding region will on the spliced

170 transcripts. By increasing the amount of Gag in cells and by visualizing the RNA, it can be measured whether there is an increase in unspliced viral transcripts and a decrease in spliced viral transcripts.

Other roles that Gag may have in the nucleus could including altering transcription to favor provirus transcription or to prevent the transcription of cellular genes. One way to test this would be to perform a transcriptome analysis of uninfected cells versus infected cells to determine what cellular genes are being upregulated or downregulated.

To determine whether Gag interacts with any of the proteins examined from the mass spectrometry in vivo, co-immunoprecipitations need to be performed. To test whether the interaction of Gag with these proteins are RNA-dependent, co- immunoprecipitations can be performed that have been treated with RNase. Due to the number of proteins that were identified via the various mass spectrometry experiments, one way to narrow the list down further is to perform mass spectrometry experiments from chromatin fractions. This would hopefully narrow the list of potential proteins important for nuclear Gag function in regards to transcription. Another way of narrowing potential hits is to perform mass spectrometry experiments using Gag proteins from various retroviruses. This way, any proteins that are identified across multiple retroviruses may have better chances at being important for retroviral replication.

171

172

Figure 5.1: One Possible Model for Gag Nuclear Trafficking. 1) After translation in the cytoplasm, RSV Gag enters the nucleus using the nuclear import factor TNPO3

(green). 2) This brings Gag to splicing speckles (purple) due to the common cargo of

TNPO3 being SR-splicing factors. 3) Since splicing occurs co-transcriptionally, Gag traffics to active transcription along with the splicing factor SC35. When Gag finds the site of proviral transcription, Gag can bind the unspliced viral RNA co-transcriptionally. 4)

After Gag binds the viral RNA, it undergoes a conformational change that exposes the

NES in p10 so the Gag-RNA complex can be exported from the nucleus using the CRM1 export pathway (yellow). 5) Once in the cytoplasm, the Gag-RNA complex can traffic to the plasma membrane for packaging.

173

5.5.1 Liquid-liquid phase separation: a model for Gag nuclear foci?

Nuclear bodies are dynamic structures that contain many proteins and RNAs without the use of membranes. It is hypothesized that they undergo liquid-liquid phase separation from the surrounding area brought about by macromolecular crowding (6).

Liquid-liquid phase separation occurs through protein-protein or protein-nucleic acid interactions that are governed by surface tension and critical concentration of the components in the structure. The threshold for the critical concentration is dependent upon many characteristics of the surrounding system including temperature, ionic strength, and whether there are any post-translation modifications made to the proteins.

Many of the proteins involved in the formation of membrane-less organelles contain motifs that are enriched in charged and/or aromatic residues that would be involved in the electrostatic and hydrophobic interactions needed for phase separation to occur.

Some of these proteins also contain structured domains and/or low complexity or disordered segments that are involved in the formation of multivalent interactions.

Disordered segments are regions with the protein that do not adopt stable secondary or tertiary structures and are dynamic. Phase separation occurs when the interactions of the proteins and nucleic acids become a better solvent for all of the components, compared to the surrounding environment. One advantage for cells to adopt these membrane-less organelles is because components can rapidly cycle between the organelles and the surrounding environment. This becomes important during times of stress when certain proteins or nucleic acids that were sequestered can be released or other macromolecules can be sequestered leading to a change in the environment allowing for a cellular response to the stress (reviewed in (136)).

When RSV Gag forms foci in the nucleus, we hypothesize that they could be in these liquid-phase separation assemblies. In Chapter 2/Appendix A, we have demonstrated that RSV Gag nuclear foci share similar characteristics to nuclear bodies.

174

Also, through observations made during confocal imaging, it appears that in cells expressing more Gag.L219A, appear to have more and bigger foci fitting the hypothesis that Gag foci are in a liquid-phase separated state. Analysis of the Gag protein sequence needs to be performed to examine whether Gag contains any of the above listed characteristic motifs that are known to cause phase separation. A way to examine whether RSV Gag is forming phase separation structures is to disrupt the Gag-Gag and

Gag-RNA interactions. If RSV Gag can no longer form interactions between other Gag proteins and RNA, then Gag shouldn’t be able to form the phase separated structures.

5.6 Summary

The work presented in this dissertation demonstrates unique findings not previously shown with other retroviral Gag proteins. In conjunction with the work others in the Parent laboratory performed, we are demonstrating a novel function for the Gag protein. No other retrovirus has been shown to undergo nuclear trafficking for the sake of obtaining the viral gRNA for packaging. We know that RSV Gag undergoes nuclear trafficking and if this is prevented, the efficiency of gRNA packaging is decreased (63).

We have demonstrated that RSV Gag and the unspliced viral RNA colocalize in the nucleus, as well as HIV-1 Gag (Maldonado R. et al., in progress; Tuffy, K. et al., in progress). And when cells are treated with a transcription inhibitor, the nuclear foci formed by RSV Gag.L219A dissociate. With the work presented here, I have shown that

RSV and HIV-1 Gag are isolated from chromatin fractions. And through proteomic experiments, numerous chromatin-associated host proteins can be eluted from affinity- tagged Gag purifications. We have also confirmed the interaction of the nuclear import protein TNPO3 and RSV Gag, and have discovered that there appears to be an interaction between Gag and the splicing factor SC35. This again leads to an innovative model in which RSV Gag undergoes nuclear trafficking using TNPO3, causing Gag to go

175 to splicing speckles in which Gag can then traffic to active transcription sites to look for the viral RNA for packaging.

176

Appendix A

Interplay between the alpharetroviral Gag protein and SR proteins SF2 and SC35 in the nucleus

Breanna L. Rice*, Rebecca J. Kaddis*, Matthew S. Stake*, Timothy L. Lochmann, and Leslie J. Parent *Authors contributed equally

Citation: Rice BL, Kaddis RJ, Stake MS, Lochmann TL, Parent LJ. Interplay between the alpharetroviral Gag protein and SR proteins SF2 and SC35 in the nucleus. Frontiers in Microbiology. 2015;6:925. doi:10.3389/fmicb.2015.00925.

© 2015 Rice, Kaddis, Stake, Lochmann and Parent

177

178

179

180

181

182

183

184

185

186

187

188

189

190

Appendix B

Mass Spectrometry Tables

191

Table 3.1. Top 10 DAVID biological processes for proteins isolated from purified RSV Gag and Gag.ΔNC affinity purifications from DF1 nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) EIF6, RPL17, CHERP, RPL19, PRPF4B, RPL14, RPL13, SNRPD3, RBM15B, RPL15, SNRPD1, SYNCRIP, RPLP2, GO:0006396 5.5 RBM7, INTS2, WTAP, PNN, NONO, DDX27, KDM1A, WDR75, IMP3, DHX38, MAK16, RPLP0, RPL26L1, U2AF1, (6.68E-93) LUC7L2, RPL10, RPL12, DHX30, RPS27A, LUC7L3, RPL35A, SNRPA1, RPUSD4, CHTOP, GTPBP4, MAGOH, RNA processing EFTUD2, HNRNPA2B1, DDX39B, MTPAP, HNRNPU, MRM3, MRTO4, RSL1D1, RCL1, NOP2, RPS16, RPS17, RPS14, (193) RPS15, SNRPB, RPS13, RPS10, RPS11, RNF20, SNRPE, HSD17B10, FIP1L1, SNRPB2, PABPC4, RPS25, HNRNPM, DDX47, HNRNPK, RPS29, RPL7, RPL6, ZNF326, RPL9, RPL8, RPL3, NPM3, NAT10, RPS20, RPL4, RPL7A, PABPC1, RPL10A, RPS23, RPS24, PRPF40A, RPSA, TSR1, DDX1, RPL23A, DDX5, RPS6, RBMX, RPF2, HNRNPA0, RPS8, CTR9, RPS7, HNRNPH3, RPL18A, RPL37A, UTP20, HNRNPH1, NCOR2, GAR1, UTP15, SKIV2L2, UTP11, RPS2, YBX1, RPS3, DCAF13, DKC1, DGCR8, PLRG1, RPS3A, CDK12, FTSJ3, WDR33, CDK13, ZCCHC8, NOL6, MRPL1, EXOSC7, EXOSC3, CDK9, SPEN, RPS4X, GTF2H2, TRMT61B, TTF2, PRPF6, EIF4A3, C1QBP, SNRNP200, CPSF7, CPSF6, CPSF4, CPSF3, NHP2, CPSF2, WDR43, KIAA1429, PRPF38A, UTP4, TRA2B, UTP6, TRA2A, NOB1, RPL27A, RPL35, RPS15A, RPL36, SF3B6, RPL38, SF3B4, SF3B3, DIMT1, PRPF19, DROSHA, EXOSC10, SF3B1, RPL30, CHD7, RPL31, PRPF8, NUDT21, KIAA0391, RBM25, HSPA8, RTCB, BCAS2, PDCD11, RRP15, ALYREF, RPL27, ELAVL1, RPL24, AFF2, HNRNPDL, PPP1R9B, CSNK1D, RPL23, CSNK1E, RPL22, SFPQ, RPL21, POLDIP3, THRAP3, RPL7L1, PSPC1, NOP58, NOP56, CWC22, RBM17 RPL17, RPL19, RPL14, RPL13, RPL15, RPLP2, SYNCRIP, MED23, INTS2, REST, MED22, WTAP, MED20, WDR75, GO:0010467 2.1 BRPF1, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, LUC7L2, RPL10, RPL12, OGT, DHX30, LUC7L3, MCRS1, PCID2, (5.30E-78) RREB1, RXRA, RCOR1, TADA2B, MTPAP, MECOM, MYH9, MRM3, JUP, MAPK1, NME2, PTRF, RPS16, RPS17, Gene expression EIF2S1, RPS14, JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, HSD17B10, PABPC4, CHCHD3, (430) AHCTF1, NIFK, SERPINH1, RPS25, RPS29, NAT10, RPS20, PABPC1, VEZF1, RPS23, RPS24, PRPF40A, RPSA, YEATS4, KLF13, PHB, RPS6, RPS8, RPS7, WDR61, SMURF2, RERE, GAR1, YBX1, ZNF148, WDR33, MRPL1, MRPL3, ACTA2, EXOSC7, ERLIN2, EXOSC3, ZNF143, TRMT61B, GTF2H2, TTF2, TAF11, EIF4A3, TAF12, EIF4A2, NUP205, FOXC2, RUVBL1, WDR43, KIAA1429, TUFM, ZNF276, C5, RPL35, NOB1, RPL36, RPL38, SF3B6, SF3B4, SF3B3, SF3B1, RPL30, EZR, RPL31, BRMS1L, ACTC1, PPHLN1, LMNA, ELAVL1, RPL27, RPL24, CSNK1D, RPL23, RPL22, CSNK1E, RPL21, SFPQ, POLDIP3, TCEB3, ZNF462, CWC22, EIF6, CHERP, PRPF4B, SNRPD3, RBM15B, SNRPD1, INO80, RBM7, TBP, DDX27, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, RPL35A, SNRPA1, HNRNPA2B1, EEF2, ARID1B, HNRNPU, MRTO4, RCL1, SMARCE1, SMARCA5, TFAP2D, COL1A1, SMARCA2, MEAF6, SNRPB2, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, RPL8, RPL3, HIST1H4C, TAF9, RPL7A, RPL4, RPL10A, TAF2, TAF1, TAF4, TAF3, EHMT1, CCPG1, MYO1C, TAF5, SEC11A, TAF8, TAF7, RPL23A, RBMX, SAFB2, HNRNPH3, HDAC2, SMARCC1, RPL37A, HNRNPH1, COA3, HP1BP3, YLPM1, PRDX4, SKIV2L2, RPS2, RPS3, MAX, DCAF13, OSR2, PLRG1, DKC1, GTF2A1, RPS3A, GTF2A2, CDK5RAP3, RHOG, INO80C, ZCCHC8, INO80B, CDK1, TBL1XR1, ZFX, CDK9, RPS4X, NCL, PRPF6, SNRNP200, UCHL5, CPSF7, CPSF6, CPSF4, CPSF3, CPSF2, DPF2, VIM, RPS15A, NFYA, NDC1, DIMT1, HIC2, CHD7, TMED2, C1QTNF3, PRPF8, CEBPZ, CHD2, CHD1, HSPA5, CHD4, HSPA8, RTCB, COL4A2, PDCD11, NFRKB, MRGBP, PWP1, PPP1R9B, MED30, MRPL40, NONO, SIN3A, PICALM, H2AFV, U2AF1, DYNC2H1, RPL26L1, MRPL37, H2AFY, MRPL39, MRPL34,

159

TWIST2, RPUSD4, GTPBP4, MRPL51, CHTOP, RAN, YY1, NOP2, BAZ1A, BAZ1B, SERBP1, MRPL48, MRPL46, FIP1L1, TCF20, KRAS, PBRM1, NUP54, HNRNPAB, IKZF5, KLF5, IKZF2, SMAD5, DDX1, DDX5, RPF2, HNRNPA0, DDX6, SAP130, RPL18A, DLX6, UTP20, NCOR2, KLF3, UTP15, ZEB2, DMAP1, ZEB1, UTP11, SLC25A3, ACTR8, TOP2A, CTBP1, RBBP4, SLC25A4, SLC25A6, SPEN, MED6, MED4, MED9, ASH1L, CDK11B, FARSA, NHP2, PRPF38A, MED1, YWHAZ, LIMS1, RPL27A, ZBTB17, PRPF19, MRPL13, MRPL17, LANCL2, ACTL6A, KIAA0391, ETV6, THBS1, RBM25, BCAS2, RRP15, AFF2, HNRNPDL, MRPL30, SNAI1, MRPL23, MRPL22, MRPL21, ILF2, MRPL28, RPL7L1, NOP58, PBX1, NOP56, RBM17, MRPS35, LGMN, CBX3, MRPS30, PNN, TOP1, SLC25A24, SLC25A22, NUP133, MAGOH, EFTUD2, DDX39B, CHP1, PPP1CB, RSL1D1, SLTM, SNRPB, RNF20, SNRPE, MRPS17, CAV1, MRPS10, TRRAP, ZMYND8, NPM1, NPM3, UBAP2L, THAP11, FN1, EEF1A1, TSR1, MRPS23, WDR5, CTR9, PHF3, ITGA6, PHF6, UQCRC2, LARP1, DGCR8, CDK12, FTSJ3, CDK13, NOL6, MRPS5, LEF1, C1QBP, SGF29, MATR3, UTP4, ING3, ING2, TNC, TRA2B, UTP6, TRA2A, PRKDC, EXOSC10, RACK1, DROSHA, SAFB, NUDT21, GAPDH, ENO1, ERG, ALYREF, SAP30BP, NUP155, ATXN2, UQCC1, RPRD2, PHB2, SP3, THRAP3, PSPC1, NFIC, NFIA RPL17, PRPF4B, RPL19, RPL14, RPL13, RBM15B, SNRPD3, RPL15, SNRPD1, SYNCRIP, RPLP2, RBM7, WTAP, GO:0016071 5.9 PNN, NONO, KDM1A, DHX38, DNAJB11, RPLP0, RPL26L1, U2AF1, RPL10, LUC7L2, RPL12, RPS27A, LUC7L3, (1.28E-76) SNRPA1, RPL35A, CHTOP, PCID2, MAGOH, EFTUD2, HNRNPA2B1, DDX39B, MTPAP, HNRNPU, MRTO4, SLTM, mRNA metabolic RPS16, RPS17, RPS14, RPS15, SNRPB, RPS13, RPS10, RPS11, RNF20, SNRPE, FIP1L1, SNRPB2, RPS25, process HNRNPM, DDX47, HNRNPK, RPS29, RPL7, ZNF326, RPL6, RPL9, RPL8, RPL3, RPS20, RPL4, PABPC1, RPL10A, (153) RPL7A, RPS23, RPS24, PRPF40A, RPSA, DDX1, RPL23A, DDX5, RPS6, RBMX, HNRNPA0, RPS8, SAFB2, CTR9, DDX6, RPS7, HNRNPH3, RPL18A, RPL37A, HNRNPH1, EDC4, SKIV2L2, RPS2, YBX1, RPS3, PLRG1, DKC1, RPS3A, CDK12, WDR33, CDK13, ZCCHC8, EXOSC7, EXOSC3, CDK9, SPEN, RPS4X, TTF2, GTF2H2, PRPF6, EIF4A3, C1QBP, EIF4A2, SNRNP200, CPSF7, CPSF6, CDK11B, CPSF4, CPSF3, CPSF2, KIAA1429, PRPF38A, TRA2B, TRA2A, RPL27A, RPL35, RPS15A, RPL36, SF3B6, RPL38, SF3B4, SF3B3, PRPF19, EXOSC10, SF3B1, RPL30, RPL31, SAFB, PRPF8, NUDT21, RBM25, HSPA8, BCAS2, PDCD11, ALYREF, RPL27, ELAVL1, RPL24, AFF2, RPL23, RPL22, SFPQ, POLDIP3, RPL21, THRAP3, PSPC1, CWC22, RBM17 EIF6, RPL17, RPL19, RPL14, RPL13, RPL15, RPLP2, DDX27, WDR75, IMP3, MAK16, RPLP0, RPL26L1, RPL10, GO:0016072 9.7 RPL12, RPS27A, RPL35A, GTPBP4, MRTO4, MRM3, RSL1D1, RCL1, NOP2, RPS16, RPS17, RPS14, RPS15, RPS13, (1.77E-72) RPS10, RPS11, NIFK, RPS25, DDX47, RPL7, RPS29, RPL6, RPL9, RPL8, RPL3, NPM3, NAT10, RPL7A, RPL10A, rRNA metabolic process RPL4, RPS20, RPS23, RPS24, RPSA, TSR1, RPL23A, RPS6, RPF2, RPS8, RPS7, RPL18A, RPL37A, UTP20, GAR1, (102) UTP15, SKIV2L2, RPS2, UTP11, RPS3, DCAF13, DKC1, RPS3A, FTSJ3, NOL6, MRPL1, EXOSC7, EXOSC3, RPS4X, TRMT61B, EIF4A3, NHP2, WDR43, UTP4, UTP6, RPL35, RPL27A, NOB1, RPS15A, RPL36, RPL38, DROSHA, EXOSC10, DIMT1, RPL30, CHD7, RPL31, PDCD11, RRP15, RPL27, RPL24, CSNK1D, RPL23, CSNK1E, RPL22, RPL21, RPL7L1, NOP58, NOP56 EIF6, RPL17, RPL19, RPL14, RPL13, RPL15, RPLP2, DDX27, WDR75, IMP3, MAK16, RPLP0, RPL26L1, RPL10, GO:0006364 9.7 RPL12, RPS27A, RPL35A, GTPBP4, MRTO4, MRM3, RSL1D1, RCL1, NOP2, RPS16, RPS17, RPS14, RPS15, RPS13, (2.69E-71) RPS10, RPS11, RPS25, DDX47, RPL7, RPS29, RPL6, RPL9, RPL8, RPL3, NPM3, NAT10, RPL7A, RPL10A, RPL4, rRNA processing RPS20, RPS23, RPS24, RPSA, TSR1, RPL23A, RPS6, RPF2, RPS8, RPS7, RPL18A, RPL37A, UTP20, GAR1, UTP15, (100) SKIV2L2, RPS2, UTP11, RPS3, DCAF13, DKC1, RPS3A, FTSJ3, NOL6, MRPL1, EXOSC7, EXOSC3, RPS4X, TRMT61B, EIF4A3, NHP2, WDR43, UTP4, UTP6, RPL35, RPL27A, NOB1, RPL36, RPS15A, RPL38, EXOSC10, DIMT1, RPL30, CHD7, RPL31, PDCD11, RRP15, RPL27, RPL24, CSNK1D, RPL23, CSNK1E, RPL22, RPL21, RPL7L1, NOP58, NOP56

160

EIF6, RPL17, RPL19, RPL14, RPL13, SNRPD3, RPL15, SNRPD1, RPLP2, DDX27, WDR75, IMP3, MAK16, RPLP0, GO:0022613 6.9 RPL26L1, RPL10, LUC7L2, RPL12, DHX30, RPS27A, LUC7L3, RPL35A, GTPBP4, RAN, DDX39B, MRM3, MRTO4, (4.40E-70) RSL1D1, RCL1, NOP2, RPS16, RPS17, RPS14, RPS15, SNRPB, RPS13, RPS10, RPS11, SNRPE, RPS25, DDX47, Ribonucleoprotein RPS29, RPL7, RPL6, RPL9, NPM1, BRIX1, RPL8, RPL3, NPM3, TAF9, NAT10, RPS20, RPL4, RPL10A, RPL7A, complex biogenesis RPS23, RPS24, RPSA, TSR1, DDX1, RPL23A, RPS6, RBMX, RPF2, RPS8, RPS7, DDX6, RPL18A, NOP16, RPL37A, (126) UTP20, GAR1, UTP15, SKIV2L2, UTP11, RPS2, RPS3, DCAF13, DKC1, RPS3A, FTSJ3, NOL6, MRPL1, EXOSC7, NIP7, EXOSC3, RPS4X, TRMT61B, PRPF6, EIF4A3, C1QBP, SNRNP200, RUVBL1, WDR43, NHP2, UTP4, UTP6, NOB1, RPL27A, RPL35, RPS15A, RPL36, RPL38, DIMT1, EXOSC10, DROSHA, PRPF19, SF3B1, RPL30, CHD7, RPL31, PRPF8, PDCD11, RRP15, RPL27, RPL24, ATXN2, CSNK1D, RPL23, CSNK1E, RPL22, RPL21, RPL7L1, NOP58, NOP56 EIF6, RPL17, RPL19, RPL14, RPL13, RPL15, RPLP2, DDX27, WDR75, IMP3, MAK16, RPLP0, RPL26L1, RPL10, GO:0042254 8.4 RPL12, DHX30, RPS27A, RPL35A, GTPBP4, RAN, MRTO4, MRM3, RSL1D1, RCL1, NOP2, RPS16, RPS17, RPS14, (9.04E-70) RPS15, RPS13, RPS10, RPS11, RPS25, DDX47, RPS29, RPL7, RPL6, RPL9, BRIX1, NPM1, RPL8, RPL3, NPM3, Ribosome biogenesis NAT10, RPL10A, RPL7A, RPL4, RPS20, RPS23, RPS24, RPSA, TSR1, RPL23A, RPS6, RPF2, RPS8, RPS7, RPL18A, (108) NOP16, RPL37A, UTP20, GAR1, UTP15, SKIV2L2, RPS2, UTP11, RPS3, DCAF13, DKC1, RPS3A, FTSJ3, NOL6, MRPL1, EXOSC7, NIP7, EXOSC3, RPS4X, TRMT61B, EIF4A3, C1QBP, WDR43, NHP2, UTP4, UTP6, RPL35, RPL27A, NOB1, RPS15A, RPL36, RPL38, DROSHA, EXOSC10, DIMT1, RPL30, CHD7, RPL31, PDCD11, RRP15, RPL27, RPL24, CSNK1D, RPL23, CSNK1E, RPL22, RPL21, RPL7L1, NOP58, NOP56 LDHB, RPL17, LDHA, RPL19, RPL14, RPL13, RPL15, RPLP2, SYNCRIP, MED23, INTS2, REST, MED22, WTAP, GO:0034641 1.9 MED20, WDR75, BRPF1, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, LUC7L2, RPL10, RPL12, OGT, DHX30, (2.74E-67) LUC7L3, MCRS1, PCID2, RREB1, RCOR1, RXRA, TADA2B, MTPAP, MECOM, MRM3, JUP, MAPK1, NME2, NME3, Cellular nitrogen PTRF, RPS16, RPS17, EIF2S1, RPS14, JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, MYO18A, compound metabolic HSD17B10, GNAI2, PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, RPS29, NAT10, RPS20, PABPC1, VEZF1, RPS23, process PRPF40A, RPS24, RPSA, YEATS4, MKI67, KLF13, TAOK1, PHB, RPS6, RPS8, RPS7, WDR61, SMURF2, RERE, (462) GAR1, YBX1, ZNF148, WDR33, MRPL1, NDUFB10, MRPL3, EXOSC7, ERLIN2, EXOSC3, ZNF143, NDUFA10, TRMT61B, GTF2H2, TTF2, TAF11, EIF4A3, TAF12, EIF4A2, NUP205, FOXC2, RUVBL1, WDR43, KIAA1429, TUFM, ZNF276, RPL35, NOB1, RPL36, RPL38, SF3B6, SF3B4, SF3B3, SF3B1, RPL30, EZR, RPL31, BRMS1L, SHMT1, PPHLN1, LMNA, ELAVL1, RPL27, RPL24, CSNK1D, RPL23, RPL22, CSNK1E, RPL21, SFPQ, POLDIP3, TCEB3, ZNF462, CWC22, EIF6, CHERP, PRPF4B, SNRPD3, RBM15B, SNRPD1, INO80, TBP, RBM7, DDX27, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, RPL35A, SNRPA1, HNRNPA2B1, ACTN1, EEF2, ARID1B, HNRNPU, MRTO4, RCL1, SMARCE1, SMARCA5, TFAP2D, COL1A1, SMARCA2, ABCA7, MEAF6, SNRPB2, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, RPL8, RPL3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, DDX42, TAF2, TAF1, TAF4, TAF3, EHMT1, CCPG1, TAF5, SEC11A, TAF8, TAF7, RPL23A, RBMX, SAFB2, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, COA3, HP1BP3, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, DKC1, GTF2A1, RPS3A, GTF2A2, CDK5RAP3, NDUFS2, RHOG, INO80C, ZCCHC8, INO80B, CDK1, TBL1XR1, SSBP1, ZFX, CDK9, RPS4X, NCL, PRPF6, SNRNP200, UCHL5, CPSF7, CPSF6, CPSF4, CPSF3, CPSF2, DPF2, RPS15A, NFYA, SMUG1, NDC1, DIMT1, HIC2, CHD7, PRPF8, CEBPZ, CHD2, CHD1, HSPA5, CHD4, HSPA8, RTCB, COL4A2, PDCD11, NFRKB, ATP5F1, MRGBP, SMC3, PWP1, PPP1R9B, MED30, MRPL40, NONO, ATP2B2, SIN3A, DDX18, PICALM, H2AFV, DNAJB11, U2AF1, RPL26L1, MRPL37, H2AFY, MRPL39, MRPL34, TWIST2, CHTOP, RPUSD4, GTPBP4, MRPL51, RAN, YY1, RFC3, NOP2, BAZ1A, RFC4, BAZ1B, MRPL48, MRPL46, MDH2, FIP1L1, TALDO1, AKAP12, ACAT1, TCF20, KRAS, TMED10, PBRM1, NUP54, HNRNPAB, NAXD, IKZF5, KLF5,

161

IKZF2, SMAD5, DDX1, DDX5, RPF2, EPHA2, HNRNPA0, DDX6, CCT4, SAP130, RPL18A, DLX6, CCT8, UTP20, NCOR2, KLF3, ATP5B, UTP15, ZEB2, DMAP1, ZEB1, UTP11, GBAS, SLC25A3, ATP5L, ATP5O, ACTR8, TOP2B, TOP2A, CTBP1, RBBP4, SLC25A4, SLC25A6, SPEN, MED6, MED4, MED9, ASH1L, CDK11B, FARSA, NHP2, MED1, PRPF38A, LIMS1, RPL27A, ZBTB17, PRPF19, MRPL13, MRPL17, LANCL2, ACTL6A, KIAA0391, ETV6, THBS1, RBM25, BCAS2, PDS5B, RRP15, AFF2, HNRNPDL, MRPL30, SNAI1, MRPL23, MRPL22, MRPL21, MRPL28, ILF2, RPL7L1, NOP58, PBX1, NOP56, ATP5A1, RBM17, MRPS35, CBX3, MRPS30, PNN, TOP1, SLC25A24, SLC25A22, NUP133, MAGOH, EFTUD2, DDX39B, CHP1, RSL1D1, SLTM, SNRPB, RNF20, SNRPE, MRPS17, CAV1, MRPS10, TRRAP, ZMYND8, NPM1, NPM3, THAP11, EEF1A1, TSR1, MRPS23, WDR5, CTR9, PHF3, ITGA6, PHF6, UQCRC2, HSP90AB1, PRKAG2, LARP1, DGCR8, CDK12, KPNB1, FTSJ3, CDK13, NOL6, HSP90AA1, MRPS5, LEF1, C1QBP, SGF29, KPNA2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, PRKDC, EXOSC10, RACK1, DROSHA, SAFB, NUDT21, GAPDH, ENO1, NDUFA4, ERG, ERH, ALYREF, SAP30BP, NUP155, MPG, ATXN2, UQCC1, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA, MGST1 RPL17, RPL19, RPL14, RPL13, RPL15, RPLP2, SYNCRIP, INTS2, MED23, MED22, REST, WTAP, MED20, NONO, GO:0016070 2.1 WDR75, BRPF1, SIN3A, PICALM, DDX18, H2AFV, DNAJB11, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, (1.05E-64) RPL26L1, H2AFY, RPL10, LUC7L2, OGT, RPL12, MRPL39, DHX30, TWIST2, LUC7L3, RPUSD4, GTPBP4, CHTOP, RNA metabolic process MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, YY1, TADA2B, MTPAP, MECOM, MRM3, JUP, MAPK1, NME2, NOP2, (380) BAZ1A, RPS16, BAZ1B, PTRF, RPS17, RPS14, JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, HSD17B10, FIP1L1, PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, TCF20, KRAS, RPS29, PBRM1, NAT10, RPS20, PABPC1, NUP54, VEZF1, RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, SAP130, WDR61, RPL18A, DLX6, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1, ZNF148, ACTR8, TOP2A, WDR33, MRPL1, CTBP1, RBBP4, EXOSC7, ERLIN2, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TRMT61B, TAF11, MED6, EIF4A3, MED4, TAF12, EIF4A2, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, FARSA, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, LIMS1, RPL27A, RPL35, NOB1, RPL36, RPL38, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, RPL31, LANCL2, ACTL6A, KIAA0391, ETV6, BRMS1L, RBM25, BCAS2, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, RPL24, AFF2, HNRNPDL, SNAI1, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, NOP58, TCEB3, PBX1, ZNF462, NOP56, CWC22, RBM17, EIF6, CHERP, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, NUP133, RPL35A, SNRPA1, MAGOH, EFTUD2, DDX39B, HNRNPA2B1, ACTN1, CHP1, ARID1B, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, COL1A1, RNF20, SMARCA2, SNRPE, MEAF6, CAV1, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, DDX42, TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, CCPG1, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, PHF3, HNRNPH3, HDAC2, ITGA6, SMARCC1, DDX50, RPL37A, HNRNPH1, PHF6, HP1BP3, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, GTF2A1, DGCR8, DKC1, RPS3A, GTF2A2, CDK12, CDK5RAP3, RHOG, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, NDC1, EXOSC10, DROSHA, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, CHD4, HSPA8, ENO1, RTCB, COL4A2, ERG, PDCD11, NFRKB, ALYREF, SAP30BP, NUP155, MRGBP, PWP1, PPP1R9B, ATXN2, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA LDHB, RPL17, LDHA, RPL19, RPL14, RPL13, RPL15, RPLP2, SYNCRIP, INTS2, MED23, REST, MED22, WTAP,

162

GO:0006807 1.8 MED20, WDR75, BRPF1, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, LUC7L2, RPL10, RPL12, OGT, DHX30, (2.62E-64) LUC7L3, MCRS1, PCID2, RREB1, RCOR1, RXRA, TADA2B, MTPAP, MECOM, LPCAT2, MRM3, JUP, MAPK1, NME2, Nitrogen compound NME3, PTRF, RPS16, RPS17, EIF2S1, RPS14, JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, metabolic process MYO18A, HSD17B10, GNAI2, PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, RPS29, NAT10, RPS20, PABPC1, VEZF1, (474) RPS23, PRPF40A, RPS24, RPSA, YEATS4, MKI67, KLF13, TAOK1, PHB, RPS6, RPS8, RPS7, WDR61, SMURF2, RERE, GAR1, YBX1, AUH, ZNF148, WDR33, MRPL1, NDUFB10, MRPL3, EXOSC7, ERLIN2, EXOSC3, ZNF143, NDUFA10, TTF2, TRMT61B, GTF2H2, TAF11, EIF4A3, TAF12, EIF4A2, NUP205, FOXC2, RUVBL1, WDR43, KIAA1429, TUFM, ZNF276, ALDH18A1, RPL35, NOB1, RPL36, RPL38, SF3B6, SF3B4, SF3B3, SF3B1, RPL30, EZR, RPL31, BRMS1L, SHMT1, PPHLN1, LMNA, ELAVL1, RPL27, RPL24, CSNK1D, RPL23, RPL22, CSNK1E, RPL21, SFPQ, POLDIP3, TCEB3, ZNF462, CWC22, EIF6, CHERP, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, TBP, RBM7, DDX27, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, RPL35A, SNRPA1, HNRNPA2B1, ACTN1, EEF2, ARID1B, HNRNPU, MRTO4, RCL1, SMARCE1, SMARCA5, TFAP2D, COL1A1, SMARCA2, ABCA7, MEAF6, HACD3, GLUD1, SNRPB2, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, RPL8, RPL3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, DDX42, TAF2, TAF1, TAF4, TAF3, EHMT1, CCPG1, TAF5, SEC11A, TAF8, TAF7, RPL23A, RBMX, SAFB2, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, COA3, HP1BP3, YLPM1, PRDX4, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, DKC1, GTF2A1, RPS3A, GTF2A2, CDK5RAP3, NDUFS2, RHOG, INO80C, ZCCHC8, INO80B, CDK1, TBL1XR1, SSBP1, ZFX, CDK9, RPS4X, NCL, PRPF6, SNRNP200, UCHL5, CPSF7, CPSF6, CPSF4, CPSF3, CPSF2, DPF2, RPS15A, NFYA, SMUG1, NDC1, DIMT1, HIC2, CHD7, PRPF8, CEBPZ, CHD2, CHD1, HSPA5, CHD4, HSPA8, RTCB, COL4A2, PDCD11, NFRKB, ATP5F1, MRGBP, SMC3, PWP1, PPP1R9B, DBT, MED30, MRPL40, NONO, ATP2B2, SIN3A, DDX18, PICALM, H2AFV, DNAJB11, U2AF1, RPL26L1, MRPL37, H2AFY, MRPL39, MRPL34, TWIST2, CHTOP, RPUSD4, GTPBP4, MRPL51, RAN, YY1, RFC3, NOP2, BAZ1A, RFC4, BAZ1B, MRPL48, MRPL46, MDH2, FIP1L1, TALDO1, AKAP12, ACAT1, TCF20, KRAS, TMED10, PBRM1, NUP54, HNRNPAB, NAXD, IKZF5, KLF5, IKZF2, SMAD5, DDX1, DDX5, RPF2, EPHA2, HNRNPA0, DDX6, CCT4, SAP130, RPL18A, DLX6, CCT8, UTP20, NCOR2, KLF3, ATP5B, UTP15, ZEB2, DMAP1, ZEB1, UTP11, GBAS, SLC25A3, ATP5L, ATP5O, ACTR8, TOP2B, TOP2A, CTBP1, RBBP4, SLC25A4, SLC25A6, SPEN, MED6, MED4, MED9, ASH1L, CDK11B, FARSA, NHP2, MED1, PRPF38A, LIMS1, RPL27A, ZBTB17, PRPF19, MRPL13, MRPL17, LANCL2, ACTL6A, KIAA0391, ETV6, THBS1, RBM25, BCAS2, PDS5B, RRP15, HSPG2, AFF2, HNRNPDL, MRPL30, SNAI1, MRPL23, MRPL22, MRPL21, MRPL28, ILF2, RPL7L1, NOP58, PBX1, NOP56, ATP5A1, RBM17, MRPS35, CBX3, MRPS30, PNN, TOP1, SLC25A24, SLC25A22, NUP133, MAGOH, EFTUD2, DDX39B, CHP1, RSL1D1, SLTM, PYCR2, SNRPB, RNF20, SNRPE, MRPS17, CAV1, MRPS10, TRRAP, ZMYND8, NPM1, NPM3, THAP11, P4HB, EEF1A1, TSR1, MRPS23, WDR5, CTR9, PHF3, ITGA6, PHF6, UQCRC2, HSP90AB1, PRKAG2, CLTC, LARP1, DGCR8, CDK12, ABHD12, KPNB1, FTSJ3, CDK13, NOL6, HSP90AA1, MRPS5, LEF1, C1QBP, SGF29, KPNA2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, PRKDC, EXOSC10, RACK1, DROSHA, SAFB, NUDT21, GAPDH, ENO1, NDUFA4, ERG, ERH, ALYREF, SAP30BP, NUP155, MPG, ATXN2, UQCC1, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA, MGST1 The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the . The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

163

Table 3.2. Top 10 nuclear enriched DAVID biological processes for proteins isolated from purified RSV Gag and Gag.ΔNC affinity purifications from DF1 nuclear lysates.

Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) RPL17, RPL19, RPL13, RPL15, SYNCRIP, INTS2, MED23, MED22, REST, WTAP, MED20, NONO, WDR75, BRPF1, GO:0090304 2.6 SIN3A, PICALM, DDX18, H2AFV, DNAJB11, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, H2AFY, LUC7L2, (2.25E-105) OGT, RPL12, DHX30, TWIST2, LUC7L3, GTPBP4, CHTOP, MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, YY1, Nucleic acid metabolic TADA2B, MECOM, JUP, MAPK1, NME2, RFC3, NOP2, BAZ1A, RFC4, RPS16, BAZ1B, PTRF, RPS17, RPS14, JUN, process RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, MYO18A, FIP1L1, PABPC4, CHCHD3, AHCTF1, NIFK, (364) RPS25, TCF20, RPS29, PBRM1, NAT10, RPS20, PABPC1, NUP54, VEZF1, RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, MKI67, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, CCT4, SAP130, WDR61, DLX6, CCT8, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1, ZNF148, ACTR8, TOP2B, TOP2A, WDR33, CTBP1, RBBP4, EXOSC7, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TAF11, MED6, EIF4A3, MED4, TAF12, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, RPL35, NOB1, RPL36, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, LANCL2, ACTL6A, ETV6, BRMS1L, RBM25, BCAS2, PDS5B, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, AFF2, HNRNPDL, SNAI1, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, NOP58, TCEB3, PBX1, ZNF462, NOP56, CWC22, RBM17, EIF6, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, TOP1, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, NUP133, SNRPA1, MAGOH, EFTUD2, DDX39B, HNRNPA2B1, CHP1, ARID1B, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, RNF20, SMARCA2, SNRPE, MEAF6, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, DDX42, TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, PHF6, HP1BP3, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, DGCR8, DKC1, GTF2A1, RPS3A, GTF2A2, CDK12, CDK5RAP3, KPNB1, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, SSBP1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, KPNA2, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, SMUG1, NDC1, EXOSC10, DROSHA, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, CHD4, HSPA8, ENO1, RTCB, ERG, PDCD11, NFRKB, ALYREF, SAP30BP, NUP155, MRGBP, SMC3, PWP1, MPG, PPP1R9B, ATXN2, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA RPL17, RPL19, RPL13, RPL15, SYNCRIP, INTS2, MED23, REST, MED22, WTAP, MED20, NONO, WDR75, BRPF1, GO:0016070 2.7 SIN3A, PICALM, DDX18, H2AFV, DNAJB11, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, H2AFY, LUC7L2, (7.48E-105) OGT, RPL12, DHX30, TWIST2, LUC7L3, GTPBP4, CHTOP, MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, YY1, RNA metabolic process TADA2B, MECOM, JUP, MAPK1, NME2, NOP2, BAZ1A, RPS16, BAZ1B, PTRF, RPS17, RPS14, JUN, RPS15, ZNF384, (348) RPS13, RPS10, VGLL3, RPS11, MYBBP1A, FIP1L1, PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, TCF20, RPS29, PBRM1, NAT10, RPS20, NUP54, PABPC1, VEZF1, RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, SAP130, WDR61,

164

DLX6, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1, ZNF148, ACTR8, TOP2A, WDR33, CTBP1, RBBP4, EXOSC7, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TAF11, MED6, EIF4A3, MED4, TAF12, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, NOB1, RPL35, RPL36, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, LANCL2, ACTL6A, ETV6, BRMS1L, RBM25, BCAS2, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, AFF2, HNRNPDL, SNAI1, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, NOP58, TCEB3, PBX1, ZNF462, NOP56, CWC22, RBM17, EIF6, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, NUP133, SNRPA1, EFTUD2, MAGOH, DDX39B, HNRNPA2B1, CHP1, ARID1B, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, RNF20, SMARCA2, SNRPE, MEAF6, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, RPL6, SMARCB1, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, DDX42, TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, HNRNPH3, HDAC2, SMARCC1, DDX50, RPL37A, HNRNPH1, PHF6, HP1BP3, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, DGCR8, DKC1, GTF2A1, RPS3A, GTF2A2, CDK12, CDK5RAP3, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, NDC1, EXOSC10, DROSHA, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, HSPA8, CHD4, ENO1, RTCB, ERG, PDCD11, NFRKB, ALYREF, SAP30BP, NUP155, MRGBP, PWP1, PPP1R9B, ATXN2, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA EIF6, RPL17, RPL19, PRPF4B, RPL13, SNRPD3, RBM15B, RPL15, SNRPD1, SYNCRIP, RBM7, INTS2, WTAP, PNN, GO:0006396 7.1 NONO, DDX27, KDM1A, WDR75, IMP3, DHX38, MAK16, RPLP0, U2AF1, LUC7L2, RPL12, DHX30, RPS27A, LUC7L3, (1.76E-103) SNRPA1, CHTOP, GTPBP4, MAGOH, EFTUD2, HNRNPA2B1, DDX39B, HNRNPU, MRTO4, RSL1D1, RCL1, NOP2, RNA processing RPS16, RPS17, RPS14, RPS15, SNRPB, RPS13, RPS10, RPS11, RNF20, SNRPE, FIP1L1, SNRPB2, PABPC4, (175) RPS25, HNRNPM, DDX47, HNRNPK, RPS29, RPL7, ZNF326, RPL6, RPL9, RPL8, RPL3, NPM3, NAT10, RPS20, RPL4, PABPC1, RPL10A, RPL7A, RPS23, RPS24, PRPF40A, RPSA, TSR1, DDX1, RPL23A, DDX5, RPS6, RBMX, RPF2, HNRNPA0, RPS8, CTR9, RPS7, HNRNPH3, RPL37A, UTP20, HNRNPH1, NCOR2, GAR1, UTP15, SKIV2L2, UTP11, RPS2, YBX1, RPS3, DCAF13, DKC1, DGCR8, PLRG1, RPS3A, CDK12, FTSJ3, WDR33, CDK13, ZCCHC8, NOL6, EXOSC7, EXOSC3, CDK9, SPEN, RPS4X, TTF2, GTF2H2, PRPF6, EIF4A3, C1QBP, SNRNP200, CPSF7, CPSF6, CPSF4, CPSF3, CPSF2, NHP2, WDR43, KIAA1429, PRPF38A, UTP4, TRA2B, UTP6, TRA2A, NOB1, RPL35, RPS15A, RPL36, SF3B6, SF3B4, SF3B3, DIMT1, PRPF19, DROSHA, EXOSC10, SF3B1, RPL30, CHD7, PRPF8, NUDT21, RBM25, HSPA8, RTCB, BCAS2, PDCD11, RRP15, ALYREF, RPL27, ELAVL1, AFF2, HNRNPDL, PPP1R9B, CSNK1D, RPL23, CSNK1E, RPL22, SFPQ, RPL21, POLDIP3, THRAP3, RPL7L1, PSPC1, NOP58, NOP56, CWC22, RBM17 MRPL40, RPL17, RPL19, RPL13, RPL15, SYNCRIP, INTS2, MED23, MED22, REST, WTAP, MED20, NONO, WDR75, GO:0010467 2.5 BRPF1, SIN3A, PICALM, H2AFV, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, H2AFY, LUC7L2, OGT, RPL12, (3.48E-103) DHX30, TWIST2, LUC7L3, GTPBP4, CHTOP, MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, YY1, TADA2B, MECOM, Gene expression MYH9, JUP, MAPK1, NME2, NOP2, BAZ1A, RPS16, BAZ1B, PTRF, RPS17, RPS14, EIF2S1, JUN, SERBP1, RPS15, (366) ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, MRPL46, FIP1L1, PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, TCF20, RPS29, PBRM1, NAT10, RPS20, PABPC1, NUP54, VEZF1, RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, SAP130, WDR61, DLX6, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1,

165

ZNF148, SLC25A3, ACTR8, TOP2A, WDR33, CTBP1, RBBP4, SLC25A4, EXOSC7, SLC25A6, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TAF11, MED6, EIF4A3, MED4, TAF12, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, YWHAZ, RPL35, NOB1, RPL36, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, LANCL2, ACTL6A, ETV6, BRMS1L, RBM25, BCAS2, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, AFF2, HNRNPDL, SNAI1, MRPL23, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, NOP58, TCEB3, PBX1, ZNF462, NOP56, CWC22, RBM17, EIF6, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, TOP1, KDM1A, IMP3, CSNK2A1, SMARCD2, SLC25A22, SMARCD1, RPS27A, NUP133, SNRPA1, MAGOH, EFTUD2, DDX39B, HNRNPA2B1, CHP1, EEF2, ARID1B, PPP1CB, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, RNF20, SMARCA2, SNRPE, MEAF6, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, UBAP2L, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, MYO1C, MRPS23, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, HNRNPH3, HDAC2, SMARCC1, RPL37A, HNRNPH1, PHF6, UQCRC2, HP1BP3, YLPM1, PRDX4, SKIV2L2, RPS2, RPS3, LARP1, DCAF13, MAX, OSR2, PLRG1, DGCR8, DKC1, GTF2A1, RPS3A, GTF2A2, CDK12, CDK5RAP3, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, MATR3, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, NDC1, EXOSC10, RACK1, DROSHA, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, GAPDH, HSPA8, CHD4, ENO1, RTCB, ERG, PDCD11, NFRKB, ALYREF, SAP30BP, NUP155, MRGBP, PWP1, PPP1R9B, ATXN2, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA RPL17, LDHA, RPL19, RPL13, RPL15, SYNCRIP, INTS2, MED23, MED22, REST, WTAP, MED20, NONO, WDR75, GO:0006139 2.4 BRPF1, SIN3A, PICALM, DDX18, H2AFV, DNAJB11, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, H2AFY, (9.63E-102) LUC7L2, OGT, RPL12, DHX30, TWIST2, LUC7L3, GTPBP4, CHTOP, MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, Nucleobase-containing YY1, TADA2B, MECOM, JUP, MAPK1, NME2, RFC3, NOP2, BAZ1A, RFC4, RPS16, BAZ1B, PTRF, RPS17, RPS14, compound metabolic JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, MYO18A, MDH2, FIP1L1, TALDO1, GNAI2, process PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, TCF20, RPS29, PBRM1, NAT10, RPS20, PABPC1, NUP54, VEZF1, (378) RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, MKI67, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, CCT4, SAP130, WDR61, DLX6, CCT8, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, ATP5B, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1, ZNF148, ATP5O, ACTR8, TOP2B, TOP2A, WDR33, CTBP1, RBBP4, EXOSC7, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TAF11, MED6, EIF4A3, MED4, TAF12, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, RPL35, NOB1, RPL36, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, LANCL2, ACTL6A, ETV6, BRMS1L, RBM25, BCAS2, SHMT1, PDS5B, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, AFF2, HNRNPDL, SNAI1, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, TCEB3, NOP58, PBX1, ZNF462, ATP5A1, NOP56, CWC22, RBM17, EIF6, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, TOP1, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, NUP133, SNRPA1, MAGOH, EFTUD2, DDX39B, HNRNPA2B1, CHP1, ARID1B, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, RNF20, SMARCA2, SNRPE, MEAF6, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, DDX42, TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, PHF6, UQCRC2, HP1BP3, PRKAG2, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, GTF2A1, DGCR8, DKC1, RPS3A, GTF2A2, CDK12,

166

CDK5RAP3, NDUFS2, KPNB1, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, SSBP1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, KPNA2, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, SMUG1, NDC1, EXOSC10, DROSHA, RACK1, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, GAPDH, CHD4, HSPA8, ENO1, RTCB, ERG, PDCD11, NFRKB, ALYREF, ATP5F1, SAP30BP, NUP155, MRGBP, SMC3, PWP1, MPG, ATXN2, PPP1R9B, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA RPL17, LDHA, RPL19, RPL13, RPL15, SYNCRIP, INTS2, MED23, MED22, REST, WTAP, MED20, NONO, WDR75, GO:0046483 2.3 BRPF1, SIN3A, PICALM, DDX18, H2AFV, DNAJB11, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, H2AFY, (1.57E-98) LUC7L2, OGT, RPL12, DHX30, TWIST2, LUC7L3, GTPBP4, CHTOP, MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, Heterocycle metabolic YY1, TADA2B, MECOM, JUP, MAPK1, NME2, RFC3, NOP2, BAZ1A, RFC4, RPS16, BAZ1B, PTRF, RPS17, RPS14, process JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, MYO18A, MDH2, FIP1L1, TALDO1, GNAI2, (378) PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, TCF20, RPS29, PBRM1, NAT10, RPS20, PABPC1, NUP54, VEZF1, RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, MKI67, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, CCT4, SAP130, WDR61, DLX6, CCT8, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, ATP5B, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1, ZNF148, ATP5O, ACTR8, TOP2B, TOP2A, WDR33, CTBP1, RBBP4, EXOSC7, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TAF11, MED6, EIF4A3, MED4, TAF12, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, RPL35, NOB1, RPL36, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, LANCL2, ACTL6A, ETV6, BRMS1L, RBM25, BCAS2, SHMT1, PDS5B, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, AFF2, HNRNPDL, SNAI1, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, TCEB3, NOP58, PBX1, ZNF462, ATP5A1, NOP56, CWC22, RBM17, EIF6, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, TOP1, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, NUP133, SNRPA1, MAGOH, EFTUD2, DDX39B, HNRNPA2B1, CHP1, ARID1B, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, RNF20, SMARCA2, SNRPE, MEAF6, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, DDX42, TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, PHF6, UQCRC2, HP1BP3, PRKAG2, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, GTF2A1, DGCR8, DKC1, RPS3A, GTF2A2, CDK12, CDK5RAP3, NDUFS2, KPNB1, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, SSBP1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, KPNA2, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, SMUG1, NDC1, EXOSC10, DROSHA, RACK1, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, GAPDH, CHD4, HSPA8, ENO1, RTCB, ERG, PDCD11, NFRKB, ALYREF, ATP5F1, SAP30BP, NUP155, MRGBP, SMC3, PWP1, MPG, ATXN2, PPP1R9B, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA RPL17, LDHA, RPL19, RPL13, RPL15, SYNCRIP, MED23, INTS2, REST, MED22, WTAP, MED20, WDR75, BRPF1, GO:0034641 2.2 MAK16, DHX38, RPLP0, CLK4, ILK, MED27, LUC7L2, RPL12, OGT, DHX30, LUC7L3, MCRS1, RREB1, PCID2, RXRA, (1.07E-97) RCOR1, TADA2B, MECOM, JUP, MAPK1, NME2, RPS16, PTRF, RPS17, RPS14, JUN, EIF2S1, RPS15, ZNF384, Cellular nitrogen RPS13, RPS10, VGLL3, RPS11, MYO18A, MYBBP1A, GNAI2, PABPC4, CHCHD3, NIFK, AHCTF1, RPS25, RPS29, compound metabolic NAT10, RPS20, PABPC1, VEZF1, RPS23, RPS24, PRPF40A, YEATS4, RPSA, MKI67, KLF13, PHB, RPS6, RPS8, process RPS7, WDR61, SMURF2, RERE, GAR1, YBX1, ZNF148, WDR33, EXOSC7, EXOSC3, ZNF143, GTF2H2, TTF2, (392) TAF11, EIF4A3, TAF12, NUP205, FOXC2, RUVBL1, WDR43, KIAA1429, ZNF276, RPL35, NOB1, RPL36, SF3B6,

167

SF3B4, SF3B3, SF3B1, RPL30, EZR, BRMS1L, SHMT1, PPHLN1, LMNA, RPL27, ELAVL1, CSNK1D, RPL23, CSNK1E, RPL22, RPL21, SFPQ, POLDIP3, TCEB3, ZNF462, CWC22, EIF6, PRPF4B, SNRPD3, RBM15B, SNRPD1, INO80, RBM7, TBP, DDX27, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, SNRPA1, HNRNPA2B1, EEF2, ARID1B, HNRNPU, MRTO4, RCL1, SMARCE1, SMARCA5, TFAP2D, SMARCA2, MEAF6, SNRPB2, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, RPL8, RPL3, HIST1H4C, TAF9, RPL7A, RPL4, RPL10A, DDX42, TAF2, TAF1, TAF4, TAF3, EHMT1, TAF5, TAF8, TAF7, RPL23A, RBMX, SAFB2, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, HP1BP3, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, MAX, DCAF13, OSR2, PLRG1, DKC1, GTF2A1, RPS3A, GTF2A2, CDK5RAP3, NDUFS2, INO80C, ZCCHC8, INO80B, CDK1, TBL1XR1, SSBP1, ZFX, CDK9, RPS4X, NCL, PRPF6, SNRNP200, UCHL5, CPSF7, CPSF6, CPSF4, CPSF3, CPSF2, DPF2, RPS15A, NFYA, SMUG1, NDC1, DIMT1, HIC2, CHD7, PRPF8, CEBPZ, CHD2, CHD1, HSPA5, HSPA8, CHD4, RTCB, PDCD11, NFRKB, ATP5F1, MRGBP, SMC3, PWP1, PPP1R9B, MED30, MRPL40, NONO, DDX18, PICALM, SIN3A, H2AFV, DNAJB11, U2AF1, H2AFY, TWIST2, GTPBP4, CHTOP, RAN, YY1, RFC3, NOP2, BAZ1A, RFC4, BAZ1B, MRPL46, MDH2, FIP1L1, TALDO1, TCF20, PBRM1, NUP54, HNRNPAB, IKZF5, KLF5, IKZF2, SMAD5, DDX1, DDX5, RPF2, HNRNPA0, DDX6, CCT4, SAP130, DLX6, CCT8, UTP20, NCOR2, KLF3, ATP5B, UTP15, ZEB2, DMAP1, ZEB1, UTP11, SLC25A3, ATP5O, ACTR8, TOP2B, TOP2A, CTBP1, RBBP4, SLC25A4, SLC25A6, SPEN, MED6, MED4, ASH1L, MED9, CDK11B, NHP2, PRPF38A, MED1, ZBTB17, PRPF19, LANCL2, ACTL6A, ETV6, RBM25, BCAS2, PDS5B, RRP15, AFF2, HNRNPDL, SNAI1, MRPL23, ILF2, RPL7L1, NOP58, PBX1, ATP5A1, NOP56, RBM17, CBX3, PNN, TOP1, SLC25A22, NUP133, MAGOH, EFTUD2, DDX39B, CHP1, RSL1D1, SLTM, SNRPB, SNRPE, RNF20, TRRAP, ZMYND8, NPM1, NPM3, THAP11, EEF1A1, TSR1, MRPS23, WDR5, CTR9, PHF6, HSP90AB1, UQCRC2, PRKAG2, LARP1, DGCR8, CDK12, KPNB1, FTSJ3, CDK13, NOL6, HSP90AA1, LEF1, C1QBP, SGF29, KPNA2, UTP4, ING3, ING2, TRA2B, UTP6, TRA2A, PRKDC, EXOSC10, RACK1, DROSHA, SAFB, NUDT21, GAPDH, ENO1, ERG, ALYREF, SAP30BP, NUP155, MPG, ATXN2, RPRD2, PHB2, SP3, THRAP3, PSPC1, NFIC, MGST1, NFIA RPL17, LDHA, RPL19, RPL13, RPL15, SYNCRIP, INTS2, MED23, MED22, REST, WTAP, MED20, NONO, WDR75, GO:0006725 2.3 BRPF1, SIN3A, PICALM, DDX18, H2AFV, DNAJB11, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, H2AFY, (2.28E-97) LUC7L2, OGT, RPL12, DHX30, TWIST2, LUC7L3, GTPBP4, CHTOP, MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, Cellular aromatic YY1, TADA2B, MECOM, JUP, MAPK1, NME2, RFC3, NOP2, BAZ1A, RFC4, RPS16, BAZ1B, PTRF, RPS17, RPS14, compound metabolic JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, MYO18A, MDH2, FIP1L1, TALDO1, GNAI2, process PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, TCF20, RPS29, PBRM1, NAT10, RPS20, PABPC1, NUP54, VEZF1, (378) RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, MKI67, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, CCT4, SAP130, WDR61, DLX6, CCT8, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, ATP5B, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1, ZNF148, ATP5O, ACTR8, TOP2B, TOP2A, WDR33, CTBP1, RBBP4, EXOSC7, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TAF11, MED6, EIF4A3, MED4, TAF12, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, RPL35, NOB1, RPL36, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, LANCL2, ACTL6A, ETV6, BRMS1L, RBM25, BCAS2, SHMT1, PDS5B, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, AFF2, HNRNPDL, SNAI1, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, TCEB3, NOP58, PBX1, ZNF462, ATP5A1, NOP56, CWC22, RBM17, EIF6, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, TOP1, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, NUP133, SNRPA1, MAGOH, EFTUD2, DDX39B, HNRNPA2B1, CHP1, ARID1B, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, RNF20, SMARCA2, SNRPE, MEAF6, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, DDX42,

168

TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, PHF6, UQCRC2, HP1BP3, PRKAG2, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, GTF2A1, DGCR8, DKC1, RPS3A, GTF2A2, CDK12, CDK5RAP3, NDUFS2, KPNB1, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, SSBP1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, KPNA2, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, SMUG1, NDC1, EXOSC10, DROSHA, RACK1, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, GAPDH, CHD4, HSPA8, ENO1, RTCB, ERG, PDCD11, NFRKB, ALYREF, ATP5F1, SAP30BP, NUP155, MRGBP, SMC3, PWP1, MPG, ATXN2, PPP1R9B, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA RPL17, LDHA, RPL19, RPL13, RPL15, SYNCRIP, INTS2, MED23, MED22, REST, WTAP, MED20, NONO, WDR75, GO:1901360 2.3 BRPF1, SIN3A, PICALM, DDX18, H2AFV, DNAJB11, MAK16, DHX38, RPLP0, CLK4, ILK, MED27, U2AF1, H2AFY, (9.10E-94) LUC7L2, OGT, RPL12, DHX30, TWIST2, LUC7L3, GTPBP4, CHTOP, MCRS1, RREB1, PCID2, RAN, RXRA, RCOR1, Organic cyclic YY1, TADA2B, MECOM, JUP, MAPK1, NME2, RFC3, NOP2, BAZ1A, RFC4, RPS16, BAZ1B, PTRF, RPS17, RPS14, compound metabolic JUN, RPS15, ZNF384, RPS13, RPS10, VGLL3, RPS11, MYBBP1A, MYO18A, MDH2, FIP1L1, TALDO1, GNAI2, process PABPC4, CHCHD3, AHCTF1, NIFK, RPS25, TCF20, RPS29, PBRM1, NAT10, RPS20, PABPC1, NUP54, VEZF1, (379) RPS23, RPS24, PRPF40A, HNRNPAB, IKZF5, KLF5, RPSA, YEATS4, IKZF2, MKI67, KLF13, PHB, SMAD5, DDX1, RPS6, DDX5, RPF2, RPS8, HNRNPA0, RPS7, DDX6, CCT4, SAP130, WDR61, DLX6, CCT8, SMURF2, UTP20, RERE, NCOR2, KLF3, GAR1, ATP5B, UTP15, ZEB2, ZEB1, DMAP1, UTP11, YBX1, ZNF148, ATP5O, ACTR8, TOP2B, TOP2A, WDR33, CTBP1, RBBP4, EXOSC7, EXOSC3, ZNF143, SPEN, TTF2, GTF2H2, TAF11, MED6, EIF4A3, MED4, TAF12, NUP205, MED9, ASH1L, FOXC2, CDK11B, RUVBL1, WDR43, NHP2, KIAA1429, PRPF38A, MED1, ZNF276, RPL35, NOB1, RPL36, SF3B6, ZBTB17, SF3B4, SF3B3, PRPF19, SF3B1, RPL30, EZR, LANCL2, ACTL6A, ETV6, BRMS1L, RBM25, BCAS2, SHMT1, PDS5B, PPHLN1, RRP15, LMNA, RPL27, ELAVL1, AFF2, HNRNPDL, SNAI1, CSNK1D, RPL23, ILF2, CSNK1E, RPL22, RPL21, POLDIP3, SFPQ, RPL7L1, TCEB3, NOP58, PBX1, ZNF462, ATP5A1, NOP56, CWC22, RBM17, EIF6, PRPF4B, RBM15B, SNRPD3, SNRPD1, INO80, CBX3, RBM7, TBP, PNN, DDX27, TOP1, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, NUP133, SNRPA1, MAGOH, EFTUD2, DDX39B, HNRNPA2B1, CHP1, ARID1B, HNRNPU, MRTO4, RSL1D1, SLTM, RCL1, SMARCE1, SMARCA5, SNRPB, TFAP2D, RNF20, SMARCA2, SNRPE, MEAF6, SNRPB2, TRRAP, ZMYND8, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, NPM1, RPL8, RPL3, NPM3, TAF9, HIST1H4C, RPL10A, RPL7A, RPL4, THAP11, DDX42, TAF2, TAF1, EEF1A1, TAF4, TAF3, EHMT1, TSR1, TAF5, WDR5, TAF8, TAF7, RPL23A, RBMX, SAFB2, CTR9, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, PHF6, UQCRC2, HP1BP3, PRKAG2, YLPM1, EDC4, SKIV2L2, RPS2, RPS3, DCAF13, MAX, OSR2, PLRG1, GTF2A1, DGCR8, DKC1, RPS3A, GTF2A2, CDK12, CDK5RAP3, LBR, NDUFS2, KPNB1, INO80C, FTSJ3, CDK13, INO80B, ZCCHC8, NOL6, CDK1, TBL1XR1, SSBP1, ZFX, LEF1, CDK9, RPS4X, NCL, PRPF6, C1QBP, SNRNP200, UCHL5, CPSF7, CPSF6, SGF29, CPSF4, CPSF3, CPSF2, KPNA2, DPF2, UTP4, ING3, ING2, TRA2B, TRA2A, UTP6, RPS15A, PRKDC, NFYA, SMUG1, NDC1, EXOSC10, DROSHA, RACK1, DIMT1, HIC2, CHD7, PRPF8, SAFB, NUDT21, CEBPZ, CHD2, CHD1, HSPA5, GAPDH, CHD4, HSPA8, ENO1, RTCB, ERG, PDCD11, NFRKB, ALYREF, ATP5F1, SAP30BP, NUP155, MRGBP, SMC3, PWP1, MPG, ATXN2, PPP1R9B, MED30, RPRD2, SP3, PHB2, THRAP3, PSPC1, NFIC, NFIA RPL17, LDHA, RPL19, RPL13, RPL15, SYNCRIP, MED23, INTS2, REST, MED22, WTAP, MED20, WDR75, BRPF1, GO:0006807 MAK16, DHX38, RPLP0, CLK4, ILK, MED27, LUC7L2, RPL12, OGT, DHX30, LUC7L3, MCRS1, RREB1, PCID2, RXRA, 2.1 RCOR1, TADA2B, MECOM, JUP, MAPK1, NME2, RPS16, PTRF, RPS17, RPS14, JUN, EIF2S1, RPS15, ZNF384, Nitrogen compound (2.42E-89) RPS13, RPS10, VGLL3, RPS11, MYO18A, MYBBP1A, GNAI2, PABPC4, CHCHD3, NIFK, AHCTF1, RPS25, RPS29,

169 metabolic process NAT10, RPS20, PABPC1, VEZF1, RPS23, RPS24, PRPF40A, YEATS4, RPSA, MKI67, KLF13, PHB, RPS6, RPS8, (394) RPS7, WDR61, SMURF2, RERE, GAR1, YBX1, ZNF148, WDR33, EXOSC7, EXOSC3, ZNF143, GTF2H2, TTF2, TAF11, EIF4A3, TAF12, NUP205, FOXC2, RUVBL1, WDR43, KIAA1429, ZNF276, RPL35, NOB1, RPL36, SF3B6, SF3B4, SF3B3, SF3B1, RPL30, EZR, BRMS1L, SHMT1, PPHLN1, LMNA, RPL27, ELAVL1, CSNK1D, RPL23, CSNK1E, RPL22, RPL21, SFPQ, POLDIP3, TCEB3, ZNF462, CWC22, EIF6, PRPF4B, SNRPD3, RBM15B, SNRPD1, INO80, RBM7, TBP, DDX27, KDM1A, IMP3, CSNK2A1, SMARCD2, SMARCD1, RPS27A, SNRPA1, HNRNPA2B1, EEF2, ARID1B, HNRNPU, MRTO4, RCL1, SMARCE1, SMARCA5, TFAP2D, SMARCA2, MEAF6, HACD3, SNRPB2, HNRNPM, DDX47, HNRNPK, RPL7, ZNF326, SMARCB1, RPL6, RPL9, RPL8, RPL3, HIST1H4C, TAF9, RPL7A, RPL4, RPL10A, DDX42, TAF2, TAF1, TAF4, TAF3, EHMT1, TAF5, TAF8, TAF7, RPL23A, RBMX, SAFB2, HNRNPH3, HDAC2, VCP, SMARCC1, DDX50, RPL37A, HNRNPH1, HP1BP3, YLPM1, PRDX4, EDC4, SKIV2L2, RPS2, RPS3, MAX, DCAF13, OSR2, PLRG1, DKC1, GTF2A1, RPS3A, GTF2A2, CDK5RAP3, NDUFS2, INO80C, ZCCHC8, INO80B, CDK1, TBL1XR1, SSBP1, ZFX, CDK9, RPS4X, NCL, PRPF6, SNRNP200, UCHL5, CPSF7, CPSF6, CPSF4, CPSF3, CPSF2, DPF2, RPS15A, NFYA, SMUG1, NDC1, DIMT1, HIC2, CHD7, PRPF8, CEBPZ, CHD2, CHD1, HSPA5, HSPA8, CHD4, RTCB, PDCD11, NFRKB, ATP5F1, MRGBP, SMC3, PWP1, PPP1R9B, MED30, MRPL40, NONO, DDX18, PICALM, SIN3A, H2AFV, DNAJB11, U2AF1, H2AFY, TWIST2, GTPBP4, CHTOP, RAN, YY1, RFC3, NOP2, BAZ1A, RFC4, BAZ1B, MRPL46, MDH2, FIP1L1, TALDO1, TCF20, PBRM1, NUP54, HNRNPAB, IKZF5, KLF5, IKZF2, SMAD5, DDX1, DDX5, RPF2, HNRNPA0, DDX6, CCT4, SAP130, DLX6, CCT8, UTP20, NCOR2, KLF3, ATP5B, UTP15, ZEB2, DMAP1, ZEB1, UTP11, SLC25A3, ATP5O, ACTR8, TOP2B, TOP2A, CTBP1, RBBP4, SLC25A4, SLC25A6, SPEN, MED6, MED4, ASH1L, MED9, CDK11B, NHP2, PRPF38A, MED1, ZBTB17, PRPF19, LANCL2, ACTL6A, ETV6, RBM25, BCAS2, PDS5B, RRP15, AFF2, HNRNPDL, SNAI1, MRPL23, ILF2, RPL7L1, NOP58, PBX1, ATP5A1, NOP56, RBM17, CBX3, PNN, TOP1, SLC25A22, NUP133, MAGOH, EFTUD2, DDX39B, CHP1, RSL1D1, SLTM, SNRPB, SNRPE, RNF20, TRRAP, ZMYND8, NPM1, NPM3, THAP11, EEF1A1, TSR1, MRPS23, WDR5, CTR9, PHF6, UQCRC2, HSP90AB1, PRKAG2, LARP1, DGCR8, CDK12, KPNB1, FTSJ3, CDK13, NOL6, HSP90AA1, LEF1, C1QBP, SGF29, KPNA2, UTP4, ING3, ING2, TRA2B, UTP6, TRA2A, PRKDC, EXOSC10, RACK1, DROSHA, SAFB, NUDT21, GAPDH, ENO1, ERG, ALYREF, SAP30BP, NUP155, MPG, ATXN2, RPRD2, PHB2, SP3, THRAP3, PSPC1, NFIC, MGST1, NFIA The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

170

Table 3.3. Top 10 DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from DF1 nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:1901360 1.6 XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, WTAP, CNOT7, MTHFD1L, NDUFS4, NDUFS8, NT5C2, (8.96E-07) SMARCD1, U2AF1, OGT, BCL7A, RPRD1B, NDUFS2, ATP5H, TWIST2, LUC7L3, CCNK, SSBP1, STRN3, PRKCI, Organic cyclic EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, NDUFB3, ING2, ZBTB10, compound metabolic NDUFB9, NOB1, ZMYND8, RPS25, LANCL2, SAFB, PRPF8, CAMK2D, IDH1, MYCBP, RPL10A, PRPF40A, HIP1, process NDUFA5, MSMO1, TSR1, NDUFA9, FDPS, ATP1A1, SF3A3, PPP1R9B, SUGP1, EIF4E, MED30, RPL23, PTPN1 (64)

GO:0006120 20.9 (9.36E-07) Mitochondrial electron NDUFB3, NDUFA5, NDUFS4, NDUFA9, NDUFB9, NDUFS8, NDUFS2 transport, NADH to ubiquinone (7) XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, WTAP, CNOT7, MTHFD1L, NDUFS4, P4HA1, NDUFS8, GO:0006807 1.6 NT5C2, SMARCD1, U2AF1, SLC25A3, OGT, BCL7A, RPRD1B, NDUFS2, ATP5H, TWIST2, LUC7L3, CCNK, SSBP1, (1.06E-06) STRN3, CRTAP, PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, BCAT1, Nitrogen compound NDUFB3, ING2, ZBTB10, NDUFB9, NOB1, IGF2BP3, ZMYND8, RPS25, LANCL2, SAFB, PRPF8, CAMK2D, EIF3K, metabolic process TMED10, IDH1, MYCBP, RPL10A, PRPF40A, HIP1, NDUFA5, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, EIF4E, (69) MED30, RPL23, PTPN1, MGST1

XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, WTAP, CNOT7, MTHFD1L, NDUFS4, NDUFS8, NT5C2, GO:0034641 1.6 SMARCD1, U2AF1, SLC25A3, OGT, BCL7A, RPRD1B, NDUFS2, ATP5H, TWIST2, LUC7L3, CCNK, SSBP1, STRN3, (1.24E-06) PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, NDUFB3, ING2, ZBTB10, Cellular nitrogen NDUFB9, NOB1, IGF2BP3, ZMYND8, RPS25, LANCL2, SAFB, PRPF8, CAMK2D, EIF3K, TMED10, IDH1, MYCBP, compound metabolic RPL10A, PRPF40A, HIP1, NDUFA5, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, EIF4E, MED30, RPL23, PTPN1, process MGST1 (66)

GO:0032981 17.6 (2.63E-06) Mitochondrial NDUFB3, NDUFA5, NDUFS4, NDUFA9, NDUFB9, NDUFS8, NDUFS2 respiratory chain complex I assembly (7) XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, WTAP, CNOT7, MTHFD1L, NDUFS4, NDUFS8, NT5C2, GO:0046483 1.6 SMARCD1, U2AF1, OGT, BCL7A, RPRD1B, NDUFS2, ATP5H, TWIST2, LUC7L3, CCNK, SSBP1, STRN3, PRKCI, 171

(3.61E-06) EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, NDUFB3, ING2, ZBTB10, Heterocycle metabolic NDUFB9, NOB1, ZMYND8, RPS25, LANCL2, SAFB, PRPF8, CAMK2D, IDH1, MYCBP, RPL10A, PRPF40A, HIP1, process NDUFA5, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, EIF4E, MED30, RPL23, PTPN1 (61)

GO:0016071 4.0 (4.11E-06) EXOSC1, CNOT7, WTAP, SF3A3, RPS25, EIF4E, SUGP1, RPL23, PRPF8, SAFB, SNRNP200, U2AF1, CNOT11, mRNA metabolic SNRNP40, RPL10A, LUC7L3, PRPF40A process (17)

GO:0006139 1.6 XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, WTAP, CNOT7, NDUFS4, NDUFS8, SMARCD1, NT5C2, (4.17E-06) U2AF1, OGT, BCL7A, RPRD1B, NDUFS2, ATP5H, TWIST2, LUC7L3, CCNK, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, Nucleobase-containing ASCC2, SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, NDUFB3, ING2, ZBTB10, NDUFB9, NOB1, compound metabolic ZMYND8, RPS25, LANCL2, SAFB, PRPF8, CAMK2D, IDH1, MYCBP, RPL10A, PRPF40A, HIP1, NDUFA5, TSR1, process NDUFA9, SF3A3, PPP1R9B, SUGP1, EIF4E, MED30, RPL23, PTPN1 (60)

GO:0006725 1.6 XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, WTAP, CNOT7, MTHFD1L, NDUFS4, NDUFS8, NT5C2, (4.79E-06) SMARCD1, U2AF1, OGT, BCL7A, RPRD1B, NDUFS2, ATP5H, TWIST2, LUC7L3, CCNK, SSBP1, STRN3, PRKCI, Cellular aromatic EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, NDUFB3, ING2, ZBTB10, compound metabolic NDUFB9, NOB1, ZMYND8, RPS25, LANCL2, SAFB, PRPF8, CAMK2D, IDH1, MYCBP, RPL10A, PRPF40A, HIP1, process NDUFA5, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, EIF4E, MED30, RPL23, PTPN1 (61)

GO:0022613 4.6 (9.20E-06) XPO1, TSR1, NOB1, EXOSC1, CNOT7, SF3A3, RPS25, RPL23, PRPF8, SNRNP200, EIF3K, RPL10A, LUC7L3, C1D Ribonucleoprotein complex biogenesis (14) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

172

Table 3.4. Top 10 nuclear enriched DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from DF1 nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:0034641 2.1 XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, CNOT7, WTAP, SMARCD1, U2AF1, SLC25A3, OGT, (5.57E-12) RPRD1B, ATP5H, NDUFS2, TWIST2, LUC7L3, CCNK, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, Cellular nitrogen UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, ING2, ZBTB10, NOB1, IGF2BP3, ZMYND8, RPS25, LANCL2, SAFB, compound metabolic PRPF8, EIF3K, CAMK2D, MYCBP, RPL10A, PRPF40A, HIP1, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, MED30, process RPL23, MGST1 (54)

XPO1, ING2, ARID4A, ZBTB10, ASCC1, NOB1, MED24, INTS3, MED22, IGF2BP3, WTAP, CNOT7, ZMYND8, RPS25, GO:0010467 2.3 LANCL2, SAFB, PRPF8, SMARCD1, U2AF1, EIF3K, SLC25A3, CAMK2D, MYCBP, RPL10A, OGT, RPRD1B, TWIST2, (4.31E-11) LUC7L3, PRPF40A, HIP1, CCNK, TSR1, STRN3, PRKCI, EXOSC1, MBD3, SF3A3, PPP1R9B, MED30, SUGP1, Gene expression ASCC2, RPL23, SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D (48)

XPO1, ING2, ARID4A, ZBTB10, ASCC1, NOB1, MED24, INTS3, MED22, WTAP, CNOT7, ZMYND8, RPS25, LANCL2, GO:0016070 2.4 SAFB, PRPF8, SMARCD1, U2AF1, CAMK2D, MYCBP, RPL10A, OGT, RPRD1B, TWIST2, LUC7L3, PRPF40A, HIP1, (4.63E-11) CCNK, TSR1, STRN3, PRKCI, EXOSC1, MBD3, SF3A3, PPP1R9B, MED30, SUGP1, ASCC2, RPL23, SNRNP200, RNA metabolic process UCHL5, CNOT11, SNRNP40, ATXN1L, C1D (45)

GO:0006139 2.2 XPO1, ARID4A, PRKAG2, ASCC1, MED24, MED22, INTS3, CNOT7, WTAP, SMARCD1, U2AF1, OGT, RPRD1B, (4.76E-11) ATP5H, NDUFS2, LUC7L3, TWIST2, CCNK, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, Nucleobase-containing CNOT11, SNRNP40, ATXN1L, C1D, ING2, ZBTB10, NOB1, ZMYND8, RPS25, SAFB, PRPF8, LANCL2, CAMK2D, compound metabolic MYCBP, RPL10A, PRPF40A, HIP1, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, MED30, RPL23 process (50)

GO:1901360 2.1 XPO1, ARID4A, PRKAG2, ASCC1, MED24, MED22, INTS3, CNOT7, WTAP, SMARCD1, U2AF1, OGT, RPRD1B, (9.93E-11) ATP5H, NDUFS2, LUC7L3, TWIST2, CCNK, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, Organic cyclic CNOT11, SNRNP40, ATXN1L, C1D, ING2, ZBTB10, NOB1, ZMYND8, RPS25, SAFB, PRPF8, LANCL2, CAMK2D, compound metabolic MYCBP, RPL10A, PRPF40A, HIP1, TSR1, NDUFA9, FDPS, SF3A3, PPP1R9B, SUGP1, MED30, RPL23 process (51)

173

XPO1, ARID4A, PRKAG2, ASCC1, MED24, INTS3, MED22, CNOT7, WTAP, SMARCD1, U2AF1, SLC25A3, OGT, GO:0006807 1.9 RPRD1B, ATP5H, NDUFS2, TWIST2, LUC7L3, CCNK, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, (1.07E-10) UCHL5, CNOT11, SNRNP40, ATXN1L, C1D, ING2, ZBTB10, NOB1, IGF2BP3, ZMYND8, RPS25, LANCL2, SAFB, Nitrogen compound PRPF8, EIF3K, CAMK2D, MYCBP, RPL10A, PRPF40A, HIP1, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, MED30, metabolic process RPL23, MGST1 (54)

GO:0046483 2.1 XPO1, ARID4A, PRKAG2, ASCC1, MED24, MED22, INTS3, CNOT7, WTAP, SMARCD1, U2AF1, OGT, RPRD1B, (1.13E-10) ATP5H, NDUFS2, LUC7L3, TWIST2, CCNK, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, Heterocycle metabolic CNOT11, SNRNP40, ATXN1L, C1D, ING2, ZBTB10, NOB1, ZMYND8, RPS25, SAFB, PRPF8, LANCL2, CAMK2D, process MYCBP, RPL10A, PRPF40A, HIP1, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, MED30, RPL23 (50)

GO:0006725 2.1 XPO1, ARID4A, PRKAG2, ASCC1, MED24, MED22, INTS3, CNOT7, WTAP, SMARCD1, U2AF1, OGT, RPRD1B, (1.54E-10) ATP5H, NDUFS2, LUC7L3, TWIST2, CCNK, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, ASCC2, SNRNP200, UCHL5, Cellular aromatic CNOT11, SNRNP40, ATXN1L, C1D, ING2, ZBTB10, NOB1, ZMYND8, RPS25, SAFB, PRPF8, LANCL2, CAMK2D, compound metabolic MYCBP, RPL10A, PRPF40A, HIP1, TSR1, NDUFA9, SF3A3, PPP1R9B, SUGP1, MED30, RPL23 process (50)

GO:0090304 2.2 XPO1, ING2, ARID4A, ZBTB10, ASCC1, NOB1, MED24, INTS3, MED22, WTAP, CNOT7, ZMYND8, RPS25, LANCL2, (3.93E-10) SAFB, PRPF8, SMARCD1, U2AF1, CAMK2D, MYCBP, RPL10A, OGT, RPRD1B, TWIST2, LUC7L3, PRPF40A, HIP1, Nucleic acid metabolic CCNK, TSR1, SSBP1, STRN3, PRKCI, EXOSC1, MBD3, SF3A3, PPP1R9B, MED30, SUGP1, ASCC2, RPL23, process SNRNP200, UCHL5, CNOT11, SNRNP40, ATXN1L, C1D (46)

GO:0022613 7.3 (3.72E-08) XPO1, TSR1, NOB1, EXOSC1, CNOT7, SF3A3, RPS25, RPL23, PRPF8, SNRNP200, EIF3K, RPL10A, LUC7L3, C1D Ribonucleoprotein complex biogenesis (14) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

174

Table 3.5. Top 10 DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from HeLa nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) EIF6, RNMT, CHERP, PRPF4B, AAR2, GAR1, U2SURP, RPL15, SNRPD2, RPS27L, CMTR1, INTS3, DDX27, CDKN2A, GO:0006396 4.3 PLRG1, PCBP1, RPLP1, DHX34, U2AF1, RPP30, DDX21, FTSJ3, LUC7L3, CDK13, PABPN1, SYMPK, GTPBP4, (1.44E-28) EXOSC8, SNRPN, EFTUD2, EXOSC4, EXOSC2, GTF2H4, EXOSC3, GTF2H3, PRPF3, PRPF4, GTF2H2, MRTO4, RNA processing TUT1, NOP2, C1QBP, CPSF4, CPSF3, SNRPE, PRPF38B, KIAA1429, FUS, PPIL1, SNRPB2, POLR2B, DIMT1, (78) HNRNPM, DDX47, LEO1, NAT10, RPS24, GEMIN5, RBM20, RPL26, GRSF1, SNW1, HEATR1, SMAD2, INTS10, HNRNPDL, FBL, DIS3, HNRNPH3, SRSF5, HNRNPH2, SRSF7, HNRNPUL1, POLDIP3, DHX40, MPHOSPH6, UTP20, PES1

CNOT9, RNMT, PRPF4B, ZC3HAV1, AAR2, RPL15, CNOT3, MLH1, EDC4, SNRPD2, CMTR1, CNOT7, PLRG1, GO:0016071 4.8 PCBP1, RPLP1, DHX34, U2AF1, LUC7L3, CDK13, PABPN1, SYMPK, EXOSC8, SNRPN, EFTUD2, EXOSC4, EXOSC2, (1.12E-25) GTF2H4, EXOSC3, GTF2H3, PRPF3, PRPF4, GTF2H2, MRTO4, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, SNRPE, mRNA metabolic PRPF38B, KIAA1429, FUS, PPIL1, SKIV2L, SNRPB2, POLR2B, HNRNPM, DDX47, EIF3E, LEO1, GEMIN5, RPS24, process RBM20, RPL26, GRSF1, SNW1, TNKS1BP1, DIS3, HNRNPH3, SRSF5, HNRNPH2, SRSF7, HNRNPUL1, POLDIP3 (64) AAR2, PNKD, RPL15, ASCC1, CMTR1, INTS3, ILK, RPLP1, DHX34, U2AF1, RPP30, OGT, LUC7L3, ATP6, GTPBP4, GO:0034641 1.7 MED12, MAPK1, RFC3, PPP1CA, NME2, NOP2, AAAS, ASCC2, MED15, PTRF, ZNF384, ADSL, EEFSEC, FUS, (1.16E-21) ZNF131, IGF2BP1, IGF2BP3, DIDO1, ATN1, RAC1, LEO1, TMED10, PBRM1, NAT10, AGRN, MEPCE, RPS24, NKRF, Cellular nitrogen INIP, CREBBP, EPRS, SMAD2, FXR2, TNKS1BP1, CCT7, SRSF5, CCT4, SRSF7, WDR61, PTCD3, ARF4, SUPT16H, compound metabolic DHX40, UTP20, ABCF1, NARS, GAR1, U2SURP, MLH1, GNL3L, QARS, TOP2A, SYMPK, EXOSC8, RBBP5, CTBP1, process CTBP2, EXOSC4, SLC25A6, EXOSC2, GTF2H4, EXOSC3, GTF2H3, CCT6A, GTF2H2, FARSB, CDK11B, PRPF38B, (214) SRP9, KIAA1429, MED1, MTDH, SKIV2L, WRNIP1, CIC, POLR2B, TRIM11, MRPL10, SQSTM1, FEN1, TRIP12, PDS5B, RBM20, RPL26, HNRNPDL, TFRC, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, TJP2, EIF6, CNOT9, CHERP, PRPF4B, RNMT, ZC3HAV1, CNOT3, SNRPD2, RPS27L, CNOT7, DDX27, UQCR10, CDKN2A, SLC25A24, SMARCD3, PUM1, ORC5, DDX21, ASPH, ORC3, PABPN1, ACTN4, SNRPN, EFTUD2, ACTN1, CHP1, EEF2, MRTO4, SNRPE, PPIL1, SNRPB2, TRRAP, HNRNPM, DDX47, SMARCB1, EIF3E, GSTK1, EIF3F, DNAJA3, HELLS, TAF4, TAF6, MRPS22, WDR5, SNW1, HEATR1, EHMT2, CPS1, INTS10, FOXP4, FOXP1, SLC25A11, HDAC3, HNRNPH3, HNRNPH2, SLC25A13, PSMC3, EBF3, SMARCC2, SPCS2, SMC1A, PHF6, DAP3, ARID4B, EDC4, UQCRQ, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, PSMD2, NDUFS3, FTSJ3, AHNAK, CDK13, RING1, PRPF3, PRPF4, MCM4, TUT1, EIF4G2, C1QBP, CPSF4, CPSF3, NUP98, NUP93, PAXBP1, NFYA, IARS, DIMT1, MINA, GAPDH, GEMIN5, TCP1, GRSF1, TAB1, FBL, DIS3, ATXN2, GPI, MPG, RPRD2, NFIC, MGST1, NFIA AAR2, PNKD, RPL15, ASCC1, CMTR1, INTS3, ILK, RPLP1, DHX34, U2AF1, RPP30, OGT, LUC7L3, ATP6, GTPBP4, GO:0006807 1.6 MED12, MAPK1, NME2, RFC3, PPP1CA, NOP2, AAAS, ASCC2, MED15, PTRF, ZNF384, ADSL, EEFSEC, FUS, (1.91E-19) ZNF131, IGF2BP1, IGF2BP3, DIDO1, ATN1, RAC1, LEO1, TMED10, PBRM1, NAT10, AGRN, MEPCE, RPS24, NKRF, Nitrogen compound INIP, CREBBP, EPRS, SMAD2, FXR2, TNKS1BP1, CCT7, SRSF5, CCT4, SRSF7, WDR61, PTCD3, ARF4, SUPT16H,

175 metabolic process DHX40, UTP20, ABCF1, NARS, GAR1, U2SURP, MLH1, GNL3L, QARS, TOP2A, SYMPK, EXOSC8, RBBP5, CTBP1, (218) CTBP2, EXOSC4, SLC25A6, EXOSC2, GTF2H4, EXOSC3, GTF2H3, CCT6A, GTF2H2, FARSB, CDK11B, SRP9, PRPF38B, KIAA1429, MED1, MTDH, SKIV2L, WRNIP1, CIC, POLR2B, HMMR, TRIM11, MRPL10, SQSTM1, ACSL3, FEN1, TRIP12, PDS5B, RBM20, RPL26, HNRNPDL, TFRC, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, TJP2, EIF6, CNOT9, CHERP, PRPF4B, RNMT, ZC3HAV1, CNOT3, SNRPD2, RPS27L, CNOT7, DDX27, UQCR10, CDKN2A, SLC25A24, SMARCD3, PUM1, ORC5, DDX21, ASPH, ORC3, PABPN1, ACTN4, SNRPN, EFTUD2, ACTN1, CHP1, EEF2, MRTO4, PYCR1, SNRPE, PPIL1, SNRPB2, TRRAP, HNRNPM, DDX47, SMARCB1, EIF3E, GSTK1, EIF3F, DNAJA3, HELLS, TAF4, TAF6, MRPS22, WDR5, SNW1, HEATR1, EHMT2, CPS1, INTS10, FOXP4, FOXP1, SLC25A11, HDAC3, HNRNPH3, HNRNPH2, SLC25A13, PSMC3, EBF3, SMARCC2, SPCS2, SMC1A, PHF6, DAP3, ARID4B, EDC4, UQCRQ, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, PSMD2, NDUFS3, FTSJ3, AHNAK, CDK13, RING1, PRPF3, PRPF4, MCM4, TUT1, EIF4G2, C1QBP, CPSF4, CPSF3, NUP98, NUP93, PAXBP1, NFYA, IARS, DIMT1, MINA, GAPDH, GEMIN5, TCP1, GRSF1, TAB1, FBL, DIS3, MPG, DBT, ATXN2, GPI, RPRD2, NFIC, MGST1, NFIA EIF6, CNOT9, CHERP, PRPF4B, RNMT, AAR2, ZC3HAV1, RPL15, CNOT3, ASCC1, SNRPD2, RPS27L, CMTR1, GO:0090304 1.8 INTS3, CNOT7, DDX27, CDKN2A, SMARCD3, ILK, RPLP1, DHX34, U2AF1, RPP30, ORC5, DDX21, ASPH, OGT, (2.86E-19) LUC7L3, ORC3, PABPN1, GTPBP4, ACTN4, SNRPN, EFTUD2, MED12, CHP1, ACTN1, MRTO4, MAPK1, NME2, Nucleic acid metabolic RFC3, NOP2, AAAS, ASCC2, PTRF, MED15, ZNF384, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, TRRAP, DIDO1, process HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1, LEO1, PBRM1, NAT10, AGRN, DNAJA3, MEPCE, HELLS, RPS24, (180) NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, EPRS, SNW1, SMAD2, HEATR1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, CCT7, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, ARF4, SUPT16H, DHX40, UTP20, SMC1A, PHF6, NARS, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, QARS, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, RING1, GTF2H4, EXOSC2, EXOSC3, GTF2H3, PRPF3, CCT6A, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, FARSB, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, GRSF1, RPL26, HNRNPDL, TAB1, FBL, DIS3, MPG, ATXN2, TFRC, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, NFIA EIF6, CNOT9, CHERP, PRPF4B, RNMT, AAR2, RPL15, CNOT3, ASCC1, SNRPD2, RPS27L, CMTR1, INTS3, CNOT7, GO:0010467 1.7 DDX27, CDKN2A, SLC25A24, SMARCD3, ILK, RPLP1, DHX34, PUM1, U2AF1, RPP30, DDX21, ASPH, OGT, LUC7L3, (3.74E-19) PABPN1, GTPBP4, SNRPN, EFTUD2, MED12, CHP1, EEF2, PRKCDBP, MRTO4, MAPK1, NME2, PPP1CA, NOP2, Gene expression AAAS, ASCC2, MED15, PTRF, SERBP1, ZNF384, EEFSEC, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, IGF2BP1, (183) TRRAP, MAPKAPK2, IGF2BP3, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, EIF3F, LEO1, PBRM1, NAT10, AGRN, DNAJA3, MEPCE, HELLS, RPS24, NKRF, TAF4, TAF6, MRPS22, WDR5, CREBBP, EPRS, SNW1, SMAD2, HEATR1, FXR2, AFG3L2, EHMT2, INTS10, FOXP4, FOXP1, SLC25A11, HDAC3, SRSF5, HNRNPH3, HNRNPH2, SLC25A13, WDR61, SRSF7, PTCD3, PSMC3, EBF3, SMARCC2, ARF4, SUPT16H, DHX40, SPCS2, UTP20, PHF6, DAP3, ABCF1, NARS, GAR1, U2SURP, ARID4B, QARS, FUBP1, TRIM5, PLRG1, PCBP1, PSMD2, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, SLC25A6, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, GTF2H2, TUT1, EIF4G2, C1QBP, FARSB, CDK11B, CPSF4, CPSF3, SRP9, PRPF38B, KIAA1429, MED1, NUP98, MTDH, LMF2, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, MRPL10, SQSTM1, TNPO1, GAPDH, GEMIN5, ACTC1, RBM20, GRSF1, RPL26, HNRNPDL, TAB1, FBL, DIS3, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, NFIA

176

GO:0022613 4.7 EIF6, FASTKD2, AAR2, GAR1, RPL15, SNRPD2, GNL3L, RPS27L, CNOT7, MINA, DIMT1, DDX27, MRPL10, DDX47, (1.11E-17) CDKN2A, EIF3E, EIF3F, RPLP1, RPP30, DDX21, NAT10, FTSJ3, LUC7L3, GEMIN5, RPS24, GTPBP4, EXOSC8, Ribonucleoprotein EXOSC4, RPL26, EXOSC2, EXOSC3, HEATR1, PRPF3, FBL, MRTO4, DIS3, ATXN2, SRSF5, NOP2, C1QBP, NOP16, complex biogenesis UTP20, PES1, MPHOSPH6, SNRPE (45) EIF6, CNOT9, CHERP, PRPF4B, RNMT, AAR2, ZC3HAV1, RPL15, CNOT3, ASCC1, SNRPD2, RPS27L, CMTR1, GO:0006139 1.7 INTS3, CNOT7, DDX27, UQCR10, CDKN2A, SMARCD3, ILK, RPLP1, DHX34, U2AF1, RPP30, ORC5, DDX21, ASPH, (1.18E-17) OGT, LUC7L3, ATP6, ORC3, PABPN1, GTPBP4, ACTN4, SNRPN, EFTUD2, MED12, CHP1, ACTN1, MRTO4, MAPK1, Nucleobase-containing NME2, RFC3, NOP2, AAAS, ASCC2, PTRF, MED15, ZNF384, ADSL, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, TRRAP, compound metabolic DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1, LEO1, PBRM1, NAT10, AGRN, DNAJA3, MEPCE, HELLS, process RPS24, NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, EPRS, SNW1, SMAD2, HEATR1, CPS1, EHMT2, INTS10, FOXP4, (190) FOXP1, CCT7, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, SLC25A13, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, ARF4, SUPT16H, DHX40, UTP20, SMC1A, PHF6, NARS, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, QARS, UQCRQ, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, CCT6A, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, FARSB, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, GRSF1, RPL26, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, TFRC, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA

FUS, RNMT, PRPF4B, AAR2, PPIL1, SNRPB2, SNRPD2, CMTR1, POLR2B, HNRNPM, DDX47, PLRG1, PCBP1, GO:0006397 4.6 U2AF1, LEO1, LUC7L3, CDK13, GEMIN5, PABPN1, SYMPK, SNRPN, EFTUD2, RBM20, GTF2H4, GRSF1, GTF2H3, (1.72E-16) SNW1, PRPF3, PRPF4, GTF2H2, TUT1, SRSF5, HNRNPH3, HNRNPH2, SRSF7, C1QBP, HNRNPUL1, POLDIP3, mRNA processing CPSF4, CPSF3, SNRPE, PRPF38B, KIAA1429 (43) EIF6, CNOT9, CHERP, PRPF4B, RNMT, AAR2, ZC3HAV1, PNKD, RPL15, CNOT3, ASCC1, SNRPD2, RPS27L, GO:0006725 1.6 CMTR1, INTS3, CNOT7, DDX27, UQCR10, CDKN2A, SMARCD3, ILK, RPLP1, DHX34, U2AF1, RPP30, ORC5, DDX21, (2.47E-16) ASPH, OGT, LUC7L3, ATP6, ORC3, PABPN1, GTPBP4, ACTN4, SNRPN, EFTUD2, MED12, CHP1, ACTN1, MRTO4, Cellular aromatic MAPK1, NME2, RFC3, NOP2, AAAS, ASCC2, PTRF, MED15, ZNF384, ADSL, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, compound metabolic TRRAP, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1, LEO1, PBRM1, NAT10, AGRN, DNAJA3, MEPCE, process HELLS, RPS24, NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, EPRS, SNW1, SMAD2, HEATR1, CPS1, EHMT2, INTS10, (191) FOXP4, FOXP1, CCT7, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, SLC25A13, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, ARF4, SUPT16H, DHX40, UTP20, SMC1A, PHF6, NARS, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, QARS, UQCRQ, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, CCT6A, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, FARSB, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, GRSF1, RPL26, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, TFRC, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA

177

The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

178

Table 3.6. Top 10 nuclear enriched DAVID biological processes for proteins isolated from the second purified RSV Gag affinity purifications from HeLa nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) EIF6, CNOT9, PRPF4B, RNMT, ZC3HAV1, PNKD, RPL15, CNOT3, ASCC1, SNRPD2, RPS27L, CMTR1, INTS3, GO:0034641 2.1 CNOT7, DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, ORC5, DDX21, OGT, LUC7L3, ORC3, PABPN1, (5.00E-37) GTPBP4, ACTN4, SNRPN, EFTUD2, MED12, CHP1, EEF2, MRTO4, MAPK1, NME2, PPP1CA, RFC3, NOP2, AAAS, Cellular nitrogen ASCC2, MED15, PTRF, ZNF384, EEFSEC, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, IGF2BP1, TRRAP, IGF2BP3, compound metabolic DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, RPS24, NKRF, process TAF4, TAF6, WDR5, INIP, CREBBP, SNW1, SMAD2, HEATR1, FXR2, CPS1, EHMT2, INTS10, FOXP4, FOXP1, (182) TNKS1BP1, SLC25A11, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, SMC1A, PHF6, DAP3, ABCF1, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, PSMD2, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, SLC25A6, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, MRPL10, SQSTM1, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA, MGST1 EIF6, CNOT9, PRPF4B, RNMT, ZC3HAV1, RPL15, CNOT3, ASCC1, SNRPD2, CMTR1, RPS27L, INTS3, CNOT7, GO:0090304 2.3 DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, ORC5, DDX21, OGT, LUC7L3, ORC3, PABPN1, GTPBP4, (7.57E-37) ACTN4, SNRPN, EFTUD2, MED12, CHP1, MRTO4, MAPK1, NME2, RFC3, NOP2, AAAS, ASCC2, PTRF, MED15, Nucleic acid metabolic ZNF384, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, TRRAP, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1, process LEO1, PBRM1, NAT10, DNAJA3, HELLS, RPS24, NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, SNW1, SMAD2, (163) HEATR1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, SMC1A, PHF6, GAR1, ARID4B, U2SURP, MLH1, EDC4, GNL3L, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, HNRNPDL, TAB1, FBL, DIS3, MPG, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, NFIA EIF6, RNMT, PRPF4B, GAR1, U2SURP, RPL15, SNRPD2, RPS27L, CMTR1, INTS3, DDX27, CDKN2A, PLRG1, GO:0006396 5.9 PCBP1, DHX34, U2AF1, RPP30, DDX21, FTSJ3, LUC7L3, CDK13, PABPN1, SYMPK, GTPBP4, EXOSC8, SNRPN, (3.33E-36) EFTUD2, EXOSC4, EXOSC2, GTF2H4, EXOSC3, GTF2H3, PRPF3, PRPF4, GTF2H2, MRTO4, TUT1, NOP2, C1QBP, RNA processing CPSF4, CPSF3, SNRPE, PRPF38B, KIAA1429, FUS, PPIL1, SNRPB2, POLR2B, DIMT1, HNRNPM, DDX47, LEO1, (73) NAT10, RPS24, GEMIN5, RBM20, SNW1, HEATR1, SMAD2, INTS10, HNRNPDL, FBL, DIS3, HNRNPH3, SRSF5, HNRNPH2, SRSF7, HNRNPUL1, POLDIP3, DHX40, MPHOSPH6, UTP20, PES1 EIF6, CNOT9, PRPF4B, RNMT, ZC3HAV1, RPL15, CNOT3, ASCC1, SNRPD2, CMTR1, RPS27L, INTS3, CNOT7, GO:0006139 2.1 DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, ORC5, DDX21, OGT, LUC7L3, ORC3, PABPN1, GTPBP4, (1.14E-33) ACTN4, SNRPN, EFTUD2, MED12, CHP1, MRTO4, MAPK1, NME2, RFC3, NOP2, AAAS, ASCC2, PTRF, MED15,

179

Nucleobase-containing ZNF384, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, TRRAP, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1, compound metabolic LEO1, PBRM1, NAT10, DNAJA3, HELLS, RPS24, NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, SNW1, SMAD2, process HEATR1, CPS1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, (168) WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, SMC1A, PHF6, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA EIF6, CNOT9, PRPF4B, RNMT, ZC3HAV1, PNKD, RPL15, CNOT3, ASCC1, SNRPD2, RPS27L, CMTR1, INTS3, GO:0006807 1.9 CNOT7, DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, ORC5, DDX21, OGT, LUC7L3, ORC3, PABPN1, (1.17E-32) GTPBP4, ACTN4, SNRPN, EFTUD2, MED12, CHP1, EEF2, MRTO4, MAPK1, NME2, PPP1CA, RFC3, NOP2, AAAS, Nitrogen compound ASCC2, MED15, PTRF, ZNF384, EEFSEC, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, IGF2BP1, TRRAP, IGF2BP3, metabolic process DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, RPS24, NKRF, (182) TAF4, TAF6, WDR5, INIP, CREBBP, SNW1, SMAD2, HEATR1, FXR2, CPS1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, SLC25A11, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, SMC1A, PHF6, DAP3, ABCF1, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, PSMD2, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, SLC25A6, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, MRPL10, SQSTM1, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA, MGST1 EIF6, CNOT9, PRPF4B, RNMT, ZC3HAV1, PNKD, RPL15, CNOT3, ASCC1, SNRPD2, CMTR1, RPS27L, INTS3, GO:0006725 2.1 CNOT7, DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, ORC5, DDX21, OGT, LUC7L3, ORC3, PABPN1, (1.18E-32) GTPBP4, ACTN4, SNRPN, EFTUD2, MED12, CHP1, MRTO4, MAPK1, NME2, RFC3, NOP2, AAAS, ASCC2, PTRF, Cellular aromatic MED15, ZNF384, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, TRRAP, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, compound metabolic EIF3E, RAC1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, RPS24, NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, SNW1, process SMAD2, HEATR1, CPS1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, (169) CCT4, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, SMC1A, PHF6, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA EIF6, CNOT9, PRPF4B, RNMT, ZC3HAV1, RPL15, CNOT3, ASCC1, SNRPD2, CMTR1, RPS27L, INTS3, CNOT7, GO:0046483 2.1 DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, ORC5, DDX21, OGT, LUC7L3, ORC3, PABPN1, GTPBP4, (2.14E-32) ACTN4, SNRPN, EFTUD2, MED12, CHP1, MRTO4, MAPK1, NME2, RFC3, NOP2, AAAS, ASCC2, PTRF, MED15, Heterocycle metabolic ZNF384, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, TRRAP, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, RAC1,

180 process LEO1, PBRM1, NAT10, DNAJA3, HELLS, RPS24, NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, SNW1, SMAD2, (168) HEATR1, CPS1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, CCT4, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, SMC1A, PHF6, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA

CNOT9, RNMT, PRPF4B, ZC3HAV1, RPL15, CNOT3, EDC4, SNRPD2, MLH1, CMTR1, CNOT7, PLRG1, PCBP1, GO:0016071 6.6 DHX34, U2AF1, LUC7L3, CDK13, PABPN1, SYMPK, EXOSC8, SNRPN, EFTUD2, EXOSC4, GTF2H4, EXOSC2, (9.28E-32) GTF2H3, EXOSC3, PRPF3, PRPF4, GTF2H2, MRTO4, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, SNRPE, PRPF38B, mRNA metabolic KIAA1429, FUS, PPIL1, SKIV2L, SNRPB2, POLR2B, HNRNPM, DDX47, EIF3E, LEO1, GEMIN5, RPS24, RBM20, process SNW1, TNKS1BP1, DIS3, HNRNPH3, SRSF5, HNRNPH2, SRSF7, HNRNPUL1, POLDIP3 (60) EIF6, CNOT9, RNMT, PRPF4B, ZC3HAV1, RPL15, CNOT3, ASCC1, SNRPD2, CMTR1, RPS27L, INTS3, CNOT7, GO:0016070 2.3 DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, DDX21, OGT, LUC7L3, PABPN1, GTPBP4, ACTN4, (9.93E-32) SNRPN, EFTUD2, MED12, CHP1, MRTO4, MAPK1, NME2, NOP2, AAAS, ASCC2, PTRF, MED15, ZNF384, SNRPE, RNA metabolic process FUS, PPIL1, ZNF131, SNRPB2, TRRAP, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, EIF3E, LEO1, PBRM1, NAT10, (148) DNAJA3, HELLS, RPS24, NKRF, TAF4, TAF6, WDR5, CREBBP, SNW1, SMAD2, HEATR1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, PHF6, GAR1, ARID4B, U2SURP, MLH1, EDC4, FUBP1, TRIM5, PLRG1, PCBP1, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, RBBP5, CTBP1, CTBP2, EXOSC4, RING1, GTF2H4, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, FEN1, GEMIN5, RBM20, HNRNPDL, TAB1, FBL, DIS3, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, NFIA EIF6, CNOT9, PRPF4B, RNMT, ZC3HAV1, PNKD, RPL15, CNOT3, ASCC1, SNRPD2, CMTR1, RPS27L, INTS3, GO:1901360 2.0 CNOT7, DDX27, CDKN2A, SMARCD3, ILK, DHX34, U2AF1, RPP30, ORC5, DDX21, OGT, LUC7L3, ORC3, PABPN1, (1.50E-31) GTPBP4, ACTN4, SNRPN, EFTUD2, MED12, CHP1, MRTO4, MAPK1, NME2, RFC3, NOP2, AAAS, ASCC2, PTRF, Organic cyclic MED15, ZNF384, SNRPE, FUS, PPIL1, ZNF131, SNRPB2, TRRAP, DIDO1, HNRNPM, DDX47, ATN1, SMARCB1, compound metabolic EIF3E, RAC1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, RPS24, NKRF, TAF4, TAF6, WDR5, INIP, CREBBP, SNW1, process SMAD2, HEATR1, CPS1, EHMT2, INTS10, FOXP4, FOXP1, TNKS1BP1, HDAC3, SRSF5, HNRNPH3, HNRNPH2, (170) CCT4, WDR61, SRSF7, PSMC3, EBF3, SMARCC2, SUPT16H, DHX40, UTP20, SMC1A, PHF6, GAR1, U2SURP, ARID4B, MLH1, EDC4, GNL3L, FUBP1, TRIM5, PLRG1, PCBP1, SETMAR, NDUFS3, TOP2A, FTSJ3, AHNAK, CDK13, SYMPK, EXOSC8, CTBP1, RBBP5, CTBP2, EXOSC4, GTF2H4, RING1, EXOSC2, EXOSC3, GTF2H3, PRPF3, PRPF4, MCM4, GTF2H2, TUT1, C1QBP, CDK11B, CPSF4, CPSF3, PRPF38B, KIAA1429, MED1, NUP98, MTDH, SKIV2L, WRNIP1, NUP93, NFYA, PAXBP1, CIC, POLR2B, TRIM11, IARS, DIMT1, MINA, SQSTM1, DHCR7, GAPDH, FEN1, TRIP12, GEMIN5, TCP1, PDS5B, RBM20, HNRNPDL, TAB1, FBL, DIS3, MPG, GPI, ATXN2, RPRD2, HNRNPUL1, POLDIP3, PBX1, TCEB1, MPHOSPH6, PES1, NFIC, TJP2, NFIA The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column

181 shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

182

Table 3.7. Top 10 DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from HeLa nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) RNMT, AAR2, RPL15, SNRPD2, RPS27L, INTS9, DDX27, CDKN2A, DKC1, PLRG1, RAVER2, PCBP1, U2AF1, CDK12, GO:0006396 4.5 SRRM1, LSM3, RBM10, MDN1, CCAR2, PABPN1, SYMPK, EXOSC4, PTBP1, EXOSC2, GTF2H4, GTF2H3, PRPF3, (7.83E-26) CDC5L, PRPF4, SMN2, GTF2H1, NOP2, C1QBP, MRPS9, LARP7, SNRPA, CPSF4, CPSF3, THOC3, NHP2, PRPF38B, RNA processing CPSF3L, POLR2E, FAM98B, DIMT1, HNRNPM, DDX47, CNOT6L, LEO1, NAT10, RBM26, NOC4L, ELAVL1, SMAD2, (67) HNRNPDL, FBL, SRSF3, HNRNPH3, PPIH, SRSF7, HNRNPUL1, POLDIP3, WDR3, DHX40, UTP20, PES1, PUF60 CNOT9, MRPS34, RNMT, AAR2, RPL15, SNRPD2, RPS27L, MED21, MRPS31, CTNNB1, INTS9, CSNK2A2, DDX27, GO:0010467 1.9 TOP1, RAD21, CDKN2A, RAVER2, PUM1, U2AF1, SRRM1, LSM3, ASPH, OGT, RBM10, CCAR2, PABPN1, PTBP1, (1.37E-21) MED13, PPP1CA, NOP2, KDM2A, BAZ1B, RFC1, PTRF, MED15, TRIM33, EIF2S1, SERBP1, HSPB1, SNRPA, Gene expression TGFB1I1, ERC1, XRN1, CPSF3L, MRPS14, CLU, IGF2BP1, TRRAP, IGF2BP3, HNRNPM, MOV10, DDX47, CNOT6L, (162) SMARCB1, EIF3F, LEO1, PBRM1, NAT10, UBAP2L, DNAJA3, MEPCE, HELLS, MRPS27, TRIP4, NOC4L, MRPS23, MYO1C, MRPS22, WDR5, CREBBP, SMAD2, AFG3L2, FOXP1, FXR1, SRSF3, SLC25A11, HDAC3, HNRNPH3, PPIH, SLC25A13, WDR61, SRSF7, PTCD3, PSMC3, SMARCC2, DNMT1, WDR3, CARS2, TMPO, DHX40, UTP20, PUF60, DAP3, ABCF1, NARS, HP1BP3, ARID4B, NUP188, QARS, DKC1, VWA9, PLRG1, PCBP1, CDK12, PSMD2, PSMD5, MDN1, AHNAK, SYMPK, SLC25A4, EXOSC4, GTF2H4, RING1, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, PURA, GTF2H1, CD3EAP, EIF4G2, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, SRP9, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, CIC, IARS, DIMT1, MINA, MRPL10, SQSTM1, GATAD2A, GAPDH, RBM26, ELAVL1, ILF3, ETF1, HNRNPDL, TAB1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2

GO:0016071 4.9 CNOT9, RNMT, ZC3HAV1, AAR2, RPL15, SNRPD2, DKC1, PLRG1, RAVER2, PCBP1, U2AF1, CDK12, SRRM1, LSM3, (1.56E-21) RBM10, CCAR2, PABPN1, SYMPK, EXOSC4, PTBP1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, mRNA metabolic GTF2H1, C1QBP, SNRPA, CDK11B, CPSF4, CPSF3, THOC3, XRN1, PRPF38B, POLR2E, SKIV2L, HNRNPM, DDX47, process MOV10, CNOT6L, LEO1, RBM26, ELAVL1, ETF1, SRSF3, PPIH, HNRNPH3, SRSF7, HNRNPUL1, POLDIP3, PUF60 (53) CNOT9, MRPS34, RNMT, AAR2, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, MRPS31, CTNNB1, INTS9, GO:0034641 1.7 CSNK2A2, DDX27, TOP1, RAD21, CDKN2A, RAVER2, DDX24, PUM1, U2AF1, SRRM1, ORC5, LSM3, ASPH, OGT, (1.04E-20) RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, PPP1CA, NOP2, NNT, KDM2A, TRIM33, BAZ1B, RFC1, PTRF, Cellular nitrogen HUWE1, MED15, EIF2S1, ADSL, HSPB1, SNRPA, TGFB1I1, ERC1, XRN1, CPSF3L, MRPS14, CLU, IGF2BP1, TRRAP, compound metabolic IGF2BP3, HNRNPM, MOV10, DDX47, CNOT6L, SMARCB1, EIF3F, LEO1, PBRM1, NAT10, DNAJA3, MEPCE, HELLS, process MRPS27, TRIP4, NOC4L, MKI67, MRPS23, MRPS22, WDR5, INIP, CREBBP, SMAD2, FOXP1, FXR1, SRSF3, CCT7, (181) SLC25A11, HDAC3, HNRNPH3, PPIH, CCT4, SLC25A13, WDR61, SRSF7, PTCD3, PSMC3, CCT8, SMARCC2, DNMT1, WDR3, CARS2, TMPO, DHX40, UTP20, SMC1A, PUF60, DAP3, ABCF1, NARS, HP1BP3, ARID4B, NUP188, QARS, UQCRQ, ANKRD17, DKC1, PLRG1, VWA9, FANCI, PCBP1, CDK12, PSMD2, PSMD5, NDUFS3, MDN1, AHNAK, SYMPK, SLC25A4, EXOSC4, GTF2H4, RING1, EXOSC2, GTF2H3, PRPF3, CDC5L, CCT6A, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, EIF4G2, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3,

183

SRP9, PRPF38B, BTAF1, NUP98, ASUN, POLR2E, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, MRPL10, SQSTM1, GATAD2A, GAPDH, TRIP12, RBM26, TCP1, ELAVL1, ILF3, ETF1, HNRNPDL, TAB1, FBL, ATXN2, BRMS1, TFRC, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2

RNMT, POLR2E, AAR2, SNRPD2, HNRNPM, DDX47, PLRG1, CNOT6L, RAVER2, PCBP1, U2AF1, CDK12, LEO1, GO:0006397 5.5 SRRM1, LSM3, RBM10, CCAR2, RBM26, PABPN1, SYMPK, PTBP1, GTF2H4, GTF2H3, ELAVL1, PRPF3, CDC5L, (6.82E-19) PRPF4, SMN2, GTF2H1, SRSF3, PPIH, HNRNPH3, SRSF7, C1QBP, HNRNPUL1, POLDIP3, SNRPA, CPSF4, CPSF3, mRNA processing THOC3, PRPF38B, PUF60 (42) CNOT9, MRPS34, RNMT, AAR2, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, MRPS31, CTNNB1, INTS9, GO:0006807 1.6 CSNK2A2, DDX27, TOP1, RAD21, CDKN2A, RAVER2, DDX24, PUM1, U2AF1, SRRM1, ORC5, LSM3, ASPH, OGT, (3.68E-18) RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, PYCR1, PPP1CA, NOP2, NNT, KDM2A, TRIM33, BAZ1B, RFC1, Nitrogen compound PTRF, HUWE1, MED15, EIF2S1, ADSL, HSPB1, SNRPA, TGFB1I1, ERC1, XRN1, CPSF3L, MRPS14, CLU, IGF2BP1, metabolic process TRRAP, IGF2BP3, HNRNPM, MOV10, DDX47, CNOT6L, SMARCB1, EIF3F, LEO1, PBRM1, NAT10, DNAJA3, MEPCE, (183) HELLS, MRPS27, TRIP4, NOC4L, MKI67, MRPS23, MRPS22, WDR5, INIP, CREBBP, SMAD2, FOXP1, FXR1, SRSF3, CCT7, SLC25A11, HDAC3, HNRNPH3, PPIH, CCT4, SLC25A13, WDR61, SRSF7, PTCD3, PSMC3, CCT8, SMARCC2, DNMT1, WDR3, CARS2, TMPO, DHX40, UTP20, SMC1A, PUF60, DAP3, ABCF1, NARS, HP1BP3, ARID4B, NUP188, QARS, UQCRQ, ANKRD17, DKC1, PLRG1, VWA9, FANCI, PCBP1, CDK12, PSMD2, PSMD5, PTDSS1, NDUFS3, MDN1, AHNAK, SYMPK, SLC25A4, EXOSC4, GTF2H4, RING1, EXOSC2, GTF2H3, PRPF3, CDC5L, CCT6A, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, EIF4G2, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, SRP9, PRPF38B, BTAF1, NUP98, ASUN, POLR2E, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, MRPL10, SQSTM1, GATAD2A, GAPDH, TRIP12, RBM26, TCP1, ELAVL1, ILF3, ETF1, HNRNPDL, TAB1, FBL, ATXN2, BRMS1, TFRC, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, AAR2, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, GO:0090304 1.8 RAD21, CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, ASPH, OGT, RBM10, CCAR2, PABPN1, ACTN4, (1.39E-17) PTBP1, MED13, NOP2, KDM2A, RFC1, BAZ1B, PTRF, HUWE1, MED15, TRIM33, SNRPA, TGFB1I1, ERC1, XRN1, Nucleic acid metabolic CPSF3L, CLU, TRRAP, HNRNPM, MOV10, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, DNAJA3, MEPCE, process HELLS, TRIP4, NOC4L, MKI67, WDR5, INIP, CREBBP, SMAD2, FOXP1, SRSF3, CCT7, HDAC3, HNRNPH3, PPIH, (151) CCT4, WDR61, SRSF7, PSMC3, CCT8, SMARCC2, WDR3, DNMT1, CARS2, TMPO, DHX40, UTP20, SMC1A, PUF60, NARS, HP1BP3, ARID4B, NUP188, QARS, ANKRD17, DKC1, VWA9, PLRG1, FANCI, PCBP1, CDK12, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, CCT6A, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, TRIP12, RBM26, TCP1, ELAVL1, ILF3, ETF1, HNRNPDL, TAB1, FBL, ATXN2, BRMS1, TFRC, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 AAR2, RPL15, MED21, CTNNB1, INTS9, RAVER2, PLOD2, U2AF1, LSM3, OGT, CCAR2, PTBP1, MED13, PPP1CA, GO:0044260 1.5 NOP2, MED15, RFC1, PTRF, BAZ1B, HUWE1, EIF2S1, TGFB1I1, CPSF3L, IGF2BP1, IGF2BP3, MOV10, LEO1, (5.98E-16) PBRM1, NAT10, MEPCE, NOC4L, MKI67, INIP, CREBBP, TRIO, SMAD2, AFG3L2, FXR1, SRSF3, CCT7, CCT4, Cellular macromolecule SRSF7, WDR61, PTCD3, CCT8, DHX40, UTP20, CDC42BPB, ABCF1, KIAA0368, NARS, QARS, ANKRD17, VWA9, metabolic process FANCI, SYMPK, SLC25A4, EXOSC4, TPX2, EXOSC2, GTF2H4, GTF2H3, CCT6A, SMN2, GTF2H1, MED9, CDK11B, (201) NHP2, THOC3, PRPF38B, SRP9, POLR2E, ASUN, FAM98B, SKIV2L, CBLL1, CIC, MRPL10, STT3A, SQSTM1, GATAD2A, TRIP12, BUB3, RBM26, ELAVL1, ILF3, ETF1, HNRNPDL, ERP44, BRMS1, TFRC, HNRNPUL1, POLDIP3,

184

PBX1, TCEB1, PES1, CIT, DNM2, MRPS34, CNOT9, RNMT, ZC3HAV1, SNRPD2, RPS27L, MRPS31, DDX27, CSNK2A2, TOP1, CDKN2A, RAD21, DDX24, TGFBI, PUM1, SRRM1, ORC5, ASPH, RBM10, PABPN1, ACTN4, KDM2A, TRIM33, HSPB1, SNRPA, ERC1, XRN1, MRPS14, CLU, TRRAP, HNRNPM, DDX47, CNOT6L, SMARCB1, EIF3F, DNAJA3, HELLS, MRPS27, TRIP4, MRPS23, MRPS22, WDR5, FOXP1, CORO1C, SLC25A11, HDAC3, HNRNPH3, PPIH, SLC25A13, PPIB, PSMC3, SMARCC2, DNMT1, WDR3, CARS2, TMPO, SMC1A, PUF60, DAP3, HP1BP3, ARID4B, NUP188, PIGK, DKC1, PLRG1, PCBP1, PSMD2, CDK12, RHOA, PSMD5, MDN1, AHNAK, KIF14, RING1, MINK1, PIGS, PRPF3, CDC5L, MCM4, PRPF4, PURA, CD3EAP, EIF4G2, C1QBP, MRPS9, LARP7, CPSF4, CPSF3, OSTC, BTAF1, GALNT1, NUP98, IARS, DIMT1, MINA, TGM2, GAPDH, EHD4, TCP1, TAB1, FBL, ATXN2, SDF2L1

ABCF1, MRPS34, CNOT9, NARS, MRPS14, RPL15, IGF2BP1, QARS, RPS27L, IGF2BP3, MRPS31, IARS, MRPL10, GO:0006412 4.2 MOV10, CNOT6L, EIF3F, PUM1, GAPDH, MRPS27, SLC25A4, MRPS23, MRPS22, PTBP1, GTF2H3, ELAVL1, ILF3, (1.58E-15) SMAD2, ETF1, PURA, FXR1, SLC25A11, EIF4G2, PPP1CA, SLC25A13, C1QBP, MRPS9, PTCD3, POLDIP3, EIF2S1, Translation HSPB1, CARS2, NHP2, SRP9, DAP3 (44) CNOT9, RNMT, AAR2, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, GO:0006139 1.7 RAD21, CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, ASPH, OGT, RBM10, CCAR2, PABPN1, ACTN4, (4.71E-15) PTBP1, MED13, NOP2, NNT, KDM2A, RFC1, BAZ1B, PTRF, HUWE1, MED15, TRIM33, SNRPA, ADSL, TGFB1I1, Nucleobase-containing ERC1, XRN1, CPSF3L, CLU, TRRAP, HNRNPM, MOV10, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, compound metabolic DNAJA3, MEPCE, HELLS, TRIP4, NOC4L, MKI67, WDR5, INIP, CREBBP, SMAD2, FOXP1, SRSF3, CCT7, HDAC3, process HNRNPH3, PPIH, SLC25A13, CCT4, WDR61, SRSF7, PSMC3, CCT8, SMARCC2, WDR3, DNMT1, CARS2, TMPO, (157) DHX40, UTP20, SMC1A, PUF60, NARS, HP1BP3, ARID4B, NUP188, QARS, UQCRQ, ANKRD17, DKC1, VWA9, PLRG1, FANCI, PCBP1, CDK12, NDUFS3, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, CCT6A, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, GAPDH, TRIP12, RBM26, TCP1, ELAVL1, ILF3, ETF1, HNRNPDL, TAB1, FBL, ATXN2, BRMS1, TFRC, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

185

Table 3.8. Top 10 nuclear enriched DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from HeLa nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, MRPS31, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, GO:0034641 2.2 RAD21, CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, (1.38E-36) MED13, PPP1CA, NOP2, KDM2A, BAZ1B, RFC1, PTRF, HUWE1, MED15, TRIM33, EIF2S1, HSPB1, SNRPA, Cellular nitrogen TGFB1I1, XRN1, CPSF3L, MRPS14, CLU, IGF2BP1, TRRAP, IGF2BP3, HNRNPM, DDX47, CNOT6L, SMARCB1, compound metabolic LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, MKI67, MRPS23, WDR5, INIP, CREBBP, SMAD2, FOXP1, process FXR1, SRSF3, SLC25A11, HDAC3, HNRNPH3, PPIH, CCT4, WDR61, SRSF7, PSMC3, CCT8, SMARCC2, DNMT1, (157) WDR3, TMPO, DHX40, UTP20, SMC1A, PUF60, DAP3, ABCF1, HP1BP3, ARID4B, NUP188, ANKRD17, DKC1, VWA9, PLRG1, FANCI, PCBP1, CDK12, PSMD2, PSMD5, NDUFS3, MDN1, AHNAK, SYMPK, SLC25A4, EXOSC4, GTF2H4, RING1, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, MRPL10, SQSTM1, GATAD2A, GAPDH, TRIP12, TCP1, ELAVL1, ILF3, ETF1, HNRNPDL, TAB1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, RPL15, SNRPD2, RPS27L, MED21, MRPS31, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, RAD21, GO:0010467 2.4 CDKN2A, RAVER2, U2AF1, SRRM1, LSM3, OGT, RBM10, CCAR2, PABPN1, PTBP1, MED13, PPP1CA, NOP2, (5.83E-36) KDM2A, BAZ1B, RFC1, PTRF, MED15, TRIM33, EIF2S1, SERBP1, HSPB1, SNRPA, TGFB1I1, XRN1, CPSF3L, Gene expression MRPS14, CLU, IGF2BP1, TRRAP, IGF2BP3, HNRNPM, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, (143) UBAP2L, DNAJA3, HELLS, TRIP4, NOC4L, MRPS23, MYO1C, WDR5, CREBBP, SMAD2, FOXP1, FXR1, SRSF3, SLC25A11, HDAC3, HNRNPH3, PPIH, WDR61, SRSF7, PSMC3, SMARCC2, WDR3, DNMT1, TMPO, DHX40, UTP20, PUF60, DAP3, ABCF1, HP1BP3, ARID4B, NUP188, VWA9, PLRG1, DKC1, PCBP1, CDK12, PSMD2, PSMD5, MDN1, AHNAK, SYMPK, SLC25A4, EXOSC4, GTF2H4, RING1, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, CIC, IARS, DIMT1, MINA, MRPL10, SQSTM1, GATAD2A, GAPDH, ELAVL1, ILF3, HNRNPDL, TAB1, ETF1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 RNMT, RPL15, SNRPD2, RPS27L, INTS9, DDX27, CDKN2A, DKC1, PLRG1, RAVER2, PCBP1, U2AF1, CDK12, GO:0006396 6.4 SRRM1, LSM3, RBM10, MDN1, CCAR2, PABPN1, SYMPK, EXOSC4, PTBP1, GTF2H4, EXOSC2, GTF2H3, PRPF3, (1.68E-34) CDC5L, PRPF4, SMN2, GTF2H1, NOP2, C1QBP, MRPS9, LARP7, SNRPA, CPSF4, CPSF3, THOC3, NHP2, PRPF38B, RNA processing CPSF3L, POLR2E, FAM98B, DIMT1, HNRNPM, DDX47, CNOT6L, LEO1, NAT10, NOC4L, ELAVL1, SMAD2, (65) HNRNPDL, FBL, SRSF3, HNRNPH3, PPIH, SRSF7, HNRNPUL1, POLDIP3, WDR3, DHX40, UTP20, PES1, PUF60 CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, RAD21, GO:0090304 2.4 CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, (3.85E-34) NOP2, KDM2A, PTRF, RFC1, BAZ1B, HUWE1, MED15, TRIM33, SNRPA, TGFB1I1, XRN1, CPSF3L, CLU, TRRAP, Nucleic acid metabolic HNRNPM, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, MKI67, WDR5, process INIP, CREBBP, SMAD2, FOXP1, SRSF3, HDAC3, PPIH, HNRNPH3, CCT4, WDR61, SRSF7, PSMC3, CCT8, (139) SMARCC2, WDR3, DNMT1, TMPO, DHX40, UTP20, SMC1A, PUF60, HP1BP3, ARID4B, NUP188, ANKRD17, VWA9, PLRG1, DKC1, FANCI, PCBP1, CDK12, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4,

186

CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, TRIP12, TCP1, ELAVL1, ILF3, HNRNPDL, TAB1, ETF1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, MRPS31, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, GO:0006807 2.0 RAD21, CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, (1.15E-32) MED13, PPP1CA, NOP2, KDM2A, BAZ1B, RFC1, PTRF, HUWE1, MED15, TRIM33, EIF2S1, HSPB1, SNRPA, Nitrogen compound TGFB1I1, XRN1, CPSF3L, MRPS14, CLU, IGF2BP1, TRRAP, IGF2BP3, HNRNPM, DDX47, CNOT6L, SMARCB1, metabolic process LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, MKI67, MRPS23, WDR5, INIP, CREBBP, SMAD2, FOXP1, (157) FXR1, SRSF3, SLC25A11, HDAC3, HNRNPH3, PPIH, CCT4, WDR61, SRSF7, PSMC3, CCT8, SMARCC2, DNMT1, WDR3, TMPO, DHX40, UTP20, SMC1A, PUF60, DAP3, ABCF1, HP1BP3, ARID4B, NUP188, ANKRD17, DKC1, VWA9, PLRG1, FANCI, PCBP1, CDK12, PSMD2, PSMD5, NDUFS3, MDN1, AHNAK, SYMPK, SLC25A4, EXOSC4, GTF2H4, RING1, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, MRPL10, SQSTM1, GATAD2A, GAPDH, TRIP12, TCP1, ELAVL1, ILF3, ETF1, HNRNPDL, TAB1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, RAD21, GO:0006139 2.2 CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, (8.97E-30) NOP2, KDM2A, PTRF, RFC1, BAZ1B, HUWE1, MED15, TRIM33, SNRPA, TGFB1I1, XRN1, CPSF3L, CLU, TRRAP, Nucleobase-containing HNRNPM, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, MKI67, WDR5, compound metabolic INIP, CREBBP, SMAD2, FOXP1, SRSF3, HDAC3, PPIH, HNRNPH3, CCT4, WDR61, SRSF7, PSMC3, CCT8, process SMARCC2, WDR3, DNMT1, TMPO, DHX40, UTP20, SMC1A, PUF60, HP1BP3, ARID4B, NUP188, ANKRD17, VWA9, (141) PLRG1, DKC1, FANCI, PCBP1, CDK12, NDUFS3, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, GAPDH, TRIP12, TCP1, ELAVL1, ILF3, HNRNPDL, TAB1, ETF1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, CDKN2A, RAD21, GO:0016070 2.4 RAVER2, DDX24, U2AF1, SRRM1, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, NOP2, KDM2A, (3.81E-29) PTRF, RFC1, BAZ1B, MED15, TRIM33, SNRPA, TGFB1I1, XRN1, CPSF3L, CLU, TRRAP, HNRNPM, DDX47, CNOT6L, RNA metabolic process SMARCB1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, WDR5, CREBBP, SMAD2, FOXP1, SRSF3, (126) HDAC3, PPIH, HNRNPH3, WDR61, SRSF7, PSMC3, SMARCC2, WDR3, DNMT1, TMPO, DHX40, UTP20, PUF60, HP1BP3, ARID4B, NUP188, VWA9, PLRG1, DKC1, PCBP1, CDK12, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, ELAVL1, ILF3, HNRNPDL, TAB1, ETF1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, RAD21, GO:0046483 2.1 CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, (1.08E-28) NOP2, KDM2A, PTRF, RFC1, BAZ1B, HUWE1, MED15, TRIM33, SNRPA, TGFB1I1, XRN1, CPSF3L, CLU, TRRAP, Heterocycle metabolic HNRNPM, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, MKI67, WDR5, process INIP, CREBBP, SMAD2, FOXP1, SRSF3, HDAC3, PPIH, HNRNPH3, CCT4, WDR61, SRSF7, PSMC3, CCT8,

187

(141) SMARCC2, WDR3, DNMT1, TMPO, DHX40, UTP20, SMC1A, PUF60, HP1BP3, ARID4B, NUP188, ANKRD17, VWA9, PLRG1, DKC1, FANCI, PCBP1, CDK12, NDUFS3, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, GAPDH, TRIP12, TCP1, ELAVL1, ILF3, HNRNPDL, TAB1, ETF1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, RAD21, GO:0006725 2.1 CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, (2.64E-28) NOP2, KDM2A, PTRF, RFC1, BAZ1B, HUWE1, MED15, TRIM33, SNRPA, TGFB1I1, XRN1, CPSF3L, CLU, TRRAP, Cellular aromatic HNRNPM, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, MKI67, WDR5, compound metabolic INIP, CREBBP, SMAD2, FOXP1, SRSF3, HDAC3, PPIH, HNRNPH3, CCT4, WDR61, SRSF7, PSMC3, CCT8, process SMARCC2, WDR3, DNMT1, TMPO, DHX40, UTP20, SMC1A, PUF60, HP1BP3, ARID4B, NUP188, ANKRD17, VWA9, (141) PLRG1, DKC1, FANCI, PCBP1, CDK12, NDUFS3, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, GAPDH, TRIP12, TCP1, ELAVL1, ILF3, HNRNPDL, TAB1, ETF1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 CNOT9, RNMT, ZC3HAV1, RPL15, SNRPD2, RPS27L, MED21, CTNNB1, INTS9, CSNK2A2, DDX27, TOP1, RAD21, GO:1901360 2.1 CDKN2A, RAVER2, DDX24, U2AF1, SRRM1, ORC5, LSM3, OGT, RBM10, CCAR2, PABPN1, ACTN4, PTBP1, MED13, (1.65E-27) NOP2, KDM2A, BAZ1B, RFC1, PTRF, HUWE1, MED15, TRIM33, SNRPA, TGFB1I1, XRN1, CPSF3L, CLU, TRRAP, Organic cyclic HNRNPM, DDX47, CNOT6L, SMARCB1, LEO1, PBRM1, NAT10, DNAJA3, HELLS, TRIP4, NOC4L, MKI67, WDR5, compound metabolic INIP, CREBBP, SMAD2, FOXP1, SRSF3, HDAC3, HNRNPH3, PPIH, CCT4, WDR61, SRSF7, PSMC3, CCT8, process SMARCC2, WDR3, DNMT1, TMPO, DHX40, UTP20, SMC1A, PUF60, HP1BP3, ARID4B, NUP188, ANKRD17, VWA9, (142) PLRG1, DKC1, FANCI, PCBP1, CDK12, NDUFS3, MDN1, AHNAK, SYMPK, EXOSC4, RING1, GTF2H4, EXOSC2, GTF2H3, PRPF3, CDC5L, PRPF4, SMN2, MCM4, PURA, GTF2H1, CD3EAP, C1QBP, MRPS9, LARP7, MED9, CDK11B, CPSF4, CPSF3, NHP2, THOC3, PRPF38B, BTAF1, NUP98, POLR2E, ASUN, FAM98B, SKIV2L, CIC, IARS, DIMT1, MINA, SQSTM1, GATAD2A, GAPDH, TRIP12, TCP1, ELAVL1, ILF3, ACLY, HNRNPDL, TAB1, ETF1, FBL, ATXN2, BRMS1, HNRNPUL1, POLDIP3, PBX1, TCEB1, PES1, DNM2 The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

188

Table 3.9. Top 10 DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from DF1 nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

ARSB, XPO1, ARID4A, COA3, PDLIM7, VAPB, EIF2A, PIP5K1A, GBF1, P4HA1, MIER1, SMARCD1, U2AF1, MSN, GO:0016043 1.7 NDUFS2, STAG1, RPL35A, CCNK, TNIK, KIF5B, PDXP, TLE4, DYNLT1, PPP1CC, ARPC1A, MAP4K4, SEC61B, (1.52E-07) KDM2A, RFC1, SPAG5, RIPK1, IPO5, SNRNP200, NDUFB3, GULP1, NDUFB9, PPFIA1, RDX, CDC73, TPM2, CAPZB, Cellular component ZMYND8, RPS25, VRK1, MACF1, PRPF8, CAMK2D, SKA3, TMED10, WAC, RPL10A, ETV6, BRMS1L, NDUFA5, organization ACTC1, TAF4, CREBBP, SACS, CSNK1D, PSMD11, RAB35, MAP2, LUC7L, MGST1, PIP4K2B (65)

GO:0071840 1.7 ARSB, XPO1, ARID4A, COA3, PDLIM7, VAPB, EIF2A, PIP5K1A, GBF1, RRP1B, P4HA1, MIER1, SMARCD1, U2AF1, (1.55E-07) MSN, NDUFS2, STAG1, RPL35A, CCNK, TNIK, KIF5B, PDXP, TLE4, DYNLT1, PPP1CC, ARPC1A, MAP4K4, SEC61B, Cellular component KDM2A, RFC1, SPAG5, RIPK1, IPO5, SNRNP200, NDUFB3, GULP1, NDUFB9, PPFIA1, RDX, CDC73, TPM2, CAPZB, organization or ZMYND8, RPS25, VRK1, MACF1, PRPF8, CAMK2D, SKA3, TMED10, WAC, RPL10A, ETV6, BRMS1L, NDUFA5, biogenesis ACTC1, TAF4, CREBBP, SACS, CSNK1D, PSMD11, RAB35, MAP2, LUC7L, MGST1, PIP4K2B (66) UQCRC2, XPO1, ARID4A, COA3, EIF2A, MED23, FUBP1, TRMT1L, RRP1B, PLRG1, MIER1, PCBP2, SMARCD1, GO:0010467 1.7 U2AF1, SLC25A3, MSN, BCL7A, RPRD1B, TWIST2, STAG1, RPL35A, CCNK, STRN3, TLE4, PPP1CC, KDM2A, RFC1, (7.67E-07) RIPK1, SNRNP200, VGLL4, ZBTB10, NFIX, RDX, CDC73, NFYA, ZMYND8, RPS25, LANCL2, PRPF8, CAMK2D, Gene expression CC2D1B, MYCBP, WAC, RPL10A, ETV6, BRMS1L, ACTC1, TAF4, SEC11A, CREBBP, TAB2, SRSF3, CSNK1D, (58) SRSF7, PSMD11, THRAP3, ATF7, LUC7L

NDUFB3, ARSB, XPO1, ARID4A, COA3, PDLIM7, VAPB, NDUFB9, PPFIA1, EIF2A, CDC73, RDX, PIP5K1A, TPM2, GO:0006996 2.0 CAPZB, ZMYND8, VRK1, GBF1, MACF1, MIER1, SMARCD1, U2AF1, CAMK2D, SKA3, TMED10, WAC, MSN, (1.54E-06) NDUFS2, BRMS1L, STAG1, NDUFA5, ACTC1, CCNK, TNIK, KIF5B, CREBBP, PDXP, DYNLT1, PPP1CC, ARPC1A, Organelle organization CSNK1D, KDM2A, RFC1, SPAG5, MAP2, PIP4K2B (46)

GO:0044085 2.1 NDUFB3, XPO1, COA3, NDUFB9, PPFIA1, EIF2A, CDC73, RDX, PIP5K1A, CAPZB, ZMYND8, RPS25, RRP1B, (2.20E-06) MACF1, PRPF8, CAMK2D, TMED10, MSN, RPL10A, NDUFS2, NDUFA5, RPL35A, TAF4, ACTC1, TNIK, CREBBP, Cellular component SACS, PDXP, TLE4, ARPC1A, MAP4K4, CSNK1D, PSMD11, RIPK1, IPO5, SNRNP200, LUC7L, MGST1, PIP4K2B biogenesis (39)

GO:0043933 2.3 NDUFB3, ARID4A, COA3, NDUFB9, RDX, CDC73, EIF2A, CAPZB, ZMYND8, VRK1, MIER1, P4HA1, PRPF8, (2.71E-06) SMARCD1, CAMK2D, TMED10, WAC, MSN, NDUFS2, BRMS1L, NDUFA5, TAF4, KIF5B, CREBBP, PDXP, TLE4, Macromolecular ARPC1A, CSNK1D, KDM2A, PSMD11, RIPK1, SNRNP200, IPO5, LUC7L, MGST1 complex subunit organization

189

(35)

GO:0034641 1.5 UQCRC2, XPO1, ARID4A, COA3, VAPB, EIF2A, MED23, MTHFD1L, FUBP1, TRMT1L, RRP1B, PLRG1, MIER1, (1.25E-05) PCBP2, SMARCD1, U2AF1, SLC25A3, BCL7A, RPRD1B, NDUFS2, TWIST2, STAG1, RPL35A, CCNK, STRN3, TLE4, Cellular nitrogen KDM2A, RFC1, RIPK1, SNRNP200, VGLL4, NDUFB3, ZBTB10, NDUFB9, NFIX, CDC73, NFYA, ZMYND8, RPS25, compound metabolic LANCL2, PRPF8, CAMK2D, TMED10, CC2D1B, MYCBP, WAC, RPL10A, ETV6, BRMS1L, NDUFA5, TAF4, SEC11A, process CREBBP, TAB2, SRSF3, CSNK1D, SRSF7, PSMD11, ATF7, THRAP3, ATP5A1, LUC7L, MGST1 (63)

XPO1, ARID4A, VAPB, EIF2A, MED23, FUBP1, TRMT1L, RRP1B, PLRG1, MIER1, PCBP2, SMARCD1, U2AF1, BCL7A, GO:0016070 1.7 RPRD1B, TWIST2, STAG1, RPL35A, CCNK, STRN3, TLE4, RFC1, KDM2A, RIPK1, SNRNP200, VGLL4, ZBTB10, (1.79E-05) NFIX, CDC73, NFYA, ZMYND8, RPS25, PRPF8, LANCL2, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, ETV6, RNA metabolic process BRMS1L, TAF4, CREBBP, TAB2, SRSF3, SRSF7, CSNK1D, THRAP3, ATF7, LUC7L (50)

UQCRC2, ARSB, XPO1, ARID4A, COA3, VAPB, EIF2A, MED23, MTHFD1L, FUBP1, TRMT1L, PLRG1, RRP1B, MIER1, GO:0006807 1.5 P4HA1, PCBP2, SMARCD1, U2AF1, SLC25A3, BCL7A, RPRD1B, NDUFS2, TWIST2, STAG1, RPL35A, CCNK, STRN3, (2.56E-05) TLE4, KDM2A, RFC1, RIPK1, SNRNP200, VGLL4, NDUFB3, ZBTB10, NDUFB9, NFIX, CDC73, NFYA, ZMYND8, Nitrogen compound RPS25, LANCL2, PRPF8, CAMK2D, TMED10, CC2D1B, MYCBP, WAC, RPL10A, ETV6, BRMS1L, NDUFA5, TAF4, metabolic process SEC11A, CREBBP, TAB2, SRSF3, CSNK1D, SRSF7, PSMD11, ATF7, THRAP3, ATP5A1, LUC7L, MGST1 (65)

GO:0022607 2.1 NDUFB3, COA3, NDUFB9, PPFIA1, EIF2A, CDC73, RDX, PIP5K1A, CAPZB, ZMYND8, MACF1, PRPF8, CAMK2D, (2.89E-05) TMED10, MSN, NDUFS2, NDUFA5, TAF4, ACTC1, TNIK, CREBBP, SACS, PDXP, TLE4, ARPC1A, MAP4K4, CSNK1D, Cellular component PSMD11, RIPK1, IPO5, SNRNP200, LUC7L, MGST1, PIP4K2B assembly (34) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

190

Table 3.10. Top 10 nuclear enriched DAVID biological processes for proteins isolated from the purified HIV Gag affinity purifications from DF1 nuclear lysates. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

UQCRC2, XPO1, ARID4A, MED23, FUBP1, TRMT1L, RRP1B, PLRG1, MIER1, PCBP2, SMARCD1, SLC25A3, U2AF1, GO:0010467 2.4 MSN, RPRD1B, TWIST2, STAG1, CCNK, STRN3, TLE4, PPP1CC, RFC1, KDM2A, SNRNP200, VGLL4, ZBTB10, NFIX, (1.14E-12) CDC73, NFYA, ZMYND8, RPS25, PRPF8, LANCL2, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, ETV6, BRMS1L, Gene expression TAF4, CREBBP, SRSF3, SRSF7, CSNK1D, PSMD11, THRAP3, ATF7, LUC7L (49)

XPO1, ARID4A, ZBTB10, NFIX, CDC73, MED23, NFYA, ZMYND8, FUBP1, RPS25, TRMT1L, RRP1B, PLRG1, MIER1, GO:0016070 2.4 LANCL2, PRPF8, PCBP2, SMARCD1, U2AF1, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, RPRD1B, ETV6, TWIST2, (5.21E-11) BRMS1L, STAG1, TAF4, CCNK, STRN3, CREBBP, TLE4, SRSF3, CSNK1D, KDM2A, RFC1, SRSF7, ATF7, RNA metabolic process SNRNP200, THRAP3, VGLL4, LUC7L (44)

GO:0034641 2.0 UQCRC2, XPO1, ARID4A, MED23, FUBP1, TRMT1L, RRP1B, PLRG1, MIER1, PCBP2, SMARCD1, SLC25A3, U2AF1, (7.72E-10) RPRD1B, NDUFS2, TWIST2, STAG1, CCNK, STRN3, TLE4, RFC1, KDM2A, SNRNP200, VGLL4, ZBTB10, NFIX, Cellular nitrogen CDC73, NFYA, ZMYND8, RPS25, PRPF8, LANCL2, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, ETV6, BRMS1L, compound metabolic TAF4, CREBBP, SRSF3, CSNK1D, SRSF7, PSMD11, THRAP3, ATF7, ATP5A1, LUC7L, MGST1 process (50)

GO:0006139 2.1 UQCRC2, XPO1, ARID4A, ZBTB10, NFIX, CDC73, MED23, NFYA, ZMYND8, FUBP1, RPS25, TRMT1L, RRP1B, (1.13E-09) PLRG1, MIER1, LANCL2, PRPF8, PCBP2, SMARCD1, U2AF1, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, RPRD1B, Nucleobase-containing ETV6, NDUFS2, TWIST2, BRMS1L, STAG1, TAF4, CCNK, STRN3, CREBBP, TLE4, SRSF3, CSNK1D, KDM2A, RFC1, compound metabolic SRSF7, ATF7, SNRNP200, THRAP3, ATP5A1, VGLL4, LUC7L process (47)

GO:0090304 2.2 XPO1, ARID4A, ZBTB10, NFIX, CDC73, MED23, NFYA, ZMYND8, FUBP1, RPS25, TRMT1L, RRP1B, PLRG1, MIER1, (1.96E-09) LANCL2, PRPF8, PCBP2, SMARCD1, U2AF1, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, RPRD1B, ETV6, TWIST2, Nucleic acid metabolic BRMS1L, STAG1, TAF4, CCNK, STRN3, CREBBP, TLE4, SRSF3, CSNK1D, KDM2A, RFC1, SRSF7, ATF7, process SNRNP200, THRAP3, VGLL4, LUC7L (44)

GO:0046483 2.1 UQCRC2, XPO1, ARID4A, ZBTB10, NFIX, CDC73, MED23, NFYA, ZMYND8, FUBP1, RPS25, TRMT1L, RRP1B, (2.49E-09) PLRG1, MIER1, LANCL2, PRPF8, PCBP2, SMARCD1, U2AF1, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, RPRD1B, Heterocycle metabolic ETV6, NDUFS2, TWIST2, BRMS1L, STAG1, TAF4, CCNK, STRN3, CREBBP, TLE4, SRSF3, CSNK1D, KDM2A, RFC1, process SRSF7, ATF7, SNRNP200, THRAP3, ATP5A1, VGLL4, LUC7L (47) UQCRC2, XPO1, ARID4A, ZBTB10, NFIX, CDC73, MED23, NFYA, ZMYND8, FUBP1, RPS25, TRMT1L, RRP1B,

191

GO:0006725 2.0 PLRG1, MIER1, LANCL2, PRPF8, PCBP2, SMARCD1, U2AF1, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, RPRD1B, (3.31E-09) ETV6, NDUFS2, TWIST2, BRMS1L, STAG1, TAF4, CCNK, STRN3, CREBBP, TLE4, SRSF3, CSNK1D, KDM2A, RFC1, Cellular aromatic SRSF7, ATF7, SNRNP200, THRAP3, ATP5A1, VGLL4, LUC7L compound metabolic process (47)

GO:1901360 2.0 UQCRC2, XPO1, ARID4A, ZBTB10, NFIX, CDC73, MED23, NFYA, ZMYND8, FUBP1, RPS25, TRMT1L, RRP1B, (1.00E-08) PLRG1, MIER1, LANCL2, PRPF8, PCBP2, SMARCD1, U2AF1, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, RPRD1B, Organic cyclic ETV6, NDUFS2, TWIST2, BRMS1L, STAG1, TAF4, CCNK, STRN3, CREBBP, TLE4, SRSF3, CSNK1D, KDM2A, RFC1, compound metabolic SRSF7, ATF7, SNRNP200, THRAP3, ATP5A1, VGLL4, LUC7L process (47)

GO:0006807 1.9 UQCRC2, XPO1, ARID4A, MED23, FUBP1, TRMT1L, RRP1B, PLRG1, MIER1, PCBP2, SMARCD1, SLC25A3, U2AF1, (1.04E-08) RPRD1B, NDUFS2, TWIST2, STAG1, CCNK, STRN3, TLE4, RFC1, KDM2A, SNRNP200, VGLL4, ZBTB10, NFIX, Nitrogen compound CDC73, NFYA, ZMYND8, RPS25, PRPF8, LANCL2, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, ETV6, BRMS1L, metabolic process TAF4, CREBBP, SRSF3, CSNK1D, SRSF7, PSMD11, THRAP3, ATF7, ATP5A1, LUC7L, MGST1 (50)

GO:0032774 2.4 XPO1, ARID4A, ZBTB10, NFIX, CDC73, MED23, NFYA, ZMYND8, FUBP1, RPS25, MIER1, LANCL2, SMARCD1, (5.34E-08) U2AF1, CAMK2D, CC2D1B, MYCBP, WAC, RPL10A, RPRD1B, ETV6, BRMS1L, TWIST2, STAG1, CCNK, TAF4, RNA biosynthetic STRN3, CREBBP, TLE4, SRSF3, KDM2A, RFC1, SRSF7, THRAP3, ATF7, VGLL4 process (36) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

192

Table 3.11. Top 10 DAVID biological processes for proteins isolated from the Gag-Strep purification. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:0012501 4.0 P4HB, ACAA2, LDHA, ACTN4, PDIA3, DDB1, ERP29, ACTN1, POLB, MIF, ANXA6, PKM, TXNDC12, CTTN, YWHAG, (1.40E-07) PA2G4, VCP, HSPA5, HSPA9 Programmed cell death (19)

GO:0008219 3.8 P4HB, ACAA2, LDHA, ACTN4, PDIA3, DDB1, ERP29, ACTN1, POLB, MIF, ANXA6, PKM, TXNDC12, CTTN, YWHAG, (3.27E-07) PA2G4, VCP, HSPA5, HSPA9 Cell death (19)

GO:0019674 39.7 (3.47E-07) PKM, LDHA, VCP, PGK1, MDH2, ENO1 NAD metabolic process (6)

GO:0065008 2.8 (3.67E-07) PDIA3, PDIA4, ARPC5, CAPZB, MIF, ANXA6, ACTR2, CTTN, ARPC2, HSPA5, HSPA8, SPP1, HSPA9, ACAA2, P4HB, Regulation of biological DDB1, ACTN1, POLB, FLNB, ANXA2, TXNDC12, YWHAG, VCP, CAPG, DSP quality (25)

GO:0042981 4.5 (6.90E-07) P4HB, ACAA2, LDHA, ACTN4, PDIA3, DDB1, ERP29, ACTN1, MIF, TXNDC12, YWHAG, PA2G4, CTTN, VCP, HSPA5, Regulation of apoptotic HSPA9 process (16)

GO:0043067 4.4 (7.77E-07) P4HB, ACAA2, LDHA, ACTN4, PDIA3, DDB1, ERP29, ACTN1, MIF, TXNDC12, YWHAG, PA2G4, CTTN, VCP, HSPA5, Regulation of HSPA9 programmed cell death (16)

GO:0006457 13.6 PDIA3, PPIA, ERP29, HSPA5, PDIA4, HSPA8, PPWD1, HSPA9 (1.44E-06)

193

Protein folding (8)

GO:0007155 3.9 RPSA, LDHA, ACTN4, ACTN1, FLNB, CAPZB, ANXA2, PKM, CTTN, ARPC2, CAPG, DSP, HSPA5, PAICS, HSPA8, (1.45E-06) SPP1, ENO1 Cell adhesion (17)

GO:0022610 3.9 RPSA, LDHA, ACTN4, ACTN1, FLNB, CAPZB, ANXA2, PKM, CTTN, ARPC2, CAPG, DSP, HSPA5, PAICS, HSPA8, (1.52E-06) SPP1, ENO1 Biological adhesion (17)

GO:0010941 4.1 P4HB, ACAA2, LDHA, ACTN4, PDIA3, DDB1, ERP29, ACTN1, MIF, TXNDC12, YWHAG, PA2G4, CTTN, VCP, HSPA5, (1.79E-06) HSPA9 Regulation of cell death (16) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

194

Table 3.12. Top 10 DAVID biological processes for proteins isolated from the L219A-Strep purification.

Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:0006734 56.0 (2.68E-12) PKM, TPI1, VCP, PGAM1, ENO3, PGK1, MDH2, MDH1, ENO1 NADH metabolic process (9)

GO:0019674 39.0 (3.09E-12) PKM, LDHA, TPI1, VCP, PGAM1, ENO3, PGK1, MDH2, MDH1, ENO1 NAD metabolic process (10)

GO:0065008 2.7 PDIA3, STRAP, GLUD1, PDIA5, UBE2V2, ARPC4, PDIA4, ARPC5, CAPZB, MIF, RPA1, ANXA6, RACK1, ACTG1, SET, (1.60E-10) GOT1, ARPC2, HSPA5, HSPA8, SPP1, HSPA9, P4HB, ACAA2, DDB1, ANXA1, ACTN1, POLB, SKP1, FLNB, PARK7, Regulation of biological ERP44, ARPC1B, YWHAG, HDAC2, PPIB, VCP, TXNDC5, PSMA3, PCNA, DSP, WDR1 quality (41)

GO:0006082 5.3 (7.81E-10) P4HB, ACAA2, LDHA, ACO2, CNDP2, GLUD1, CS, ANXA1, PGAM1, ACLY, PARK7, MIF, HAGH, PKM, TPI1, GOT1, Organic acid metabolic ENO3, PGK1, MDH2, MDH1, ENO1 process (21)

GO:0044283 7.5 (1.81E-09) ACAA2, IMPA1, GLUD1, ANXA1, PGAM1, ACLY, PARK7, MIF, PKM, TPI1, GOT1, ENO3, PGK1, MDH2, MDH1, ENO1 Small molecule biosynthetic process (16)

GO:0046496 18.7 (2.61E-09) PKM, LDHA, TPI1, VCP, PGAM1, ENO3, PGK1, MDH2, MDH1, ENO1 Nicotinamide nucleotide metabolic process (10)

195

GO:0072524 17.6 (4.56E-09) Pyridine-containing PKM, LDHA, TPI1, VCP, PGAM1, ENO3, PGK1, MDH2, MDH1, ENO1 compound metabolic process (10)

GO:0006733 16.5 (7.69E-09) Oxidoreduction PKM, LDHA, TPI1, VCP, PGAM1, ENO3, PGK1, MDH2, MDH1, ENO1 coenzyme metabolic process (10)

GO:0019752 5.3 (7.97E-09) ACAA2, LDHA, ACO2, GLUD1, CS, ANXA1, PGAM1, ACLY, PARK7, MIF, HAGH, PKM, TPI1, GOT1, ENO3, PGK1, Carboxylic acid MDH2, MDH1, ENO1 metabolic process (19)

GO:0043436 5.3 (8.75E-09) ACAA2, LDHA, ACO2, GLUD1, CS, ANXA1, PGAM1, ACLY, PARK7, MIF, HAGH, PKM, TPI1, GOT1, ENO3, PGK1, Oxoacid metabolic MDH2, MDH1, ENO1 process (19) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

196

Table 3.13. Top 10 nuclear enriched DAVID biological processes for proteins isolated from the L219A-Strep purification.

Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:001250 4.1 LDHA, ACTN4, PDIA3, ANP32B, DDB1, ANXA1, UBE2V2, POLB, ANXA4, NAA38, PARK7, MIF, RACK1, PKM, TOP1, (2.10E-08) PA2G4, HDAC2, SET, VCP, HSPA5, HSPA9 Programmed cell death (21)

GO:0008219 3.8 LDHA, ACTN4, PDIA3, ANP32B, DDB1, ANXA1, UBE2V2, POLB, ANXA4, NAA38, PARK7, MIF, RACK1, PKM, TOP1, (5.41E-08) PA2G4, HDAC2, SET, VCP, HSPA5, HSPA9 Cell death (21)

GO:0042981 4.6 (6.98E-08) LDHA, ACTN4, PDIA3, ANP32B, DDB1, ANXA1, UBE2V2, ANXA4, NAA38, PARK7, MIF, RACK1, PA2G4, HDAC2, Regulation of apoptotic SET, VCP, HSPA5, HSPA9 process (18)

GO:0043067 4.6 (7.99E-08) LDHA, ACTN4, PDIA3, ANP32B, DDB1, ANXA1, UBE2V2, ANXA4, NAA38, PARK7, MIF, RACK1, PA2G4, HDAC2, Regulation of SET, VCP, HSPA5, HSPA9 programmed cell death (18)

GO:0010941 4.3 LDHA, ACTN4, PDIA3, ANP32B, DDB1, ANXA1, UBE2V2, ANXA4, NAA38, PARK7, MIF, RACK1, PA2G4, HDAC2, (2.08E-07) SET, VCP, HSPA5, HSPA9 Regulation of cell death (18)

GO:0019674 36.3 (5.53E-07) PKM, LDHA, TPI1, VCP, MDH2, ENO1 NAD metabolic process (6)

LDHA, PDIA3, ANP32B, DDB1, ANXA1, UBE2V2, POLB, ANXA4, NAA38, PARK7, MIF, RACK1, PA2G4, HDAC2, SET, GO:0006915 3.7 VCP, HSPA5, HSPA9 (1.66E-06)

197

Apoptotic process (18)

GO:0006734 48.3 (3.05E-06) PKM, TPI1, VCP, MDH2, ENO1 NADH metabolic process (5)

GO:0060548 5.2 (3.62E-06) DDB1, ANXA1, UBE2V2, NAA38, ANXA4, PARK7, MIF, RACK1, PA2G4, SET, HDAC2, HSPA5, HSPA9 Negative regulation of cell death (13)

GO:0006082 5.1 (4.06E-06) PKM, TPI1, LDHA, GOT1, ACO2, CNDP2, CS, ANXA1, ACLY, MDH2, PARK7, ENO1, MIF Organic acid metabolic process (13) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

198

Table 3.14. Top 10 DAVID biological processes of nuclear proteins identified in Engeland et al 2011. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:0006396 10.8 GTPBP4, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, SRSF3, EXOSC10, SRSF2, TRMT1L, SRSF7, SRSF9, POP1, (2.20E-11) NAT10 RNA processing (14)

GO:0010467 2.9 ABCF1, GTPBP4, MTDH, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, LARP1, SRSF3, EXOSC10, SRSF2, TOP1, (3.75E-09) TROVE2, TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A Gene expression (22)

GO:0016071 11.5 (7.03E-09) SRSF3, EXOSC10, SRSF2, ZC3HAV1, SRSF7, SRSF9, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1 mRNA metabolic process (11)

GO:0034641 2.5 (1.23E-08) ABCF1, GTPBP4, MTDH, ZC3HAV1, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, LARP1, SRSF3, EXOSC10, SRSF2, Cellular nitrogen TOP1, TROVE2, TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A compound metabolic process (23)

GO:0090304 2.8 (3.43E-08) GTPBP4, MTDH, ZC3HAV1, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, SRSF3, EXOSC10, SRSF2, TOP1, TROVE2, Nucleic acid metabolic TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A process (21)

GO:0006807 2.3 (5.05E-08) ABCF1, GTPBP4, MTDH, ZC3HAV1, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, LARP1, SRSF3, EXOSC10, SRSF2, Nitrogen compound TOP1, TROVE2, TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A metabolic process (23) GTPBP4, MTDH, ZC3HAV1, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, SRSF3, EXOSC10, SRSF2, TROVE2,

199

GO:0016070 3.0 TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A (6.45E-08) RNA metabolic process (20)

GO:0006139 2.5 (3.02E-07) GTPBP4, MTDH, ZC3HAV1, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, SRSF3, EXOSC10, SRSF2, TOP1, TROVE2, Nucleobase-containing TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A compound metabolic process (21)

GO:0046483 2.5 (4.48E-07) GTPBP4, MTDH, ZC3HAV1, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, SRSF3, EXOSC10, SRSF2, TOP1, TROVE2, Heterocycle metabolic TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A process (21)

GO:0006725 2.5 (5.17E-07) GTPBP4, MTDH, ZC3HAV1, EXOSC5, PRPF3, CDC5L, PRPF4, SRPK1, SRSF3, EXOSC10, SRSF2, TOP1, TROVE2, Cellular aromatic TRMT1L, SRSF7, SRSF9, POP1, LARS, NAT10, RBM14, MYBBP1A compound metabolic process (21) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

200

Table 3.15. Top 10 DAVID biological processes of nuclear proteins identified in Jäger et al 2011. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) RPL18, RPL17, RPL19, RPL13, RBM4, RPL15, SYNCRIP, NONO, DDX27, DDX17, DHX37, SRRM2, RPLP0, DDX21, GO:0006396 12.2 RPL11, RPL12, RBM10, DHX30, RPS27A, SNRPA1, CHTOP, RRP1, HNRNPA2B1, HNRNPR, HNRNPU, MRTO4, (5.22E-106) RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, RPS12, SNRPB, RPS13, SNRPA, RPS10, RPS11, PABPC4, LIN28B, RNA processing RPS25, HNRNPA3, HNRNPL, RPS26, HNRNPM, DDX47, RPS28, HNRNPK, RPL7, RPL6, RPL9, HNRNPF, RPL8, (119) HNRNPD, RPL3, RPL5, RPS20, RPL10A, PABPC1, RPL7A, RPL4, RPS23, RPS24, RPS9, RPL23A, RPS6, DDX5, RPS5, HNRNPA1, RPF2, RPS8, HNRNPA0, RPS7, SRSF3, SRSF2, HNRNPH3, SRSF7, SRSF6, WDR3, RRS1, RPL37A, HNRNPH1, ADAR, DICER1, RPS2, YBX1, RPS3, WDR36, PLRG1, RRP1B, RPS3A, SRPK2, EXOSC5, PRPF3, CDC5L, RPS4X, PRPF4, SRPK1, KHSRP, RPL35, RPS15A, RPL36, BMS1, SF3B3, DIMT1, RPL30, NSUN2, PDCD11, ALYREF, PNO1, ELAVL1, RPL27, PWP2, FBL, RPL23, RPL13A, HNRNPUL1, RPL21, SFPQ RPL18, RPL17, RPL19, ZC3HAV1, RPL13, RBM4, RPL15, SYNCRIP, NONO, RPLP0, SRRM2, RPL11, RPL12, RBM10, GO:0016071 14.2 RPS27A, SNRPA1, CHTOP, HNRNPA2B1, HNRNPR, HNRNPU, MRTO4, RPS18, RPS19, RPS16, RPS17, RPS14, (2.28E-94) RPS12, SNRPB, RPS13, SNRPA, RPS10, RPS11, XRN1, HNRNPA3, HNRNPL, RPS25, HNRNPM, RPS26, DDX47, mRNA metabolic HNRNPK, RPS28, RPL7, RPL6, RPL9, HNRNPF, RPL8, HNRNPD, RPL3, RPL5, RPL7A, RPL10A, PABPC1, RPL4, process RPS20, RPS23, RPS24, RPS9, RPL23A, DDX5, RPS6, RPS5, HNRNPA1, RPS8, HNRNPA0, RPS7, SRSF3, SRSF2, (102) HNRNPH3, SRSF7, SRSF6, RPL37A, HNRNPH1, ADAR, RPS2, YBX1, RPS3, PLRG1, RPS3A, SRPK2, EXOSC5, PRPF3, CDC5L, RPS4X, PRPF4, SRPK1, EIF4G1, KHSRP, RPL35, RPS15A, RPL36, SF3B3, RPL30, PDCD11, UPF1, ALYREF, ELAVL1, RPL27, RPL23, HNRNPUL1, RPL13A, RPL21, SFPQ RPL18, RPL17, RPL19, RPL13, SURF6, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, RPLP0, GO:0042254 21.7 DDX21, RPL11, RPL12, GNL2, DHX30, RPS27A, GNL3, RRP1, EXOSC5, RPS4X, MRTO4, RPS18, RPS19, NOP2, (1.83E-84) RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, BMS1, DIMT1, RPS25, RPS26, Ribosome biogenesis RPL30, DDX47, RPS28, RPL7, RPL6, RPL9, NPM1, RPL8, RPL3, RPL5, RPS20, RPL4, RSL24D1, RPL10A, RPL7A, (78) RPS23, RPS24, PDCD11, PNO1, RPL27, RPS9, RPL23A, RPS6, RPS5, RPF2, PWP2, RPS8, FBL, RPS7, RPL23, RPL13A, NOP16, RPL21, WDR3, RRS1, RPL37A RPL18, RPL17, RPL19, RPL13, SURF6, DICER1, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, GO:0022613 16.5 RPLP0, DDX21, RPL11, RPL12, GNL2, DHX30, RPS27A, GNL3, SRPK2, RRP1, EXOSC5, PRPF3, RPS4X, MRTO4, (3.73E-82) RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, SNRPB, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, Ribonucleoprotein BMS1, DIMT1, RPS25, RPS26, RPL30, DDX47, RPS28, ATXN2L, RPL7, RPL6, RPL9, NPM1, RPL8, RPL3, RPL5, complex biogenesis RPS20, RPL4, RSL24D1, RPL10A, RPL7A, RPS23, RPS24, PDCD11, PNO1, RPL27, RPS9, RPL23A, RPS6, RPS5, (85) RPF2, PWP2, FBL, RPS8, RPS7, RPL23, RPL13A, SRSF6, NOP16, RPL21, WDR3, RRS1, RPL37A, ADAR RPL18, RPL17, RPL19, RPL13, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, RPLP0, DDX21, GO:0016072 24.3 RPL11, RPL12, RPS27A, RRP1, EXOSC5, RPS4X, MRTO4, RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, RPS12, (3.23E-81) RPS13, RPS10, RPS11, XRN1, RPL35, RPS15A, RPL36, BMS1, DIMT1, RPS25, RPS26, RPL30, DDX47, RPS28, rRNA metabolic process RPL7, RPL6, RPL9, RPL8, RPL3, RPL5, RPS20, RPL7A, RPL10A, RPL4, RPS23, RPS24, PDCD11, PNO1, RPS9, (72) RPL27, RPL23A, RPS6, RPS5, RPF2, PWP2, RPS8, FBL, RPS7, RPL23, RPL13A, RPL21, WDR3, RRS1, RPL37A RPL18, RPL17, RPL19, RPL13, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, RPLP0, DDX21, GO:0006364 24.6 RPL11, RPL12, RPS27A, RRP1, EXOSC5, RPS4X, MRTO4, RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, RPS12,

201

(2.20E-80) RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, BMS1, DIMT1, RPS25, RPS26, RPL30, DDX47, RPS28, RPL7, rRNA processing RPL6, RPL9, RPL8, RPL3, RPL5, RPS20, RPL7A, RPL10A, RPL4, RPS23, RPS24, PDCD11, PNO1, RPS9, RPL27, (71) RPL23A, RPS6, RPS5, RPF2, PWP2, RPS8, FBL, RPS7, RPL23, RPL13A, RPL21, WDR3, RRS1, RPL37A

GO:0006614 51.5 RPL18, RPL17, SRP14, RPL19, RPL13, RPL15, RPS2, RPS3, RPS3A, RPLP0, RPL11, RPL12, RPS27A, RPS4X, (2.61E-79) RPS18, RPS19, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, RPS25, RPS26, SRP-dependent RPL30, RPS28, RPL7, RPL6, RPL9, RPL8, RPL3, RPL5, RPL7A, RPL10A, RPL4, RPS20, RPS23, RPS24, RPS9, cotranslational protein RPL27, RPL23A, RPS6, RPS5, RPS8, RPS7, RPL23, RPL13A, RPL21, RPL37A targeting to membrane (53)

GO:0006613 48.0 RPL18, RPL17, SRP14, RPL19, RPL13, RPL15, RPS2, RPS3, RPS3A, RPLP0, RPL11, RPL12, RPS27A, RPS4X, (4.24E-77) RPS18, RPS19, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, RPS25, RPS26, Cotranslational protein RPL30, RPS28, RPL7, RPL6, RPL9, RPL8, RPL3, RPL5, RPL7A, RPL10A, RPL4, RPS20, RPS23, RPS24, RPS9, targeting to membrane RPL27, RPL23A, RPS6, RPS5, RPS8, RPS7, RPL23, RPL13A, RPL21, RPL37A (53)

RPL18, RPL17, SRP14, RPL19, RPL13, RPL15, RPS2, RPS3, RPS3A, RPLP0, RPL11, RPL12, RPS27A, RPS4X, GO:0045047 47.5 RPS18, RPS19, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, RPS25, RPS26, (8.41E-77) RPL30, RPS28, RPL7, RPL6, RPL9, RPL8, RPL3, RPL5, RPL7A, RPL10A, RPL4, RPS20, RPS23, RPS24, RPS9, Protein targeting to ER RPL27, RPL23A, RPS6, RPS5, RPS8, RPS7, RPL23, RPL13A, RPL21, RPL37A (53)

GO:0000184 41.7 (8.26E-76) RPL18, RPL17, RPL19, RPL13, RPL15, RPS2, RPS3, RPS3A, RPLP0, RPL11, RPL12, RPS27A, RPS4X, EIF4G1, Nuclear-transcribed RPS18, RPS19, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, RPS25, RPS26, mRNA catabolic RPL30, RPS28, RPL7, RPL6, RPL9, RPL8, RPL3, RPL5, PABPC1, RPL7A, RPL10A, RPL4, RPS20, RPS23, RPS24, process, nonsense- UPF1, RPS9, RPL27, RPL23A, RPS6, RPS5, RPS8, RPS7, RPL23, RPL13A, RPL21, RPL37A mediated decay (55) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

202

Table 3.16. Top 10 DAVID biological processes of nuclear proteins identified in Engeland et al 2014. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value) RPL18, RPL17, RPL19, RPL13, RBM4, RPL15, SYNCRIP, NONO, DDX27, DDX17, TRMT1L, DHX37, RPLP0, DDX21, GO:0006396 12.4 RPL11, RPL12, RBM10, DHX30, RPS27A, SNRPA1, GTPBP4, CHTOP, RRP1, HNRNPA2B1, HNRNPR, HNRNPU, (6.91E-113) MRTO4, RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, RPS12, SNRPB, RPS13, SNRPA, RPS10, RPS11, PABPC4, RNA processing LIN28B, RPS25, HNRNPA3, HNRNPL, RPS26, HNRNPM, DDX47, RPS28, HNRNPK, RPL7, FRG1, RPL6, RPL9, (125) HNRNPF, RPL8, HNRNPD, RPL3, NAT10, RPL5, RPS20, RPL10A, PABPC1, RPL7A, RPL4, RPS23, RPS24, RPS9, RPL23A, RPS6, DDX5, RPS5, HNRNPA1, RPF2, RPS8, HNRNPA0, RPS7, SRSF3, SRSF2, HNRNPH3, SRSF7, SRSF6, SRSF9, POP1, WDR3, RRS1, RPL37A, HNRNPH1, ADAR, DICER1, RPS2, YBX1, RPS3, WDR36, PLRG1, RRP1B, RPS3A, SRPK2, PRPF3, CDC5L, RPS4X, PRPF4, SRPK1, KHSRP, RPL35, RPS15A, RPL36, BMS1, SF3B3, DIMT1, EXOSC10, RPL30, RPL34, NSUN2, PDCD11, ALYREF, PNO1, ELAVL1, RPL27, PWP2, FBL, RPL23, RPL13A, HNRNPUL1, RPL21, SFPQ RPL18, RPL17, RPL19, ZC3HAV1, RPL13, RBM4, RPL15, SYNCRIP, NONO, RPLP0, RPL11, RPL12, RBM10, GO:0016071 13.8 RPS27A, SNRPA1, CHTOP, HNRNPA2B1, HNRNPR, HNRNPU, MRTO4, RPS18, RPS19, RPS16, RPS17, RPS14, (6.83E-94) RPS12, SNRPB, RPS13, SNRPA, RPS10, RPS11, HNRNPA3, HNRNPL, RPS25, HNRNPM, RPS26, DDX47, HNRNPK, mRNA metabolic RPS28, RPL7, RPL6, FRG1, RPL9, HNRNPF, RPL8, HNRNPD, RPL3, RPL5, RPL7A, RPL10A, PABPC1, RPL4, process RPS20, RPS23, RPS24, RPS9, RPL23A, DDX5, RPS6, RPS5, HNRNPA1, RPS8, HNRNPA0, RPS7, SRSF3, SRSF2, (103) HNRNPH3, SRSF7, SRSF6, SRSF9, RPL37A, HNRNPH1, ADAR, RPS2, YBX1, RPS3, PLRG1, RPS3A, SRPK2, PRPF3, CDC5L, RPS4X, PRPF4, SRPK1, EIF4G1, KHSRP, RPL35, RPS15A, RPL36, SF3B3, EXOSC10, RPL30, RPL34, PDCD11, UPF1, ALYREF, ELAVL1, RPL27, RPL23, HNRNPUL1, RPL13A, RPL21, SFPQ RPL18, RPL17, RPL19, RPL13, SURF6, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, RPLP0, GO:0042254 22.4 DDX21, RPL11, RPL12, GNL2, DHX30, RPS27A, GNL3, GTPBP4, RRP1, RPS4X, MRTO4, RPS18, RPS19, NOP2, (1.72E-91) RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, BMS1, EXOSC10, DIMT1, RPS25, Ribosome biogenesis RPS26, RPL30, DDX47, RPS28, RPL7, DDX3X, FRG1, RPL6, RPL34, RPL9, NPM1, RPL8, RPL3, NAT10, RPL5, (83) RPS20, RPL4, RPL10A, RSL24D1, RPL7A, RPS23, RPS24, PDCD11, PNO1, RPL27, RPS9, RPL23A, RPS6, RPS5, RPF2, PWP2, FBL, RPS8, RPS7, RPL23, RPL13A, NOP16, RPL21, WDR3, RRS1, RPL37A RPL18, RPL17, RPL19, RPL13, SURF6, DICER1, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, GO:0022613 17.1 RPLP0, DDX21, RPL11, RPL12, GNL2, DHX30, RPS27A, GNL3, SRPK2, GTPBP4, RRP1, PRPF3, RPS4X, MRTO4, (5.44E-90) RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, SNRPB, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, Ribonucleoprotein BMS1, EXOSC10, DIMT1, RPS25, RPS26, RPL30, DDX47, RPS28, ATXN2L, DDX3X, RPL7, FRG1, RPL6, RPL34, complex biogenesis RPL9, NPM1, RPL8, RPL3, NAT10, RPL5, RPS20, RPL4, RPL10A, RSL24D1, RPL7A, RPS23, RPS24, PDCD11, (91) PNO1, RPL27, RPS9, RPL23A, RPS6, RPS5, RPF2, PWP2, FBL, RPS8, RPS7, RPL23, RPL13A, SRSF6, NOP16, RPL21, SRSF9, WDR3, RRS1, RPL37A, ADAR RPL18, RPL17, RPL19, RPL13, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, RPLP0, DDX21, GO:0006364 25.2 RPL11, RPL12, RPS27A, GTPBP4, RRP1, RPS4X, MRTO4, RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, RPS12, (4.71E-86) RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, BMS1, EXOSC10, DIMT1, RPS25, RPS26, RPL30, DDX47, RPS28, rRNA processing RPL7, RPL6, FRG1, RPL34, RPL9, RPL8, RPL3, NAT10, RPL5, RPS20, RPL7A, RPL10A, RPL4, RPS23, RPS24, (75) PDCD11, PNO1, RPS9, RPL27, RPL23A, RPS6, RPS5, RPF2, PWP2, RPS8, FBL, RPS7, RPL23, RPL13A, RPL21,

203

WDR3, RRS1, RPL37A RPL18, RPL17, RPL19, RPL13, RPL15, RPS2, RPS3, DDX27, WDR36, RRP1B, DHX37, RPS3A, RPLP0, DDX21, GO:0016072 24.5 RPL11, RPL12, RPS27A, GTPBP4, RRP1, RPS4X, MRTO4, RPS18, RPS19, NOP2, RPS16, RPS17, RPS14, RPS12, (4.11E-85) RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, BMS1, EXOSC10, DIMT1, RPS25, RPS26, RPL30, DDX47, RPS28, rRNA metabolic process RPL7, RPL6, FRG1, RPL34, RPL9, RPL8, RPL3, NAT10, RPL5, RPS20, RPL7A, RPL10A, RPL4, RPS23, RPS24, (75) PDCD11, PNO1, RPS9, RPL27, RPL23A, RPS6, RPS5, RPF2, PWP2, RPS8, FBL, RPS7, RPL23, RPL13A, RPL21, WDR3, RRS1, RPL37A RPL18, RPL17, RPL19, RPL13, DICER1, RPL15, RPS2, RPS3, DDX27, WDR36, TRMT1L, RRP1B, DHX37, RPS3A, GO:0034470 18.0 RPLP0, DDX21, RPL11, RPL12, RPS27A, GTPBP4, RRP1, HNRNPA2B1, RPS4X, MRTO4, RPS18, RPS19, NOP2, (6.84E-82) RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, BMS1, LIN28B, EXOSC10, DIMT1, ncRNA processing RPS25, RPS26, RPL30, DDX47, RPS28, RPL7, FRG1, RPL6, RPL34, RPL9, RPL8, RPL3, NAT10, RPL5, RPS20, (82) RPL4, RPL7A, RPL10A, NSUN2, RPS23, RPS24, PDCD11, PNO1, RPL27, RPS9, RPL23A, RPS6, RPS5, RPF2, PWP2, RPS8, FBL, RPS7, RPL23, RPL13A, RPL21, POP1, WDR3, RRS1, RPL37A, ADAR

GO:0006614 50.8 RPL18, RPL17, SRP14, RPL19, RPL13, RPL15, RPS2, RPS3, RPS3A, RPLP0, RPL11, RPL12, RPS27A, RPS4X, (1.28E-80) RPS18, RPS19, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, RPS25, RPS26, SRP-dependent RPL30, RPS28, RPL7, RPL6, RPL34, RPL9, RPL8, RPL3, RPL5, RPL7A, RPL10A, RPL4, RPS20, RPS23, RPS24, cotranslational protein RPS9, RPL27, RPL23A, RPS6, RPS5, RPS8, RPS7, RPL23, RPL13A, RPL21, RPL37A targeting to membrane (54)

GO:0000184 41.8 (6.27E-79) RPL18, RPL17, RPL19, RPL13, RPL15, RPS2, RPS3, RPS3A, RPLP0, RPL11, RPL12, RPS27A, RPS4X, EIF4G1, Nuclear-transcribed RPS18, RPS19, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, RPS25, EXOSC10, mRNA catabolic RPS26, RPL30, RPS28, RPL7, RPL6, RPL9, RPL34, RPL8, RPL3, RPL5, RPL7A, RPL10A, PABPC1, RPL4, RPS20, process, nonsense- RPS23, RPS24, UPF1, RPS9, RPL27, RPL23A, RPS6, RPS5, RPS8, RPS7, RPL23, RPL13A, RPL21, RPL37A mediated decay (57)

GO:0006613 47.3 RPL18, RPL17, SRP14, RPL19, RPL13, RPL15, RPS2, RPS3, RPS3A, RPLP0, RPL11, RPL12, RPS27A, RPS4X, (2.41E-78) RPS18, RPS19, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, RPS11, RPL35, RPS15A, RPL36, RPS25, RPS26, Cotranslational protein RPL30, RPS28, RPL7, RPL6, RPL34, RPL9, RPL8, RPL3, RPL5, RPL7A, RPL10A, RPL4, RPS20, RPS23, RPS24, targeting to membrane RPS9, RPL27, RPL23A, RPS6, RPS5, RPS8, RPS7, RPL23, RPL13A, RPL21, RPL37A (54) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

204

Table 3.17. Top 10 DAVID biological processes of nuclear proteins identified in Ritchie et al 2015. Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:0098609 8.5 (2.63E-07) EIF4G1, ATXN2L, DDX3X, ZC3HAV1, SERBP1, CCT8, PRKDC, TMPO, FLNA, LARP1 Cell-cell adhesion (10)

GO:0007155 5.8 (6.43E-06) EIF4G1, ATXN2L, DDX3X, ZC3HAV1, SERBP1, CCT8, PRKDC, TMPO, FLNA, LARP1 Cell adhesion (10)

GO:0022610 5.8 (6.62E-06) EIF4G1, ATXN2L, DDX3X, ZC3HAV1, SERBP1, CCT8, PRKDC, TMPO, FLNA, LARP1 Biological adhesion (10)

GO:0090304 2.5 (3.38E-04) EIF4G1, HNRNPM, EEF1A1, MTDH, DDX3X, MKI67, ZC3HAV1, HIST1H1C, CCT8, PRKDC, TMPO, XRN1, FLNA Nucleic acid metabolic process (13)

GO:0034641 2.1 (6.50E-04) EEF1A1, MTDH, MKI67, HIST1H1C, ZC3HAV1, PRKDC, FLNA, LARP1, EIF4G1, HNRNPM, DDX3X, CCT8, TMPO, Cellular nitrogen XRN1 compound metabolic process (14)

GO:0010608 10.5 (8.57E-04) Posttranscriptional EIF4G1, DDX3X, SERBP1, XRN1, LARP1 regulation of gene expression (5) EIF4G1, HNRNPM, EEF1A1, MTDH, DDX3X, MKI67, ZC3HAV1, HIST1H1C, CCT8, PRKDC, TMPO, XRN1, FLNA

205

GO:0006139 2.2 (1.12E-03) Nucleobase-containing compound metabolic process (13)

GO:0060255 2.2 (1.16E-03) Regulation of EIF4G1, EEF1A1, MTDH, DDX3X, ZC3HAV1, HIST1H1C, SERBP1, CCT8, PRKDC, TMPO, XRN1, FLNA, LARP1 macromolecule metabolic process (13)

GO:0006807 2.0 (1.36E-03) EEF1A1, MTDH, MKI67, HIST1H1C, ZC3HAV1, PRKDC, FLNA, LARP1, EIF4G1, HNRNPM, DDX3X, CCT8, TMPO, Nitrogen compound XRN1 metabolic process (14)

GO:0046483 2.2 (1.39E-03) EIF4G1, HNRNPM, EEF1A1, MTDH, DDX3X, MKI67, ZC3HAV1, HIST1H1C, CCT8, PRKDC, TMPO, XRN1, FLNA Heterocycle metabolic process (13) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

206

Table 3.18. Top 10 DAVID biological processes of nuclear proteins identified in Le Sage et al 2015.

Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:0006614 92.1 (2.06E-16) SRP-dependent RPS16, RPS3A, RPL34, RPL35, RPL37A, RPL11, RPS2, RPS6, RPS5, RPS3 cotranslational protein targeting to membrane (10)

GO:0006613 85.8 (3.97E-16) RPS16, RPS3A, RPL34, RPL35, RPL37A, RPL11, RPS2, RPS6, RPS5, RPS3 Cotranslational protein targeting to membrane (10)

GO:0006364 39.4 (4.15E-16) RPS16, RPS3A, FRG1, RPL34, RPL35, RPL37A, RPL11, NAT10, RPS2, RPS6, RPS5, RPS3 rRNA processing (12)

GO:0045047 85.0 (4.35E-16) RPS16, RPS3A, RPL34, RPL35, RPL37A, RPL11, RPS2, RPS6, RPS5, RPS3 Protein targeting to ER (10)

GO:0016072 38.4 (5.52E-16) RPS16, RPS3A, FRG1, RPL34, RPL35, RPL37A, RPL11, NAT10, RPS2, RPS6, RPS5, RPS3 rRNA metabolic process (12)

GO:0072599 81.8 (6.18E-16) Establishment of protein RPS16, RPS3A, RPL34, RPL35, RPL37A, RPL11, RPS2, RPS6, RPS5, RPS3 localization to endoplasmic reticulum (10)

207

GO:0006413 51.4 (1.03E-15) RPS16, RPS3A, RPL34, EIF5B, RPL35, RPL37A, RPL11, RPS2, RPS6, RPS5, RPS3 Translational initiation (11)

GO:0000184 71.9 (2.06E-15) Nuclear-transcribed RPS16, RPS3A, RPL34, RPL35, RPL37A, RPL11, RPS2, RPS6, RPS5, RPS3 mRNA catabolic process, nonsense- mediated decay (10)

GO:0070972 69.0 (2.98E-15) RPS16, RPS3A, RPL34, RPL35, RPL37A, RPL11, RPS2, RPS6, RPS5, RPS3 Protein localization to endoplasmic reticulum (10)

GO:0006396 14.6 (4.32E-15) RPL35, DDX5, RPS6, RPS2, RPS5, RPS3, DDX17, RPS16, RPS3A, FRG1, RPL34, SRRM2, NAT10, RPL11, RPL37A RNA processing (15) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

208

Table 3.19. Top 10 DAVID biological processes of nuclear proteins identified in Li et al 2016.

Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:0006614 41.6 (3.40E-19) RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, SRP-dependent RPS10 cotranslational protein targeting to membrane (15)

GO:0006613 38.8 (9.58E-19) RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, Cotranslational protein RPS10 targeting to membrane (15)

GO:0045047 38.4 RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, (1.10E-18) RPS10 Protein targeting to ER (15)

GO:0072599 37.0 (1.92E-18) RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, Establishment of protein RPS10 localization to endoplasmic reticulum (15)

GO:0000184 32.5 (1.27E-17) Nuclear-transcribed RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, mRNA catabolic RPS10 process, nonsense- mediated decay (15) RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, GO:0070972 31.2 RPS10

209

(2.25E-17) Protein localization to endoplasmic reticulum (15)

GO:0090304 2.7 XRCC5, RPL17, FOSL2, XRCC6, LEMD3, DEK, RBM7, NFKB2, RPS2, YBX1, SART1, ZNF207, INTS9, EBNA1BP2, (5.60E-17) DAB2, PLRG1, RPS3A, BRD7, HIST3H2A, LUC7L3, CTBP1, YY1, MTA2, HNRNPA2B1, PRKCB, MED6, NVL, RPS18, Nucleic acid metabolic SENP1, RPS16, RPS17, RPS14, RPS12, RPS13, RPS10, MED1, RPS15A, RPS25, TCF20, EZR, DNAJA3, NUP58, process MAFF, NACC1, S100A11, RPS9, ZNF668, RPS7, RPL23, UBTF, CCT8, RBM15 (52)

GO:0019083 23.6 RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, (9.95E-17) RPS10, NUP58 Viral transcription (16)

GO:0006413 22.6 RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, EIF2S2, RPS12, (1.91E-16) RPS13, RPS10 Translational initiation (16)

GO:0019080 22.2 RPL17, RPS9, RPS15A, RPS2, RPS7, RPS25, RPS18, RPS16, RPL23, RPS17, RPS3A, RPS14, RPS12, RPS13, (2.42E-16) RPS10, NUP58 Viral gene expression (16) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

210

Table 3.20. Top 10 DAVID biological processes of the common proteins identified from published work and the HIV Gag results presented here.

Gene Ontology Term Fold Enrichment Gene names for proteins isolated (protein count) (p-value)

GO:1904871 184.1 (7.69E-09) CCT7, TCP1, CCT4, CCT8, CCT6A Positive regulation of protein localization to Cajal body (5)

GO:1990173 163.7 (1.38E-08) CCT7, TCP1, CCT4, CCT8, CCT6A Protein localization to nucleoplasm (5)

SNRPD2, FUBP1, INTS9, ANKRD17, BAG2, CDK12, SLC25A3, PUM1, U2AF1, PABPN1, SLC25A4, EXOSC4, CCT6A, GO:0043170 1.6 PPP1CC, CD3EAP, MAP4K4, PTRF, C1QBP, BAZ1B, LARP7, SRP9, MRPS14, DIMT1, MINA, MRPL10, PBRM1, (1.90E-08) MYCBP, BUB3, MEPCE, TRIP12, ACTC1, TAF4, TCP1, NOC4L, MKI67, WDR5, AFG3L2, TAB1, TAB2, CORO1C, Macromolecule CCT7, SLC25A11, PPIH, CCT4, SDF2L1, THRAP3, CCT8, DNMT1, TCEB1, LUC7L, UTP20 metabolic process (51)

SNRPD2, FUBP1, INTS9, ANKRD17, CDK12, SLC25A3, PUM1, U2AF1, PABPN1, SLC25A4, EXOSC4, CCT6A, GO:0044260 1.7 PPP1CC, CD3EAP, MAP4K4, PTRF, C1QBP, BAZ1B, LARP7, SRP9, MRPS14, DIMT1, MINA, MRPL10, PBRM1, (2.14E-08) MYCBP, BUB3, MEPCE, TRIP12, TAF4, TCP1, MKI67, NOC4L, WDR5, AFG3L2, TAB1, TAB2, CCT7, CORO1C, Cellular macromolecule SLC25A11, PPIH, CCT4, SDF2L1, THRAP3, CCT8, DNMT1, TCEB1, LUC7L, UTP20 metabolic process (49)

GO:0070203 147.3 (2.29E-08) Regulation of CCT7, TCP1, CCT4, CCT8, CCT6A establishment of protein localization to telomere (5)

GO:0070202 133.9 (3.60E-08) CCT7, TCP1, CCT4, CCT8, CCT6A Regulation of establishment of protein

211 localization to chromosome (5)

GO:0034641 1.9 MRPS14, SNRPD2, MINA, INTS9, DIMT1, FUBP1, MRPL10, ANKRD17, U2AF1, PUM1, CDK12, SLC25A3, PBRM1, (3.65E-08) MYCBP, TRIP12, MEPCE, PABPN1, TCP1, TAF4, NOC4L, SLC25A4, MKI67, WDR5, EXOSC4, CCT6A, TAB1, TAB2, Cellular nitrogen CCT7, CD3EAP, SLC25A11, PPIH, CCT4, BAZ1B, C1QBP, PTRF, LARP7, THRAP3, CCT8, DNMT1, TCEB1, UTP20, compound metabolic LUC7L, SRP9 process (43)

GO:1904816 122.7 (5.38E-08) Positive regulation of CCT7, TCP1, CCT4, CCT8, CCT6A protein localization to chromosome, telomeric region (5)

GO:1904814 105.2 (1.08E-07) Regulation of protein CCT7, TCP1, CCT4, CCT8, CCT6A localization to chromosome, telomeric region (5)

MRPS14, SNRPD2, MINA, INTS9, FUBP1, DIMT1, MRPL10, U2AF1, PUM1, CDK12, SLC25A3, PBRM1, MYCBP, GO:0010467 2.1 MEPCE, PABPN1, TAF4, ACTC1, NOC4L, SLC25A4, WDR5, EXOSC4, TAB1, AFG3L2, PPP1CC, TAB2, CD3EAP, (1.18E-07) SLC25A11, PPIH, BAZ1B, C1QBP, PTRF, LARP7, THRAP3, DNMT1, TCEB1, UTP20, LUC7L, SRP9 Gene expression (38) The gene ontology (GO) term, biological function, and the number of proteins found under the terms in parentheses are listed in the first column. The second column shows the fold enrichment which measures the magnitude of enrichment of the GO term when comparing the number of genes present under this term from the sample list to the number of this GO term genes present in the human genome. The p-value is also stated which examines the significance of the GO term enrichment using a modified Fisher’s exact test.

212

Appendix C

Association of retroviral Gag proteins with unspliced viral RNA in the nucleus

[Running title: Retroviral Gag-viral RNA colocalization in the nucleus.] Rebecca J. Kaddis Maldonado1, Kevin M. Tuffy1, Breanna Rice1, Estelle F. Chiari-Fort1, Kelly M. Fahrbach3, Thomas J. Hope3, Alan Cochrane4, and Leslie J. Parent1,2,*

Departments of Medicine1 and Department of Microbiology & Immunology2 Penn State College of Medicine, 500 University Drive, Hershey, PA 17033; Department of Cell and Molecular Biology3, Northwestern University, 303 E. Superior, Chicago, Il 60611; Department of Molecular Genetics4, University of Toronto, 1 King’s College Circle, Toronto, Ontario, Canada M5S-1A8

*manuscript in progress

213

214

Figure 6: Colocalization of Gag.L219A with its own RNA is dependent on the presence of Ψ (taken from the manuscript). A) Transcripts produced from plasmids expressing either Gag.L219A-GFP, RU5.Gag.L219A-YFP, or RU5.Gag.L219A.C21S-

YFP were labeled via RNA FISH. In a majority of cells expressing Gag.L219A-GFP in the absence of the Ψ packaging signal, had a diffuse nuclear gag mRNA pattern. B), while the split between cells with diffuse versus focal RNA was fairly even in the cells expressing RU5.Gag.L219A-YFP. C). In the Gag.L219A-GFP that contained RNA nuclear foci, only 10% of the RNA foci colocalized with Gag.L219A protein foci in the nucleus compared to 62% (p<0.0001****) of RU5.Gag.L219A-YFP RNA nuclear foci that colocalized with Gag.L219A. Furthermore, only 4% of Gag.L219A foci colocalized with Gag.L219A-GFP RNA nuclear foci, compared to 40% (p=0.0003***) of Gag foci that colocalized with RU5.Gag.L219A-YFP RNA foci. D) However, a

Gag.L219A mutant that contains a mutation in the Cys-His box of NC, which has been shown to inhibit genome packaging, fails to colocalize with gag mRNA in the presence of Ψ. E) The percentage of Gag.L219A-GFP expressing cells with diffuse nuclear RNA signal was statistically significant compared to the number of RU5.Gag.L219A-YFP

(p<0.0001****) and RU5.Gag.L219A.C21S-YFP (p<0.0001****) cells with diffuse nuclear

RNA. The presence of nuclear foci was the predominant RNA pattern for

RU5.Gag.L219A-YFP and was statistically significant compared to Gag.L219A-GFP

(p<0.0001****) and RU5.Gag.L219A.C21S-YFP (p=0.0049**).

215

Bibliography

1. Abramoff, M. D., P. J. Magalhaes, and S. J. Ram. 2004. Image Processing with ImageJ. Biophotonics International 11:36-42. 2. Ali, M. K., J. Kim, F. B. Hamid, and C.-G. Shin. 2015. Knockdown of the host cellular protein transportin 3 attenuates prototype foamy virus infection. Bioscience, Biotechnology, and Biochemistry 79:943-951. 3. Angel, T. E., U. K. Aryal, S. M. Hengel, E. S. Baker, R. T. Kelly, E. W. Robinson, and R. D. Smith. 2012. Mass spectrometry based proteomics: existing capabilities and future directions. Chemical Society Reviews 41:3912-3928. 4. Ao, Z., K. R. Fowke, É. A. Cohen, and X. Yao. 2005. Contribution of the C-terminal tri- lysine regions of human immunodeficiency virus type 1 integrase for efficient reverse transcription and viral DNA nuclear import. Retrovirology 2:62. 5. Ashburner, M., C. A. Ball, J. A. Blake, D. Botstein, H. Butler, J. M. Cherry, A. P. Davis, K. Dolinski, S. S. Dwight, J. T. Eppig, M. A. Harris, D. P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J. C. Matese, J. E. Richardson, M. Ringwald, G. M. Rubin, and G. Sherlock. 2000. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25:25-29. 6. Aumiller, W. M., Jr., B. W. Davis, and C. D. Keating. 2014. Phase separation as a possible means of nuclear compartmentalization. Int Rev Cell Mol Biol 307:109-149. 7. Bacharach, E., J. Gonsky, K. Alin, M. Orlova, and S. P. Goff. 2000. The Carboxy-Terminal Fragment of Nucleolin Interacts with the Nucleocapsid Domain of Retroviral Gag Proteins and Inhibits Virion Assembly. J Virol 74:11027-11039. 8. Bader, J. P. 1965. The Requirement for DNA Synthesis in the Growth of Rous Sarcoma and Rous-Associated Viruses. Virology 26:253-261. 9. Baltimore, D. 1970. RNA-dependent DNA polymerase in virions of RNA tumour viruses. Nature 226:1209-1211. 10. Baluyot, M. F., S. A. Grosse, T. D. Lyddon, S. K. Janaka, and M. C. Johnson. 2012. CRM1- Dependent Trafficking of Retroviral Gag Proteins Revisited. J Virol 86:4696-4700. 11. Banks, J. D., B. O. Kealoha, and M. L. Linial. 1999. An MΨ-Containing Heterologous RNA, but Notenv mRNA, Is Efficiently Packaged into Avian Retroviral Particles. Journal of Virology 73:8926-8933. 12. Basyuk, E., T. Galli, M. Mougel, J.-M. Blanchard, M. Sitbon, and E. Bertrand. 2003. Retroviral Genomic RNAs Are Transported to the Plasma Membrane by Endosomal Vesicles. Developmental Cell 5:161-174. 13. Becker, J. T., and N. M. Sherer. 2017. Subcellular Localization of HIV-1 gag-pol mRNAs Regulates Sites of Virion Assembly. J Virol 91. 14. Bériault, V., J.-F. Clément, K. Lévesque, C. LeBel, X. Yong, B. Chabot, É. A. Cohen, A. W. Cochrane, W. F. C. Rigby, and A. J. Mouland. 2004. A Late Role for the Association of hnRNP A2 with the HIV-1 hnRNP A2 Response Elements in Genomic RNA, Gag, and Vpr Localization. Journal of Biological Chemistry 279:44141-44153. 15. Bewley, M. C., L. Reinhart, M. S. Stake, S. Nadaraia-Hoke, L. J. Parent, and J. M. Flanagan. 2017. A non-cleavable hexahistidine affinity tag at the carboxyl-terminus of the HIV-1 Pr55Gag polyprotein alters nucleic acid binding properties. Protein Expression and Purification 130:137-145.

216

16. Beyer, A. R., D. V. Bann, B. Rice, I. S. Pultz, M. Kane, S. P. Goff, T. V. Golovkina, and L. J. Parent. 2013. Nucleolar Trafficking of the Mouse Mammary Tumor Virus Gag Protein Induced by Interaction with Ribosomal Protein L9. Journal of Virology 87:1069-1082. 17. Bhattacharya, A., S. L. Alam, T. Fricke, K. Zadrozny, J. Sedzicki, A. B. Taylor, B. Demeler, O. Pornillos, B. K. Ganser-Pornillos, F. Diaz-Griffero, D. N. Ivanov, and M. Yeager. 2014. Structural basis of HIV-1 capsid recognition by PF74 and CPSF6. Proceedings of the National Academy of Sciences 111:18625-18630. 18. Binns, D., E. Dimmer, R. Huntley, D. Barrell, C. O'Donovan, and R. Apweiler. 2009. QuickGO: a web-based tool for Gene Ontology searching. Bioinformatics 25:3045-3046. 19. Blot, G., K. Janvier, S. Le Panse, R. Benarous, and C. Berlioz-Torrent. 2003. Targeting of the human immunodeficiency virus type 1 envelope to the trans-Golgi network through binding to TIP47 is required for env incorporation into virions and infectivity. J Virol 77:6931-6945. 20. Bohl, C. R., S. M. Brown, and R. A. Weldon. 2005. The pp24 phosphoprotein of Mason- Pfizer monkey virus contributes to viral genome packaging. Retrovirology 2:68. 21. Bowzard, J. B., R. P. Bennett, N. K. Krishna, S. M. Ernst, A. Rein, and J. W. Wills. 1998. Importance of Basic Residues in the Nucleocapsid Sequence for Retrovirus Gag Assembly and Complementation Rescue. Journal of Virology 72:9034-9044. 22. Brass, A. L., D. M. Dykxhoorn, Y. Benita, N. Yan, A. Engelman, R. J. Xavier, J. Lieberman, and S. J. Elledge. 2008. Identification of Host Proteins Required for HIV Infection Through a Functional Genomic Screen. Science 319:921-926. 23. Brückner, A., C. Polge, N. Lentze, D. Auerbach, and U. Schlattner. 2009. Yeast Two- Hybrid, a Powerful Tool for Systems Biology. International Journal of Molecular Sciences 10:2763-2788. 24. Brzezinski, J. D., A. Modi, M. Liu, and M. J. Roth. 2016. Repression of the Chromatin- Tethering Domain of Murine Leukemia Virus p12. J Virol 90:11197-11207. 25. Bukrinsky, M. I., S. Haggerty, M. P. Dempsey, N. Sharova, A. Adzhubei, L. Spitz, P. Lewis, D. Goldfarb, M. Emerman, and M. Stevenson. 1993. A nuclear localization signal within HIV-1 matrix protein that governs infection of non-dividing cells. Nature 365:666. 26. Butterfield-Gerson, K. L., L. Z. Scheifele, E. P. Ryan, A. K. Hopper, and L. J. Parent. 2006. Importin-β Family Members Mediate Alpharetrovirus Gag Nuclear Entry via Interactions with Matrix and Nucleocapsid. J Virol 80:1798-1806. 27. Campbell, S., and V. M. Vogt. 1997. In vitro assembly of virus-like particles with Rous sarcoma virus Gag deletion mutants: identification of the p10 domain as a morphological determinant in the formation of spherical particles. J Virol 71:4425-4435. 28. Campbell, S., and V. M. Vogt. 1995. Self-assembly in vitro of purified CA-NC proteins from Rous sarcoma virus and human immunodeficiency virus type 1. J Virol 69:6487- 6497. 29. Campeau, E., and S. Gobeil. 2011. RNA interference in mammals: behind the screen. Briefings in Functional Genomics 10:215-226. 30. Camus, G., C. Segura-Morales, D. Molle, S. Lopez-Vergès, C. Begon-Pescia, C. Cazevieille, P. Schu, E. Bertrand, C. Berlioz-Torrent, and E. Basyuk. 2007. The Clathrin Adaptor Complex AP-1 Binds HIV-1 and MLV Gag and Facilitates Their Budding. Molecular Biology of the Cell 18:3193-3203. 31. Carmody, S. R., and S. R. Wente. 2009. mRNA nuclear export at a glance. Journal of Cell Science 122:1933-1937.

217

32. Cecconi, F., M. Cencini, M. Falcioni, and A. Vulpiani. 2005. Brownian motion and diffusion: From stochastic processes to chaos and beyond. Chaos: An Interdisciplinary Journal of Nonlinear Science 15:026102. 33. Chakalova, L., and P. Fraser. 2010. Organization of Transcription. Cold Spring Harbor Perspectives in Biology 2. 34. Chase, G. P., M.-A. Rameix-Welti, A. Zvirbliene, G. Zvirblis, V. Götz, T. Wolff, N. Naffakh, and M. Schwemmle. 2011. Influenza Virus Ribonucleoprotein Complexes Gain Preferential Access to Cellular Export Machinery through Chromatin Targeting. PLOS Pathogens 7:e1002187. 35. Chen, Y.-C. M., C. Kappel, J. Beaudouin, R. Eils, and D. L. Spector. 2008. Live Cell Dynamics of Promyelocytic Leukemia Nuclear Bodies upon Entry into and Exit from Mitosis. Molecular Biology of the Cell 19:3147-3162. 36. Chook, Y. M., and K. E. Süel. 2011. Nuclear import by karyopherin-βs: Recognition and inhibition. Biochimica et Biophysica Acta (BBA) - Molecular Cell Research 1813:1593- 1606. 37. Christ, F., W. Thys, J. De Rijck, R. Gijsbers, A. Albanese, D. Arosio, S. Emiliani, J.-C. Rain, R. Benarous, A. Cereseto, and Z. Debyser. 2008. Transportin-SR2 Imports HIV into the Nucleus. Current Biology 18:1192-1202. 38. Cooper, G. 2000. The Nucleolus, The Cell: A Molecular Approach. 2nd Edition. Sinauer Associates, Sunderland, MA. 39. Craven, R. C., A. E. Leure-duPree, R. A. Weldon, Jr., and J. W. Wills. 1995. Genetic analysis of the major homology region of the Rous sarcoma virus Gag protein. J Virol 69:4213-4227. 40. Craven, R. C., A. E. Leure-duPree, R. A. Weldon, and J. W. Wills. 1995. Genetic analysis of the major homology region of the Rous sarcoma virus Gag protein. Journal of Virology 69:4213-4227. 41. Dalessio, P. M., R. C. Craven, P. M. Lokhandwala, and I. J. Ropson. 2013. Lethal mutations in the major homology region and their suppressors act by modulating the dimerization of the rous sarcoma virus capsid protein C-terminal domain. Proteins 81:316-325. 42. De Houwer, S., J. Demeulemeester, W. Thys, O. Taltynov, K. Zmajkovicova, F. Christ, and Z. Debyser. 2012. Identification of Residues in the C-terminal Domain of HIV-1 Integrase That Mediate Binding to the Transportin-SR2 Protein. Journal of Biological Chemistry 287:34059-34068. 43. De Iaco, A., and J. Luban. 2011. Inhibition of HIV-1 infection by TNPO3 depletion is determined by capsid and detectable after viral cDNA enters the nucleus. Retrovirology 8:98. 44. De Iaco, A., F. Santoni, A. Vannier, M. Guipponi, S. Antonarakis, and J. Luban. 2013. TNPO3 protects HIV-1 replication from CPSF6-mediated capsid stabilization in the host cell cytoplasm. Retrovirology 10:20. 45. Dupont, S., N. Sharova, C. DéHoratius, C.-M. A. Virbasius, X. Zhu, A. G. Bukrinskaya, M. Stevenson, and M. R. Green. 1999. A novel nuclear export activity in HIV-1 matrix protein required for viral replication. Nature 402:681. 46. Dupraz, P., S. Oertle, C. Meric, P. Damay, and P. F. Spahr. 1990. Point mutations in the proximal Cys-His box of Rous sarcoma virus nucleocapsid protein. Journal of Virology 64:4978-4987. 47. Dziuba, N., M. R. Ferguson, W. A. O'Brien, A. Sanchez, A. J. Prussia, N. J. McDonald, B. M. Friedrich, G. Li, M. W. Shaw, J. Sheng, T. W. Hodge, D. H. Rubin, and J. L. Murray.

218

2012. Identification of Cellular Proteins Required for Replication of Human Immunodeficiency Virus Type 1. AIDS Research and Human Retroviruses 28:1329-1339. 48. Elis, E., M. Ehrlich, and E. Bacharach. 2015. Dynamics and restriction of murine leukemia virus cores in mitotic and interphase cells. Retrovirology 12:015-0220. 49. Elis, E., M. Ehrlich, A. Prizan-Ravid, N. Laham-Karam, and E. Bacharach. 2012. p12 Tethers the Murine Leukemia Virus Pre-integration Complex to Mitotic Chromosomes. PLOS Pathogens 8:e1003103. 50. Engeland, C. E., N. P. Brown, K. Börner, M. Schümann, E. Krause, L. Kaderali, G. A. Müller, and H.-G. Kräusslich. 2014. Proteome analysis of the HIV-1 Gag interactome. Virology 460-461:194-206. 51. Engeland, C. E., H. Oberwinkler, M. Schümann, E. Krause, G. A. Müller, and H.-G. Kräusslich. 2011. The Cellular Protein Lyric Interacts with HIV-1 Gag. Journal of Virology 85:13322-13332. 52. Fairhead, M., and M. Howarth. 2015. Site-specific biotinylation of purified proteins using BirA. Methods in molecular biology (Clifton, N.J.) 1266:171-184. 53. Fox, A. H., and A. I. Lamond. 2010. Paraspeckles. Cold Spring Harbor Perspectives in Biology 2. 54. Frankel, A. D., and J. A. T. Young. 1998. HIV-1: Fifteen Proteins and an RNA. Annual Review of Biochemistry 67:1-25. 55. Freed, E. O. 1998. HIV-1 Gag Proteins: Diverse Functions in the Virus Life Cycle. Virology 251:1-15. 56. Fujiwara, T., K. Oda, S. Yokota, A. Takatsuki, and Y. Ikehara. 1988. Brefeldin A causes disassembly of the Golgi complex and accumulation of secretory proteins in the endoplasmic reticulum. Journal of Biological Chemistry 263:18545-18552. 57. Fukuda, M., S. Asano, T. Nakamura, M. Adachi, M. Yoshida, M. Yanagida, and E. Nishida. 1997. CRM1 is responsible for intracellular transport mediated by the nuclear export signal. Nature 390:308. 58. Fuzik, T., R. Pichalova, F. K. M. Schur, K. Strohalmova, I. Krizova, R. Hadravova, M. Rumlova, J. A. G. Briggs, P. Ulbrich, and T. Ruml. 2016. Nucleic Acid Binding by Mason- Pfizer Monkey Virus CA Promotes Virus Assembly and Genome Packaging. J Virol 90:4593-4603. 59. Gallay, P., T. Hope, D. Chin, and D. Trono. 1997. HIV-1 infection of nondividing cells through the recognition of integrase by the importin/karyopherin pathway. Proceedings of the National Academy of Sciences 94:9825-9830. 60. Gallay, P., V. Stitt, C. Mundy, M. Oettinger, and D. Trono. 1996. Role of the karyopherin pathway in human immunodeficiency virus type 1 nuclear import. J Virol 70:1027-1032. 61. Gallay, P., S. Swingler, J. Song, F. Bushman, and D. Trono. 1995. HIV nuclear import is governed by the phosphotyrosine-mediated binding of matrix to the core domain of integrase. Cell 83:569-576. 62. Gamble, T. R., S. Yoo, F. F. Vajdos, U. K. von Schwedler, D. K. Worthylake, H. Wang, J. P. McCutcheon, W. I. Sundquist, and C. P. Hill. 1997. Structure of the carboxyl-terminal dimerization domain of the HIV-1 capsid protein. Science 278:849-853. 63. Garbitt-Hirst, R., S. P. Kenney, and L. J. Parent. 2009. Genetic evidence for a connection between Rous sarcoma virus gag nuclear trafficking and genomic RNA packaging. J Virol 83:6790-6797. 64. Garbitt, R. A., J. A. Albert, M. D. Kessler, and L. J. Parent. 2001. trans-Acting Inhibition of Genomic RNA Dimerization by Rous Sarcoma Virus Matrix Mutants. J Virol 75:260- 268.

219

65. Gavrilov, A. A., and S. V. Razin. 2015. Compartmentalization of the cell nucleus and spatial organization of the genome. Molecular Biology 49:21-39. 66. Goff, A., L. S. Ehrlich, S. N. Cohen, and C. A. Carter. 2003. Tsg101 control of human immunodeficiency virus type 1 Gag trafficking and release. J Virol 77:9173-9182. 67. Goff, S. P. 2007. Host factors exploited by retroviruses. Nature Reviews Microbiology 5:253. 68. Görisch, S. M., M. Wachsmuth, C. Ittrich, C. P. Bacher, K. Rippe, and P. Lichter. 2004. Nuclear body movement is determined by chromatin accessibility and dynamics. Proceedings of the National Academy of Sciences of the United States of America 101:13221-13226. 69. Grewe, B., B. Hoffmann, I. Ohs, M. Blissenbach, S. Brandt, B. Tippler, T. Grunwald, and K. Überla. 2012. Cytoplasmic Utilization of Human Immunodeficiency Virus Type 1 Genomic RNA Is Not Dependent on a Nuclear Interaction with Gag. J Virol 86:2990-3002. 70. Grigorov, B., F. Arcanger, P. Roingeard, J. L. Darlix, and D. Muriaux. 2006. Assembly of infectious HIV-1 in human epithelial and T-lymphoblastic cell lines. J Mol Biol 359:848- 862. 71. Grigorov, B., D. Décimo, F. Smagulova, C. Péchoux, M. Mougel, D. Muriaux, and J.-L. Darlix. 2007. Intracellular HIV-1 Gag localization is impaired by mutations in the nucleocapsid zinc fingers. Retrovirology 4:54-54. 72. Gudleski, N., J. M. Flanagan, E. P. Ryan, M. C. Bewley, and L. J. Parent. 2010. Directionality of nucleocytoplasmic transport of the retroviral gag protein depends on sequential binding of karyopherins and viral RNA. Proceedings of the National Academy of Sciences 107:9358-9363. 73. Gupta, K., D. Ott, T. J. Hope, R. F. Siliciano, and J. D. Boeke. 2000. A Human Nuclear Shuttling Protein That Interacts with Human Immunodeficiency Virus Type 1 Matrix Is Packaged into Virions. J Virol 74:11811-11824. 74. Gurer, C., L. Berthoux, and J. Luban. 2005. Covalent Modification of Human Immunodeficiency Virus Type 1 p6 by SUMO-1. J Virol 79:910-917. 75. Haffar, O. K., S. Popov, L. Dubrovsky, I. Agostini, H. Tang, T. Pushkarsky, S. G. Nadler, and M. Bukrinsky. 2000. Two nuclear localization signals in the HIV-1 matrix protein regulate nuclear import of the HIV-1 pre-integration complex11Edited by M. Gottesman. Journal of Molecular Biology 299:359-368. 76. Harper, S., and D. W. Speicher. 2002. Expression and Purification of GST Fusion Proteins, vol. 1. John Wiley & Sons, Inc. 77. Henikoff, S., J. G. Henikoff, A. Sakai, G. B. Loeb, and K. Ahmad. 2009. Genome-wide profiling of salt fractions maps physical properties of chromatin. Genome Research 19:460-469. 78. Henning, M. S., S. G. Morham, S. P. Goff, and M. H. Naghavi. 2010. PDZD8 Is a Novel Gag-Interacting Factor That Promotes Retroviral Infection. J Virol 84:8990-8995. 79. Hermida-Matsumoto, L., and M. D. Resh. 2000. Localization of Human Immunodeficiency Virus Type 1 Gag and Env at the Plasma Membrane by Confocal Imaging. J Virol 74:8670-8679. 80. Himly, M., D. N. Foster, I. Bottoli, J. S. Iacovoni, and P. K. Vogt. 1998. The DF-1 Chicken Fibroblast Cell Line: Transformation Induced by Diverse Oncogenes and Cell Death Resulting from Infection by Avian Leukosis Viruses. Virology 248:295-304. 81. Hocine, S., R. H. Singer, and D. Grünwald. 2010. RNA Processing and Export. Cold Spring Harbor Perspectives in Biology.

220

82. Hong, S., G. Choi, S. Park, A.-S. Chung, E. Hunter, and S. S. Rhee. 2001. Type D Retrovirus Gag Polyprotein Interacts with the Cytosolic Chaperonin TRiC. J Virol 75:2526- 2534. 83. Hooks, J. J., and C. J. Gibbs. 1975. The foamy viruses. Bacteriological Reviews 39:169- 185. 84. Hori, T., H. Takeuchi, H. Saito, R. Sakuma, Y. Inagaki, and S. Yamaoka. 2013. A Carboxy- Terminally Truncated Human CPSF6 Lacking Residues Encoded by Exon 6 Inhibits HIV-1 cDNA Synthesis and Promotes Capsid Disassembly. J Virol 87:7726-7736. 85. Huang, D. W., B. T. Sherman, and R. A. Lempicki. 2009. Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists. Nucleic Acids Research 37:1-13. 86. Huang, D. W., B. T. Sherman, and R. A. Lempicki. 2008. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols 4:44. 87. Hulsen, T., J. de Vlieg, and W. Alkema. 2008. BioVenn – a web application for the comparison and visualization of biological lists using area-proportional Venn diagrams. BMC Genomics 9:488. 88. Hutten, S., and R. H. Kehlenbach. 2007. CRM1-mediated nuclear export: to the pore and beyond. Trends in Cell Biology 17:193-201. 89. Jäger, S., P. Cimermancic, N. Gulbahce, J. R. Johnson, K. E. McGovern, S. C. Clarke, M. Shales, G. Mercenne, L. Pache, K. Li, H. Hernandez, G. M. Jang, S. L. Roth, E. Akiva, J. Marlett, M. Stephens, I. D’Orso, J. Fernandes, M. Fahey, C. Mahon, A. J. O’Donoghue, A. Todorovic, J. H. Morris, D. A. Maltby, T. Alber, G. Cagney, F. D. Bushman, J. A. Young, S. K. Chanda, W. I. Sundquist, T. Kortemme, R. D. Hernandez, C. S. Craik, A. Burlingame, A. Sali, A. D. Frankel, and N. J. Krogan. 2011. Global landscape of HIV– human protein complexes. Nature 481:365. 90. Jouvenet, N., S. Lainé, L. P. Vivares, and M. Mougel. 2011. Cell biology of retroviral RNA packaging. RNA Biology 8:572-580. 91. Kaddis Maldonado, R., and L. Parent. 2016. Orchestrating the Selection and Packaging of Genomic RNA by Retroviruses: An Ensemble of Viral and Host Factors. Viruses 8:257. 92. Kataoka, N., J. L. Bachorik, and G. Dreyfuss. 1999. Transportin-SR, a Nuclear Import Receptor for SR Proteins. The Journal of Cell Biology 145:1145-1152. 93. Keller, A., A. I. Nesvizhskii, E. Kolker, and R. Aebersold. 2002. Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database Search. Analytical Chemistry 74:5383-5392. 94. Kemler, I., A. Meehan, and E. M. Poeschla. 2010. Live-Cell Coimaging of the Genomic RNAs and Gag Proteins of Two Lentiviruses. Journal of Virology 84:6352-6366. 95. Kemler, I., D. Saenz, and E. Poeschla. 2012. Feline Immunodeficiency Virus Gag Is a Nuclear Shuttling Protein. Journal of Virology 86:8402-8411. 96. Kenney, S. P., T. L. Lochmann, C. L. Schmid, and L. J. Parent. 2008. Intermolecular Interactions between Retroviral Gag Proteins in the Nucleus. Journal of Virology 82:683- 691. 97. Kim, W., Y. Tang, Y. Okada, T. A. Torrey, S. K. Chattopadhyay, M. Pfleiderer, F. G. Falkner, F. Dorner, W. Choi, N. Hirokawa, and H. C. Morse. 1998. Binding of Murine Leukemia Virus Gag Polyproteins to KIF4, a Microtubule-Based Motor Protein. J Virol 72:6898-6901. 98. Kim, Y., A. A. Sharov, K. McDole, M. Cheng, H. Hao, C.-M. Fan, N. Gaiano, M. S. H. Ko, and Y. Zheng. 2011. Mouse B-Type Lamins Are Required for Proper Organogenesis But Not by Embryonic Stem Cells. Science (New York, N.y.) 334:1706-1710.

221

99. Kok, K.-H., T. Lei, and D.-Y. Jin. 2009. siRNA and shRNA screens advance key understanding of host factors required for HIV-1 replication. Retrovirology 6:78. 100. König, R., Y. Zhou, D. Elleder, T. L. Diamond, G. M. C. Bonamy, J. T. Irelan, C.-y. Chiang, B. P. Tu, P. D. De Jesus, C. E. Lilley, S. Seidel, A. M. Opaluch, J. S. Caldwell, M. D. Weitzman, K. L. Kuhen, S. Bandyopadhyay, T. Ideker, A. P. Orth, L. J. Miraglia, F. D. Bushman, J. A. Young, and S. K. Chanda. 2008. Global Analysis of Host-Pathogen Interactions that Regulate Early-Stage HIV-1 Replication. Cell 135:49-60. 101. Kotsopoulou, E., V. N. Kim, A. J. Kingsman, S. M. Kingsman, and K. A. Mitrophanous. 2000. A Rev-Independent Human Immunodeficiency Virus Type 1 (HIV-1)-Based Vector That Exploits a Codon-Optimized HIV-1gag-pol Gene. Journal of Virology 74:4839-4852. 102. Krishnan, L., K. A. Matreyek, I. Oztop, K. Lee, C. H. Tipper, X. Li, M. J. Dar, V. N. KewalRamani, and A. Engelman. 2009. The Requirement for Cellular Transportin 3 (TNPO3 or TRN-SR2) during Infection Maps to Human Immunodeficiency Virus Type 1 Capsid and Not Integrase. Journal of Virology 84:397-406. 103. Kudo, N., N. Matsumori, H. Taoka, D. Fujiwara, E. P. Schreiner, B. Wolff, M. Yoshida, and S. Horinouchi. 1999. Leptomycin B inactivates CRM1/exportin 1 by covalent modification at a cysteine residue in the central conserved region. Proceedings of the National Academy of Sciences 96:9112-9117. 104. Kudo, N., B. Wolff, T. Sekimoto, E. P. Schreiner, Y. Yoneda, M. Yanagida, S. Horinouchi, and M. Yoshida. 1998. Leptomycin B Inhibition of Signal-Mediated Nuclear Export by Direct Binding to CRM1. Experimental Cell Research 242:540-547. 105. KUO, L. J., and L.-X. YANG. 2008. γ-H2AX - A Novel Biomarker for DNA Double-strand Breaks. In Vivo 22:305-309. 106. Lai, M.-C., R.-I. Lin, and W.-Y. Tarn. 2001. Transportin-SR2 mediates nuclear import of phosphorylated SR proteins. Proceedings of the National Academy of Sciences 98:10154-10159. 107. Lai, M. C., R. I. Lin, S. Y. Huang, C. W. Tsai, and W. Y. Tarn. 2000. A human importin- beta family protein, transportin-SR2, interacts with the phosphorylated RS domain of SR proteins. J Biol Chem 275:7950-7957. 108. Lallemand-Breitenbach, V., and H. de Thé. 2010. PML Nuclear Bodies. Cold Spring Harbor Perspectives in Biology 2. 109. Lam, Y. W., and L. Trinkle-Mulcahy. 2015. New insights into nucleolar structure and function. F1000Prime Reports 7:48. 110. Lam, Y. W., L. Trinkle-Mulcahy, and A. I. Lamond. 2005. The nucleolus. Journal of Cell Science 118:1335-1337. 111. Lamond, A. I., and D. L. Spector. 2003. Nuclear speckles: a model for nuclear organelles. Nature Reviews Molecular Cell Biology 4:605. 112. Larson, D. R., Y. M. Ma, V. M. Vogt, and W. W. Webb. 2003. Direct measurement of Gag–Gag interaction during retrovirus assembly with FRET and fluorescence correlation spectroscopy. The Journal of Cell Biology 162:1233-1244. 113. Larue, R., K. Gupta, C. Wuensch, N. Shkriabai, J. J. Kessl, E. Danhart, L. Feng, O. Taltynov, F. Christ, G. D. Van Duyne, Z. Debyser, M. P. Foster, and M. Kvaratskhelia. 2012. Interaction of the HIV-1 Intasome with Transportin 3 Protein (TNPO3 or TRN-SR2). Journal of Biological Chemistry 287:34044-34058. 114. Le Sage, V., A. Cinti, F. Valiente-Echeverría, and A. J. Mouland. 2015. Proteomic analysis of HIV-1 Gag interacting partners using proximity-dependent biotinylation. Virology Journal 12:138.

222

115. Lehmann-Che, J., N. Renault, M. L. Giron, P. Roingeard, E. Clave, J. Tobaly-Tapiero, P. Bittoun, A. Toubert, H. de Thé, and A. Saïb. 2007. Centrosomal Latency of Incoming Foamy Viruses in Resting Cells. PLOS Pathogens 3:e74. 116. Lehmann, M., M. P. Milev, L. Abrahamyan, X.-J. Yao, N. Pante, and A. J. Mouland. 2009. Intracellular Transport of Human Immunodeficiency Virus Type 1 Genomic RNA and Viral Production Are Dependent on Dynein Motor Function and Late Endosome Positioning. The Journal of biological chemistry 284:14572-14585. 117. Lesbats, P., E. Serrao, D. P. Maskell, V. E. Pye, N. O’Reilly, D. Lindemann, A. N. Engelman, and P. Cherepanov. 2017. Structural basis for spumavirus GAG tethering to chromatin. Proceedings of the National Academy of Sciences 114:5509. 118. Lévesque, K., M. Halvorsen, L. Abrahamyan, L. Chatel‐Chaix, V. Poupon, H. Gordon, L. DesGroseillers, A. Gatignol, and A. J. Mouland. 2006. Trafficking of HIV‐1 RNA is Mediated by Heterogeneous Nuclear Ribonucleoprotein A2 Expression and Impacts on Viral Assembly. Traffic 7:1177-1193. 119. Levin, A., Z. Hayouka, A. Friedler, and A. Loyter. 2010. Transportin 3 and importin α are required for effective nuclear import of HIV-1 integrase in virus-infected cells. Nucleus 1:422-431. 120. Li, Y., K. M. Frederick, N. A. Haverland, P. Ciborowski, and M. Belshan. 2016. Investigation of the HIV-1 matrix interactome during virus replication. PROTEOMICS – Clinical Applications 10:156-163. 121. Linial, M. L. 1999. Foamy Viruses Are Unconventional Retroviruses. J Virol 73:1747- 1755. 122. Linial, M. L. 1999. Foamy Viruses Are Unconventional Retroviruses. Journal of Virology 73:1747-1755. 123. Liu, H., and J. H. Naismith. 2008. An efficient one-step site-directed deletion, insertion, single and multiple-site plasmid mutagenesis protocol. BMC Biotechnology 8:91. 124. Lo, Y. T., T. Tian, P. E. Nadeau, J. Park, and A. Mergia. 2010. The foamy virus genome remains unintegrated in the nuclei of G1/S phase-arrested cells, and integrase is critical for preintegration complex transport into the nucleus. J Virol 84:2832-2842. 125. Lochmann, T. L., D. V. Bann, E. P. Ryan, A. R. Beyer, A. Mao, A. Cochrane, and L. J. Parent. 2013. NC-Mediated Nucleolar Localization of Retroviral Gag Proteins. Virus research 171:304-318. 126. Logue, E. C., K. T. Taylor, P. H. Goff, and N. R. Landau. 2011. The Cargo-Binding Domain of Transportin 3 Is Required for Lentivirus Nuclear Import. J Virol 85:12950-12961. 127. Luban, J. 2008. HIV-1 Infection: Going Nuclear with TNPO3/Transportin-SR2 and Integrase. Current Biology 18:R710-R713. 128. Luban, J., K. L. Bossolt, E. K. Franke, G. V. Kalpana, and S. P. Goff. 1993. Human immunodeficiency virus type 1 Gag protein binds to cyclophilins A and B. Cell 73:1067- 1078. 129. Lund, N., M. P. Milev, R. Wong, T. Sanmuganantham, K. Woolaway, B. Chabot, S. Abou Elela, A. J. Mouland, and A. Cochrane. 2012. Differential effects of hnRNP D/AUF1 isoforms on HIV-1 gene expression. Nucleic Acids Research 40:3663-3675. 130. Luttge, B. G., and E. O. Freed. 2010. FIV Gag: Virus Assembly and Host-cell Interactions. Veterinary immunology and immunopathology 134:3. 131. Maertens, G. N., N. J. Cook, W. Wang, S. Hare, S. S. Gupta, I. Öztop, K. Lee, V. E. Pye, O. Cosnefroy, A. P. Snijders, V. N. KewalRamani, A. Fassati, A. Engelman, and P. Cherepanov. 2014. Structural basis for nuclear import of splicing factors by human Transportin 3. Proceedings of the National Academy of Sciences 111:2728-2733.

223

132. Martinez, N. W., X. Xue, R. G. Berro, G. Kreitzer, and M. D. Resh. 2008. Kinesin KIF4 Regulates Intracellular Trafficking and Stability of the Human Immunodeficiency Virus Type 1 Gag Polyprotein. Journal of Virology 82:9937-9950. 133. Matic, I., M. van Hagen, J. Schimmel, B. Macek, S. C. Ogg, M. H. Tatham, R. T. Hay, A. I. Lamond, M. Mann, and A. C. O. Vertegaal. 2008. In Vivo Identification of Human Small Ubiquitin-like Modifier Polymerization Sites by High Accuracy Mass Spectrometry and an in Vitro to in Vivo Strategy. Molecular & Cellular Proteomics 7:132-144. 134. Misteli, T., J. F. Cáceres, J. Q. Clement, A. R. Krainer, M. F. Wilkinson, and D. L. Spector. 1998. Serine Phosphorylation of SR Proteins Is Required for Their Recruitment to Sites of Transcription In Vivo. The Journal of Cell Biology 143:297. 135. Misteli, T., J. F. Cáceres, and D. L. Spector. 1997. The dynamics of a pre-mRNA splicing factor in living cells. Nature 387:523. 136. Mitrea, D. M., and R. W. Kriwacki. 2016. Phase separation in biology; functional organization of a higher order. Cell Communication and Signaling 14:1. 137. Molle, D., C. Segura-Morales, G. Camus, C. Berlioz-Torrent, J. Kjems, E. Basyuk, and E. Bertrand. 2009. Endosomal Trafficking of HIV-1 Gag and Genomic RNAs Regulates Viral Egress. Journal of Biological Chemistry 284:19727-19743. 138. Monette, A., L. Ajamian, M. López-Lastra, and A. J. Mouland. 2009. Human Immunodeficiency Virus Type 1 (HIV-1) Induces the Cytoplasmic Retention of Heterogeneous Nuclear Ribonucleoprotein A1 by Disrupting Nuclear Import: IMPLICATIONS FOR HIV-1 GENE EXPRESSION. Journal of Biological Chemistry 284:31350- 31362. 139. Moroianu, J. 1999. Nuclear import and export pathways. J Cell Biochem 33:76-83. 140. Müllers, E. 2013. The Foamy Virus Gag Proteins: What Makes Them Different? Viruses 5. 141. Müllers, E., K. Stirnnagel, S. Kaulfuss, and D. Lindemann. 2011. Prototype Foamy Virus Gag Nuclear Localization: a Novel Pathway among Retroviruses. Journal of Virology 85:9276-9285. 142. Müllers, E., T. Uhlig, K. Stirnnagel, U. Fiebig, H. Zentgraf, and D. Lindemann. 2011. Novel Functions of Prototype Foamy Virus Gag Glycine- Arginine-Rich Boxes in Reverse Transcription and Particle Morphogenesis. J Virol 85:1452-1463. 143. Muratani, M., D. Gerlich, S. M. Janicki, M. Gebhard, R. Eils, and D. L. Spector. 2001. Metabolic-energy-dependent movement of PML bodies within the mammalian cell nucleus. Nature Cell Biology 4:106. 144. Nadaraia-Hoke, S., D. V. Bann, T. L. Lochmann, N. Gudleski-O'Regan, and L. J. Parent. 2013. Alterations in the MA and NC Domains Modulate Phosphoinositide-Dependent Plasma Membrane Localization of the Rous Sarcoma Virus Gag Protein. Journal of Virology 87:3609-3615. 145. Nadler, S. G., D. Tritschler, O. K. Haffar, J. Blake, A. G. Bruce, and J. S. Cleaveland. 1997. Differential Expression and Sequence-specific Interaction of Karyopherin α with Nuclear Localization Sequences. Journal of Biological Chemistry 272:4310-4315. 146. Nandhagopal, N., A. A. Simpson, M. C. Johnson, A. B. Francisco, G. W. Schatz, M. G. Rossmann, and V. M. Vogt. 2004. Dimeric rous sarcoma virus capsid protein structure relevant to immature Gag assembly. J Mol Biol 335:275-282. 147. Nash, M. A., M. K. Meyer, G. L. Decker, and R. B. Arlinghaus. 1993. A subset of Pr65gag is nucleus associated in murine leukemia virus-infected cells. Journal of Virology 67:1350-1356. 148. Nesvizhskii, A. I., A. Keller, E. Kolker, and R. Aebersold. 2003. A Statistical Model for Identifying Proteins by Tandem Mass Spectrometry. Analytical Chemistry 75:4646-4658.

224

149. Ngo, J. C. K., S. Chakrabarti, J.-H. Ding, A. Velazquez-Dones, B. Nolen, B. E. Aubol, J. A. Adams, X.-D. Fu, and G. Ghosh. 2005. Interplay between SRPK and Clk/Sty Kinases in Phosphorylation of the Splicing Factor ASF/SF2 Is Regulated by a Docking Motif in ASF/SF2. Molecular Cell 20:77-89. 150. Nguyen, D. G., K. C. Wolff, H. Yin, J. S. Caldwell, and K. L. Kuhen. 2006. “UnPAKing” Human Immunodeficiency Virus (HIV) Replication: Using Small Interfering RNA Screening To Identify Novel Cofactors and Elucidate the Role of Group I PAKs in HIV Infection. J Virol 80:130-137. 151. Nishi, K., M. Yoshida, D. Fujiwara, M. Nishikawa, S. Horinouchi, and T. Beppu. 1994. Leptomycin B targets a regulatory cascade of crm1, a fission yeast nuclear protein, involved in control of higher order chromosome structure and gene expression. Journal of Biological Chemistry 269:6320-6324. 152. Nizami, Z., S. Deryusheva, and J. G. Gall. 2010. The Cajal Body and Histone Locus Body. Cold Spring Harbor Perspectives in Biology 2. 153. Parent, L. J., T. M. Cairns, J. A. Albert, C. B. Wilson, J. W. Wills, and R. C. Craven. 2000. RNA Dimerization Defect in a Rous Sarcoma Virus Matrix Mutant. J Virol 74:164-172. 154. Pemberton, L. F., and B. M. Paschal. 2005. Mechanisms of Receptor‐Mediated Nuclear Import and Nuclear Export. Traffic 6:187-198. 155. Perlman, M., and M. D. Resh. 2006. Identification of an intracellular trafficking and assembly pathway for HIV-1 gag. Traffic 7:731-745. 156. Petit, C., M.-L. Giron, J. Tobaly-Tapiero, P. Bittoun, E. Real, Y. Jacob, N. Tordo, H. de Thé, and A. Saïb. 2003. Targeting of incoming retroviral Gag to the centrosome involves a direct interaction with the dynein light chain 8. Journal of Cell Science 116:3433-3442. 157. Phillips, J. M., P. S. Murray, D. Murray, and V. M. Vogt. 2008. A molecular switch required for retrovirus assembly participates in the hexagonal immature lattice. Embo J 27:1411-1420. 158. Pommier, Y. 2006. Topoisomerase I inhibitors: camptothecins and beyond. Nature Reviews Cancer 6:789. 159. Poole, E., P. Strappe, H. P. Mok, R. Hicks, and A. M. L. Lever. 2005. HIV‐1 Gag–RNA Interaction Occurs at a Perinuclear/Centrosomal Site; Analysis by Confocal Microscopy and FRET. Traffic 6:741-755. 160. Prizan-Ravid, A., E. Elis, N. Laham-Karam, S. Selig, M. Ehrlich, and E. Bacharach. 2010. The Gag cleavage product, p12, is a functional constituent of the murine leukemia virus pre-integration complex. PLoS Pathog 6:1001183. 161. Rato, S., S. Maia, P. M. Brito, L. Resende, C. F. Pereira, C. Moita, R. P. Freitas, J. Moniz- Pereira, N. Hacohen, L. F. Moita, and J. Goncalves. 2010. Novel HIV-1 Knockdown Targets Identified by an Enriched Kinases/Phosphatases shRNA Library Using a Long- Term Iterative Screen in Jurkat T-Cells. PLOS ONE 5:e9276. 162. Renault, N., J. Tobaly-Tapiero, J. Paris, M.-L. Giron, A. Coiffic, P. Roingeard, and A. Saïb. 2011. A nuclear export signal within the structural Gag protein is required for prototype foamy virus replication. Retrovirology 8:6. 163. Rice, B., R. Kaddis, M. Stake, T. Lochmann, and L. Parent. 2015. Interplay between the alpharetroviral Gag protein and SR Proteins SF2 and SC35 in the nucleus. Frontiers in Microbiology 6. 164. Rieder, D., Z. Trajanoski, and J. G. McNally. 2012. Transcription factories. Frontiers in Genetics 3:221. 165. Risco, C., L. Menendez-Arias, T. D. Copeland, P. Pinto da Silva, and S. Oroszlan. 1995. Intracellular transport of the murine leukemia virus during acute infection of NIH 3T3

225

cells: nuclear import of nucleocapsid protein and integrase. Journal of Cell Science 108:3039. 166. Ritchie, C., I. Cylinder, E. J. Platt, and E. Barklis. 2015. Analysis of HIV-1 Gag Protein Interactions via Biotin Ligase Tagging. Journal of Virology 89:3988-4001. 167. Rous, P. 1911. A SARCOMA OF THE FOWL TRANSMISSIBLE BY AN AGENT SEPARABLE FROM THE TUMOR CELLS. The Journal of Experimental Medicine 13:397-411. 168. Rous, P. 1910. A TRANSMISSIBLE AVIAN NEOPLASM. (SARCOMA OF THE COMMON FOWL.). The Journal of Experimental Medicine 12:696-705. 169. Royer, M., M. Cerutti, B. Gay, S.-S. Hong, G. r. Devauchelle, and P. Boulanger. 1991. Functional domains of HIV-1gag-polyprotein expressed in baculovirus-infected cells. Virology 184:417-422. 170. Rye-McCurdy, T. D., S. Nadaraia-Hoke, N. Gudleski-O'Regan, J. M. Flanagan, L. J. Parent, and K. Musier-Forsyth. 2014. Mechanistic Differences between Nucleic Acid Chaperone Activities of the Gag Proteins of Rous Sarcoma Virus and Human Immunodeficiency Virus Type 1 Are Attributed to the MA Domain. Journal of Virology 88:7852-7861. 171. Saïb, A., F. Puvion-Dutilleul, M. Schmid, J. Périès, and H. de Thé. 1997. Nuclear targeting of incoming human foamy virus Gag proteins involves a centriolar step. J Virol 71:1155-1161. 172. Scheifele, L. Z., R. A. Garbitt, J. D. Rhoads, and L. J. Parent. 2002. Nuclear entry and CRM1-dependent nuclear export of the Rous sarcoma virus Gag polyprotein. Proceedings of the National Academy of Sciences 99:3944-3949. 173. Scheifele, L. Z., S. P. Kenney, T. M. Cairns, R. C. Craven, and L. J. Parent. 2007. Overlapping roles of the Rous sarcoma virus Gag p10 domain in nuclear export and virion core morphology. J Virol 81:10718-10728. 174. Scheifele, L. Z., E. P. Ryan, and L. J. Parent. 2005. Detailed Mapping of the Nuclear Export Signal in the Rous Sarcoma Virus Gag Protein. Journal of Virology 79:8732-8741. 175. Schliephake, A. W., and A. Rethwilm. 1994. Nuclear localization of foamy virus Gag precursor protein. Journal of Virology 68:4946-4954. 176. Schneider, W. M., J. D. Brzezinski, S. Aiyer, N. Malani, M. Gyuricza, F. D. Bushman, and M. J. Roth. 2013. Viral DNA tethering domains complement replication-defective mutations in the p12 protein of MuLV Gag. Proceedings of the National Academy of Sciences 110:9487-9492. 177. Schreiber, E., P. Matthias, M. M. Müller, and W. Schaffner. 1989. Rapid detection of octamer binding proteins with 'mini-extracts', prepared from a small number of cells. Nucleic Acids Research 17:6419. 178. Sfakianos, J. N., R. A. LaCasse, and E. Hunter. 2003. The M-PMV cytoplasmic targeting- retention signal directs nascent Gag polypeptides to a pericentriolar region of the cell. Traffic 4:660-670. 179. Shah, V. B., J. Shi, D. R. Hout, I. Oztop, L. Krishnan, J. Ahn, M. S. Shotwell, A. Engelman, and C. Aiken. 2013. The Host Proteins Transportin SR2/TNPO3 and Cyclophilin A Exert Opposing Effects on HIV-1 Uncoating. J Virol 87:422-432. 180. Sherer, N. M., M. J. Lehmann, L. F. Jimenez-Soto, A. Ingmundson, S. M. Horner, G. Cicchetti, P. G. Allen, M. Pypaert, J. M. Cunningham, and W. Mothes. 2003. Visualization of retroviral replication in living cells reveals budding into multivesicular bodies. Traffic 4:785-801. 181. Shilov, I. V., S. L. Seymour, A. A. Patel, A. Loboda, W. H. Tang, S. P. Keating, C. L. Hunter, L. M. Nuwaysir, and D. A. Schaeffer. 2007. The Paragon Algorithm, a Next

226

Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra. Molecular & Cellular Proteomics 6:1638-1655. 182. Simon, D. N., and K. L. Wilson. 2011. The nucleoskeleton as a genome-associated dynamic 'network of networks'. Nature Reviews Molecular Cell Biology 12:695. 183. Skinner, L. M., M. Sudol, A. L. Harper, and M. Katzman. 2001. Nucleophile Selection for the Endonuclease Activities of Human, Ovine, and Avian Retroviral Integrases. Journal of Biological Chemistry 276:114-124. 184. Spector, D. L. 2003. The Dynamics of Chromosome Organization and Gene Regulation. Annual Review of Biochemistry 72:573-608. 185. Spector, D. L., and A. I. Lamond. 2011. Nuclear Speckles. Cold Spring Harbor Perspectives in Biology 3. 186. Stade, K., C. S. Ford, C. Guthrie, and K. Weis. Exportin 1 (Crm1p) Is an Essential Nuclear Export Factor. Cell 90:1041-1050. 187. Stake, M., D. Bann, R. Kaddis, and L. Parent. 2013. Nuclear Trafficking of Retroviral RNAs and Gag Proteins during Late Steps of Replication. Viruses 5:2767. 188. Stake, M. S. 2013. The Role of Karyopherin Transportin 3 in Rous Sarcoma Virus Assembly. Pennsylvania State University. 189. Staněk, D., and A. H. Fox. 2017. Nuclear bodies: news insights into structure and function. Current Opinion in Cell Biology 46:94-101. 190. Stehelin, D., H. E. Varmus, J. M. Bishop, and P. K. Vogt. 1976. DNA related to the transforming gene(s) of avian sarcoma viruses is present in normal avian DNA. Nature 260:170-173. 191. Stewart, M. 2007. Molecular mechanism of the nuclear protein import cycle. Nature Reviews Molecular Cell Biology 8:195. 192. Sun, Q., Y. P. Carrasco, Y. Hu, X. Guo, H. Mirzaei, J. MacMillan, and Y. M. Chook. 2013. Nuclear export inhibition through covalent conjugation and hydrolysis of Leptomycin B by CRM1. Proceedings of the National Academy of Sciences of the United States of America 110:1303-1308. 193. Swanstrom R., a. W., JW. 1997. Principles of Particle Assembly. In H. S. Coffin JM, Varmus HE (ed.), Retroviruses. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 194. Swanstrom R., a. W. J. 1997. Overview of Retroviral Assembly. In H. S. Coffin JM, Varmus HE (ed.), Retroviruses. Cold Spring Harbor Laboratory Press, Cold Spring Harbor. 195. Tanaka, M., B. A. Robinson, K. Chutiraka, C. D. Geary, J. C. Reed, and J. R. Lingappa. 2015. Mutations of Conserved Residues in the Major Homology Region Arrest Assembling HIV-1 Gag as a Membrane-Targeted Intermediate Containing Genomic RNA and Cellular Proteins. J Virol 90:1944-1963. 196. Tang, W. H., I. V. Shilov, and S. L. Seymour. 2008. Nonlinear Fitting Method for Determining Local False Discovery Rates from Decoy Database Searches. Journal of Proteome Research 7:3661-3667. 197. Temin, H. M. 1963. The effects of actinomycin D on growth of Rous sarcoma virus in vitro. Virology 20:577-582. 198. Temin, H. M., and S. Mizutani. 1970. RNA-dependent DNA polymerase in virions of Rous sarcoma virus. Nature 226:1211-1213. 199. The Gene Ontology Consortium. 2017. Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Research 45:D331-D338.

227

200. Tobaly‐Tapiero, J., P. Bittoun, J. Lehmann‐Che, O. Delelis, M. L. Giron, H. D. Thé, and A. Saïb. 2008. Chromatin Tethering of Incoming Foamy Virus by the Structural Gag Protein. Traffic 9:1717-1727. 201. Tsirkone, V. G., J. Blokken, F. De Wit, J. Breemans, S. De Houwer, Z. Debyser, F. Christ, and S. V. Strelkov. 2017. N-terminal half of transportin SR2 interacts with HIV integrase. Journal of Biological Chemistry 292:9699-9710. 202. Valle-Casuso, J. C., F. Di Nunzio, Y. Yang, N. Reszka, M. Lienlaf, N. Arhel, P. Perez, A. L. Brass, and F. Diaz-Griffero. 2012. TNPO3 Is Required for HIV-1 Replication after Nuclear Import but prior to Integration and Binds the HIV-1 Core. J Virol 86:5931-5936. 203. Vogt, P. K. 1997. A Brief Chronicle of Retrovirology. In H. S. Coffin JM, Varmus HE (ed.), Retroviruses. Cold Spring Harbor Laboratory Press, Cold Spring Harbor (NY). 204. Vogt, P. K. 1997. The Place of Retroviruses in Biology. In H. S. Coffin JM, Varmus HE (ed.), Retroviruses. Cold Spring Harbor Laboratory Press, Cold Spring Harbor (NY). 205. Vogt, P. K. 1997. Virion Proteins. In H. S. Coffin JM, Varmus HE (ed.), Retroviruses. Cold Spring Harbor Laboratory Press, Cold Spring Harbor (NY). 206. von Schwedler, U., R. S. Kornbluth, and D. Trono. 1994. The nuclear localization signal of the matrix protein of human immunodeficiency virus type 1 allows the establishment of infection in macrophages and quiescent T lymphocytes. Proc Natl Acad Sci U S A 91:6992-6996. 207. Wang, M. Q., W. Kim, G. Gao, T. A. Torrey, H. C. Morse, P. De Camilli, and S. P. Goff. 2003. Endophilins interact with Moloney murine leukemia virus Gag and modulate virion production. Journal of Biology 3:4. 208. Wang, P., and S. R. Wilson. 2013. Mass spectrometry-based protein identification by integrating de novo sequencing with database searching. BMC Bioinformatics 14:S24- S24. 209. Weidtkamp-Peters, S., T. Lenser, D. Negorev, N. Gerstner, T. G. Hofmann, G. Schwanitz, C. Hoischen, G. Maul, P. Dittrich, and P. Hemmerich. 2008. Dynamics of component exchange at PML nuclear bodies. Journal of Cell Science 121:2731-2743. 210. Weldon, R. A., C. R. Erdie, M. G. Oliver, and J. W. Wills. 1990. Incorporation of chimeric gag protein into retroviral particles. Journal of Virology 64:4169-4179. 211. Weldon, R. A., P. Sarkar, S. M. Brown, and S. K. Weldon. 2003. Mason–Pfizer monkey virus Gag proteins interact with the human sumo conjugating enzyme, hUbc9. Virology 314:62-73. 212. Wight, D. J., V. C. Boucherit, M. Nader, D. J. Allen, I. A. Taylor, and K. N. Bishop. 2012. The Gammaretroviral p12 protein has multiple domains that function during the early stages of replication. Retrovirology 9:83. 213. Woodcock, C. L., and R. P. Ghosh. 2010. Chromatin Higher-order Structure and Dynamics. Cold Spring Harbor Perspectives in Biology 2. 214. Yates, J. R. 1998. Mass spectrometry and the age of the proteome. Journal of Mass Spectrometry 33:1-19. 215. Yeakley, J. M., H. Tronchère, J. Olesen, J. A. Dyck, H.-Y. Wang, and X.-D. Fu. 1999. Phosphorylation Regulates In Vivo Interaction and Molecular Targeting of Serine/Arginine-rich Pre-mRNA Splicing Factors. The Journal of Cell Biology 145:447. 216. Yeung, M. L., L. Houzet, V. S. R. K. Yedavalli, and K.-T. Jeang. 2009. A Genome-wide Short Hairpin RNA Screening of Jurkat T-cells for Human Proteins Contributing to Productive HIV-1 Replication. Journal of Biological Chemistry 284:19463-19473. 217. Yu, K. L., S. H. Lee, E. S. Lee, and J. C. You. 2016. HIV-1 nucleocapsid protein localizes efficiently to the nucleus and nucleolus. Virology 492:204-212.

228

218. Yu, S. F., S. W. Eastman, and M. L. Linial. 2006. Foamy Virus Capsid Assembly Occurs at a Pericentriolar Region Through a Cytoplasmic Targeting/Retention Signal in Gag. Traffic 7:966-977. 219. Yu, S. F., K. Edelmann, R. K. Strong, A. Moebes, A. Rethwilm, and M. L. Linial. 1996. The carboxyl terminus of the human foamy virus Gag protein contains separable nucleic acid binding and nuclear transport domains. J Virol 70:8255-8262. 220. Yuan, B., S. Campbell, E. Bacharach, A. Rein, and S. P. Goff. 2000. Infectivity of Moloney Murine Leukemia Virus Defective in Late Assembly Events Is Restored by Late Assembly Domains of Other Retroviruses. J Virol 74:7250-7260. 221. Yuan, B., A. Fassati, A. Yueh, and S. P. Goff. 2002. Characterization of Moloney Murine Leukemia Virus p12 Mutants Blocked during Early Events of Infection. J Virol 76:10801- 10810. 222. Yuan, B., X. Li, and S. P. Goff. 1999. Mutations altering the Moloney murine leukemia virus p12 Gag protein affect virion production and early events of the virus life cycle. The EMBO Journal 18:4700-4710. 223. Yueh, A., J. Leung, S. Bhattacharyya, L. A. Perrone, K. de los Santos, S.-y. Pu, and S. P. Goff. 2006. Interaction of Moloney Murine Leukemia Virus Capsid with Ubc9 and PIASy Mediates SUMO-1 Addition Required Early in Infection. J Virol 80:342-352. 224. Zhang, J., and C. S. Crumpacker. 2002. Human Immunodeficiency Virus Type 1 Nucleocapsid Protein Nuclear Localization Mediates Early Viral mRNA Expression. J Virol 76:10444-10454. 225. Zhou, H., M. Xu, Q. Huang, A. T. Gates, X. D. Zhang, J. C. Castle, E. Stec, M. Ferrer, B. Strulovici, D. J. Hazuda, and A. S. Espeseth. 2008. Genome-Scale RNAi Screen for Host Factors Required for HIV Replication. Cell Host & Microbe 4:495-504. 226. Zhou, L., E. Sokolskaja, C. Jolly, W. James, S. A. Cowley, and A. Fassati. 2011. Transportin 3 Promotes a Nuclear Maturation Step Required for Efficient HIV-1 Integration. PLOS Pathogens 7:e1002194.

229

Vita Breanna Lynn Rice

Education Ph.D. in Biomedical Sciences 2011 - 2018 The Pennsylvania State University, College of Medicine (Hershey, PA) B.S. in Biology Magna Cum Laude 2007 - 2011 Central Michigan University (Mt. Pleasant, MI)

Research Grants Ruth L. Kirschstein Predoctoral Individual National Research Service Award 2015 - 2018 Ruth L. Kirschstein NRSA Institutional Research Training Grant (T32) 2013 - 2015 Undergraduate Research and Creative Endeavors Grant 2010

Selected Abstracts Rice B., Tuffy, K., Maldonado, R., and Parent L. Chromatin-Associated Binding Partners of Retroviral Gag Proteins. Retroviruses Meeting. May 2017. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.

Rice B., Lochmann T., Stake M., Kaddis, R., and Parent L. Association of Retroviral Gag Proteins with Host Chromatin: Implications for Viral Genome Packaging near sites of Transcription. Nuclear Organization & Function Meeting. May 2016. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY..

Rice B., Lochmann T., Stake M., Kaddis, R., Shkriabai N., Kvaratskhelia M., and Parent L. The Association of Retroviral Gag Proteins with Host Chromatin Proteins. Retroviruses Meeting. May 2015. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY.

Rice B., Lochmann T., Stake M., Shkriabai N., Kvaratskhelia M., and Parent L. Retroviral Gag Interactions with Chromatin. ASM Viral Manipulations of the Nucleus. October 2014. Washington D.C.

Selected Publications Maldonado RK, Tuffy KM, Rice BL, Chiari-Fort EF, Fahrbach KM, Hope TJ, Cochrane A, and Parent LJ. Association of retroviral Gag proteins with unspliced viral RNA in the nucleus. In preparation.

Rice BL*, Kaddis RJ*, Stake MS*, Lochmann TL, and Parent LJ (2015). Interplay between the alpharetroviral Gag protein and SR proteins SF2 and SC35 in the nucleus. Front. Microbiol. 6:925. *Authors contributed equally.

Beyer, AR, Bann, DV, Rice, BL, Pultz, IS, Kane, M, Goff, SP, Golovkina, T, and Parent, LJ (2013). Nucleolar Trafficking of the Mouse Mammary Tumor Virus Gag Protein Induced by Interaction with Ribosomal Protein L9. J. Virol. 87(2):1069-82.