Systematic Discovery of Endogenous Human Ribonucleoprotein Complexes
Total Page:16
File Type:pdf, Size:1020Kb
bioRxiv preprint doi: https://doi.org/10.1101/480061; this version posted November 27, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Systematic discovery of endogenous human ribonucleoprotein complexes Anna L. Mallam1,2,3,§,*, Wisath Sae-Lee1,2,3, Jeffrey M. Schaub1,2,3, Fan Tu1,2,3, Anna Battenhouse1,2,3, Yu Jin Jang1, Jonghwan Kim1, Ilya J. Finkelstein1,2,3, Edward M. Marcotte1,2,3,§, Kevin Drew1,2,3,§,* 1 Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA 2 Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX 78712, USA 3 Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA § Correspondence: [email protected] (A.L.M.), [email protected] (E.M.M.), [email protected] (K.D.) * These authors contributed equally to this work Short title: A resource of human ribonucleoprotein complexes Summary: An exploration of human protein complexes in the presence and absence of RNA reveals endogenous ribonucleoprotein complexes ! 1! bioRxiv preprint doi: https://doi.org/10.1101/480061; this version posted November 27, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Abstract Ribonucleoprotein (RNP) complexes are important for many cellular functions but their prevalence has not been systematically investigated. We developed a proteome-wide fractionation-mass-spectrometry strategy called differential fractionation (DIF-FRAC) to discover RNP complexes by their sensitivity to RNase A treatment. Applying this to human cells reveals a set of 115 highly-stable endogenous RNPs, and a further 1,428 protein complexes whose subunits associate with RNA, thus indicating over 20% of all complexes are RNPs. We show RNP complexes either dissociate, change composition, or form stable protein-only complexes upon RNase A treatment, uncovering the biochemical role of RNA in complex formation. We combine these data into a resource, rna.MAP (rna.proteincomplexes.org), which demonstrates that well-studied complexes such as replication factor C (RFC) and centralspindlin exist as RNP complexes, providing new insight into their cellular functions. We apply our method to red blood cells and mouse embryonic stem cells to demonstrate its ability to identify cell-type specific roles for RNP complexes in diverse systems. Thus the methodology has the potential uncover RNP complexes in different human tissues, disease states and throughout all domains of life. Keywords: ribonucleoprotein complex, RNP, RNA binding protein, RBP, proteomics, DIF- FRAC, protein complexes, biochemical fractionation, mass spectrometry, interactome ! 2! bioRxiv preprint doi: https://doi.org/10.1101/480061; this version posted November 27, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Introduction Large macromolecular complexes are crucial to many essential biochemical functions and their full characterization is necessary for a complete understanding of the cell. A worldwide effort is underway to systematically identify complexes using high throughput mass spectrometry techniques across many cell types, tissues and species (Havugimana et al., 2012; Hein et al., 2015; Wan et al., 2015; Drew et al., 2017; Huttlin et al., 2017) but these techniques currently only consider the protein subunits of complexes ignoring other constituent biomolecules. Ribonucleoprotein (RNP) complexes, a specific subclass of complexes consisting of RNA and protein (Castello et al., 2013; Gerstberger et al., 2014; Hentze et al., 2018), are particularly important to study due to their indispensable role in cellular functions such as translation (ribosome), splicing (spliceosome), and RNA degradation (exosome), as well as their critical role in human diseases including amyotrophic lateral sclerosis (ALS) (Scotter et al., 2015), spinocerebellar ataxia (Yue et al., 2001), and autism (Voineagu et al., 2011). Unfortunately, we currently lack a full account of all RNP complexes in the cell inhibiting our understanding of vital biological processes. Recent advances in methodology have identified many new RNA- associated proteins, highlighting the importance of protein-RNA interactions throughout the proteome (Baltz et al., 2012; Castello et al., 2012; Brannan et al., 2016; Castello et al., 2016; He et al., 2016; Treiber et al., 2017; Bao et al., 2018; Huang et al., 2018; Queiroz et al., 2018; Trendel et al., 2018) yet none of these methods are capable of directly exploring the prevalence of protein-RNA interactions in the context of macromolecular complexes. Moreover, these techniques rely on crosslinking, modified-nucleotide-incorporation, the use of specific RNA baits, and/or poly(A) RNA-capture, all of which biases their identifications. To address these limitations and discover the pervasiveness of RNA in macromolecular complexes, we developed an unbiased strategy to systematically discover endogenous RNP complexes. Our method, ‘differential fractionation for interaction analysis’ (DIF-FRAC), measures the sensitivity of protein complexes to RNase A treatment using native size-exclusion chromatography followed by mass spectrometry. DIF-FRAC is based on a high throughput co- fractionation mass spectrometry (CF-MS) approach that has been applied to a diverse set of tissues and cells types en route to generating human and metazoan protein complex maps (Havugimana et al., 2012; Wan et al., 2015; Drew et al., 2017). DIF-FRAC builds upon CF-MS by comparing chromatographic separations of cellular lysate under control and RNA degrading ! 3! bioRxiv preprint doi: https://doi.org/10.1101/480061; this version posted November 27, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. conditions (Figure 1A). DIF-FRAC then discovers RNP complexes by identifying concurrent shifts of known protein complex subunits upon RNA degradation. Using DIF-FRAC we discover a set of 115 highly stable RNP complexes and further generate a system-wide resource of 1,428 protein complexes where the majority of subunits bind RNA, representing 20 % of known human protein complexes. In depth analysis of DIF-FRAC data further shows unprecedented characterization of RNP complexes providing distinct roles for RNA in protein complexes including complex compositions that are RNA-dependent, identification of RNA as peripheral to complex formation, and discerning RNA as a structural component responsible for the stability of the complex. A distinct advantage of the DIF-FRAC method is its lack of reliance on crosslinking, nucleotide incorporation, genetic manipulation or poly(A) RNA capture efficiency and therefore can be used to investigate a wide variety of cell types, tissues and species. We apply DIF-FRAC to mouse embryonic stem cells (mESCs) and human erythrocytes (red blood cells; RBCs) to show the method is highly adaptable and can be extended to discover RNP complexes in diverse samples, tissue types, and species. Finally, we provide our resource, rna.MAP, to the community as a fully searchable web database at rna.proteincomplexes.org. Results and Discussion Differential fractionation (DIF-FRAC) identifies RNP complexes The DIF-FRAC strategy detects RNP complexes by identifying changes in the elution of a protein complex’s subunits upon degradation of RNA (Figure 1). We applied DIF-FRAC to human HEK 293T cell lysate using size-exclusion chromatography (SEC) to separate the cellular proteins in a control and an RNase A-treated sample into 50 fractions (Figure 1A). Upon degradation of RNA, we see a change in the bulk chromatography absorbance signal consistent with higher-molecular-weight species (>1000 kDa) becoming lower-molecular-weight species (Figure 1B). The loss of absorbance signal in the high-molecular-weight region in the absence of RNA suggests this peak corresponds to RNA and RNP complexes. The distribution of cellular RNA in these fractions measured using RNA-seq confirmed we are accessing a diverse RNA landscape of mRNAs, small RNAs, and lncRNAs (Figure S1). As a negative control, we applied DIF-FRAC to human erythrocytes, which have substantially lower amounts of RNA due to the loss of their nucleus and ribosomes upon maturation (Keerthivasan et al., 2011). For this reason, ! 4! bioRxiv preprint doi: https://doi.org/10.1101/480061; this version posted November 27, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. we expect limited change in the proteome’s elution profiles upon RNase A treatment. Accordingly, the absorbance chromatography signal of erythrocyte lysate shows negligible difference in a DIF-FRAC experiment (Figure 1C). Together these data establish that DIF-FRAC is