Wholegenome Screening Identifies Proteins Localized to Distinct Nuclear Bodies
Total Page:16
File Type:pdf, Size:1020Kb
JCB: Tools Whole-genome screening identifies proteins localized to distinct nuclear bodies Ka-wing Fong,1 Yujing Li,2,3 Wenqi Wang,1 Wenbin Ma,2,3 Kunpeng Li,2 Robert Z. Qi,4 Dan Liu,5 Zhou Songyang,2,3,5 and Junjie Chen1 1Department of Experimental Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030 2Key Laboratory of Gene Engineering of Ministry of Education and 3State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou 510275, China 4State Key Laboratory of Molecular Neuroscience, Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China 5The Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030 he nucleus is a unique organelle that contains essen- 325 proteins localized to distinct nuclear bodies, including tial genetic materials in chromosome territories. The nucleoli (148), promyelocytic leukemia nuclear bodies (38), Tinterchromatin space is composed of nuclear sub- nuclear speckles (27), paraspeckles (24), Cajal bodies compartments, which are defined by several distinctive (17), Sam68 nuclear bodies (5), Polycomb bodies (2), nuclear bodies believed to be factories of DNA or RNA and uncharacterized nuclear bodies (64). Functional vali- processing and sites of transcriptional and/or posttranscrip- dation revealed several proteins potentially involved in tional regulation. In this paper, we performed a genome- the assembly of Cajal bodies and paraspeckles. Together, wide microscopy-based screening for proteins that form these data establish the first atlas of human proteins in nuclear foci and characterized their localizations using different nuclear bodies and provide key information for markers of known nuclear bodies. In total, we identified research on nuclear bodies. Complete screen data http://jcb-dataviewer.rupress.org/jcb/browse/6852/S152 (a) Nucleoli are sites of ribosomal DNA transcription, preribosomal RNA processing, and preribosomal assembly. (b) Nuclear speck- Introduction les may serve as storage and/or modification sites for splicing factors and sites for pre-mRNA splicing. In fact, nuclear speck- The nucleus is enclosed by a double-membrane structure termed les are often in close proximity to many active genes, suggest- the nuclear envelope, which serves as a physical barrier to sepa- ing that transcription and RNA splicing are coupled in the cell. THE JOURNAL OF CELL BIOLOGY rate nuclear contents from the cytoplasm. Numerous nuclear (c) Cajal bodies are involved in the assembly and maturation of pores exist as large protein complexes across the nuclear enve- small nuclear RNPs (snRNPs; Spector, 2006). Recently, telom- lope, which allow the transport of water-soluble molecules. In- erase RNA and telomerase reverse transcription were also shown terphase chromosomes occupy distinct subnuclear territories. to localize to Cajal bodies (Zhu et al., 2004; Tomlinson et al., The interchromatin space is also well organized and harbors 2008). (d) PML bodies engage in a multitude of cellular events, multiple nuclear bodies that can be visualized as distinct nuclear including apoptosis, DNA repair, and transcription control, by foci at the microscopic level. To date, nuclear bodies that have sequestering, modifying, and degrading many partner proteins been studied extensively are nucleoli, promyelocytic leukemia (Lallemand-Breitenbach and de Thé, 2010). (e) Paraspeckles (PML) bodies, nuclear speckles, Cajal bodies, paraspeckles, and are involved in nuclear retention of some A-to-I hyperedited Polycomb bodies (Spector, 2006). mRNAs, and such retention is altered upon environmental Tremendous effort has been made and allowed us to stress, which provides a control mechanism for gene expression understand the distinct functions of several nuclear bodies: Correspondence to Junjie Chen: [email protected] © 2013 Fong et al. This article is distributed under the terms of an Attribution– Noncommercial–Share Alike–No Mirror Sites license for the first six months after the pub Abbreviations used in this paper: FBL, fibrillarin; GO, Gene Ontology; IP, immuno lication date (see http://www.rupress.org/terms). After six months it is available under a precipitation; IR, ionizing radiation; PML, promyelocytic leukemia; qRT-PCR, quan Creative Commons License (Attribution–Noncommercial–Share Alike 3.0 Unported license, titative RT-PCR; SMN, survival of motor neuron; snRNP, small nuclear RNP. as described at http://creativecommons.org/licenses/by-nc-sa/3.0/). The Rockefeller University Press $30.00 J. Cell Biol. Vol. 203 No. 1 149–164 www.jcb.org/cgi/doi/10.1083/jcb.201303145 JCB 149 Figure 1. Identification and characterization of proteins in various nuclear subcompartments. (A) Overall schematic flow of this protein localization screen. (B) Representative images of proteins that show colocalization with various nuclear body markers. Bar, 10 µm. (C) The pie chart shows the distribution of proteins in various nuclear subcompartments (148 nucleolar proteins were not included). A total of 325 proteins that formed nuclear foci were identified in this screen, including 148 in the nucleolus, 38 in PML bodies, 27 in nuclear speckles, 24 in paraspeckles, 17 in Cajal bodies, 5 in Sam68 nuclear bodies, 2 in Polycomb bodies, and 64 proteins in uncharacterized nuclear subcompartments (please also see Table S1). (Prasanth et al., 2005). (f) Two classes of complexes designated Results as PRC1 and PRC2 (Polycomb repressive complexes 1 and 2) have been found in Polycomb bodies, which are believed to col- Description and validation of the nuclear laborate to repress gene transcription through epigenetic si- foci screen lencing (Spector, 2006). However, despite the importance of To generate a proteome of nuclear subcompartments, we subcloned these nuclear bodies, their compositions and regulations are still the Human ORFeome v5.1 Library into a Gateway-compatible largely unknown. destination vector. Individual plasmid DNA was transfected There are previous attempts in identifying mammalian into HeLa cells in a 96-well format followed by immunofluor- proteins localized to nuclear subcompartments (Sutherland et al., escence staining of the tagged proteins. Fluorescent images were 2001), which also include proteomic analysis of the nucleolus captured by an automated fluorescence microscope, subcellular (Andersen et al., 2002; Scherl et al., 2002) as well as nuclear localization of each ORF was reviewed with use of MetaXpress speckles (Saitoh et al., 2004). However, an ORFeome-scale sys- software (Molecular Devices), and proteins forming nuclear tematic approach has yet to be conducted. This is especially im- foci were selected for further characterization (Fig. 1 A). portant for the studies of nuclear bodies because these nuclear To estimate the accuracy of our study, we randomly selected bodies have no membrane and are difficult to isolate using tra- 36 proteins in the ORFeome library for which the antibodies ditional biochemical methods. In this study, we took advantage recognizing endogenous proteins are available. By comparing of the available 15,483 ORFs in the Human ORFeome Library the fluorescence intensities in the transfected and untransfected and performed whole-genome screening for proteins localized cells, we estimated that the mean level of overexpression is to distinct nuclear bodies. This study allowed us to expand the 2.35-fold of that of endogenous protein (Fig. S1). Moreover, inventory of components in various nuclear bodies and to con- 34/36 proteins displayed subcellular localization identical to struct the first nuclear body landscape. that of endogenous protein (Fig. S1). These results suggest that 150 JCB • VOLUME 203 • NUMBER 1 • 2013 Figure 2. Comparison of our study with datasets on nuclear subcompartments. Datasets were created for each nuclear subcompartment based on online databases or recent review articles. NOPdb (Ahmad et al., 2009) was used to represent nucleolar proteins. A proteomic analysis of interchromatin granule clusters (Saitoh et al., 2004) was used to represent the nuclear speckles dataset. A PML body interactome analysis (Van Damme et al., 2010) was used to represent proteins in PML bodies. A list of Cajal body proteins from a recent review paper (Machyna et al., 2013) was used to represent proteins in Cajal bodies. A list of paraspeckle proteins from a review (Bond and Fox, 2009) was used to represent proteins in paraspeckles. Venn graphs were used to show the extent of overlapping between an available dataset (green) and our study (blue). A group of proteins that are uniquely identified in our study and have been reported by others in the literature were presented in dark blue. The other groups of proteins that are confirmed by our shRNA screen to be involved in the assembly of Cajal bodies or paraspeckles were presented in yellow. Please also see Table S2 and Table S3. the tagged proteins are only moderately expressed, and most of suggests that our screening complements previous biochemical them exhibit proper localization as their endogenous counterparts. isolation of the nucleolus (Leung et al., 2006; Ahmad et al., 2009) To validate our screening results, we characterized the lo- and allows us to identify novel nucleolar proteins. For nuclear calization of these proteins that display nuclear foci by costain- speckles, 29.6% (8/27) proteins on our list overlapped with those ing with various nuclear foci or nuclear body markers