Yeom Et Al SM V.5.15.B
Total Page:16
File Type:pdf, Size:1020Kb
Supplementary Materials for Tracking pre-mRNA maturation across subcellular compartments identifies developmental gene regulation through intron retention and nuclear anchoring. Kyu-Hyeon Yeom1,†, Zhicheng Pan2,3,†, Chia-Ho Lin1, Han Young Lim1,4, Wen Xiao1, Yi Xing3,5,*, Douglas L. Black1,*. Department of Microbiology, Immunology, and Molecular Genetics, University of California, Los Angeles, Los Angeles, CA 90095 Bioinformatics Interdepartmental Graduate Program, University of California, Los Angeles, CA 90095 Center for Computational and Genomic Medicine, The Children’s Hospital of Philadelphia, Philadelphia, PA 19104 Molecular Biology Interdepartmental Doctoral Program, University of California, Los Angeles, Los Angeles, CA 90095 Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA 19104 This PDF file includes: Supplemental Figures S1 – S6 List of Supplemental Tables S1 – S9 Supplemental Methods References for Supplemental Materials Supplemental Fig S1 (Related Fig 1) A B mESC mNPC mCtx Log2 (TPM + 1) 1 2 Biological 15 3 : Replicate 10 5 Cyto Nuc Chr Cyto Nuc Chr Cyto Nuc Chr Lin28a 80 0 60 anti-SNRNP70 Nanog Pou5f1 50 anti-TUBA1A Alpl _E14 mESC 20 anti-Histone H3.1 Srsf1 ESC 80 Ptbp1 60 anti-SNRNP70 Nxf1 Nxt1 50 anti-TUBA1A 40 Nes _C2+ anti-GAPDH mNPC 30 Msi1 20 anti-Histone H3.1 NPC Notch1 80 Gfap anti-SNRNP70 60 Stmn1 Tubb3 50 anti-TUBA1A 40 Ptbp2 mCtx 30 anti-GAPDH Dlg4 Neuron _E15 DIV5 20 anti-Histone H3.1 Map2 Syp Cyto : Cytoplasmic Rbfox3 Nuc : Soluble Nucleoplasmic Chr :Chromatin-associated C Poly(A)+ Total mESC mNPC mCtx Chromatin-associated Soluble Nucleoplasmic Cytoplasmic Rep_1 Rep_2 Rep_3 Supplemental Figure S1. (Related to Figure 1) Validation of subcellular fractionation, cell type gene expression, and library consistency. A. Confirmation of subcellular fractionation. Immunoblot analysis of diagnostic proteins in subcellular fractions. SNRNP70 for soluble nucleoplasm (Nuc), TUBA1A and GAPDH for cytoplasm (Cyto), and Histone H3.1 for chromatin pellet (Chr). Gel images include 3 biological replicates of mouse embryonic stem cells (line E14), mouse neuronal progenitor cell line C2+, and mouse cortical neurons after 5 days in vitro culture (E15DIV5; mCtx). Note that the immunoblot results of the third replicate of mESC_E14 are reprinted from Yeom et al. (Yeom et al. 2018). B. Confirmation of cell type specific gene expression. Heatmap presents the cytoplasmic expression as measured by kallisto for the indicated mRNAs in each cell type and replicate. C. Confirmation of library similarity. Heatmap displays similarity of gene expression between pairwise comparisons of all cell types, fractions, and replicates. Color codes are indicated on right. - 2 - Supplemental Fig S2 (Related Figs 1, 6) A 10 kb B 10 kb Chr19: 5,835,000 Chr12: 109,555,000 Neat1 Meg3 Mir-1906-1 Mir-770 59.2 - 80.7 - Chr Chr 59.2 - 80.7 - Nuc Nuc 80.7 - 59.2 - mESC mESC Cyto Cyto 1.5 - 0.2 - Chr Chr 1.5 - 0.2 - Nuc Nuc mNPC 1.5 - mNPC 0.2 - Cyto Cyto Poly(A)+, -Strand 1.9 - Poly(A)+, +Strand 334.2 - Chr Chr 1.9 - 334.2 - Nuc Nuc 1.9 - mCtx 334.2 - mCtx Cyto Cyto 5.1 - 9.3 - Chr Chr 5.1 - 9.3 - Nuc Nuc 5.1 - mESC 9.3 - mESC Cyto Cyto 0.6 - 0.1 - Chr Chr 0.6 - 0.1 - Nuc Nuc 0.6 - mNPC 0.1 - mNPC Cyto Cyto Total, -Strand 0.4 - Total, +Strand 47.3 - Chr Chr 0.4 - 47.3 - Nuc Nuc 0.4 - mCtx 47.3 - mCtx Cyto Cyto C D 5 kb 5 kb Chr16: 20,710,000 Chr2: 11,784,000 Ankrd16 Clcn2 (Chr/Cyto) (Chr/Cyto) 2 2 9.2 - : log 5.0 - : log Chr Chr 9.2 - 5.0 - Nuc Nuc 4.3 4.6 9.2 - mESC 5.0 - mESC Cyto Cyto 4.0 - 13.5 - Chr Chr 4.0 - 13.5 - Nuc Nuc 3.8 2.7 4.0 - mNPC 13.5 - mNPC Poly(A)+, -Strand Cyto Poly(A)+, +Strand Cyto 7.6 - 12.5 - Chr Chr 7.6 - 12.5 - 2.3 Nuc 3.7 Nuc mCtx mCtx 7.6 - 12.5 - Cyto Cyto - 3 - Supplemental Figure S2. (Related to Figures 1 and 6) Example genome browser tracks of non-coding and coding RNAs. A. Neat1 expression in mESC, mNPC, and mCtx. Genome browser tracks of the Neat1 locus for poly(A)+ and total libraries. Y axis shows RPM scaled to the highest value in the Chromatin-associated fraction. B. Meg3 expression in mESC, mNPC, and mCtx. Genome browser tracks of the Meg3 locus in poly(A)+ and total libraries. C. Genome browser tracks of the Clcn2 locus in mESC, mNPC, and mCtx. Transcripts are enriched in the chromatin fraction and exhibit unspliced introns in poly(A)+ RNA. The partition index of Clcn2 in each cell type is indicated on the right. D. Genome browser tracks of the Ankrd16 locus in mESC, mNPC, and mCtx. Transcripts are enriched in the chromatin fraction and exhibit unspliced introns in poly(A)+ RNA. Partition index of Ankrd16 in each cell type is indicated on the right. - 4 - Supplemental Fig S3 (Related Fig 2) 500 kb Chr16: 6,900,000 Rbfox1 8.2 - Total mCtx +Strand Chr 5.8 - Poly(A)+ +Strand Chr Supplemental Figure S3. (Related to Figure 2) Very long introns exhibit declining reads 5’ to 3’ to create a sawtooth pattern. Genome browser tracks of the Rbfox1 locus for poly(A)+ and total libraries. Y axis shows RPM in each library. - 5 - Supplemental Fig S4 (Related Figs 2, 3) A FI (Fraction of intron Inclusion) B Intron types EI IE 149,333 U introns 16,989 E introns 46,964 I introns 63,991 EI introns in 28,733 genes in 9,883 genes in 16,210 genes in 16,929 genes have overlap with overlap with overlap with EE unique annotation exon segments other introns segments of EI + IE both exons and other introns 2 FI = EI + IE + EE 2 C Initial filters FI in Total FI in Poly(A)+ Chromatin expression TPM ≥ median Poly(A)+_Chromatin ≥ 0.1 :Post-Tx Total_Chromatin ≥ 0.1 & 3’ unbiased transcripts Poly(A)+_Chromatin < 0.1:Co-Tx D E F Post- Co- U Introns transcriptional transcriptional Any First Total U intron Middle Introns adjacent cassette exon Poly(A)+ Last Intron location Chromatin 60 40 20 0 20 40 60 (%) Intron Intron Co-Tx Post-Tx upstream of downstream of cassette exon cassette exon G Initial filters Select genes Define gene group Gene group A has group A introns only Chromatin expression TPM ≥ median Genes have Gene group B has group B or A introns & 3’ unbiased transcripts only U introns Gene group C has group C or B or A introns :13,036 genes : 1,502 genes Gene group D has group D or C or B or A introns : 751 genes U intron Examples: Gene group A A A A na A Gene group B A B A na A Gene group C A B C na A Gene group D H D B C na A 3 0 (Chr/Cyto) mESC 2 −3 Log A B C D : Gene group 280 301 150 20 : Genes with only U introns - 6 - Supplemental Figure S4. (Related to Figures 2 and 3) Computational definition of introns and splicing. A. Determination of FI value using read numbers for the exon-intron junction (EI), intron- exon junction (IE), and exon-exon junction (EE). B. Introns were categorized as one of four types based on their Ensembl v91 annotation. Introns that are not partly overlapped with either exons or other introns are classified as U type introns. Introns that partly overlap with exons but not with other introns are classified E type introns. Introns that overlap with other annotated introns but not exons are called I type introns. EI type introns overlap with both exons and introns of other annotated isoforms. C. Determination of cotranscriptional and posttranscriptional splicing. FI values were determined for all U introns from total and poly(A)+ chromatin associated RNA. Genes with overall expression above the median (2.13 TPM) were analyzed. Genes showing a bias for reads in the 3’ end in the poly(A)+ RNA, and introns exhibiting FI values in total RNA below 0.1 were removed. A posttranscriptional splicing event was then defined as an intron having an FI value in poly(A)+ RNA greater than or equal to 0.1 (Post-tx). Cotranscriptional splicing of an intron generates an FI of less than 0.1 in the poly(A)+ RNA (Co-tx). D. Illustration of post and cotranscriptional splicing. Introns with high read numbers on chromatin in both the total and poly(A)+ libraries were defined as posttranscriptionally spliced. Cotranscriptional splicing events exhibited reads in the total but not the poly(A)+ RNA. E. Diagrams of constitutive U introns and I introns adjacent to simple cassette exons that were assessed for co- and posttranscription splicing as presented in Figure 2C. F. The proportions of co- and posttranscriptional splicing for all U introns and for first, middle and last introns in a transcript. G. Transcripts with unspliced introns are enriched in the chromatin fraction. Genes having only U introns were selected from those whose overall expression was above the median (2.13 TPM). The gene group was then defined by the highest intron group within the gene (751 genes), where D > C > B > A. Introns marked ‘na’ indicate they were filtered by SIRI during X-means clustering. H. Violin plots showing the distribution of chromatin partition indices (Log2(Chr/Cyto)) of transcripts from different gene groups defined above. The number of genes in each gene group is indicated at the bottom.