<<

Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells

Yurong Xina, Jinrang Kima, Min Nia, Yi Weia, Haruka Okamotoa, Joseph Leea, Christina Adlera, Katie Cavinoa, Andrew J. Murphya, George D. Yancopoulosa,1, Hsin Chieh Lina, and Jesper Gromadaa,1

aRegeneron Pharmaceuticals, Tarrytown, NY 10591

Contributed by George D. Yancopoulos, February 11, 2016 (sent for review January 21, 2016; reviewed by Philipp Sherer and Lori Sussel) This study provides an assessment of the Fluidigm C1 platform for 1C). Interestingly, we detected few cells that coexpressed Gcg-Ppy RNA sequencing of single mouse pancreatic islet cells. The system (0.8%; n = 125) (Fig. 1D). Using RNA FISH and immunohisto- combines microfluidic technology and nanoliter-scale reactions. chemistry in sections from mice we confirmed the exis- + + We sequenced 622 cells, allowing identification of 341 islet cells tence of rare Gcg -Ppy cells (SI Appendix,Fig.S1). Consistent with high-quality expression profiles. The cells clustered into with the high sensitivity of RNA FISH (14), we detected low levels populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancre- (0.02–0.3%) of other endocrine hormones in each single hormone- atic polypeptide cells (2%). We identified cell-type–specific tran- expressing cell. These data show that the dissociated islet cell scription factors and pathways primarily involved in nutrient preparations used for single-cell RNA sequencing consist nearly sensing and oxidation and cell signaling. Unexpectedly, 281 cells exclusively of single hormone-expressing cells. had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection Viability of Captured Cells. We used two methods to determine of more than one islet hormone. Collectively, we provide a re- viability of the captured cells. The first method is based on source for identification of high-quality datasets LIVE/DEAD staining in the C1 Single-Cell Auto Prep System. + + to help expand insights into and pathways characterizing islet We found 77% live (LIVE ) cells, 2% dead (DEAD ) cells, and + + cell types. We reveal limitations in the C1 Fluidigm cell capture pro- 21% cells that stained positive for both (LIVE /DEAD ). Via- cess resulting in contaminated cells with altered gene expression bility of the islet cells before capture was 78 ± 16% (n = 9 patterns. This calls for caution when interpreting single-cell transcrip- preparations). The second approach uses unsupervised hierar- tomics data using the C1 Fluidigm system. chical clustering of the top 100 variable genes in the sequenced cells. We used 622 cells from nine preparations for the analysis, single-cell RNA sequencing | pancreatic islet cells | Fluidigm C1 | | after excluding 34 cells where debris or contaminating cells were glucagon observed (SI Appendix, Fig. S2). Twelve acinar cells were de- tected [≥1 reads per kilobase per million (RPKM) for ≥2 of the slets of Langerhans are miniature endocrine organs within the following genes: Amy2a5, Amy2b,orPnlip]. This represents an Ipancreas that are essential for control of blood glucose levels exocrine contamination rate of 1.9% (12/622 cells). Two distinct (1). They are composed of four endocrine cell types producing cell clusters were identified: cells with low (cluster 1) or high glucagon (α-cells), insulin (β-cells), somatostatin (δ-cells), and viability (cluster 2) (Fig. 2A). Of note, mitochondrial genome- (PP cells). Whole-genome transcriptome encoded genes are more abundantly expressed in cells in cluster analysis has been performed on enriched populations of human 1. In particular, ATP6, ATP8, COX1, COX2, COX3, CYTB, ND1, and mouse α-andβ-cells (2–4). These studies report the ensemble Rnr2, and LOC100503946 are highly up-regulated and assigned average on the cell populations and do not report variation in expressed genes among cells. The studies also do not allow study Significance of the low abundant δ-cells and PP cells. In addition, data in- CELL BIOLOGY terpretation in these analyses can be affected by the presence of a are complex structures composed of four cell few contaminating cells. Single-cell RNA sequencing circumvents types whose primary function is to maintain glucose homeo- these problems and has recently been applied to a low number of stasis. Owing to the scarcity and heterogeneity of the islet cell human pancreatic islet cells (5) as well as to other cell types in types, little is known about their individual gene expression complex tissues (6–13). Pancreatic islet cells are suited for single-cell profiles. Here we used the Fluidigm C1 platform to obtain high- RNA sequencing because they express high levels of a single hor- quality gene expression profiles of each islet cell type from mone. This allows for unequivocal identification and unbiased un- mice. We identified cell-type–specific factors and derstanding of gene expression in each cell type. pathways providing previously unrecognized insights into genes Here we used the C1 Fluidigm system to analyze the tran- characterizing islet cells. Unexpectedly, our data uncover technical scriptome of dispersed mouse pancreatic islet cells. We also limitations with the C1 Fluidigm cell capture process, which should studied how the capture process affected cell quality and con- be considered when analyzing single-cell transcriptomics data. tamination. We report identification of all islet cells with high- quality gene expression profiles. Unexpectedly, our data uncover Author contributions: Y.X., Y.W., H.O., H.C.L., and J.G. designed research; J.K., M.N., J.L., technical limitations with the cell capture, which should be C.A., and K.C. performed research; Y.X., J.K., Y.W., H.O., J.L., and J.G. analyzed data; and considered when analyzing single-cell transcriptomics data. Y.X., A.J.M., G.D.Y., H.C.L., and J.G. wrote the paper. Reviewers: P.S., University of Texas Southwestern Medical Center; and L.S., Columbia Results University. Islet Cell Identity. Conflict of interest statement: All authors are employees and shareholders of RNA FISH simultaneously using probes to Regeneron Pharmaceuticals. glucagon (Gcg), insulin (Ins2), somatostatin (Sst), and pancreatic = Data deposition: The data reported in this paper have been deposited in the Gene Ex- polypeptide (Ppy) showed that 99.2% (n 15,542) of islet cells pression Omnibus (GEO) database, www.ncbi.nlm.nih.gov/geo (accession no. GSE77980). used for single-cell RNA sequencing expressed high levels of one 1To whom correspondence may be addressed. Email: [email protected] or Jesper. hormone (Fig. 1A). The distribution of the cell types is shown in [email protected]. Fig. 1B. The intensity distributions of the fluorescence signal were This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. bell-shaped, suggesting one cell population for each cell type (Fig. 1073/pnas.1602306113/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1602306113 PNAS | March 22, 2016 | vol. 113 | no. 12 | 3293–3298 Downloaded by guest on September 26, 2021 A B

Gcg Ins2 Sst Ppy

C 4.6% (727) Gcg 100 2000 100 100 Gcg Ins2 Sst Ppy 84.5% (13242) Ins2 r

r 80 80 e 80 5.9% (922) Sst e 1500 b b 60 4.2% (651) Ppy m 60 60 1000 0.8% (125) Gcg-Ppy num

ll 40 40 ll nu 40 e Ce Cell number Cell Cell number Cell C 500 20 20 20 (15667) Total 0 0 0 0 0.00 0.50 1.00 0.00 0.50 1.00 0.00 0.50 1.00 0.00 0.50 1.00 Fluorescent intensity Fluorescent intensity Fluorescent intensity Fluorescent intensity

D

Gcg Gcg Ppy Ppy

Fig. 1. Mouse islet cells rarely express more than one hormone. (A) Representative RNA FISH images of single mouse islet cells expressing glucagon (Gcg), + + + + insulin (Ins2), somatostatin (Sst), or pancreatic polypeptide (Ppy). (B) Distribution of islet cells. (C) Intensity distribution histograms of Gcg , Ins2 , Sst ,orPpy cells. (D) Representative RNA FISH images of Gcg+-Ppy+ cells.

+ + as the cell viability gene set (Methods). These genes account for mainly cluster between Gcg and Ins2 cells (SI Appendix,Fig.S7). >30% of total expression in RPKM. Fig. 2B shows that the This, combined with the RNA FISH data of the input islet cell median expression of the cell viability gene set is 12-fold higher suspensions (cf. Fig. 1), suggests that nearly all multiple-hormone– − (P = 5.6e 23) in cluster 1 cells, whereas the expression of all expressing cells are artifacts that arise during the cell capture pro- − other genes is 285-fold (P = 6.0e 23) reduced. Fig. 2C shows the cess due to damage or cell–cell fusion. Therefore, the cells that distribution of the sequenced cells according to their viability coexpress more than one hormone were excluded from sub- score (Methods). Cells with a score >0.3 are likely to be of low sequent analysis (SI Appendix, Fig. S2). Fig. 3C shows the dis- quality and were removed from the analysis. We found no pat- tribution of the remaining single-hormone–expressing islet cells. tern of changes in cell quality throughout the C1 Fluidigm circuit The cells clustered into populations of α-cells (5%), β-cells (SI Appendix, Fig. S3). In total, the assessments of cell quality (92%), δ-cells (1%), and PP cells (2%), matching the distribution resulted in removal of 65 cells (10%; SI Appendix, Fig. S2). in the input islet cell suspensions measured by RNA FISH. Fig. 3C also shows that each cell expresses low levels (0.003–0.27%) Characterization of Sequenced Islet Cells. Each sample was se- of other endocrine hormones. Total number of detected genes quenced to an average depth of 1 million read pairs (SI Ap- varied between 3,900 and 5,300 (SI Appendix, Table S2). These pendix, Table S1). This sequencing depth was sufficient to detect data show that single-cell capture and lower coverage se- expressed genes and did not improve when 64 cells were rese- quencing can be used to profile gene expression of islet cells. The quenced at 14.8 million read pairs (SI Appendix, Fig. S4A), even data reveal important technical limitations of the cell capture when considering low or highly expressed genes (SI Appendix, process resulting in large number of cells with false expres- Fig. S4B). Average single-cell expression agreed well with two sion patterns. matching intact islet samples (r = 0.88 and 0.89) (SI Appendix, Fig. S5). The stochastic nature of single-cell gene expression Expression. Previous work suggests that 150– seems highest for genes with average RPKM values <100 (SI 300 transcription factors are expressed in mammalian tissues and Appendix, Fig. S6). The sequenced cells were also evaluated for constitute 5–8% of all expressed genes (15). Consistent with technical quality (Methods). Thirty-seven cells failed to meet our these data, we detected 372 out of 721 curated transcription criteria and were removed from further analysis. factors (7.0–9.5% of expressed genes) with average RPKM ≥1in Islet cell types were defined by their expression of Gcg (α-cell), at least one cell type (Fig. 3D and Dataset S1). Owing to the low Ins2 (β-cell), Sst (δ-cell), and Ppy (PP cell). Unexpectedly, of number of identified δ-cells and PP cells and the stochastic na- the 520 cells that passed viability and quality control assess- ture of gene expression (cf. SI Appendix, Fig. S6), we limited the ments, only 341 cells (66%) expressed one hormone. Among the analysis to 42 abundant or previously reported transcription remaining 179 cells, 10 cells expressed low levels of any hormone factors (Fig. 3E). The heat map is sorted by average expression in (2%), whereas 169 cells (33%) expressed high levels of two β-cells. Interestingly, α-cells and PP cells have similar expression or more hormones. These multiple-hormone–expressing cells patterns for transcription factors. These cells showed enriched showed gene profiles reminiscent of fused cells (Fig. 3A). This is expression for Arx, Mafb, , Atf3, Fosb, and Id3. α-Cells se- supported by principal component analysis because the coex- lectively expressed Pou3f4 (Brn4). On the contrary, δ-cells have pressing cells typically cluster between their corresponding single- the most distinct expression pattern. Hhex and Neurog3 are + + hormone–expressing cells (Fig. 3B). For example, Gcg -Ins2 cells only expressed in this cell type and Pa2g4, Erg1,andFos have

3294 | www.pnas.org/cgi/doi/10.1073/pnas.1602306113 Xin et al. Downloaded by guest on September 26, 2021 A B 105

104

103 Viability score 2 High 10 Low 101

0 Median RPKM of geneset / cell 10

20 Cluster 1 Cluster 2

Geneset Viability.genes Rest.genes

15 C

150 10 cutoff=0.3 100

5

Number of cells 50

0 0 log2(RPKM+1) 0.00 0.25 0.50 0.75 1.00 Cluster 1 Cluster 2 Viability score

Fig. 2. Identification of islet cells with low viability. (A) Hierarchical clustering of top 100 variable genes in 622 mouse islet cells. Cluster 1 in the heat map characterizes cells with low viability (high viability score). Cluster 2 shows gene expression of cells mainly with high viability (low viability score). Viability score for each cell is depicted in the top horizontal bar. The gene set defining the viability score is marked by a black vertical bar left to the heat map. (B) Median expression of viability gene set and all other expressed genes in cells in cluster 1 and 2. (C) Distribution of cells according to their viability score. Cells with a viability score >0.3 (indicated by red dotted line) were removed from further analysis.

enriched expression. δ-Cells are also characterized by lack of (Fig. 4B). Collectively, the pathway gene enrichment analyses expression of Id3, Hdac2, Sin3a, Hnf1a,andKlf4 (Fig. 3E). confirm function of the identified α- and β-cells. β-Cells coexpress Pdx1, Nkx6-1, Nkx2-2, Pax6,andNeurod1, consistent with previous descriptions of genes expressed in Discussion mature β-cells (16, 17). Pax4 was not detected and MafA had Our data show that the C1 Fluidigm platform can be used for expression <1 RPKM. These data confirm and expand our single-cell RNA sequencing, allowing identification of all islet cell CELL BIOLOGY understanding of transcription factor expression in islet cells. types. We also demonstrate that half of the cells were damaged during the capture process, resulting in markedly altered gene ex- Enriched and Abundant α- and β-Cell Genes. We identified 26 pression patterns. Therefore, we have developed a workflow that enriched genes in α-cells and 151 genes in β-cells. The average allows identification of low-quality and contaminated cells. This expression is summarized in Datasets S2 and S3. It is important critical evaluation of each captured and sequenced cell is possible to note that extensive variation in expression was observed for because islet cells express high amounts of one hormone, allowing many of the genes (SI Appendix, Figs. S8–S10). Despite the for unequivocal identification and unbiased understanding of gene variation in gene expression, we did not observe subpopulations expression profiles. The workflow can be adapted to any cell type of β-cells (SI Appendix, Fig. S11). The lower number of α-cells with a distinct molecular gene signature. This is, however, not al- precluded meaningful subgroup analysis. Pathway analysis ways possible, calling for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system. revealed that the genes were enriched in 18 pathways and mo- RNA FISH analysis revealed that 99.2% of mouse islet cells ex- lecular function gene sets (Fig. 4A). The primary function of the press high levels of one hormone. Consistent with a previous report β + + -cell is to sense glucose and mount an appropriate insulin se- (16), we observed few Gcg -Ppy cells. These double-hormone– cretory response. Consistent with this, we found that genes in- positive cells are unlikely to be artifacts arising from the cell iso- volved in the sensing of glucose, cell signaling, and exocytosis lation procedure because they were also observed in intact islets in β α were enriched in -cells. In -cells, gene enrichment analysis pancreas sections using RNA FISH and immunofluorescence identified pathways regulating cell proliferation and signaling as staining. It is important to emphasize that islet cells do express very well as synthesis and modification (Fig. 4A). We also low levels (0.003–0.3%) of other endocrine hormones, consistent identified abundantly expressed genes (average RPKM >100) in with a previous study (18). This could reflect low-level contamina- α- and β-cells. The average expression for these genes is sum- tion, but if real the functional significance remains to be determined. marized in Datasets S4 and S5. Pathway analysis identified the Our workflow revealed that 45% of captured cells did not top three functional gene sets as pathways involved in oxidative meet our inclusion criteria for final analysis. Because the capture phosphorylation, mitochondrial dysfunction, and EIF2 signaling rate was 76% (656 captured cells/864 capture sites), the overall

Xin et al. PNAS | March 22, 2016 | vol. 113 | no. 12 | 3295 Downloaded by guest on September 26, 2021 A C  PP E    PP Fos Gcg Egr1 15 Atf4 Ins2 10 Creb3 Sst Xbp1 5 Ppy Ctnnb1 0 Isl1 log2(RPKM+1) Tcf25 D Hmgb1 Id3 0610031J06Rik Stat3 Ddit3 Tshz1 Gatad1 Pax6 15 Fosb Neurod1 10 Pa2g4 5 Insm1 0 Pura log2(RPKM+1) Jun Gpbp1 Foxo1 B Atf3 Foxa2 Hormone expression Nkx2−2 25 Gcg+ Pdx1 Ins2+ Nkx6−1 Sst+ Tbp Ppy+ Bhlha15 0 + + Gcg -Ppy Sin3a Gcg+-Sst+-Ppy+ 12 PC2 10 Gcg+-Ins2+ Irf9 + + + 10 Hnf1b −25 Gcg -Ins2 -Ppy 8 Ins2+-Sst+ 8 Mafb Ins2+-Sst+-Ppy+ Etv1 6 Ins2+-Ppy+ 6 Hnf1a + + −50 Sst -Ppy Klf4 4 Gcg+-Ins2+-Sst+-Ppy+ 4 Hhex 2 2 Neurog3 −100 0 100 0 Pou3f4 0 PC1 log2(RPKM+1) Arx log2(RPKM+1)

Fig. 3. Gene expression profiles of islet hormone-expressing cells. (A) Top 100 most significant genes in 510 cells with the following hormone-expressing + + + + + + + + + + + + patterns (color coding shown in B): Gcg (n = 18), Ins2 (n = 313), Sst (n = 4), Ppy (n = 6), Gcg -Ins2 (n = 42), Gcg -Ins2 -Ppy (n = 30), Ins2 -Sst (n = 9), Ins2 - + + + + + + + + + + + + + + + Sst -Ppy (n = 32), Ins2 -Ppy (n = 22), Gcg -Ppy (n = 11), Gcg -Sst -Ppy (n = 2), Sst -Ppy (n = 19), and Gcg -Ins2 -Sst -Ppy (n = 2). (B) Principal component analysis of the 510 cells. The first two principal components are depicted and each symbol represents a cell, and cells are color-coded by hormone-expressing + + + + pattern. (C) Distribution of Gcg (α-cells; n = 18), Ins2 (β-cells; n = 313), Sst (δ-cells; n = 4), and Ppy (PP cells; n = 6). Each column represents gene expression in one cell. (D) Expression pattern of 721 transcription factors in single mouse islet cells. (E) Average expression of 42 abundant or known islet cell transcription factors in mouse α-cells (n = 18), β-cells (n = 313), δ-cells (n = 4), and PP cells (n = 6).

efficiency of the C1 Fluidigm system was 39%. Surprisingly, 27% were unable to identify distinct subpopulations of β-cells. This of sequenced cells (169/622 cells) coexpressed more than one was a surprising finding but might reflect the relatively high endocrine hormone. These cells are most likely artifacts because degree of stochastic variation in gene expression. In particular, the islet cell suspension used for cell capture consisted of 99% we found that the variation in expression was highest for genes single-hormone–expressing cells. The high sensitivity of RNA with RPKM <100, which represent the majority of the genes. FISH and the detection of rare Gcg-Ppy double-positive cells In conclusion, we describe the utility of the C1 Fluidigm make it unlikely that the other double-hormone–positive cells platform to identify and sequence all mouse islet cell types. The detected by the C1 Fluidigm system are real. The flow or pres- islet cells show overall similar expression profiles, suggesting that sure in the microfluidics system of the C1 cell capture circuit few genes are likely to control cell identify and function. We might somehow cause transient cell damage or cell–cell fusion. uncovered liabilities in the C1 Fluidigm cell capture process Our gene expression analysis suggests that cell–cell fusion might leading to a high number of contaminated cells with markedly be frequently occurring in the C1 capture circuit. These liabilities altered gene expression profiles. We describe a workflow that hamper important utilities of the C1 Fluidigm system in islet cell allows identification of low-quality and contaminated cells. We research and possibly in all areas of biology. suggest adapting this workflow when analyzing single-cell RNA Islet cells express between 3,900 and 5,300 genes, yet only 26 sequencing data for any cell type using the C1 Fluidigm platform. genes were enriched in α-cells and 151 genes in β-cells. This implies that a small number of genes control cell identify. This is Methods perhaps not surprising because it has previously been shown that Islet Cells. C57BL/6 mice (males, 3–7 mo of age; Taconic) were housed in a only three transcription factors control endocrine cell fate (19, controlled environment (12-h light/dark cycle, 22 ± 1 °C, 60–70% humidity) 20). Our transcription factor analysis revealed few differences in and fed standard chow for ad libitum consumption (Purina Laboratory Rodent their expression between α- and β-cells, as well as between α-cells Diet 5001; LabDiet). All animal procedures were conducted in compliance with and PP cells. δ-Cells had the most distinct transcription factor protocols approved by the Regeneron Pharmaceuticals Institutional Animal expression profile. The high degree of similarity between the islet Care and Use Committee. Islets were isolated by density gradient separation after perfusing the pancreas with Liberase TL (Roche) through the common bile cell types at the mRNA level might be important for cells to duct. Following digestion for 13 min at 37 °C, the pancreas solution was washed undergo, for example, to meet metabolic and filtered through a 400-μm wire mesh strainer and islets were separated by demand (21, 22). Despite the great similarity between islet cells, Histopaque gradient centrifugation (Sigma). Islets were cultured in RPMI-1640 we rarely detected cells that coexpress endocrine hormones. We medium with 10% (vol/vol) FBS, 10 mM Hepes, 50 μM β-mercaptoethanol, observed high variability in the expression of many genes, yet we 1 mM sodium-pyruvate, 100 U/mL penicillin, and 100 μg/mL streptomycin at

3296 | www.pnas.org/cgi/doi/10.1073/pnas.1602306113 Xin et al. Downloaded by guest on September 26, 2021 constructed with Nextera XT DNA Sample Prep kit (Illumina), according to the A  manufacturer’s recommendations (protocol 100-7168 E1). Sequencing was Concentration of D−glucose performed on Illumina HiSeq2500 (Illumina) rapid mode by multiplexed single- Homeostasis of D−glucose read run with 50 cycles. Quantity of carbohydrate 7 Metabolism of nucleotide RNA Sequencing Data Analysis. Rawsequencedata(BCLfiles)wereconvertedto Maturity onset diabetes of young (MODY) signaling 6 FASTQ format via Illumina Casava 1.8.2. Reads were decoded based on their Synthesis of nucleotide barcodes. Read quality was evaluated using FastQC (www.bioinformatics. babraham.ac.uk/projects/fastqc/). Reads were mapped to the reference ge- Exocytosis 5 nome (mouse: GRCm38) using CLC bio Genomics Workbench Version FXR/RXR activation 7.0 (CLC Bio) allowing one mismatch. Reads mapped to the of a gene were Synthesis of lipid 4 summed at the gene level. Sequencing statistics including mapped counts, unique Metabolism of D−glucose counts, and intron and intergenic counts were summarized and checked for Metabolism of cyclic nucleotides 3 outliers. Expression correlation between samples was calculated using log2-scale G protein signaling mediated by tubby Pearson correlation. Principal component analysis was used to examine outliers Endoplasmic reticulum stress response 2 and potential batch effect. Depending on variances explained in principal com- Proliferation of cells ponents, we used the first three to five components to select top genes ac- Glutamate signaling 1 counting for most of the variances. The top 100 genes were selected based on Tetramerization of protein maximum loading of the first few principal components (8). Cells were removed Homotetramerization of protein 0 from further analysis using the following criteria: (i)exocrinepancreascontami- Abnormal quantity of retinol -log10(p-value) nation (≥1 RPKM) based on marker genes (Amy2a5, Amy2b,andPnlip); (ii)per- meable or dying cells with >0.3 viability score (discussed below) and/or dead cells based on DEAD/LIVE cell staining; (iii) cells with high expression of nonislet tissue B  markers; (iv) cells with <100,000 exon counts, low exon-to-mapped ratio (<0.2), or Oxidative phosphorylation high intergenic-to-mapped ratio (>0.3); and (v) outliers using hierarchical clus- – Mitochondrial dysfunction tering. Mann Whitney test was used for statistical analysis of data Fig. 2B. Single- cell transcriptome data are deposited in the Gene Expression Omnibus. EIF2 signaling Synthesis of protein Identification of Islet Cells. Islet cell types were identified using densityMclust Metabolism of protein (Mclust in R) to estimate bimodal expression distribution of Gcg, Ins2, Sst,and Translation Ppy. To obtain more reliable cell-type identification, cells were excluded if Regulation of eIF4 and p70S6K signaling their expression is >2 SDs from the average of the high expression mode. Translation of protein mTOR signaling 60 Cell Viability Score. We defined the following viability score to measure the Expression of protein quality of cells: Metabolism of nucleoside triphosphate , X X Phagosome maturation 50 gi Gj , Translation of mRNA i j Expression of mRNA 40 Metabolism of ATP where gi is one of the abundant genes (ATP6, ATP8, COX1, COX2, COX3, Metabolism of nucleotide CYTB, ND1, Rnr2, and LOC100503946) and Gj is one of the annotated genes. A viability score >0.3 was used to identify cells with low viability. Metabolism of purine nucleotide 30 Metabolism of nucleic acid component or derivative Evaluation of Gene Expression Variation. One-way ANOVA was used to Synthesis of ATP identify gene expression variation in cells with the following hormone ex- Processing of rRNA 20 pression patterns: Gcg+, Ins2+, Sst+, Ppy+, Gcg+-Ins2+, Gcg+-Ins2+-Ppy+, Ins2+- Biosynthesis of nucleoside triphosphate + + + + + + + + + + + + +

Sst , Ins2 -Sst -Ppy , Ins2 -Ppy , Gcg -Ppy , Gcg -Sst -Ppy , Sst -Ppy ,and CELL BIOLOGY + + + + Metabolism of reactive oxygen species Gcg -Ins2 -Sst -Ppy . After filtering out genes with average expression ≤5 Synthesis of purine nucleotide 10 RPKM in α-cells, β-cells, δ-cells, and PP cells, the top 100 most significant Synthesis of reactive oxygen species -log10(p-value) genes were selected based on false discovery rate to illustrate gene signatures.

Fig. 4. α-Cell and β-cell pathways with enriched or abundant genes. Cell-Type Enriched and Abundant Genes. Expressed genes were defined by ≥1 (A) Pathways and functional gene sets with enriched genes in α-cells (n = 18) RPKM. Abundant genes were defined by (i) average expression >100 RPKM and β-cells (n = 312). (B) Pathways and functional gene sets with abundant in the selected cell type and (ii) >50% of cells in the selected cell type with genes in α-cells (n = 18) and β-cells (n = 312). expression >1 RPKM. DESeq2 was used to identity enriched α- and β-cell genes according to (i) expression in the selected cell type is >10-fold com- pared with the other cell type, (ii) false discovery rate <0.01, (iii) average 37 °C with 5% CO2 in air atmosphere. Following overnight incubation, islets expression in the selected cell type >10 RPKM, and (iv) >50% of cells in the were hand-picked and enzymatically digested at 37 °C for 11 min using TrypLE selected cell type with expression >1 RPKM. Express (Life Technologies). Dissociated cells were filtered through a 40-μmcell strainer and suspended in RPMI-1640 medium. RNA in Situ Hybridization and Immunofluorescence. Dissociated mouse islet cells were fixed in 10% neutral buffered formalin and centrifuged onto charged Cell Capture, RNA Isolation, and Library Construction. Single islet cells in RPMI- slides. Whole mouse pancreata were fixed in 10% neutral buffered formalin. 1640 medium (300–500 cells/μL) were mixed (3:2 ratio) with C1 Cell Sus- After fixation pancreata were paraffin-embedded and sectioned onto slides. For pension Reagent (Fluidigm) before loading onto a 10- to 17-μm-diameter C1 RNA analysis, pancreas tissue and cells were permeabilized and hybridized with Integrated Fluidic Circuit (IFC; Fluidigm). LIVE/DEAD staining solution was combinations of mRNA probes for mouse Gcg, Ins2, Sst,andPpy,accordingto prepared by adding 2.5 μL ethidium homodimer-1 and 0.625 μL calcein AM the manufacturer’s instructions (Advanced Cell Diagnostics). A fluorescent kit (Life Technologies) to 1.25 mL C1 Cell Wash Buffer (Fluidigm) and 20 μL was was used to amplify mRNA signal. For protein analysis, pancreas sections were loaded onto the C1 IFC. Each capture site was carefully examined under a stainedwithacombinationofanantiglucagon(REGN745,anantiglucagon Nikon microscope in bright field, GFP, and Texas Red channels for cell dou- monoclonal antibody generated in-house), an antiinsulin (Dako A0564), an blets and viability. Cell lysing, reverse transcription, and cDNA amplification antisomatostatin (Sigma SAB4502861), or an antipancreatic polypeptide (Sigma were performed on the C1 Single-Cell Auto Prep IFC, as specified by the man- SAB2500747) antibody. Fluorescent signal was detected using a microscope ufacturer (protocol 100-7168 E1). The SMARTer Ultra Low RNA Kit (Clontech) slide scanner (Zeiss Axio Scan.Z1). Islet cell types were quantified using the was used for cDNA synthesis from the single cells. Illumina NGS library was HALO image analysis using the Cytonuclear Fluorescence module (Indica Labs).

Xin et al. PNAS | March 22, 2016 | vol. 113 | no. 12 | 3297 Downloaded by guest on September 26, 2021 ACKNOWLEDGMENTS. We thank Erqian Na for help with immunofluores- data analysis, Dr. Weikeat Lim for providing curated mouse transcription cence staining, Dr. Yu Bai for helpful discussion of single-cell RNA sequencing factors, and Dr. Judith Altarejos for critical reading of the manuscript.

1. Ashcroft FM, Rorsman P (2012) Diabetes mellitus and the β cell: The last ten years. Cell 12. Xue Z, et al. (2013) Genetic programs in human and mouse early embryos revealed by 148(6):1160–1171. single-cell RNA sequencing. Nature 500(7464):593–597. 2. Bramswig NC, et al. (2013) Epigenomic plasticity enables human pancreatic α to β cell 13. Zeisel A, et al. (2015) Brain structure. Cell types in the mouse cortex and hippocampus reprogramming. J Clin Invest 123(3):1275–1284. revealed by single-cell RNA-seq. Science 347(6226):1138–1142. 3. Dorrell C, et al. (2011) Transcriptomes of the major human pancreatic cell types. 14. Wang F, et al. (2012) RNAscope: A novel in situ RNA analysis platform for formalin- Diabetologia 54(11):2832–2844. fixed, paraffin-embedded tissues. J Mol Diagn 14(1):22–29. 4. Nica AC, et al. (2013) Cell-type, allelic, and genetic signatures in the human pancreatic 15. Vaquerizas JM, Kummerfeld SK, Teichmann SA, Luscombe NM (2009) A census of transcriptome. Genome Res 23(9):1554–1562. human transcription factors: Function, expression and evolution. Nat Rev Genet 10(4): 5. Li J, et al. (2015) Single-cell transcriptomes reveal characteristic features of human 252–263. pancreatic islet cell types. EMBO Rep 17(2):178–187. 16. Chiang MK, Melton DA (2003) Single-cell transcript analysis of pancreas development. 6. Grün D, et al. (2015) Single-cell messenger RNA sequencing reveals rare intestinal cell Dev Cell 4(3):383–393. types. Nature 525(7568):251–255. 17. Edlund H (2002) Pancreatic organogenesis–developmental mechanisms and implica- 7. Henry FE, Sugino K, Tozer A, Branco T, Sternson SM (2015) Cell type-specific tran- tions for therapy. Nat Rev Genet 3(7):524–532. scriptomics of hypothalamic energy-sensing neuron responses to weight-loss. eLife 4:4. 18. Katsuta H, et al. (2010) Single pancreatic beta cells co-express multiple islet hormone 8. Pollen AA, et al. (2014) Low-coverage single-cell mRNA sequencing reveals cellular genes in mice. Diabetologia 53(1):128–138. heterogeneity and activated signaling pathways in developing . Nat 19. Li W, et al. (2014) In vivo reprogramming of pancreatic acinar cells to three islet Biotechnol 32(10):1053–1058. endocrine subtypes. eLife 3:e01846. 9. Shalek AK, et al. (2013) Single-cell transcriptomics reveals bimodality in expression 20. Zhou Q, Brown J, Kanarek A, Rajagopal J, Melton DA (2008) In vivo reprogramming of and splicing in immune cells. Nature 498(7453):236–240. adult pancreatic exocrine cells to beta-cells. Nature 455(7213):627–632. 10. Shalek AK, et al. (2014) Single-cell RNA-seq reveals dynamic paracrine control of 21. Chera S, et al. (2014) Diabetes recovery by age-dependent conversion of pancreatic cellular variation. Nature 510(7505):363–369. δ-cells into insulin producers. Nature 514(7523):503–507. 11. Wu AR, et al. (2014) Quantitative assessment of single-cell RNA-sequencing methods. 22. Thorel F, et al. (2010) Conversion of adult pancreatic alpha-cells to beta-cells after Nat Methods 11(1):41–46. extreme beta-cell loss. Nature 464(7292):1149–1154.

3298 | www.pnas.org/cgi/doi/10.1073/pnas.1602306113 Xin et al. Downloaded by guest on September 26, 2021