ARTICLE

https://doi.org/10.1038/s41467-019-12917-9 OPEN Single-nuclei RNA-seq on human retinal tissue provides improved transcriptome profiling

Qingnan Liang 1,2,3,9, Rachayata Dharmat1,2,8,9, Leah Owen4, Akbar Shakoor4, Yumei Li1, Sangbae Kim1, Albert Vitale4, Ivana Kim4, Denise Morgan4,5, Shaoheng Liang 6, Nathaniel Wu1, Ken Chen 6, Margaret M. DeAngelis4,5,7* & Rui Chen1,2,3*

Single-cell RNA-seq is a powerful tool in decoding the heterogeneity in complex tissues by

1234567890():,; generating transcriptomic profiles of the individual cell. Here, we report a single-nuclei RNA- seq (snRNA-seq) transcriptomic study on human retinal tissue, which is composed of mul- tiple cell types with distinct functions. Six samples from three healthy donors are profiled and high-quality RNA-seq data is obtained for 5873 single nuclei. All major retinal cell types are observed and marker for each cell type are identified. The expression of the macular and peripheral retina is compared to each other at cell-type level. Furthermore, our dataset shows an improved power for prioritizing genes associated with human retinal dis- eases compared to both mouse single-cell RNA-seq and human bulk RNA-seq results. In conclusion, we demonstrate that obtaining single cell transcriptomes from human frozen tissues can provide insight missed by either human bulk RNA-seq or animal models.

1 HGSC, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA. 2 Department of Molecular and Human Genetics, Baylor College of Medicine, Houston 77030 TX, USA. 3 Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA. 4 Department of Ophthalmology and Visual Sciences, University of Utah School of Medicine, Salt Lake City, UT 84132, USA. 5 Department of Pharmacotherapy, the College of Pharmacy, University of Utah, Salt Lake City, UT 84132, USA. 6 Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA. 7 Department of Population Health Sciences, University of Utah School of Medicine, Salt Lake City, UT 84132, USA. 8Present address: St. Jude Children’s Research Hospital, 262 Danny Thomas Pl, Memphis, TN 38105, USA. 9These authors contributed equally: Qingnan Liang, Rachayata Dharmat. *email: [email protected]; [email protected]

NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9

he retina is a heterogeneous tissue and is composed of Using the snRNA-seq approach, a total of 5873 nuclei are Tmultiple neuronal and non-neuronal cell types1. In the profiled from both the peripheral and macular region of three human retina, there is an ordered array of ~70 different frozen human donor retina samples. Through unsupervised cell types across five major neuron classes: photoreceptors (rods clustering of the profiles, clusters corresponding and cones), retinal ganglion cells (RGCs), horizontal cells (HC), to all seven major cell types in the human retina (rod, cone, MG, bipolar cells (BC), amacrine cells (AC), and a non-neuronal HC, AC, BC, and RGC) are identified. Differentially expressed Müller glial cell (MG), each playing a unique role in processing genes from each cell type are obtained. We compare the gene visual signals1,2. The transcriptome of the human retina has been expression profile between macular and peripheral region for reported using bulk tissue RNA-seq3,4, and the overall gene each cell types. Significantly higher expression of mitochondrial- expression profiles from different retinal regions (macular and electron transport genes is found in macular rod cells compared peripheral region) were compared to each other5. These studies to the expression of the peripheral ones, which may indicate that provided general transcriptomic information of the retina as a comparatively higher levels of oxidation stress exists in the whole tissue, but they could not reveal the entirety of the retina’s macular, providing a potential explanation of higher vulnerability complexity because it lacks individual cell-type resolution. In of macular rod cells during aging. In addition, as expected, addition, transcriptomic studies of selected cell types of the compared to the published mouse single-cell data, the single- human and primate retina were performed but were not sufficient nuclei human data show stronger predictive power in prioritizing in obtaining the complete profile of all major cell types6,7. The genes associated with human disease. Finally, we find that pho- profiles of individual cell types, particularly rare types that toreceptor DEGs significantly enrich inherited retinal disease account for a small portion of the retina, will offer important (IRD) genes, indicating that they can serve as a prioritization tool insights to the scientific knowledge of biology and disease. For for novel disease gene discovery and cell-specific pathway ana- example, particular cell types, such as the cone cells in the cone- lysis. Overall, our study reports a transcriptome profile of all rod dystrophy (CRD) and the retinal ganglion cells in glaucoma, major cell types of the human healthy retina at the resolution of are the first targets of the diseases8,9. However, both cell types are the individual cell, which would serve as a rich resource for the found in low proportions in the human retina and their tran- scientific community. scriptome profiles have not been well studied. Given the great benefit of obtaining the transcriptome at the individual cell type or at the individual cell level, single-cell RNA-seq on human Results retina tissue is an area of great interest. The generation of the snRNA-seq data of human retina.To Recently, single-cell RNA-seq studies have been performed on generate transcriptome profiles for human photoreceptor cells, mouse retina, both within the whole retina10 and within specifi- retinae from three separate, healthy donors were obtained (n = 3 cally sorted cell types11,12. These studies provided unprecedent- donors). All three donors were Caucasian, ages between 60 and edly high-resolution transcriptomic data of each cell types and 80 years old (Table 1), and were thoroughly examined as pre- allowed for novel cell subtype discovery. However, usage of viously described in the methods. Figure 1 shows an example of mouse datasets could be limiting given the considerable differ- the donor tissues used for this study. There was no visible ences between the human and mouse retina. For example, the pathogenic indication found, not even macular drusen, a hall- mouse retina lacks the macula region of the retina that is found in mark of age-related macular degeneration commonly found in humans and primates13,14, a structure that is essential for both this age group, according to the fundus images15. Two-sample high visual acuity and color vision perception in the retina. punches from each retina (n = 6), one from the macula region Mouse cone cells are also different from those of humans in their and the other from the peripheral region, were collected and wavelength-sensitive opsin expression patterns. Thus, we propose subjected to single-cell nuclei RNA-Seq. After dispensing, nano- that single-cell transcriptomic study on human retinal tissues will wells were imaged and only wells with single nuclei were selected. expand our knowledge of the retina and become a rich resource cDNA library construction and sequencing were performed for a for related research, such as human disease-related studies. total of 6542 individual nuclei. Distribution of the number of Unlike model-organism-based studies, more factors are nuclei from each sample is listed in Table 2. To exclude low- needed to be taken into consideration for human tissue studies quality data, several QC steps were conducted (described in the in order to provide reliable results. One of the major concerns methods). As a result, 669 nuclei were filtered out, leaving a total in profiling the transcriptome of post-mortem human tissues is number of 5873 nuclei for downstream analysis. On average, RNA integrity. This issue worsens in single-cell-level studies 31,186 mapped reads were obtained per nucleus, with the median because the dissociation of tissue occurs for an extended period number of genes detected at 1044. To further evaluate the quality of time after the retina is dissected. Thus, the dissociation itself of snRNA-Seq data, bulk RNA-seq from the corresponding could also lead to changes in due to damage of the sample was performed. Good correlation between the bulk gene retina that might have occurred during the waiting period. expression (FPKM, paired-end seq) and single-nuclei gene Another concern is the health condition of the human tissue, expression (average of normalized read count after transforma- since studies using multiple individuals with differing health tion and 3′ end seq, see methods) was observed with positive conditions could potentially add complexity in proper inter- correlation coefficient ranges from 0.66 to 0.71 (all genes with pretation of the data. Here, we report a single-cell tran- non-zero expression were used with gene numbers ranging from scriptomic study on healthy human retina tissues using snRNA- 12163 to 13657, among six samples. Spearman correlation coef- seq. We tackle the potential issues by two approaches. The first ficient was used). consisted of using snRNA-seq instead of single-cell RNA-seq (scRNA-seq) to profile tissues confirmed as healthy by a strict Table 1 Medical information of the sample donors. post-mortem phenotyping. The second approach involved fl snRNA-seq, where tissues were ash-frozen immediately after Donor number Globe Age Sex Race dissection to preserve RNA integrity, minimizing the variation 12–887 OS (left eye) 78 F Caucasian due to differences in tissue dissociation conditions. With the 13–0025 OD (right eye) 78 M Caucasian extensive post-mortem phenotyping, we can ensure only heal- 13–0347 OS (left eye) 83 M Caucasian thy tissues are used in our study.

2 NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9 ARTICLE

a

b 37/73

200 µm 200 µm

Fig. 1 Analogous color fundus and OCT images demonstrating normal findings for post-mortem eyes. a Color fundus imaging of post-mortem retina showing a normal phenotype. b OCT image of post-mortem retina showing a normal phenotype.

Table 2 Basic sample information showing nuclei numbers in Fig. 2c and Table 3, the composition of different cell types from and correlation with bulk RNA-seq of the same sample. the human peripheral retina was generally consistent with that from previous mouse studies, with the exception of a higher percentage of MG cells and a lower percentage of AM cells Donor Region Number Number of Correlation 10,16 of nuclei nuclei with bulk observed in the human retina , a piece of information that after QC would require further experimental validation. This trend is 12–887 Macular 1621 1428 0.68 consistent with the results reported from a previous study in 12–887 Peripheral 415 388 0.66 monkey, in which the relative ratio of BC: MG: AC: HC is close to 16,17 13–0025 Macular 1613 1458 0.69 40:28:22:9 . As expected, a lower rod proportion and higher 13–0025 Peripheral 424 391 0.67 BC, HC, and RGC proportions were observed in the human 13–0347 Macular 1805 1669 0.69 macular sample compared to the human peripheral retina. Fur- 13–0347 Peripheral 664 539 0.71 thermore, we noticed that the cone proportion in the macula region was only slightly higher than that of the peripheral, which Calculation of correlation was performed between the single-nuclei RNA-seq and bulk RNA-seq of the same samples. For single-nuclei RNA-seq (3′-end seq), the gene expression data from was because that the macula samples collected for this study did each nucleus were gene-filtered, normalized, and log-transformed (method) before averaging. not contain the fovea, where cone cells have a much more For bulk RNA-seq (paired-end seq), FPKM was used increased proportion. Since snRNA-seq is less biased in sampling in comparison to single-cell sequencing, a better estimation of cell Identification of major cell types in the human retina.To proportion can be obtained. By comparing the transcriptome of identify individual retinal cell types, we performed unbiased cells in each cell type with all other cells, a total of 139, 101, 147, clustering on the gene expression profiles of 5873 human retinal 167, 174, 255, and 249 cell type differentially expressed genes nuclei. Seven clusters were identified, each of which contained (DEGs) was identified for rod, BC, MG, AC, HC, cone, and RGC, cells from all six samples (Fig. 2a, Supplementary Fig. 1), sug- respectively (here, DEGs are defined by transcriptome compar- gesting relatively low sample bias. Based on the expression pat- ison between one cell type and all other cells, e.g., rods vs. non- tern of cell-type-specific marker genes in the cluster, each cluster rods; see methods; Supplementary Data 2). is mapped to individual cell types (Supplementary Data 1). As enrichment analysis of biological process terms was performed shown in Fig. 2b, markers for known retinal cell types, such as with these DEGs (Fig. 2d, Supplementary Data 3). Top GO terms PDE6A for rod cells, NETO1 for bipolar cells (BC), SLC1A3 for enriched by each DEG lists were consistent with our previous Müller glial cells (MG), GAD1 for amacrine cells (AC), SEPT4 for knowledge for each cell type, such as visual perception term for photoreceptor cells18, ion transmembrane transport term for horizontal cells (HC), ARR3 for cone cells and RBPMS for retinal – ganglion cells (RGC), showed cluster-specific expression pattern. retinal interneurons19 21, and neuron migration term for Müller Thus, each cluster could be assigned to a known retinal cell type. glia cells. These results indicated that our result faithfully repre- Based on the number of nuclei in each cluster, we were able to sented the transcriptome profiles of major cell types of the human quantify the proportion of each cell type in the sample. As shown retina.

NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9

a b PDE6A NETO1 SLC1A3 GAD1

RGC 4 4 4 4 4 4 BC AC 0 0 0 0 HC Rod Person1_mac UMAP_2 UMAP_2 UMAP_2 BC UMAP_2 Person1_per MG 0 Person2_mac 0 MG AC Person2_per –4 –4 –4 –4 HC UMAP_2 Person3_mac UMAP_2 Cone Person3_per Rod RGC

–4 0 48 –4 0 48 –4 0 48 –4 0 48 –4 –4 UMAP_1 UMAP_1 UMAP_1 UMAP_1

Cone SEPT4 ARR3 RBPMS

–4 0 48 –4 0 48 UMAP_1 UMAP_1 4 4 4

0 0 0 UMAP_2 UMAP_2 UMAP_2

c –4 –4 –4

–4 0 4 8 –4 0 48 –4 0 48 UMAP_1 UMAP_1 UMAP_1 0.4

Mac d Per Rod BC MG AC HC Cone RGC Visual perception Rhodopsin mediated signaling pathway 0.2 Regulation of rhodopsin mediated signaling pathway Photoreceptor cell maintenance Phototransduction 0 10 20 30 Response to calcium ion development Regulation of long-term neuronal synaptic plasticity Ion transmembrane transport Intracellular signal transduction 0 1 2 3 0.0 Neuron migration 2 Angiogenesis 1 Platelet degranulation Cell maturation 0 Visual perception –1 0 1 2 3 4 BC_Per MG_Per AC_Per HC_Per Rod_Per BC_Mac MG_Mac AC_Mac HC_Mac Chemical synaptic transmission Rod_Mac Cone_Per RGC_Per –2 Cone_Mac RGC_Mac Synapse assembly Gamma-aminobutyric acid signaling pathway Neurotransmitter secretion Cholesterol biosynthetic process 0 2 468 Retina layer formation Negative regulation of neuron apoptotic process Positive regulation of neuron projection development Renal water homeostasis 0 1 2 3 Visual perception Phototransduction Response to stimulus Actin filament network formation Actin filament bundle assembly 0510 Sodium ion transport Neuromuscular process controlling balance Innervation Cholesterol biosynthetic process Membrane depolarization during action potential 0 1 2 3 4 5 –log10 (p-value)

Fig. 2 Unsupervised clustering identifies seven major cell types in the human retina. a Clustering of 5873 human retina single-nuclei expression profiles into seven populations (right) and representation of the alignment of six datasets from three donors (left). b Profiles of known markers (PDE6A, NETO1, SLC1A3, GAD1, SEPT4, ARR3, RBPMS) in each cluster. c The proportion of the seven cell types (rod, BC, MG, AC, HC, cone, RGC) in the macular and peripheral samples (bar graph shows the mean of the proportion; single data points are visualized in dots, N = 3). d Heatmap of DEGs in each cell type and the gene ontology term enrichment by each set of DEGs. For visualization, top 50 DEGs with least FDR q-value and top five terms under the biological process category with least p-value were used. Each column represents a cell while each row represents a gene. Gene expression values are scaled across all the cells.

Table 3 Comparison of proportion of major cell types of mouse and human retina.

Cell type Mouse retinal cell proportion Mouse retinal cell proportion (%) Human macular cell Human peripheral cell (%) (Jeon et al.16) (Macosko et al.10) proportion (%) proportion (%) Rod 79.9 65.6 44.81 53.29 Cone 2.1 4.2 4.81 4.61 MG 2.8 3.6 11.92 14.08 BC 7.3 14.0 20.12 15.60 AC 7.0 9.9 7.11 7.74 HC 0.5 0.6 5.33 3.61 RGC 0.5 1.0 4.15 1.07

Regional gene expression variation revealed for human retina. combining snRNA-Seq from each region. As a result, we obtained The macula is a structure unique to human and other primates. a list of 234 genes that were highly expressed in the macular Several studies aimed at identifying genes that are differentially region and 214 genes that were highly expressed in the peripheral expressed between macular and peripheral retina have been region (Fig. 3a, Supplementary Data 4). Significant overlapping conducted. For example, using a SAGE approach, Sharon et al. between the SAGE gene list and our list was observed. Specifi- reported 20 genes with high expression in the macula and 23 cally, for the SAGE gene list, 13 out of 21 macular genes genes with high expression in the peripheral region in the retina (SLC17A6, SNCG, NEFL, NEAT1, STMN2, YWHAH, UCHL1, (referred as SAGE gene list)22. Based on bulk RNA-Seq, Li et al. DPYSL2, APP, NDRG4, TUBA1B, MDH1, EEF2) and 11 out of 24 reported 1239 genes with high expression in macular and 812 peripheral genes (SAG, RCVRN, UNC119, GPX3, PDE6G, ROM1, genes with high expression in the peripheral region5. To compare ABCA4, DDC, PDE6B, GNB1, NRL) were also observed in our our data against these published data, we generated the virtual gene list. However, most of the genes reported by the SAGE list bulk macular and peripheral RNA-Seq data by in silico but not included in ours (14 out of 21) also showed a consistent

4 NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9 ARTICLE

a b 200 Peripheral DEGs Macular DEGs

71

150 RGC/HC/BC DEGs

Others Rod DEGs 173 100 RGC DEGs

50 q-values (logarithm base 2) q-values

85 Rod DEGs 0 Others 129 –3 –2 –1 0 1 2 3 Fold-change (logarithm base 2) of average expression

c

0.8 0.85 0.9 0.95 1 0.75 0.8 0.85 0.9 0.95 1 0.75 0.8 0.85 0.9 0.95 1 0.75 0.8 0.85 0.9 0.95 1

P3-mac P3-mac P3-mac P2-mac P2-mac P2-mac P1-mac P3-mac P1-mac P1-mac P2-mac P1-mac P1-per P1-per P1-per P3-per P3-per P3-per P3-per P1-per P2-per P2-per P2-per P2-per Rod MG BC AC

d Mitochondrial electron transport

NADH to ubiquinone Cytochrome C to Oxygen

NDUFB9 NDUFV2 NDUFAB1 NDUFA13 CYCS COX4l1 COX6C 5

4

3

2

1 Expression in rod cells

0 Mac Per Mac Per Mac Per Mac Per Mac Per Mac Per Mac Per

Fig. 3 Transcriptome difference revealed between macular and peripheral region. a Differentially expressed genes between the overall macular and peripheral cells. All genes detected in more than 20% of macular or peripheral cells were plotted. Macular/Peripheral DEGs by overall comparison were labeled orange. Rod DEGs and RGC DEGs were labeled as red and green boxes, respectively. b The proportion of cell-type DEGs in the overall macular- peripheral DEGs showing cell population affection on DEG detection. c Heatmap demonstrating the correlation of averaged gene expression of four major cell types (rod, MG, BC, AC) in six samples. d Demonstration of mitochondrial-electron transport-related genes showing differential expression in rod cells in the macular and peripheral region. In the box plots, bounds of the box spans from 25 to 75% percentile, center line represents median, and whiskers visualize 1.5 times inter-quartile range less than quartile 1 or 1.5 times inter-quartile range more than quartile 3 of all the data points. trend, with fold-change or expression proportion under our exhibit different expression levels between the macular and selected threshold. Similar GO terms, such as ion transport in peripheral regions in bulk transcriptome profiling experiments macular genes and visual perception in peripheral genes (Sup- might be due to the differences in cell proportion instead of true plementary Data 4), are enriched in both our dataset and the Li expression level differences. In the 244 macular DEGs (‘virtual dataset (gene set not publicly available). Consistent data com- bulk’, compared with peripheral), 71 were also found in DEG lists pared to previous studies further supports that our data are high that contained highly expressed genes of RGC, BC, or HC quality and could be used for further analysis, such as macular- (comparison between the cell type with all other cells) (Fig. 3a, b). peripheral comparison within individual cell types. These cell types were found to be in a much higher proportion in In both the previous reports and our data, although the the macular region (Fig. 2c, d). Consistently, out of 214 peripheral macular and the peripheral region consists of the same types of DEGs, 85 are found as rod DEGs (Fig. 3a, b), while rods were cell, the proportion of cell types are different between the macular found to be in higher proportion in the peripheral region. These and peripheral human retina (Fig. 2c, d). As a result, genes that results suggested that the virtual bulk RNA-seq data tended to be

NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9 affected by cell population variations in different regions in the Supplementary Data 5, Supplementary Fig. 3). GO analysis on retina and provided a general estimation of the limitations in the biological process terms revealed that these genes enrich terms data found caused by small nuances in the population. Thus, such as visual perception, GTP metabolic process, phototrans- single-cell level study is required for revealing genuine macular- duction, response to stimulus, and regulation of ion transmem- peripheral similarities and differences. The snRNA-seq data brane transport (Supplementary Data 5). Using published single- overcome the constraint of the bulk RNA-seq data and allow a cell RNA-seq data of mouse retina10, rod and cone cells were direct comparison of gene expression between macular and identified with a method similar to what is used for human data. peripheral retina for the same cell type. Correlation of averaged Two hundred thirty three genes that were highly expressed in gene expression patterns of four cell types (rod, MG, BC, and AC) cone cells compared to rod cells were identified (mouse cone- from each sample were calculated (Fig. 3c). For all the four cell over-rod list, referred to as mCOR list, Supplementary Data 6). types, all samples showed reasonably high mutual correlation to Interestingly, this mCOR list shared less than 10% similarity to each other, but the macula samples possessed a tighter cluster, the hCOR list (Fig. 4b), indicating considerable differences indicating that differences do exist within the same cell type between human and mouse cone cells. Excluding the 15 shared depending on the region. Within each cell types (rod, MG, BC, genes by hCOR and mCOR, the rest of the genes in hCOR enrich and AC) in the macular cells and peripheral cells, we performed biological process GO terms, such as GTP metabolic processes, differential expression analysis and found cell-type DEGs visual perception, and regulation of ion transmembrane trans- (Supplementary Data 4). Among the DEGs, it is interesting to port, while the top terms enriched by the mCOR list were not note that some mitochondrial-electron-transport-related genes phototransduction-related. This result indicated more insights of showed higher expression in macular rod cells compared to cone cell functions could be potentially obtained within our peripheral ones (Fig. 3d). Rod photoreceptor cells have large dataset. Indeed, in the hCOR list (excluding shared genes), 7 numbers of mitochondria packed in the inner segment23, because genes (AHI1, ATXN7, KCNV2, PROM1, RD3, RPGRIP1) have they require a higher amount of energy to maintain the high been linked to human CRD or Leber’s congenital amaurosis turnover rate of the outer segment and support phototransduc- (LCA) diseases, while mCOR list only contains one such gene tion24. The comparatively higher expression level of these genes (GUCA1A). As an example, RPGRIP1 and RD3, both are known indicates higher oxidative stress, which is linked to photoreceptor human IRD genes30–39, are expressed at a significantly higher death25. Mitochondria is a major source of retinal oxidative stress, level in human cones compared to rods, as is shown in Fig. 4c. which accumulates as the organism ages26. Mitochondrial Consistent with the expression pattern found, patients with dysfunction in retinal pigment epithelial cells (RPEs) has been mutations in RPGRIP1 and RD3 show LCA and CRD phenotype, associated with retinal diseases, like age-related macular degen- where more severe defects are found in cones than rods. In eration (AMD), while its specific effect in photoreceptor cells contrast, these two genes show no differential expression in rod remains relatively unknown24. Our finding might offer a potential and cone cells in the mice dataset (Fig. 4c). In live animals, KO explanation as to why macula photoreceptors are more vulnerable mouse models of these two genetic defects are reported to display than peripheral ones. However, it is worth mentioning that the retinitis pigmentosa (RP)-like phenotypes39–41, which are the detected expression level of mitochondria-related genes could be result of early defects in rod cells. With immunofluorescence affected by experimental conditions or sample biases. We staining, we confirmed our findings that the expression level of confirmed that our findings were consistent with sample pairs RPGRIP1 and RD3 was higher in human cone cells compared to from all three donors (Supplementary Fig. 2). In conclusion, the human rod cells (Fig. 4d, e). Additionally, the expression pattern cell-type-based macular-peripheral comparisons reveal regional of RPGRIP1 in macaque photoreceptor cells reported by Peng variances that bulk RNA-seq cannot find. Macular-peripheral et al.42 is consistent with our finding (RD3 was not well detected differences within the same cell types were not studied with cone, in the macaque data, Supplementary Fig. 4). Therefore, the HC, or RGC, because there was limited quantity of these cell types differences in mouse and human phenotype are at least partially in the dataset. due to differences in cell-specific expression of the gene. The human cone profile would serve as an informative resource to better understand mechanisms behind human retinal biology and DEG analysis reveals human cone-rod differences. More than diseases. half of the retina cell population is composed of photoreceptor cells. By combining rod and cone photoreceptor cells, a list of 177 genes that are highly expressed in photoreceptors was obtained Retinal disease genes are enriched in photoreceptor DEGs. (PR cells compared with all other cells, Supplementary Data 2). With the expression profile for each retinal cell type generated in These genes show significant enrichment of biological process GO this study, we sought to examine its potential utility in identifying terms of phototransduction, sensory perception, response to light genes associated with human retinal diseases. A gene list of 246 stimulus, etc. (Fig. 4a). This is mostly consistent with the known genes that include known IRD genes for retinitis pigmentosa functions of photoreceptor cells, which further validate our (RP), Leber’s congenital amaurosis (LCA), cone-rod dystrophy cluster assignment and DEG analysis. (CRD), and other retinopathies was obtained from the retnet In photoreceptor cells, cones are important for (RetNet, http://www.sph.uth.tmc.edu/RetNet/) (Supplementary function1,18,27,28 but poorly studied. During retina development, Data 7). As expected, robust expression of most of these known photoreceptor precursors first commit to their photoreceptor cell IRD genes (233 of the 246) could be detected in our dataset fate and then differentiate into subtypes, namely rods and (Supplementary Data 7). Additionally, the detected IRD genes cones29. Thus, cone cells and rod cells are closely related in were expressed at a significantly higher level than the average development and share similarities in functions. Comparison of (Fig. 5a, p-value = 1.22e-08, two-sample t-test). Since the vast transcriptomes between cone cells and rod cells would provide majority of known IRD-associated genes affects photoreceptor informative insights to cone-specific functions. Genes with higher cells exclusively, we believed that significant overlap should be cone expression compared to rod expression may be relevant in detected between the IRD genes and the photoreceptor-enriched determining cone-specific functions. We identified 212 genes that gene set. We investigated the PR DEG list (Supplementary were highly expressed in human cone cells compared to rod cells Data 2), which includes 177 genes that exhibit significantly higher (human cone-over-rod gene list, referred to as hCOR list, expression in photoreceptor cells than other cell types (may not

6 NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9 ARTICLE

a Cell projection assembly b Neuron differentiation mCOR hCOR

Neurogenesis Neuron development

21815 197 Generation of neurons Sensory organ development System process Photoreceptor cell maintenance

Detection of external stimulus Transferrin transport GTP metabolic process Detection of stimulus Visual perception Neurological system process Phagosome acidification Detection of abiotic stimulus ATP hydrolysis coupled proton transport Regulation of ion transmembrane transport Chaperone-mediated protein folding Regulation of postsynaptic membrane potential Response to light stimulus Response to radiation Proton transport Regulation of rhodoptin mediated signaling pathway Sensory perception 64200 1 2 Phototransduction –log10(p-value)

Response to abiotic stimulus

c AHI1 ATXN7 KCNV2 PROM1 RD3 RPGRIP1 GUCA1A 5

4

3

Human 2

1

0

5

4

3 Expression in photoreceptor cells

Mouse 2

1

0 Rod Cone Rod Cone Rod Cone Rod Cone Rod Cone Rod Cone Rod Cone d e

RD3 RPGRIP1

RHO RHO

OS OS IS IS ONL ONL OPL OPL INL INL IPL GCL Merge + DAPI Merge + DAPI be rigorously exclusive). Indeed, 48 IRD genes are among the 177 genes are obtained from a list of 472 genes, an odds ratio of 154. PR DEGs, representing a significant enrichment over background Due to its higher performance quality, we propose that the PR (odds ratio = 27.06, p-value < 2.2e-16, fisher’s exact test) (Fig. 5b, DEG list can be potentially used as a gene prioritization tool for Supplementary Data 8). This result outperforms previous reports novel IRD gene discovery. using bulk RNA-seq data. Pinelli et al. generated an RNA-seq of Depending on the timing and severity of rod and cone 50 retina samples and used co-expression analysis to predict photoreceptors defects, IRDs can be classified into different potential IRDs4 and only 56 known retinal disease associated subtypes clinically. For example, although cone photoreceptor cell

NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9

Fig. 4 Differentially expressed genes are revealed in rod and cone photoreceptors. a Gene ontology networks demonstrate the biological process GO terms enrichment by photoreceptor cell DEGs. All photoreceptor cell DEGs (177 genes) were used as input for the online software NetworkAnalyst76 and only the terms with FDR adjusted p-value less than 0.05 were visualized. Terms were colored according to the p-value from low (red) to high (yellow). b Overlap of the hCOR (human cone-over-rod) and mCOR (mouse cone-over-rod) gene list and biological process GO term enriched in non-overlapping part. c Demonstration of the expression of CRD/LCA genes in the non-overlapping part of hCOR (AHI1, ATXN7, KCNV2, PROM1, RD3, RPGRIP1) and mCOR (GUCA1A). In the box plots, bounds of the box spans from 25 to 75% percentile, center line represents median, and whiskers visualize 1.5 times inter- quartile range less than quartile 1 or 1.5 times inter-quartile range more than quartile 3 of all the data points. d Immunofluorescent staining for RD3 and RHO on human retina cryosections. White triangles and arrows highlight the fluorescent signal of the cones and rods, respectively, in the top and bottom panel. Retinal cell layers are labeled in the bottom panel (OS outer segment, IS inner segment, ONL outer nuclear layer, OPL outer plexiform layer, INL inner nuclear layer, IPL inner plexiform layer, GCL ganglion cell layer). Scale bar is 20μm. e Immunofluorescent staining for RPGRIP1 and RHO on human retina cryosections. White triangles and arrows highlight the fluorescent signal of the cones and rods, respectively, in the top and bottom panel. Retinal cell layers are labeled in the bottom panel. Scale bar is 20μm.

a b 2.0 100 4.08% 2.97% 0.59% 0.57% 0.4% 15.38% 17.24% 10.44% 27.12% 18.81% 24.49% 75 36.55% 11.3% 42.6% 32.76% 27.21% 38.61% 1.5 50 23.16% Percentage 25 49.43% 52.61% 38.42% 44.22% 39.6% 41.42%

1.0 0 PR MG BC AC HC RGC

Category 0.5 Known human disese gene Average expression level Average Related to mouse eye phenotype (no human disease reported) Known expression in mouse retina (no mouse eye phenotype reported) Others 0.0

IRD Non-IRD c CRD/LCA genes RP genes

1.0 1.0

0.5 0.5 PDE6H AHI1 KCNV2 RP1 RP1L1 PRCD Higher cone expression EYS RBP3 FAM161A TMEM237 GUCA1B 0.0 0.0 CNGB1 PDE6B NRL MAK AIPL1 REEP6 CNGA1 NR2E3 CDHR1 USH2A ATXN7 RPGRIP1 GUCA1A IMPG2 RAX2 ROM1 –0.5 –0.5 SAG UNI119 CABP4 PDE6G RHO NEUROD1 PDE6A CACNA2D4 –1.0 –1.0 Average differential expression differential Average Higher rod expression 246 81012 5101520 Gene index Gene index

Fig. 5 Photoreceptor DEGs enrich human IRDs. a IRD genes generally show higher expression level than the rest of genes in the human retina. Expression values for each gene were normalized by cell total reads, multiplied by 10,000 and then log-transformed (natural logarithm), before averaging. In the box plots, bounds of the box spans from 25 to 75% percentile, center line represents median, and whiskers visualize 1.5 times inter-quartile range less than quartile 1 or 1.5 times inter-quartile range more than quartile 3 of all the data points. b Distribution of human retinal disease genes, mouse eye phenotype genes (but no human disease discovered), mouse retinal expressing genes (but no eye phenotype found) in DEGs of all cell types. c RP genes and CRD genes show different expression trend in rod and cone photoreceptors. The y-axis is representing the differential expression of each gene in cone cells compared with rod cells (cone expression level minus rod expression level). Expression values for each gene were normalized by cell total reads, multiplied by 10,000 and then log-transformed (natural logarithm), before averaging within cell types (rod, cone). degeneration is observed as the disease progresses, RP is primarily six of the 23 RP genes actually showed higher expression in cone due to a defect in rod photoreceptors43–45. In contrast, CRD and cells, which are ‘counter examples’. This indicates that the LCA mostly result from cone degeneration from the very tolerance of malfunction of some gene products by rods and beginning8,9. Among the 48 IRD genes that show photoreceptor cones- though the products are required by both cell types- could cell-specific expression, 23 have been associated with RP, while 12 be different. One possible reason could be that the difference of have been associated with CRD or LCA. This is consistent with the rod and cone transcriptome could lead to different redundant the clinical phenotype where, on average, RP-associated genes features. have higher expression in rod cells while CRD- or LCA-associated Given the significant enrichment of IRD-associated genes in genes show higher in cone cells (Fig. 5c). It is worth noting that the cell-type DEG set, it can be potentially used to prioritize

8 NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9 ARTICLE candidate IRD disease genes. For the 129 photoreceptor DEGs method. These nuclei were from six samples, obtained from three that have yet to be associated with human retinal diseases, 61 has donor retinae. It is worth emphasizing that the donors had a already shown expression in mouse retina (MGI gene expression similar genetic background, all underwent post-mortem pheno- data). Furthermore, viable knock-out mice have already been typing, and were confirmed to has normal retina. OCT images are obtained for 93 genes. Among these 93 genes, phenotypes in the used to resolve the appearance of subretinal drusen, fluid, atro- visual system have been observed in 20 genes (MGI database; phy, and fibrosis, and to differentiate artifact from pathology. Supplementary Data 8), such as GNGT1, RDH8, and RCVRN. These images (Fig. 1) demonstrate the usage of a combination of The protein encoded by GNGT1 is known to be located in the fundus and OCT imaging techniques for examining post-mortem outer segment of photoreceptor cells and plays an important role eye common in clinical practice and in accordance to the Utah in phototransduction46,47. RDH8 gene encodes a human retinol Grading Scale for Post-mortem Eyes61. All major cell types were dehydrogenase, and a mutation of its mouse ortholog causes found in every sample within a reasonable proportion and delayed rod function recovery after light exposure and an expected marker expression. Thus, the transcriptome profiles we accumulation of all-trans-retinal in the rod outer segment48. demonstrated here are reproducible and reliable. RCVRN is a calcium-binding protein, a regulator of rod The use of snRNA-seq method especially ensured the quality of sensitivity of dim lights, and it is also found to be an auto- our dataset, as the RNA integrity of the samples could be well antigen of cancer-associated retinopathy49. Although this list preserved with a much-reduced death-to-preservation time proves itself useful, the power of this list for prioritizing candidate compared to other methods that used cell dissociation. Another IRD genes could be underestimated because some genes could advantage of the snRNA-seq is that the sampling bias could be have mutations that would cause an eye phenotype in mice but minimized, which is an issue for profiling complex tissue. Thus, might not be reflected in the MGI data. For example, Top2b our estimation of cell proportion is likely to be more accurate conditional knock-out mice have been reported to show severe compared to previous results. In terms of faithfully representing photoreceptor cell loss during eye development50; however, this the transcriptome, Gao et al. reported that, in breast cancer cells, phenotype was not captured by the MGI data. In summary, our the snRNA-seq could be representative of single-cell RNA-seq62. photoreceptor cell DEG list would serve as a useful prioritization To corroborate these previous findings, our single-nuclei profiles tool for novel diseased gene discovery. showed a reliable correlation to bulk RNA-seq of the same sample, and the cell-type profiles are also consistent with pub- lished human retinal cell markers. Retinal disease genes in other retinal cell types. In contrast to Regional transcriptomes for tissues are of great interest for the significant overlap between IRD genes and photoreceptor researchers while single-cell level studies could provide insight to (PR) DEGs, no significant enrichment is observed in DEGs for future research questions. Here, we demonstrated that, for het- other retinal cell types. Although rare, it has been shown that erogeneous tissue, single-cell studies out-perform bulk studies in defects in cell types in the neural retina other than PR can also regional specific transcriptome profiling. We concluded that bulk lead to IRD. Indeed, 12 known IRD genes show enriched studies are limited by variations of the cell population in different expression in cell types not restricted to photoreceptor cells regions, where findings with variable expression levels in genes (Supplementary Data 8). For example, RDH11 is highly expressed may be due to changes in cell proportions, rather than true in Müller Glia cell, which plays an important function in the changes in expression in genes. Single-cell studies, on the other retinoid cycle, and mutations in RDH11 lead to RP51. GRM6 and hand, allows for comparisons between each cell types to eliminate TRPM1 are bipolar DEGs and their mutations could cause a this source of error. Interestingly, based on our dataset, it is recessive congenital stationary night blindness (CSNB), a condi- – observed that genes related to mitochondrial-electron transport tion due to abnormal photoreceptor-bipolar signaling52 58.In showed higher expression in macular rod cells compared to addition, 36, 30, 26, 26, and 19 genes that are highly expressed in peripheral ones, suggesting that the macular rod cells may have MG, HC, RGC, AC, and BC, respectively, showed mouse eye higher oxidative stress. In a recent article, Voigt et al.63 reported a phenotypes (Supplementary Data 8). These genes are likely dataset of human retina using single-cell RNA-seq and identified associated with human diseases. genes that showed differential expression when comparing the corresponding cell type from the foveal and peripheral retina. By Discussion comparing our results to the Viogt dataset, we found that these Studying the cell-type specific transcriptome expands our two datasets are largely consistent. For example, among the top understanding of the cell function within heterogeneous tissues, 20 DEGs between foveal and peripheral cones in the Voigt including the retina. The retina contains seven major cell types in dataset, 17 were detected in our dataset, with 10 showing con- differing proportions: rod cells consist of over half while other sistent trend with both the ‘Voigt dataset’ and the dataset types, such as amacrine cells and RGCs, are much rare. All the reported by Peng et al.42 (Supplementary Fig. 5). cell types have distinct functions and coordinate to allow for The transcriptome profiles of all major cell types, especially visual perception and regulation. Under pathological conditions, cone cells, in the human retina are also important findings in our not all the cell types are affected equally at the early stage of cell study. Mice, the most studied animal model for retinal degen- development8,43,59. Thus, understanding the transcriptome at the eration, are not an ideal model for studying cone biology since its cell type or the single-cell level will expand disease-related studies. cones are significant different from those in humans. For exam- In the retina, transcriptome profiles of many cell subtypes, ple, in the human retina, cone cells are highly rich in a region especially rare ones, are usually masked in bulk RNA-seq. Cell- called the fovea near the central macula, which is absent in the surface-marker-based sorting and purification methods have not mouse retina13,64. In addition, mice have two types of cone- been developed to enrich all cell types. Thus, single-cell RNA-seq opsins, namely Opn1sw for short-wavelength sensing and stands out as the most effective and unbiased method for Opn1mw for middle- and long-wavelength sensing. Some mouse obtaining the transcriptome of each cell types in the retina. cone cells express only one type of opsin but a considerable Efforts have been made to obtain single-cell level transcriptome proportion (40% for C57/BL6 mice) express both65. In contrast, profiles of human or primate retina42,60. Here, we report a humans have three types of cone-opsins, OPN1SW, OPN1MW, transcriptome profiling of the human retinal major cell types at and OPN1LW, and each cone cell only expresses one type of the single-cell level from 5873 nuclei with the snRNA-seq opsin29. Mustafi et al. previously compared the transcriptomes

NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9 between the cone-enriched macular region and rod-enriched Preparation of single-nucleus suspensions. Nuclei from frozen neural retinal peripheral region in monkey to analyze rod and cone signatures7. tissue was isolated using RNase-free lysis buffer (10 mM Tris-HCl, 10 mM NaCl, They found known cone makers, such as SLC24A2 and 3 mM MgCl2, 0.1% NP40). The frozen tissue was resuspended in ice-cold lysis buffer and triturated to break the tissue structure. The tissue aggregates were then OPN1MW; however, they also found other genes, including NEFL homogenized using a Wheaton™ Dounce Tissue Grinder and centrifuged (500 g) to and NEFM, which were actually highly expressed in RGCs. This pellet the nuclei. The pellet was resuspended in fresh lysis buffer and homogenized finding had limited sensitivity and might be driven by the uneven to yield clean single-nuclei suspension. The collected nuclei were stained with ′ μ proportion of RGCs. Welby et al. developed a sorting method to DAPI (4 ,6-diamidino-2-phenylindole, 10 ug/ml) and were diluted to 1000 lof fi 3E4/ml with 1 × PBS (without Ca and Mg ions, pH 7.4, Thermo Fisher), RNase speci cally select fetal cones in human and reported the tran- inhibitor (NEB, 40 KU/ml) and Cell Diluent Buffer. scriptome of these fetal cone cells6. Their findings identified the – cone signature during the development (9 20 weeks post con- The ICELL8™ single-cell-based single-cell capture. Single nuclear capture and ception), which could be useful in recapitulating some aspects of sequencing were performed on the ICELL8 single-cell platform (Wafergen Biosy- the human adult cone. Our study took advantage of in silico tems). ICELL8 platform comprised of a multi-sample nano-dispenser that precisely sorting to separate cones from other cell types and reported the dispensed 50 nl of the single-nuclei suspension into an ICELL8 nanowell human cone cell transcriptome profile, which could be a useful microchip-containing 5184 wells (150 nl capacity). Assuming a Poisson distribu- fi tion frequency for the number of cells per well, about 30% of the nanowells were resource in future studies. With this cone pro le, we were able to expected to contain a single nucleus under optimal conditions. Automated scan- perform a cone-rod comparison to gain insight into cone function ning fluorescent microscopy of the microchip was performed using an Olympus and investigate discrepancies between phenotype in patients and BX43 fluorescent microscope with a robotic stage to visualize wells containing phenotype of genetic mouse models of cone diseases. single nuclei (see Table 1 for single-cell capture number across different experi- fi mental repeats). The automated well selection was performed using the CellSelect The rich single-cell transcriptome pro les could be a useful software (Wafergen Biosystems), which identified nanowells containing single resource for the research community. For example, we observed nuclei and excluded wells with >1 nuclear, debris, nuclei clumps or empty wells. significant enrichment of IRD genes among the genes with high The candidate wells were manually evaluated for debris or clumps as an specificity in photoreceptor cells. In addition, the list also contains additional QC. multiple genes that lead to eye defect in mice when knocked out. Thus, we could use this list to prioritize novel IRD gene discovery. Single-cell RT-PCR and library preparation. The chip was subjected to freeze- fi thaw in order to lyse the cells and 50 nl of reverse transcription and amplification Additionally, it is worth noting that a list of genes speci cally solution (following ICELL8 protocol) was dispensed using the multi-sample nano- expressed in cells other than PR was also found to cause retinal dispenser to candidate wells. Each well had a preprinted primer that contains an phenotype in human and/or mice when mutated. Besides the uses 11-nucleotide-well-specific barcode. This barcode was added to the 3′ (A)n RNA that were described, this dataset can be informative for many tail of the nuclear transcripts during reverse transcription and cDNA amplification, other studies. For instance, the cell-type-specific markers reported the latter of which was performed on candidate wells using SCRB-seq chemistry. fi After RT, the cDNA products from candidate wells were pooled, concentrated in our data would facilitate general cell-type-speci c studies. (Zymo Clean and Concentrator kit, Zymogen) and purified using 0.6× AMPure XP As a proof of concept study, this dataset has certain limitations. beads. A 3′ transcriptome enriched library was made using Nextera XT Kit and 3′- The complexity of the retina could not be fully solved with the specific P5 primer, which was sequenced on the Illumina Hiseq2500. number of nuclei we sequenced. For example, due to the modest number of nuclei profiled in this study, this dataset is under- Immunofluorescence (IF). Healthy human donor retinal tissue was dissected and powered and cannot be used to classify subtypes of cells in the fixed in 4% PFA for 48 h, cryo-protected with 30% sucrose overnight in 4 degrees, and then embedded in OCT to be flash-frozen. Cryosections from a peripheral retina. Recognizing this limitation, we mainly focused on the μ fi region of the human retina with 10 m thickness were used for the IF. pro les of the major cell types in the retina in this study. For the IF, sections were fixed with 4% PFA for 5 min at room temperature. Nevertheless, given the robustness of snRNA-Seq, it is feasible to They were then blocked for 3 h with blocking buffer (10% normal goat serum in scale up the current study to provide much improved resolution PBS + 0.1% Triton X-100) in room temperature. Sections were then incubated fi in future studies. overnight at 4 °C with primary antibodies (anti-RD3: Thermo Scienti c PA583118; anti-RPGRIP1: gift from Dr. Tiansen Li’s lab; anti-RHO 1D4: Santa Cruz 57432) diluted in blocking buffer (1:150 for anti-RD3 and anti-RHO 1D4; 1:50 for anti- RPGRIP1). Species-specific fluorophore conjugated secondary antibody in blocking Methods buffer (1:200) was applied for 120 min in room temperature, followed by DAPI Macular and peripheral sample collection. As previously described in detail61, staining for 10 min. The sections were washed with PBS (three times, 5 min each) between every buffer change. The sections were then mounted with AquaMount human donor eyes were obtained in collaboration with the Utah Lions Eye Bank. fi Only eyes within 6 h post-mortem were used for this study. Both eyes of the donor Slide Mounting Media (Thermo Scienti c 13800). underwent rigorous post-mortem phenotyping, including spectral domain optical coherence tomography (SD-OCT) and color fundus photography. Most impor- Data analysis. Generation of human snRNA-seq expression matrices and quality tantly, these images were taken in a manner consistent with the appearance of the control. FASTQ files were generated from Illumina base call files using bcl2fastq2 analogous images utilized in the clinical setting. Dissections of donor’s eyes were conversion software (v2.17). Sequence reads were aligned to the carried out immediately according to a standardized protocol to reliably isolate the hg19 (GRCh37), and aligned reads were counted within exons using HTseq-count RPE/choroid from the retina and segregate the layers into quadrants61,66. After the using default parameters67 to generate the expression matrices of raw read counts. eye was flowered and all imaging was complete, macula retina tissue was collected Quality control of the expression matrices of six samples (macular region and using a 6 mm disposable biopsy punch (Integra, Cat # 33–37) centered over the peripheral region for each of the three tissues) were performed separately before fovea and flash frozen and stored at −80 °C. The peripheral retina was collected in they were merged together. For quality control of each matrix, genes that were not a similar manner for each of the four quadrants. To determine precise ocular detected in at least 0.5% of all cells were discarded. Cells were filtered based on a phenotype relative to disease and healthy aging, analysis of each set of images was minimum number of 500 and a maximum number of 3000 expressed genes per performed by a team of retinal specialists and ophthalmologists at the University of cell, and a minimum number of 6000 and a maximum number of 100000 tran- Utah School of Medicine, Moran Eye Center and the Massachusetts Eye and Ear scripts per cell. Infirmary Retina Service. Specifically, each donor’s eye was checked by an inde- Data alignment and unsupervised clustering. For all six expression matrices, the pendent review of the color fundus and OCT imaging; discrepancies were resolved expression value of each gene in each cell was normalized by the following by collaboration between a minimum of three specialists to ensure a robust and conversions: total read counts of that cell multiplied by 10,000 and then log- rigorous phenotypic analysis. This diagnosis was then compared to medical records transformed (natural logarithm, ln (value + 1)). Using the Seurat v3 package68, and a standardized epidemiological questionnaire for the donor. For this study, variable genes were detected in each matrix and were used as input for the both eyes for each donor were classified as AREDS 0/1 to be considered normal. ‘FindIntegrationAnchors’ function, and thus the six matrices were integrated with Only one eye was used for each donor. Donors with any history of retinal the ‘IntegrateData’ function. The integrated data were then clustered with principal degeneration, diabetes, macular degeneration, or drusen were not used for this component analysis (PCA; top 10 principal components were used) and the study. Institutional approval for the consent of patients to donate their eyes was clusters were visualized in two dimensions with UMAP. All the data alignment, obtained from the University of Utah and conformed to the tenets of the integration, and clustering were performed under standard Seurat workflow. Declaration of Helsinki. All retinal tissues were de-identified in accordance with Detection of differentially expressed genes (DEGs) for each cluster. Raw read HIPAA Privacy Rules. count was used as input for DEG analysis with the VGAM R package implemented

10 NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9 ARTICLE by the Monocle 3 R package69. For obtaining the DEG list of each cluster, only the 18. Fu, Y. Phototransduction in Rods and Cones. Webvision: The Organization of genes expressed in more than 20% of that cluster were taken. For each gene, the the Retina and Visual System (University of Utah Health Sciences Center, expression level of all other clusters was used as background for comparison. DEGs 1995). were defined as genes that had a 1.5-fold expression over the background with a q- 19. Masland, R. H. The tasks of amacrine cells. Vis. Neurosci. 29,3–9 (2012). value (FDR) less than 0.05. For calculating the fold-change, we used the raw read 20. Euler, T., Haverkamp, S., Schubert, T. & Baden, T. Retinal bipolar cells: counts normalized with total read counts in each cell and averaged by cluster (or elementary building blocks of vision. Nat. Rev. Neurosci. 15, 507–519 (2014). background). DEGs between two clusters were obtained in similar methods, where 21. Poche, R. A. & Reese, B. E. Retinal horizontal cells: challenging paradigms of all the genes with more than 20% detection in either cluster were used. With the neural development and cancer biology. Development 136, 2141–2151 (2009). 70 gene lists, functional enrichment analysis was performed using DAVID . 22. Sharon, D., Blackshaw, S., Cepko, C. L. & Dryja, T. P. Profile of the genes For bulk RNA-seq on the same samples, FASTQ sequences were mapped to expressed in the human peripheral retina, macula, and retinal pigment human genome hg19 (GRCh37), which was downloaded from UCSC genome epithelium determined through serial analysis of gene expression (SAGE). 71 browser website and aligned using STAR . Transcript structure and abundance Proc. Natl Acad. Sci. USA 99, 315–320 (2002). fl 72–75 were estimated using Cuf inks . 23. Stone, J., van Driel, D., Valter, K., Rees, S. & Provis, J. The locations of For mouse single-cell RNA-seq, the expression data was obtained from 10 mitochondria in mammalian photoreceptors: relation to retinal vasculature. GSE63473 . Matrices from seven P14 mice, GSM1626793-1626799, were used for Brain Res. 1189,58–69 (2008). analysis. The major analysis pipeline, including data normalization, integration, 24. Lefevere, E. et al. Mitochondrial dysfunction underlying outer retinal diseases. clustering, and DEG detection, was the same as the one used for human data. The Mitochondrion 36,66–76 (2017). only differences were in data QC. For mice data, cells were filtered based on a 25. Wright, A. F., Chakarova, C. F., Abd El-Aziz, M. M. & Bhattacharya, S. S. minimum number of 400 and a maximum number of 3000 expressed genes per Photoreceptor degeneration: genetic and mechanistic dissection of a complex cell, and a minimum number of 800 transcripts per cell. trait. Nat. Rev. Genet. 11, 273–284 (2010). 26. Kam, J. H. & Jeffery, G. To unite or divide: mitochondrial dynamics in the Data availability murine outer retina that preceded age related photoreceptor loss. Oncotarget All relevant data are available from the authors. snRNA-seq data could be accessed by 6, 26690–26701 (2015). GEO series number GSE133707. 27. Cepko, C. L. The Determination of Rod and Cone Photoreceptor Fate. Annu. Rev. Vis. Sci. 1, 211–234 (2015). 28. Helga Kolb. Cone Pathways through the Retina. Webvision. https://webvision. Code availability med.utah.edu/book/part-iii-retinal-circuits/cone-pathways-through-the- The Seurat v3 and monocle 3 R packages are major tools for the single-cell RNA-seq data fl retina/ (2018) analysis and standard work ows are used. All relevant code is available from the authors. 29. Swaroop, A., Kim, D. & Forrest, D. Transcriptional regulation of photoreceptor development and homeostasis in the mammalian retina. Nat. Received: 11 February 2019; Accepted: 2 October 2019; Rev. Neurosci. 11, 563–576 (2010). 30. Hameed, A. et al. Evidence of RPGRIP1 gene mutations associated with recessive cone-rod dystrophy. J. Med. Genet. 40, 616–619 (2003). 31. Dryja, T. P. et al. Null RPGRIP1 alleles in patients with leber congenital amaurosis. Am. J. Hum. Genet. 68, 1295–1298 (2001). 32. Kuznetsova, T. et al. Exclusion of RPGRIP1 ins44 from primary causal association with early-onset cone–rod dystrophy in dogs. Investig. References Opthalmology Vis. Sci. 53, 5486 (2012). 1. Masland, R. H. The neuronal organization of the retina. Neuron 76, 266–280 33. Mavlyutov, T. A., Zhao, H. & Ferreira, P. A. Species-specific subcellular (2012). localization of RPGR and RPGRIP isoforms: implications for the phenotypic 2. Kolb, H. Gross Anatomy of the Eye. Webvision (2012). http://webvision.med. variability of congenital retinopathies among species. Hum. Mol. Genet. 11, utah.edu/book/part-i-foundations/gross-anatomy-of-the-ey/ (2018). 1899–1907 (2002). 3. Farkas, M. H. et al. Transcriptome analyses of the human retina identify 34. Mellersh, C. S. et al. Canine RPGRIP1 mutation establishes cone–rod unprecedented transcript diversity and 3.5 Mb of novel transcribed sequence dystrophy in miniature longhaired dachshunds as a homologue of human via significant and novel genes. BMC Genomics 14, 486 Leber congenital amaurosis. Genomics 88, 293–301 (2006). (2013). 35. Hanein, S. et al. Leber congenital amaurosis: comprehensive survey of the 4. Pinelli, M. et al. An atlas of gene expression and gene co-regulation in the genetic heterogeneity, refinement of the clinical definition, and genotype- human retina. Nucleic Acids Res. 44, 5773–5784 (2016). phenotype correlations as a strategy for molecular diagnosis. Hum. Mutat. 23, 5. Li, M. et al. Comprehensive analysis of gene expression in human retina and 306–317 (2004). supporting tissues. Hum. Mol. Genet. 23, 4001–4014 (2014). 36. Roepman, R. et al. The retinitis pigmentosa GTPase regulator (RPGR) 6. Welby, E. et al. Isolation and comparative transcriptome analysis of human interacts with novel transport-like proteins in the outer segments of rod fetal and iPSC-derived cone photoreceptor cells. Stem Cell Rep. 9, 1898–1915 photoreceptors. Hum. Mol. Genet. 9, 2095–2105 (2000). (2017). 37. Boylan, J. P. & Wright, A. F. Identification of a novel protein interacting with 7. Mustafi, D. et al. Transcriptome analysis reveals rod/cone photoreceptor RPGR. Hum. Mol. Genet. 9, 2085–2093 (2000). specific signatures across mammalian retinas. Hum. Mol. Genet. 25, ddw268 38. Friedman, J. S. et al. Premature truncation of a novel protein, RD3, exhibiting (2016). subnuclear localization Is associated with retinal degeneration. Am. J. Hum. 8. Hamel, C. P. Cone rod dystrophies. Orphanet J. Rare Dis. 2, 7 (2007). Genet. 79, 1059–1070 (2006). 9. Michaelides, M., Hardcastle, A. J., Hunt, D. M. & Moore, A. T. Progressive 39. Chang, B., Heckenlively, J. R., Hawes, N. L. & Roderick, T. H. New mouse cone and cone-rod dystrophies: phenotypes and underlying molecular genetic primary retinal degeneration (rd-3). Genomics 16,45–49 (1993). basis. Surv. Ophthalmol. 51, 232–258 (2006). 40. Zhao, Y. et al. The retinitis pigmentosa GTPase regulator (RPGR)- interacting 10. Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of protein: subserving RPGR function and participating in disk morphogenesis. individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015). Proc. Natl Acad. Sci. USA 100, 3965–3970 (2003). 11. Shekhar, K. et al. Comprehensive classification of retinal bipolar neurons by 41. Won, J. et al. RPGRIP1 is essential for normal rod photoreceptor outer single-cell transcriptomics. Cell 166, 1308–1323.e30 (2016). segment elaboration and morphogenesis. Hum. Mol. Genet. 18, 4329–4339 12. Rheaume, B. A. et al. Single cell transcriptome profiling of retinal ganglion (2009). cells identifies cellular subtypes. Nat. Commun. 9, 2759 (2018). 42. Peng, Y.-R. et al. Molecular classification and comparative taxonomics of 13. Marmorstein, A. D. & Marmorstein, L. Y. The challenge of modeling macular foveal and peripheral cells in primate retina. Cell 176, 1222–1237.e22 (2019). degeneration in mice. Trends Genet. 23, 225–231 (2007). 43. Ferrari, S. et al. Retinitis pigmentosa: genes and disease mechanisms. Curr. 14. Hageman, G. S., Gaehrs, K., L. V. J. and D. A. Age-Related Macular Genomics 12, 238–249 (2011). Degeneration (AMD). Webvision. https://webvision.med.utah.edu/book/part- 44. Zobor, D. & Zrenner, E. Retinitis pigmentosa – eine Übersicht. Der xii-cell-biology-of-retinal-degenerations/age-related-macular-degeneration- Ophthalmol. 109, 501–515 (2012). amd/ (2018). 45. Goodwin, P. Hereditary retinal disease. Curr. Opin. Ophthalmol. 19, 255–262 15. DeAngelis, M. M. et al. Genetics of age-related macular degeneration (AMD). (2008). Hum. Mol. Genet. 26, R246–R246 (2017). 46. Zhao, L. et al. Integrative subcellular proteomic analysis allows accurate 16. Jeon, C. J., Strettoi, E. & Masland, R. H. The major cell populations of the prediction of human disease-causing genes. Genome Res. 26, 660–669 mouse retina. J. Neurosci. 18, 8936–8946 (1998). (2016). 17. Martin, P. R. & Grünert, U. Spatial density and immunoreactivity of bipolar 47. Kolesnikov, A. V. et al. G-protein betagamma-complex is crucial for efficient cells in the macaque monkey retina. J. Comp. Neurol. 323, 269–287 (1992). signal amplification in vision. J. Neurosci. 31, 8067–8077 (2011).

NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications 11 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-019-12917-9

48. Maeda, A. et al. Role of photoreceptor-specific retinol dehydrogenase in the 74. Roberts, A., Trapnell, C., Donaghey, J., Rinn, J. L. & Pachter, L. Improving retinoid cycle in vivo. J. Biol. Chem. 280, 18822–18832 (2005). RNA-Seq expression estimates by correcting for fragment bias. Genome Biol. 49. Matsubara, S., Yamaji, Y., Sato, M., Fujita, J. & Takahara, J. Expression of a 12, R22 (2011). photoreceptor protein, recoverin, as a cancer-associated retinopathy 75. Trapnell, C. et al. Transcript assembly and quantification by RNA-Seq reveals autoantigen in human lung cancer cell lines. Br. J. Cancer 74, 1419–1422 unannotated transcripts and isoform switching during cell differentiation. (1996). Nat. Biotechnol. 28, 511–515 (2010). 50. Li, Y. et al. Top2b is involved in the formation of outer segment and synapse 76. Zhou, G. et al. NetworkAnalyst 3.0: a visual analytics platform for during late-stage photoreceptor differentiation by controlling key genes of comprehensive gene expression profiling and meta-analysis. Nucleic Acids Res. photoreceptor transcriptional regulatory network. J. Neurosci. Res. 95, 1951 https://doi.org/10.1093/nar/gkz240 (2019). (2017). 51. Xie, Y. (Angela) et al. New syndrome with retinitis pigmentosa is caused by nonsense mutations in retinol dehydrogenase RDH11. Hum. Mol. Genet. 23, Acknowledgements 5774–5780 (2014). This work is supported by grants from Retina Research Foundation (R.C.) and NEI 52. van Genderen, M. M. et al. Mutations in TRPM1 are a common cause of (R01EY018571 and R01EY022356 to R.C.); the Carl Marshall Reeves & Mildred Almen complete congenital stationary night blindness. Am. J. Hum. Genet. 85, Reeves Foundation, Inc. (MMD), the Macular Degeneration Foundation, Inc. (MMD), 730–736 (2009). the Eunice Kennedy Shriver National Institute of Child Health & Human Development 53. Li, Z. et al. Recessive mutations of the gene TRPM1 abrogate ON bipolar cell (LAO), the Office of Research on Women’s Health of the National Institutes of Health function and cause complete congenital stationary night blindness in humans. under Award K12HD085852 (LAO), National Institutes of Health Core Grant EY014800, Am. J. Hum. Genet. 85, 711–719 (2009). and an Unrestricted Grant from Research to Prevent Blindness, Inc., New York, NY, to 54. Bellone, R. R. et al. Differential gene expression of TRPM1, the potential cause the Department of Ophthalmology & Visual Sciences, University of Utah. Bulk and of congenital stationary night blindness and coat spotting patterns (LP) in the single-cell RNA-Seq were performed at the Single Cell Genomics Core at BCM partially Appaloosa Horse (Equus caballus). Genetics 179, 1861–1870 (2008). supported by NIH shared instrument grants (S10OD018033, S10OD023469) and 55. Audo, I. et al. TRPM1 is mutated in patients with autosomal-recessive P30EY002520 to R.C. We would thank Shangyi Fu for helpful comments and proof complete congenital stationary night blindness. Am. J. Hum. Genet. 85, reading. 720–729 (2009). 56. Zeitz, C. et al. Mutations in GRM6 cause autosomal recessive congenital Author contributions stationary night blindness with a distinctive scotopic 15-Hz flicker R.C. and M.D. conceived the project. Q.L., S.K., R.D., S.L. and N.W. participated in the electroretinogram. Investig. Opthalmology Vis. Sci. 46, 4328 (2005). data analysis. K.C. advised the data analysis. L.O., A.S., A.V., I.K., D.M. and M.D. 57. Dryja, T. P. et al. Night blindness and abnormal cone electroretinogram ON collected the human donor retinas, performed the phenotyping and dissection. R.D. and responses in patients with mutations in the GRM6 gene encoding mGluR6. Y.L. prepared the nuclei sample and performed the single-nuclei RNA-seq. Q.L., R.D., Proc. Natl Acad. Sci. USA 102, 4884–4889 (2005). M.D. and R.C. wrote the manuscript with input from all other authors. All authors 58. Masu, M. et al. Specificdeficit of the ON response in visual transmission by proofread the manuscript. targeted disruption of the mGluR6 gene. Cell 80, 757–765 (1995). 59. Zeitz, C., Robson, A. G. & Audo, I. Congenital stationary night blindness: an analysis and update of genotype–phenotype correlations and pathogenic Competing interests mechanisms. Prog. Retin. Eye Res. 45,58–110 (2015). The authors declare no competing interests. 60. Phillips, M. J. et al. A novel approach to single cell RNA-sequence analysis facilitates in silico gene reporting of human pluripotent stem cell-derived Additional information retinal cell types. Stem Cells 36, 313–324 (2018). Supplementary information 61. Owen, L. A. et al. The Utah protocol for postmortem eye phenotyping and is available for this paper at https://doi.org/10.1038/s41467- molecular biochemical analysis. Investig. Opthalmology Vis. Sci. 60, 1204 (2019). 019-12917-9. 62. Gao, R. et al. Nanogrid single-nucleus RNA sequencing reveals phenotypic Correspondence diversity in breast cancer. Nat. Commun. 8, 228 (2017). and requests for materials should be addressed to M.M.D. or R.C. 63. Voigt, A. P. et al. Molecular characterization of foveal versus peripheral human Peer review information retina by single-cell RNA sequencing. Exp. Eye Res. 184,234–242 (2019). Nature Communications thanks the anonymous reviewers for 64. Bringmann, A. et al. The primate fovea: structure, function and development. their contribution to the peer review of this work. Prog. Retin. Eye Res. 66,49–84 (2018). Reprints and permission information 65. Ortín-Martínez, A. et al. Number and distribution of mouse retinal cone is available at http://www.nature.com/reprints photoreceptors: differences between an albino (Swiss) and a pigmented (C57/ Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in BL6) strain. PLoS One 9, e102392 (2014). fi 66. Morgan, D. J. & DeAngelis, M. M. Differential gene expression in age-related published maps and institutional af liations. macular degeneration. Cold Spring Harb. Perspect. Med. 5, a017210 (2014). 67. Anders, S., Pyl, P. T. & Huber, W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–169 (2015). Open Access This article is licensed under a Creative Commons 68. Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, Attribution 4.0 International License, which permits use, sharing, 1888–1902.e21 (2019). adaptation, distribution and reproduction in any medium or format, as long as you give 69. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with appropriate credit to the original author(s) and the source, provide a link to the Creative Census. Nat. Methods 14, 309–315 (2017). Commons license, and indicate if changes were made. The images or other third party 70. Huang, D. et al. The DAVID Gene Functional Classification Tool: a novel material in this article are included in the article’s Creative Commons license, unless biological module-centric algorithm to functionally analyze large gene lists. indicated otherwise in a credit line to the material. If material is not included in the Genome Biol. 8, R183 (2007). article’s Creative Commons license and your intended use is not permitted by statutory 71. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, regulation or exceeds the permitted use, you will need to obtain permission directly from 15–21 (2013). the copyright holder. To view a copy of this license, visit http://creativecommons.org/ 72. Trapnell, C. et al. Differential analysis of gene regulation at transcript licenses/by/4.0/. resolution with RNA-seq. Nat. Biotechnol. 31,46–53 (2013). 73. Roberts, A., Pimentel, H., Trapnell, C. & Pachter, L. Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27, © The Author(s) 2019 2325–2329 (2011).

12 NATURE COMMUNICATIONS | (2019) 10:5743 | https://doi.org/10.1038/s41467-019-12917-9 | www.nature.com/naturecommunications