Cell Population Characterization and Discovery Using Single-Cell Technologies in Endocrine Systems
Total Page:16
File Type:pdf, Size:1020Kb
65 2 Journal of Molecular LYM Cheung and K Rizzoti Single cell technologies in 65:2 R35–R51 Endocrinology endocrine systems REVIEW Cell population characterization and discovery using single-cell technologies in endocrine systems Leonard Y M Cheung 1 and Karine Rizzoti 2 1Department of Human Genetics, University of Michigan, Michigan, Ann Arbor, USA 2Laboratory of Stem Cell Biology and Developmental Genetics, The Francis Crick Institute, London, UK Correspondence should be addressed to K Rizzoti: [email protected] Abstract In the last 15 years, single-cell technologies have become robust and indispensable Key Words tools to investigate cell heterogeneity. Beyond transcriptomic, genomic and epigenome f technology analyses, technologies are constantly evolving, in particular toward multi-omics, where f single cell analyses of different source materials from a single cell are combined, and spatial f microfluidics transcriptomics, where resolution of cellular heterogeneity can be detected in situ. While f multi-omics some of these techniques are still being optimized, single-cell RNAseq has commonly f transcriptome been used because the examination of transcriptomes allows characterization of f endocrine organs cell identity and, therefore, unravel previously uncharacterized diversity within cell populations. Most endocrine organs have now been investigated using this technique, and this has given new insights into organ embryonic development, characterization of rare cell types, and disease mechanisms. Here, we highlight recent studies, particularly on the hypothalamus and pituitary, and examine recent findings on the pancreas and Journal of Molecular reproductive organs where many single-cell experiments have been performed. Endocrinology (2020) 65, R35–R51 Introduction Single-cell technologies have become an essential soon be analyzed at the single-cell level (Palii et al. 2019) tool, offering an unprecedented glimpse into cellular (for review see Duncan et al. 2019). In parallel, multi- heterogeneity. The decreasing costs of next generation omics technologies are utilized to profile simultaneously sequencing, allowing the sequencing of millions or billions different material sources from one single cell, enabling, of DNA fragments in parallel, and the development of for example, correlations between genomic mutations microfluidic techniques, easing the handling of a large and alteration of gene expression using G&T-seq (Genome number of single cells, both underlaid the emergence and Transcriptome-seq; Macaulay et al. 2015; for review and success of relevant new technologies over the last see Macaulay et al. 2017). Furthermore, combining 15 years. Among single-cell technologies, transcriptomic scRNA-seq with genome-editing tools, genetic screens (scRNAseq) analyses are most frequently performed, can now be conducted with an unprecedented level followed by genome and epigenome sequencing. As new of characterization using tools such as PERTURB-seq technologies are constantly emerging, and progressing (Dixit et al. 2016), CRISP-seq (Jaitin et al. 2016) and very rapidly, the proteome and metabolome may also CROP-seq (Datlinger et al. 2017). While one obvious https://jme.bioscientifica.com © 2020 Society for Endocrinology https://doi.org/10.1530/JME-19-0276 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 10/01/2021 08:51:31AM via free access -19-0276 Journal of Molecular LYM Cheung and K Rizzoti Single cell technologies in 65:2 R36 Endocrinology endocrine systems caveat of most of these analyses is the loss of spatial are currently developed to allow comparisons between information, because most techniques rely on the physical different datasets and platforms (Adey 2019). In this isolation of single cells or nuclei from their in vivo or constantly evolving technological landscape, the range in vitro context, progress has been made to solve this of possibilities is truly amazing. However, it is biological issue by performing in situ sequencing. Several different validation that gives meaning to dataset mining, and platforms, such as seqFISH (Eng et al. 2019), MERFISH single-cell datasets are rather the beginning than the end (Chen et al. 2015), FiSSEQ (Lee et al. 2015) and TIVA (Lovatt of the story. Here, we will mostly focus our attention on et al. 2014), have been developed, and this technology transcriptome and genome analyses. is fast becoming accessible to the wider community (https://spatialtranscriptomics.com/). Applications and remaining hurdles of single- To handle the large quantity of sequencing data cell genome and transcriptome analyses generated from thousands up to hundreds of thousands of cells, specialized bioinformatic tools are essential. While the focus of this review will be on endocrine They need to be more sophisticated to deal with the systems, we will first briefly review more widely the particularities of single-cell datasets compared to bulk advances brought up by single-cell genomic and analyses, such as the low starting material, leading to transcriptomic analyses. We will focus on pioneer studies enhanced noise and batch effects. Similar to bulk analyses, and illustrate the areas where these technologies have these tools are first used to check the quality of the data, been particularly instrumental in opening new avenues remove noise, and generate the results, by aligning it to for investigation. Broadly, the immense advantage of the relevant genome. More advanced algorithms, with single-cell approaches is that they highlight and inform the frequent development of new ones (Amezquita et al. about cell heterogeneity, while previous bulk analyses 2019), are then used to extract as much information as wiped away all this precious information (for review possible from the sequencing data, to visualize results and see Trapnell 2015). However, there are still technical extrapolate biological meaning. Essentially, the first step hurdles. In genome analyses, allelic or locus dropouts and in result analysis is dimensional reduction, where cells incomplete coverage lead to failed detection of single- with similar gene expression profiles are grouped together, nucleotide variants, while in transcriptome analyses, allowing separation and, therefore, identification of not all transcripts are reverse transcribed, and therefore, distinct cell populations. Principal component analysis absence of a certain gene in the dataset does not always (PCA), t-distributed stochastic neighbour embedding mean that it is not expressed. This is more of a problem (t-SNE) and uniform manifold approximation and for lowly transcribed genes, such as transcription projection (UMAP) are commonly used algorithms to factors. It clearly implies that a substantial amount of achieve dimensional reduction. The Seurat package from information is simply missing from the datasets. Spike- the Satija lab comprises quality check and dimensional ins (RNAs in known quantities) can be added as internal reduction analyses and is very commonly used to controls, but they may not always truly reflect the rate analyze single-cell RNA-seq datasets (Stuart et al. 2019). of capture of endogenous transcripts. Platforms, reagents Then, according to the biological question, ordering of and algorithms are constantly improved to respectively cell populations may be performed, because cells have reduce and take into account these problems. For genome been captured as they transit between different states or analysis, insertion of transposons for linear amplification stages and this information can be used to reconstruct (LIANTI) improves both coverage and fidelity compared cellular hierarchies and trajectories. A commonly with exponential amplification (Chen et al. 2017a). In utilized algorithm to allow distribution of cells along a addition, there are notable technological differences pseudo-time trajectory is Monocle from the Trapnell lab between the different platforms used for transcriptomic (Trapnell et al. 2014). Other widely used tools comprise analyses (for review see Svensson et al. 2018). In term of RNA velocity, where trajectories are reconstituted results, the two main differences are sequencing depth according to the balance between unspliced and spliced (number of reads) and the number of cells processed, transcripts, reflecting how recently a gene has been which are both inversely correlated (Ziegenhain up-regulated (La Manno et al. 2018), and many others et al. 2017). Maybe counterintuitively, it appears that have been developed and extensively used (for review sequencing more cells, at the detriment of the number of see Todorov & Saeys 2019). Because the quantity of reads, is better for dimensional reduction. Furthermore, it datasets generated is now exponential, bioinformatic tools is sometimes difficult to obtain a good quality single-cell https://jme.bioscientifica.com © 2020 Society for Endocrinology https://doi.org/10.1530/JME-19-0276 Published by Bioscientifica Ltd. Printed in Great Britain Downloaded from Bioscientifica.com at 10/01/2021 08:51:31AM via free access Journal of Molecular LYM Cheung and K Rizzoti Single cell technologies in 65:2 R37 Endocrinology endocrine systems suspension or this is simply not an option when working and ultimately for regenerative medicine. Single-cell with frozen clinical samples. An alternative is to use single approaches have thus been instrumental to resolve cell nuclei where transcriptome is comparable to that found in heterogeneity in different