Comparative Transcriptomic Analyses and Single-Cell RNA Sequencing Of
Total Page:16
File Type:pdf, Size:1020Kb
Swapna et al. Genome Biology (2018) 19:124 https://doi.org/10.1186/s13059-018-1498-x RESEARCH Open Access Comparative transcriptomic analyses and single-cell RNA sequencing of the freshwater planarian Schmidtea mediterranea identify major cell types and pathway conservation Lakshmipuram Seshadri Swapna1, Alyssa M. Molinaro1,2, Nicole Lindsay-Mosher1,2, Bret J. Pearson1,2,3* and John Parkinson1,2,4* Abstract Background: In the Lophotrochozoa/Spiralia superphylum, few organisms have as high a capacity for rapid testing of gene function and single-cell transcriptomics as the freshwater planaria. The species Schmidtea mediterranea in particular has become a powerful model to use in studying adult stem cell biology and mechanisms of regeneration. Despite this, systematic attempts to define gene complements and their annotations are lacking, restricting comparative analyses that detail the conservation of biochemical pathways and identify lineage-specific innovations. Results: In this study we compare several transcriptomes and define a robust set of 35,232 transcripts. From this, we perform systematic functional annotations and undertake a genome-scale metabolic reconstruction for S. mediterranea. Cross-species comparisons of gene content identify conserved, lineage-specific, and expanded gene families, which may contribute to the regenerative properties of planarians. In particular, we find that the TRAF gene family has been greatly expanded in planarians. We further provide a single-cell RNA sequencing analysis of 2000 cells, revealing both known and novel cell types defined by unique signatures of gene expression. Among these are a novel mesenchymal cell population as well as a cell type involved in eye regeneration. Integration of our metabolic reconstruction further reveals the extent to which given cell types have adapted energy and nucleotide biosynthetic pathways to support their specialized roles. Conclusions: In general, S. mediterranea displays a high level of gene and pathway conservation compared with other model systems, rendering it a viable model to study the roles of these pathways in stem cell biology and regeneration. Keywords: Schmidtea mediterranea, Transcriptomics, Metabolism, Single-cell genomics, Metabolic reconstruction, Transcription factors, Tissue regeneration, Comparative genomics * Correspondence: [email protected]; [email protected] 1Hospital for Sick Children, Toronto, ON, Canada Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Swapna et al. Genome Biology (2018) 19:124 Page 2 of 22 Background analyses can mask the contribution of specific cell subpop- Investigations using model organisms such as Caenor- ulations, which can be particularly problematic when habditis elegans, Drosophila melanogaster, zebrafish, and attempting to elucidate, for example, pathways expressed mice continue to drive fundamental insights into the during key cellular events. While cell sorting offers the cap- molecular mechanisms driving a variety of conserved ability to enrich for specific cell subpopulations, the emer- biochemical processes [1]. However, much attention has gence of single-cell RNA sequencing (scRNAseq) offers a recently turned to the use of non-traditional organisms powerful route for interrogating gene expression profiles as models to explore more specialized pathways. For ex- from individual cells [23, 24]. Applied to S. mediterranea, ample, while freshwater planarians (flatworms) have this technology is expected to yield molecular-level insights been used in a laboratory setting for more than 100 years into the roles of distinct cell types, such as neoblasts, due to their ability to regenerate following virtually any during homeostatic tissue maintenance and regeneration injury, the planarian Schmidtea mediterranea has [7, 25–27]. Indeed, scRNAseq experiments have already emerged as a powerful model for dissecting the molecu- been used to resolve neoblast heterogeneity and identify lar basis of tissue regeneration [2, 3]. Despite significant regulators of lineage progression [26–30]. resources put forth to develop S. mediterranea as a In this study, we generate a high-confidence transcrip- model in the lab, systematic genome-scale investigations tome pruned from an integrated transcriptome gener- of gene function and conservation are lacking. ated earlier in the lab [18], which, through combining Much of the interest in planarians is driven by the fact transcriptomes from diverse physiological conditions that approximately 20% of their adult cells are stem cells and experimental techniques, leads to a large number of (called neoblasts), at least some of which are pluripotent transcripts (n = 83,469) for S. mediterranea.Next,we [4–7]. In addition, planarians are one of the only models apply systematic bioinformatic approaches to annotate that can be used to rapidly test gene function in adult and compare the complement with model organisms animals through RNA interference (RNAi) screening. and other Platyhelminthes. This pipeline predicts puta- Placing gene function in an evolutionary context is crit- tive functional annotations of the transcriptome, identi- ical not only to inform on the conservation of pathways fying a set of transcriptionally active transposons as well related to stem cell biology and regeneration, but also as extended families of cadherins and tumor necrosis because planarians represent a key member of the other- factor (TNF) receptor associated factor (TRAF) proteins. wise neglected superphylum Lophotrochozoa/Spiralia Metabolic reconstruction further reveals an increased (subsequently referred to as Lophotrochozoa), and they biochemical repertoire relative to related parasitic platy- can further be used to model closely related parasitic helminths. In order to gain insights into the role of these flatworm species (e.g., flukes and tapeworms), which in- pathways in planarian biology, high-throughput scRNA- fect an estimated hundreds of millions worldwide [8]. seq was performed, capturing the transcriptional signa- In attempts to complement ongoing genome sequen- tures from ~ 2000 cells. From the 11 distinct clusters of cing efforts [9, 10], several transcriptome datasets have transcriptional profiles, we identified clusters corre- been generated for S. mediterranea under various sponding to neoblasts, epithelial progenitors, muscle, physiological conditions using a variety of experimental neurons, and gut, among which neoblasts exhibit the techniques [11–18]. In isolation, each set provides a most metabolically active profiles. We also identify a snapshot of planarian gene expression under a specific novel cluster: a cathepsin+ cluster representing multiple condition; however, recent efforts have focused on inte- unknown mesenchymal cells. Beyond giving us new in- grating several transcriptomes to generate a more com- sights into the evolution and dynamics of genes involved prehensive overview of gene expression [9, 19]. The in regenerative pathways, the data and analyses presented SmedGD repository was generated by integrating tran- here provide a complementary resource to ongoing scriptomes from whole-animal sexual and asexual genome annotation efforts for S. mediterranea.Theyare worms, whereas the PlanMine database serves as a re- available for download from http://www.compsysbio.org/ pository for the published genome as well as existing datasets/schmidtea/. transcriptomes from the community to be deposited and queried. However, they lack systematic and comparative Results evolutionary and functional genomics analyses, which A definitive transcriptome for S. mediterranea are required for understanding the mechanistic basis of A definitive transcriptome of S. mediterranea was gener- biological processes. Together these datasets comprise ated by integrating the RNA sequencing (RNA-seq) more than 82,000 “transcripts” with little assessment of reads generated from five separate experiments and cell “completeness” from an evolutionary perspective. purifications [18, 31–33] (National Center for Biotech- Typically, transcriptome datasets are generated from nology Information [NCBI] Bioproject PRJNA215411). entire organisms or tissues [20–22]; however, such From an initial set of 83,469 transcripts, a tiered set of Swapna et al. Genome Biology (2018) 19:124 Page 3 of 22 filters were applied to define a single set of 36,026 co-localize on the genome with known S. mediterranea high-confidence transcripts (Fig. 1a). First, protein-coding transcripts derived from complementary resources transcripts are identified on the basis of sequence similar- (EST database of the NCBI, SmedGD v2.0 [9]andthe ity to known transcripts or proteins, as well as the pres- Oxford dataset [14]) were included in our final filtered ence of predicted protein domains with reference to the dataset (Fig. 1a, b). following databases: UniProt [34], MitoCarta [35], InterPro Together,