Tsgene 2.0: an Updated Literature-Based Knowledgebase for Tumor Suppressor Genes Min Zhao1,Porakim2, Ramkrishna Mitra2, Junfei Zhao2 and Zhongming Zhao2,3,4,5,*
Total Page:16
File Type:pdf, Size:1020Kb
Published online 20 November 2015 Nucleic Acids Research, 2016, Vol. 44, Database issue D1023–D1031 doi: 10.1093/nar/gkv1268 TSGene 2.0: an updated literature-based knowledgebase for tumor suppressor genes Min Zhao1,PoraKim2, Ramkrishna Mitra2, Junfei Zhao2 and Zhongming Zhao2,3,4,5,* 1School of Engineering, Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Maroochydore DC, Queensland 4558, Australia, 2Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37232, USA, 3Department of Cancer Biology, Vanderbilt University School of Medicine, Nashville, TN 37232, USA, 4Department of Psychiatry, Vanderbilt University School of Medicine, Nashville, TN 37212, USA and 5School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA Received September 14, 2015; Revised October 30, 2015; Accepted November 03, 2015 ABSTRACT normal cell growth with a potential to spread through the body. It often arises from two types of genetic alterations Tumor suppressor genes (TSGs) are a major type related to the cell proliferation, differentiation, apoptosis of gatekeeper genes in the cell growth. A knowl- and cell-to-cell communication (2,3): the loss-of-function edgebase with the systematic collection and cura- of tumor suppressor genes (TSGs) and the gain-of-function tion of TSGs in multiple cancer types is critically of oncogenes (OCGs). The inactivation or reduced func- important for further studying their biological func- tion of protein-coding TSGs can be induced in many ways tions as well as for developing therapeutic strategies. including promoter methylation changes (4), copy number Since its development in 2012, the Tumor Suppres- alterations (5), deregulated mRNA expression due to mi- sor Gene database (TSGene), has become a popular croRNA (miRNA) activities (6) and competing endogenous resource in the cancer research community. Here, long non-coding RNAs (lncRNAs) (7). In general, TSGs we reported the TSGene version 2.0, which has sub- play key roles in the cell cycle checkpoints and in maintain- ing genomic stability. Defective TSGs often allow uncon- stantial updates of contents (e.g. up-to-date litera- trolled cell growth without normal DNA repair, apoptosis ture and pan-cancer genomic data collection and and normal metabolic regulation (8). Accumulating lines of curation), data types (noncoding RNAs and protein- evidence have shown that non-protein coding RNAs, such coding genes) and content accessibility. Specifically, as miRNAs (9–11) and lncRNAs (12), can act as TSGs to the current TSGene 2.0 contains 1217 human TSGs initiate and promote cancer development. (1018 protein-coding and 199 non-coding genes) cu- To provide a comprehensive TSG resource for the can- rated from over 9000 articles. Additionally, TSGene cer research community, we developed the Tumor Sup- 2.0 provides thousands of expression and mutation pressor Gene database (TSGene 1.0) in 2012 (13), and patterns derived from pan-cancer data of The Can- have been continuously maintaining it since then. TSGene cer Genome Atlas. A new web interface is available 1.0 is the only active data resource specifically designed at http://bioinfo.mc.vanderbilt.edu/TSGene/. System- for TSGs. It has received 82573 web hits based on daily unique internet protocol addresses. Since its release, TS- atic analyses of 199 non-coding TSGs provide numer- Gene database has become a popular resource, enjoying ous cancer-specific non-coding mutational events wide use for testing drug resistance (14), studying HIV in- for further screening and clinical use. Intriguingly, tegration of cancer-related genes (15,16), exploring phos- we identified 49 protein-coding TSGs that were phorylation regulatory networks in cancer cells (17), identi- consistently down-regulated in 11 cancer types. In fying cancer-associated transcript fusions (18), uncovering summary, TSGene 2.0, which is the only available intronic enhancers through loss of methylation (19) and de- database for TSGs, provides the most updated TSGs signing genome-scale CRISPR-based gene repression (20). and their features in pan-cancer. Moreover, TSGene 1.0 has been frequently used as a special gene list in the systems biology-based studies for the cancer INTRODUCTION genomic data (21–24). In the past several years, we have witnessed the unprece- Cancer is a large family of diseases that cause millions of dented growth of cancer genomic data, such as those from death worldwide every year (1). It is characterized by the ab- *To whom correspondence should be addressed. Tel: +1 615 343 9158; Fax: +1 615 936 8545; Email: [email protected] C The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. D1024 Nucleic Acids Research, 2016, Vol. 44, Database issue the Cancer Genome Project (CGP), The Cancer Genome translated them to the official gene symbols from Entrez Atlas (TCGA) and the International Cancer Genome Con- Gene database. sortium (ICGC). Furthermore, many more TSGs have been In this round, we were more cautious for the TSGs reported including non-protein coding genes (miRNAs and with potential oncogenic roles. TSGs may also play differ- lncRNAs). Accordingly, there is a strong need to charac- ent roles in different cancers or at different stages of dis- terize the tumor suppressor gene landscape at the genome, ease. For example, histone deacetylase 1 gene (HDAC1) epigenome, transcriptome and proteome levels, across all has been reported as a TSG during the cancer initiation, types of cancer. We have addressed this need with substan- but as an oncogene during the tumor maintenance pro- tial updates to TSGene 2.0. Our changes include more ex- cess (26). Other TSGs may have oncogenic roles in differ- tensive literature curation, data integration and annotation, ent cancer grades. For example, RASSF1 is a TSG mainly and a user-friendly web interface. reported in neuroendocrine tumors of the lung. One of its As the only literature-based database dedicated to TSGs, isoform acts as an oncogene in some high-grade lung tu- TSGene 2.0 provides not only a comprehensive resource for mors (27). The Notch signaling pathway has been identi- the cancer research community, but also a classified TSG fied as oncogene in the hematopoietic cancers (28). How- catalog for advanced integrative analyses across multiple ever, accumulating lines of evidence suggest that pathway cancers. For example, as described in this paper, we ob- members have growth-suppressive roles in some hematopoi- served that two lncRNA TSGs, DLEU1 and DLEU2,are etic cells, in skin, pancreatic epithelium and hepatocytes not only highly mutated in multiple prostate cancers, but (28). SIRT1 can negatively regulate the TGF-beta signal- also in cancers of the bladder and ovaries. Our analyses ing pathway and enhance tumorigenesis, but it also interacts of 24 highly confident miRNA TSGs revealed that they with promyelocytic leukaemia (PML) protein to stabilize were associated with cancer-specific signaling pathways in TP53 and induce cell senescence (29). WT1, a well-studied multiple cancers. Furthermore, we pinpointed a testable tumor suppressor has dual roles in cancer progression de- prevalent deletion of has-miR-31 in high-grade glioblas- pending on the presence or absence of regulatory protein toma (GBM). We also used a pan-cancer expression data partners (30). A few well-studied oncogenes, such as MYC, analysis method and found that the TGF-beta signaling have weak evidence for tumor suppression (31). We did not pathway is dominated by TSGs that are consistently down- include them in TSGene database to avoid potential misuse. regulated in tumor samples. These cancer genomics-based To create an overview of TSGs that have also been re- integrative analyses can provide complementary evidence ported to act as oncogenes, we compiled a gene list of of novel functions of known TSGs in the new cancer types 320 protein-coding oncogenes. Our information sources in- with potential lethal effects that otherwise might have been cluded a classical review of cancer genes (32), a research ar- overlooked in the analysis of individual cancers. TSGene ticle for oncogenic miRNAs (33), the UniProtKB keyword 2.0’s new web interface has more user-friendly features for ‘proto-oncogene’ and 17 TSGs with dual roles from our lit- browsing the relevant information and querying the func- erature curation (Supplementary Table S1). As the result, tionalities of TSGs. The web server is available at http: we obtained a list of 73 TSGs with potential oncogenic roles //bioinfo.mc.vanderbilt.edu/TSGene/. (including 54 protein-coding TSGs and 19 miRNA TSGs) (Supplementary Table S2). This list of 73 genes is also avail- able through TSGene 2.0. We will update it frequently. DATA COLLECTION AND DISCUSSION In summary, we compiled 1217 human TSGs, including 1018 protein-coding and 199 non-coding genes, from 3354 Curation of the known and conflicting TSGs in literature PubMed abstracts with confirmed literature evidence. We To maintain consistency, we duplicated the literature query- stored all the curated TSGs and relevant annotations in a ing strategy used for TSGene 1.0 on the PubMed and MySQL relational database. A dynamic web interface was GeneRif (Gene Reference Into Function). The most recent implemented by Perl CGI and JavaScript for data browsing systematic PubMed search was conducted on 25 April 2015 and querying. using the term: ‘tumor suppressor’ [Title] NOT (P53 [Ti- tle] OR TP53 [Title]). To avoid false results, we searched for Representative entries in the TSGene 2.0 matches in titles only.The search returned 6178 PubMed ab- stracts. On the same day, we also extracted 5454 additional Supplementary Figure S1 shows a compilation of informa- short statements associated with 3719 PubMed abstracts tion provided by TSGene 2.0.