D1044–D1055 Nucleic Acids Research, 2019, Vol. 47, Database issue Published online 16 November 2018 doi: 10.1093/nar/gky1139 OncoBase: a platform for decoding regulatory somatic mutations in human cancers Xianfeng Li1,2,†, Leisheng Shi3,†, Yan Wang1,†, Jianing Zhong4,†, Xiaolu Zhao5, Huajing Teng2, Xiaohui Shi6, Haonan Yang5, Shasha Ruan7, MingKun Li3, Zhong Sheng Sun2,*, Qimin Zhan1,* and Fengbiao Mao 5,*
1Key laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Laboratory of Molecular Oncology, Peking University Cancer Hospital & Institute, Beijing 100142, China, 2Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100101, China, 3Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China, 4Key Laboratory of Prevention and Treatment of Cardiovascular and Cerebrovascular Diseases of Ministry of Education, Gannan Medical University, Ganzhou 341000,China, 5Department of Pathology, University of Michigan, Ann Arbor, MI 48109, USA, 6Sino-Danish college, University of Chinese Academy of Sciences, Beijing 100049, China and 7Department of Clinical Oncology, Renmin Hospital of Wuhan University, Wuhan, Hubei 430072, China
Received August 16, 2018; Revised October 16, 2018; Editorial Decision October 24, 2018; Accepted November 11, 2018
ABSTRACT a large pool. Thus, OncoBase is a useful resource for the functional annotation of regulatory noncoding Whole-exome and whole-genome sequencing have regions and systematically benchmarking the regu- revealed millions of somatic mutations associated latory effects of embedded noncoding somatic mu- with different human cancers, and the vast ma- tations in human carcinogenesis. jority of them are located outside of coding se- quences, making it challenging to directly interpret their functional effects. With the rapid advances in high-throughput sequencing technologies, genome- INTRODUCTION scale long-range chromatin interactions were de- Noncoding variants are capable of causing common dis- tected, and distal target genes of regulatory elements eases and account for the vast majority of heritability (1). were determined using three-dimensional (3D) chro- To date, a majority of studies have focused exclusively on matin looping. Herein, we present OncoBase (http:// the effects of missense variants in coding regions (2–4) that < www.oncobase.biols.ac.cn/), an integrated database comprise 2% of the human genome (5). Mapping variants for annotating 81 385 242 somatic mutations in 68 to the whole genome indicate that disease-associated sin- gle nucleotide polymorphisms (SNPs) are strongly enriched cancer types from more than 120 cancer projects by in regulatory elements, especially those activated in rele- exploring their roles in distal interactions between vant cell types (6). Moreover, numerous studies have shown target genes and regulatory elements. OncoBase in- that associated variants for a particular trait/disease are tegrates local chromatin signatures, 3D chromatin in- significantly enriched in certain regulatory regions of rel- teractions in different cell types and reconstruction evant tissues/cell types (7). Importantly, the noncoding re- of enhancer-target networks using state-of-the-art al- gions possess many functional elements based on one di- gorithms. It employs informative visualization tools mensional (1D) epigenomic features and three-dimensional to display the integrated local and 3D chromatin sig- (3D) spatial long-range interactions that could help to build natures and effects of somatic mutations on regu- accurate enhancer-promoter regulatory pairs; therefore, in- latory elements. Enhancer-promoter interactions es- tegrating noncoding variants with 1D coordinated epige- timated from chromatin interactions are integrated netic profiles and 3D long-range interactions in specific tissue/cell types will provide a promising direction to fine- into a network diffusion system that quantitatively map causal regulatory variants and understand underlying prioritizes somatic mutations and target genes from regulatory mechanisms in human diseases.
*To whom correspondence should be addressed. Tel: +1 734 545 9666; Email: [email protected] Correspondence may also be addressed to Zhong Sheng Sun. Email: [email protected] Correspondence may also be addressed to Qimin Zhan. Email: [email protected] †The authors wish it to be known that, in their opinion, the first four authors should be regarded as Joint First Authors.