Humcfs: a Database of Fragile Sites in Human Chromosomes

Humcfs: a Database of Fragile Sites in Human Chromosomes

Kumar et al. BMC Genomics 2019, 19(Suppl 9):985 https://doi.org/10.1186/s12864-018-5330-5 DATABASE Open Access HumCFS: a database of fragile sites in human chromosomes Rajesh Kumar1,2†, Gandharva Nagpal2†, Vinod Kumar1,2, Salman Sadullah Usmani1,2, Piyush Agrawal1,2 and Gajendra P. S. Raghava1* From 17th International Conference on Bioinformatics (InCoB 2018): Genomics New Delhi, India. 26-28 September, 2018 Abstract Background: Fragile sites are the chromosomal regions that are susceptible to breakage, and their frequency varies among the human population. Based on the frequency of fragile site induction, they are categorized as common and rare fragile sites. Common fragile sites are sensitive to replication stress and often rearranged in cancer. Rare fragile sites are the archetypal trinucleotide repeats. Fragile sites are known to be involved in chromosomal rearrangements in tumors. Human miRNA genes are also present at fragile sites. A better understanding of genes and miRNAs lying in the fragile site regions and their association with disease progression is required. Result: HumCFS is a manually curated database of human chromosomal fragile sites. HumCFS provides useful information on fragile sites such as coordinates on the chromosome, cytoband, their chemical inducers and frequency of fragile site (rare or common), genes and miRNAs lying in fragile sites. Protein coding genes in the fragile sites were identified by mapping the coordinates of fragile sites with human genome Ensembl (GRCh38/hg38). Genes present in fragile sites were further mapped to DisGenNET database, to understand their possible link with human diseases. Human miRNAs from miRBase was also mapped on fragile site coordinates. In brief, HumCFS provides useful information about 125 human chromosomal fragile sites and their association with 4921 human protein-coding genes and 917 human miRNA’s. Conclusion: User-friendly web-interface of HumCFS and hyper-linking with other resources will help researchers to search for genes, miRNAs efficiently and to intersect the relationship among them. For easy data retrieval and analysis, we have integrated standard web-based tools, such as JBrowse, BLAST etc. Also, the user can download the data in various file formats such as text files, gff3 files and Bed-format files which can be used on UCSC browser. Database URL:http://webs.iiitd.edu.in/raghava/humcfs/ Keywords: Chromosomal fragile site, Genomic instability, Database, Chemical inducers, Carcinogenesis, miRNA, DNA elements, Replication stress Background gaps and breaks when cells are exposed to DNA synthesis in- Genomic instability is the hallmark of cancer [1]andseveral hibitors [3]. These are specific loci which conserved among other pathologies such as mental retardation [2]; preferen- human and mouse genomes [4]. CFSs can be classified as tially occurs at specific genomic regions, known as the rare and common depending on their frequency of induction chromosome fragile sites (CFSs). CFSs are the specific within the population. Rare fragile sites are induced explicitly chromosomal regions that exhibit an increased frequency of by BrdU (Bromodeoxyuridine), and folic acid, thus leading to categorization as BrdU (Bromodeoxyuridine) sensitive and * Correspondence: [email protected] folate-sensitive. In case of common fragile sites, Aphidicolin, †Rajesh Kumar and Gandharva Nagpal contributed equally to this work. 5-Azacytidine, and Distamycin-A are most common in- 1 Center for Computational Biology, Indraprastha Institute of Information ducers [5]. Induced fragile sites are involved in sister chroma- Technology, New Delhi 110020, India Full list of author information is available at the end of the article tid exchange, deletion and translocation [6]. © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Kumar et al. BMC Genomics 2019, 19(Suppl 9):985 Page 2 of 8 Over the past few years, fragile sites have been realized to Construction and content be an important aspect of cancer biology, as most of the Data collection and compilation cancer-related genes occur in the CFSs [7]. The genetic in- The relevant articles were collected from PubMed by stability at fragile sites often results in aberrant expression searching the combination of strings “fragile sites”, of oncogenes and tumor-suppressing genes, a step towards “chromosomal fragile region”, “chromosomal breakage initiation of cancer progression [8]. It has been shown by in region”, “genomic breakpoints and fragile sites”, “gen- vitro studies that translocation, deletion, intra-chromo- omic breakpoints”, to collect comprehensive informa- somal gene arrangement and sister chromatid exchange of tion about fragile sites. The total number of hits cancer-specific genomic regions occur as a consequence of found using the above-searched terms were 4068. cell treatment with fragile site inducers [9, 10]. Common Each abstract was manually examined for the infor- fragile sites even co-localize with breakpoints and deletions mation regarding human CFSs, which includes fragile specific to various tumors [11, 12]. Epigenetic alterations site name, coordinate, cytoband, inducer, frequency, such as histone hypo-acetylation and methylation contrib- type and technique used for their identification. Final ute towards genomic instability at CFS [13]. For example, datawasgatheredfrom83PubMedarticles.Someof TumorsuppressorgeneWWOXlocatedwithinthe the fragile sites are not characterized at the molecular FRA16D fragile site is often aberrantly methylated and can level, for them, regions corresponding to cytoband be correlated with the development of various tumors such positions were considered as fragile sites. Coordinate as ovary, prostate, and breast cancer [14, 15]. Micro-RNAs information is considered to define the position for (miRNAs), which are essential for cell survival, cell differen- molecularly cloned fragile sites as per literature. tiation, metabolism and cell death, also lies in fragile sites, e.g., FRA4D contains miR-218-1 and FRA5G contains Data curation and organization miR-218-2 [16]. The deregulated expression of miRNAs Primary information about the human CFSs, such as the due to chromosomal rearrangement has been associated name of the fragile site, chromosomal location, cytoband, with cancer-specific events and tumor development [17]. type, frequency, technique used for their characterization For instance, the differential expression of miR218 due to and identification, and the chromosome number were chromosomal rearrangement is allied with bladder cancer manually extracted from the PubMed articles. Apart from development [18, 19]. the primary information, we have processed three levels of In the past, numerous resources have been developed to annotation for each fragile site, which provides a complete maintain a wide range of information related to instabil- and well-annotated picture of the fragile sites. First level ities in genome and chromosomes such as i) TICdb, which annotation includes mapping of protein-coding genes on contain information about chromosomal translocation the chromosomal coordinates of fragile sites. For this, data breakpoint in human tumors [20], ii) HYBRIDdb, from human reference genome assembly (GRCh38/hg38) maintains information of hybrid genes in humans [21], iii) (http://hgdownload.soe.ucsc.edu/goldenPath/hg38/chromo- dbCRID, contains information about chromosomal somes/) was downloaded from Ensembl. An in-house Py- rearrangement in diseases [22], iv) chimerDB3.0 is a re- thon script was used to extract the data from human source of fusion genes [23] and v) COSMIC is a Catalogue genome reference assembly and to map on the coordinates of Somatic Mutations in Cancer [24]. Since the discovery of CFSs. Second level annotation of fragile sites includes in- of CFSs, several lines of evidence suggest their involve- formation about miRNAs lying in fragile sites. Human ment in human disease progression, as CFSs are preferred miRNAs data from miRBase [25] was downloaded and sites for exogenous DNA insertion, chromosomal trans- mapped on the fragile site by using a Python script. The location, re-arrangement, and breakpoint. But there re- third level of annotation provides information about hu- mains a significant gap in the current understanding of man CFSs gene and their association with diseases. Each the human CFSs and the functional components lying gene present in the database was linked to the disease cat- within these sites such as miRNAs. One possible reason egorybytakingdatafromDisGenNET[26] database. Fi- could be a lack of attempts to associate and integrate nally, information encompassing 4921 protein-coding divergent CFS studies in literature, leading to the absence genes, 917 miRNA and association of genes to human of a complete view for the study of these versatile human diseases, was systematically and comprehensively compiled genomic regions. Best of our knowledge there is no data- in HumCFS. The complete architecture of HumCFS

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    8 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us