Cancer-Dedicated Gene Set Interpretation Authors

Total Page:16

File Type:pdf, Size:1020Kb

Cancer-Dedicated Gene Set Interpretation Authors Title: oncoEnrichR: cancer-dedicated gene set interpretation Authors: Sigve Nakken1,2,*, Sveinung Gundersen3, Fabian L. M. Bernal3, Eivind Hovig1,4, Jørgen Wesche1,2 Affiliations: 1Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, Norway 2Centre for Cancer Cell Reprogramming, Institute of Clinical Medicine, Faculty of Medicine, University of Oslo, Norway 3ELIXIR Norway, Department of Informatics, University of Oslo, Norway 4Centre for Bioinformatics, Department of Informatics, University of Oslo, Norway *Corresponding author: [email protected] Abstract Summary: Interpretation and prioritization of candidate hits from genome-scale screening experiments represent a significant analytical challenge, particularly when it comes to an understanding of cancer relevance. We have developed a flexible tool that substantially refines gene set interpretability in cancer by leveraging a broad scope of prior knowledge unavailable in existing frameworks, including data on target tractabilities, tumor-type association strengths, protein complexes and protein-protein interactions, tissue and cell-type expression specificities, subcellular localizations, prognostic associations, cancer dependency maps, and information on genes of poorly defined or unknown function. Availability: oncoEnrichR is developed in R, and is freely available as a stand-alone R package. A web interface to oncoEnrichR is provided through the Galaxy framework (https://oncotools.elixir.no). All code is open-source under the MIT license, with documentation, example datasets and and instructions for usage available at https://github.com/sigven/oncoEnrichR/ Contact: [email protected] 1 Introduction The search for novel cancer-implicated genes is currently fueled by different types of genome-scale screening experiments, exemplified by gene silencing through siRNA and CRISPR/cas-9, protein interaction screens, or differential gene expression from RNA-seq. Importantly, most high-throughput screens typically produce the same conceptual output, that is dozens or hundreds of candidate genes ranked by effect size. A significant challenge in this respect is to aggregate sufficient prior knowledge regarding the candidate hits, both individually and at a systems level, in order to prioritize likely cancer-implicated genes for in-depth experimental validation. Presently, this effort is mostly supported by various gene set enrichment tools, but also platforms that can prioritize hits through their aberration patterns in different cancer omics datasets. Valuable resources in this regard include Metascape, DAVID, g:Profiler, and EnrichR, which leverage up-to-date databases for different types of enrichment analyses of gene lists (Huang et al., 2009; Chen et al., 2013; Zhou et al., 2019; Reimand et al., 2007). Furthermore, the web-based cBioPortal platform enables elegant interactive visualizations of custom gene lists in relation to tumor gene expression and aberration datasets from The Cancer Genome Atlas (TCGA) and other large-sequencing sequencing projects (Cerami et al., 2012; The Cancer Genome Atlas Research Network et al., 2013). To our knowledge, there is however a lack of gene set interpretation tools that successfully combine a broad set of cancer-dedicated annotations at the individual gene level with multiple perspectives of system-level properties. Specifically, information that is significant for comprehensive geneset interpretability in cancer, yet not sufficiently integrated in available frameworks, include data on gene-tumor type associations, prognostic gene expression associations, target druggability, oncogene/tumor suppressor/cancer driver gene information, tissue- and cell-type expression specificities, as well data on target subcellular localizations, protein complex memberships, and protein-protein interactions. In order to fill this gap, we have developed oncoEnrichR, a cancer-dedicated gene set interpretation workflow where a diverse suite of gene- centric annotation and prioritization modules are supplemented with system-level modules that offer insight at the level of signaling pathways, molecular gene signatures and the protein interactome. A structured and interactive gene set analysis report, highly configurable by the user, is readily available as the main output per run. We believe that oncoEnrichR can function as an easily accessible and all- inclusive resource for candidate hit interpretation in cancer. 2 Construction and Content The oncoEnrichR gene set interpretation tool has been developed as an R package, encompassing four main components (schematically illustrated in Figure 1). In order to make the tool accessible to biologists, oncoEnrichR is also available through the web-based Galaxy platform, which facilitates a reproducible and collaborative analysis framework. Input The main input to the tool is a simple list of gene identifiers. To cater for ease of use, multiple common identifier types are permitted, e.g. official gene symbols, UniProt accessions, RefSeq/Ensembl transcript and protein identifiers etc. An optional specification of a background gene set for use in enrichment analysis is further available, which ensures that enrichment results are performed according to the nature of the underlying screen. The user can also configure different parameters of the individual workflow modules (e.g. enrichment and protein-protein interaction settings), and which parts to include in the output. Data resources & analysis content The backend of oncoEnrichR is based on a comprehensive collection of prior knowledge on cancer- relevant properties of human genes and their interrelationships (a complete overview is provided in Supplementary Table 1). Query genes are annotated with status as tumor suppressor genes, oncogenes or potential cancer drivers according to significant support from the CancerMine text-mining resource and Network of Cancer Genes (Repana et al., 2019; Lever et al., 2019). For gene-disease associations, we exploit the powerful Open Targets Platform, which have established an aggregated association score between phenotypes (i.e. in the form of Experimental Factor Ontology (EFO) terms) and human genes from a multitude of sources, e.g. genetic associations, text mining, or data from animal models (Malone et al., 2010; Ochoa et al., 2021). These association scores enable an effective ranking of the candidate hits in terms of their global association to cancer, as well as tumor-type specific rankings. Importantly, in order to enable the latter functionality, we have established a semi-manually curated mapping between ontology terms and main tumor sites (e.g. breast, lung, colon/rectum etc.), using OncoTree as the starting point (Kundra et al., 2021). The Open Targets Platform is further utilized to gather information on cancer hallmark evidence, known cancer drugs targeted toward candidate genes, as well as overall target tractability (druggability) data, which provides a prioritization of the hits in terms of their potential as targets for inhibition by small molecules or antibodies. Genes of poorly defined or unknown function, as evident from limited Gene Ontology (GO) annotations or gene summary descriptions, can be conveniently highlighted. A systems-level view of the candidate set is provided through a standard enrichment/over- representation analysis towards several databases, including GO, the Molecular Signatures Database, KEGG and NetPath (Kanehisa and Goto, 2000; Liberzon et al., 2011; Kandasamy et al., 2010; Ashburner et al., 2000). This is supplemented by a protein-protein interaction view of the candidate set and their neighbouring partners (STRING), a view where we also integrate druggable targets and known cancer genes (Von Mering et al., 2005). Information on candidate hit memberships in known protein complexes (CORUM) and their subcellular localization patterns (ComPPI) adds additional dimensions to the global view of the candidate set (Giurgiu et al., 2019; Veres et al., 2015). Through the use of omics data from TCGA, candidate hits can be interrogated and prioritized for aberration frequencies in different tumor types, and also the potential co-expression of candidate hits with other cancer-relevant genes. Expression profiling data from healthy tissues (Human Protein Atlas (HPA) and Genotype-Tissue Expression Project) are used to portray tissue- and cell-type specific expression patterns for the candidate genes, which subsequently shows enrichment of particular tissues/cell-types (Jain and Tuteja, 2019; Uhlen et al., 2010). The tool can further list gene- survival associations for multiple genetic features (expression, mutation, copy-number) based on HPA and recently published data (Smith and Sheltzer, 2021) . Finally, the oncoEnrichR workflow integrates gene fitness data and target priority scores from CRISPR-Cas9 mediated cell inactivation of human cancer cell lines (Behan et al., 2019), which have lately shown to be an effective approach to prioritize drug target candidates in cancer. Importantly, intersection of the candidate hits with genetic screen data also indicate relevant cell lines to be used in potential experimental follow-up studies. Output The output of the oncoEnrichR workflow is two-fold. A structured HTML report is provided with 13 separate analysis modules of the above-mentioned features and datasets, encompassing a variety of visualizations and interactive tables for prioritization of candidate hits. Moreover, a multi-sheet Excel workbook with all annotations and
Recommended publications
  • Foamy Viral Vector Integration Sites in SCID-Repopulating Cells After MGMTP140K-Mediated in Vivo Selection
    Gene Therapy (2015) 22, 591–595 © 2015 Macmillan Publishers Limited All rights reserved 0969-7128/15 www.nature.com/gt SHORT COMMUNICATION Foamy viral vector integration sites in SCID-repopulating cells after MGMTP140K-mediated in vivo selection ME Olszko1, JE Adair2, I Linde1,DTRae1, P Trobridge1, JD Hocum1, DJ Rawlings3, H-P Kiem2 and GD Trobridge1,4 Foamy virus (FV) vectors are promising for hematopoietic stem cell (HSC) gene therapy but preclinical data on the clonal composition of FV vector-transduced human repopulating cells is needed. Human CD34+ human cord blood cells were transduced with an FV vector encoding a methylguanine methyltransferase (MGMT)P140K transgene, transplanted into immunodeficient NOD/SCID IL2Rγnull mice, and selected in vivo for gene-modified cells. The retroviral insertion site profile of repopulating clones was examined using modified genomic sequencing PCR. We observed polyclonal repopulation with no evidence of clonal dominance even with the use of a strong internal spleen focus forming virus promoter known to be genotoxic. Our data supports the use of FV vectors with MGMTP140K for HSC gene therapy but also suggests additional safety features should be developed and evaluated. Gene Therapy (2015) 22, 591–595; doi:10.1038/gt.2015.20; published online 19 March 2015 INTRODUCTION Here we examined clonality and retroviral insertion sites (RIS) of Retroviral vectors derived from the foamy viruses (FVs) efficiently FV vectors in human SCID-repopulating cells after in vivo selection. transduce hematopoietic stem cells (HSCs) and are promising for use in HSC gene therapy.1 One challenge for HSC gene therapy is that some diseases like hemoglobinopathies require high RESULTS AND DISCUSSION percentages of gene-marked cells in order to achieve therapeutic Our goal was to investigate the genotoxicity of FV vectors in the correction.
    [Show full text]
  • Mutation Signatures and Mutable Motifs in Cancer Research
    This is an Open Access document downloaded from ORCA, Cardiff University's institutional repository: http://orca.cf.ac.uk/117709/ This is the author’s version of a work that was submitted to / accepted for publication. Citation for final published version: Rogozin, Igor B., Pavlov, Youri I., Goncearenco, Alexander, De, Subhajyoti, Lada, Artem G., Poliakov, Eugenia, Panchenko, Anna R. and Cooper, David N. 2018. Mutational signatures and mutable motifs in cancer genomes. Briefings in Bioinformatics 19 (6) , pp. 1085-1101. 10.1093/bib/bbx049 file Publishers page: http://dx.doi.org/10.1093/bib/bbx049 <http://dx.doi.org/10.1093/bib/bbx049> Please note: Changes made as a result of publishing processes such as copy-editing, formatting and page numbers may not be reflected in this version. For the definitive version of this publication, please refer to the published source. You are advised to consult the publisher’s version if you wish to cite this paper. This version is being made available in accordance with publisher policies. See http://orca.cf.ac.uk/policies.html for usage policies. Copyright and moral rights for publications made available in ORCA are retained by the copyright holders. Mutational signatures and mutable motifs in cancer genomes Igor B. Rogozin1, Youri I. Pavlov2,3, Alexander Goncearenco1, Subhajyoti De4, Artem G. Lada5, Eugenia Poliakov6, Anna R. Panchenko1, David N. Cooper7 1 National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA 2 Eppley Institute for
    [Show full text]
  • Targeted Sequencing Reveals the Somatic Mutation Landscape in a Swedish Breast Cancer Cohort
    www.nature.com/scientificreports OPEN Targeted sequencing reveals the somatic mutation landscape in a Swedish breast cancer cohort Argyri Mathioudaki1*, Viktor Ljungström2, Malin Melin3, Maja Louise Arendt4,5, Jessika Nordin 5, Åsa Karlsson5, Eva Murén5, Pushpa Saksena6, Jennifer R. S. Meadows5, Voichita D. Marinescu5, Tobias Sjöblom 2 & Kerstin Lindblad‑Toh5,7* Breast cancer (BC) is a genetically heterogeneous disease with high prevalence in Northern Europe. However, there has been no detailed investigation into the Scandinavian somatic landscape. Here, in a homogeneous Swedish cohort, we describe the somatic events underlying BC, leveraging a targeted next‑generation sequencing approach. We designed a 20.5 Mb array targeting coding and regulatory regions of genes with a known role in BC (n = 765). The selected genes were either from human BC studies (n = 294) or from within canine mammary tumor associated regions (n = 471). A set of predominantly estrogen receptor positive tumors (ER + 85%) and their normal tissue counterparts (n = 61) were sequenced to ~ 140 × and 85 × mean target coverage, respectively. MuTect2 and VarScan2 were employed to detect single nucleotide variants (SNVs) and copy number aberrations (CNAs), while MutSigCV (SNVs) and GISTIC (CNAs) algorithms estimated the signifcance of recurrent somatic events. The signifcantly mutated genes (q ≤ 0.01) were PIK3CA (28% of patients), TP53 (21%) and CDH1 (11%). However, histone modifying genes contained the largest number of variants (KMT2C and ARID1A, together 28%). Mutations in KMT2C were mutually exclusive with PI3KCA mutations (p ≤ 0. 001) and half of these afect the formation of a functional PHD domain. The tumor suppressor CDK10 was deleted in 80% of the cohort while the oncogene MDM4 was amplifed.
    [Show full text]
  • Deep Sequencing of the X Chromosome Reveals the Proliferation History of Colorectal Adenomas
    De Grassi et al. Genome Biology 2014, 15:437 http://genomebiology.com/2014/15/8/437 RESEARCH Open Access Deep sequencing of the X chromosome reveals the proliferation history of colorectal adenomas Anna De Grassi1,7†, Fabio Iannelli1†, Matteo Cereda1,2, Sara Volorio3, Valentina Melocchi1, Alessandra Viel4, Gianluca Basso5, Luigi Laghi5, Michele Caselle6 and Francesca D Ciccarelli1,2* Abstract Background: Mismatch repair deficient colorectal adenomas are composed of transformed cells that descend from a common founder and progressively accumulate genomic alterations. The proliferation history of these tumors is still largely unknown. Here we present a novel approach to rebuild the proliferation trees that recapitulate the history of individual colorectal adenomas by mapping the progressive acquisition of somatic point mutations during tumor growth. Results: Using our approach, we called high and low frequency mutations acquired in the X chromosome of four mismatch repair deficient colorectal adenomas deriving from male individuals. We clustered these mutations according to their frequencies and rebuilt the proliferation trees directly from the mutation clusters using a recursive algorithm. The trees of all four lesions were formed of a dominant subclone that co-existed with other genetically heterogeneous subpopulations of cells. However, despite this similar hierarchical organization, the growth dynamics varied among and within tumors, likely depending on a combination of tumor-specific genetic and environmental factors. Conclusions: Our study provides insights into the biological properties of individual mismatch repair deficient colorectal adenomas that may influence their growth and also the response to therapy. Extended to other solid tumors, our novel approach could inform on the mechanisms of cancer progression and on the best treatment choice.
    [Show full text]
  • TF–RBP–AS Triplet Analysis Reveals the Mechanisms of Aberrant
    International Journal of Molecular Sciences Article TF–RBP–AS Triplet Analysis Reveals the Mechanisms of Aberrant Alternative Splicing Events in Kidney Cancer: Implications for Their Possible Clinical Use as Prognostic and Therapeutic Biomarkers Meng He and Fuyan Hu * Department of Statistics, School of Science, Wuhan University of Technology, 122 Luoshi Road, Wuhan 430070, China; [email protected] * Correspondence: [email protected]; Tel.: +86-027-87108033 Abstract: Aberrant alternative splicing (AS) is increasingly linked to cancer; however, how AS contributes to cancer development still remains largely unknown. AS events (ASEs) are largely regulated by RNA-binding proteins (RBPs) whose ability can be modulated by a variety of genetic and epigenetic mechanisms. In this study, we used a computational framework to investigate the roles of transcription factors (TFs) on regulating RBP-AS interactions. A total of 6519 TF–RBP– AS triplets were identified, including 290 TFs, 175 RBPs, and 16 ASEs from TCGA–KIRC RNA sequencing data. TF function categories were defined according to correlation changes between RBP expression and their targeted ASEs. The results suggested that most TFs affected multiple targets, and six different classes of TF-mediated transcriptional dysregulations were identified. Then, Citation: He, M.; Hu, F. TF–RBP–AS regulatory networks were constructed for TF–RBP–AS triplets. Further pathway-enrichment analysis Triplet Analysis Reveals the showed that these TFs and RBPs involved in triplets were enriched in a variety of pathways that were Mechanisms of Aberrant Alternative associated with cancer development and progression. Survival analysis showed that some triplets Splicing Events in Kidney Cancer: were highly associated with survival rates.
    [Show full text]
  • NF-Y Subunits Overexpression in HNSCC
    cancers Article NF-Y Subunits Overexpression in HNSCC Eugenia Bezzecchi 1, Andrea Bernardini 1 , Mirko Ronzio 1 , Claudia Miccolo 2 , Susanna Chiocca 2 , Diletta Dolfini 1 and Roberto Mantovani 1,* 1 Dipartimento di Bioscienze, Università degli Studi di Milano, Via Celoria 26, 20133 Milano, Italy; [email protected] (E.B.); [email protected] (A.B.); [email protected] (M.R.); diletta.dolfi[email protected] (D.D.) 2 Department of Experimental Oncology, IEO, European Institute of Oncology IRCCS, Via Adamello 16, 20139 Milan, Italy; [email protected] (C.M.); [email protected] (S.C.) * Correspondence: [email protected]; Tel.: +39-02-5031-5005 Simple Summary: Cancer cells have altered gene expression profiles. This is ultimately elicited by altered structure, expression or binding of transcription factors to regulatory regions of genomes. The CCAAT-binding trimer is a pioneer transcription factor involved in the activation of “cancer” genes. We and others have shown that the regulatory NF-YA subunit is overexpressed in epithelial cancers. Here, we examined large datasets of bulk gene expression profiles, as well as single-cell data, in head and neck squamous cell carcinomas by bioinformatic methods. We partitioned tumors according to molecular subtypes, mutations and positivity for HPV. We came to the conclusion that high levels of the histone-like subunits and the “short” NF-YAs isoform are protective in HPV-positive tumors. On the other hand, high levels of the “long” NF-YAl were found in the recently identified aggressive and metastasis-prone cell population undergoing partial epithelial to mesenchymal transition, p-EMT.
    [Show full text]
  • Comparative Assessment of Genes Driving Cancer and Somatic Evolution
    bioRxiv preprint doi: https://doi.org/10.1101/2021.08.31.458177; this version posted August 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Comparative assessment of genes driving cancer and somatic evolution. Lisa Dressler1,2,3*, Michele Bortolomeazzi1,2*, Mohamed Reda Keddar1,2*, Hrvoje Misetic1,2*, Giulia Sartini1,2*, Amelia Acha-Sagredo1,2*, Lucia Montorsi1,2*, Neshika Wijewardhane1,2, Dimitra Repana1,2, Joel Nulsen1,2, Jacki Goldman4, Marc Pollit4, Patrick Davis4, Amy Strange4, Karen Ambrose4, Francesca D. Ciccarelli1,2** 1 Cancer Systems Biology Laboratory, The Francis Crick Institute, London NW1 1AT, UK 2 School of Cancer and Pharmaceutical Sciences, King’s College London, London SE11UL, UK 3 Department of Medical and Molecular Genetics, King’s College London, London SE1 9RT, UK 4 Scientific Computing, The Francis Crick Institute, London NW1 1AT, UK * = equal contribution ** = corresponding author ([email protected]) 1 bioRxiv preprint doi: https://doi.org/10.1101/2021.08.31.458177; this version posted August 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. ABSTRACT Genetic alterations of somatic cells can drive non-malignant clone formation and promote cancer initiation. However, the link between these processes remains unclear hampering our understanding of tissue homeostasis and cancer development.
    [Show full text]
  • The Network of Cancer Genes (NCG): a Comprehensive Catalogue of Known and Candidate Cancer Genes from Cancer Sequencing Screens
    bioRxiv preprint doi: https://doi.org/10.1101/389858; this version posted December 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. The Network of Cancer Genes (NCG): a comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens. Dimitra Repana*, Joel Nulsen*, Lisa Dressler*, Michele Bortolomeazzi*, Santhilata Kuppili Venkata*, Aikaterini Tourna, Anna Yakovleva, Tommaso Palmieri and Francesca D. Ciccarelli¶ Cancer Systems Biology Laboratory, The Francis Crick Institute, London NW1 1AT, UK and School of Cancer and Pharmaceutical Sciences, King’s College London, London SE1 1UL, UK * Equal contribution ¶ To whom correspondence should be addressed: Tel: +44 (0)20 7848 6616; Fax: +44 (0)20 7848 6220; Email: [email protected] Author contact details: [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/389858; this version posted December 1, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. ABSTRACT The Network of Cancer Genes (NCG) is a manually curated repository of 2,372 genes whose somatic modifications have a known or predicted cancer driver role.
    [Show full text]
  • A Graph-Theoretic Approach to Model Genomic Data and Identify Biological Modules Asscociated with Cancer Outcomes
    A Graph-Theoretic Approach to Model Genomic Data and Identify Biological Modules Asscociated with Cancer Outcomes Deanna Petrochilos A dissertation presented in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington 2013 Reading Committee: Neil Abernethy, Chair John Gennari, Ali Shojaie Program Authorized to Offer Degree: Biomedical Informatics and Health Education ©Copyright 2013 Deanna Petrochilos University of Washington Abstract Using Graph-Based Methods to Integrate and Analyze Cancer Genomic Data Deanna Petrochilos Chair of the Supervisory Committee: Assistant Professor Neil Abernethy Biomedical Informatics and Health Education Studies of the genetic basis of complex disease present statistical and methodological challenges in the discovery of reliable and high-confidence genes that reveal biological phenomena underlying the etiology of disease or gene signatures prognostic of disease outcomes. This dissertation examines the capacity of graph-theoretical methods to model and analyze genomic information and thus facilitate using prior knowledge to create a more discrete and functionally relevant feature space. To assess the statistical and computational value of graph-based algorithms in genomic studies of cancer onset and progression, I apply a random walk graph algorithm in a weighted interaction network. I merge high-throughput co-expression and curated interaction data to search for biological modules associated with key cancer processes and evaluate significant modules in terms
    [Show full text]
  • Oncogenes, Tumor Suppressor and Differentiation Genes Represent The
    www.nature.com/scientificreports OPEN Oncogenes, tumor suppressor and diferentiation genes represent the oldest human gene classes and evolve concurrently A. A. Makashov1,2,5, S. V. Malov3,4 & A. P. Kozlov1,2,5,6* Earlier we showed that human genome contains many evolutionarily young or novel genes with tumor- specifc or tumor-predominant expression. We suggest calling such genes Tumor Specifcally Expressed, Evolutionarily New (TSEEN) genes. In this paper we performed a study of the evolutionary ages of diferent classes of human genes, using homology searches in genomes of diferent taxa in human lineage. We discovered that diferent classes of human genes have diferent evolutionary ages and confrmed the existence of TSEEN gene classes. On the other hand, we found that oncogenes, tumor- suppressor genes and diferentiation genes are among the oldest gene classes in humans and their evolution occurs concurrently. These fndings confrm non-trivial predictions made by our hypothesis of the possible evolutionary role of hereditary tumors. The results may be important for better understanding of tumor biology. TSEEN genes may become the best tumor markers. We are interested in the possible evolutionary role of tumors. In previous publications1–5 we formulated the hypothesis of the possible evolutionary role of hereditary tumors, i.e. tumors that can be passed from parent to ofspring. According to this hypothesis, hereditary tumors were the source of extra cell masses which could be used in the evolution of multicellular organisms for the expression of evolutionarily novel genes, for the origin of new diferentiated cell types with novel functions and for building new structures which constitute evolutionary innovations and morphological novelties.
    [Show full text]
  • (NCG): a Comprehensive Catalogue of Known and Candidate Cancer Genes from Cancer Sequencing Screens
    King’s Research Portal DOI: 10.1186/s13059-018-1612-0 Document Version Version created as part of publication process; publisher's layout; not normally made publicly available Link to publication record in King's Research Portal Citation for published version (APA): Repana, D., Nulsen, J., Dressler, L., Bortolomeazzi, M., Venkata, S. K., Tourna, A., Yakovleva, A., Palmieri, T., & Ciccarelli, F. D. (2019). The Network of Cancer Genes (NCG): a comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens. Genome Biology, 20(1), [1]. https://doi.org/10.1186/s13059-018-1612-0 Citing this paper Please note that where the full-text provided on King's Research Portal is the Author Accepted Manuscript or Post-Print version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version for pagination, volume/issue, and date of publication details. And where the final published version is provided on the Research Portal, if citing you are again advised to check the publisher's website for any subsequent corrections. General rights Copyright and moral rights for the publications made accessible in the Research Portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognize and abide by the legal requirements associated with these rights. •Users may download and print one copy of any publication from the Research Portal for the purpose of private study or research. •You may not further distribute the material or use it for any profit-making activity or commercial gain •You may freely distribute the URL identifying the publication in the Research Portal Take down policy If you believe that this document breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim.
    [Show full text]
  • The Network of Cancer Genes (NCG): a Comprehensive Catalogue Of
    Repana et al. Genome Biology (2019) 20:1 https://doi.org/10.1186/s13059-018-1612-0 DATABASE Open Access The Network of Cancer Genes (NCG): a comprehensive catalogue of known and candidate cancer genes from cancer sequencing screens Dimitra Repana1,2†, Joel Nulsen1,2†, Lisa Dressler1,2†, Michele Bortolomeazzi1,2†, Santhilata Kuppili Venkata1,2†, Aikaterini Tourna1,2, Anna Yakovleva1,2, Tommaso Palmieri1,2 and Francesca D. Ciccarelli1,2* Abstract The Network of Cancer Genes (NCG) is a manually curated repository of 2372 genes whose somatic modifications have known or predicted cancer driver roles. These genes were collected from 275 publications, including two sources of known cancer genes and 273 cancer sequencing screens of more than 100 cancer types from 34,905 cancer donors and multiple primary sites. This represents a more than 1.5-fold content increase compared to the previous version. NCG also annotates properties of cancer genes, such as duplicability, evolutionary origin, RNA and protein expression, miRNA and protein interactions, and protein function and essentiality. NCG is accessible at http://ncg.kcl.ac.uk/. Keywords: Cancer genomics screens, Cancer genes, Cancer heterogeneity, Systems-level properties Background project [5] and for the recent analysis of the whole set of One of the main goals of cancer genomics is to find the TCGA samples [6], which accompanied the conclusion genes that, upon acquiring somatic alterations, play a of the TCGA sequencing phase [7]. As more and more role in driving cancer (cancer genes). To this end, in the studies contribute to our knowledge of cancer genes, it last 10 years, hundreds of cancer sequencing screens becomes increasingly challenging for the research com- have generated mutational data from thousands of munity to maintain an up-to-date overview of cancer cancer samples.
    [Show full text]