ABSTRACT DATA DRIVEN APPROACHES to IDENTIFY DETERMINANTS of HEART DISEASES and CANCER RESISTANCE Avinash Das Sahu, Doctor Of

Total Page:16

File Type:pdf, Size:1020Kb

ABSTRACT DATA DRIVEN APPROACHES to IDENTIFY DETERMINANTS of HEART DISEASES and CANCER RESISTANCE Avinash Das Sahu, Doctor Of ABSTRACT Title of dissertation: DATA DRIVEN APPROACHES TO IDENTIFY DETERMINANTS OF HEART DISEASES AND CANCER RESISTANCE Avinash Das Sahu, Doctor of Philosophy, 2016 Dissertation directed by: Professor Sridhar Hannenhalli Department of Computer Science Cancer and cardio-vascular diseases are the leading causes of death world-wide. Caused by systemic genetic and molecular disruptions in cells, these disorders are the manifestation of profound disturbance of normal cellular homeostasis. People suffering or at high risk for these disorders need early diagnosis and personalized therapeutic intervention. Successful implementation of such clinical measures can significantly improve global health. However, development of effective therapies is hindered by the challenges in identifying genetic and molecular determinants of the onset of diseases; and in cases where therapies already exist, the main challenge is to identify molecular determinants that drive resistance to the therapies. Due to the progress in sequencing technologies, the access to a large genome-wide biolog- ical data is now extended far beyond few experimental labs to the global research community. The unprecedented availability of the data has revolutionized the ca- pabilities of computational researchers, enabling them to collaboratively address the long standing problems from many different perspectives. Likewise, this thesis tackles the two main public health related challenges using data driven approaches. Numerous association studies have been proposed to identify genomic variants that determine disease. However, their clinical utility remains limited due to their inability to distinguish causal variants from associated variants. In the presented thesis, we first propose a simple scheme that improves association studies in su- pervised fashion and has shown its applicability in identifying genomic regulatory variants associated with hypertension. Next, we propose a coupled Bayesian re- gression approach { eQTeL, which leverages epigenetic data to estimate regulatory and gene interaction potential, and identifies combinations of regulatory genomic variants that explain the gene expression variance. On human heart data, eQTeL not only explains a significantly greater proportion of expression variance in sam- ples, but also predicts gene expression more accurately than other methods. We demonstrate that eQTeL accurately detects causal regulatory SNPs by simulation, particularly those with small effect sizes. Using various functional data, we show that SNPs detected by eQTeL are enriched for allele-specific protein binding and hi- stone modifications, which potentially disrupt binding of core cardiac transcription factors and are spatially proximal to their target. eQTeL SNPs capture a substantial proportion of genetic determinants of expression variance and we estimate that 58% of these SNPs are putatively causal. The challenge of identifying molecular determinants of cancer resistance so far could only be dealt with labor intensive and costly experimental studies, and in case of experimental drugs such studies are infeasible. Here we take a fundamentally different data driven approach to understand the evolving landscape of emerging re- sistance. We introduce a novel class of genetic interactions termed synthetic rescues (SR) in cancer, which denotes a functional interaction between two genes where a change in the activity of one vulnerable gene (which may be a target of a cancer drug) is lethal, but subsequently altered activity of its partner rescuer gene restores cell viability. Next we describe a comprehensive computational framework {termed INCISOR{ for identifying SR underlying cancer resistance. Applying INCISOR to mine The Cancer Genome Atlas (TCGA), a large collection of cancer patient data, we identified the first pan-cancer SR networks, composed of interactions common to many cancer types. We experimentally test and validate a subset of these in- teractions involving the master regulator gene mTOR. We find that rescuer genes become increasingly activated as breast cancer progresses, testifying to pervasive ongoing rescue processes. We show that SRs can be utilized to successfully predict patients' survival and response to the majority of current cancer drugs, and impor- tantly, for predicting the emergence of drug resistance from the initial tumor biopsy. Our analysis suggests a potential new strategy for enhancing the effectiveness of ex- isting cancer therapies by targeting their rescuer genes to counteract resistance. The thesis provides statistical frameworks that can harness ever increasing high throughput genomic data to address challenges in determining the molecular underpinnings of hypertension, cardiovascular disease and cancer resistance. We discover novel molecular mechanistic insights that will advance the progress in early disease prevention and personalized therapeutics. Our analyses sheds light on the fundamental biological understanding of gene regulation and interaction, and opens up exciting avenues of translational applications in risk prediction and therapeutics. DATA DRIVEN APPROACHES TO IDENTIFY DETERMINANTS OF HEART DISEASES & CANCER RESISTANCE by Avinash Das Sahu Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial fulfillment of the requirements for the degree of Doctor of Philosophy 2016 Advisory Committee: Professor Sridhar Hannenhalli, Chair/Advisor Professor Eytan Ruppin, Co-Advisor Professor Laura Elnitski Professor Hector Corrada Bravo Professor Michael Cummings c Copyright by Avinash Das Sahu 2016 ii Contents iii 1 Introduction1 1.1 How does a cell function?........................1 1.2 How does a cell produce proteins?....................2 1.3 How can same DNA give rise to drastically different cells?......4 1.4 How do eukaryotes regulate genes in a cell-type specific manner?..6 1.5 Biological processes performed by genes.................8 1.6 Disruption of biological processes causes diseases...........9 1.6.1 Mutation.............................9 1.6.2 Coding mutation......................... 10 1.6.3 Non-coding mutation....................... 10 1.7 Heritable mutation disorder....................... 12 1.7.1 Expression quantitative trail loci (eQTL)............ 13 1.7.2 Genome wide association studies (GWAS)........... 14 1.7.3 Limitation of association studies................. 14 1.7.4 How to improve association studies?.............. 16 1.8 Somatic mutation disorder........................ 17 1.8.1 Hallmarks of cancer........................ 17 1.8.2 Cancer therapies......................... 20 1.8.3 Cancer resistance and molecular reprogramming........ 21 1.8.4 Genetic interactions in cancer.................. 22 1.9 Computation challenges......................... 23 1.10 Significance................................ 26 1.10.1 Cardio-vascular disease and hypertension............ 27 1.10.2 Cancer............................... 28 1.11 Organization of Thesis.......................... 29 1.12 Contribution................................ 30 I Cardio-vascular disease and hyper-tension 33 2 EPIGENOMIC MODEL OF CARDIAC ENHANCERS WITH AP- PLICATION TO GENOME WIDE ASSOCIATION STUDIES 35 2.1 Overview.................................. 35 iii 2.2 Background................................ 38 2.2.1 Expression quantitive trait loci................. 38 2.2.2 Genome wide association studies................ 40 2.2.3 Epigenetics and regulation.................... 41 2.2.4 Epigenetic Modifications..................... 42 2.2.5 Epigenetic Inheritance...................... 43 2.2.6 Support vector machines (SVM)................. 43 2.3 Methods.................................. 44 2.3.1 Correlating DNase Hypersensitivity and Gene Expression... 46 2.4 Results................................... 47 2.4.1 SVM model for cardiac enhancers................ 47 2.4.2 Identification of cardiac enhancers near SNPs associated with cardiac phenotypes........................ 51 2.4.3 Cardiac enhancers near cardiac GWAS SNPs are enriched for cardiac regulator motifs..................... 53 2.4.4 Cardiac enhancers near cardiac GWAS SNPs are likely to reg- ulate the nearby genes...................... 54 2.4.5 Genes near cardiac enhancers are enriched for cardiac function 56 2.5 Conclusion................................. 57 3 Bayesian integration of genetics and epigenetics detects causal regulatory SNPs underlying expression variability 59 3.1 Introduction................................ 59 3.2 Results................................... 61 3.3 Quantitative Trait enhancer Loci (eQTeL) model........... 61 3.4 eQTeL detects expression regulatory SNP in MAGNet......... 65 3.5 eQTeL detects causal SNPs in semi-synthetic data........... 68 3.6 eQTeL detects SNPs with small effect sizes............... 71 3.7 eeSNPs lie within protein-bound genomic regions........... 74 3.8 eeSNPs exhibit binding and regulatory allele specificity........ 75 3.9 eeSNPs are spatially proximal to their target gene........... 78 3.10 eeSNPs disrupt motifs of cardiac transcription factors......... 78 3.11 Proportion of eeSNPs that are causal.................. 80 3.12 Methods.................................. 82 3.12.1 Modeling regulatory-interaction potential:........... 82 3.12.2 Modeling Gene Expression:................... 82 3.12.3 Cardiac expression data (MAGNet):.............. 85 3.12.4 Selection of genes:......................... 86 3.12.5 Pre-procession of gene-expression:................ 86 3.12.6 Genotypes and imputation for cardiac
Recommended publications
  • John Snow Lecture, by George Davey Smith
    Published by Oxford University Press on behalf of the International Epidemiological Association International Journal of Epidemiology 2011;40:537–562 ß The Author 2011; all rights reserved. doi:10.1093/ije/dyr117 IEA JOHN SNOW LECTURE 2011 Epidemiology, epigenetics and the ‘Gloomy Prospect’: embracing randomness in population health research and practice George Davey Smith MRC Centre for Causal Analyses in Translational Epidemiology, School of Social and Community Medicine, University of Bristol, Oakfield House, Oakfield Grove, Bristol BS8 2BN, UK Epidemiologists aim to identify modifiable causes of disease, this often being a prerequisite for the application of epidemiological findings in public health pro- grammes, health service planning and clinical medicine. Despite successes in identifying causes, it is often claimed that there are missing additional causes for even reasonably well-understood conditions such as lung cancer and coronary heart disease. Several lines of evidence suggest that largely chance events, from the biographical down to the sub-cellular, contribute an important stochastic element to disease risk that is not epidemiologically tractable at the individual level. Epigenetic influences provide a fashionable contemporary explanation for such seemingly random processes. Chance events—such as a particular lifelong smoker living unharmed to 100 years—are averaged out at the group level. As a consequence population-level differences (for example, secular trends or differ- ences between administrative areas) can be entirely explicable by causal factors that appear to account for only a small proportion of individual-level risk. In public health terms, a modifiable cause of the large majority of cases of a disease may have been identified, with a wild goose chase continuing in an attempt to discipline the random nature of the world with respect to which particular indi- viduals will succumb.
    [Show full text]
  • Manual Annotation and Analysis of the Defensin Gene Cluster in the C57BL
    BMC Genomics BioMed Central Research article Open Access Manual annotation and analysis of the defensin gene cluster in the C57BL/6J mouse reference genome Clara Amid*†1, Linda M Rehaume*†2, Kelly L Brown2,3, James GR Gilbert1, Gordon Dougan1, Robert EW Hancock2 and Jennifer L Harrow1 Address: 1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK, 2University of British Columbia, Centre for Microbial Disease & Immunity Research, 2259 Lower Mall, Vancouver, BC, V6T 1Z4, Canada and 3Department of Rheumatology and Inflammation Research, Göteborg University, Guldhedsgatan 10, S-413 46 Göteborg, Sweden Email: Clara Amid* - [email protected]; Linda M Rehaume* - [email protected]; Kelly L Brown - [email protected]; James GR Gilbert - [email protected]; Gordon Dougan - [email protected]; Robert EW Hancock - [email protected]; Jennifer L Harrow - [email protected] * Corresponding authors †Equal contributors Published: 15 December 2009 Received: 15 May 2009 Accepted: 15 December 2009 BMC Genomics 2009, 10:606 doi:10.1186/1471-2164-10-606 This article is available from: http://www.biomedcentral.com/1471-2164/10/606 © 2009 Amid et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: Host defense peptides are a critical component of the innate immune system. Human alpha- and beta-defensin genes are subject to copy number variation (CNV) and historically the organization of mouse alpha-defensin genes has been poorly defined.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Mapping DNA Structural Variation in Dogs
    Downloaded from genome.cshlp.org on October 3, 2021 - Published by Cold Spring Harbor Laboratory Press Resource Mapping DNA structural variation in dogs Wei-Kang Chen,1,4 Joshua D. Swartz,1,4,5 Laura J. Rush,2 and Carlos E. Alvarez1,3,6 1Center for Molecular and Human Genetics, The Research Institute at Nationwide Children’s Hospital, Columbus, Ohio 43205, USA; 2Department of Veterinary Biosciences, The Ohio State University, Columbus, Ohio 43210, USA; 3Department of Pediatrics, The Ohio State University College of Medicine, Columbus, Ohio 43210, USA DNA structural variation (SV) comprises a major portion of genetic diversity, but its biological impact is unclear. We propose that the genetic history and extraordinary phenotypic variation of dogs make them an ideal mammal in which to study the effects of SV on biology and disease. The hundreds of existing dog breeds were created by selection of extreme morphological and behavioral traits. And along with those traits, each breed carries increased risk for different diseases. We used array CGH to create the first map of DNA copy number variation (CNV) or SV in dogs. The extent of this variation, and some of the gene classes affected, are similar to those of mice and humans. Most canine CNVs affect genes, including disease and candidate disease genes, and are thus likely to be functional. We identified many CNVs that may be breed or breed class specific. Cluster analysis of CNV regions showed that dog breeds tend to group according to breed classes. Our combined findings suggest many CNVs are (1) in linkage disequilibrium with flanking sequence, and (2) associated with breed-specific traits.
    [Show full text]
  • Assigning Folds to the Proteins Encoded by the Genome of Mycoplasma Genitalium (Protein Fold Recognition͞computer Analysis of Genome Sequences)
    Proc. Natl. Acad. Sci. USA Vol. 94, pp. 11929–11934, October 1997 Biophysics Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium (protein fold recognitionycomputer analysis of genome sequences) DANIEL FISCHER* AND DAVID EISENBERG University of California, Los Angeles–Department of Energy Laboratory of Structural Biology and Molecular Medicine, Molecular Biology Institute, University of California, Los Angeles, Box 951570, Los Angeles, CA 90095-1570 Contributed by David Eisenberg, August 8, 1997 ABSTRACT A crucial step in exploiting the information genitalium (MG) (10), as a test of the capabilities of our inherent in genome sequences is to assign to each protein automatic fold recognition server and as a case study to sequence its three-dimensional fold and biological function. identify the difficulties facing automated fold assignment. Here we describe fold assignment for the proteins encoded by the small genome of Mycoplasma genitalium. The assignment MATERIALS AND METHODS was carried out by our computer server (http:yywww.doe- mbi.ucla.eduypeopleyfrsvryfrsvr.html), which assigns folds to The MG Sequences. The 468 MG sequences were obtained amino acid sequences by comparing sequence-derived predic- from The Institute for Genome Research (TIGR) through its tions with known structures. Of the total of 468 protein ORFs, Web address: http:yywww.tigr.orgytdbymdbymgdbymgd- 103 (22%) can be assigned a known protein fold with high b.html. Three types of annotation (based on searches in the confidence, as cross-validated with tests on known structures. sequence database) accompany each TIGR sequence (10): (i) Of these sequences, 75 (16%) show enough sequence similarity functional assignment—a clear sequence similarity with a to proteins of known structure that they can also be detected protein of known function from another organism was found by traditional sequence–sequence comparison methods.
    [Show full text]
  • Practical Structure-Sequence Alignment of Pseudoknotted Rnas Wei Wang
    Practical structure-sequence alignment of pseudoknotted RNAs Wei Wang To cite this version: Wei Wang. Practical structure-sequence alignment of pseudoknotted RNAs. Bioinformatics [q- bio.QM]. Université Paris Saclay (COmUE), 2017. English. NNT : 2017SACLS563. tel-01697889 HAL Id: tel-01697889 https://tel.archives-ouvertes.fr/tel-01697889 Submitted on 31 Jan 2018 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. 1 NNT : 2017SACLS563 Thèse de doctorat de l’Université Paris-Saclay préparée à L’Université Paris-Sud Ecole doctorale n◦580 (STIC) Sciences et Technologies de l’Information et de la Communication Spécialité de doctorat : Informatique par M. Wei WANG Alignement pratique de structure-séquence d’ARN avec pseudonœuds Thèse présentée et soutenue à Orsay, le 18 Décembre 2017. Composition du Jury : Mme Hélène TOUZET Directrice de Recherche (Présidente) CNRS, Université Lille 1 M. Guillaume FERTIN Professeur (Rapporteur) Université de Nantes M. Jan GORODKIN Professeur (Rapporteur) University of Copenhagen Mme Johanne COHEN Directrice de Recherche (Examinatrice)
    [Show full text]
  • Detailed Characterization of Human Induced Pluripotent Stem Cells Manufactured for Therapeutic Applications
    Stem Cell Rev and Rep DOI 10.1007/s12015-016-9662-8 Detailed Characterization of Human Induced Pluripotent Stem Cells Manufactured for Therapeutic Applications Behnam Ahmadian Baghbaderani 1 & Adhikarla Syama2 & Renuka Sivapatham3 & Ying Pei4 & Odity Mukherjee2 & Thomas Fellner1 & Xianmin Zeng3,4 & Mahendra S. Rao5,6 # The Author(s) 2016. This article is published with open access at Springerlink.com Abstract We have recently described manufacturing of hu- help determine which set of tests will be most useful in mon- man induced pluripotent stem cells (iPSC) master cell banks itoring the cells and establishing criteria for discarding a line. (MCB) generated by a clinically compliant process using cord blood as a starting material (Baghbaderani et al. in Stem Cell Keywords Induced pluripotent stem cells . Embryonic stem Reports, 5(4), 647–659, 2015). In this manuscript, we de- cells . Manufacturing . cGMP . Consent . Markers scribe the detailed characterization of the two iPSC clones generated using this process, including whole genome se- quencing (WGS), microarray, and comparative genomic hy- Introduction bridization (aCGH) single nucleotide polymorphism (SNP) analysis. We compare their profiles with a proposed calibra- Induced pluripotent stem cells (iPSCs) are akin to embryonic tion material and with a reporter subclone and lines made by a stem cells (ESC) [2] in their developmental potential, but dif- similar process from different donors. We believe that iPSCs fer from ESC in the starting cell used and the requirement of a are likely to be used to make multiple clinical products. We set of proteins to induce pluripotency [3]. Although function- further believe that the lines used as input material will be used ally identical, iPSCs may differ from ESC in subtle ways, at different sites and, given their immortal status, will be used including in their epigenetic profile, exposure to the environ- for many years or even decades.
    [Show full text]
  • UC San Diego UC San Diego Electronic Theses and Dissertations
    UC San Diego UC San Diego Electronic Theses and Dissertations Title Analysis and applications of conserved sequence patterns in proteins Permalink https://escholarship.org/uc/item/5kh9z3fr Author Ie, Tze Way Eugene Publication Date 2007 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Analysis and Applications of Conserved Sequence Patterns in Proteins A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Tze Way Eugene Ie Committee in charge: Professor Yoav Freund, Chair Professor Sanjoy Dasgupta Professor Charles Elkan Professor Terry Gaasterland Professor Philip Papadopoulos Professor Pavel Pevzner 2007 Copyright Tze Way Eugene Ie, 2007 All rights reserved. The dissertation of Tze Way Eugene Ie is ap- proved, and it is acceptable in quality and form for publication on microfilm: Chair University of California, San Diego 2007 iii DEDICATION This dissertation is dedicated in memory of my beloved father, Ie It Sim (1951–1997). iv TABLE OF CONTENTS Signature Page . iii Dedication . iv Table of Contents . v List of Figures . viii List of Tables . x Acknowledgements . xi Vita, Publications, and Fields of Study . xiii Abstract of the Dissertation . xv 1 Introduction . 1 1.1 Protein Homology Search . 1 1.2 Sequence Comparison Methods . 2 1.3 Statistical Analysis of Protein Motifs . 4 1.4 Motif Finding using Random Projections . 5 1.5 Microbial Gene Finding without Assembly . 7 2 Multi-Class Protein Classification using Adaptive Codes . 9 2.1 Profile-based Protein Classifiers . 14 2.2 Embedding Base Classifiers in Code Space .
    [Show full text]
  • High-Resolution Analysis of Chromosomal Alterations in Adult Acute Lymphoblastic Leukemia
    Elmer Press Original Article J Hematol. 2014;3(3):65-71 High-Resolution Analysis of Chromosomal Alterations in Adult Acute Lymphoblastic Leukemia Lam Kah Yuena, c, Zakaria Zubaidaha, Ivyna Bong Pau Nia, Megat Baharuddin Puteri Jamilatul Noora, Esa Ezaliaa, Chin Yuet Menga, Ong Tee Chuanb, Vegappan Subramanianb, Chang Kiang Mengb Abstract Introduction Background: Chromosomal alterations occur frequently in acute Acute lymphoblastic leukemia (ALL) is a heterogeneous lymphoblastic leukemia (ALL), affecting either the chromosome disease, resulting from the accumulation of chromosomal al- number or structural changes. These alterations can lead to inacti- terations either in the form of numerical or structural chang- vation of tumor suppressor genes and/or activation of oncogenes. es such as amplification, deletion, inversion or translocation. The objective of this study was to identify recurrent and/or novel The frequency of chromosomal alterations in adult ALL is chromosomal alterations in adult ALL using single nucleotide poly- 64-85% [1], compared to 60-69% in childhood ALL. Trans- morphism (SNP) array analysis. location t(9.22), one of the most common recurring chromo- Methods: We studied 41 cases of adult ALL compared with healthy somal alterations, is found in 20-40% adult ALL patients [2], normal controls using SNP array. and its incidence increases with age. Some of the chromo- somal alterations are significantly associated with remission Results: Our analysis revealed 43 copy number variant regions, duration, complete remission rate and disease-free survival of which 44% were amplifications and 56% were deletions. The [1]. Increasing age is also associated with lower remission most common amplifications were on chromosome regions 8p23.1 rates, shorter remissions and poor outcomes in adult ALL (71%), 1q44 (66%), 1q23.3 (54%), 11q23.3 (54%), 12p13.33 [1].
    [Show full text]
  • Genomic Alterations on 8P21-P23 Are the Most Frequent Genetic Events in Stage I Squamous Cell Carcinoma of the Lung
    EXPERIMENTAL AND THERAPEUTIC MEDICINE 9: 345-350, 2015 Genomic alterations on 8p21-p23 are the most frequent genetic events in stage I squamous cell carcinoma of the lung JIUN KANG Department of Biomedical Laboratory Science, Korea Nazarene University, Cheonan 330-718, Republic of Korea Received April 23, 2014; Accepted October 31, 2014 DOI: 10.3892/etm.2014.2123 Abstract. Genetic alterations in the early stages of cancer genetic changes, and chromosomal aberrations are a hallmark have a close correlation with tumor initiation and poten- of cancer cells, occurring at a high prevalence in SCC. Although tially activate downstream pathways implicated in tumor a number of studies have been performed to evaluate genetic progression; however, the method of initiation in sporadic events associated with the development and progression of neoplasias is largely unknown. In this study, whole-genome SCC (2,3), the molecular mechanism remains to be uncovered, microarray-comparative genomic hybridization was and the identification of predictive markers is crucial. performed to identify the early genetic alterations that define Genetic alterations in the early stages of cancer have a the prognosis of patients with stage I squamous cell carcinoma close correlation with tumor initiation, and potentially activate (SCC) of the lung. The most striking finding was the high downstream pathways that are implicated in tumor progres- frequency of copy number losses and hemizygous deletions on sion. With the recent advances of computed tomographic chromosome 8p, which occurred in 94.7% (18/19) and 63.2% technology, the number of patients diagnosed with stage I (12/19) of the cases, respectively, with a delineated minimal lung SCC has been increasing (3).
    [Show full text]
  • Etwork Inference from Vector Representations of Words
    RZ 3918 (# ZUR1801-015) 01/09/2018 Computer Science/Mathematics 19 pages Research Report INtERAcT: Interaction Network Inference from Vector Representations of Words Matteo Manica, Roland Mathis, Maria Rodriguez Martinez IBM Research – Zurich 8803 Rüschlikon Switzerland LIMITED DISTRIBUTION NOTICE This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside pub- lisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies (e.g., payment of royalties). Some re- ports are available at http://domino.watson.ibm.com/library/Cyberdig.nsf/home. Research Africa • Almaden • Austin • Australia • Brazil • China • Haifa • India • Ireland • Tokyo • Watson • Zurich INtERAcT: Interaction Network Inference from Vector Representations of Words Matteo Manica1,2,*, Roland Mathis1,*, Mar´ıa Rodr´ıguez Mart´ınez1, y ftte,lth,[email protected] 1 IBM Research Z¨urich, S¨aumerstrasse4, 8803 R¨uschlikon, Switzerland 2 ETH Z¨urich, Institute f¨urMolekulare Systembiologie, Auguste-Piccard-Hof 1 8093, Z¨urich, Switzerland * Shared first authorship y Corresponding author Abstract Background In recent years, the number of biomedical publications made freely available through literature archives is steadfastly growing, resulting in a rich source of untapped new knowledge. Most biomedical facts are however buried in the form of unstructured text, and their exploitation requires expert-knowledge and time-consuming manual curation of published articles.
    [Show full text]
  • Predicting Protein Folding Pathways Using Ensemble Modeling and Sequence Information
    Predicting Protein Folding Pathways Using Ensemble Modeling and Sequence Information by David Becerra McGill University School of Computer Science Montr´eal, Qu´ebec, Canada A thesis submitted to McGill University in partial fulfillment of the requirements of the degree of Doctor of Philosophy c David Becerra 2017 Dedicated to my GRANDMOM. Thanks for everything. You are such a strong woman i Acknowledgements My doctoral program has been an amazing adventure with ups and downs. The most valuable fact is that I am a different person with respect to the one who started the program five years ago. I am not better or worst, I am just different. I would like to express my gratitude to all the people who helped me during my time at McGill. This thesis contains only my name as author, but it should have the name of all the people who played an important role in the completion of this thesis (and they are a lot of people). I am certain I would have not reached the end of my thesis without the help, advises and support of all people around me. First at all, I want to thank my country Colombia. Being abroad, I learnt to value my culture, my roots and my south-american positiveness, happiness and resourcefulness. I am really proud of being Colombian and of showing to the world a bit of what our wonderful land has given to us. I would also like to thank all Colombians, because with your taxes (via Colciencia’s funding) I was able to complete this dream.
    [Show full text]