An Integrative Bioinformatics Approach Reveals Coding and Non-Coding

Total Page:16

File Type:pdf, Size:1020Kb

An Integrative Bioinformatics Approach Reveals Coding and Non-Coding www.nature.com/bjc ARTICLE Genetics & Genomics An integrative bioinformatics approach reveals coding and non-coding gene variants associated with gene expression profiles and outcome in breast cancer molecular subtypes Balázs Győrffy1,2,Lőrinc Pongor1,2, Giulia Bottai3, Xiaotong Li4, Jan Budczies5, András Szabó2, Christos Hatzis4, Lajos Pusztai4 and Libero Santarpia3 BACKGROUND: Sequence variations in coding and non-coding regions of the genome can affect gene expression and signalling pathways, which in turn may influence disease outcome. METHODS: In this study, we integrated somatic mutations, gene expression and clinical data from 930 breast cancer patients included in the TCGA database. Genes associated with single mutations in molecular breast cancer subtypes were identified by the Mann-Whitney U-test and their prognostic value was evaluated by Kaplan-Meier and Cox regression analyses. Results were confirmed using gene expression profiles from the Metabric data set (n = 1988) and whole-genome sequencing data from the TCGA cohort (n = 117). RESULTS: The overall mutation rate in coding and non-coding regions were significantly higher in ER-negative/HER2-negative tumours (P = 2.8E–03 and P = 2.4E–07, respectively). Recurrent sequence variations were identified in non-coding regulatory regions of several cancer-associated genes, including NBPF1, PIK3CA and TP53. After multivariate regression analysis, gene signatures associated with three coding mutations (CDH1, MAP3K1 and TP53) and two non-coding variants (CRTC3 and STAG2)in cancer-related genes predicted prognosis in ER-positive/HER2-negative tumours. CONCLUSIONS: These findings demonstrate that sequence alterations influence gene expression and oncogenic pathways, possibly affecting the outcome of breast cancer patients. Our data provide potential opportunities to identify non-coding variations with functional and clinical relevance in breast cancer. British Journal of Cancer (2018) 118:1107–1114; https://doi.org/10.1038/s41416-018-0030-0 INTRODUCTION associated gene expression signatures often have a greater Results from DNA-sequencing studies revealed a great degree of prognostic value than single mutations.7, 8 genomic heterogeneity in breast cancer, which may partly explain In this study, we used the data from 930 breast cancer patients the diversity of clinical behaviour of breast cancer subtypes.1 It is of The Cancer Genome Atlas (TCGA) database to identify genes also apparent that breast cancer genome is characterised by only with sequence variations in coding and non-coding regions that a few frequently mutated genes together with a long tail of rare were captured by exome sequencing. We next identified genes mutations in a great variety of genes, whose potential prognostic whose expression was associated with a given mutation by impact is difficult to assess due to the small sample size of affected comparing cases with and without the mutation. This analysis was cases.2 Furthermore, although it has been suggested that these done separately for each molecular breast cancer subtype. Once genetic aberrations can be useful drug targets or biomarkers for subtype-specific mutation-associated gene signatures were iden- patient stratification, their utility in clinical practice has been tified, we assessed their effects on survival. Results were also limited so far.3 validated using whole-genome sequencing (WGS) data from the In addition to mutations in protein-coding regions, a new class TCGA cohort (n = 117) and gene expression profiles from the of oncogenic events in non-coding areas of the genome has also Metabric data set (n = 1988). been identified, suggesting that cancer can result from an array of genomic alterations affecting both coding and non-coding regions.4 Most pathogenic DNA sequence alterations directly or MATERIALS AND METHODS indirectly impact gene expression and protein functions, leaving Data source and acquisition an imprint on messenger RNA expression that can be captured by Whole-exome sequencing (WES), RNA sequencing (RNA-seq) data gene expression profiling.5, 6 It is noteworthy that mutation- and the related clinical information from the TCGA database (n = 1MTA TTK Lendület Cancer Biomarker Research Group, Institute of Enzymology, Budapest H-1117, Hungary; 22nd Department of Pediatrics, Semmelweis University, Budapest H- 1094, Hungary; 3Oncology Experimental Therapeutics, Humanitas Clinical and Research Institute, Rozzano, Milan 20089, Italy; 4Breast Medical Oncology, Yale Cancer Center, Yale School of Medicine, New Haven, Connecticut 06520, USA and 5Institute of Pathology, Charité University Hospital, 10117 Berlin, Germany Correspondence: Libero Santarpia ([email protected]) These authors contributed equally: Lőrinc Pongor and Giulia Bottai Received: 19 July 2017 Revised: 19 January 2018 Accepted: 22 January 2018 Published online: 21 March 2018 © Cancer Research UK 2018 Breast cancer genetic variants and prognosis BGyőrffy et al. 1108 930), as well as WGS data (n = 117) were obtained from http:// Gene expression signatures cancergenome.nih.gov/. Gene expression analysis was performed Mann-Whitney U-test was performed to identify genes whose on the pre-processed RNA-seq data (i.e., level 3 data) using expression was significantly associated with a given genotype (i.e., MapSplice and RNA-Seq by Expectation-Maximisation. Individual somatic mutation) in each breast cancer subtype, separately. patient files were merged into a single database using the plyr R Samples were divided into two cohorts according to the mutation package.9 Information for relapse-free survival was available only status and the cohorts were compared to each other. The analysis for 52 patients and therefore we used overall survival (OS) as was performed for all coding and non-coding sequence variations outcome measure. Clinical and pathological features of the breast without filtering for functional significance. The average expres- cancer cohort are described in Supplementary Table 1. The entire sion of the significantly mutation-associated genes was desig- workflow of the study is summarised in Supplementary Figure 1. nated as the gene expression signature of a given genotype. The expression of the downregulated genes was inverted before Processing of sequence variations computing the mean expression of the signature. Significant Aligned data were downloaded via The Cancer Genomics Hub associations from the Mann-Whitney U-test (P ≤ 0.01) were ranked (https://cghub.ucsc.edu/) for both tumour and matched normal based on their achieved P-values. Finally, a maximum of samples. Somatic mutation calls were performed with the MuTect 100 significant genes for each signature was included to reduce programme, as previously described.8 The identified variants were noise in large gene sets. annotated with MuTect using SNP database (dbSNP, build 139) and Catalogue Of Somatic Mutations in Cancer (COSMIC, version Statistical analyses 68) library.10 The recognised sequence variations were functionally To examine the association of mutation status and mutation- annotated via SNPeff (version 3.5) and filtered to include the associated signatures with OS, we performed Kaplan-Meier and COSMIC-identified genes only.8 Known cancer-associated genes Cox proportional hazard regression analyses using the median were defined by the Cancer Gene Census.10 We also analysed expression of the signature to dichotomise the population. sequence variants occurring in introns, promoters (defined as − Multivariate analysis was performed including tumour size 2.5 kb from transcription starting sites) and other regulatory (T stage), lymph node status (N stage), the presence of distant elements, including enhancers, untranslated regions (UTRs) and metastases (M stage) and MKI67 gene expression as a measure transcription factor (TF) binding sites, which were captured by of proliferation. The level of statistical significance was set at WES. We used the LARVA software to identify significantly P<0.05. 1234567890();,: mutated non-coding regulatory elements, via modelling with β- binomial distribution and mutation rate calculation through DNA replication timing correction.11 To identify potential non-coding RESULTS drivers, somatic mutations were annotated and prioritised by Database characteristics FunSeq2, combining inter- and intra-species conservation, loss- Nine hundred and thirty patients with invasive breast cancer were and gain-of-function events for TF binding, enhancer-gene analysed, including 50.2% ER-positive/HER2-negative, 19.9% ER- linkages and network centrality, and per-element recurrence negative/HER2-negative and 29.9% HER2-positive cancers. The across samples.12 mean age of patients was 58.3 years. Nodal status was available for 905 patients, of which 45.3% were lymph node positive. The Sample classification according to breast cancer molecular median follow-up was 31.5 months. The detailed characteristics of subtypes patients included in the analysis are presented in Supplementary Breast cancer molecular subtypes were defined by the oestrogen Table 1. receptor (ER) and human epidermal growth factor receptor 2 (HER2) status determined by RNA-seq data. For ER the RNA-seq ID Genetic variants in coding and non-coding regions in breast 2099 was used with a cut-off of 3,700, whereas for HER2 the RNA- cancer seq ID 2064 was used with a cut-off of 27,000. Cut-off values were We found 208 and 3562 genes with sequence variations in coding determined by a receiver operating curve (ROC) analysis compar- and non-coding areas in at
Recommended publications
  • Supplemental Table S1
    Entrez Gene Symbol Gene Name Affymetrix EST Glomchip SAGE Stanford Literature HPA confirmed Gene ID Profiling profiling Profiling Profiling array profiling confirmed 1 2 A2M alpha-2-macroglobulin 0 0 0 1 0 2 10347 ABCA7 ATP-binding cassette, sub-family A (ABC1), member 7 1 0 0 0 0 3 10350 ABCA9 ATP-binding cassette, sub-family A (ABC1), member 9 1 0 0 0 0 4 10057 ABCC5 ATP-binding cassette, sub-family C (CFTR/MRP), member 5 1 0 0 0 0 5 10060 ABCC9 ATP-binding cassette, sub-family C (CFTR/MRP), member 9 1 0 0 0 0 6 79575 ABHD8 abhydrolase domain containing 8 1 0 0 0 0 7 51225 ABI3 ABI gene family, member 3 1 0 1 0 0 8 29 ABR active BCR-related gene 1 0 0 0 0 9 25841 ABTB2 ankyrin repeat and BTB (POZ) domain containing 2 1 0 1 0 0 10 30 ACAA1 acetyl-Coenzyme A acyltransferase 1 (peroxisomal 3-oxoacyl-Coenzyme A thiol 0 1 0 0 0 11 43 ACHE acetylcholinesterase (Yt blood group) 1 0 0 0 0 12 58 ACTA1 actin, alpha 1, skeletal muscle 0 1 0 0 0 13 60 ACTB actin, beta 01000 1 14 71 ACTG1 actin, gamma 1 0 1 0 0 0 15 81 ACTN4 actinin, alpha 4 0 0 1 1 1 10700177 16 10096 ACTR3 ARP3 actin-related protein 3 homolog (yeast) 0 1 0 0 0 17 94 ACVRL1 activin A receptor type II-like 1 1 0 1 0 0 18 8038 ADAM12 ADAM metallopeptidase domain 12 (meltrin alpha) 1 0 0 0 0 19 8751 ADAM15 ADAM metallopeptidase domain 15 (metargidin) 1 0 0 0 0 20 8728 ADAM19 ADAM metallopeptidase domain 19 (meltrin beta) 1 0 0 0 0 21 81792 ADAMTS12 ADAM metallopeptidase with thrombospondin type 1 motif, 12 1 0 0 0 0 22 9507 ADAMTS4 ADAM metallopeptidase with thrombospondin type 1
    [Show full text]
  • TMEM50A Shrna (M) Lentiviral Particles: Sc-154474-V
    SANTA CRUZ BIOTECHNOLOGY, INC. TMEM50A shRNA (m) Lentiviral Particles: sc-154474-V BACKGROUND PRODUCT TMEM50A (transmembrane protein 50A), also known as small membrane pro- TMEM50A shRNA (m) Lentiviral Particles is a pool of concentrated, trans- tein 1, is a 157 amino acid protein encoded by a gene mapping to human duction-ready viral particles containing 3 target-specific constructs that chromosome 1. Chromosome 1 is the largest human chromosome spanning encode 19-25 nt (plus hairpin) shRNA designed to knock down gene expres- about 260 million base pairs and making up 8% of the human genome. There sion. Each vial contains 200 µl frozen stock containing 1.0 x 106 infectious are about 3,000 genes on chromosome 1, and considering the great number units of virus (IFU) in Dulbecco’s Modified Eagle’s Medium with 25 mM of genes there are also a large number of diseases associated with chromo- HEPES pH 7.3. Suitable for 10-20 transductions. Also see TMEM50A some 1. Notably, the rare aging disease Hutchinson-Gilford progeria is asso- siRNA (m): sc-154474 and TMEM50A shRNA Plasmid (m): sc-154474-SH as ciated with the LMNA gene which encodes Lamin A. When defective, the alternate gene silencing products. LMNA gene product can build up in the nucleus and cause characteristic nuclear blebs. The mechanism of rapidly enhanced aging is unclear and is APPLICATIONS a topic of continuing exploration. The MUTYH gene is located on chromo- TMEM50A shRNA (m) Lentiviral Particles is recommended for the inhibition some 1 and is partially responsible for familial adenomatous polyposis.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • 4-6 Weeks Old Female C57BL/6 Mice Obtained from Jackson Labs Were Used for Cell Isolation
    Methods Mice: 4-6 weeks old female C57BL/6 mice obtained from Jackson labs were used for cell isolation. Female Foxp3-IRES-GFP reporter mice (1), backcrossed to B6/C57 background for 10 generations, were used for the isolation of naïve CD4 and naïve CD8 cells for the RNAseq experiments. The mice were housed in pathogen-free animal facility in the La Jolla Institute for Allergy and Immunology and were used according to protocols approved by the Institutional Animal Care and use Committee. Preparation of cells: Subsets of thymocytes were isolated by cell sorting as previously described (2), after cell surface staining using CD4 (GK1.5), CD8 (53-6.7), CD3ε (145- 2C11), CD24 (M1/69) (all from Biolegend). DP cells: CD4+CD8 int/hi; CD4 SP cells: CD4CD3 hi, CD24 int/lo; CD8 SP cells: CD8 int/hi CD4 CD3 hi, CD24 int/lo (Fig S2). Peripheral subsets were isolated after pooling spleen and lymph nodes. T cells were enriched by negative isolation using Dynabeads (Dynabeads untouched mouse T cells, 11413D, Invitrogen). After surface staining for CD4 (GK1.5), CD8 (53-6.7), CD62L (MEL-14), CD25 (PC61) and CD44 (IM7), naïve CD4+CD62L hiCD25-CD44lo and naïve CD8+CD62L hiCD25-CD44lo were obtained by sorting (BD FACS Aria). Additionally, for the RNAseq experiments, CD4 and CD8 naïve cells were isolated by sorting T cells from the Foxp3- IRES-GFP mice: CD4+CD62LhiCD25–CD44lo GFP(FOXP3)– and CD8+CD62LhiCD25– CD44lo GFP(FOXP3)– (antibodies were from Biolegend). In some cases, naïve CD4 cells were cultured in vitro under Th1 or Th2 polarizing conditions (3, 4).
    [Show full text]
  • Cellular and Molecular Signatures in the Disease Tissue of Early
    Cellular and Molecular Signatures in the Disease Tissue of Early Rheumatoid Arthritis Stratify Clinical Response to csDMARD-Therapy and Predict Radiographic Progression Frances Humby1,* Myles Lewis1,* Nandhini Ramamoorthi2, Jason Hackney3, Michael Barnes1, Michele Bombardieri1, Francesca Setiadi2, Stephen Kelly1, Fabiola Bene1, Maria di Cicco1, Sudeh Riahi1, Vidalba Rocher-Ros1, Nora Ng1, Ilias Lazorou1, Rebecca E. Hands1, Desiree van der Heijde4, Robert Landewé5, Annette van der Helm-van Mil4, Alberto Cauli6, Iain B. McInnes7, Christopher D. Buckley8, Ernest Choy9, Peter Taylor10, Michael J. Townsend2 & Costantino Pitzalis1 1Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK. Departments of 2Biomarker Discovery OMNI, 3Bioinformatics and Computational Biology, Genentech Research and Early Development, South San Francisco, California 94080 USA 4Department of Rheumatology, Leiden University Medical Center, The Netherlands 5Department of Clinical Immunology & Rheumatology, Amsterdam Rheumatology & Immunology Center, Amsterdam, The Netherlands 6Rheumatology Unit, Department of Medical Sciences, Policlinico of the University of Cagliari, Cagliari, Italy 7Institute of Infection, Immunity and Inflammation, University of Glasgow, Glasgow G12 8TA, UK 8Rheumatology Research Group, Institute of Inflammation and Ageing (IIA), University of Birmingham, Birmingham B15 2WB, UK 9Institute of
    [Show full text]
  • Role of CREB/CRTC1-Regulated Gene Transcription During Hippocampal-Dependent Memory in Alzheimer’S Disease Mouse Models
    ADVERTIMENT. Lʼaccés als continguts dʼaquesta tesi queda condicionat a lʼacceptació de les condicions dʼús establertes per la següent llicència Creative Commons: http://cat.creativecommons.org/?page_id=184 ADVERTENCIA. El acceso a los contenidos de esta tesis queda condicionado a la aceptación de las condiciones de uso establecidas por la siguiente licencia Creative Commons: http://es.creativecommons.org/blog/licencias/ WARNING. The access to the contents of this doctoral thesis it is limited to the acceptance of the use conditions set by the following Creative Commons license: https://creativecommons.org/licenses/?lang=en Institut de Neurociències Universitat Autònoma de Barcelona Departament de Bioquímica i Biologia Molecular Unitat de Bioquímica, Facultat de Medicina Role of CREB/CRTC1-regulated gene transcription during hippocampal-dependent memory in Alzheimer’s disease mouse models Arnaldo J. Parra Damas TESIS DOCTORAL Bellaterra, 2015 Institut de Neurociències Departament de Bioquímica i Biologia Molecular Universitat Autònoma de Barcelona Role of CREB/CRTC1-regulated gene transcription during hippocampal-dependent memory in Alzheimer’s disease mouse models Papel de la transcripción génica regulada por CRTC1/CREB durante memoria dependiente de hipocampo en modelos murinos de la enfermedad de Alzheimer Memoria de tesis doctoral presentada por Arnaldo J. Parra Damas para optar al grado de Doctor en Neurociencias por la Universitat Autonòma de Barcelona. Trabajo realizado en la Unidad de Bioquímica y Biología Molecular de la Facultad de Medicina del Departamento de Bioquímica y Biología Molecular de la Universitat Autònoma de Barcelona, y en el Instituto de Neurociencias de la Universitat Autònoma de Barcelona, bajo la dirección del Doctor Carlos Saura Antolín.
    [Show full text]
  • CREB-Regulated Transcription Coactivator 3 (CRTC3) Polymorphism Associated with Type 2 Diabetes and Hyperlipidemia in the Taiwanese Population
    Tzu Chi Medical Journal 26 (2014) 114e118 Contents lists available at ScienceDirect Tzu Chi Medical Journal journal homepage: www.tzuchimedjnl.com Original Article CREB-regulated transcription coactivator 3 (CRTC3) polymorphism associated with type 2 diabetes and hyperlipidemia in the Taiwanese population y y Kuei-Fang Lee a, b, , Cheng-Chia Lin c, , Tsung-Cheng Hsieh a, Chun-Te Wu c, * Lawrence Shih-Hsin Wu a, a Institute of Medical Sciences, Tzu Chi University, Hualien, Taiwan b Laboratory for Cytogenetics, Center for Genetic Counseling, Buddhist Tzu Chi General Hospital, Hualien, Taiwan c Department of Urological Surgery, Chang Gung Memorial Hospital, Keelung, Taiwan article info abstract Article history: Objective: Type 2 diabetes mellitus (T2D) is a pathologically and genetically heterogeneous disease Received 24 April 2014 influenced by genetic and environmental factors. This study aims to investigate the association between Received in revised form T2D and polymorphism(s) in CREB-regulated transcription coactivator 3 (CRTC3) in Asian Taiwan. 27 May 2014 Materials and methods: In this study, 417 participants with T2D and 197 without T2D were recruited. Accepted 9 July 2014 Anthropometrics, the metabolic profile, blood pressure, fasting plasma glucose, glycosylated hemoglobin (HbA1c), serum triglycerides, serum total cholesterol, low-density lipoprotein, high-density lipoprotein Keywords: (HDL), and C-peptide were analyzed. TaqMan genotyping was used to identify individual genotypes, and CRTC3 Hyperlipidemia the association of CRTC3 polymorphism with clinical and biochemical parameters was assessed. SNP Results: Single nucleotide polymorphism (SNP) rs8033595 showed an association with diabetes Type 2 diabetes (p ¼ 0.031) and hyperlipidemia (p ¼ 0.002). Odds ratio analysis showed that A carriers (AA or AG) had a protective effect against developing T2D and hyperlipidemia.
    [Show full text]
  • 1714 Gene Comprehensive Cancer Panel Enriched for Clinically Actionable Genes with Additional Biologically Relevant Genes 400-500X Average Coverage on Tumor
    xO GENE PANEL 1714 gene comprehensive cancer panel enriched for clinically actionable genes with additional biologically relevant genes 400-500x average coverage on tumor Genes A-C Genes D-F Genes G-I Genes J-L AATK ATAD2B BTG1 CDH7 CREM DACH1 EPHA1 FES G6PC3 HGF IL18RAP JADE1 LMO1 ABCA1 ATF1 BTG2 CDK1 CRHR1 DACH2 EPHA2 FEV G6PD HIF1A IL1R1 JAK1 LMO2 ABCB1 ATM BTG3 CDK10 CRK DAXX EPHA3 FGF1 GAB1 HIF1AN IL1R2 JAK2 LMO7 ABCB11 ATR BTK CDK11A CRKL DBH EPHA4 FGF10 GAB2 HIST1H1E IL1RAP JAK3 LMTK2 ABCB4 ATRX BTRC CDK11B CRLF2 DCC EPHA5 FGF11 GABPA HIST1H3B IL20RA JARID2 LMTK3 ABCC1 AURKA BUB1 CDK12 CRTC1 DCUN1D1 EPHA6 FGF12 GALNT12 HIST1H4E IL20RB JAZF1 LPHN2 ABCC2 AURKB BUB1B CDK13 CRTC2 DCUN1D2 EPHA7 FGF13 GATA1 HLA-A IL21R JMJD1C LPHN3 ABCG1 AURKC BUB3 CDK14 CRTC3 DDB2 EPHA8 FGF14 GATA2 HLA-B IL22RA1 JMJD4 LPP ABCG2 AXIN1 C11orf30 CDK15 CSF1 DDIT3 EPHB1 FGF16 GATA3 HLF IL22RA2 JMJD6 LRP1B ABI1 AXIN2 CACNA1C CDK16 CSF1R DDR1 EPHB2 FGF17 GATA5 HLTF IL23R JMJD7 LRP5 ABL1 AXL CACNA1S CDK17 CSF2RA DDR2 EPHB3 FGF18 GATA6 HMGA1 IL2RA JMJD8 LRP6 ABL2 B2M CACNB2 CDK18 CSF2RB DDX3X EPHB4 FGF19 GDNF HMGA2 IL2RB JUN LRRK2 ACE BABAM1 CADM2 CDK19 CSF3R DDX5 EPHB6 FGF2 GFI1 HMGCR IL2RG JUNB LSM1 ACSL6 BACH1 CALR CDK2 CSK DDX6 EPOR FGF20 GFI1B HNF1A IL3 JUND LTK ACTA2 BACH2 CAMTA1 CDK20 CSNK1D DEK ERBB2 FGF21 GFRA4 HNF1B IL3RA JUP LYL1 ACTC1 BAG4 CAPRIN2 CDK3 CSNK1E DHFR ERBB3 FGF22 GGCX HNRNPA3 IL4R KAT2A LYN ACVR1 BAI3 CARD10 CDK4 CTCF DHH ERBB4 FGF23 GHR HOXA10 IL5RA KAT2B LZTR1 ACVR1B BAP1 CARD11 CDK5 CTCFL DIAPH1 ERCC1 FGF3 GID4 HOXA11 IL6R KAT5 ACVR2A
    [Show full text]
  • Literature Mining Sustains and Enhances Knowledge Discovery from Omic Studies
    LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES by Rick Matthew Jordan B.S. Biology, University of Pittsburgh, 1996 M.S. Molecular Biology/Biotechnology, East Carolina University, 2001 M.S. Biomedical Informatics, University of Pittsburgh, 2005 Submitted to the Graduate Faculty of School of Medicine in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2016 UNIVERSITY OF PITTSBURGH SCHOOL OF MEDICINE This dissertation was presented by Rick Matthew Jordan It was defended on December 2, 2015 and approved by Shyam Visweswaran, M.D., Ph.D., Associate Professor Rebecca Jacobson, M.D., M.S., Professor Songjian Lu, Ph.D., Assistant Professor Dissertation Advisor: Vanathi Gopalakrishnan, Ph.D., Associate Professor ii Copyright © by Rick Matthew Jordan 2016 iii LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES Rick Matthew Jordan, M.S. University of Pittsburgh, 2016 Genomic, proteomic and other experimentally generated data from studies of biological systems aiming to discover disease biomarkers are currently analyzed without sufficient supporting evidence from the literature due to complexities associated with automated processing. Extracting prior knowledge about markers associated with biological sample types and disease states from the literature is tedious, and little research has been performed to understand how to use this knowledge to inform the generation of classification models from ‘omic’ data. Using pathway analysis methods to better understand the underlying biology of complex diseases such as breast and lung cancers is state-of-the-art. However, the problem of how to combine literature- mining evidence with pathway analysis evidence is an open problem in biomedical informatics research.
    [Show full text]
  • Supp Table 6.Pdf
    Supplementary Table 6. Processes associated to the 2037 SCL candidate target genes ID Symbol Entrez Gene Name Process NM_178114 AMIGO2 adhesion molecule with Ig-like domain 2 adhesion NM_033474 ARVCF armadillo repeat gene deletes in velocardiofacial syndrome adhesion NM_027060 BTBD9 BTB (POZ) domain containing 9 adhesion NM_001039149 CD226 CD226 molecule adhesion NM_010581 CD47 CD47 molecule adhesion NM_023370 CDH23 cadherin-like 23 adhesion NM_207298 CERCAM cerebral endothelial cell adhesion molecule adhesion NM_021719 CLDN15 claudin 15 adhesion NM_009902 CLDN3 claudin 3 adhesion NM_008779 CNTN3 contactin 3 (plasmacytoma associated) adhesion NM_015734 COL5A1 collagen, type V, alpha 1 adhesion NM_007803 CTTN cortactin adhesion NM_009142 CX3CL1 chemokine (C-X3-C motif) ligand 1 adhesion NM_031174 DSCAM Down syndrome cell adhesion molecule adhesion NM_145158 EMILIN2 elastin microfibril interfacer 2 adhesion NM_001081286 FAT1 FAT tumor suppressor homolog 1 (Drosophila) adhesion NM_001080814 FAT3 FAT tumor suppressor homolog 3 (Drosophila) adhesion NM_153795 FERMT3 fermitin family homolog 3 (Drosophila) adhesion NM_010494 ICAM2 intercellular adhesion molecule 2 adhesion NM_023892 ICAM4 (includes EG:3386) intercellular adhesion molecule 4 (Landsteiner-Wiener blood group)adhesion NM_001001979 MEGF10 multiple EGF-like-domains 10 adhesion NM_172522 MEGF11 multiple EGF-like-domains 11 adhesion NM_010739 MUC13 mucin 13, cell surface associated adhesion NM_013610 NINJ1 ninjurin 1 adhesion NM_016718 NINJ2 ninjurin 2 adhesion NM_172932 NLGN3 neuroligin
    [Show full text]
  • WO 2012/174282 A2 20 December 2012 (20.12.2012) P O P C T
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2012/174282 A2 20 December 2012 (20.12.2012) P O P C T (51) International Patent Classification: David [US/US]; 13539 N . 95th Way, Scottsdale, AZ C12Q 1/68 (2006.01) 85260 (US). (21) International Application Number: (74) Agent: AKHAVAN, Ramin; Caris Science, Inc., 6655 N . PCT/US20 12/0425 19 Macarthur Blvd., Irving, TX 75039 (US). (22) International Filing Date: (81) Designated States (unless otherwise indicated, for every 14 June 2012 (14.06.2012) kind of national protection available): AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ, English (25) Filing Language: CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO, Publication Language: English DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP, KR, (30) Priority Data: KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, 61/497,895 16 June 201 1 (16.06.201 1) US MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, 61/499,138 20 June 201 1 (20.06.201 1) US OM, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SC, SD, 61/501,680 27 June 201 1 (27.06.201 1) u s SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, 61/506,019 8 July 201 1(08.07.201 1) u s TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW.
    [Show full text]
  • Human Induced Pluripotent Stem Cell–Derived Podocytes Mature Into Vascularized Glomeruli Upon Experimental Transplantation
    BASIC RESEARCH www.jasn.org Human Induced Pluripotent Stem Cell–Derived Podocytes Mature into Vascularized Glomeruli upon Experimental Transplantation † Sazia Sharmin,* Atsuhiro Taguchi,* Yusuke Kaku,* Yasuhiro Yoshimura,* Tomoko Ohmori,* ‡ † ‡ Tetsushi Sakuma, Masashi Mukoyama, Takashi Yamamoto, Hidetake Kurihara,§ and | Ryuichi Nishinakamura* *Department of Kidney Development, Institute of Molecular Embryology and Genetics, and †Department of Nephrology, Faculty of Life Sciences, Kumamoto University, Kumamoto, Japan; ‡Department of Mathematical and Life Sciences, Graduate School of Science, Hiroshima University, Hiroshima, Japan; §Division of Anatomy, Juntendo University School of Medicine, Tokyo, Japan; and |Japan Science and Technology Agency, CREST, Kumamoto, Japan ABSTRACT Glomerular podocytes express proteins, such as nephrin, that constitute the slit diaphragm, thereby contributing to the filtration process in the kidney. Glomerular development has been analyzed mainly in mice, whereas analysis of human kidney development has been minimal because of limited access to embryonic kidneys. We previously reported the induction of three-dimensional primordial glomeruli from human induced pluripotent stem (iPS) cells. Here, using transcription activator–like effector nuclease-mediated homologous recombination, we generated human iPS cell lines that express green fluorescent protein (GFP) in the NPHS1 locus, which encodes nephrin, and we show that GFP expression facilitated accurate visualization of nephrin-positive podocyte formation in
    [Show full text]