DNA Methylation Marks in Peripheral Blood and the Risk of Developing Mature B Cell Neoplasms

Total Page:16

File Type:pdf, Size:1020Kb

DNA Methylation Marks in Peripheral Blood and the Risk of Developing Mature B Cell Neoplasms DNA methylation marks in peripheral blood and the risk of developing mature B cell neoplasms Nicole Wong Doo A thesis in fulfillment of the requirements for the degree of Doctor of Philosophy School of Population and Global Health The University of Melbourne 2018 ORCID ID: 0000-0003-3725-3397 0 Blank 1 Abstract Dysregulation of DNA methylation is a feature of mature B cell neoplasms (MBCN) but it is not known whether methylation changes can be detected in blood-derived DNA prior to MBCN diagnosis. In this prospective cohort study, peripheral blood was collected from healthy participants at recruitment (1990-1994). Participants who were subsequently diagnosed with MBCN (chronic lymphocytic lymphoma, B cell non-Hodgkin lymphoma and myeloma) up to 2012 were matched to the same number of controls based on age, seX, ethnicity, and type of blood sample (Guthrie cards, mononuclear cells, buffy coats). DNA methylation was measured using the Infinium®HumanMethylation450 BeadChip. Peripheral blood DNA was collected from 438 matched case-control pairs, a median of 10.6 years prior to diagnosis with MBCN. A series of analytical approaches was used in order to evaluate whether there was a distinct methylation profile associated with MBCN. First, global methylation analysis was performed, identifying increased methylation in CpG island and promoter-associated CpGs and widespread hypomethylation. Second, conditional logistic regression was used to identify differentially methylated CpG sites (DMPs) and kernel smoothing was used to identify differentially methylated regions (DMRs). Third, differential methylation variability, considered to be a distinctive feature in cancer, was assessed. In total, 1,338 DMPs were identified, of which 90 had gain of methylation in CpG sites associated with homeoboX genes and 1,248 had loss of methylation in CpG sites associated with MAPK signaling pathway genes and genes involved in chemokine signaling pathways. There were 9,857 DMRs, with a cluster of 151 DMRs located in a 3.8kb region on 6p21.3, corresponding to the major histocompatibility locus. Differential methylation variability analysis identified 144 novel CpG sites distinctively located outside CpG islands. Conclusion: Distinctive changes in peripheral blood DNA methylation can be detected many years prior to diagnosis with MBCN, suggesting that changes in DNA methylation are an early epigenetic event. This contributes to our understanding of the timing of methylation changes in the development of MBCN. 2 Blank 3 Declaration This is to certify that: (i) the thesis comprises only my original work towards the PhD eXcept where indicated, (ii) due acknowledgement has been made in the teXt to all other material used, (iii) the thesis is less than 100,000 words in length, eXclusive of table, maps, bibliographies, appendices and footnotes. 4 Blank 5 Preface (i) The work contained within this thesis was performed as a collaboration with Graham G. Giles, Dallas R. English and John L. Hopper at the School of Population and Global Health, University of Melbourne and the Cancer Epidemiology and Intelligence Division, Cancer Council Victoria, who established the Melbourne Collaborative Cohort Study (MCCS); Melissa C. Southey, JiHoon E. Joo and Ee Ming Wong at the Genetic Epidemiology Laboratory, University of Melbourne who performed the DNA methylation assay; Enes Makalic, Daniel F. Schmidt and Chol- Hee Jung who performed the bioinformatics analysis. The component of the work which I contributed as original research was to assist in planning the nested study design, wholly planning and designing the statistical analyses, wholly interpreting the results and writing of the manuscript. The regional methylation analysis was performed by JiHoon E. Joo, according to specifications outlined by me and I interpreted the analysis in full. The differential variability analysis was performed by Dr Pierre Antoine-Dugué, and I interpreted the results. (ii) No portion of this thesis has been submitted for other qualifications (iii) No portion of this thesis was carried out prior to enrolment in the degree (iv) No third party editorial assistance was provided in preparation of the thesis (v) For the publication included in this thesis, the roles of the authors are as follows: Enes Makalic – biostatistical analysis of array data; JiHoon E. Joo – DNA methylation assay; Claire M Vajdic – contribution to manuscript; Daniel F. Schmidt – biostatistical analysis of array data; Ee Ming Wong – DNA methylation assay; Gianluca Severi – contribution to study design and review of manuscript; Daniel J. Park – contribution to manuscript; Jessica Chung – contribution to developing bioinformatics pipeline; Laura Baglietto – contribution to study design and review of manuscript; Henry M. Prince – contribution to manuscript; John F. Seymour – contribution to manuscript; Constantine Tam – contribution to manuscript; John L. Hopper – contribution to study design; Dallas R. English – contribution to study design; Dallas R. English – contribution to study design; Roger L. Milne – contribution to manuscript; Simon J. Harrison – contribution to manuscript; Melissa C. Southey – DNA methylation assay and contribution to study design; Graham G. Giles – contribution to study design and manuscript. (vii) The MCCS was funded by VicHealth and Cancer Council Victoria. The MCCS was further supported by Australian NHMRC grants 209057, 251553 and 504711 and by infrastructure provided by Cancer Council Victoria. i Blank ii Acknowledgements Nicholas Brennan – for your patience and understanding Ella & Luca – you embody the forces of nature and time Elsa & Victor Wong Doo – a lifetime of support and encouragement Colleagues at Concord Hospital – for your unquestioning support iii Blank iv Table of Contents Abstract ................................................................................................................. 2 Declaration ............................................................................................................ 4 Preface ................................................................................................................... i Acknowledgements ............................................................................................... iii Table of Contents ................................................................................................... v List of Tables ......................................................................................................... vii List of Figures ....................................................................................................... viii List of Abbreviations ............................................................................................ 10 Acknowledgements ..................................................... Error! Bookmark not defined. 1 Introduction .................................................................................................. 12 2 Background .................................................................................................... 14 2.1 Mature B cell neoplasms – Background ............................................................... 14 2.2 DNA Methylation ................................................................................................ 26 2.3 Differential methylation as a marker of cancer risk ............................................. 40 2.4 Measuring DNA methylation ............................................................................... 41 2.5 Measures of differential methylation .................................................................. 46 2.6 Biological challenges in measuring DNA methylation .......................................... 47 3 Study Design .................................................................................................. 50 3.1 Melbourne Collaborative Cohort Study ............................................................... 50 3.2 Nested Case-Control Study, participant selection ................................................ 51 4 Methods ......................................................................................................... 55 4.1 DNA source and sample collection ...................................................................... 55 4.2 DNA Extraction and Bisulfite conversion ............................................................. 55 4.3 DNA methylation measurement ......................................................................... 56 4.4 Data processing .................................................................................................. 56 4.5 CpG site selection ............................................................................................... 57 4.6 Assembly of Candidate Genes ............................................................................. 58 5 Results ........................................................................................................... 59 5.1 Global DNA Methylation ..................................................................................... 59 5.2 Differentially methylated positions ..................................................................... 72 Background ..................................................................................................................... 72 Analysis ........................................................................................................................... 72 Results ...........................................................................................................................
Recommended publications
  • CST9L (NM 080610) Human Tagged ORF Clone – RC206646L4
    OriGene Technologies, Inc. 9620 Medical Center Drive, Ste 200 Rockville, MD 20850, US Phone: +1-888-267-4436 [email protected] EU: [email protected] CN: [email protected] Product datasheet for RC206646L4 CST9L (NM_080610) Human Tagged ORF Clone Product data: Product Type: Expression Plasmids Product Name: CST9L (NM_080610) Human Tagged ORF Clone Tag: mGFP Symbol: CST9L Synonyms: bA218C14.1; CTES7B Vector: pLenti-C-mGFP-P2A-Puro (PS100093) E. coli Selection: Chloramphenicol (34 ug/mL) Cell Selection: Puromycin ORF Nucleotide The ORF insert of this clone is exactly the same as(RC206646). Sequence: Restriction Sites: SgfI-MluI Cloning Scheme: ACCN: NM_080610 ORF Size: 441 bp This product is to be used for laboratory only. Not for diagnostic or therapeutic use. View online » ©2021 OriGene Technologies, Inc., 9620 Medical Center Drive, Ste 200, Rockville, MD 20850, US 1 / 2 CST9L (NM_080610) Human Tagged ORF Clone – RC206646L4 OTI Disclaimer: The molecular sequence of this clone aligns with the gene accession number as a point of reference only. However, individual transcript sequences of the same gene can differ through naturally occurring variations (e.g. polymorphisms), each with its own valid existence. This clone is substantially in agreement with the reference, but a complete review of all prevailing variants is recommended prior to use. More info OTI Annotation: This clone was engineered to express the complete ORF with an expression tag. Expression varies depending on the nature of the gene. RefSeq: NM_080610.1 RefSeq Size: 982 bp RefSeq ORF: 444 bp Locus ID: 128821 UniProt ID: Q9H4G1, A0A140VJH1 Protein Families: Secreted Protein, Transmembrane MW: 17.3 kDa Gene Summary: The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences.
    [Show full text]
  • Understanding Chronic Kidney Disease: Genetic and Epigenetic Approaches
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2017 Understanding Chronic Kidney Disease: Genetic And Epigenetic Approaches Yi-An Ko Ko University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Bioinformatics Commons, Genetics Commons, and the Systems Biology Commons Recommended Citation Ko, Yi-An Ko, "Understanding Chronic Kidney Disease: Genetic And Epigenetic Approaches" (2017). Publicly Accessible Penn Dissertations. 2404. https://repository.upenn.edu/edissertations/2404 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/2404 For more information, please contact [email protected]. Understanding Chronic Kidney Disease: Genetic And Epigenetic Approaches Abstract The work described in this dissertation aimed to better understand the genetic and epigenetic factors influencing chronic kidney disease (CKD) development. Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) significantly associated with chronic kidney disease. However, these studies have not effectively identified target genes for the CKD variants. Most of the identified variants are localized to non-coding genomic regions, and how they associate with CKD development is not well-understood. As GWAS studies only explain a small fraction of heritability, we hypothesized that epigenetic changes could explain part of this missing heritability. To identify potential gene targets of the genetic variants, we performed expression quantitative loci (eQTL) analysis, using genotyping arrays and RNA sequencing from human kidney samples. To identify the target genes of CKD-associated SNPs, we integrated the GWAS-identified SNPs with the eQTL results using a Bayesian colocalization method, coloc. This resulted in a short list of target genes, including PGAP3 and CASP9, two genes that have been shown to present with kidney phenotypes in knockout mice.
    [Show full text]
  • Role of Amylase in Ovarian Cancer Mai Mohamed University of South Florida, [email protected]
    University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School July 2017 Role of Amylase in Ovarian Cancer Mai Mohamed University of South Florida, [email protected] Follow this and additional works at: http://scholarcommons.usf.edu/etd Part of the Pathology Commons Scholar Commons Citation Mohamed, Mai, "Role of Amylase in Ovarian Cancer" (2017). Graduate Theses and Dissertations. http://scholarcommons.usf.edu/etd/6907 This Dissertation is brought to you for free and open access by the Graduate School at Scholar Commons. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Scholar Commons. For more information, please contact [email protected]. Role of Amylase in Ovarian Cancer by Mai Mohamed A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Pathology and Cell Biology Morsani College of Medicine University of South Florida Major Professor: Patricia Kruk, Ph.D. Paula C. Bickford, Ph.D. Meera Nanjundan, Ph.D. Marzenna Wiranowska, Ph.D. Lauri Wright, Ph.D. Date of Approval: June 29, 2017 Keywords: ovarian cancer, amylase, computational analyses, glycocalyx, cellular invasion Copyright © 2017, Mai Mohamed Dedication This dissertation is dedicated to my parents, Ahmed and Fatma, who have always stressed the importance of education, and, throughout my education, have been my strongest source of encouragement and support. They always believed in me and I am eternally grateful to them. I would also like to thank my brothers, Mohamed and Hussien, and my sister, Mariam. I would also like to thank my husband, Ahmed.
    [Show full text]
  • Human CST9L / Testatin Protein (Fc Tag)
    Human CST9L / Testatin Protein (Fc Tag) Catalog Number: 13243-H02H General Information SDS-PAGE: Gene Name Synonym: bA218C14.1; CTES7B; PRO3543; UNQ1835 Protein Construction: A DNA sequence encoding the human CST9L (Q9H4G1) (Met 1-His 147) was fused with the Fc region of human IgG1 at the C-terminus. Source: Human Expression Host: Human Cells QC Testing Purity: > 92 % as determined by SDS-PAGE Endotoxin: Protein Description < 1.0 EU per μg of the protein as determined by the LAL method Testatin is a member of the Cystatin family. Cystatins comprise genes that Stability: all show expression patterns that are strikingly restricted to reproductive tissue. Cystatins are a family of cysteine protease inhibitors with homology Samples are stable for up to twelve months from date of receipt at -70 ℃ to chicken cystatin. There are typically about 115 amino acids in this family. They are largely acidic, contain four conserved cysteine residues known to Predicted N terminal: Trp 29 form two disulfide bonds, may be glycosylated and/or phosphorylated, with Molecular Mass: similarity to fetuins, kininogens, stefins, histidine-rich glycoproteins and cystatin-related proteins. Testatin shows homology to family 2 cystatins, a The recombinant human CST9L/Fc chimera is a disulfide-linked group of broadly expressed small secretory proteins that are inhibitors of homodimeric protein. The reduced monomer consists of 360 amino acids cysteine proteases in vitro but whose in vivo functions are unclear. It is and has a calculated molecular mass of 41.3 kDa. In SDS-PAGE under expressed in germ cells and somatic cells in reproductive tissues. Testatin reducing conditions, the apparent molecular mass of rhCST9L/Fc is considered a strong candidate for involvement in early testis monomer is approximately 48 kDa.
    [Show full text]
  • CST9L Polyclonal Antibody
    PRODUCT DATA SHEET Bioworld Technology,Inc. CST9L polyclonal antibody Catalog: BS65196 Host: Rabbit Reactivity: Human BackGround: munogen. cystatin 9-like(CST9L) Homo sapiens The cystatin su- Applications: perfamily encompasses proteins that contain multiple Immunohistochemistry: 1/100 - 1/300. ELISA: 1/10000. cystatin-like sequences. Some of the members are active Not yet tested in other applications. cysteine protease inhibitors, while others have lost or Storage&Stability: perhaps never acquired this inhibitory activity. There are Store at 4°C short term. Aliquot and store at -20°C long three inhibitory families in the superfamily, including the term. Avoid freeze-thaw cycles. type 1 cystatins (stefins), type 2 cystatins and the kinino- Specificity: gens. The type 2 cystatin proteins are a class of cysteine CST9L Polyclonal Antibody detects endogenous levels of proteinase inhibitors found in a variety of human fluids CST9L protein. and secretions. The cystatin locus on chromosome 20 DATA: contains the majority of the type 2 cystatin genes and pseudogenes. This gene is located in the cystatin locus and encodes a protein similar to mouse cystatin 9. Based on its testis-specific expression, it is likely to have a role in tissue reorganization during early testis development. [provided by RefSeq, Jul 2008], Product: Liquid in PBS containing 50% glycerol, 0.5% BSA and Immunohistochemistry analysis of paraffin-embedded human placenta, 0.02% sodium azide. using CST9L Antibody. The picture on the right is blocked with the Molecular Weight: synthesized peptide. / Note: Swiss-Prot: For research use only, not for use in diagnostic procedure. Q9H4G1 Purification&Purity: The antibody was affinity-purified from rabbit antiserum by affinity-chromatography using epitope-specific im- Bioworld Technology, Inc.
    [Show full text]
  • Rabbit Anti-CST9L Antibody-SL14089R
    SunLong Biotech Co.,LTD Tel: 0086-571- 56623320 Fax:0086-571- 56623318 E-mail:[email protected] www.sunlongbiotech.com Rabbit Anti-CST9L antibody SL14089R Product Name: CST9L Chinese Name: 半胱氨酸蛋白酶抑制剂9样蛋白抗体 Alias: CST9L; CST9L_HUMAN; Cystatin-9-like. Organism Species: Rabbit Clonality: Polyclonal React Species: Human, ELISA=1:500-1000IHC-P=1:400-800IHC-F=1:400-800ICC=1:100-500IF=1:100- 500(Paraffin sections need antigen repair) Applications: not yet tested in other applications. optimal dilutions/concentrations should be determined by the end user. Molecular weight: 14kDa Cellular localization: cytoplasmic Form: Lyophilized or Liquid Concentration: 1mg/ml immunogen: KLH conjugated synthetic peptide derived from human CST9L:41-147/147 Lsotype: IgG Purification: affinitywww.sunlongbiotech.com purified by Protein A Storage Buffer: 0.01M TBS(pH7.4) with 1% BSA, 0.03% Proclin300 and 50% Glycerol. Store at -20 °C for one year. Avoid repeated freeze/thaw cycles. The lyophilized antibody is stable at room temperature for at least one month and for greater than a year Storage: when kept at -20°C. When reconstituted in sterile pH 7.4 0.01M PBS or diluent of antibody the antibody is stable for at least two weeks at 2-4 °C. PubMed: PubMed The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences. Some of the members are active cysteine protease inhibitors, while others have lost or perhaps never acquired this inhibitory activity. There are three inhibitory Product Detail: families in the superfamily, including the type 1 cystatins (stefins), type 2 cystatins and the kininogens. The type 2 cystatin proteins are a class of cysteine proteinase inhibitors found in a variety of human fluids and secretions.
    [Show full text]
  • HHS Public Access Author Manuscript
    HHS Public Access Author manuscript Author Manuscript Author ManuscriptGenet Epidemiol Author Manuscript. Author Author Manuscript manuscript; available in PMC 2016 June 01. Published in final edited form as: Genet Epidemiol. 2015 December ; 39(8): 664–677. doi:10.1002/gepi.21932. Multiple SNP-sets Analysis for Genome-wide Association Studies through Bayesian Latent Variable Selection Zhaohua Lu, Hongtu Zhu, Rebecca C Knickmeyer, Patrick F. Sullivan, Williams N. Stephanie, and Fei Zou for the Alzheimer’s Disease Neuroimaging Initiative* Departments of Biostatistics, Psychiatry, and Genetics and Biomedical Research Imaging Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Abstract The power of genome-wide association studies (GWAS) for mapping complex traits with single SNP analysis may be undermined by modest SNP effect sizes, unobserved causal SNPs, correlation among adjacent SNPs, and SNP-SNP interactions. Alternative approaches for testing the association between a single SNP-set and individual phenotypes have been shown to be promising for improving the power of GWAS. We propose a Bayesian latent variable selection (BLVS) method to simultaneously model the joint association mapping between a large number of SNP-sets and complex traits. Compared to single SNP-set analysis, such joint association mapping not only accounts for the correlation among SNP-sets, but also is capable of detecting causal SNP- sets that are marginally uncorrelated with traits. The spike-slab prior assigned to the effects of SNP-sets can greatly reduce the dimension of effective SNP-sets, while speeding up computation. An efficient MCMC algorithm is developed. Simulations demonstrate that BLVS outperforms several competing variable selection methods in some important scenarios.
    [Show full text]
  • Open Sadiesteffens Dissertation Final.Pdf
    The Pennsylvania State University The Graduate School College of Medicine MECHANISM OF DRUG ACTION OF THE SPECIFIC CK2 INHIBITOR CX-4945 IN ACUTE MYELOID LEUKEMIA A Dissertation in Biomedical Sciences by Sadie Lynne Steffens 2015 Sadie Lynne Steffens Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy December 2015 The dissertation of Sadie Lynne Steffens was reviewed and approved* by the following: Sinisa Dovat Physician, Associate Professor of Pediatrics, Pharmacology, & Biochemistry Director, Translational Research – Four Diamonds Pediatric Cancer Research Center Dissertation Advisor Chair of Committee Barbara A. Miller Physician, Professor of Pediatrics Chief, Division of Pediatric Hematology/Oncology Sergei A. Grigoryev Professor of Biochemistry and Molecular Biology Jong K. Yun Associate Professor of Pharmacology Ralph L. Keil Associate Professor of Biochemistry and Molecular Biology Chair, Biomedical Sciences Graduate Program *Signatures are on file in the Graduate School ii ABSTRACT Acute myeloid leukemia (AML) is a malignant disease of the myeloid line of blood cells and is characterized by the rapid growth of abnormal white blood cells that accumulate in the bone marrow and interfere with the production of normal blood cells. Cytarabine and other currently available treatments for acute myeloid leukemia are highly toxic and insufficient, as more than half of all AML patients develop resistance to chemotherapeutic agents. Since AML often affects older people who are less tolerant of chemotherapy, there is need for novel, targeted, less toxic drugs in order to improve survival for this disease. Casein Kinase II (CK2) is a pro-oncogenic serine/threonine kinase that is essential for cellular proliferation.
    [Show full text]
  • CST9L (NM 080610) Human Tagged ORF Clone Lentiviral Particle Product Data
    OriGene Technologies, Inc. 9620 Medical Center Drive, Ste 200 Rockville, MD 20850, US Phone: +1-888-267-4436 [email protected] EU: [email protected] CN: [email protected] Product datasheet for RC206646L4V CST9L (NM_080610) Human Tagged ORF Clone Lentiviral Particle Product data: Product Type: Lentiviral Particles Product Name: CST9L (NM_080610) Human Tagged ORF Clone Lentiviral Particle Symbol: CST9L Synonyms: bA218C14.1; CTES7B Vector: pLenti-C-mGFP-P2A-Puro (PS100093) ACCN: NM_080610 ORF Size: 441 bp ORF Nucleotide The ORF insert of this clone is exactly the same as(RC206646). Sequence: OTI Disclaimer: The molecular sequence of this clone aligns with the gene accession number as a point of reference only. However, individual transcript sequences of the same gene can differ through naturally occurring variations (e.g. polymorphisms), each with its own valid existence. This clone is substantially in agreement with the reference, but a complete review of all prevailing variants is recommended prior to use. More info OTI Annotation: This clone was engineered to express the complete ORF with an expression tag. Expression varies depending on the nature of the gene. RefSeq: NM_080610.1 RefSeq Size: 982 bp RefSeq ORF: 444 bp Locus ID: 128821 UniProt ID: Q9H4G1, A0A140VJH1 Protein Families: Secreted Protein, Transmembrane MW: 17.3 kDa This product is to be used for laboratory only. Not for diagnostic or therapeutic use. View online » ©2021 OriGene Technologies, Inc., 9620 Medical Center Drive, Ste 200, Rockville, MD 20850, US 1 / 2 CST9L (NM_080610) Human Tagged ORF Clone Lentiviral Particle – RC206646L4V Gene Summary: The cystatin superfamily encompasses proteins that contain multiple cystatin-like sequences.
    [Show full text]
  • The DNA Sequence and Comparative Analysis of Human Chromosome 20
    articles The DNA sequence and comparative analysis of human chromosome 20 P. Deloukas, L. H. Matthews, J. Ashurst, J. Burton, J. G. R. Gilbert, M. Jones, G. Stavrides, J. P. Almeida, A. K. Babbage, C. L. Bagguley, J. Bailey, K. F. Barlow, K. N. Bates, L. M. Beard, D. M. Beare, O. P. Beasley, C. P. Bird, S. E. Blakey, A. M. Bridgeman, A. J. Brown, D. Buck, W. Burrill, A. P. Butler, C. Carder, N. P. Carter, J. C. Chapman, M. Clamp, G. Clark, L. N. Clark, S. Y. Clark, C. M. Clee, S. Clegg, V. E. Cobley, R. E. Collier, R. Connor, N. R. Corby, A. Coulson, G. J. Coville, R. Deadman, P. Dhami, M. Dunn, A. G. Ellington, J. A. Frankland, A. Fraser, L. French, P. Garner, D. V. Grafham, C. Grif®ths, M. N. D. Grif®ths, R. Gwilliam, R. E. Hall, S. Hammond, J. L. Harley, P. D. Heath, S. Ho, J. L. Holden, P. J. Howden, E. Huckle, A. R. Hunt, S. E. Hunt, K. Jekosch, C. M. Johnson, D. Johnson, M. P. Kay, A. M. Kimberley, A. King, A. Knights, G. K. Laird, S. Lawlor, M. H. Lehvaslaiho, M. Leversha, C. Lloyd, D. M. Lloyd, J. D. Lovell, V. L. Marsh, S. L. Martin, L. J. McConnachie, K. McLay, A. A. McMurray, S. Milne, D. Mistry, M. J. F. Moore, J. C. Mullikin, T. Nickerson, K. Oliver, A. Parker, R. Patel, T. A. V. Pearce, A. I. Peck, B. J. C. T. Phillimore, S. R. Prathalingam, R. W. Plumb, H. Ramsay, C. M.
    [Show full text]
  • Identifying Transcriptomic Correlates of Histology Using Deep Learning
    bioRxiv preprint doi: https://doi.org/10.1101/2020.08.07.241331; this version posted August 10, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Identifying Transcriptomic Correlates of Histology using Deep Learning Liviu Badea1*, Emil Stănescu1 1 Artificial Intelligence and Bioinformatics Group, National Institute for Research and Development in Informatics, Bucharest, Romania * Corresponding author Email: [email protected] Abstract Linking phenotypes to specific gene expression profiles is an extremely important problem in biology, which has been approached mainly by correlation methods or, more fundamentally, by studying the effects of gene perturbations. However, genome-wide perturbations involve extensive experimental efforts, which may be prohibitive for certain organisms. On the other hand, the characterization of the various phenotypes frequently requires an expert’s subjective interpretation, such as a histopathologist’s description of tissue slide images in terms of complex visual features (e.g. ‘acinar structures’). In this paper, we use Deep Learning to eliminate the inherent subjective nature of these visual histological features and link them to genomic data, thus establishing a more precisely quantifiable correlation between transcriptomes and phenotypes. Using a dataset of whole slide images with matching gene expression data from 39 normal tissue types, we first developed a Deep Learning tissue classifier with an accuracy of 94%. Then we searched for genes whose expression correlates with features inferred by the classifier and demonstrate that Deep Learning can automatically derive visual (phenotypical) features that are well correlated with the transcriptome and therefore biologically interpretable.
    [Show full text]
  • Chromatin Conformation Links Distal Target Genes to CKD Loci
    BASIC RESEARCH www.jasn.org Chromatin Conformation Links Distal Target Genes to CKD Loci Maarten M. Brandt,1 Claartje A. Meddens,2,3 Laura Louzao-Martinez,4 Noortje A.M. van den Dungen,5,6 Nico R. Lansu,2,3,6 Edward E.S. Nieuwenhuis,2 Dirk J. Duncker,1 Marianne C. Verhaar,4 Jaap A. Joles,4 Michal Mokry,2,3,6 and Caroline Cheng1,4 1Experimental Cardiology, Department of Cardiology, Thoraxcenter Erasmus University Medical Center, Rotterdam, The Netherlands; and 2Department of Pediatrics, Wilhelmina Children’s Hospital, 3Regenerative Medicine Center Utrecht, Department of Pediatrics, 4Department of Nephrology and Hypertension, Division of Internal Medicine and Dermatology, 5Department of Cardiology, Division Heart and Lungs, and 6Epigenomics Facility, Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands ABSTRACT Genome-wide association studies (GWASs) have identified many genetic risk factors for CKD. However, linking common variants to genes that are causal for CKD etiology remains challenging. By adapting self-transcribing active regulatory region sequencing, we evaluated the effect of genetic variation on DNA regulatory elements (DREs). Variants in linkage with the CKD-associated single-nucleotide polymorphism rs11959928 were shown to affect DRE function, illustrating that genes regulated by DREs colocalizing with CKD-associated variation can be dysregulated and therefore, considered as CKD candidate genes. To identify target genes of these DREs, we used circular chro- mosome conformation capture (4C) sequencing on glomerular endothelial cells and renal tubular epithelial cells. Our 4C analyses revealed interactions of CKD-associated susceptibility regions with the transcriptional start sites of 304 target genes. Overlap with multiple databases confirmed that many of these target genes are involved in kidney homeostasis.
    [Show full text]