A CATH Domain Functional Family Based Approach to Identify
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Improving the Prediction of Transcription Factor Binding Sites To
Improving the prediction of transcription factor binding sites to aid the interpretation of non-coding single nucleotide variants. Narayan Jayaram Research Department of Structural and Molecular Biology University College London A thesis submitted to University College London for the degree of Doctor of Philosophy 1 Declaration I, Narayan Jayaram confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Narayan Jayaram 2 Abstract Single nucleotide variants (SNVs) that occur in transcription factor binding sites (TFBSs) can disrupt the binding of transcription factors and alter gene expression which can cause inherited diseases and act as driver SNVs in cancer. The identification of SNVs in TFBSs has historically been challenging given the limited number of experimentally characterised TFBSs. The recent ENCODE project has resulted in the availability of ChIP-Seq data that provides genome wide sets of regions bound by transcription factors. These data have the potential to improve the identification of SNVs in TFBSs. However, as the ChIP-Seq data identify a broader range of DNA in which a transcription factor binds, computational prediction is required to identify the precise TFBS. Prediction of TFBSs involves scanning a DNA sequence with a Position Weight Matrix (PWM) using a pattern matching tool. This thesis focusses on the prediction of TFBSs by: (a) evaluating a set of locally-installable pattern-matching tools and identifying the best performing tool (FIMO), (b) using the ENCODE ChIP-Seq data to evaluate a set of de novo motif discovery tools that are used to derive PWMs which can handle large volumes of data, (c) identifying the best performing tool (rGADEM), (d) using rGADEM to generate a set of PWMs from the ENCODE ChIP-Seq data and (e) by finally checking that the selection of the best pattern matching tool is not unduly influenced by the choice of PWMs. -
Genetic Variants Contribute to Gene Expression Variability in Humans
INVESTIGATION Genetic Variants Contribute to Gene Expression Variability in Humans Amanda M. Hulse* and James J. Cai*,†,1 *Interdisciplinary Program in Genetics and †Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, Texas 77843-4458 ABSTRACT Expression quantitative trait loci (eQTL) studies have established convincing relationships between genetic variants and gene expression. Most of these studies focused on the mean of gene expression level, but not the variance of gene expression level (i.e., gene expression variability). In the present study, we systematically explore genome-wide association between genetic variants and gene expression variability in humans. We adapt the double generalized linear model (dglm) to simultaneously fit the means and the variances of gene expression among the three possible genotypes of a biallelic SNP. The genomic loci showing significant association between the variances of gene expression and the genotypes are termed expression variability QTL (evQTL). Using a data set of gene expression in lymphoblastoid cell lines (LCLs) derived from 210 HapMap individuals, we identify cis-acting evQTL involving 218 distinct genes, among which 8 genes, ADCY1, CTNNA2, DAAM2, FERMT2, IL6, PLOD2, SNX7, and TNFRSF11B, are cross-validated using an extra expression data set of the same LCLs. We also identify 300 trans-acting evQTL between .13,000 common SNPs and 500 randomly selected representative genes. We employ two distinct scenarios, emphasizing single-SNP and multiple-SNP effects on expression variability, to explain the formation of evQTL. We argue that detecting evQTL may represent a novel method for effectively screening for genetic interactions, especially when the multiple-SNP influence on expression variability is implied. -
Applied Category Theory for Genomics – an Initiative
Applied Category Theory for Genomics { An Initiative Yanying Wu1,2 1Centre for Neural Circuits and Behaviour, University of Oxford, UK 2Department of Physiology, Anatomy and Genetics, University of Oxford, UK 06 Sept, 2020 Abstract The ultimate secret of all lives on earth is hidden in their genomes { a totality of DNA sequences. We currently know the whole genome sequence of many organisms, while our understanding of the genome architecture on a systematic level remains rudimentary. Applied category theory opens a promising way to integrate the humongous amount of heterogeneous informations in genomics, to advance our knowledge regarding genome organization, and to provide us with a deep and holistic view of our own genomes. In this work we explain why applied category theory carries such a hope, and we move on to show how it could actually do so, albeit in baby steps. The manuscript intends to be readable to both mathematicians and biologists, therefore no prior knowledge is required from either side. arXiv:2009.02822v1 [q-bio.GN] 6 Sep 2020 1 Introduction DNA, the genetic material of all living beings on this planet, holds the secret of life. The complete set of DNA sequences in an organism constitutes its genome { the blueprint and instruction manual of that organism, be it a human or fly [1]. Therefore, genomics, which studies the contents and meaning of genomes, has been standing in the central stage of scientific research since its birth. The twentieth century witnessed three milestones of genomics research [1]. It began with the discovery of Mendel's laws of inheritance [2], sparked a climax in the middle with the reveal of DNA double helix structure [3], and ended with the accomplishment of a first draft of complete human genome sequences [4]. -
A Community Proposal to Integrate Structural
F1000Research 2020, 9(ELIXIR):278 Last updated: 11 JUN 2020 OPINION ARTICLE A community proposal to integrate structural bioinformatics activities in ELIXIR (3D-Bioinfo Community) [version 1; peer review: 1 approved, 3 approved with reservations] Christine Orengo1, Sameer Velankar2, Shoshana Wodak3, Vincent Zoete4, Alexandre M.J.J. Bonvin 5, Arne Elofsson 6, K. Anton Feenstra 7, Dietland L. Gerloff8, Thomas Hamelryck9, John M. Hancock 10, Manuela Helmer-Citterich11, Adam Hospital12, Modesto Orozco12, Anastassis Perrakis 13, Matthias Rarey14, Claudio Soares15, Joel L. Sussman16, Janet M. Thornton17, Pierre Tuffery 18, Gabor Tusnady19, Rikkert Wierenga20, Tiina Salminen21, Bohdan Schneider 22 1Structural and Molecular Biology Department, University College, London, UK 2Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, CB10 1SD, UK 3VIB-VUB Center for Structural Biology, Brussels, Belgium 4Department of Oncology, Lausanne University, Swiss Institute of Bioinformatics, Lausanne, Switzerland 5Bijvoet Center, Faculty of Science – Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands 6Science for Life Laboratory, Stockholm University, Solna, S-17121, Sweden 7Dept. Computer Science, Center for Integrative Bioinformatics VU (IBIVU), Vrije Universiteit, Amsterdam, 1081 HV, The Netherlands 8Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, L-4367, Luxembourg 9Bioinformatics center, Department of Biology, University of Copenhagen, Copenhagen, DK-2200, -
Genome-Wide Analysis of DNA Methylation In
Villalba‑Benito et al. Clin Epigenet (2021) 13:51 https://doi.org/10.1186/s13148‑021‑01040‑6 RESEARCH Open Access Genome‑wide analysis of DNA methylation in Hirschsprung enteric precursor cells: unraveling the epigenetic landscape of enteric nervous system development Leticia Villalba‑Benito1,2, Daniel López‑López3,4, Ana Torroglosa1,2, Carlos S. Casimiro‑Soriguer3,4, Berta Luzón‑Toro1,2, Raquel María Fernández1,2, María José Moya‑Jiménez5, Guillermo Antiñolo1,2, Joaquín Dopazo2,3,4 and Salud Borrego1,2* Abstract Background: Hirschsprung disease (HSCR, OMIM 142623) is a rare congenital disorder that results from a failure to fully colonize the gut by enteric precursor cells (EPCs) derived from the neural crest. Such incomplete gut coloniza‑ tion is due to alterations in EPCs proliferation, survival, migration and/or diferentiation during enteric nervous system (ENS) development. This complex process is regulated by a network of signaling pathways that is orchestrated by genetic and epigenetic factors, and therefore alterations at these levels can lead to the onset of neurocristopathies such as HSCR. The goal of this study is to broaden our knowledge of the role of epigenetic mechanisms in the disease context, specifcally in DNA methylation. Therefore, with this aim, a Whole‑Genome Bisulfte Sequencing assay has been performed using EPCs from HSCR patients and human controls. Results: This is the frst study to present a whole genome DNA methylation profle in HSCR and reveal a decrease of global DNA methylation in CpG context in HSCR patients compared with controls, which correlates with a greater hypomethylation of the diferentially methylated regions (DMRs) identifed. -
Bayesian Hierarchical Modeling of High-Throughput Genomic Data with Applications to Cancer Bioinformatics and Stem Cell Differentiation
BAYESIAN HIERARCHICAL MODELING OF HIGH-THROUGHPUT GENOMIC DATA WITH APPLICATIONS TO CANCER BIOINFORMATICS AND STEM CELL DIFFERENTIATION by Keegan D. Korthauer A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Statistics) at the UNIVERSITY OF WISCONSIN–MADISON 2015 Date of final oral examination: 05/04/15 The dissertation is approved by the following members of the Final Oral Committee: Christina Kendziorski, Professor, Biostatistics and Medical Informatics Michael A. Newton, Professor, Statistics Sunduz Kele¸s,Professor, Biostatistics and Medical Informatics Sijian Wang, Associate Professor, Biostatistics and Medical Informatics Michael N. Gould, Professor, Oncology © Copyright by Keegan D. Korthauer 2015 All Rights Reserved i in memory of my grandparents Ma and Pa FL Grandma and John ii ACKNOWLEDGMENTS First and foremost, I am deeply grateful to my thesis advisor Christina Kendziorski for her invaluable advice, enthusiastic support, and unending patience throughout my time at UW-Madison. She has provided sound wisdom on everything from methodological principles to the intricacies of academic research. I especially appreciate that she has always encouraged me to eke out my own path and I attribute a great deal of credit to her for the successes I have achieved thus far. I also owe special thanks to my committee member Professor Michael Newton, who guided me through one of my first collaborative research experiences and has continued to provide key advice on my thesis research. I am also indebted to the other members of my thesis committee, Professor Sunduz Kele¸s,Professor Sijian Wang, and Professor Michael Gould, whose valuable comments, questions, and suggestions have greatly improved this dissertation. -
Analysis of Gene Expression Data for Gene Ontology
ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Robert Daniel Macholan May 2011 ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION Robert Daniel Macholan Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Department Chair Dr. Zhong-Hui Duan Dr. Chien-Chung Chan _______________________________ _______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Yingcai Xiao Dr. George R. Newkome _______________________________ Date ii ABSTRACT A tremendous increase in genomic data has encouraged biologists to turn to bioinformatics in order to assist in its interpretation and processing. One of the present challenges that need to be overcome in order to understand this data more completely is the development of a reliable method to accurately predict the function of a protein from its genomic information. This study focuses on developing an effective algorithm for protein function prediction. The algorithm is based on proteins that have similar expression patterns. The similarity of the expression data is determined using a novel measure, the slope matrix. The slope matrix introduces a normalized method for the comparison of expression levels throughout a proteome. The algorithm is tested using real microarray gene expression data. Their functions are characterized using gene ontology annotations. The results of the case study indicate the protein function prediction algorithm developed is comparable to the prediction algorithms that are based on the annotations of homologous proteins. -
Mouse Net1 Knockout Project (CRISPR/Cas9)
https://www.alphaknockout.com Mouse Net1 Knockout Project (CRISPR/Cas9) Objective: To create a Net1 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Net1 gene (NCBI Reference Sequence: NM_019671 ; Ensembl: ENSMUSG00000021215 ) is located on Mouse chromosome 13. 12 exons are identified, with the ATG start codon in exon 1 and the TAA stop codon in exon 12 (Transcript: ENSMUST00000091853). Exon 4~11 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit delayed mammary gland development during puberty associated with slower ductal extension, reduced ductal branching and epithelial cell proliferation, disorganized myoepithelial and ductal epithelial cells, and increased collagen deposition. Exon 4 starts from about 14.34% of the coding region. Exon 4~11 covers 63.25% of the coding region. The size of effective KO region: ~4176 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 4 5 6 7 8 9 10 11 12 Legends Exon of mouse Net1 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 4 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. -
Chromosome 10, Frequently Lost in Human Melanoma, Encodes Multiple Tumor-Suppressive Functions
Published OnlineFirst January 22, 2014; DOI: 10.1158/0008-5472.CAN-13-1446 Cancer Tumor and Stem Cell Biology Research Chromosome 10, Frequently Lost in Human Melanoma, Encodes Multiple Tumor-Suppressive Functions Lawrence N. Kwong and Lynda Chin Abstract Although many DNA aberrations in melanoma have been well characterized, including focal amplification and deletions of oncogenes and tumor suppressors, broad regions of chromosomal gain and loss are less well understood. One possibility is that these broad events are a consequence of collateral damage from targeting single loci. Another possibility is that the loss of large regions permits the simultaneous repression of multiple tumor suppressors by broadly decreasing the resident gene dosage and expression. Here, we test this hypothesis in a targeted fashion using RNA interference to suppress multiple candidate residents in broad regions of loss. We find that loss of chromosome regions 6q, 10, and 11q21-ter is correlated with broadly decreased expression of most resident genes and that multiple resident genes impacted by broad regional loss of chromosome 10 are tumor suppressors capable of affecting tumor growth and/or invasion. We also provide additional functional support for Ablim1 as a novel tumor suppressor. Our results support the hypothesis that multiple cancer genes are targeted by regional chromosome copy number aberrations. Cancer Res; 74(6); 1814–21. Ó2014 AACR. Introduction mas (7). In contrast, the observed approximately 60% of CDKN2A/B Increasingly high-resolution genomic studies have estab- deletions on chromosome 9p are focal. These data PTEN lished that recurrent focal deletions and amplifications in suggest that although is a major driver of chromosome cancer can selectively target specific oncogenes and tumor 10 loss, other genes may also be targeted for inactivation. -
Environmental Influences on Endothelial Gene Expression
ENDOTHELIAL CELL GENE EXPRESSION John Matthew Jeff Herbert Supervisors: Prof. Roy Bicknell and Dr. Victoria Heath PhD thesis University of Birmingham August 2012 University of Birmingham Research Archive e-theses repository This unpublished thesis/dissertation is copyright of the author and/or third parties. The intellectual property rights of the author or third parties in respect of this work are as defined by The Copyright Designs and Patents Act 1988 or as modified by any successor legislation. Any use made of information contained in this thesis/dissertation must be in accordance with that legislation and must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the permission of the copyright holder. ABSTRACT Tumour angiogenesis is a vital process in the pathology of tumour development and metastasis. Targeting markers of tumour endothelium provide a means of targeted destruction of a tumours oxygen and nutrient supply via destruction of tumour vasculature, which in turn ultimately leads to beneficial consequences to patients. Although current anti -angiogenic and vascular targeting strategies help patients, more potently in combination with chemo therapy, there is still a need for more tumour endothelial marker discoveries as current treatments have cardiovascular and other side effects. For the first time, the analyses of in-vivo biotinylation of an embryonic system is performed to obtain putative vascular targets. Also for the first time, deep sequencing is applied to freshly isolated tumour and normal endothelial cells from lung, colon and bladder tissues for the identification of pan-vascular-targets. Integration of the proteomic, deep sequencing, public cDNA libraries and microarrays, delivers 5,892 putative vascular targets to the science community. -
A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated. -
Functional Parsing of Driver Mutations in the Colorectal Cancer Genome Reveals Numerous Suppressors of Anchorage-Independent
Supplementary information Functional parsing of driver mutations in the colorectal cancer genome reveals numerous suppressors of anchorage-independent growth Ugur Eskiocak1, Sang Bum Kim1, Peter Ly1, Andres I. Roig1, Sebastian Biglione1, Kakajan Komurov2, Crystal Cornelius1, Woodring E. Wright1, Michael A. White1, and Jerry W. Shay1. 1Department of Cell Biology, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390-9039. 2Department of Systems Biology, University of Texas M.D. Anderson Cancer Center, Houston, TX 77054. Supplementary Figure S1. K-rasV12 expressing cells are resistant to p53 induced apoptosis. Whole-cell extracts from immortalized K-rasV12 or p53 down regulated HCECs were immunoblotted with p53 and its down-stream effectors after 10 Gy gamma-radiation. ! Supplementary Figure S2. Quantitative validation of selected shRNAs for their ability to enhance soft-agar growth of immortalized shTP53 expressing HCECs. Each bar represents 8 data points (quadruplicates from two separate experiments). Arrows denote shRNAs that failed to enhance anchorage-independent growth in a statistically significant manner. Enhancement for all other shRNAs were significant (two tailed Studentʼs t-test, compared to none, mean ± s.e.m., P<0.05)." ! Supplementary Figure S3. Ability of shRNAs to knockdown expression was demonstrated by A, immunoblotting for K-ras or B-E, Quantitative RT-PCR for ERICH1, PTPRU, SLC22A15 and SLC44A4 48 hours after transfection into 293FT cells. Two out of 23 tested shRNAs did not provide any knockdown. " ! Supplementary Figure S4. shRNAs against A, PTEN and B, NF1 do not enhance soft agar growth in HCECs without oncogenic manipulations (Student!s t-test, compared to none, mean ± s.e.m., ns= non-significant).