Analysis, Visualization, and Machine Learning of Epigenomic Data
Total Page:16
File Type:pdf, Size:1020Kb

Load more
Recommended publications
-
Aberrant Methylation Underlies Insulin Gene Expression in Human Insulinoma
ARTICLE https://doi.org/10.1038/s41467-020-18839-1 OPEN Aberrant methylation underlies insulin gene expression in human insulinoma Esra Karakose1,6, Huan Wang 2,6, William Inabnet1, Rajesh V. Thakker 3, Steven Libutti4, Gustavo Fernandez-Ranvier 1, Hyunsuk Suh1, Mark Stevenson 3, Yayoi Kinoshita1, Michael Donovan1, Yevgeniy Antipin1,2, Yan Li5, Xiaoxiao Liu 5, Fulai Jin 5, Peng Wang 1, Andrew Uzilov 1,2, ✉ Carmen Argmann 1, Eric E. Schadt 1,2, Andrew F. Stewart 1,7 , Donald K. Scott 1,7 & Luca Lambertini 1,6 1234567890():,; Human insulinomas are rare, benign, slowly proliferating, insulin-producing beta cell tumors that provide a molecular “recipe” or “roadmap” for pathways that control human beta cell regeneration. An earlier study revealed abnormal methylation in the imprinted p15.5-p15.4 region of chromosome 11, known to be abnormally methylated in another disorder of expanded beta cell mass and function: the focal variant of congenital hyperinsulinism. Here, we compare deep DNA methylome sequencing on 19 human insulinomas, and five sets of normal beta cells. We find a remarkably consistent, abnormal methylation pattern in insu- linomas. The findings suggest that abnormal insulin (INS) promoter methylation and altered transcription factor expression create alternative drivers of INS expression, replacing cano- nical PDX1-driven beta cell specification with a pathological, looping, distal enhancer-based form of transcriptional regulation. Finally, NFaT transcription factors, rather than the cano- nical PDX1 enhancer complex, are predicted to drive INS transactivation. 1 From the Diabetes Obesity and Metabolism Institute, The Department of Surgery, The Department of Pathology, The Department of Genetics and Genomics Sciences and The Institute for Genomics and Multiscale Biology, The Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA. -
Analysis of Gene Expression Data for Gene Ontology
ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Robert Daniel Macholan May 2011 ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION Robert Daniel Macholan Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Department Chair Dr. Zhong-Hui Duan Dr. Chien-Chung Chan _______________________________ _______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Yingcai Xiao Dr. George R. Newkome _______________________________ Date ii ABSTRACT A tremendous increase in genomic data has encouraged biologists to turn to bioinformatics in order to assist in its interpretation and processing. One of the present challenges that need to be overcome in order to understand this data more completely is the development of a reliable method to accurately predict the function of a protein from its genomic information. This study focuses on developing an effective algorithm for protein function prediction. The algorithm is based on proteins that have similar expression patterns. The similarity of the expression data is determined using a novel measure, the slope matrix. The slope matrix introduces a normalized method for the comparison of expression levels throughout a proteome. The algorithm is tested using real microarray gene expression data. Their functions are characterized using gene ontology annotations. The results of the case study indicate the protein function prediction algorithm developed is comparable to the prediction algorithms that are based on the annotations of homologous proteins. -
Original Article Correlation Between LSP1 Polymorphisms and the Susceptibility to Breast Cancer
Int J Clin Exp Pathol 2015;8(5):5798-5802 www.ijcep.com /ISSN:1936-2625/IJCEP0005835 Original Article Correlation between LSP1 polymorphisms and the susceptibility to breast cancer Hai Chen, Xiaodong Qi, Ping Qiu, Jiali Zhao Department of Galactophore, The General Hospital of Beijing Military Command, Beijing, China Received January 12, 2015; Accepted March 16, 2015; Epub May 1, 2015; Published May 15, 2015 Abstract: Objective: The present study aimed at assessing the relationship between Leukocyte-specific protein 1 gene (LSP1) polymorphisms (rs569550 and rs592373) and the pathogenesis of breast cancer (BC). Methods: 70 BC patients and 72 healthy subjects were enrolled in the study. Rs569550 and rs592373 polymorphisms were genotyped by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). Odds ratio (OR) with 95% confidence interval (CI) were calculated by the chi-squared test to assess the relationship between LSP1 polymorphisms and BC risk. Linkage disequilibrium (LD) and haplotypes were also analyzed by HaploView software. Results: Genotype distribution of the control was in accordance with Hardy-Weinberg equilibrium (HWE). The homo- zygous genotype TT and T allele of rs569550 could significantly increase the risk of BC (TT vs. GG: OR=3.17, 95% CI=1.23-8.91; T vs. G: OR=1.63, 95% CI=1.01-2.64). For rs592373, mutation homozygous genotype CC and C allele were significantly associated with BC susceptibility (CC vs. TT: OR=4.45, 95% CI=1.38-14.8; C vs. T: OR=1.70, 95% CI=1.03-2.81). LD and haplotypes analysis of rs569550 and rs592373 polymorphisms showed that T-C haplotype was a risk factor for BC (T-C vs. -
Meta-Analysis of Nasopharyngeal Carcinoma
BMC Genomics BioMed Central Research article Open Access Meta-analysis of nasopharyngeal carcinoma microarray data explores mechanism of EBV-regulated neoplastic transformation Xia Chen†1,2, Shuang Liang†1, WenLing Zheng1,3, ZhiJun Liao1, Tao Shang1 and WenLi Ma*1 Address: 1Institute of Genetic Engineering, Southern Medical University, Guangzhou, PR China, 2Xiangya Pingkuang associated hospital, Pingxiang, Jiangxi, PR China and 3Southern Genomics Research Center, Guangzhou, Guangdong, PR China Email: Xia Chen - [email protected]; Shuang Liang - [email protected]; WenLing Zheng - [email protected]; ZhiJun Liao - [email protected]; Tao Shang - [email protected]; WenLi Ma* - [email protected] * Corresponding author †Equal contributors Published: 7 July 2008 Received: 16 February 2008 Accepted: 7 July 2008 BMC Genomics 2008, 9:322 doi:10.1186/1471-2164-9-322 This article is available from: http://www.biomedcentral.com/1471-2164/9/322 © 2008 Chen et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Background: Epstein-Barr virus (EBV) presumably plays an important role in the pathogenesis of nasopharyngeal carcinoma (NPC), but the molecular mechanism of EBV-dependent neoplastic transformation is not well understood. The combination of bioinformatics with evidences from biological experiments paved a new way to gain more insights into the molecular mechanism of cancer. Results: We profiled gene expression using a meta-analysis approach. Two sets of meta-genes were obtained. Meta-A genes were identified by finding those commonly activated/deactivated upon EBV infection/reactivation. -
LSP1-Myosin1e Bi-Molecular Complex Regulates Focal Adhesion Dynamics
bioRxiv preprint doi: https://doi.org/10.1101/2020.02.26.963991; this version posted February 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. LSP1-myosin1e bi-molecular complex regulates focal adhesion dynamics and cell migration Katja Schäringer1*, Sebastian Maxeiner1*, Carmen Schalla1, Stephan Rütten2, Martin Zenke1 and Antonio Sechi1 1Institute of Biomedical Engineering, Dept. of Cell Biology, RWTH Aachen University, Pauwelsstrasse, 30, D-52074 Aachen, Germany 2Electron Microscopy Facility, Institute of Pathology, RWTH Aachen University, Pauwelsstrasse, 30, D-52074 Aachen, Germany *equal contribution Corresponding author: Email: [email protected] Telephone: +49 241 8085248 Running head: LSP1-myosin1e complex regulates cell migration. Keywords: Actin cytoskeleton remodelling, focal adhesions, cell motility, macrophages. 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.26.963991; this version posted February 27, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Abstract Several cytoskeleton-associated proteins and signalling pathways work in concert to regulate actin cytoskeleton remodelling, cell adhesion and migration. We have recently demonstrated that the bi-molecular complex between the leukocyte-specific protein 1 (LSP1) and myosin1e controls actin cytoskeleton remodelling during phagocytosis. In this study, we show that LSP1 down regulation severely impairs cell migration, lamellipodia formation and focal adhesion dynamics in macrophages. Inhibition of the interaction between LSP1 and myosin1e also impairs these processes resulting in poorly motile cells, which are characterised by few and small lamellipodia. -
Anti-LSP1 Antibody (ARG55377)
Product datasheet [email protected] ARG55377 Package: 50 μl anti-LSP1 antibody Store at: -20°C Summary Product Description Rabbit Polyclonal antibody recognizes LSP1 Tested Reactivity Hu Tested Application WB Host Rabbit Clonality Polyclonal Isotype IgG Target Name LSP1 Antigen Species Human Immunogen Synthetic peptide corresponding to aa. 121-149 of Human LSP1. Conjugation Un-conjugated Alternate Names Lymphocyte-specific protein 1; pp52; 52 kDa phosphoprotein; 47 kDa actin-binding protein; WP34; Lymphocyte-specific antigen WP34 Application Instructions Application table Application Dilution WB 1:1000 Application Note * The dilutions indicate recommended starting dilutions and the optimal dilutions or concentrations should be determined by the scientist. Positive Control Daudi Calculated Mw 37 kDa Properties Form Liquid Purification Affinity purification with immunogen. Buffer PBS (without Mg2+ and Ca2+, pH 7.4), 150mM NaCl, 0.02% Sodium azide and 50% Glycerol Preservative 0.02% Sodium azide Stabilizer 50% Glycerol Storage instruction For continuous use, store undiluted antibody at 2-8°C for up to a week. For long-term storage, aliquot and store at -20°C. Storage in frost free freezers is not recommended. Avoid repeated freeze/thaw cycles. Suggest spin the vial prior to opening. The antibody solution should be gently mixed before use. Note For laboratory research only, not for drug, diagnostic or other use. www.arigobio.com 1/2 Bioinformation Database links GeneID: 4046 Human Swiss-port # P33241 Human Gene Symbol LSP1 Gene Full Name lymphocyte-specific protein 1 Background This gene encodes an intracellular F-actin binding protein. The protein is expressed in lymphocytes, neutrophils, macrophages, and endothelium and may regulate neutrophil motility, adhesion to fibrinogen matrix proteins, and transendothelial migration. -
Expression of Mouse LSP1/S37 Isoforms S37 Is Expressed in Embryonic Mesenchymal Cells
Journal of Cell Science 107, 3591-3600 (1994) 3591 Printed in Great Britain © The Company of Biologists Limited 1994 Expression of mouse LSP1/S37 isoforms S37 is expressed in embryonic mesenchymal cells V. L. Misener1, C.-c. Hui2, I. A. Malapitan1, M.-E. Ittel1, A. L. Joyner2,3 and J. Jongstra1,* 1The Arthritis Centre-Research Unit, The Toronto Hospital Research Institute and Department of Immunology, University of Toronto, Toronto, Ontario, Canada 2Division of Molecular and Developmental Biology, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada 3Department of Molecular and Medical Genetics, University of Toronto, Toronto, Ontario, Canada *Author for correspondence at: Toronto Western Hospital, Room 13-419, 399 Bathurst Street, Toronto, Ontario, M5T 2S8 Canada SUMMARY Mouse LSP1 is a 330 amino acid intracellular F-actin 18 bp encoding the 6 amino acids HLIRHQ of the acidic binding protein expressed in lymphocytes and domain. Therefore, the Lsp1 gene encodes four protein macrophages but not in non-hematopoietic tissues. A 328 isoforms: full-length LSP1 and S37 proteins, designated amino acid LSP1-related protein, designated S37, is LSP1-I and S37-I and the same proteins without the expressed in murine bone marrow stromal cells, in fibro- HLIRHQ sequence, designated LSP1-II and S37-II. By in blasts, and in a myocyte cell line. The two proteins differ situ hybridization analysis we show that the S37 isoforms only at their N termini, the first 23 amino acid residues of are expressed in mesenchymal tissue, but not in adjacent LSP1 being replaced by 21 different residues in S37. The epithelial tissue, of several developing organs during mouse presence of different amino termini suggests that the LSP1 embryogenesis. -
Ontology-Based Methods for Analyzing Life Science Data
Habilitation a` Diriger des Recherches pr´esent´ee par Olivier Dameron Ontology-based methods for analyzing life science data Soutenue publiquement le 11 janvier 2016 devant le jury compos´ede Anita Burgun Professeur, Universit´eRen´eDescartes Paris Examinatrice Marie-Dominique Devignes Charg´eede recherches CNRS, LORIA Nancy Examinatrice Michel Dumontier Associate professor, Stanford University USA Rapporteur Christine Froidevaux Professeur, Universit´eParis Sud Rapporteure Fabien Gandon Directeur de recherches, Inria Sophia-Antipolis Rapporteur Anne Siegel Directrice de recherches CNRS, IRISA Rennes Examinatrice Alexandre Termier Professeur, Universit´ede Rennes 1 Examinateur 2 Contents 1 Introduction 9 1.1 Context ......................................... 10 1.2 Challenges . 11 1.3 Summary of the contributions . 14 1.4 Organization of the manuscript . 18 2 Reasoning based on hierarchies 21 2.1 Principle......................................... 21 2.1.1 RDF for describing data . 21 2.1.2 RDFS for describing types . 24 2.1.3 RDFS entailments . 26 2.1.4 Typical uses of RDFS entailments in life science . 26 2.1.5 Synthesis . 30 2.2 Case study: integrating diseases and pathways . 31 2.2.1 Context . 31 2.2.2 Objective . 32 2.2.3 Linking pathways and diseases using GO, KO and SNOMED-CT . 32 2.2.4 Querying associated diseases and pathways . 33 2.3 Methodology: Web services composition . 39 2.3.1 Context . 39 2.3.2 Objective . 40 2.3.3 Semantic compatibility of services parameters . 40 2.3.4 Algorithm for pairing services parameters . 40 2.4 Application: ontology-based query expansion with GO2PUB . 43 2.4.1 Context . 43 2.4.2 Objective . -
PREDICTD: Parallel Epigenomics Data Imputation with Cloud-Based Tensor Decomposition
bioRxiv preprint doi: https://doi.org/10.1101/123927; this version posted April 4, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. PREDICTD: PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition Timothy J. Durham Maxwell W. Libbrecht Department of Genome Sciences Department of Genome Sciences University of Washington University of Washington J. Jeffry Howbert Jeff Bilmes Department of Genome Sciences Department of Electrical Engineering University of Washington University of Washington William Stafford Noble Department of Genome Sciences Department of Computer Science and Engineering University of Washington April 4, 2017 Abstract The Encyclopedia of DNA Elements (ENCODE) and the Roadmap Epigenomics Project have produced thousands of data sets mapping the epigenome in hundreds of cell types. How- ever, the number of cell types remains too great to comprehensively map given current time and financial constraints. We present a method, PaRallel Epigenomics Data Imputation with Cloud-based Tensor Decomposition (PREDICTD), to address this issue by computationally im- puting missing experiments in collections of epigenomics experiments. PREDICTD leverages an intuitive and natural model called \tensor decomposition" to impute many experiments si- multaneously. Compared with the current state-of-the-art method, ChromImpute, PREDICTD produces lower overall mean squared error, and combining methods yields further improvement. We show that PREDICTD data can be used to investigate enhancer biology at non-coding human accelerated regions. PREDICTD provides reference imputed data sets and open-source software for investigating new cell types, and demonstrates the utility of tensor decomposition and cloud computing, two technologies increasingly applicable in bioinformatics. -
Manolis Kellis Piotr Indyk
6.095 / 6.895 Computational Biology: Genomes, Networks, Evolution Manolis Kellis Rapid database search Courtsey of CCRNP, The National Cancer Institute. Piotr Indyk Protein interaction network Courtesy of GTL Center for Molecular and Cellular Systems. Genome duplication Courtesy of Talking Glossary of Genetics. Administrivia • Course information – Lecturers: Manolis Kellis and Piotr Indyk • Grading: Part. Problem sets 50% Final Project 25% Midterm 20% 5% • 5 problem sets: – Each problem set: covers 4 lectures, contains 4 problems. – Algorithmic problems and programming assignments – Graduate version includes 5th problem on current research •Exams – In-class midterm, no final exam • Collaboration policy – Collaboration allowed, but you must: • Work independently on each problem before discussing it • Write solutions on your own • Acknowledge sources and collaborators. No outsourcing. Goals for the term • Introduction to computational biology – Fundamental problems in computational biology – Algorithmic/machine learning techniques for data analysis – Research directions for active participation in the field • Ability to tackle research – Problem set questions: algorithmic rigorous thinking – Programming assignments: hands-on experience w/ real datasets • Final project: – Research initiative to propose an innovative project – Ability to carry out project’s goals, produce deliverables – Write-up goals, approach, and findings in conference format – Present your project to your peers in conference setting Course outline • Organization – Duality: -
Prof. Manolis Kellis April 15, 2008
Chromosomes inside the cell Introduction to Algorithms 6.046J/18.401J • Eukaryote cell LECTURE 18 • Prokaryote Computational Biology cell • Bio intro: Regulatory Motifs • Combinatorial motif discovery - Median string finding • Probabilistic motif discovery - Expectation maximization • Comparative genomics Prof. Manolis Kellis April 15, 2008 DNA packaging DNA: The double helix • Why packaging • The most noble molecule of our time – DNA is very long – Cell is very small • Compression – Chromosome is 50,000 times shorter than extended DNA • Using the DNA – Before a piece of DNA is used for anything, this compact structure must open locally ATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATA ATATTGAATTTTCAAAAATTCTTACTTTTTTTTTGGATGGACGCAAAGAAGTTTAATAATCATATTACATGGCATTACCACCATATA ATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTC ATCCATATCTAATCTTACTTATATGTTGTGGAAATGTAAAGAGCCCCATTATCTTAGCCTAAAAAAACCTTCTCTTTGGAACTTTC AATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTC AATACGCTTAACTGCTCATTGCTATATTGAAGTACGGATTAGAAGCCGCCGAGCGGGCGACAGCCCTCCGACGGAAGACTCTCCTC GCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACT GCGTCCTCGTCTTCACCGGTCGCGTTCCTGAAACGCAGATGTGCCTCGCGCCGCACTGCTCCGAACAATAAAGATTCTACAATACT TTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATG TTTTATGGTTATGAAGAGGAAAAATTGGCAGTAACCTGGCCCCACAAACCTTCAAATTAACGAATCAAATTAACAACCATAGGATG AATGCGATTAGTTTTTTAGCCTTATTTCTGGGGTAATTAATCAGCGAAGCGATGATTTTTGATCTATTAACAGATATATAAATGGAA -
ENCODE Consortium Meeting
ENCODE Consortium Meeting June 17-19, 2008 Hilton Washington DC/Rockville Executive Meeting Center Rockville, Maryland PARTICIPANTS Bradley Bernstein Piero Carninci, Ph.D. Molecular Pathology Unit Leader Massachusetts General Hospital Functional Genomics Technology Team and 149 13th Street Omics Resource Development Unit Charlestown, MA 02129 Deputy Project Director (617) 726-6906 LSA Technology Development Group (617) 726-5684 Fax Omics Science Center [email protected] RIKEN Yokohama Institute 1-7-22 Suehiro-cho, Tsurumi-ku Ewan Birney Yokohama 230-0045 Joint Team Leader Japan Panda Group Nucleotides +81-(0)901-709-2277 Panda Coordination and Outreach [email protected] Panda Metabolism European Molecular Biology Laboratory Philip Cayting European Bioinformatics Institute Gerstein Laboratory Hinxton Outstation Department of Molecular Biophysics and Wellcome Trust Genome Campus Biochemistry Hinxton, Cambridge CB10 1SD Yale University United Kingdom P.O. Box 208114 +44-(0)1223-494 444, ext. 4420 New Haven, CT 06520-8114 +44-(0)1223-494 494 Fax (203) 432-6337 [email protected] [email protected] Michael Brent, Ph.D. Howard Y. Chang, M.D., Ph.D. Professor Assistant Professor Center for Genome Sciences Stanford University Washington University Center for Clinical Sciences Research, Campus Box 8510 Room 2155C 4444 Forest Park 269 Campus Drive Saint Louis, MO 63108 Stanford, CA 94305 (314) 286-0210 (650) 736-0306 [email protected] [email protected] James Bentley Brown Mike Cherry, Ph.D. Graduate Student Researcher Associate Professor Graduate Program in Applied Science and Department of Genetics Technology Stanford University Bickel Group 300 Pasteur Drive University of California, Berkeley Stanford, CA 94305-5120 Room 2 (650) 723-7541 1246 Hearst Avenue [email protected] Berkeley, CA 94702 (510) 703-4706 [email protected] Francis S.