Co-Expression Network Analysis of Down's Syndrome Based on Microarray Data

Total Page:16

File Type:pdf, Size:1020Kb

Co-Expression Network Analysis of Down's Syndrome Based on Microarray Data EXPERIMENTAL AND THERAPEUTIC MEDICINE 12: 1503-1508, 2016 Co-expression network analysis of Down's syndrome based on microarray data JIANPING ZHAO1*, ZHENGGUO ZHANG2*, SHUMIN REN3, YANAN ZONG3 and XIANGDONG KONG3 1Clinical Laboratory, Women and Infants Hospital of Zhengzhou, Zhengzhou, Henan 450012; 2Clinical Laboratory, Henan Provincial Hospital of Traditional Chinese Medicine, Zhengzhou, Henan 450002; 3Center of Prenatal Diagnosis, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan 450052, P.R. China Received February 28, 2015; Accepted April 11, 2016 DOI: 10.3892/etm.2016.3462 Abstract. Down's syndrome (DS) is a type of chromosome significant differential modules were significantly enriched in disease. The present study aimed to explore the underlying 5 functions, including the endoplasmic reticulum (P=0.018) molecular mechanisms of DS. GSE5390 microarray data and regulation of apoptosis (P=0.061). The identified DEGs, downloaded from the gene expression omnibus database was in particular the 12 DEGs in the significant differential used to identify differentially expressed genes (DEGs) in DS. modules, such as B-cell lymphoma 2-associated transcrip- Pathway enrichment analysis of the DEGs was performed, tion factor 1, heat shock protein 90 kDa beta member 1, UBX followed by co‑expression network construction. Significant domain-containing protein 2 and transmembrane protein 50B, differential modules were mined by mutual information, may serve important roles in the pathogenesis of DS. followed by functional analysis. The accuracy of sample classification for the significant differential modules of DEGs Introduction was evaluated by leave-one-out cross-validation. A total of 997 DEGs, including 638 upregulated and 359 downregulated Down's syndrome (DS), also known as trisomy 21, is a genetic genes, were identified. Upregulated DEGs were enriched in disorder caused by trisomy of part of, or all of, chromosome 21, 15 pathways, such as cell adhesion molecules, whereas down- and is associated with significant intellectual disability, phys- regulated DEGs were enriched in maturity onset diabetes of ical growth delays and slanted eyes (1,2). DS may cause other the young. Three significant differential modules with the complications, such as autism, coeliac disease, hypothyroidism highest discriminative scores (mutual information>0.35) and leukemia (3-5). Over the last several decades, numerous were selected from a co-expression network. The classifica- genes on trisomy 21 have been suggested to be associated with tion accuracy of GSE16677 expression profile samples was DS (6). 54.55% and 72.73% when characterized by 12 DEGs and Modern genetic pathology states the hypothesis that 3 significant differential modules, respectively. Genes in phenotypes of DS are determined by an extra copy of one or several genes, such as oligodendrocyte transcription factor 1 present on human chromosome 21 (Hsa21) (7). A previous study demonstrated that sorting nexin family member 27, which is suppressed by a bioactive substance Correspondence to: Mr. Xiangdong Kong, Center of Prenatal Diagnosis, The First Affiliated Hospital of Zhengzhou University, encoded on Hsa21, is upregulated in the brains of DS 1 Jiangshe Donglu, Zhengzhou, Henan 450052, P.R. China mice (8). The DS critical region (DSCR) is composed of E-mail: [email protected] 6 genes, including v-ets erythroblastosis virus E26 onco- gene homolog 2 (ETS2) and ETS-related gene (ERG), which *Contributed equally encode transcription factors (9). ERG and ETS2 belong to the E26 transformation‑specific family, which contains key regula- Abbreviations: APP, amyloid precursor protein; DAVID, database tors of embryonic development (10). Numerous sensitive genes for annotation, visualization and integrated discovery; such as ubiquitin specific peptidase 16, which are associated DGEs, differentially expressed genes; DS, Down's syndrome; with the reduced ubiquitination of cyclin-dependent kinase DSCR, Down's syndrome critical region; EST, erythroblast inhibitor 2A and expedited senescence in Ts65Dn fibroblasts, transformation-specific; GEO, gene expression omnibus; have been confirmed to be associated with DS (11,12). OLIG1, oligodendrocyte transcription factor 1; SAM, significant analysis of microarray; SVMs, support vector machines The identification of sensitive genes on Hsa21 has become the focus of investigations with the aim of elaborating suit- Key words: Down's syndrome, differentially expressed genes, able therapeutic strategies for DS, and global gene expression significant differential module, classification accuracy, co-expressed profiling techniques have been widely used to analyze the network molecular mechanisms underlying trisomy 21, as the expression levels of trisomic genes and those in the wider genome can be assessed (13). The majority of DS studies have been performed 1504 ZHAO et al: CO-EXPRESSION NETWORK ANALYSIS OF DOWN'S SYNDROME in animal models, such as the Ts65Dn mouse model of DS, analytical tools for extracting biological meaning from large which can be used to analyze the association between hippo- lists of genes (25,26), and was used for pathway enrichment campal pathways and cognitive dysfunction (14-16). Various analysis of upregulated and downregulated DEGs in the samples, such as fibroblasts, whole blood, T cells and placenta, current study. have been used to investigate the pathogenesis of DS (17-20). The samples used in the present study are derived from the Co‑expressed network construction. Gene co-expression brain tissue samples of patients with DS. Lockstone et al (21) network analysis is a fusion product of microarray profile screened differentially expressed genes (DEGs) in DS and techniques and network theory, that offers an intuitive concept performed functional analysis, and investigated significant for illustrating the association between two genes (27). The differential modules from a co-expressed network (21). gene expression data set GSE5390 was applied to build a In the present study, GSE5390 and GSE16677 were down- co-expressed network. Cytoscape is an open software platform loaded from the Gene Expression Omnibus (GEO) database, for visualizing complex networks and integrating these with DEGs were screened for pathway enrichment analysis, then any type of attributed data (28). The Pearson correlation coef- a co-expressed network was constructed and screened for ficient is a popular coefficient for evaluating the dependence significant differential modules. In addition, the classification of two variables (29) and, in the present study, Pearson's coef- accuracy of GSE16677 expression profile samples were evalu- ficient analysis was performed using the Perl script program ated, characterized by the value of average expressed DEGs (version 5.22.1; https://www.perl.org/get.html). R>0.9 was and all DEGs in significant differential modules. Finally, func- chosen as the criterion for selecting co-expressed genes. The tional enrichment analysis of DEGs in significant differential co-expression network was visualized using Cytoscape soft- modules was performed. ware (version 2.8; http://www.cytoscape.org/). Materials and methods Analysis of differentially co‑expressed gene modules. GraphWeb (http://biit.cs.ut.ee/graphweb/) is a public web Microarray data. The transcription profiles of GSE5390 and server used to mine heterogeneous biological networks for GSE16677 were acquired from the GEO database (http://www. significant function gene modules (30). The hidden Markov ncbi.nlm.nih.gov/geo/) (22,23). The platform names were model (HMM) of GraphWeb can be used for modeling by GPL96 Affymetrix Human Genome U133A Array and defining and observing the joint probability of sequences and GPL570 Affymetrix Human Genome U133 plus 2.0 Array, label sequences (31). In the present study, HMM was used for respectively. A total of 15 samples from GSE5390 were studied, mining network modules. Mutual information (MI) between including 8 normal samples and 7 DS samples of human adult the average expression value of modules and sample labels was brain tissue (dorsolateral prefrontal cortex) from healthy calculated according to the following formula (32): controls and individuals with DS (23). A total of 11 samples from GSE16677 were studied, including 5 normal samples and 6 DS samples of megakaryocytic leukemia blasts from patients with DS-acute megakaryoblastic leukemia (AMKL) or French-American-British M7 leukemia, and patients with In this formula, a' is the discretization average expres- non-DS-AMKL. The raw data were downloaded for further sion value; c is the corresponding vector of the sample analysis. The present study was approved by the Ethics label. To derive a' from a, activity levels are discretized into Committee of the First Affiliated Hospital of Zhengzhou [log2 (number of samples) + 1] values (33). Differentially University (Zhengzhou, China) and performed in accordance co-expressed gene modules were screened with the threshold with ethical standards (22,23). of discriminative score (defined as MI)> 0.35. Then, functional enrichment analysis of differentially expressed genes in the DEG analysis. Gene expression data from GSE5390 differentially co-expressed gene modules was performed by were processed through background correction, quantile DAV I D. P<0.1 was used as the cut-off criterion. normalization and probe summarization using the robust multi-array average algorithm in the oligo software package Assessment of classification
Recommended publications
  • Analysis of Gene Expression Data for Gene Ontology
    ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Robert Daniel Macholan May 2011 ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION Robert Daniel Macholan Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Department Chair Dr. Zhong-Hui Duan Dr. Chien-Chung Chan _______________________________ _______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Yingcai Xiao Dr. George R. Newkome _______________________________ Date ii ABSTRACT A tremendous increase in genomic data has encouraged biologists to turn to bioinformatics in order to assist in its interpretation and processing. One of the present challenges that need to be overcome in order to understand this data more completely is the development of a reliable method to accurately predict the function of a protein from its genomic information. This study focuses on developing an effective algorithm for protein function prediction. The algorithm is based on proteins that have similar expression patterns. The similarity of the expression data is determined using a novel measure, the slope matrix. The slope matrix introduces a normalized method for the comparison of expression levels throughout a proteome. The algorithm is tested using real microarray gene expression data. Their functions are characterized using gene ontology annotations. The results of the case study indicate the protein function prediction algorithm developed is comparable to the prediction algorithms that are based on the annotations of homologous proteins.
    [Show full text]
  • Identification of C3 As a Therapeutic Target for Diabetic Nephropathy By
    www.nature.com/scientificreports OPEN Identifcation of C3 as a therapeutic target for diabetic nephropathy by bioinformatics analysis ShuMei Tang, XiuFen Wang, TianCi Deng, HuiPeng Ge & XiangCheng Xiao* The pathogenesis of diabetic nephropathy is not completely understood, and the efects of existing treatments are not satisfactory. Various public platforms already contain extensive data for deeper bioinformatics analysis. From the GSE30529 dataset based on diabetic nephropathy tubular samples, we identifed 345 genes through diferential expression analysis and weighted gene coexpression correlation network analysis. GO annotations mainly included neutrophil activation, regulation of immune efector process, positive regulation of cytokine production and neutrophil-mediated immunity. KEGG pathways mostly included phagosome, complement and coagulation cascades, cell adhesion molecules and the AGE-RAGE signalling pathway in diabetic complications. Additional datasets were analysed to understand the mechanisms of diferential gene expression from an epigenetic perspective. Diferentially expressed miRNAs were obtained to construct a miRNA-mRNA network from the miRNA profles in the GSE57674 dataset. The miR-1237-3p/SH2B3, miR-1238-5p/ ZNF652 and miR-766-3p/TGFBI axes may be involved in diabetic nephropathy. The methylation levels of the 345 genes were also tested based on the gene methylation profles of the GSE121820 dataset. The top 20 hub genes in the PPI network were discerned using the CytoHubba tool. Correlation analysis with GFR showed that SYK, CXCL1, LYN, VWF, ANXA1, C3, HLA-E, RHOA, SERPING1, EGF and KNG1 may be involved in diabetic nephropathy. Eight small molecule compounds were identifed as potential therapeutic drugs using Connectivity Map. It is estimated that a total of 451 million people sufered from diabetes by 2017, and the number is speculated to be 693 million by 2045 1.
    [Show full text]
  • Physical and Linkage Mapping of Mammary-Derived Expressed Sequence Tags in Cattle
    Genomics 83 (2004) 148–152 www.elsevier.com/locate/ygeno Physical and linkage mapping of mammary-derived expressed sequence tags in cattle E.E. Connor,a,* T.S. Sonstegard,a J.W. Keele,b G.L. Bennett,b J.L. Williams,c R. Papworth,c C.P. Van Tassell,a and M.S. Ashwella a U.S. Beltsville Agricultural Research Center, ARS, U.S. Department of Agriculture, 10300 Baltimore Avenue, Beltsville, MD 20705, USA b U.S. Meat Animal Research Center, ARS, U.S. Department of Agriculture, P.O. Box 166, Clay Center, NE 68933-0166, USA c Roslin Institute (Edinburgh), Roslin, Midlothian EH25 9PS, Scotland, United Kingdom Received 2 June 2003; accepted 5 July 2003 Abstract This study describes the physical and linkage mapping of 42 gene-associated markers developed from mammary gland-derived expressed sequence tags to the cattle genome. Of the markers, 25 were placed on the USDA reference linkage map and 37 were positioned on the Roslin 3000-rad radiation hybrid (RH) map, with 20 assignments shared between the maps. Although no novel regions of conserved synteny between the cattle and the human genomes were identified, the coverage was extended for bovine chromosomes 3, 7, 15, and 29 compared with previously published comparative maps between human and bovine genomes. Overall, these data improve the resolution of the human–bovine comparative maps and will assist future efforts to integrate bovine RH and linkage map data. Crown Copyright D 2003 Published by Elsevier Inc. All rights reserved. Keywords: RH mapping; Linkage mapping; SNP; Cattle; EST Selection of positional candidate genes controlling eco- pig [4,5], and cattle [6], and serve as a resource for nomically important traits in cattle requires a detailed candidate gene identification.
    [Show full text]
  • (P -Value<0.05, Fold Change≥1.4), 4 Vs. 0 Gy Irradiation
    Table S1: Significant differentially expressed genes (P -Value<0.05, Fold Change≥1.4), 4 vs. 0 Gy irradiation Genbank Fold Change P -Value Gene Symbol Description Accession Q9F8M7_CARHY (Q9F8M7) DTDP-glucose 4,6-dehydratase (Fragment), partial (9%) 6.70 0.017399678 THC2699065 [THC2719287] 5.53 0.003379195 BC013657 BC013657 Homo sapiens cDNA clone IMAGE:4152983, partial cds. [BC013657] 5.10 0.024641735 THC2750781 Ciliary dynein heavy chain 5 (Axonemal beta dynein heavy chain 5) (HL1). 4.07 0.04353262 DNAH5 [Source:Uniprot/SWISSPROT;Acc:Q8TE73] [ENST00000382416] 3.81 0.002855909 NM_145263 SPATA18 Homo sapiens spermatogenesis associated 18 homolog (rat) (SPATA18), mRNA [NM_145263] AA418814 zw01a02.s1 Soares_NhHMPu_S1 Homo sapiens cDNA clone IMAGE:767978 3', 3.69 0.03203913 AA418814 AA418814 mRNA sequence [AA418814] AL356953 leucine-rich repeat-containing G protein-coupled receptor 6 {Homo sapiens} (exp=0; 3.63 0.0277936 THC2705989 wgp=1; cg=0), partial (4%) [THC2752981] AA484677 ne64a07.s1 NCI_CGAP_Alv1 Homo sapiens cDNA clone IMAGE:909012, mRNA 3.63 0.027098073 AA484677 AA484677 sequence [AA484677] oe06h09.s1 NCI_CGAP_Ov2 Homo sapiens cDNA clone IMAGE:1385153, mRNA sequence 3.48 0.04468495 AA837799 AA837799 [AA837799] Homo sapiens hypothetical protein LOC340109, mRNA (cDNA clone IMAGE:5578073), partial 3.27 0.031178378 BC039509 LOC643401 cds. [BC039509] Homo sapiens Fas (TNF receptor superfamily, member 6) (FAS), transcript variant 1, mRNA 3.24 0.022156298 NM_000043 FAS [NM_000043] 3.20 0.021043295 A_32_P125056 BF803942 CM2-CI0135-021100-477-g08 CI0135 Homo sapiens cDNA, mRNA sequence 3.04 0.043389246 BF803942 BF803942 [BF803942] 3.03 0.002430239 NM_015920 RPS27L Homo sapiens ribosomal protein S27-like (RPS27L), mRNA [NM_015920] Homo sapiens tumor necrosis factor receptor superfamily, member 10c, decoy without an 2.98 0.021202829 NM_003841 TNFRSF10C intracellular domain (TNFRSF10C), mRNA [NM_003841] 2.97 0.03243901 AB002384 C6orf32 Homo sapiens mRNA for KIAA0386 gene, partial cds.
    [Show full text]
  • Inaugural-Dissertation Zur Erlangung Der Doktorwürde Der Naturwissenschaftlich-Mathematischen Gesamtfakultät Der Ruprecht-Kar
    Inaugural-Dissertation zur Erlangung der Doktorwürde der Naturwissenschaftlich-Mathematischen Gesamtfakultät der Ruprecht-Karls-Universität Heidelberg vorgelegt von Diplom-Biochemiker Stefan Knoth Tag der mündlichen Prüfung: 25.07.2006 Titel der Arbeit: Regulation der Genexpression bei der Erythroiden Differenzierung Gutachter: Prof. Dr. Stefan Wölfl, Universität Heidelberg Prof. Dr. Katharina Pachmann, Universität Jena Prof. Dr. Jürgen Reichling Prof. Dr. Ulrich Hilgenfeldt Zusammenfassung Viele Erkrankungen des blutbildenden Systems, insbesondere Leukämien, haben ihre Ursache in einer Störung der Differenzierungvorgänge, die ausgehend von undifferenzierten Stammzellen des Knochenmarks zu allen Arten von hochspezialisierten Blutzellen führen. Bei der Chronischen Myeloischen Leukämie (CML) im Stadium der Blastenkrise kommt es zu einer unkontrollierten Vermehrung unreifer Blasten. Alle Chronischen Myeloischen Leukämien exprimieren dass Fusionsprotein BCR-ABL welches aus der Translokation t(9;22)(q34;q11) resultiert. BCR-ABL ist eine konstitutiv aktive Proteinkinase und stimuliert die Zellen zu unkontrolliertem Wachstum. Es ist bekannt, dass sich bei transformierten Zellen mit verschiedenen Substanzen wie AraC oder Butyrat die Proliferation unterdrücken und eine weitere Differenzierung induzieren lässt. Daher werden diese Substanzen oder ihre Derivate als Therapeutika eingesetzt bzw. als solche erprobt. Bei CML-Zellen geht man davon aus, dass durch die genannten Chemikalien die Differenzierung in Richtung der erythroiden Reihe stimuliert wird, da verstärkt Hämoglobine synthetisiert werden. Wegen ihrer einfachen Handhabung in Kultur wurden diese Zellsysteme in der Vergangenheit häufig als Modelle zur Untersuchung der erythroiden Differenzierung verwendet. Die therapeutikastimulierte erythroide Differenzierung von CML-Zellen ist also in zweierlei Hinsicht interessant, einerseits aus medizinischer Sicht zur Bewertung der Vorgänge im behandelten Patienten und andererseits hinsichtlich der Eignung dieser Systeme als Modelle zur Erforschung der Erythropoese.
    [Show full text]
  • Ubiquitin-Dependent Regulation of the WNT Cargo Protein EVI/WLS Handelt Es Sich Um Meine Eigenständig Erbrachte Leistung
    DISSERTATION submitted to the Combined Faculty of Natural Sciences and Mathematics of the Ruperto-Carola University of Heidelberg, Germany for the degree of Doctor of Natural Sciences presented by Lucie Magdalena Wolf, M.Sc. born in Nuremberg, Germany Date of oral examination: 2nd February 2021 Ubiquitin-dependent regulation of the WNT cargo protein EVI/WLS Referees: Prof. Dr. Michael Boutros apl. Prof. Dr. Viktor Umansky If you don’t think you might, you won’t. Terry Pratchett This work was accomplished from August 2015 to November 2020 under the supervision of Prof. Dr. Michael Boutros in the Division of Signalling and Functional Genomics at the German Cancer Research Center (DKFZ), Heidelberg, Germany. Contents Contents ......................................................................................................................... ix 1 Abstract ....................................................................................................................xiii 1 Zusammenfassung .................................................................................................... xv 2 Introduction ................................................................................................................ 1 2.1 The WNT signalling pathways and cancer ........................................................................ 1 2.1.1 Intercellular communication ........................................................................................ 1 2.1.2 WNT ligands are conserved morphogens .................................................................
    [Show full text]
  • "The Genecards Suite: from Gene Data Mining to Disease Genome Sequence Analyses". In: Current Protocols in Bioinformat
    The GeneCards Suite: From Gene Data UNIT 1.30 Mining to Disease Genome Sequence Analyses Gil Stelzer,1,5 Naomi Rosen,1,5 Inbar Plaschkes,1,2 Shahar Zimmerman,1 Michal Twik,1 Simon Fishilevich,1 Tsippi Iny Stein,1 Ron Nudel,1 Iris Lieder,2 Yaron Mazor,2 Sergey Kaplan,2 Dvir Dahary,2,4 David Warshawsky,3 Yaron Guan-Golan,3 Asher Kohn,3 Noa Rappaport,1 Marilyn Safran,1 and Doron Lancet1,6 1Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel 2LifeMap Sciences Ltd., Tel Aviv, Israel 3LifeMap Sciences Inc., Marshfield, Massachusetts 4Toldot Genetics Ltd., Hod Hasharon, Israel 5These authors contributed equally to the paper 6Corresponding author GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It au- tomatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. Var- Elect’s capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses.
    [Show full text]
  • Supplementary Materials and Tables a and B
    SUPPLEMENTARY MATERIAL 1 Table A. Main characteristics of the subset of 23 AML patients studied by high-density arrays (subset A) WBC BM blasts MYST3- MLL Age/Gender WHO / FAB subtype Karyotype FLT3-ITD NPM status (x109/L) (%) CREBBP status 1 51 / F M4 NA 21 78 + - G A 2 28 / M M4 t(8;16)(p11;p13) 8 92 + - G G 3 53 / F M4 t(8;16)(p11;p13) 27 96 + NA G NA 4 24 / M PML-RARα / M3 t(15;17) 5 90 - - G G 5 52 / M PML-RARα / M3 t(15;17) 1.5 75 - - G G 6 31 / F PML-RARα / M3 t(15;17) 3.2 89 - - G G 7 23 / M RUNX1-RUNX1T1 / M2 t(8;21) 38 34 - + ND G 8 52 / M RUNX1-RUNX1T1 / M2 t(8;21) 8 68 - - ND G 9 40 / M RUNX1-RUNX1T1 / M2 t(8;21) 5.1 54 - - ND G 10 63 / M CBFβ-MYH11 / M4 inv(16) 297 80 - - ND G 11 63 / M CBFβ-MYH11 / M4 inv(16) 7 74 - - ND G 12 59 / M CBFβ-MYH11 / M0 t(16;16) 108 94 - - ND G 13 41 / F MLLT3-MLL / M5 t(9;11) 51 90 - + G R 14 38 / F M5 46, XX 36 79 - + G G 15 76 / M M4 46 XY, der(10) 21 90 - - G NA 16 59 / M M4 NA 29 59 - - M G 17 26 / M M5 46, XY 295 92 - + G G 18 62 / F M5 NA 67 88 - + M A 19 47 / F M5 del(11q23) 17 78 - + M G 20 50 / F M5 46, XX 61 59 - + M G 21 28 / F M5 46, XX 132 90 - + G G 22 30 / F AML-MD / M5 46, XX 6 79 - + M G 23 64 / M AML-MD / M1 46, XY 17 83 - + M G WBC: white blood cell.
    [Show full text]
  • Mclean, Chelsea.Pdf
    COMPUTATIONAL PREDICTION AND EXPERIMENTAL VALIDATION OF NOVEL MOUSE IMPRINTED GENES A Dissertation Presented to the Faculty of the Graduate School of Cornell University In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Chelsea Marie McLean August 2009 © 2009 Chelsea Marie McLean COMPUTATIONAL PREDICTION AND EXPERIMENTAL VALIDATION OF NOVEL MOUSE IMPRINTED GENES Chelsea Marie McLean, Ph.D. Cornell University 2009 Epigenetic modifications, including DNA methylation and covalent modifications to histone tails, are major contributors to the regulation of gene expression. These changes are reversible, yet can be stably inherited, and may last for multiple generations without change to the underlying DNA sequence. Genomic imprinting results in expression from one of the two parental alleles and is one example of epigenetic control of gene expression. So far, 60 to 100 imprinted genes have been identified in the human and mouse genomes, respectively. Identification of additional imprinted genes has become increasingly important with the realization that imprinting defects are associated with complex disorders ranging from obesity to diabetes and behavioral disorders. Despite the importance imprinted genes play in human health, few studies have undertaken genome-wide searches for new imprinted genes. These have used empirical approaches, with some success. However, computational prediction of novel imprinted genes has recently come to the forefront. I have developed generalized linear models using data on a variety of sequence and epigenetic features within a training set of known imprinted genes. The resulting models were used to predict novel imprinted genes in the mouse genome. After imposing a stringency threshold, I compiled an initial candidate list of 155 genes.
    [Show full text]
  • ADHD) Gene Networks in Children of Both African American and European American Ancestry
    G C A T T A C G G C A T genes Article Rare Recurrent Variants in Noncoding Regions Impact Attention-Deficit Hyperactivity Disorder (ADHD) Gene Networks in Children of both African American and European American Ancestry Yichuan Liu 1 , Xiao Chang 1, Hui-Qi Qu 1 , Lifeng Tian 1 , Joseph Glessner 1, Jingchun Qu 1, Dong Li 1, Haijun Qiu 1, Patrick Sleiman 1,2 and Hakon Hakonarson 1,2,3,* 1 Center for Applied Genomics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA; [email protected] (Y.L.); [email protected] (X.C.); [email protected] (H.-Q.Q.); [email protected] (L.T.); [email protected] (J.G.); [email protected] (J.Q.); [email protected] (D.L.); [email protected] (H.Q.); [email protected] (P.S.) 2 Division of Human Genetics, Department of Pediatrics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA 3 Department of Human Genetics, Children’s Hospital of Philadelphia, Philadelphia, PA 19104, USA * Correspondence: [email protected]; Tel.: +1-267-426-0088 Abstract: Attention-deficit hyperactivity disorder (ADHD) is a neurodevelopmental disorder with poorly understood molecular mechanisms that results in significant impairment in children. In this study, we sought to assess the role of rare recurrent variants in non-European populations and outside of coding regions. We generated whole genome sequence (WGS) data on 875 individuals, Citation: Liu, Y.; Chang, X.; Qu, including 205 ADHD cases and 670 non-ADHD controls. The cases included 116 African Americans H.-Q.; Tian, L.; Glessner, J.; Qu, J.; Li, (AA) and 89 European Americans (EA), and the controls included 408 AA and 262 EA.
    [Show full text]
  • Ejhg2009157.Pdf
    European Journal of Human Genetics (2010) 18, 342–347 & 2010 Macmillan Publishers Limited All rights reserved 1018-4813/10 $32.00 www.nature.com/ejhg ARTICLE Fine mapping and association studies of a high-density lipoprotein cholesterol linkage region on chromosome 16 in French-Canadian subjects Zari Dastani1,2,Pa¨ivi Pajukanta3, Michel Marcil1, Nicholas Rudzicz4, Isabelle Ruel1, Swneke D Bailey2, Jenny C Lee3, Mathieu Lemire5,9, Janet Faith5, Jill Platko6,10, John Rioux6,11, Thomas J Hudson2,5,7,9, Daniel Gaudet8, James C Engert*,2,7, Jacques Genest1,2,7 Low levels of high-density lipoprotein cholesterol (HDL-C) are an independent risk factor for cardiovascular disease. To identify novel genetic variants that contribute to HDL-C, we performed genome-wide scans and quantitative association studies in two study samples: a Quebec-wide study consisting of 11 multigenerational families and a study of 61 families from the Saguenay– Lac St-Jean (SLSJ) region of Quebec. The heritability of HDL-C in these study samples was 0.73 and 0.49, respectively. Variance components linkage methods identified a LOD score of 2.61 at 98 cM near the marker D16S515 in Quebec-wide families and an LOD score of 2.96 at 86 cM near the marker D16S2624 in SLSJ families. In the Quebec-wide sample, four families showed segregation over a 25.5-cM (18 Mb) region, which was further reduced to 6.6 Mb with additional markers. The coding regions of all genes within this region were sequenced. A missense variant in CHST6 segregated in four families and, with additional families, we observed a P value of 0.015 for this variant.
    [Show full text]
  • Proteomic Analysis of Monolayer-Integrated Proteins on Lipid Droplets Identifies Amphipathic Interfacial Α-Helical Membrane Anchors
    Proteomic analysis of monolayer-integrated proteins on lipid droplets identifies amphipathic interfacial α-helical membrane anchors Camille I. Patakia, João Rodriguesb, Lichao Zhangc, Junyang Qiand, Bradley Efrond, Trevor Hastied, Joshua E. Eliasc, Michael Levittb, and Ron R. Kopitoe,1 aDepartment of Biochemistry, Stanford University, Stanford, CA 94305; bDepartment of Structural Biology, Stanford University, Stanford, CA 94305; cDepartment of Chemical and Systems Biology, Stanford University, Stanford, CA 94305; dDepartment of Statistics, Stanford University, Stanford, CA 94305; and eDepartment of Biology, Stanford University, Stanford, CA 94305 Edited by Jennifer Lippincott-Schwartz, Howard Hughes Medical Institute, Ashburn, VA, and approved July 23, 2018 (received for review May 9, 2018) Despite not spanning phospholipid bilayers, monotopic integral The first step in understanding the structural diversity of proteins (MIPs) play critical roles in organizing biochemical reac- monotopic HMDs is to identify a sufficiently large set of proteins tions on membrane surfaces. Defining the structural basis by with unequivocal monotopic topology. In this study, we exploited which these proteins are anchored to membranes has been the fact that lipid droplets (LDs) are enveloped by phospholipid hampered by the paucity of unambiguously identified MIPs and a monolayers that separate a highly hydrophobic lipid core from lack of computational tools that accurately distinguish monolayer- the aqueous cytoplasm (8). Because TMDs are flanked on both integrating motifs from bilayer-spanning transmembrane domains ends by soluble, hydrophilic domains, bilayer-spanning proteins (TMDs). We used quantitative proteomics and statistical modeling are strongly disfavored from integrating into the monolayer to identify 87 high-confidence candidate MIPs in lipid droplets, membranes of LDs.
    [Show full text]