Of Triple Negative Breast Cancer (TNBC)

Total Page:16

File Type:pdf, Size:1020Kb

Of Triple Negative Breast Cancer (TNBC) Supplementary Methods Gene set enrichment analysis (GSEA) of triple negative breast cancer (TNBC) We obtained the primary gene expression data of breast cancer patients and breast cancer cell lines from Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) and analyzed the gene signature with Gene Set Enrichment Analysis (GSEA), as described in our previous report [1]. Briefly, the clinical data of breast cancer patients were obtained from the GEO database under series accession no. GSE27447, and human breast cancer cell line data were obtained from GSE32474. We utilized GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) to compare TNBC and non‐TNBC groups to identify differentially expressed genes. From GSE27447, we compared the gene expression profiles of five TNBC patient tumors with those of fourteen non‐TNBC patient tumors, and we identified 1972 genes that were differentially expressed between TNBC and non‐TNBC patients (p‐value<5E‐02, Supplementary Table 1). From the GSE32474 dataset, we compared the genomic signature of TNBC cell lines (BT549, HS578T, and MDA‐MB‐231) with that of non‐TNBC cell lines (MCF7 and T47D), and we identified a total of 653 annotated genes that were differentially expressed between human TNBC and non‐TNBC cell lines (p‐value<1E‐08, Supplementary Table 2). Then, GSEA of the ranked gene list was performed using the Java implementation of GSEA obtained from http://www.broadinstitute.org/gsea/ (5000 permutations, minimum term size of 15, and maximum term size of 500). The entire gene lists for TNBC patients as well as for TNBC cell lines were pre‐ranked based on the mean fold change. The analysis included gene sets from MSigDB pathways and C2: curated gene sets (c2.all.v6.2.symbols.gmt), and an FDR q‐value < 0.05 was set as the significance threshold. Supplementary Tables Supplementary Table 1. Differentially expressed gene list of TNBC patient tumors (n=5) versus non‐ TNBC patient tumors (n=14). Supplementary Table 2. Differentially expressed gene list of TNBC cell lines (BT549, HS578T, and MDA‐MB‐231) versus non‐TNBC cell lines (MCF7 and T47D). Supplementary Tables are attached as excel files. References 1. Kim J.‐H.; Park S.‐Y; Jun Y.; Kim J.‐Y.; Nam J.‐S. Roles of Wnt Target Genes in the Journey of Cancer Stem Cells. International Journal of Molecular Sciences 2017, 18, pii: E1604 Supplementary Table 1. Differentially expressed gene list of TNBC patient tumors (n=5) versus non-TNBC patient tumors (n=14). # ID Gene.symbTNBC group non-TNBC group logFC P.Value Gene.title 1 8103706 AADAT 2.272663 -3.5244 1.667202 0.0348422 aminoadipate aminotransferase 2 7989953 AAGAB -2.9889457 -2.4408 -0.59936 0.0075418 alpha- and gamma-adaptin binding protein 3 8052798 AAK1 2.220929 -3.5981 0.519726 0.0387047 AP2 associated kinase 1 4 8100478 AASDH -2.1427178 -3.7078 -0.60192 0.0452934 aminoadipate-semialdehyde dehydrogenase 5 8070708 AATBC 2.2450704 -3.5638 0.524513 0.0368558 apoptosis associated transcript in bladder cancer 6 8131682 ABCB5 -2.1401391 -3.7114 -1.18796 0.0455271 ATP binding cassette subfamily B member 5 7 7975607 ACOT4 -2.8771048 -2.6158 -1.45266 0.0096492 acyl-CoA thioesterase 4 8 7952503 ACRV1 2.1522735 -3.6945 0.541704 0.0444369 acrosomal vesicle protein 1 9 8137474 ACTR3B 2.497213 -3.196 0.932199 0.0218645 ARP3 actin related protein 3 homolog B 10 8092002 ACTRT3 2.1043293 -3.761 0.477936 0.0488877 actin related protein T3 11 7955535 ACVR1B -3.4919358 -1.6429 -0.7804 0.0024384 activin A receptor type 1B 12 8066431 ADA 2.5949572 -3.0493 0.613998 0.0177718 adenosine deaminase 13 7983191 ADAL -3.6757514 -1.3502 -1.13951 0.0016052 adenosine deaminase like 14 8115490 ADAM19 2.2082558 -3.616 0.784916 0.0397089 ADAM metallopeptidase domain 19 15 7979927 ADAM20 -2.5968716 -3.0464 -0.97852 0.0176993 ADAM metallopeptidase domain 20 16 7979925 ADAM20P1 -3.1352823 -2.21 -1.10773 0.0054469 ADAM metallopeptidase domain 20 pseudogene 1 17 7990736 ADAMTS7 2.7036207 -2.8839 0.471716 0.0140756 ADAM metallopeptidase with thrombospondin type 1 motif 7 18 7952752 ADAMTS8 2.267395 -3.5319 0.373428 0.0352187 ADAM metallopeptidase with thrombospondin type 1 motif 8 19 8088560 ADAMTS9 2.223479 -3.5944 0.814403 0.0385054 ADAM metallopeptidase with thrombospondin type 1 motif 9 20 8132667 ADCY1 -2.3826722 -3.3652 -2.13113 0.027783 adenylate cyclase 1 21 7999079 ADCY9 -2.178401 -3.658 -0.68505 0.0421698 adenylate cyclase 9 22 8052882 ADD2 2.3413871 -3.4253 0.920914 0.0302602 adducin 2 23 8157777 ADGRD2 3.0434101 -2.3551 0.753347 0.006684 adhesion G protein-coupled receptor D2 24 7996064 ADGRG5 2.6787531 -2.9219 0.748457 0.0148507 adhesion G protein-coupled receptor G5 25 7902565 ADGRL2 3.1775711 -2.143 1.370978 0.0049553 adhesion G protein-coupled receptor L2 26 8106827 ADGRV1 -2.9017974 -2.5773 -2.19419 0.0091399 adhesion G protein-coupled receptor V1 27 7928882 ADIRF -2.6158137 -3.0177 -1.52148 0.0169976 adipogenesis regulatory factor 28 7928471 ADK -2.4563235 -3.2568 -0.54747 0.0238268 adenosine kinase 29 8005134 ADORA2B 2.5798829 -3.072 0.501918 0.0183519 adenosine A2b receptor 30 7925550 ADSS -2.2428494 -3.5669 -0.80113 0.0370224 adenylosuccinate synthase 31 7977273 ADSSL1 -3.924199 -0.9566 -1.24256 0.0009105 adenylosuccinate synthase like 1 32 8106429 AGGF1 -2.3379787 -3.4303 -0.45223 0.0304736 angiogenic factor with G-patch and FHA domains 1 33 8130628 AGPAT4 2.2572734 -3.5464 1.030554 0.0359524 1-acylglycerol-3-phosphate O-acyltransferase 4 34 8138381 AGR2 -3.034222 -2.3696 -4.76127 0.0068217 anterior gradient 2, protein disulphide isomerase family member 35 8138392 AGR3 -2.8541838 -2.6514 -3.82108 0.0101462 anterior gradient 3, protein disulphide isomerase family member 36 7896822 AGRN -3.6272845 -1.4273 -1.2464 0.0017926 agrin 37 8169492 AGTR2 -2.350242 -3.4125 -0.79261 0.0297122 angiotensin II receptor type 2 38 8049737 AGXT 2.147625 -3.701 0.561296 0.0448517 alanine-glyoxylate aminotransferase 39 8122396 AIG1 -2.8242038 -2.698 -0.85036 0.0108335 androgen induced 1 40 8011912 AIPL1 2.428713 -3.2976 0.446297 0.0252438 aryl hydrocarbon receptor interacting protein like 1 41 8112458 AK6///TAF9 -2.1319152 -3.7228 -0.37853 0.0462798 adenylate kinase 6///TATA-box binding protein associated factor 9 42 8177635 AK6///TAF9 -2.1319152 -3.7228 -0.37853 0.0462798 adenylate kinase 6///TATA-box binding protein associated factor 9 43 8164766 AK8 -2.2637759 -3.5371 -0.58121 0.0354794 adenylate kinase 8 44 8128777 AK9 -4.3419818 -0.3051 -6.19443 0.0003509 adenylate kinase 9 45 8128788 AK9 -2.2081295 -3.6162 -0.79441 0.0397191 adenylate kinase 9 46 7975066 AKAP5 -2.1550402 -3.6907 -1.10602 0.0441917 A-kinase anchoring protein 5 47 8163569 AKNA 2.3821898 -3.3659 0.831578 0.0278109 AT-hook transcription factor 48 7904084 AKR7A2P1 -2.0949187 -3.7739 -0.5781 0.0498074 aldo-keto reductase family 7 member A2 pseudogene 1 49 7913146 AKR7A3 -2.1430004 -3.7074 -1.01223 0.0452679 aldo-keto reductase family 7 member A3 50 7994737 ALDOA -2.6232516 -3.0064 -0.86986 0.0167294 aldolase, fructose-bisphosphate A 51 7936923 ALDOAP2 -2.1384024 -3.7138 -0.76954 0.0456851 aldolase, fructose-bisphosphate A pseudogene 2 52 7969228 ALG11///U -2.5976779 -3.0452 -0.49494 0.0176689 ALG11, alpha-1,2-mannosyltransferase///UTP14, small subunit proces 53 8141757 ALKBH4 2.769381 -2.7828 0.541324 0.0122074 alkB homolog 4, lysine demethylase 54 8040053 ALLC 3.2541557 -2.0215 1.017958 0.0041729 allantoicase 55 8004221 ALOX12 2.1981782 -3.6302 0.453382 0.0405245 arachidonate 12-lipoxygenase, 12S type 56 8012309 ALOX12B 2.2575772 -3.546 0.439605 0.0359302 arachidonate 12-lipoxygenase, 12R type 57 8001477 AMFR -2.6542221 -2.9593 -0.77238 0.0156549 autocrine motility factor receptor 58 8090852 AMOTL2 -3.0848012 -2.2898 -0.7071 0.0060961 angiomotin like 2 59 8090866 ANAPC13 -3.0338613 -2.3702 -0.95255 0.0068272 anaphase promoting complex subunit 13 60 8033892 ANGPTL6 2.1121698 -3.7501 0.470955 0.0481332 angiopoietin like 6 61 8150439 ANK1 2.3629479 -3.394 0.46731 0.028942 ankyrin 1 62 7984227 ANKDD1A 3.0400253 -2.3605 0.497822 0.0067344 ankyrin repeat and death domain containing 1A 63 8060949 ANKEF1 -5.0439266 0.7419 -1.35595 0.000072 ankyrin repeat and EF-hand domain containing 1 64 7943943 ANKK1 2.3528345 -3.4087 0.400823 0.0295535 ankyrin repeat and kinase domain containing 1 65 7934979 ANKRD1 2.1881199 -3.6443 0.493973 0.0413538 ankyrin repeat domain 1 66 7927033 ANKRD30A -3.1461142 -2.1929 -6.27253 0.0053167 ankyrin repeat domain 30A 67 8069499 ANKRD30B -3.3952732 -1.7969 -3.55391 0.0030354 ankyrin repeat domain 30B 68 8112672 ANKRD31 2.493414 -3.2017 0.877722 0.0220402 ankyrin repeat domain 31 69 8016725 ANKRD40 -2.728627 -2.8455 -0.59072 0.0133352 ankyrin repeat domain 40 70 7942858 ANKRD42 -2.3024942 -3.4816 -1.0422 0.0327798 ankyrin repeat domain 42 71 7965686 ANKS1B -2.3903619 -3.3539 -1.47992 0.027343 ankyrin repeat and sterile alpha motif domain containing 1B 72 7993815 ANKS4B 2.6442458 -2.9745 0.669602 0.0159937 ankyrin repeat and sterile alpha motif domain containing 4B 73 7938951 ANO5 -2.3424692 -3.4238 -1.12212 0.0301927 anoctamin 5 74 8049799 ANO7 2.1350723 -3.7184 0.452277 0.0459895 anoctamin 7 75 7905283 ANXA9 -2.4597308 -3.2517 -2.26573 0.0236572 annexin A9 76 8034084 AP1M2 -2.7490871 -2.814 -2.50749 0.0127571 adaptor related protein complex 1 mu 2 subunit 77 8161618 APBA1 3.5176519 -1.6019 0.960177 0.0023002 amyloid beta precursor protein binding family A member 1 78 8099982 APBB2 -2.5988325 -3.0434 -0.94924 0.0176255 amyloid beta precursor protein binding family B member 2 79 7897632 APITD1-CO -2.8836635 -2.6055 -1.1645 0.0095113
Recommended publications
  • Dominantly Acting Variants in ARF3 Have Disruptive Consequences on Golgi Integrity and Cause Microcephaly Recapitulated in ZebraSh
    Dominantly acting variants in ARF3 have disruptive consequences on Golgi integrity and cause microcephaly recapitulated in zebrash Giulia Fasano Ospedale Pediatrico Bambino Gesù Valentina Muto Ospedale Pediatrico Bambino Gesù Francesca Clementina Radio Genetic and Rare Disease Research Division, Bambino Gesù Children's Hospital IRCCS, Rome, Italy https://orcid.org/0000-0003-1993-8018 Martina Venditti Ospedale Pediatrico Bambino Gesù Alban Ziegler Département de Génétique, CHU d’Angers Giovanni Chillemi Tuscia University https://orcid.org/0000-0003-3901-6926 Annalisa Vetro Pediatric Neurology, Neurogenetics and Neurobiology Unit and Laboratories, Meyer Children’s Hospital, University of Florence Francesca Pantaleoni https://orcid.org/0000-0003-0765-9281 Simone Pizzi Bambino Gesù Children's Hospital Libenzio Conti Ospedale Pediatrico Bambino Gesù, IRCCS, 00146 Rome https://orcid.org/0000-0001-9466-5473 Stefania Petrini Bambino Gesù Children's Hospital Simona Coppola Istituto Superiore di Sanità Alessandro Bruselles Istituto Superiore di Sanità https://orcid.org/0000-0002-1556-4998 Ingrid Guarnetti Prandi University of Pisa, 56124 Pisa, Italy Balasubramanian Chandramouli Super Computing Applications and Innovation, CINECA Magalie Barth Céline Bris Département de Génétique, CHU d’Angers Donatella Milani Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico Angelo Selicorni ASST Lariana Marina Macchiaiolo Ospedale Pediatrico Bambino Gesù, IRCCS Michaela Gonantini Ospedale Pediatrico Bambino Gesù, IRCCS Andrea Bartuli Bambino Gesù Children's
    [Show full text]
  • Genome-Wide Analysis of 5-Hmc in the Peripheral Blood of Systemic Lupus Erythematosus Patients Using an Hmedip-Chip
    INTERNATIONAL JOURNAL OF MOLECULAR MEDICINE 35: 1467-1479, 2015 Genome-wide analysis of 5-hmC in the peripheral blood of systemic lupus erythematosus patients using an hMeDIP-chip WEIGUO SUI1*, QIUPEI TAN1*, MING YANG1, QIANG YAN1, HUA LIN1, MINGLIN OU1, WEN XUE1, JIEJING CHEN1, TONGXIANG ZOU1, HUANYUN JING1, LI GUO1, CUIHUI CAO1, YUFENG SUN1, ZHENZHEN CUI1 and YONG DAI2 1Guangxi Key Laboratory of Metabolic Diseases Research, Central Laboratory of Guilin 181st Hospital, Guilin, Guangxi 541002; 2Clinical Medical Research Center, the Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong 518020, P.R. China Received July 9, 2014; Accepted February 27, 2015 DOI: 10.3892/ijmm.2015.2149 Abstract. Systemic lupus erythematosus (SLE) is a chronic, Introduction potentially fatal systemic autoimmune disease characterized by the production of autoantibodies against a wide range Systemic lupus erythematosus (SLE) is a typical systemic auto- of self-antigens. To investigate the role of the 5-hmC DNA immune disease, involving diffuse connective tissues (1) and modification with regard to the onset of SLE, we compared is characterized by immune inflammation. SLE has a complex the levels 5-hmC between SLE patients and normal controls. pathogenesis (2), involving genetic, immunologic and envi- Whole blood was obtained from patients, and genomic DNA ronmental factors. Thus, it may result in damage to multiple was extracted. Using the hMeDIP-chip analysis and valida- tissues and organs, especially the kidneys (3). SLE arises from tion by quantitative RT-PCR (RT-qPCR), we identified the a combination of heritable and environmental influences. differentially hydroxymethylated regions that are associated Epigenetics, the study of changes in gene expression with SLE.
    [Show full text]
  • (12) Patent Application Publication (10) Pub. No.: US 2015/0337275 A1 Pearlman Et Al
    US 20150337275A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2015/0337275 A1 Pearlman et al. (43) Pub. Date: Nov. 26, 2015 (54) BOCONVERSION PROCESS FOR Publication Classification PRODUCING NYLON-7, NYLON-7.7 AND POLYESTERS (51) Int. C. CI2N 9/10 (2006.01) (71) Applicant: INVISTATECHNOLOGIES S.a.r.l., CI2P 7/40 (2006.01) St. Gallen (CH) CI2PI3/00 (2006.01) CI2PI3/04 (2006.01) (72) Inventors: Paul S. Pearlman, Thornton, PA (US); CI2P 13/02 (2006.01) Changlin Chen, Cleveland (GB); CI2N 9/16 (2006.01) Adriana L. Botes, Cleveland (GB); Alex CI2N 9/02 (2006.01) Van Eck Conradie, Cleveland (GB); CI2N 9/00 (2006.01) Benjamin D. Herzog, Wichita, KS (US) CI2P 7/44 (2006.01) CI2P I 7/10 (2006.01) (73) Assignee: INVISTATECHNOLOGIES S.a.r.l., (52) U.S. C. St. Gallen (CH) CPC. CI2N 9/13 (2013.01); C12P 7/44 (2013.01); CI2P 7/40 (2013.01); CI2P 13/005 (2013.01); (21) Appl. No.: 14/367,484 CI2P 17/10 (2013.01); CI2P 13/02 (2013.01); CI2N 9/16 (2013.01); CI2N 9/0008 (2013.01); (22) PCT Fled: Dec. 21, 2012 CI2N 9/93 (2013.01); CI2P I3/04 (2013.01); PCT NO.: PCT/US2012/071.472 CI2P 13/001 (2013.01); C12Y 102/0105 (86) (2013.01) S371 (c)(1), (2) Date: Jun. 20, 2014 (57) ABSTRACT Embodiments of the present invention relate to methods for Related U.S. Application Data the biosynthesis of di- or trifunctional C7 alkanes in the (60) Provisional application No.
    [Show full text]
  • PARSANA-DISSERTATION-2020.Pdf
    DECIPHERING TRANSCRIPTIONAL PATTERNS OF GENE REGULATION: A COMPUTATIONAL APPROACH by Princy Parsana A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy Baltimore, Maryland July, 2020 © 2020 Princy Parsana All rights reserved Abstract With rapid advancements in sequencing technology, we now have the ability to sequence the entire human genome, and to quantify expression of tens of thousands of genes from hundreds of individuals. This provides an extraordinary opportunity to learn phenotype relevant genomic patterns that can improve our understanding of molecular and cellular processes underlying a trait. The high dimensional nature of genomic data presents a range of computational and statistical challenges. This dissertation presents a compilation of projects that were driven by the motivation to efficiently capture gene regulatory patterns in the human transcriptome, while addressing statistical and computational challenges that accompany this data. We attempt to address two major difficulties in this domain: a) artifacts and noise in transcriptomic data, andb) limited statistical power. First, we present our work on investigating the effect of artifactual variation in gene expression data and its impact on trans-eQTL discovery. Here we performed an in-depth analysis of diverse pre-recorded covariates and latent confounders to understand their contribution to heterogeneity in gene expression measurements. Next, we discovered 673 trans-eQTLs across 16 human tissues using v6 data from the Genotype Tissue Expression (GTEx) project. Finally, we characterized two trait-associated trans-eQTLs; one in Skeletal Muscle and another in Thyroid. Second, we present a principal component based residualization method to correct gene expression measurements prior to reconstruction of co-expression networks.
    [Show full text]
  • Hypoxia and Oxygen-Sensing Signaling in Gene Regulation and Cancer Progression
    International Journal of Molecular Sciences Review Hypoxia and Oxygen-Sensing Signaling in Gene Regulation and Cancer Progression Guang Yang, Rachel Shi and Qing Zhang * Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; [email protected] (G.Y.); [email protected] (R.S.) * Correspondence: [email protected]; Tel.: +1-214-645-4671 Received: 6 October 2020; Accepted: 29 October 2020; Published: 31 October 2020 Abstract: Oxygen homeostasis regulation is the most fundamental cellular process for adjusting physiological oxygen variations, and its irregularity leads to various human diseases, including cancer. Hypoxia is closely associated with cancer development, and hypoxia/oxygen-sensing signaling plays critical roles in the modulation of cancer progression. The key molecules of the hypoxia/oxygen-sensing signaling include the transcriptional regulator hypoxia-inducible factor (HIF) which widely controls oxygen responsive genes, the central members of the 2-oxoglutarate (2-OG)-dependent dioxygenases, such as prolyl hydroxylase (PHD or EglN), and an E3 ubiquitin ligase component for HIF degeneration called von Hippel–Lindau (encoding protein pVHL). In this review, we summarize the current knowledge about the canonical hypoxia signaling, HIF transcription factors, and pVHL. In addition, the role of 2-OG-dependent enzymes, such as DNA/RNA-modifying enzymes, JmjC domain-containing enzymes, and prolyl hydroxylases, in gene regulation of cancer progression, is specifically reviewed. We also discuss the therapeutic advancement of targeting hypoxia and oxygen sensing pathways in cancer. Keywords: hypoxia; PHDs; TETs; JmjCs; HIFs 1. Introduction Molecular oxygen serves as a co-factor in many biochemical processes and is fundamental for aerobic organisms to maintain intracellular ATP levels [1,2].
    [Show full text]
  • Distinct RNA N-Demethylation Pathways Catalyzed by Nonheme Iron ALKBH5 and FTO Enzymes Enable Regulation of Formaldehyde Release Rates
    Distinct RNA N-demethylation pathways catalyzed by nonheme iron ALKBH5 and FTO enzymes enable regulation of formaldehyde release rates Joel D. W. Toha,1, Steven W. M. Crossleya,1, Kevin J. Bruemmera, Eva J. Gea, Dan Hea, Diana A. Iovana, and Christopher J. Changa,b,2 aDepartment of Chemistry, University of California, Berkeley, CA 94720; and bDepartment of Molecular and Cell Biology, University of California, Berkeley, CA 94720 Edited by Amy C. Rosenzweig, Northwestern University, Evanston, IL, and approved August 24, 2020 (received for review April 17, 2020) The AlkB family of nonheme Fe(II)/2-oxoglutarate–dependent oxy- m6A (23) is relatively stable compared to other hydroxymethyl- genases are essential regulators of RNA epigenetics by serving as containing nucleobases and has been reported to decay to the free erasers of one-carbon marks on RNA with release of formaldehyde adenosine (A) base over 10 h (22). m6A is the most prominent (FA). Two major human AlkB family members, FTO and ALKBH5, modification of messenger RNA (mRNA) (24, 25) and is installed by both act as oxidative demethylases of N6-methyladenosine (m6A) the S-adenosylmethionine–dependent METTL3/14-WTAP writer but furnish different major products, N6-hydroxymethyladenosine complex (26) and removed by two human AlkB eraser enzymes, fat (hm6A) and adenosine (A), respectively. Here we identify founda- mass and obesity-associated protein (FTO) (4) and AlkB family tional mechanistic differences between FTO and ALKBH5 that pro- member 5 (ALKBH5) (27). Internal m6A modifications control the mote these distinct biochemical outcomes. In contrast to FTO, fate of mRNA (28–32) through translation, splicing, localization, which follows a traditional oxidative N-demethylation pathway stability, and decay and are connected to cancer progression, im- to catalyze conversion of m6A to hm6A with subsequent slow mune responses, and metabolic states (27, 29, 32–36).
    [Show full text]
  • Analysis of Gene Expression Data for Gene Ontology
    ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Robert Daniel Macholan May 2011 ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION Robert Daniel Macholan Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Department Chair Dr. Zhong-Hui Duan Dr. Chien-Chung Chan _______________________________ _______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Yingcai Xiao Dr. George R. Newkome _______________________________ Date ii ABSTRACT A tremendous increase in genomic data has encouraged biologists to turn to bioinformatics in order to assist in its interpretation and processing. One of the present challenges that need to be overcome in order to understand this data more completely is the development of a reliable method to accurately predict the function of a protein from its genomic information. This study focuses on developing an effective algorithm for protein function prediction. The algorithm is based on proteins that have similar expression patterns. The similarity of the expression data is determined using a novel measure, the slope matrix. The slope matrix introduces a normalized method for the comparison of expression levels throughout a proteome. The algorithm is tested using real microarray gene expression data. Their functions are characterized using gene ontology annotations. The results of the case study indicate the protein function prediction algorithm developed is comparable to the prediction algorithms that are based on the annotations of homologous proteins.
    [Show full text]
  • Expression Gene Network Analyses Reveal Molecular Mechanisms And
    www.nature.com/scientificreports OPEN Diferential expression and co- expression gene network analyses reveal molecular mechanisms and candidate biomarkers involved in breast muscle myopathies in chicken Eva Pampouille1,2, Christelle Hennequet-Antier1, Christophe Praud1, Amélie Juanchich1, Aurélien Brionne1, Estelle Godet1, Thierry Bordeau1, Fréderic Fagnoul2, Elisabeth Le Bihan-Duval1 & Cécile Berri1* The broiler industry is facing an increasing prevalence of breast myopathies, such as white striping (WS) and wooden breast (WB), and the precise aetiology of these occurrences remains poorly understood. To progress our understanding of the structural changes and molecular pathways involved in these myopathies, a transcriptomic analysis was performed using an 8 × 60 K Agilent chicken microarray and histological study. The study used pectoralis major muscles from three groups: slow-growing animals (n = 8), fast-growing animals visually free from defects (n = 8), or severely afected by both WS and WB (n = 8). In addition, a weighted correlation network analysis was performed to investigate the relationship between modules of co-expressed genes and histological traits. Functional analysis suggested that selection for fast growing and breast meat yield has progressively led to conditions favouring metabolic shifts towards alternative catabolic pathways to produce energy, leading to an adaptive response to oxidative stress and the frst signs of infammatory, regeneration and fbrosis processes. All these processes are intensifed in muscles afected by severe myopathies, in which new mechanisms related to cellular defences and remodelling seem also activated. Furthermore, our study opens new perspectives for myopathy diagnosis by highlighting fne histological phenotypes and genes whose expression was strongly correlated with defects. Te poultry industry relies on the production of fast-growing chickens, which are slaughtered at high weights and intended for cutting and processing.
    [Show full text]
  • Seq2pathway Vignette
    seq2pathway Vignette Bin Wang, Xinan Holly Yang, Arjun Kinstlick May 19, 2021 Contents 1 Abstract 1 2 Package Installation 2 3 runseq2pathway 2 4 Two main functions 3 4.1 seq2gene . .3 4.1.1 seq2gene flowchart . .3 4.1.2 runseq2gene inputs/parameters . .5 4.1.3 runseq2gene outputs . .8 4.2 gene2pathway . 10 4.2.1 gene2pathway flowchart . 11 4.2.2 gene2pathway test inputs/parameters . 11 4.2.3 gene2pathway test outputs . 12 5 Examples 13 5.1 ChIP-seq data analysis . 13 5.1.1 Map ChIP-seq enriched peaks to genes using runseq2gene .................... 13 5.1.2 Discover enriched GO terms using gene2pathway_test with gene scores . 15 5.1.3 Discover enriched GO terms using Fisher's Exact test without gene scores . 17 5.1.4 Add description for genes . 20 5.2 RNA-seq data analysis . 20 6 R environment session 23 1 Abstract Seq2pathway is a novel computational tool to analyze functional gene-sets (including signaling pathways) using variable next-generation sequencing data[1]. Integral to this tool are the \seq2gene" and \gene2pathway" components in series that infer a quantitative pathway-level profile for each sample. The seq2gene function assigns phenotype-associated significance of genomic regions to gene-level scores, where the significance could be p-values of SNPs or point mutations, protein-binding affinity, or transcriptional expression level. The seq2gene function has the feasibility to assign non-exon regions to a range of neighboring genes besides the nearest one, thus facilitating the study of functional non-coding elements[2]. Then the gene2pathway summarizes gene-level measurements to pathway-level scores, comparing the quantity of significance for gene members within a pathway with those outside a pathway.
    [Show full text]
  • Supplementary Table S1. Upregulated Genes Differentially
    Supplementary Table S1. Upregulated genes differentially expressed in athletes (p < 0.05 and 1.3-fold change) Gene Symbol p Value Fold Change 221051_s_at NMRK2 0.01 2.38 236518_at CCDC183 0.00 2.05 218804_at ANO1 0.00 2.05 234675_x_at 0.01 2.02 207076_s_at ASS1 0.00 1.85 209135_at ASPH 0.02 1.81 228434_at BTNL9 0.03 1.81 229985_at BTNL9 0.01 1.79 215795_at MYH7B 0.01 1.78 217979_at TSPAN13 0.01 1.77 230992_at BTNL9 0.01 1.75 226884_at LRRN1 0.03 1.74 220039_s_at CDKAL1 0.01 1.73 236520_at 0.02 1.72 219895_at TMEM255A 0.04 1.72 201030_x_at LDHB 0.00 1.69 233824_at 0.00 1.69 232257_s_at 0.05 1.67 236359_at SCN4B 0.04 1.64 242868_at 0.00 1.63 1557286_at 0.01 1.63 202780_at OXCT1 0.01 1.63 1556542_a_at 0.04 1.63 209992_at PFKFB2 0.04 1.63 205247_at NOTCH4 0.01 1.62 1554182_at TRIM73///TRIM74 0.00 1.61 232892_at MIR1-1HG 0.02 1.61 204726_at CDH13 0.01 1.6 1561167_at 0.01 1.6 1565821_at 0.01 1.6 210169_at SEC14L5 0.01 1.6 236963_at 0.02 1.6 1552880_at SEC16B 0.02 1.6 235228_at CCDC85A 0.02 1.6 1568623_a_at SLC35E4 0.00 1.59 204844_at ENPEP 0.00 1.59 1552256_a_at SCARB1 0.02 1.59 1557283_a_at ZNF519 0.02 1.59 1557293_at LINC00969 0.03 1.59 231644_at 0.01 1.58 228115_at GAREM1 0.01 1.58 223687_s_at LY6K 0.02 1.58 231779_at IRAK2 0.03 1.58 243332_at LOC105379610 0.04 1.58 232118_at 0.01 1.57 203423_at RBP1 0.02 1.57 AMY1A///AMY1B///AMY1C///AMY2A///AMY2B// 208498_s_at 0.03 1.57 /AMYP1 237154_at LOC101930114 0.00 1.56 1559691_at 0.01 1.56 243481_at RHOJ 0.03 1.56 238834_at MYLK3 0.01 1.55 213438_at NFASC 0.02 1.55 242290_at TACC1 0.04 1.55 ANKRD20A1///ANKRD20A12P///ANKRD20A2///
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Comparative Transcriptome Analysis of Embryo Invasion in the Mink Uterus
    Placenta 75 (2019) 16–22 Contents lists available at ScienceDirect Placenta journal homepage: www.elsevier.com/locate/placenta Comparative transcriptome analysis of embryo invasion in the mink uterus T ∗ Xinyan Caoa,b, , Chao Xua,b, Yufei Zhanga,b, Haijun Weia,b, Yong Liuc, Junguo Caoa,b, Weigang Zhaoa,b, Kun Baoa,b, Qiong Wua,b a Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, China b State Key Laboratory for Molecular Biology of Special Economic Animal and Plant Science, Chinese Academy of Agricultural Sciences, Changchun, China c Key Laboratory of Embryo Development and Reproductive Regulation of Anhui Province, College of Biological and Food Engineering, Fuyang Teachers College, Fuyang, China ABSTRACT Introduction: In mink, as many as 65% of embryos die during gestation. The causes and the mechanisms of embryonic mortality remain unclear. The purpose of our study was to examine global gene expression changes during embryo invasion in mink, and thereby to identify potential signaling pathways involved in implantation failure and early pregnancy loss. Methods: Illumina's next-generation sequencing technology (RNA-Seq) was used to analyze the differentially expressed genes (DEGs) in implantation (IMs) and inter- implantation sites (inter-IMs) of uterine tissue. Results: We identified a total of 606 DEGs, including 420 up- and 186 down-regulated genes in IMs compared to inter-IMs. Gene annotation analysis indicated multiple biological pathways to be significantly enriched for DEGs, including immune response, ECM complex, cytokine activity, chemokine activity andprotein binding. The KEGG pathway including cytokine-cytokine receptor interaction, Jak-STAT, TNF and the chemokine signaling pathway were the most enriched.
    [Show full text]