Mouse Snapc4 Knockout Project (CRISPR/Cas9)

Total Page:16

File Type:pdf, Size:1020Kb

Mouse Snapc4 Knockout Project (CRISPR/Cas9) https://www.alphaknockout.com Mouse Snapc4 Knockout Project (CRISPR/Cas9) Objective: To create a Snapc4 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Snapc4 gene (NCBI Reference Sequence: NM_172339 ; Ensembl: ENSMUSG00000036281 ) is located on Mouse chromosome 2. 23 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 22 (Transcript: ENSMUST00000035427). Exon 2~16 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 2 starts from the coding region. Exon 2~16 covers 49.66% of the coding region. The size of effective KO region: ~9498 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 6 7 89 10 11 12 13 14 15 16 23 Legends Exon of mouse Snapc4 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 1912 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 730 bp section downstream of Exon 16 is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 9 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(1912bp) | A(21.29% 407) | C(22.86% 437) | T(29.81% 570) | G(26.05% 498) Note: The 1912 bp section upstream of Exon 2 is analyzed to determine the GC content. Significant high GC-content regions are found. The gRNA site is selected outside of these high GC-content regions. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(730bp) | A(21.37% 156) | C(24.11% 176) | T(27.12% 198) | G(27.4% 200) Note: The 730 bp section downstream of Exon 16 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 9 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 1912 1 1912 1912 100.0% chr2 - 26378624 26380535 1912 browser details YourSeq 216 732 1125 1912 91.0% chr7 - 73609495 73610204 710 browser details YourSeq 185 859 1120 1912 89.1% chr16 + 82752580 82753109 530 browser details YourSeq 183 854 1120 1912 91.5% chr11 - 62188828 62189337 510 browser details YourSeq 182 854 1131 1912 91.4% chr2 - 18222038 18222343 306 browser details YourSeq 182 860 1120 1912 90.7% chr11 - 29672435 29672863 429 browser details YourSeq 181 811 1102 1912 92.2% chr11 - 75527361 75527962 602 browser details YourSeq 178 812 1084 1912 91.3% chr1 - 131991175 131991759 585 browser details YourSeq 176 860 1127 1912 93.6% chr1 + 72232488 72242579 10092 browser details YourSeq 171 860 1125 1912 91.4% chr11 + 55472438 55472987 550 browser details YourSeq 163 854 1100 1912 94.6% chr11 - 86589606 86941702 352097 browser details YourSeq 162 784 1003 1912 87.4% chr2 + 142919278 142919477 200 browser details YourSeq 158 865 1085 1912 91.7% chr12 + 106536845 106537382 538 browser details YourSeq 157 793 995 1912 88.3% chr17 - 33591056 33591239 184 browser details YourSeq 156 809 1085 1912 93.0% chr11 - 101662641 101663080 440 browser details YourSeq 155 793 1048 1912 84.2% chr12 + 21459328 21459525 198 browser details YourSeq 153 854 1096 1912 91.3% chr8 - 90911463 90911941 479 browser details YourSeq 150 775 1009 1912 86.5% chr15 + 96897091 96897313 223 browser details YourSeq 150 775 993 1912 87.0% chr12 + 40083902 40084109 208 browser details YourSeq 146 809 993 1912 88.5% chr4 - 55197046 55197210 165 Note: The 1912 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 730 1 730 730 100.0% chr2 - 26368409 26369138 730 browser details YourSeq 23 461 486 730 96.0% chr2 - 122250581 122250606 26 browser details YourSeq 22 219 241 730 100.0% chr16 - 52215386 52215411 26 browser details YourSeq 22 428 451 730 87.0% chr15 + 93300234 93300256 23 browser details YourSeq 20 516 537 730 95.5% chr15 + 67553906 67553927 22 Note: The 730 bp section downstream of Exon 16 is BLAT searched against the genome. No significant similarity is found. Page 5 of 9 https://www.alphaknockout.com Gene and protein information: Snapc4 small nuclear RNA activating complex, polypeptide 4 [ Mus musculus (house mouse) ] Gene ID: 227644, updated on 14-Aug-2019 Gene summary Official Symbol Snapc4 provided by MGI Official Full Name small nuclear RNA activating complex, polypeptide 4 provided by MGI Primary source MGI:MGI:2443935 See related Ensembl:ENSMUSG00000036281 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as 5730436L13Rik Expression Ubiquitous expression in ovary adult (RPKM 8.1), genital fat pad adult (RPKM 7.7) and 28 other tissues See more Orthologs human all Genomic context Location: 2; 2 A3 See Snapc4 in Genome Data Viewer Exon count: 25 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 2 NC_000068.7 (26362765..26380661, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 2 NC_000068.6 (26218285..26236173, complement) Chromosome 2 - NC_000068.7 Page 6 of 9 https://www.alphaknockout.com Transcript information: This gene has 14 transcripts Gene: Snapc4 ENSMUSG00000036281 Description small nuclear RNA activating complex, polypeptide 4 [Source:MGI Symbol;Acc:MGI:2443935] Gene Synonyms 5730436L13Rik Location Chromosome 2: 26,362,765-26,380,653 reverse strand. GRCm38:CM000995.2 About this gene This gene has 14 transcripts (splice variants), 176 orthologues, 15 paralogues, is a member of 1 Ensembl protein family and is associated with 3 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Snapc4-201 ENSMUST00000035427.10 4363 1325aa ENSMUSP00000041767.4 Protein coding CCDS15802 Q8BP86 TSL:1 GENCODE basic APPRIS P2 Snapc4-202 ENSMUST00000114115.8 4368 1333aa ENSMUSP00000109750.2 Protein coding - A2AIV6 TSL:1 GENCODE basic APPRIS ALT2 Snapc4-203 ENSMUST00000123934.1 761 254aa ENSMUSP00000122456.1 Protein coding - F7D5X7 CDS 5' and 3' incomplete TSL:3 Snapc4-212 ENSMUST00000149850.7 4459 No protein - lncRNA - - TSL:1 Snapc4-213 ENSMUST00000150121.1 2039 No protein - lncRNA - - TSL:2 Snapc4-207 ENSMUST00000135171.1 814 No protein - lncRNA - - TSL:2 Snapc4-211 ENSMUST00000149316.7 797 No protein - lncRNA - - TSL:2 Snapc4-209 ENSMUST00000144871.1 793 No protein - lncRNA - - TSL:3 Snapc4-208 ENSMUST00000136054.1 724 No protein - lncRNA - - TSL:3 Snapc4-210 ENSMUST00000148024.1 709 No protein - lncRNA - - TSL:5 Snapc4-204 ENSMUST00000124843.1 678 No protein - lncRNA - - TSL:3 Snapc4-205 ENSMUST00000125789.1 543 No protein - lncRNA - - TSL:2 Snapc4-214 ENSMUST00000155643.1 372 No protein - lncRNA - - TSL:3 Snapc4-206 ENSMUST00000133778.7 319 No protein - lncRNA - - TSL:3 Page 7 of 9 https://www.alphaknockout.com 37.89 kb Forward strand 26.36Mb 26.37Mb 26.38Mb 26.39Mb Genes Gm13562-201 >lncRNA Gm13563-201 >lncRNA (Comprehensive set... Gm13562-202 >lncRNA Gm13563-203 >lncRNA Pmpca-201 >protein coding Pmpca-202 >protein coding Gm13563-202 >lncRNA Pmpca-204 >lncRNA Pmpca-205 >lncRNA Pmpca-203 >lncRNA Contigs < AL732541.11 Genes (Comprehensive set... < Card9-202protein coding < Snapc4-202protein coding < Entr1-204protein coding < Card9-201protein coding < Snapc4-207lncRNA < Snapc4-203protein coding < Snapc4-204lncRNA < Entr1-201protein coding < Card9-203lncRNA < Snapc4-201protein coding < Entr1-202protein coding < Snapc4-212lncRNA < Entr1-203protein coding < Snapc4-208lncRNA < Snapc4-209lncRNA < Snapc4-214lncRNA< Entr1-206lncRNA < Snapc4-211lncRNA < Entr1-208lncRNA < Snapc4-205lncRNA < Entr1-209lncRNA < Snapc4-210lncRNA < Entr1-211protein coding < Snapc4-213lncRNA < Entr1-207lncRNA < Snapc4-206lncRNA < Entr1-205lncRNA < Entr1-213lncRNA < Entr1-210lncRNA < Entr1-212lncRNA Regulatory Build 26.36Mb 26.37Mb 26.38Mb 26.39Mb Reverse strand 37.89 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding Ensembl protein coding merged Ensembl/Havana Non-Protein Coding RNA gene Page 8 of 9 https://www.alphaknockout.com Transcript: ENSMUST00000035427 < Snapc4-201protein coding Reverse strand 17.88 kb ENSMUSP00000041... MobiDB lite Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Homeobox-like domain superfamily SMART SANT/Myb domain Pfam PF13921 PROSITE profiles Myb-like domain Myb domain PANTHER PTHR46621 Gene3D 1.10.10.60 CDD SANT/Myb domain All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend inframe insertion missense variant synonymous variant Scale bar 0 200 400 600 800 1000 1325 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 9 of 9.
Recommended publications
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Transcriptional Control of Tissue-Resident Memory T Cell Generation
    Transcriptional control of tissue-resident memory T cell generation Filip Cvetkovski Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2019 © 2019 Filip Cvetkovski All rights reserved ABSTRACT Transcriptional control of tissue-resident memory T cell generation Filip Cvetkovski Tissue-resident memory T cells (TRM) are a non-circulating subset of memory that are maintained at sites of pathogen entry and mediate optimal protection against reinfection. Lung TRM can be generated in response to respiratory infection or vaccination, however, the molecular pathways involved in CD4+TRM establishment have not been defined. Here, we performed transcriptional profiling of influenza-specific lung CD4+TRM following influenza infection to identify pathways implicated in CD4+TRM generation and homeostasis. Lung CD4+TRM displayed a unique transcriptional profile distinct from spleen memory, including up-regulation of a gene network induced by the transcription factor IRF4, a known regulator of effector T cell differentiation. In addition, the gene expression profile of lung CD4+TRM was enriched in gene sets previously described in tissue-resident regulatory T cells. Up-regulation of immunomodulatory molecules such as CTLA-4, PD-1, and ICOS, suggested a potential regulatory role for CD4+TRM in tissues. Using loss-of-function genetic experiments in mice, we demonstrate that IRF4 is required for the generation of lung-localized pathogen-specific effector CD4+T cells during acute influenza infection. Influenza-specific IRF4−/− T cells failed to fully express CD44, and maintained high levels of CD62L compared to wild type, suggesting a defect in complete differentiation into lung-tropic effector T cells.
    [Show full text]
  • Implication of M6a Mrna Methylation in Susceptibility to Inflammatory
    epigenomes Article Implication of m6A mRNA Methylation in Susceptibility to Inflammatory Bowel Disease Maialen Sebastian-delaCruz 1,2 , Ane Olazagoitia-Garmendia 1,2 , Itziar Gonzalez-Moro 2,3 , Izortze Santin 2,3,4 , Koldo Garcia-Etxebarria 5 and Ainara Castellanos-Rubio 1,2,4,6,* 1 Department of Genetics, Physical Anthropology and Animal Fisiology, University of the Basque Country, 48940 Leioa, Spain; [email protected] (M.S.-d.); [email protected] (A.O.-G.) 2 Biocruces Bizkaia Health Research Institute, 48903 Barakaldo, Spain; [email protected] (I.G.-M.); [email protected] (I.S.) 3 Department of Biochemistry and Molecular Biology, University of the Basque Country, 48940 Leioa, Spain 4 CIBER (Centro de Investigación Biomédica en Red) de Diabetes y Enfermedades Metabólicas Asociadas (CIBERDEM), Instituto de Salud Carlos III, 28029 Madrid, Spain 5 Hepatic and Gastrointestinal Disease Area, IIS Biodonostia, 20014 Donostia, Spain; [email protected] 6 Ikerbasque, Basque Foundation for Science, 48013 Bilbao, Spain * Correspondence: [email protected] Received: 29 June 2020; Accepted: 28 July 2020; Published: 3 August 2020 Abstract: Inflammatory bowel disease (IBD) is a chronic inflammatory condition of the gastrointestinal tract that develops due to the interaction between genetic and environmental factors. More than 160 loci have been associated with IBD, but the functional implication of many of the associated genes remains unclear. N6-Methyladenosine (m6A) is the most abundant internal modification in mRNA. m6A methylation regulates many aspects of mRNA metabolism, playing important roles in the development of several pathologies. Interestingly, SNPs located near or within m6A motifs have been proposed as possible contributors to disease pathogenesis.
    [Show full text]
  • Revostmm Vol 10-4-2018 Ingles Maquetaciûn 1
    108 ORIGINALS / Rev Osteoporos Metab Miner. 2018;10(4):108-18 Roca-Ayats N1, Falcó-Mascaró M1, García-Giralt N2, Cozar M1, Abril JF3, Quesada-Gómez JM4, Prieto-Alhambra D5,6, Nogués X2, Mellibovsky L2, Díez-Pérez A2, Grinberg D1, Balcells S1 1 Departamento de Genética, Microbiología y Estadística - Facultad de Biología - Universidad de Barcelona - Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER) - Instituto de Salud Carlos III (ISCIII) - Instituto de Biomedicina de la Universidad de Barcelona (IBUB) - Instituto de Investigación Sant Joan de Déu (IRSJD) - Barcelona (España) 2 Unidad de Investigación en Fisiopatología Ósea y Articular (URFOA); Instituto Hospital del Mar de Investigaciones Médicas (IMIM) - Parque de Salud Mar - Centro de Investigación Biomédica en Red de Fragilidad y Envejecimiento Saludable (CIBERFES); Instituto de Salud Carlos III (ISCIII) - Barcelona (España) 3 Departamento de Genética, Microbiología y Estadística; Facultad de Biología; Universidad de Barcelona - Instituto de Biomedicina de la Universidad de Barcelona (IBUB) - Barcelona (España) 4 Unidad de Metabolismo Mineral; Instituto Maimónides de Investigación Biomédica de Córdoba (IMIBIC); Hospital Universitario Reina Sofía - Centro de Investigación Biomédica en Red de Fragilidad y Envejecimiento Saludable (CIBERFES); Instituto de Salud Carlos III (ISCIII) - Córdoba (España) 5 Grupo de Investigación en Enfermedades Prevalentes del Aparato Locomotor (GREMPAL) - Instituto de Investigación en Atención Primaria (IDIAP) Jordi Gol - Centro de Investigación
    [Show full text]
  • Genomic Study of RNA Polymerase II and III Snapc-Bound Promoters Reveals a Gene Transcribed by Both Enzymes and a Broad Use of Common Activators
    Genomic Study of RNA Polymerase II and III SNAPc-Bound Promoters Reveals a Gene Transcribed by Both Enzymes and a Broad Use of Common Activators Nicole James Faresse1., Donatella Canella1., Viviane Praz1,2, Joe¨lle Michaud1¤, David Romascano1, Nouria Hernandez1* 1 Center for Integrative Genomics, Faculty of Biology and Medicine, University of Lausanne, Lausanne, Switzerland, 2 Swiss Institute of Bioinformatics, Lausanne, Switzerland Abstract SNAPc is one of a few basal transcription factors used by both RNA polymerase (pol) II and pol III. To define the set of active SNAPc-dependent promoters in human cells, we have localized genome-wide four SNAPc subunits, GTF2B (TFIIB), BRF2, pol II, and pol III. Among some seventy loci occupied by SNAPc and other factors, including pol II snRNA genes, pol III genes with type 3 promoters, and a few un-annotated loci, most are primarily occupied by either pol II and GTF2B, or pol III and BRF2. A notable exception is the RPPH1 gene, which is occupied by significant amounts of both polymerases. We show that the large majority of SNAPc-dependent promoters recruit POU2F1 and/or ZNF143 on their enhancer region, and a subset also recruits GABP, a factor newly implicated in SNAPc-dependent transcription. These activators associate with pol II and III promoters in G1 slightly before the polymerase, and ZNF143 is required for efficient transcription initiation complex assembly. The results characterize a set of genes with unique properties and establish that polymerase specificity is not absolute in vivo. Citation: James Faresse N, Canella D, Praz V, Michaud J, Romascano D, et al.
    [Show full text]
  • NGS-Based Reverse Genetic Screen for Common Embryonic Lethal Mutations Compromising Fertility in Livestock
    Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press Research NGS-based reverse genetic screen for common embryonic lethal mutations compromising fertility in livestock Carole Charlier,1,5 Wanbo Li,2,5 Chad Harland,1,3 Mathew Littlejohn,3 Wouter Coppieters,1,4 Frances Creagh,3 Steve Davis,3 Tom Druet,1 Pierre Faux,1 François Guillaume,1,6 Latifa Karim,1,4 Mike Keehan,3 Naveen Kumar Kadri,1 Nico Tamma,1 Richard Spelman,3 and Michel Georges1 1Unit of Animal Genomics, GIGA-R & Faculty of Veterinary Medicine, University of Liège (B34), 4000-Liège, Belgium; 2State Key Laboratory for Pig Genetic Improvement and Production Technology, Jiangxi Agricultural University, Nanchang, 330045, Jiangxi Province, P.R. China; 3Livestock Improvement Corporation, Newstead, Hamilton 3240, New Zealand; 4Genomics Platform, GIGA, University of Liège (B34), 4000-Liège, Belgium We herein report the result of a large-scale, next generation sequencing (NGS)-based screen for embryonic lethal (EL) mu- tations in Belgian beef and New Zealand dairy cattle. We estimated by simulation that cattle might carry, on average, ∼0.5 recessive EL mutations. We mined exome sequence data from >600 animals, and identified 1377 stop-gain, 3139 frame-shift, 1341 splice-site, 22,939 disruptive missense, 62,399 benign missense, and 92,163 synonymous variants. We show that cattle have a comparable load of loss-of-function (LoF) variants (defined as stop-gain, frame-shift, or splice-site variants) as humans despite having a more variable exome. We genotyped >40,000 animals for up to 296 LoF and 3483 disruptive missense, breed-specific variants.
    [Show full text]
  • A Dominant Gain-Of-Function Mutation in Universal Tyrosine Kinase SRC Causes Enhanced Podosome Formation in a Syndrome with Thro
    A dominant gain-of-function mutation in universal tyrosine kinase SRC causes enhanced podosome formation in a syndrome with thrombocytopenia, myelofibrosis, bleeding and bone pathologies Ernest Turro1-4, Daniel Greene1,3,4, Anouck Wijgaerts5, Chantal Thys5, Claire Lentaigne6,7, Tadbir K Bariana8,9, Sarah K Westbury10, Anne M Kelly1,2, Dominik Selleslag11, Jonathan C Stephens1,2,4, Sofia Papadia1,4, Ilenia Simeoni1,4, Christopher J Penkett1,4, Sofie Ashford1,4, Antony Attwood1,2,4, Steve Austin12, Tamam Bakchoul13, Peter Collins14, Sri V V Deevi1,4, Rémi Favier15, Myrto Kostadima1,2, Michele P Lambert16,17, Mary Mathias18, Carolyn M Millar6,7, Kathelijne Peerlinck5, David J Perry19, Sol Schulman20, Deborah Whitehorn1,2, Christine Wittevrongel5, BRIDGE-BPD Consortium#, Marc De Maeyer21, Augusto Rendon1,22, Keith Gomez8,9, Wendy N Erber23, Andrew D Mumford10,24, Paquita Nurden25, Kathleen Stirrups1,4, John R Bradley4,26, F Lucy Raymond4,27, Michael A Laffan6,7, Chris Van Geet5, Sylvia Richardson3, Kathleen Freson5,*,ǂ & Willem H Ouwehand1,2,4,28,* Affiliations 1 Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, United Kingdom. 2 NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, United Kingdom. 3 Medical Research Council Biostatistics Unit, Cambridge Institute of Public Health, Cambridge Biomedical Campus, Cambridge, United Kingdom. 4 NIHR BioResource - Rare Diseases, Cambridge University Hospitals, Cambridge Biomedical Campus, Cambridge, United Kingdom. 5 Department of Cardiovascular Sciences, Center for Molecular and Vascular Biology, University of Leuven, Belgium. 6 Centre for Haematology, Hammersmith Campus, Imperial College Academic Health Sciences Centre, Imperial College London, London, United Kingdom. 7 Imperial College Healthcare NHS Trust, Du Cane Road, London, United Kingdom.
    [Show full text]
  • COMPUTATIONAL METHODS for the FUNCTIONAL ANALYSIS of DNA SEQUENCE VARIANTS by Lucas Santana Dos Santos BS, Universidade Federal
    COMPUTATIONAL METHODS FOR THE FUNCTIONAL ANALYSIS OF DNA SEQUENCE VARIANTS by Lucas Santana dos Santos BS, Universidade Federal de Minas Gerais, 2008 MS, University of Pittsburgh, 2012 Submitted to the Graduate Faculty of the School of Medicine in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2017 UNIVERSITY OF PITTSBURGH SCHOOL OF MEDICINE This dissertation was presented by Lucas Santana dos Santos It was defended on April 4, 2017 and approved by Richard Duerr, MD, Medicine Vanathi Gopalakrishnan, PhD, Associate Professor, Biomedical Informatics Xia Jiang, PhD, Associate Professor, Biomedical Informatics Dissertation Director: Panayiotis Benos, PhD, Professor, Computational and Systems Biology ii Copyright © by Lucas Santana dos Santos 2017 iii COMPUTATIONAL METHODS FOR THE FUNCTIONAL ANALYSIS OF DNA SEQUENCE VARIANTS Lucas Santana dos Santos, PhD University of Pittsburgh, 2017 Complex diseases, such as cancer and inflammatory bowel disease, are caused by a combination of genetic and environmental factors. The advent of next-generation sequencing (NGS) technology allowed the genome-wide investigation of the underlying genetic causes of complex disorders. Analysis of the large amount of data generated by NGS is computationally intensive and require new computational methods. One of the current problems in genomic data analysis is the lack of computational methods for functional annotation of DNA sequence variants (DSVs), especially regulatory DNA sequence variants (rDSVs). In recent years, rDSVs have been shown to be the primary cause of complex diseases, supported by the fact that functional regulatory sites are more polymorphic than coding regions, and that rDSVs vastly outnumber coding variants. Also, GWAS studies of complex traits have shown that SNPs with the strongest association signals lie outside known genes in non-coding regions of the genome.
    [Show full text]
  • Methods for Determining the Genetic Causes of Rare Diseases
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Apollo Methods for Determining the Genetic Causes of Rare Diseases Daniel Greene MRC Biostatistics Unit University of Cambridge This dissertation is submitted for the degree of Doctor of Philosophy Clare College January 2018 Methods for Determining the Genetic Causes of Rare Diseases Daniel Greene Thanks to the affordability of DNA sequencing, hundreds of thousands of individuals with rare disorders are undergoing whole-genome sequencing in an effort to reveal novel disease aetiologies, increase our understanding of biological processes and improve patient care. However, the power to discover the genetic causes of many unexplained rare diseases is hindered by a paucity of cases with a shared molecular aetiology. This thesis presents research into statistical and computational methods for determining the genetic causes of rare diseases. Methods described herein treat important aspects of the nature of rare diseases, including genetic and phenotypic heterogeneity, phenotypes involving multiple organ systems, Mendelian modes of inheritance and the incorporation of complex prior information such as model organism phenotypes and evolutionary conservation. The complex nature of rare disease phenotypes and the need to aggregate patient data across many centres has led to the adoption of the Human Phenotype Ontology (HPO) as a means of coding patient phenotypes. The HPO provides a standardised vocabulary and captures relationships between disease features. The use of such ontologically encoded data is widespread in bioinformatics, with ontologies defining relationships between concepts in hundreds of subfields. However, there has been a dearth of tools for manipulating and analysing ontological data.
    [Show full text]
  • Cryo-EM Structures of Human RNA Polymerase III in Its Unbound and Transcribing States
    bioRxiv preprint doi: https://doi.org/10.1101/2020.06.29.177642; this version posted June 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Cryo-EM structures of human RNA polymerase III in its unbound and transcribing states Mathias Girbig1,3,4, Agata D. Misiaszek1,3,4, Matthias K. Vorländer1, Aleix Lafita2, Helga Grötsch1, Florence Baudin1, Alex Bateman2, Christoph W. Müller1* 1European Molecular Biology Laboratory (EMBL), Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany. 2European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. 3Candidate for joint PhD degree from EMBL and Heidelberg University, Faculty of Biosciences, 69120 Heidelberg, Germany. 4These authors contributed equally: Mathias Girbig, Agata D. Misiaszek. * Correspondence to CWM ([email protected]) 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.06.29.177642; this version posted June 29, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. ABSTRACT RNA polymerase III (Pol III) synthesises tRNAs and other short, essential RNAs. Human Pol III misregulation is linked to tumour transformation, neurodegenerative and developmental disorders, and increased sensitivity to viral infections. Pol III inhibition increases longevity in different animals but also promotes intracellular bacterial growth owing to its role in the immune system.
    [Show full text]
  • Transposable Element Polymorphisms and Human Genome Regulation
    TRANSPOSABLE ELEMENT POLYMORPHISMS AND HUMAN GENOME REGULATION A Dissertation Presented to The Academic Faculty by Lu Wang In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in Bioinformatics in the School of Biological Sciences Georgia Institute of Technology December 2017 COPYRIGHT © 2017 BY LU WANG TRANSPOSABLE ELEMENT POLYMORPHISMS AND HUMAN GENOME REGULATION Approved by: Dr. I. King Jordan, Advisor Dr. John F. McDonald School of Biological Sciences School of Biological Sciences Georgia Institute of Technology Georgia Institute of Technology Dr. Fredrik O. Vannberg Dr. Victoria V. Lunyak School of Biological Sciences Aelan Cell Technologies Georgia Institute of Technology San Francisco, CA Dr. Greg G. Gibson School of Biological Sciences Georgia Institute of Technology Date Approved: November 6, 2017 To my family and friends ACKNOWLEDGEMENTS I am truly grateful to my advisor Dr. I. King Jordan for his guidance and support throughout my time working with him as a graduate student. I am fortunate enough to have him as my mentor, starting from very basic, well-defined research tasks, and guided me step-by-step into the exciting world of scientific research. Throughout my PhD training, I have been always impressed by his ability to explain complex ideas – sometimes brilliant ideas of his own – in short and succinct sentences in such a way that his students could easily understand. I am also very impressed and inspired by his diligence and passion for his work. It is my great honor to have Dr. Greg Gibson, Dr. Victoria Lunyak, Dr. John McDonald, Dr. Fredrik Vannberg as my committee members. I really appreciate the guidance they provided me throughout my PhD study and the insightful thoughts they generously share with me during our discussions.
    [Show full text]
  • Table S1. 103 Ferroptosis-Related Genes Retrieved from the Genecards
    Table S1. 103 ferroptosis-related genes retrieved from the GeneCards. Gene Symbol Description Category GPX4 Glutathione Peroxidase 4 Protein Coding AIFM2 Apoptosis Inducing Factor Mitochondria Associated 2 Protein Coding TP53 Tumor Protein P53 Protein Coding ACSL4 Acyl-CoA Synthetase Long Chain Family Member 4 Protein Coding SLC7A11 Solute Carrier Family 7 Member 11 Protein Coding VDAC2 Voltage Dependent Anion Channel 2 Protein Coding VDAC3 Voltage Dependent Anion Channel 3 Protein Coding ATG5 Autophagy Related 5 Protein Coding ATG7 Autophagy Related 7 Protein Coding NCOA4 Nuclear Receptor Coactivator 4 Protein Coding HMOX1 Heme Oxygenase 1 Protein Coding SLC3A2 Solute Carrier Family 3 Member 2 Protein Coding ALOX15 Arachidonate 15-Lipoxygenase Protein Coding BECN1 Beclin 1 Protein Coding PRKAA1 Protein Kinase AMP-Activated Catalytic Subunit Alpha 1 Protein Coding SAT1 Spermidine/Spermine N1-Acetyltransferase 1 Protein Coding NF2 Neurofibromin 2 Protein Coding YAP1 Yes1 Associated Transcriptional Regulator Protein Coding FTH1 Ferritin Heavy Chain 1 Protein Coding TF Transferrin Protein Coding TFRC Transferrin Receptor Protein Coding FTL Ferritin Light Chain Protein Coding CYBB Cytochrome B-245 Beta Chain Protein Coding GSS Glutathione Synthetase Protein Coding CP Ceruloplasmin Protein Coding PRNP Prion Protein Protein Coding SLC11A2 Solute Carrier Family 11 Member 2 Protein Coding SLC40A1 Solute Carrier Family 40 Member 1 Protein Coding STEAP3 STEAP3 Metalloreductase Protein Coding ACSL1 Acyl-CoA Synthetase Long Chain Family Member 1 Protein
    [Show full text]