Mouse Baat Knockout Project (CRISPR/Cas9)

Total Page:16

File Type:pdf, Size:1020Kb

Mouse Baat Knockout Project (CRISPR/Cas9) https://www.alphaknockout.com Mouse Baat Knockout Project (CRISPR/Cas9) Objective: To create a Baat knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Baat gene (NCBI Reference Sequence: NM_007519 ; Ensembl: ENSMUSG00000039653 ) is located on Mouse chromosome 4. 4 exons are identified, with the ATG start codon in exon 2 and the TGA stop codon in exon 4 (Transcript: ENSMUST00000043056). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 2 starts from the coding region. Exon 2 covers 36.98% of the coding region. The size of effective KO region: ~517 bp. The KO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 4 Legends Exon of mouse Baat Knockout region Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Page 3 of 8 https://www.alphaknockout.com Overview of the GC Content Distribution (up) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(26.5% 530) | C(21.8% 436) | T(33.6% 672) | G(18.1% 362) Note: The 2000 bp section upstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution (down) Window size: 300 bp Sequence 12 Summary: Full Length(2000bp) | A(26.5% 530) | C(18.75% 375) | T(33.15% 663) | G(21.6% 432) Note: The 2000 bp section downstream of Exon 2 is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 4 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr4 - 49503121 49505120 2000 browser details YourSeq 62 332 569 2000 89.8% chr7 - 122348103 122348621 519 browser details YourSeq 61 500 624 2000 93.1% chr6 - 144810232 144810389 158 browser details YourSeq 61 503 604 2000 93.2% chr14 - 122594669 122594849 181 browser details YourSeq 59 500 625 2000 76.9% chr12 + 84280267 84280348 82 browser details YourSeq 52 563 629 2000 96.5% chr10 + 115780579 115780659 81 browser details YourSeq 50 515 610 2000 91.4% chr11 - 104077538 104077653 116 browser details YourSeq 49 541 607 2000 83.7% chr16 - 95041747 95041811 65 browser details YourSeq 48 508 571 2000 91.7% chr17 - 58664717 58664796 80 browser details YourSeq 47 517 590 2000 89.7% chr14 - 86635750 86635986 237 browser details YourSeq 45 520 610 2000 83.4% chr1 + 71636848 71636934 87 browser details YourSeq 42 502 593 2000 80.8% chr8 - 32041311 32041398 88 browser details YourSeq 39 502 548 2000 91.5% chr9 - 73173185 73173231 47 browser details YourSeq 37 516 564 2000 86.1% chr2 - 167216814 167216860 47 browser details YourSeq 37 537 587 2000 93.2% chr14 + 12155116 12155176 61 browser details YourSeq 37 574 654 2000 93.2% chr1 + 117338750 117338833 84 browser details YourSeq 35 558 608 2000 85.4% chr6 + 82468318 82468366 49 browser details YourSeq 35 522 569 2000 92.5% chr4 + 93604430 93604479 50 browser details YourSeq 33 495 528 2000 100.0% chr12 + 17827732 17827806 75 browser details YourSeq 32 559 599 2000 92.2% chr5 - 21980936 21980977 42 Note: The 2000 bp section upstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 2000 1 2000 2000 100.0% chr4 - 49500655 49502654 2000 browser details YourSeq 183 849 1159 2000 86.2% chr2 - 12377781 12378123 343 browser details YourSeq 174 809 1171 2000 81.2% chr13 + 36569867 36570376 510 browser details YourSeq 173 849 1159 2000 81.2% chr5 - 20754940 20755281 342 browser details YourSeq 171 812 1159 2000 84.2% chr9 - 123355749 123356230 482 browser details YourSeq 169 849 1133 2000 82.2% chr3 - 67381162 67381483 322 browser details YourSeq 167 849 1157 2000 88.6% chr2 - 171944317 171944660 344 browser details YourSeq 166 852 1133 2000 86.8% chr12 - 110064908 110065223 316 browser details YourSeq 164 813 1108 2000 84.6% chr10 - 59545338 59545775 438 browser details YourSeq 162 849 1133 2000 84.2% chr8 + 109971339 109971634 296 browser details YourSeq 159 812 1114 2000 82.2% chr4 - 63639164 63639611 448 browser details YourSeq 159 849 1115 2000 84.8% chr2 - 166503039 166547698 44660 browser details YourSeq 158 815 1113 2000 84.0% chr1 - 156379976 156380422 447 browser details YourSeq 157 849 1114 2000 83.5% chr11 - 75990670 75990975 306 browser details YourSeq 154 849 1118 2000 88.9% chrX + 48809547 48809860 314 browser details YourSeq 153 849 1129 2000 87.0% chr17 + 88492303 88492626 324 browser details YourSeq 152 849 1106 2000 82.1% chr3 - 11303036 11303329 294 browser details YourSeq 151 393 1108 2000 86.6% chr14 + 118818441 118819151 711 browser details YourSeq 148 813 1159 2000 87.8% chr8 + 126463771 126464199 429 browser details YourSeq 146 853 1114 2000 85.3% chr15 + 36933294 36933590 297 Note: The 2000 bp section downstream of Exon 2 is BLAT searched against the genome. No significant similarity is found. Page 5 of 8 https://www.alphaknockout.com Gene and protein information: Baat bile acid-Coenzyme A: amino acid N-acyltransferase [ Mus musculus (house mouse) ] Gene ID: 12012, updated on 28-Sep-2019 Gene summary Official Symbol Baat provided by MGI Official Full Name bile acid-Coenzyme A: amino acid N-acyltransferase provided by MGI Primary source MGI:MGI:106642 See related Ensembl:ENSMUSG00000039653 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as BAT; AI118337; AI158864 Expression Biased expression in liver adult (RPKM 17.8), liver E18 (RPKM 3.3) and 1 other tissueS ee more Orthologs human all Genomic context Location: 4 B1; 4 26.51 cM See Baat in Genome Data Viewer Exon count: 4 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 4 NC_000070.6 (49489416..49507915, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 4 NC_000070.5 (49502290..49519430, complement) Chromosome 4 - NC_000070.6 Page 6 of 8 https://www.alphaknockout.com Transcript information: This gene has 2 transcripts Gene: Baat ENSMUSG00000039653 Description bile acid-Coenzyme A: amino acid N-acyltransferase [Source:MGI Symbol;Acc:MGI:106642] Gene Synonyms BAT, taurine N-acyltransferase Location Chromosome 4: 49,489,422-49,506,557 reverse strand. GRCm38:CM000997.2 About this gene This gene has 2 transcripts (splice variants), 146 orthologues, 9 paralogues, is a member of 1 Ensembl protein family and is associated with 2 phenotypes. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Baat-201 ENSMUST00000043056.8 1941 420aa ENSMUSP00000041983.2 Protein coding CCDS18173 Q91X34 TSL:1 GENCODE basic APPRIS P1 Baat-202 ENSMUST00000166036.1 1263 420aa ENSMUSP00000129603.1 Protein coding CCDS18173 Q91X34 TSL:3 GENCODE basic APPRIS P1 37.14 kb Forward strand 49.48Mb 49.49Mb 49.50Mb 49.51Mb Contigs AL772310.27 > Genes < Baat-201protein coding < Mrpl50-201protein coding (Comprehensive set... < Baat-202protein coding Regulatory Build 49.48Mb 49.49Mb 49.50Mb 49.51Mb Reverse strand 37.14 kb Regulation Legend CTCF Open Chromatin Promoter Flank Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000043056 < Baat-201protein coding Reverse strand 17.14 kb ENSMUSP00000041... Low complexity (Seg) Coiled-coils (Ncoils) Superfamily Alpha/Beta hydrolase fold Pfam Acyl-CoA thioester hydrolase/bile acid-CoA amino acid N-acetyltransferase BAAT/Acyl-CoA thioester hydrolase C-terminal PIRSF Acyl-CoA thioesterase, long chain PANTHER PTHR10824 PTHR10824:SF18 Gene3D Alpha/Beta hydrolase fold Acyl-CoA thioester hydrolase/BAAT, N-terminal All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend frameshift variant missense variant synonymous variant Scale bar 0 40 80 120 160 200 240 280 320 360 420 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.
Recommended publications
  • Location Analysis of Estrogen Receptor Target Promoters Reveals That
    Location analysis of estrogen receptor ␣ target promoters reveals that FOXA1 defines a domain of the estrogen response Jose´ e Laganie` re*†, Genevie` ve Deblois*, Ce´ line Lefebvre*, Alain R. Bataille‡, Franc¸ois Robert‡, and Vincent Gigue` re*†§ *Molecular Oncology Group, Departments of Medicine and Oncology, McGill University Health Centre, Montreal, QC, Canada H3A 1A1; †Department of Biochemistry, McGill University, Montreal, QC, Canada H3G 1Y6; and ‡Laboratory of Chromatin and Genomic Expression, Institut de Recherches Cliniques de Montre´al, Montreal, QC, Canada H2W 1R7 Communicated by Ronald M. Evans, The Salk Institute for Biological Studies, La Jolla, CA, July 1, 2005 (received for review June 3, 2005) Nuclear receptors can activate diverse biological pathways within general absence of large scale functional data linking these putative a target cell in response to their cognate ligands, but how this binding sites with gene expression in specific cell types. compartmentalization is achieved at the level of gene regulation is Recently, chromatin immunoprecipitation (ChIP) has been used poorly understood. We used a genome-wide analysis of promoter in combination with promoter or genomic DNA microarrays to occupancy by the estrogen receptor ␣ (ER␣) in MCF-7 cells to identify loci recognized by transcription factors in a genome-wide investigate the molecular mechanisms underlying the action of manner in mammalian cells (20–24). This technology, termed 17␤-estradiol (E2) in controlling the growth of breast cancer cells. ChIP-on-chip or location analysis, can therefore be used to deter- We identified 153 promoters bound by ER␣ in the presence of E2. mine the global gene expression program that characterize the Motif-finding algorithms demonstrated that the estrogen re- action of a nuclear receptor in response to its natural ligand.
    [Show full text]
  • Activation of the Bile Acid Pathway and No Observed Antimicrobial Peptide Sequences in the Skin of a Poison Frog
    INVESTIGATION Activation of the Bile Acid Pathway and No Observed Antimicrobial Peptide Sequences in the Skin of a Poison Frog Megan L. Civitello,* Robert Denton,* Michael A. Zasloff,† and John H. Malone*,1 *Institute of Systems Genomics, Department of Molecular and Cell Biology, University of Connecticut, Storrs, Connecticut 06269 and †Georgetown University School of Medicine, MedStar Georgetown Transplant Institute, Washington D.C. 20057 ORCID IDs: 0000-0002-8629-1376 (R.D.); 0000-0003-1369-3769 (J.H.M.) ABSTRACT The skin secretions of many frogs have genetically-encoded, endogenous antimicrobial KEYWORDS peptides (AMPs). Other species, especially aposematic poison frogs, secrete exogenously derived alkaloids Anti-microbial that serve as potent defense molecules. The origins of these defense systems are not clear, but a novel bile- peptides acid derived metabolite, tauromantellic acid, was recently discovered and shown to be endogenous in defensive poison frogs (Mantella, Dendrobates, and Epipedobates). These observations raise questions about the secretions evolutionary history of AMP genetic elements, the mechanism and function of tauromatellic acid produc- phylogenetic tion, and links between these systems. To understand the diversity and expression of AMPs among frogs, history we assembled skin transcriptomes of 13 species across the anuran phylogeny. Our analyses revealed a bile acid pathway diversity of AMPs and AMP expression levels across the phylogenetic history of frogs, but no observations of AMPs in Mantella. We examined genes expressed in the bile-acid metabolic pathway and found that CYP7A1 (Cytochrome P450), BAAT (bile acid-CoA: amino acid N-acyltransferase), and AMACR (alpha- methylacyl-CoA racemase) were highly expressed in the skin of M.
    [Show full text]
  • Supporting Evidence from the Primates Keith R Oliver1* and Wayne K Greene2
    Oliver and Greene Mobile DNA 2011, 2:8 http://www.mobilednajournal.com/content/2/1/8 REVIEW Open Access Mobile DNA and the TE-Thrust hypothesis: supporting evidence from the primates Keith R Oliver1* and Wayne K Greene2 Abstract Transposable elements (TEs) are increasingly being recognized as powerful facilitators of evolution. We propose the TE-Thrust hypothesis to encompass TE-facilitated processes by which genomes self-engineer coding, regulatory, karyotypic or other genetic changes. Although TEs are occasionally harmful to some individuals, genomic dynamism caused by TEs can be very beneficial to lineages. This can result in differential survival and differential fecundity of lineages. Lineages with an abundant and suitable repertoire of TEs have enhanced evolutionary potential and, if all else is equal, tend to be fecund, resulting in species-rich adaptive radiations, and/or they tend to undergo major evolutionary transitions. Many other mechanisms of genomic change are also important in evolution, and whether the evolutionary potential of TE-Thrust is realized is heavily dependent on environmental and ecological factors. The large contribution of TEs to evolutionary innovation is particularly well documented in the primate lineage. In this paper, we review numerous cases of beneficial TE-caused modifications to the genomes of higher primates, which strongly support our TE-Thrust hypothesis. Introduction extent of evolution on the different clades within that Building on the groundbreaking work of McClintock [1] lineage. and numerous others [2-14], we further advanced the proposition of transposable elements (TEs) as powerful The TE-Thrust Hypothesis facilitators of evolution [15] and now formalise this into The ubiquitous, very diverse, and mostly extremely ‘The TE-Thrust hypothesis’.Inthispaper,wepresent ancient TEs are powerful facilitators of genome evolu- much specific evidence in support of this hypothesis, tion, and therefore of phenotypic diversity.
    [Show full text]
  • Molecular Characterization of the T(4;9)12Gso Mutation and Analysis of the Associated Fitness, Skeletal, and Lymphoproliferative Phenotypes
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Doctoral Dissertations Graduate School 8-2002 Molecular Characterization of the T(4;9)12Gso Mutation and Analysis of the Associated Fitness, Skeletal, and Lymphoproliferative Phenotypes Laura Ray Chittenden University of Tennessee - Knoxville Follow this and additional works at: https://trace.tennessee.edu/utk_graddiss Part of the Biomedical Engineering and Bioengineering Commons Recommended Citation Chittenden, Laura Ray, "Molecular Characterization of the T(4;9)12Gso Mutation and Analysis of the Associated Fitness, Skeletal, and Lymphoproliferative Phenotypes. " PhD diss., University of Tennessee, 2002. https://trace.tennessee.edu/utk_graddiss/2105 This Dissertation is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Doctoral Dissertations by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council: I am submitting herewith a dissertation written by Laura Ray Chittenden entitled "Molecular Characterization of the T(4;9)12Gso Mutation and Analysis of the Associated Fitness, Skeletal, and Lymphoproliferative Phenotypes." I have examined the final electronic copy of this dissertation for form and content and recommend that it be accepted in partial fulfillment of the requirements for the degree of Doctor of Philosophy, with a major in Biomedical Engineering.
    [Show full text]
  • Genetics of Familial Intrahepatic Cholestasis Syndromes S W C Van Mil, R H J Houwen, L W J Klomp
    449 REVIEW J Med Genet: first published as 10.1136/jmg.2004.026187 on 3 June 2005. Downloaded from Genetics of familial intrahepatic cholestasis syndromes S W C van Mil, R H J Houwen, L W J Klomp ............................................................................................................................... J Med Genet 2005;42:449–463. doi: 10.1136/jmg.2004.026187 Bile acids and bile salts have essential functions in the liver During the last decade, many loci underlying intrahepatic cholestasis have been mapped, the and in the small intestine. Their synthesis in the liver genes have been identified, and protein functions provides a metabolic pathway for the catabolism of have been studied. Taken together with the cholesterol and their detergent properties promote the recent exciting discovery of nuclear hormone receptors that act as bile acid sensors by solubilisation of essential nutrients and vitamins in the small transcriptional regulation of key enzymes and intestine. Inherited conditions that prevent the synthesis of transporters in bile acid metabolism, these bile acids or their excretion cause cholestasis, or impaired studies have allowed a detailed understanding of the molecular mechanisms and regulation of bile flow. These disorders generally lead to severe human bile formation. This review summarises recent liver disease, underscoring the essential role of bile acids in insights into the genetics of familial intrahepatic metabolism. Recent advances in the elucidation of gene cholestasis syndromes and the functions
    [Show full text]
  • Semantic Prioritization of Novel Causative Genomic Variants in Mendelian and Oligogenic Diseases
    Semantic Prioritization of Novel Causative Genomic Variants in Mendelian and Oligogenic Diseases Dissertation by Imane Boudellioua In Partial Fulfillment of the Requirements For the Degree of Doctor of Philosophy King Abdullah University of Science and Technology Thuwal, Kingdom of Saudi Arabia February, 2019 2 EXAMINATION COMMITTEE PAGE The dissertation of Imane Boudellioua is approved by the examination committee Committee Chairperson: Prof. Robert Hoehndorf Committee Members: Prof. Stefan T. Arold, Prof. Xin Gao, and Prof. Dietrich Rebholz-Schuhmann 3 ©February, 2019 Imane Boudellioua All Rights Reserved 4 ABSTRACT Semantic Prioritization of Novel Causative Genomic Variants in Mendelian and Oligogenic Diseases Imane Boudellioua Recent advances in Next Generation Sequencing (NGS) technologies have facili- tated the generation of massive amounts of genomic data which in turn is bringing the promise that personalized medicine will soon become widely available. As a result, there is an increasing pressure to develop computational tools to analyze and interpret genomic data. In this dissertation, we present a systematic approach for interrogat- ing patients' genomes to identify candidate causal genomic variants of Mendelian and oligogenic diseases. To achieve that, we leverage the use of biomedical data available from extensive biological experiments along with machine learning techniques to build predictive models that rival the currently adopted approaches in the field. We inte- grate a collection of features representing molecular information about the genomic variants and information derived from biological networks. Furthermore, we incorpo- rate genotype-phenotype relations by exploiting semantic technologies and automated reasoning inferred throughout a cross-species phenotypic ontology network obtained from human, mouse, and zebra fish studies.
    [Show full text]
  • The Emerging Role of Acyl-Coa Thioesterases and Acyltransferases in Regulating Peroxisomal Lipid Metabolism☆
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Biochimica et Biophysica Acta 1822 (2012) 1397–1410 Contents lists available at SciVerse ScienceDirect Biochimica et Biophysica Acta journal homepage: www.elsevier.com/locate/bbadis Review The emerging role of acyl-CoA thioesterases and acyltransferases in regulating peroxisomal lipid metabolism☆ Mary C. Hunt a,⁎, Marina I. Siponen b,1, Stefan E.H. Alexson c a Dublin Institute of Technology, School of Biological Sciences, College of Sciences & Health, Kevin Street, Dublin 8, Ireland b Department of Medical Biochemistry and Biophysics, Structural Genomics Consortium, Karolinska Institutet, SE-171 77 Stockholm, Sweden c Karolinska Institutet, Department of Laboratory Medicine, Division of Clinical Chemistry, Karolinska University Hospital at Huddinge, SE-141 86 Stockholm, Sweden article info abstract Article history: The importance of peroxisomes in lipid metabolism is now well established and peroxisomes contain approx- Received 15 November 2011 imately 60 enzymes involved in these lipid metabolic pathways. Several acyl-CoA thioesterase enzymes Received in revised form 3 March 2012 (ACOTs) have been identified in peroxisomes that catalyze the hydrolysis of acyl-CoAs (short-, medium-, Accepted 16 March 2012 long- and very long-chain), bile acid-CoAs, and methyl branched-CoAs, to the free fatty acid and coenzyme Available online 23 March 2012 A. A number of acyltransferase enzymes, which are structurally and functionally related to ACOTs, have also been identified in peroxisomes, which conjugate (or amidate) bile acid-CoAs and acyl-CoAs to amino acids, Keywords: Acyl-CoA thioesterase resulting in the production of amidated bile acids and fatty acids.
    [Show full text]
  • Animal Models to Study Bile Acid Metabolism T ⁎ Jianing Li, Paul A
    BBA - Molecular Basis of Disease 1865 (2019) 895–911 Contents lists available at ScienceDirect BBA - Molecular Basis of Disease journal homepage: www.elsevier.com/locate/bbadis ☆ Animal models to study bile acid metabolism T ⁎ Jianing Li, Paul A. Dawson Department of Pediatrics, Division of Gastroenterology, Hepatology, and Nutrition, Emory University, Atlanta, GA 30322, United States ARTICLE INFO ABSTRACT Keywords: The use of animal models, particularly genetically modified mice, continues to play a critical role in studying the Liver relationship between bile acid metabolism and human liver disease. Over the past 20 years, these studies have Intestine been instrumental in elucidating the major pathways responsible for bile acid biosynthesis and enterohepatic Enterohepatic circulation cycling, and the molecular mechanisms regulating those pathways. This work also revealed bile acid differences Mouse model between species, particularly in the composition, physicochemical properties, and signaling potential of the bile Enzyme acid pool. These species differences may limit the ability to translate findings regarding bile acid-related disease Transporter processes from mice to humans. In this review, we focus primarily on mouse models and also briefly discuss dietary or surgical models commonly used to study the basic mechanisms underlying bile acid metabolism. Important phenotypic species differences in bile acid metabolism between mice and humans are highlighted. 1. Introduction characteristics such as small size, short gestation period and life span, which facilitated large-scale laboratory breeding and housing, the Interest in bile acids can be traced back almost three millennia to availability of inbred and specialized strains as genome sequencing the widespread use of animal biles in traditional Chinese medicine [1].
    [Show full text]
  • Gnomad Lof Supplement
    1 gnomAD supplement gnomAD supplement 1 Data processing 4 Alignment and read processing 4 Variant Calling 4 Coverage information 5 Data processing 5 Sample QC 7 Hard filters 7 Supplementary Table 1 | Sample counts before and after hard and release filters 8 Supplementary Table 2 | Counts by data type and hard filter 9 Platform imputation for exomes 9 Supplementary Table 3 | Exome platform assignments 10 Supplementary Table 4 | Confusion matrix for exome samples with Known platform labels 11 Relatedness filters 11 Supplementary Table 5 | Pair counts by degree of relatedness 12 Supplementary Table 6 | Sample counts by relatedness status 13 Population and subpopulation inference 13 Supplementary Figure 1 | Continental ancestry principal components. 14 Supplementary Table 7 | Population and subpopulation counts 16 Population- and platform-specific filters 16 Supplementary Table 8 | Summary of outliers per population and platform grouping 17 Finalizing samples in the gnomAD v2.1 release 18 Supplementary Table 9 | Sample counts by filtering stage 18 Supplementary Table 10 | Sample counts for genomes and exomes in gnomAD subsets 19 Variant QC 20 Hard filters 20 Random Forest model 20 Features 21 Supplementary Table 11 | Features used in final random forest model 21 Training 22 Supplementary Table 12 | Random forest training examples 22 Evaluation and threshold selection 22 Final variant counts 24 Supplementary Table 13 | Variant counts by filtering status 25 Comparison of whole-exome and whole-genome coverage in coding regions 25 Variant annotation 30 Frequency and context annotation 30 2 Functional annotation 31 Supplementary Table 14 | Variants observed by category in 125,748 exomes 32 Supplementary Figure 5 | Percent observed by methylation.
    [Show full text]
  • Integrated Analysis Sheds Light on Evolutionary Trajectories of Young Transcription Start Sites in the Human Genome
    Downloaded from genome.cshlp.org on October 6, 2021 - Published by Cold Spring Harbor Laboratory Press Research Integrated analysis sheds light on evolutionary trajectories of young transcription start sites in the human genome Cai Li,1 Boris Lenhard,2,3,4 and Nicholas M. Luscombe1,5,6 1The Francis Crick Institute, London NW1 1AT, United Kingdom; 2Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London W12 0NN, United Kingdom; 3Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London W12 0NN, United Kingdom; 4Sars International Centre for Marine Molecular Biology, University of Bergen, N-5008 Bergen, Norway; 5UCL Genetics Institute, University College London, London WC1E 6BT, United Kingdom; 6Okinawa Institute of Science & Technology Graduate University, Okinawa, 904-0495, Japan Understanding the molecular mechanisms and evolution of the gene regulatory system remains a major challenge in biol- ogy. Transcription start sites (TSSs) are especially interesting because they are central to initiating gene expression. Previous studies revealed widespread transcription initiation and fast turnover of TSSs in mammalian genomes. Yet, how new TSSs originate and how they evolve over time remain poorly understood. To address these questions, we analyzed ∼200,000 human TSSs by integrating evolutionary (inter- and intra-species) and functional genomic data, particularly focusing on evolutionarily young TSSs that emerged in the primate lineage. TSSs were grouped according to their evolutionary
    [Show full text]
  • The Association Between SLC25A15 Gene Polymorphisms and Hyperornithinemia‐Hyperammonemia‐Homocitrullinuria Syndrome: Using in Silico Analysis
    bioRxiv preprint doi: https://doi.org/10.1101/786301; this version posted September 30, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Research Article The Association between SLC25A15 Gene Polymorphisms and Hyperornithinemia‐hyperammonemia‐homocitrullinuria Syndrome: Using In Silico Analysis Nuha A. Mahmoud1*, Dina T. Ahmed2*, Zainab O. Mohammed3, Fatima A. Altyeb2, Mujahed I. Mustafa2, Mohamed A. Hassan2,4 1- Department of Biochemistry, Faculty of Medicine, National University, Khartoum, Sudan. 2- Department of Biotechnology, Africa City of Technology, Khartoum, Sudan. 3- Department of Hematology, Ribat University Hospital, Khartoum, Sudan. 4- Department of Bioinformatics, DETAGEN Genetics Diagnostic Center, Kayseri, Turkey. *Both authors contributed equally to this manuscript. Corresponding Author: Nuha A. Mahmoud: [email protected] Background: Hyperornithinemia‐hyperammonemia‐homocitrullinuria (HHH) syndrome is an autosomal recessive inborn error of the urea cycle. It is caused by mutations in the SLC25A15 gene that codes the mitochondrial ornithine transporter. The aim of this study is to detect and identify the pathogenic SNPs in SLC25A15 gene through a combination set of bioinformatics tools and their effect on the structure and function of the protein. Methods: The deleterious SNPs in SLC25A15 are detected by various bioinformatics tools, with addition to identifying their effects on the structure and function of this gene. Results: 20 deleterious SNPs out 287of were found to have their own damaging effects on the structure and function of the SLC25A15 gene.
    [Show full text]
  • Human SSAV-Related Endogenous Retroviral Element: LTR-Like Sequence and Chromosomal Localization to 18Q21
    GENOMICS 4,68-75 (1989) Human SSAV-Related Endogenous Retroviral Element: LTR-like Sequence and Chromosomal Localization to 18q21 RUTH BRACK-WERNER,* DAVID E. BARToN,t THOMAS WERNER,* BRIGITTE E. FoELLraR,t CHRISTINE LEIB-M~)scH,§ UTA Fwwx,t VOLKER ERFLE,* AND R~~DIGER HEHLMANN~ l Abt. filr Molekulare Zellpathologie and Wtstitut filr Uugetiergenetik, Gesellschaft fiir Strahlen- und Umweltforschung (GSF), D-8042 Neuherberg, Federal Republic of Germany; tDepartment of Human Genetics, Yale University School of Medicine, New Haven, Connecticut 065 10; and SMedizinische Poliklinik der Universitat Miinchen, D-8000 Munich, Federal Republic of Germany Received July 12, 1988; revised September 14, 1988 cance is not yet known. Nevertheless, they are ob- A new family of human endogenous retroviral se- viously not only silent components of the human quences was recently discovered by way of its rela- genome since some have been shown to be transcrip- tionship to the simian sarcoma-associated virus tionally active (Kato et al., 1987; Rabson et al., 1985). (SSAV). One molecular clone, termed 571, contains Expression of human endogenous retroviral sequences sequences related to the genes coding for the group- has also been associated with neoplastic disease (Gat- speciilc antigens (gag) and polymerase (pal) proteins toni-Celli et al., 1986); however, conclusive evidence of SSAV. At the 3’end of this human retroviral element for a direct role of human retroviral elements in patho- we have now found a 636-bp region which shows fea- genicity is still forthcoming. We have found human tures characteristic of a retroviral long terminal re- DNA to contain a number of sequences related to the peat, including potential signal sequences essential for simian sarcoma-associated virus/gibbon ape leukemia transcriptional control.
    [Show full text]