Human-Specific Tandem Repeat Expansion and Differential Gene Expression During Primate Evolution

Total Page:16

File Type:pdf, Size:1020Kb

Human-Specific Tandem Repeat Expansion and Differential Gene Expression During Primate Evolution Human-specific tandem repeat expansion and differential gene expression during primate evolution Arvis Sulovaria, Ruiyang Lia, Peter A. Audanoa, David Porubskya, Mitchell R. Vollgera, Glennis A. Logsdona, Human Genome Structural Variation Consortium1, Wesley C. Warrenb, Alex A. Pollenc, Mark J. P. Chaissona,d, and Evan E. Eichlera,e,2 aDepartment of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195; bBond Life Sciences Center, University of Missouri, Columbia, MO 65201; cDepartment of Neurology, University of California, San Francisco, CA 94143; dQuantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089; and eHoward Hughes Medical Institute, University of Washington, Seattle, WA 98195 Edited by Stephen T. Warren, Emory University School of Medicine, Atlanta, GA, and approved October 1, 2019 (received for review July 17, 2019) Short tandem repeats (STRs) and variable number tandem repeats Despite their established importance in population genetics (VNTRs) are important sources of natural and disease-causing and disease association, tandem repeats, particularly VNTRs, variation, yet they have been problematic to resolve in reference are among the least characterized forms of genetic variation in genomes and genotype with short-read technology. We created a the human genome (13, 14). Their repetitive nature and some- framework to model the evolution and instability of STRs and VNTRs times extreme GC content make them particularly challenging to in apes. We phased and assembled 3 ape genomes (chimpanzee, sequence and assemble with standard whole-genome shotgun gorilla, and orangutan) using long-read and 10x Genomics linked- sequencing assembly strategies, including next generation se- read sequence data for 21,442 human tandem repeats discovered in quencing approaches that depend on bridge amplification (15). 6 haplotype-resolved assemblies of Yoruban, Chinese, and Puerto Rican origin. We define a set of 1,584 STRs/VNTRs expanded Their large size, often many kilobases in length, and their in- Escherichia specifically in humans, including large tandem repeats affecting herent instability during clonal propagation through coding and noncoding portions of genes (e.g., MUC3A, CACNA1C). coli vectors have limited their accurate representation in the We show that short interspersed nuclear element–VNTR–Alu (SVA) human reference genome, which was largely dependent on hi- GENETICS retrotransposition is the main mechanism for distributing GC-rich erarchical bacterial artificial chromosome (BAC) clone sequence human-specific tandem repeat expansions throughout the ge- and assembly (16, 17). As a result, recent surveys of human ge- nome but with a bias against genes. In contrast, we observe that nomes using orthogonal single-molecule long-read sequencing VNTRs not originating from retrotransposons have a propensity to technologies (18, 19) have shown that the length and number of cluster near genes, especially in the subtelomere. Using tissue- these repeats have been systematically underestimated. Be- specific expression from human and chimpanzee brains, we iden- cause their length and purity are critical to determining their tify genes where transcript isoform usage differs significantly, likely caused by cryptic splicing variation within VNTRs. Using Significance single-cell expression from cerebral organoids, we observe a strong effect for genes associated with transcription profiles anal- ogous to intermediate progenitor cells. Finally, we compare the Short tandem repeats (STRs) and variable number tandem re- sequence composition of some of the largest human-specific re- peats (VNTRs) are among the most mutable regions of our ge- peat expansions and identify 52 STRs/VNTRs with at least 40 un- nome but are frequently underascertained in studies of disease interrupted pure tracts as candidates for genetically unstable and evolution. Using long-read sequence data from apes and regions associated with disease. humans, we present a sequence-based evolutionary framework for ∼20,000 phased STRs and VNTRs. We identify 1,584 tandem tandem repeat | STR | VNTR | tandem repeat expansion | repeats that are specifically expanded in human lineage. We genome instability show that VNTRs originate by short interspersed nuclear ele- ment–VNTR–Alu retrotransposition or accumulate near genes in subtelomeric regions. We identify associations with expanded hort tandem repeats (STRs) and variable number tandem re- tandem repeats and genes differentially spliced or expressed Speats (VNTRs), also referred to as micro- and minisatellites (1, between human and chimpanzee brains. We identify 52 loci with 2), are operationally defined as tandemly repeating units of DNA long, uninterrupted repeats (≥40 pure tandem repeats) as can- ≥ of 1 to 6 and 7 bp in length, respectively (3). The mutation rates didates for genetically unstable regions associated with disease. among these tandem repeats can be several orders of magnitude −6 higher than the unique portions of the genome, ranging from 10 Author contributions: A.S. and E.E.E. designed research; A.S. performed research; A.S., − to 10 2 nucleotides per generation in STRs (4, 5). The mutation R.L., G.A.L., W.C.W., A.A.P., and M.J.P.C. contributed new reagents/analytic tools; A.S., R.L., P.A.A., D.P., and M.R.V. analyzed data; D.P. helped with enrichment analysis; HGSVC rate for a given locus can vary widely, while the longest and purest provided early data access; A.A.P. helped with brain organoid expression analysis; and tandem repeat tract often defines the most unstable STRs and A.S. and E.E.E. wrote the paper. VNTRs (6–8). As a result, STRs/VNTRs have long been recog- Competing interest statement: E.E.E. is on the scientific advisory board of DNAnexus, Inc. nized among the most polymorphic markers of genomes. They are This article is a PNAS Direct Submission. also an important source of genomic instability associated with Published under the PNAS license. several human disorders, including repeat expansion disorders, Data deposition: All long-read human and nonhuman primate assemblies of the short due to their tendency to expand through replication slippage, tandem repeats/variable number tandem repeats presented in this study were aligned against the GRCh38 and deposited in Zenodo (https://zenodo.org/record/3401477). DNA repair, or nonallelic homologous recombination (9). 1A complete list of the Human Genome Structural Variation Consortium can be found in Tandem repeats can also harbor cryptic disease-causing vari- the SI Appendix. ation in the form of single-nucleotide variants (SNVs) (10) or 2To whom correspondence may be addressed. Email: [email protected]. short insertions and deletions (indels) (11, 12), emphasizing This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. the importance of accurately predicting both their size and 1073/pnas.1912175116/-/DCSupplemental. sequence composition. www.pnas.org/cgi/doi/10.1073/pnas.1912175116 PNAS Latest Articles | 1of11 Downloaded by guest on September 24, 2021 mutability and inherent instability, accurate haplotype resolution the 7 human and 6 NHP haplotypes, with the goal of identifying human- of larger alleles is important (20). specific expansions (HSEs) of tandem repeats. SI Appendix has more details. The goal of this study is 2-fold: 1) produce a high-quality set of We also identified ab initio human repeats; these consisted of loci with haplotype-resolved tandem repeat loci in the human genome tandem repeats content in all human samples and no tandem repeat content in any of the NHPs’ homologous loci. Due to the filters used above for both with a specific emphasis on those that are misrepresented in the ab initio and HSE, we expect to observe a set of tandem repeats that are current human reference genome and 2) generate an evolu- differently expanded in humans while also being ab initio, which leads to tionary framework for their origin by establishing the likely ape some STRs/VNTRs being classified in both categories. Thus, we took into ancestral state of each allele. As a starting point, we leveraged account this redundancy between the 2 categories to avoid double counting the haplotype-resolved structural variants (SVs) from 3 diverse in the downstream analyses. individuals generated as part of the Human Genome Structural A subset of the tandem repeats is expected to differ in both length and Variation Consortium (HGSVC) (20). Next, we generated sequence composition from the human genome reference. To estimate the number of reference collapses or misassemblies in GRCh38, we counted STRs/ haplotype-resolved sequences for the homologous loci in non- ≥ human primates (NHPs) using a combination of PacBio long VNTRs with 2-fold as many tandem copies in the shortest human haplotype than in GRCh38 (i.e., collapsed regions) and repeat unit sequence with ≤90% reads and 10x Genomics linked reads from individuals of 3 sequence identity between the human haplotypes and the reference (i.e., species: chimpanzee, gorilla, and orangutan (21). This compar- misassembled). Lastly, a tandem repeat locus was classified as polymorphic if ative analysis allowed us to delineate human-specific tandem its standard deviation of copy numbers across the human haplotypes repeat expansions and further investigate their potential effects was ≥10th percentile. on gene expression and splicing using single-cell and tissue- specific RNA sequencing (RNA-seq) of human and NHP Differential
Recommended publications
  • Identification of a Non-LTR Retrotransposon from the Gypsy Moth
    Insect Molecular Biology (1999) 8(2), 231-242 Identification of a non-L TR retrotransposon from the gypsy moth K. J. Garner and J. M. Siavicek sposons (Boeke & Corces, 1989), or retroposons USDA Forest Service, Northeastern Research Station, (McClure, 1991). Many non-L TR retrotransposons Delaware, Ohio, U.S.A. have been described in insects, including the Doc (O'Hare et al., 1991), F (Di Nocera & Casari, 1987), I (Fawcett et al., 1986) and jockey (Priimiigi et al., 1988) Abstract elements of Drosophila melanogaster, the T1Ag A family of highly repetitive elements, named LDT1, (Besansky, 1990) and Q (Besansky et al., 1994) ele- has been identified in the gypsy moth, Lymantria ments of Anopheles gambiae, and the R1Bm (Xiong & dispar. The complete element is 5.4 kb in length and Eickbush, 1988a) and R2Bm (Burke et al., 1987) lacks long-terminal repeats, The element contains two families of ribosomal DNA insertions in Bombyx mori. open reading frames with a significant amino acid Gypsy moths (Lymantria dispar) are currently wide- sequence similarity to several non-L TR retrotrans- spread forest pests in the north-eastern United States posons. The first open reading frame contains a and the adjacent regions of Canada. Population region that potentially encodes a polypeptide similar markers have been sought to distinguish the North to DNA-binding GAG-like proteins. The second American gypsy moths introduced from Europe in 1869 encodes a polypeptide resembling both endonuclease from those recently introduced from Asia (Bogdano- and reverse transcriptase sequences. A" members of wicz et al., 1993; Pfeifer et al., 1995; Garner & Siavicek, the LDT1 element family sequenced thus far have poly- 1996; Schreiber et al., 1997).
    [Show full text]
  • The Significance of the Evolutionary Relationship of Prion Proteins and ZIP Transporters in Health and Disease
    The Significance of the Evolutionary Relationship of Prion Proteins and ZIP Transporters in Health and Disease by Sepehr Ehsani A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Department of Laboratory Medicine and Pathobiology University of Toronto © Copyright by Sepehr Ehsani 2012 The Significance of the Evolutionary Relationship of Prion Proteins and ZIP Transporters in Health and Disease Sepehr Ehsani Doctor of Philosophy Department of Laboratory Medicine and Pathobiology University of Toronto 2012 Abstract The cellular prion protein (PrPC) is unique amongst mammalian proteins in that it not only has the capacity to aggregate (in the form of scrapie PrP; PrPSc) and cause neuronal degeneration, but can also act as an independent vector for the transmission of disease from one individual to another of the same or, in some instances, other species. Since the discovery of PrPC nearly thirty years ago, two salient questions have remained largely unanswered, namely, (i) what is the normal function of the cellular protein in the central nervous system, and (ii) what is/are the factor(s) involved in the misfolding of PrPC into PrPSc? To shed light on aspects of these questions, we undertook a discovery-based interactome investigation of PrPC in mouse neuroblastoma cells (Chapter 2), and among the candidate interactors, identified two members of the ZIP family of zinc transporters (ZIP6 and ZIP10) as possessing a PrP-like domain. Detailed analyses revealed that the LIV-1 subfamily of ZIP transporters (to which ZIPs 6 and 10 belong) are in fact the evolutionary ancestors of prions (Chapter 3).
    [Show full text]
  • Analysis of Gene Expression Data for Gene Ontology
    ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION A Thesis Presented to The Graduate Faculty of The University of Akron In Partial Fulfillment of the Requirements for the Degree Master of Science Robert Daniel Macholan May 2011 ANALYSIS OF GENE EXPRESSION DATA FOR GENE ONTOLOGY BASED PROTEIN FUNCTION PREDICTION Robert Daniel Macholan Thesis Approved: Accepted: _______________________________ _______________________________ Advisor Department Chair Dr. Zhong-Hui Duan Dr. Chien-Chung Chan _______________________________ _______________________________ Committee Member Dean of the College Dr. Chien-Chung Chan Dr. Chand K. Midha _______________________________ _______________________________ Committee Member Dean of the Graduate School Dr. Yingcai Xiao Dr. George R. Newkome _______________________________ Date ii ABSTRACT A tremendous increase in genomic data has encouraged biologists to turn to bioinformatics in order to assist in its interpretation and processing. One of the present challenges that need to be overcome in order to understand this data more completely is the development of a reliable method to accurately predict the function of a protein from its genomic information. This study focuses on developing an effective algorithm for protein function prediction. The algorithm is based on proteins that have similar expression patterns. The similarity of the expression data is determined using a novel measure, the slope matrix. The slope matrix introduces a normalized method for the comparison of expression levels throughout a proteome. The algorithm is tested using real microarray gene expression data. Their functions are characterized using gene ontology annotations. The results of the case study indicate the protein function prediction algorithm developed is comparable to the prediction algorithms that are based on the annotations of homologous proteins.
    [Show full text]
  • Using Rnai to Elucidate Mechanisms of Human Disease
    Cell Death and Differentiation (2008) 15, 809–819 & 2008 Nature Publishing Group All rights reserved 1350-9047/08 $30.00 www.nature.com/cdd Review From sequence to function: using RNAi to elucidate mechanisms of human disease NM Wolters1 and JP MacKeigan*,1 RNA interference (RNAi) has emerged as one of the most powerful tools for functionally characterizing large sets of genomic data. Capabilities of RNAi place it at the forefront of high-throughput screens, which are able to span the human genome in search of novel targets. Although RNAi screens have been used to elucidate pathway components and discover potential drug targets in lower organisms, including Caenorhabditis elegans and Drosophila, only recently has the technology been advanced to a state in which large-scale screens can be performed in mammalian cells. In this review, we will evaluate the major advancements in the field of mammalian RNAi, specifically in terms of high-throughput assays. Crucial points of experimental design will be highlighted, as well as suggestions as to how to interpret and follow-up on potential cell death targets. Finally, we assess the prospective applications of high-throughput screens, the data they are capable of generating, and the potential for this technique to further our understanding of human disease. Cell Death and Differentiation (2008) 15, 809–819; doi:10.1038/sj.cdd.4402311; published online 18 January 2008 The sequencing of the human genome ushered a new era into maintaining the balance between cell survival, cell death, the field of modern biology; it is now possible to elucidate and autophagy represents a crucial regulatory point in human molecular pathways relevant to development and disease health and that when this balance is tipped, a variety of with a breadth never before seen.
    [Show full text]
  • A Yeast Phenomic Model for the Gene Interaction Network Modulating
    Louie et al. Genome Medicine 2012, 4:103 http://genomemedicine.com/content/4/12/103 RESEARCH Open Access A yeast phenomic model for the gene interaction network modulating CFTR-ΔF508 protein biogenesis Raymond J Louie3†, Jingyu Guo1,2†, John W Rodgers1, Rick White4, Najaf A Shah1, Silvere Pagant3, Peter Kim3, Michael Livstone5, Kara Dolinski5, Brett A McKinney6, Jeong Hong2, Eric J Sorscher2, Jennifer Bryan4, Elizabeth A Miller3* and John L Hartman IV1,2* Abstract Background: The overall influence of gene interaction in human disease is unknown. In cystic fibrosis (CF) a single allele of the cystic fibrosis transmembrane conductance regulator (CFTR-ΔF508) accounts for most of the disease. In cell models, CFTR-ΔF508 exhibits defective protein biogenesis and degradation rather than proper trafficking to the plasma membrane where CFTR normally functions. Numerous genes function in the biogenesis of CFTR and influence the fate of CFTR-ΔF508. However it is not known whether genetic variation in such genes contributes to disease severity in patients. Nor is there an easy way to study how numerous gene interactions involving CFTR-ΔF would manifest phenotypically. Methods: To gain insight into the function and evolutionary conservation of a gene interaction network that regulates biogenesis of a misfolded ABC transporter, we employed yeast genetics to develop a ‘phenomic’ model, in which the CFTR-ΔF508-equivalent residue of a yeast homolog is mutated (Yor1-ΔF670), and where the genome is scanned quantitatively for interaction. We first confirmed that Yor1-ΔF undergoes protein misfolding and has reduced half-life, analogous to CFTR-ΔF. Gene interaction was then assessed quantitatively by growth curves for approximately 5,000 double mutants, based on alteration in the dose response to growth inhibition by oligomycin, a toxin extruded from the cell at the plasma membrane by Yor1.
    [Show full text]
  • The Beaver's Phylogenetic Lineage Illuminated by Retroposon Reads
    www.nature.com/scientificreports OPEN The Beaver’s Phylogenetic Lineage Illuminated by Retroposon Reads Liliya Doronina1,*, Andreas Matzke1,*, Gennady Churakov1,2, Monika Stoll3, Andreas Huge3 & Jürgen Schmitz1 Received: 13 October 2016 Solving problematic phylogenetic relationships often requires high quality genome data. However, Accepted: 25 January 2017 for many organisms such data are still not available. Among rodents, the phylogenetic position of the Published: 03 March 2017 beaver has always attracted special interest. The arrangement of the beaver’s masseter (jaw-closer) muscle once suggested a strong affinity to some sciurid rodents (e.g., squirrels), placing them in the Sciuromorpha suborder. Modern molecular data, however, suggested a closer relationship of beaver to the representatives of the mouse-related clade, but significant data from virtually homoplasy- free markers (for example retroposon insertions) for the exact position of the beaver have not been available. We derived a gross genome assembly from deposited genomic Illumina paired-end reads and extracted thousands of potential phylogenetically informative retroposon markers using the new bioinformatics coordinate extractor fastCOEX, enabling us to evaluate different hypotheses for the phylogenetic position of the beaver. Comparative results provided significant support for a clear relationship between beavers (Castoridae) and kangaroo rat-related species (Geomyoidea) (p < 0.0015, six markers, no conflicting data) within a significantly supported mouse-related clade (including Myodonta, Anomaluromorpha, and Castorimorpha) (p < 0.0015, six markers, no conflicting data). Most of an organism’s phylogenetic history is fossilized in their heritable genomic material. Using data from genome sequencing projects, particularly informative regions of this material can be extracted in sufficient num- bers to resolve the deepest history of speciation.
    [Show full text]
  • CLASP2 Antibody Product Type
    PRODUCT INFORMATION Product name: CLASP2 antibody Product type: Primary antibodies Description: Rabbit polyclonal to CLASP2 Immunogen:3 synthetic peptides (human) conjugated to KLH Reacts with:Hu, Ms Tested applications:ELISA, WB and IF GENE INFORMATION Gene Symbol: CLASP2 Gene Name:cytoplasmic linker associated protein 2 Ensembl ID:ENSG00000163539 Entrez GeneID:23122 GenBank Accession number:AB014527 Swiss-Prot:O75122 Molecular weight of CLASP2: 165.9 & 108.6kDa Function:Microtubule plus-end tracking protein that promotes the stabilization of dynamic microtubules. Involved in the nucleation of noncentrosomal microtubules originating from the trans-Golgi network (TGN). Required for the polarization of the cytoplasmic microtubule arrays in migrating cells towards the leading edge of the cell. May act at the cell cortex to enhance the frequency of rescue of depolymerizing microtubules by attaching their plus- ends to cortical platforms composed of ERC1 and PHLDB2. This cortical microtubule stabilizing activity is regulated at least in part by phosphatidylinositol 3-kinase signaling. Also performs a similar stabilizing function at the kinetochore which is essential for the bipolar alignment of chromosomes on the mitotic spindle. Acts as a mediator of ERBB2- dependent stabilization of microtubules at the cell cortex. Expected subcellular localization:Cytoplasm › cytoskeleton. Cytoplasm › cytoskeleton › microtubule organizing center › centrosome. Chromosome › centromere › kinetochore. Cytoplasm › cytoskeleton › spindle. Golgi apparatus. Golgi apparatus › trans-Golgi network. Cell membrane. Cell projection › ruffle membrane. Note: Localizes to microtubule plus ends. Localizes to centrosomes, kinetochores and the mitotic spindle from prometaphase. Subsequently localizes to the spindle midzone from anaphase and to the midbody from telophase. In migrating cells localizes to the plus ends of microtubules within the cell body and to the entire microtubule lattice within the lamella.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • A Molecular and Genetic Analysis of Otosclerosis
    A molecular and genetic analysis of otosclerosis Joanna Lauren Ziff Submitted for the degree of PhD University College London January 2014 1 Declaration I, Joanna Ziff, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Where work has been conducted by other members of our laboratory, this has been indicated by an appropriate reference. 2 Abstract Otosclerosis is a common form of conductive hearing loss. It is characterised by abnormal bone remodelling within the otic capsule, leading to formation of sclerotic lesions of the temporal bone. Encroachment of these lesions on to the footplate of the stapes in the middle ear leads to stapes fixation and subsequent conductive hearing loss. The hereditary nature of otosclerosis has long been recognised due to its recurrence within families, but its genetic aetiology is yet to be characterised. Although many familial linkage studies and candidate gene association studies to investigate the genetic nature of otosclerosis have been performed in recent years, progress in identifying disease causing genes has been slow. This is largely due to the highly heterogeneous nature of this condition. The research presented in this thesis examines the molecular and genetic basis of otosclerosis using two next generation sequencing technologies; RNA-sequencing and Whole Exome Sequencing. RNA–sequencing has provided human stapes transcriptomes for healthy and diseased stapes, and in combination with pathway analysis has helped identify genes and molecular processes dysregulated in otosclerotic tissue. Whole Exome Sequencing has been employed to investigate rare variants that segregate with otosclerosis in affected families, and has been followed by a variant filtering strategy, which has prioritised genes found to be dysregulated during RNA-sequencing.
    [Show full text]
  • Genetic Determinants Underlying Rare Diseases Identified Using Next-Generation Sequencing Technologies
    Western University Scholarship@Western Electronic Thesis and Dissertation Repository 8-2-2018 1:30 PM Genetic determinants underlying rare diseases identified using next-generation sequencing technologies Rosettia Ho The University of Western Ontario Supervisor Hegele, Robert A. The University of Western Ontario Graduate Program in Biochemistry A thesis submitted in partial fulfillment of the equirr ements for the degree in Master of Science © Rosettia Ho 2018 Follow this and additional works at: https://ir.lib.uwo.ca/etd Part of the Medical Genetics Commons Recommended Citation Ho, Rosettia, "Genetic determinants underlying rare diseases identified using next-generation sequencing technologies" (2018). Electronic Thesis and Dissertation Repository. 5497. https://ir.lib.uwo.ca/etd/5497 This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of Scholarship@Western. For more information, please contact [email protected]. Abstract Rare disorders affect less than one in 2000 individuals, placing a huge burden on individuals, families and the health care system. Gene discovery is the starting point in understanding the molecular mechanisms underlying these diseases. The advent of next- generation sequencing has accelerated discovery of disease-causing genetic variants and is showing numerous benefits for research and medicine. I describe the application of next-generation sequencing, namely LipidSeq™ ‒ a targeted resequencing panel for the identification of dyslipidemia-associated variants ‒ and whole-exome sequencing, to identify genetic determinants of several rare diseases. Utilization of next-generation sequencing plus associated bioinformatics led to the discovery of disease-associated variants for 71 patients with lipodystrophy, two with early-onset obesity, and families with brachydactyly, cerebral atrophy, microcephaly-ichthyosis, and widow’s peak syndrome.
    [Show full text]
  • A Field Guide to Eukaryotic Transposable Elements
    GE54CH23_Feschotte ARjats.cls September 12, 2020 7:34 Annual Review of Genetics A Field Guide to Eukaryotic Transposable Elements Jonathan N. Wells and Cédric Feschotte Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14850; email: [email protected], [email protected] Annu. Rev. Genet. 2020. 54:23.1–23.23 Keywords The Annual Review of Genetics is online at transposons, retrotransposons, transposition mechanisms, transposable genet.annualreviews.org element origins, genome evolution https://doi.org/10.1146/annurev-genet-040620- 022145 Abstract Annu. Rev. Genet. 2020.54. Downloaded from www.annualreviews.org Access provided by Cornell University on 09/26/20. For personal use only. Copyright © 2020 by Annual Reviews. Transposable elements (TEs) are mobile DNA sequences that propagate All rights reserved within genomes. Through diverse invasion strategies, TEs have come to oc- cupy a substantial fraction of nearly all eukaryotic genomes, and they rep- resent a major source of genetic variation and novelty. Here we review the defining features of each major group of eukaryotic TEs and explore their evolutionary origins and relationships. We discuss how the unique biology of different TEs influences their propagation and distribution within and across genomes. Environmental and genetic factors acting at the level of the host species further modulate the activity, diversification, and fate of TEs, producing the dramatic variation in TE content observed across eukaryotes. We argue that cataloging TE diversity and dissecting the idiosyncratic be- havior of individual elements are crucial to expanding our comprehension of their impact on the biology of genomes and the evolution of species. 23.1 Review in Advance first posted on , September 21, 2020.
    [Show full text]
  • Mouse Phldb2 Knockout Project (CRISPR/Cas9)
    https://www.alphaknockout.com Mouse Phldb2 Knockout Project (CRISPR/Cas9) Objective: To create a Phldb2 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Phldb2 gene (NCBI Reference Sequence: NM_001252442 ; Ensembl: ENSMUSG00000033149 ) is located on Mouse chromosome 16. 19 exons are identified, with the ATG start codon in exon 2 and the TAG stop codon in exon 19 (Transcript: ENSMUST00000076333). Exon 2 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a conditional allele activated in neurons exhibit impaired LTP. Exon 2 starts from the coding region. Exon 2 covers 33.87% of the coding region. The size of effective KO region: ~1337 bp. The KO region does not have any other known gene. Page 1 of 9 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 19 Legends Exon of mouse Phldb2 Knockout region Page 2 of 9 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of Exon 2 is aligned with itself to determine if there are tandem repeats. Tandem repeats are found in the dot plot matrix. The gRNA site is selected outside of these tandem repeats. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of Exon 2 is aligned with itself to determine if there are tandem repeats.
    [Show full text]