Structural Variation in the 3D Genome
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS TRANSLATIONAL GENETICS Structural variation in the 3D genome Malte Spielmann 1, Darío G. Lupiáñez2 and Stefan Mundlos 3,4* Abstract | Structural and quantitative chromosomal rearrangements, collectively referred to as structural variation (SV), contribute to a large extent to the genetic diversity of the human genome and thus are of high relevance for cancer genetics, rare diseases and evolutionary genetics. Recent studies have shown that SVs can not only affect gene dosage but also modulate basic mechanisms of gene regulation. SVs can alter the copy number of regulatory elements or modify the 3D genome by disrupting higher- order chromatin organization such as topologically associating domains. As a result of these position effects, SVs can influence the expression of genes distant from the SV breakpoints, thereby causing disease. The impact of SVs on the 3D genome and on gene expression regulation has to be considered when interpreting the pathogenic potential of these variant types. Structural variation The application of molecular karyotyping and, more the position and/or function of cis- regulatory elements, (SV). Genetic variation that recently, next- generation sequencing (NGS) has such as promoters and enhancers. Owing to advances in includes all structural and revealed the enormous structural complexity of the 3D genome mapping technologies, it is now becoming quantitative chromosomal human genome1–3. Structural and quantitative chromo- increasingly evident that position effects are the result of rearrangements, that is, deletions and duplications, as somal rearrangements, collectively referred to as much more complex alterations than just changes in the well as copy- number-neutral structural variation (SV), include deletions, duplications, linear genome. Recent findings have shown that posi- aberrations, such as inversions, inversions, insertions and translocations and make tion effects can be understood only by taking the third insertions and translocations. up the majority of varying nucleotides among dimension of chromosomes into account — the folding human genomes4. Unbalanced rearrangements, also of chromatin in the 3D space of the nucleus. known as copy number variation (CNV), alter the dip- In this Review, we first discuss how SVs can rear- loid status of DNA by changing the copy number of range the order and number of genes and regulatory chromosomes or chromosomal regions5, whereas elements, resulting in human disease. We then describe balanced rearrangements such as inversions, recip- in detail how SVs can modify the 3D organization of rocal translocations or copy- number-neutral inser- the genome by disrupting chromatin domains. We tions do not result in loss or gain of genetic material. also describe the phenotypic consequences of genomic Germline SVs ranging from whole- chromosome disorders resulting from reshuffling of non- coding 1Department of Genome abnormalities, such as trisomy 13, 18 or 21, to recur- enhancer sequences and chromatin domain bounda- Sciences, University of rent microdeletion and microduplication syndromes are ries with the aim of presenting possible strategies for Washington, Seattle, 6–10 WA, USA. a common cause of congenital disease , and high the medical interpretation of SVs in the 3D genome. 2Epigenetics and Sex levels of somatic SVs are a key signature of human cancer We do not review the mechanisms underlying chromo- 1,11 Development Group, Berlin genomes . High- throughput sequencing technologies some organization, nor do we discuss methods for SV Institute for Medical Systems have dramatically accelerated the discovery and charac- detection by chromosome conformation capture (3C) or Biology, Max- Delbrück terization of SVs; however, the medical interpretation related technologies; for recent reviews on these topics, Center for Molecular of SVs and the prediction of phenotypic consequences see REFS4,18–22. Medicine, Berlin- Buch, 12 Germany. remain unsatisfactory . Structural variation in disease 3Max Planck Institute for The discovery that SVs can be pathogenic without Molecular Genetics, RG changing coding genome sequences first indicated that Great progress has been made in the detection of single Development & Disease, SVs can have regulatory effects, which were explained by nucleotide variants in health and disease. However, the Berlin, Germany. so-called position effects13,14. Indeed, 98% of the genome is genetic analysis of SV has been limited by technical chal- 4Institute for Medical and non-coding 15, and a large part of the non-coding genome lenges. Although array comparative genomic hybridiza- Human Genetics, Charité is, in one way or another, involved in gene regulation. tion (array CGH) is effectively used in routine clinical Universitätsmedizin Berlin, 5,10 Berlin, Germany. The complexity of assigning function to non- coding diagnostics for the detection of CNVs , this technique *e- mail: mundlos@ regions makes accurate estimates difficult; the fraction has limitations owing to its fairly low resolution, its ina- molgen.mpg.de of the genome implicated in gene regulation has been bility to detect balanced rearrangements and its low effi- 16 17 23 https://doi.org/10.1038/ estimated to be 5% or 80% depending on the study. cacy in mosaic individuals . Whole-genome sequencing s41576-018-0007-0 Nonetheless, SVs have a high probability of affecting (WGS) by short- read technology has the potential to NATURE REVIEWS | GENETICS VOLUME 19 | JULY 2018 | 453 © 2018 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. REVIEWS Copy number variation detect all rearrangements but lacks the ability to detect even within the same data sets, highlights the current (CNV). Genetic variation that breakpoints in repetitive regions, where many break- uncertainty in both the mutation rate of de novo SVs refers only to quantitative points occur24. Long- read sequencing technology, for and their impact on disease. chromosomal rearrangements, example, single molecule real time sequencing or nano- To establish disease- specific SV morbidity maps such as deletions and duplications. pore sequencing, is expected to overcome these problems of the human genome, several large initiatives such as but currently remains too expensive for routine clinical the 100,000 Genomes Project42 and the Deciphering Microdeletion and application25,26. Developmental Disorders (DDD) study43 are evaluat- microduplication syndromes ing the amount of de novo SVs in a variety of patient A group of syndromes that are De novo structural variation cohorts. Equivalent CNV morbidity maps have proved caused by chromosomal microdeletions and One important question in human genetics is how much to be extremely useful for the clinical interpretation of 6,7,10,31 microduplications, which make variation is normally present in the human genome and CNVs . The most comprehensive SV database to up a subset of copy number how much is specific to disease. Using all currently date, the Database of Genomic Variants (DGV) provides variations that are usually available methodologies to define the full spectrum of a catalogue of SVs found in the genomes of control indi- smaller than 5 Mb. Classic 44 examples include the 7q11 human genetic variation, >30,000 SVs can be detected viduals from diverse populations . The current version 27 deletion (Williams–Beuren per human genome . In a clinical diagnostic setting, includes >22,300 genomes. syndrome), the 15q11–15q13 following their identification, each variant needs to be Although de novo SVs are more likely to cause deletion (Prader–Willi and carefully evaluated according to the specific phenotype disease, it should be noted that a considerable num- Angelman syndromes) and and family history of the patient, as well as in terms of ber of recurrent microdeletion syndromes exhibit the 17p11 deletion penetrance (Smith–Magenis syndrome). the gene content and genetic context of the variant, in reduced or high clinical variability. For order to assign pathogenicity to an SV. To differenti- example, only 15% of 15q13.3 microdeletions arise Position effects ate polymorphic from potentially disease- causing SVs, de novo, and they present with a wide range and sever- Effects of structural variation it is essential to validate whether an SV has occurred ity of phenotypes45. Reduced penetrance and high clin- (SV), classically translocations, on the expression of a gene de novo or whether it is inherited, as de novo SVs are ical variability represent key challenges for the medical 10 without any changes to its more frequently associated with disease . Large- scale interpretation of SVs. A possible explanation for reduced coding sequence or promoter studies have established a role for de novo CNVs in the penetrance can be underlying recessive or biallelic inher- region. These effects can also aetiology of many diseases, including intellectual disabil- itance, which was described for thrombocytopenia be observed if a gene is ity28,29, autism spectrum disorders (ASDs)6,30, congenital absent radius syndrome46, for example. inserted into different regions 7 8,9 of the genome or if SVs heart disease , sporadic schizophrenia and congenital 31 connect previously limb malformation . For ASDs, approximately 3.7% Gene dosage unconnected genes and their of affected individuals were found to carry a large The pathogenicity of SVs that disrupt or include cod- regulatory units. In these cases, (>500 kb) de novo CNV, in contrast to only 0.4−0.8% ing sequences is currently interpreted