Global Patterns of Copy Number Variation in Humans from a Population-Based Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Global Patterns of Copy Number Variation in Humans from a Population-Based Analysis Global patterns of copy number variation in humans from a population-based analysis. ICHG Kyoto Jean Monlong April 5, 2016 BOURQUE LAB MCGILL UNIVERSITY HUMAN GENETICS DEPT. Disclosure Information I have no financial relationships to disclose 2 Copy-Number Variation Copy-Number Variation 3 Copy Number Variation (CNV) Imbalanced genetic variation involving more than 500bp. Copy-Number Variation 4 CNV detection from High-Throughput Sequencing Baker 2012, Nature Methods. Copy-Number Variation 5 More prone to CNV. Enriched in Segmental Duplications (Sharp Annual Review 2006). Short Tandem Repeats highly polymorphic (Warbuton BMC Genomics 2008). Transposons involved in CNV formation (Sen AJHG 2006). Involved in phenotype and disease. Short Tandem Repeats and gene expression (Gymrek Nat. Genetics 2016). Repeats CNV involved in ∼30 genetic disorders (Mirkin Nature 2007). Retrotransposition in cancer (Lee Science 2012). Low-mappability regions Repeat-rich regions, centromeres, telomeres. ∼13% of the human genome. Copy-Number Variation 6 Involved in phenotype and disease. Short Tandem Repeats and gene expression (Gymrek Nat. Genetics 2016). Repeats CNV involved in ∼30 genetic disorders (Mirkin Nature 2007). Retrotransposition in cancer (Lee Science 2012). Low-mappability regions Repeat-rich regions, centromeres, telomeres. ∼13% of the human genome. More prone to CNV. Enriched in Segmental Duplications (Sharp Annual Review 2006). Short Tandem Repeats highly polymorphic (Warbuton BMC Genomics 2008). Transposons involved in CNV formation (Sen AJHG 2006). Copy-Number Variation 6 Low-mappability regions Repeat-rich regions, centromeres, telomeres. ∼13% of the human genome. More prone to CNV. Enriched in Segmental Duplications (Sharp Annual Review 2006). Short Tandem Repeats highly polymorphic (Warbuton BMC Genomics 2008). Transposons involved in CNV formation (Sen AJHG 2006). Involved in phenotype and disease. Short Tandem Repeats and gene expression (Gymrek Nat. Genetics 2016). Repeats CNV involved in ∼30 genetic disorders (Mirkin Nature 2007). Retrotransposition in cancer (Lee Science 2012). Copy-Number Variation 6 PopSV approach PopSV approach 7 PopSV approach Objective Test the entire genome, including low-mappability regions, and detect subtle abnormal coverage. PopSV: Population-based approach Use a set of reference experiments to detect abnormal patterns. sample reference tested number of reads mapped number genomic window PopSV approach 8 Benchmark and validation Existing methods FREEC LASSO-based segmentation; GC and mappability correction. cn.MOPS Multi-sample Bayesian-based segmentation. Whole-Genome Sequencing data 45 samples, including 10 twin families (i.e 2 twins + 2 parents). 95 pairs of normal/tumor samples from Renal Cell Carcinoma (CageKid). PopSV approach 9 Benchmark and validation Replication in the twins. Concordance with pedigree. Replication in the paired tumor. Concordance of different bin sizes PCR validation. Overall performance and in different repeat context. PopSV approach 10 Validation conclusions PopSV detects 3-5x more variants. Wider genomic range. Robust across challenging regions: Low-coverage. Segmental duplications. DNA satellites. Short tandem repeats GC-rich/poor. Resolution down to half the bin size. PopSV approach 11 CNV patterns in normal genomes CNV patterns in normal genomes 12 Where are CNVs located ? In Centromere ? Telomere ? Segmental duplication ? DNA satellites ? Short tandem repeats ? Transposable Elements ? Exons ? Promoters ? Control regions Same size distribution. Randomly distributed. CNV in normal genomes 640 normal genomes 45 samples from the Twin study (∼40X) 95 normal samples from Renal Cell Carcinoma (∼54X). 500 unrelated samples from GoNL (∼14X). CNV patterns in normal genomes 13 Control regions Same size distribution. Randomly distributed. CNV in normal genomes 640 normal genomes 45 samples from the Twin study (∼40X) 95 normal samples from Renal Cell Carcinoma (∼54X). 500 unrelated samples from GoNL (∼14X). Where are CNVs located ? In Centromere ? Telomere ? Segmental duplication ? DNA satellites ? Short tandem repeats ? Transposable Elements ? Exons ? Promoters ? CNV patterns in normal genomes 13 CNV in normal genomes 640 normal genomes 45 samples from the Twin study (∼40X) 95 normal samples from Renal Cell Carcinoma (∼54X). 500 unrelated samples from GoNL (∼14X). Where are CNVs located ? In Centromere ? Telomere ? Segmental duplication ? DNA satellites ? Short tandem repeats ? Transposable Elements ? Exons ? Promoters ? Control regions Same size distribution. Randomly distributed. CNV patterns in normal genomes 13 Enriched close to Centromere/Telomere/Gap (CTG) 1.00 0.75 0.50 cumulative proportioncumulative 0.25 region CNV 0.00 control 0e+00 2e+07 4e+07 6e+07 distance to centromere/telomere/gap (bp) CNV patterns in normal genomes 14 Enriched in SD and low-coverage regions CNV patterns in normal genomes 15 Same proportion overlapping a segmental duplication. Similar distance to CTG. Going further 1. Control for the SD and CTG patterns. 2. Look at other repeat classes. Control regions Randomly distributed. Same size distribution. CNV patterns in normal genomes 16 Going further 1. Control for the SD and CTG patterns. 2. Look at other repeat classes. Control regions Randomly distributed. Same size distribution. Same proportion overlapping a segmental duplication. Similar distance to CTG. CNV patterns in normal genomes 16 Controlling for SD and distance to CTG CNV patterns in normal genomes 17 Controlling for SD and distance to CTG CNV patterns in normal genomes 17 Controlling for SD and distance to CTG Satellites enrichment driven by ALR/Alpha, (GAATG)n/(CATTC)n families. Short Tandem Repeats Enrichment distributed across families... ... but stronger for larger STR. Transposable elements (TE): SVA class enriched. Expected: L1HS, L1PA2 to L1PA5. Surprises: HERVH, LTR38, LTR4. CNV patterns in normal genomes 18 Repeat CNVs and protein-coding genes Genes with CNVs Set CNVs Exon + Promoter + Intron All CNVs 91733 7206 11341 13259 Low coverage 26888 682 1151 1977 Extremely low coverage 10010 347 465 521 STR 4286 45 286 748 Satellite 1822 2 21 33 TE 20491 164 1747 3998 STR/Satellite/TE 22313 166 1760 4014 Repeat CNV: more than 90% of the CNV is annotated as repeat. CNV patterns in normal genomes 19 Conclusion Conclusion 20 In normal genomes: CNVs enriched in low coverage regions. Specific enrichment in satellites, simple repeats, TEs. Not due to segmental duplication enrichment. Replicated across datasets but different from somatic patterns. Some CNVs in low coverage regions or repeats hit exonic sequence. Summary PopSV uses reference samples. detects more CNVs. is robust across the entire genome. Conclusion 21 Summary PopSV uses reference samples. detects more CNVs. is robust across the entire genome. In normal genomes: CNVs enriched in low coverage regions. Specific enrichment in satellites, simple repeats, TEs. Not due to segmental duplication enrichment. Replicated across datasets but different from somatic patterns. Some CNVs in low coverage regions or repeats hit exonic sequence. Conclusion 21 Guillaume Bourque Simon Gravel Mathieu Bourgey Mathieu Blanchette Louis Letourneau Francois Lefebvre Eric Audemard Toby Hocking Simon Girard Patrick Cossette Guy Rouleau Caroline Meloche 23 Workflow 24 Replication in twins 25 Robust across challenging regions 1.00 1.00 0.75 0.75 PopSV set PopSV set 0.50 call 0.50 call null null 0.25 0.25 proportion of regions with concordant samples proportion of regions with concordant samples 0.00 0.00 low expected high [0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8,1] coverage class GC content 1.00 1.00 0.75 0.75 PopSV set PopSV set 0.50 call 0.50 call null null 0.25 0.25 proportion of regions with concordant samples proportion of regions with concordant samples 0.00 0.00 [0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8,1] [0,0.2] (0.2,0.4] (0.4,0.6] (0.6,0.8] (0.8,1] segmental duplication proportion simple repeat proportion 26 Robust across challenging regions PopSV 0.8 0.6 0.4 0.2 0.0 ● ● ● ● ● ● ● ● ● ● other5 other1 other3 other2 other4 1652−Twin1 1652−Twin2 1480−Twin2 1480−Twin1 1389−Twin1 1389−Twin2 1207−Twin1 1207−Twin2 1286−Twin1 1286−Twin2 1301−Twin1 1301−Twin2 1323−Twin1 1323−Twin2 1443−Twin2 1443−Twin1 1121−Twin1 1121−Twin2 1490−Twin1 1490−Twin2 1652−Father 1207−Father 1286−Father 1389−Father 1301−Father 1480−Father 1323−Father 1443−Father 1121−Father 1490−Father 1652−Mother 1480−Mother 1389−Mother 1207−Mother 1286−Mother 1301−Mother 1323−Mother 1443−Mother 1121−Mother 1490−Mother sample Father ● Mother Twin family ● 1121 ● 1207 ● 1286 ● 1301 ● 1323 ● 1389 ● 1443 ● 1480 ● 1490 ● 1652 Using only CNVs in extremely low coverage regions ! 27 Resolution - 500 bp bins Vs 5 Kbp bins 1.00 0.75 0.50 0.25 proportion 5kbp−bin calls overlapping 0.00 0 2500 5000 7500 10000 12500 15000 17500 20000 size of the 500bp−bin call 28 Control regions QC − SD, low−coverage and CTG distance control 1.00 0.75 set 0.50 CNV control 0.25 proportion the feature overlapping 0.00 lowmap segdup feature 29 Control regions QC − SD, low−coverage and CTG distance control 1.00 0.75 0.50 cumulative proportioncumulative 0.25 region CNV 0.00 control 0e+00 2e+07 4e+07 6e+07 distance to centromere/telomere/gap (bp) 30 Control regions S/2 S/2 S/2 S/2 S/2 S/2 Random region of size S = Random base in green S 31 Controlling for SD and distance to CTG SINE SVA TE LTR LINE DNA SVA_F SVA_E SVA_D MER65A LTR4 TE top families LTR38−int L1PA5 L1PA4 L1PA3 L1PA2 L1HS HERVH−int AluY Twins CK Normal GoNL CK Somatic cohort Significance (−log10 Pvalue) 4 8 12 Depleted Enriched 32 Controlling for SD and distance to CTG TAR1 SUBTEL_sa SST1 SATR2 SATR1 SAR REP522 MSR1 LSAU Satellite HSATII HSATI HSAT6 HSAT5 HSAT4 GSATX GSATII GSAT D20S16 CER BSR/Beta ALR/Alpha ACRO1 (GAATG)n (CATTC)n other TTTC TTCT TTCC TCTT TCTA TCCT TATC TAGA TA GGAA GCG GATA STR GAAG GAAA CTTT CTTC CTG CGC CCTT CAG ATCT ATAG AT AGGA AGAT AGAA AAGG AAGA AAAG Twins CK Normal GoNL CK Somatic cohort Depleted Enriched Significance (−log10 Pvalue) 0 4 8 12 33 Distance to CTG for somatic CNVs 1.00 0.75 0.50 cumulative proportioncumulative 0.25 region CNV 0.00 control 0e+00 2e+07 4e+07 6e+07 distance to centromere/telomere/gap (bp) 34.
Recommended publications
  • Genome‐Wide Copy Number Variation Analysis in Early Onset Alzheimer's
    Genome‐wide Copy Number Variation Analysis in Early Onset Alzheimer’s disease A thesis Submitted to the Faculty of Drexel University by Basavaraj V. Hooli in partial fulfillment of the requirements for the degree of Doctorate of Philosophy August 2011 © Copyright 2011 Basavaraj Hooli. All Rights Reserved. iii Dedication To my family, mentors and friends for their enduring encouragement, love and support. iv Acknowledgements Thankful acknowledgments are owed to some really awesome people. First and foremost, to my incredible mentors over the past years – Drs. Rudy Tanzi, Lars Bertram and Aleister Saunders. I would like to express sincere gratitude to Rudy for the opportunity to pursue PhD in his illustrious lab, for being an inspiring mentor, all the support, patience and guidance over the years. I will be always indebted to Lars for all the knowledge and training in Alzheimer’s genetics and conducting methodical and systematic research – it is an absolute priceless experience. I cannot thank Aleister enough for introducing me to the field of scientific research, for providing a strong foundation in basics of biological research in such a short duration of time, and for the continued advice and counsel. I will always be grateful for the contribution of my mentors to my intellectual and professional development – I feel privileged to have them as my mentors. My sincere thanks to the committee members Drs. Guillermo Alexander, Jacob Russell and Daniel Marenda for their support and valuable input towards successful and timely completion of the project. I appreciate meeting some of the smartest minds and nicest people during these past years ‐ Sara Ansaloni, Neha Patel, Ranjita Mukherjee, Preeti v Khandelwal, Trinna Cuellar in Aleisterʹs lab.
    [Show full text]
  • Novel Copy-Number Variations in Pharmacogenes Contribute to Interindividual Differences in Drug Pharmacokinetics
    ORIGINAL RESEARCH ARTICLE © American College of Medical Genetics and Genomics Novel copy-number variations in pharmacogenes contribute to interindividual differences in drug pharmacokinetics María Santos, MSc1, Mikko Niemi, PhD2, Masahiro Hiratsuka, PhD3, Masaki Kumondai, BSc3, Magnus Ingelman-Sundberg, PhD4, Volker M. Lauschke, PhD4 and Cristina Rodríguez-Antona, PhD1,5 Purpose: Variability in pharmacokinetics and drug response is of the genes studied. We experimentally confirmed novel deletions shaped by single-nucleotide variants (SNVs) as well as copy- in CYP2C19, CYP4F2, and SLCO1B3 by Sanger sequencing and number variants (CNVs) in genes with importance for drug validated their allelic frequencies in selected populations. absorption, distribution, metabolism, and excretion (ADME). Conclusion: CNVs are an additional source of pharmacogenetic While SNVs have been extensively studied, a systematic assessment variability with important implications for drug response and of the CNV landscape in ADME genes is lacking. personalized therapy. This, together with the important contribu- Methods: We integrated data from 2,504 whole genomes from the tion of rare alleles to the variability of pharmacogenes, emphasizes 1000 Genomes Project and 59,898 exomes from the Exome the necessity of comprehensive next-generation sequencing–based Aggregation Consortium to identify CNVs in 208 relevant genotype identification for an accurate prediction of the genetic pharmacogenes. variability of drug pharmacokinetics. Results: We describe novel exonic deletions
    [Show full text]
  • A Visualization and Annotation Tool for Copy Number Variation from Whole-Genome Sequencing Ryan L
    Bioinformatics, YYYY, 0–0 doi: 10.1093/bioinformatics/xxxxx Advance Access Publication Date: DD Month YYYY Applications Note Genome Analysis CNView: a visualization and annotation tool for copy number variation from whole-genome sequencing Ryan L. Collins1, Matthew R. Stone1, Harrison Brand1,2, Joseph T. Gless- ner1,2, Michael E. Talkowski1,2,3,* 1Psychiatric and Neurodevelopmental Genetics Unit and Molecular Neurogenetics Unit, Center for Human Ge- netic Research, Massachusetts General Hospital, Boston, MA 02114, USA, 2Department of Neurology, Harvard Medical School, Boston, MA 02114, USA, 3Program in Medical and Population Genetics, Broad Institute, Cam- bridge, MA 02141, USA *To whom correspondence should be addressed. Associate Editor: XXXXXXX Received on XXXXX; revised on XXXXX; accepted on XXXXX Abstract Summary: Copy number variation (CNV) is a major component of structural differences between individual genomes. The recent emergence of population-scale whole-genome sequencing (WGS) datasets has enabled genome-wide CNV delineation. However, molecular validation at this scale is impractical, so visualization is an invaluable preliminary screening approach when evaluating CNVs. Standardized tools for visualization of CNVs in large WGS datasets are therefore in wide demand. Methods & Results: To address this demand, we developed a software tool, CNView, for normalized visualization, statistical scoring, and annotation of CNVs from population-scale WGS datasets. CNView surmounts challenges of sequencing depth variability between individual libraries by locally adapting to cohort-wide variance in sequencing uniformity at any locus. Importantly, CNView is broadly extensible to any reference genome assembly and most current WGS data types. Availability and Implementation: CNView is written in R, is supported on OS X, MS Windows, and Linux, and is freely distributed under the MIT license.
    [Show full text]
  • Systematic Analysis of Copy Number Variation Associated with Congenital Diaphragmatic Hernia
    Systematic analysis of copy number variation associated with congenital diaphragmatic hernia Qihui Zhua,1, Frances A. Highb,c,d,1, Chengsheng Zhanga,1, Eliza Cerveiraa, Meaghan K. Russellb, Mauro Longonib,d, Maliackal P. Joyb, Mallory Ryana, Adam Mil-homensa, Lauren Bellfya, Caroline M. Colettib, Pooja Bhayanib, Regis Hilab, Jay M. Wilsonc,d, Patricia K. Donahoeb,d,2,3, and Charles Leea,e,2,3 aThe Jackson Laboratory for Genomic Medicine, Farmington, CT 06032; bPediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA 02114; cDepartment of Surgery, Boston Children’s Hospital, Boston, MA 02115; dHarvard Medical School, Boston, MA 02115; and eDepartment of Life Sciences, Ewha Womans University, Seoul 03760, South Korea Contributed by Patricia K. Donahoe, April 4, 2018 (sent for review August 22, 2017; reviewed by Sally A. Camper and Deborah Krakow) Congenital diaphragmatic hernia (CDH), characterized by malfor- mouse models were also later found to harbor rare and predicted mation of the diaphragm and hypoplasia of the lungs, is one of the pathogenic mutations in patients with CDH (8, 10, 13). most common and severe birth defects, and is associated with high Numerous studies have implicated copy number variants morbidity and mortality rates. There is growing evidence demon- (CNVs) in human disease with associated phenotypes ranging strating that genetic factors contribute to CDH, although the patho- from cognitive disabilities to predispositions to obesity, cancer, and genesis remains largely elusive. Single-nucleotide polymorphisms otherdiseases(14–17). Both inherited and de novo CNVs play a have been studied in recent whole-exome sequencing efforts, causative role in CDH (18–22), and at least 10% of CDH cases are but larger copy number variants (CNVs) have not yet been studied estimated to be caused by genomic imbalances (21).
    [Show full text]
  • Genome-Wide Mapping of Copy Number Variations and Loss of Heterozygosity Using the Infinium® Human1m Beadchip
    TECHNICAL NOTE: ILLUMINA® DNA ANALYSIS Genome-Wide Mapping of Copy Number Variations and Loss of Heterozygosity Using the Infinium® Human1M BeadChip Contributed by Ku Chee-Seng, Sim Xueling, and Chia Kee-Seng, Centre for Molecular Epidemiology and Department of Community, Occupational, and Family Medicine, Yong Loo Lin School of Medicine, National University of Singapore INTRODUCTION With only a preliminary understanding of the roles Genetic variations within the human genome can take CNVs and LOH play in complex disease development, it many forms, including single-nucleotide polymorphisms is imperative that we generate a comprehensive catalog (SNPs), copy number variations (CNVs), and copy-neutral of structural variations in the human genome. This loss of heterozygosity (LOH). SNPs involve the change in a approach may provide the opportunity to unravel novel single nucleotide, while CNVs and LOH encompass larger disease loci. To date, there has been little research into segments of DNA. In this application note, we focus on CNV information in Asian populations. Therefore, we methods for accurately mapping these structural variations have begun exploring the extent of CNVs in several and their potential involvement in disease manifestations. South-East Asian populations (Singaporean Chinese, CNVs, defined as additions or deletions in the number Malay, and Indian) with the goal of constructing a of copies of a particular segment of DNA (larger than genome-wide map reflecting CNVs and copy-neutral 1kb in length) when compared to a reference genome LOH within these populations. In our study, we demon- sequence, provide further insight into the complexity and strate the advantages of using high-density SNP arrays diversity of genetic variations.
    [Show full text]
  • Genome-Wide Analysis of Copy Number Variation in Latin American Parkinson’S Disease Patients
    medRxiv preprint doi: https://doi.org/10.1101/2020.05.29.20100859; this version posted June 2, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license . Genome-wide Analysis of Copy Number Variation in Latin American Parkinson’s Disease Patients 1 1 2 3,4,5 Elif Irem Sarihan ,​ Eduardo Pérez-Palma ,​ Lisa-Marie Niestroj ,​ Douglas Loesch ,​ Miguel ​ ​ ​ ​ 1 6 7,8 9,10 Inca-Martinez ,​ Andrea R. V. R. Horimoto ,​ Mario Cornejo-Olivas ,​ Luis Torres ,​ Pilar ​ ​ ​ ​ ​ 7,10 9,10 7 7 Mazzetti ,​ Carlos Cosentino ,​ Elison Sarapura-Castro ,​ Andrea Rivera-Valdivia ,​ Elena​ ​ ​ ​ ​ 11 12 12 13 14 Dieguez ,​ Victor Raggio ,​ Andres Lescano ,​ Vitor Tumas ,​ Vanderci Borges ,​ Henrique B. ​ ​ ​ ​ ​ 14 15 16 17 Ferraz ,​ Carlos R. Rieder ,​ Artur Schumacher-Schuh ,​ Bruno L. Santos-Lobato ,​ Carlos ​ ​ ​ ​ 18 18 18 18 Velez-Pardo ,​ Marlene Jimenez-Del-Rio ,​ Francisco Lopera ,​ Sonia Moreno ,​ Pedro ​ ​ ​ ​ 19 20 20 20 Chana-Cuevas ,​ William Fernandez ,​ Gonzalo Arboleda ,​ Humberto Arboleda ,​ Carlos E. ​ ​ ​ ​ 20 21,22 21,22 23 Arboleda-Bustos ,​ Dora Yearout ,​ Cyrus P. Zabetian ,​ Timothy A. Thornton ,​ Timothy D. ​ ​ ​ ​ 3,4,5 1,2,24,25 21,22^,1* O’Connor ,​ Dennis Lal ,​ Ignacio F. Mata on behalf of the Latin American Research ​ ​ ​ ​ Consortium on the Genetics of Parkinson’s Disease (LARGE-PD) 1 Lerner​ Research Institute, Genomic Medicine, Cleveland Clinic,
    [Show full text]
  • A High-Resolution X Chromosome Copy-Number Variation Map in Fertile Females and Women with Primary Ovarian Insufficiency
    © American College of Medical Genetics and Genomics ARTICLE A high-resolution X chromosome copy-number variation map in fertile females and women with primary ovarian insufficiency Svetlana A. Yatsenko, MD 1,2,3,4, Michelle Wood-Trageser, PhD1, Tianjiao Chu, PhD3, Huaiyang Jiang, MD3 and Aleksandar Rajkovic, MD, PhD1,2,3,4,5,6 Purpose: Sex-biased expression of genes on the X chromosome is sequences among POI females, highlighting structural differences accomplished by a complex mechanism of dosage regulation that between X chromosomes of fertile and POI females. Furthermore, leads to anatomical and physiological differences between males we discovered a ~4% carrier incidence for X-linked disorders and females. Copy-number variations (CNVs) may impact the among fertile women. human genome by either affecting gene dosage or disturbing a Conclusion: We constructed a high-resolution map of female- chromosome structural and/or functional integrity. specific CNVs that provides critical insights into the spectrum of Methods: We performed a high-resolution CNV profiling to human genetic variation, sex-specific disease risk factors, and investigate the X chromosome integrity in cohorts of 269 fertile reproductive potential. We discovered novel CNVs associated with females and 111 women affected with primary ovarian insufficiency ovarian dysfunction and support polygenic models for POI. (POI) and assessed CNVs impact into functional and nonfunctional genomic elements. Genetics in Medicine (2019) 21:2275–2284; https://doi.org/10.1038/s41436- Results: In POI patients, we observed a 2.5-fold enrichment for 019-0505-2 rare CNVs comprising ovary-expressed genes, and genes implicated in autoimmune response and apoptotic signaling.
    [Show full text]
  • S41467-021-21341-X.Pdf
    ARTICLE https://doi.org/10.1038/s41467-021-21341-x OPEN Telomeres reforged with non-telomeric sequences in mouse embryonic stem cells Chuna Kim1,2,7, Sanghyun Sung1,7, Jong-Seo Kim 1,3,7, Hyunji Lee1, Yoonseok Jung3, Sanghee Shin1,3, Eunkyeong Kim1, Jenny J. Seo1,3, Jun Kim 1, Daeun Kim4,5, Hiroyuki Niida6, V. Narry Kim 1,3, ✉ ✉ Daechan Park4,5 & Junho Lee1 Telomeres are part of a highly refined system for maintaining the stability of linear chro- 1234567890():,; mosomes. Most telomeres rely on simple repetitive sequences and telomerase enzymes to protect chromosomal ends; however, in some species or telomerase-defective situations, an alternative lengthening of telomeres (ALT) mechanism is used. ALT mainly utilises recombination-based replication mechanisms and the constituents of ALT-based telomeres vary depending on models. Here we show that mouse telomeres can exploit non-telomeric, unique sequences in addition to telomeric repeats. We establish that a specific subtelomeric element, the mouse template for ALT (mTALT), is used for repairing telomeric DNA damage as well as for composing portions of telomeres in ALT-dependent mouse embryonic stem cells. Epigenomic and proteomic analyses before and after ALT activation reveal a high level of non-coding mTALT transcripts despite the heterochromatic nature of mTALT-based tel- omeres. After ALT activation, the increased HMGN1, a non-histone chromosomal protein, contributes to the maintenance of telomere stability by regulating telomeric transcription. These findings provide a molecular basis to study the evolution of new structures in telomeres. 1 Department of Biological Sciences, Seoul National University, Seoul, Korea. 2 Aging Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea.
    [Show full text]
  • Detecção De Copy Number Variation (CNV) E Sua Caracterização Na População Brasileira
    Detecção de Copy Number Variation (CNV) e sua caracterização na população brasileira Ana Cláudia Martins Ciconelle Dissertação apresentada ao Instituto de Matemática e Estatística da Universidade de São Paulo para Obtenção do Título de Mestre em Bioinformática Programa: Mestrado em Bionformática Orientadora: Prof. Dra. Júlia Maria Pavan Soler Durante o desenvolvimento deste trabalho a autora recebeu auxílio financeiro do CNPq e CAPES São Paulo, Janeiro de 2018 Detection of Copy Number Variation (CNV) and its characterization in Brazilian population Esta versão da dissertação contém as correções e alterações sugeridas pela Comissão Julgadora durante a defesa da versão original do trabalho, realizada em 06/02/2018. Uma cópia da versão original está disponível no Instituto de Matemática e Estatística da Universidade de São Paulo. Comissão Julgadora: • Prof. Dra. Júlia Maria Pavan Soler - IME-USP • Prof. Dr. Alexandre da Costa Pereira - HCFMUSP • Prof. Dr. Benilton de Sá Carvalho - UNICAMP Agradecimentos Agradeço aos meus pais, Claudio e Marcia, meu irmão Lucas, minhas avós, Dorga e Isabel, e todos os outros familiares que sempre me apoiaram e me fazem acreditar no significado de família. Agradeço especialmente á minha professora, orientadora e amiga, Júlia M. P. Soler, que desde da minha iniciação científica sempre esteve disponível para me ensinar, orientar e aconselhar com muita paciência, carinho e apoio. Agradeço também meus amigos de graduação em Ciências Moleculares, em especial ao Chico, Otto e o Leo, e aos amigos do IME que sempre me ajudaram em todos os sentidos possíveis. Agradeço aos professores do Ciências Moleculares e do IME por me mostrarem o caminho da ciência.
    [Show full text]
  • Copy Number Variation Detection from 1000 Genomes Project Exon
    Wu et al. BMC Bioinformatics 2012, 13:305 http://www.biomedcentral.com/1471-2105/13/305 METHODOLOGY ARTICLE Open Access Copy Number Variation detection from 1000 Genomes project exon capture sequencing data Jiantao Wu1†, Krzysztof R Grzeda1†, Chip Stewart1, Fabian Grubert2, Alexander E Urban2, Michael P Snyder2 and Gabor T Marth1* Abstract Background: DNA capture technologies combined with high-throughput sequencing now enable cost-effective, deep-coverage, targeted sequencing of complete exomes. This is well suited for SNP discovery and genotyping. However there has been little attention devoted to Copy Number Variation (CNV) detection from exome capture datasets despite the potentially high impact of CNVs in exonic regions on protein function. Results: As members of the 1000 Genomes Project analysis effort, we investigated 697 samples in which 931 genes were targeted and sampled with 454 or Illumina paired-end sequencing. We developed a rigorous Bayesian method to detect CNVs in the genes, based on read depth within target regions. Despite substantial variability in read coverage across samples and targeted exons, we were able to identify 107 heterozygous deletions in the dataset. The experimentally determined false discovery rate (FDR) of the cleanest dataset from the Wellcome Trust Sanger Institute is 12.5%. We were able to substantially improve the FDR in a subset of gene deletion candidates that were adjacent to another gene deletion call (17 calls). The estimated sensitivity of our call-set was 45%. Conclusions: This study demonstrates that exonic sequencing datasets, collected both in population based and medical sequencing projects, will be a useful substrate for detecting genic CNV events, particularly deletions.
    [Show full text]
  • Variants and Health
    Help Me Understand Genetics Variants and Health Reprinted from MedlinePlus Genetics U.S. National Library of Medicine National Institutes of Health Department of Health & Human Services Table of Contents 1 What is a gene variant and how do variants occur? 1 2 How can gene variants affect health and development? 3 3 Do all gene variants affect health and development? 4 4 What kinds of gene variants are possible? 6 5 Can a change in the number of genes affect health and development? 12 6 Can changes in the number of chromosomes affect health and development? 13 7 Can changes in the structure of chromosomes affect health and development? 19 8 Can changes in noncoding DNA affect health and development? 26 9 Can changes in mitochondrial DNA affect health and development? 28 10 What are complex or multifactorial disorders? 29 11 What does it mean to have a genetic predisposition to a disease? 30 12 How are gene variants involved in evolution? 32 13 What information can statistics provide about a genetic condition? 34 Reprinted from MedlinePlus Genetics (https://medlineplus.gov/genetics/) i 14 How are genetic conditions and genes named? 36 Reprinted from MedlinePlus Genetics (https://medlineplus.gov/genetics/) ii Variants and Health 1 What is a gene variant and how do variants occur? A gene variant is a permanent change in the DNA sequence that makes up a gene. This type of genetic change used to be known as a gene mutation, but because changes in DNA do not always cause disease, it is thought that gene variant is a more accurate term.
    [Show full text]
  • "Interpretation of Genomic Copy Number Variants Using DECIPHER"
    Interpretation of Genomic Copy Number UNIT 8.14 Variants Using DECIPHER Manuel Corpas,1 Eugene Bragin,1 Stephen Clayton,1 Paul Bevan,1 and Helen V. Firth1,2 1Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom 2Cambridge University Hospitals Foundation Trust, Addenbrooke’s Hospital, Cambridge, United Kingdom ABSTRACT Many patients suffering from developmental disorders have submicroscopic deletions or duplications affecting the copy number of dosage-sensitive genes or disrupting normal gene expression. Many of these changes are novel or extremely rare, making clinical interpretation problematic and genotype/phenotype correlations difficult. Identification of patients sharing a genomic rearrangement and having phenotypes in common increases certainty in the diagnosis and allows characterization of new syndromes. The DECIPHER database is an online repository of genotype and phenotype data whose chief objective is to facilitate the association of genomic variation with phenotype to enable the clinical interpretation of copy number variation (CNV). This unit shows how DECIPHER can be used to (1) search for consented patients sharing a defined chromosomal location, (2) navigate regions of interest using in-house visualization tools and the Ensembl genome browser, (3) analyze affected genes and prioritize them according to their likelihood of haploinsufficiency, (4) upload patient aberrations and phenotypes, and (5) create printouts at different levels of detail. By following this protocol, clinicians
    [Show full text]