Discovery of the Human Genome Sequence in the Public and Private Databases

Total Page:16

File Type:pdf, Size:1020Kb

Discovery of the Human Genome Sequence in the Public and Private Databases R808 Current Biology Vol 11 No 20 Feature Discovery of the human genome sequence in the public and private databases Genomes: Much heat has been generated in discussions about the ‘academic’ subscription for the past key human genome sequence databases, generated by the Human year. We are also intense users (and Genome Project and Celera, and what specific features each offers contributors) of data in the Human genome researchers. Stephen W. Scherer and Joseph Cheung, Genome Project (HGP) databases. who are intense users of both, offer a personal assessment of the Many of our experiences are based developing contents. on gene mapping and sequencing studies of human chromosome 7, but Deservedly, there has been much genes and other important biological also through positional cloning celebration over the publication of features of chromosomes have been studies in other regions of the two draft versions of the human characterized. The goal of this piece genome. We are most often asked to genome sequence. There have also is to share our experiences with comment subjectively on the been other recent assemblies of the other scientists contemplating if and following three datasets: sequence, producing more complete how they might benefit from coverage and reliable DNA subscribing to the Celera DNA Important DNA sequence sources sequence annotation. However, to sequence database. (i) The Celera version of the date, a finished reference sequence Our observations are based on human genome published in of the human genome does not exist. having access to the Celera February (called component-3 or Furthermore, only a fraction of the Discovery System through an C3; data at http://www.celera.com/) and their more recent component-4 Figure 1 (C4) assembly (by subscription since August 2001). The C3 assembly was derived from combining 14,808 Mb 7000 of Whole Genome Shotgun (WGS) Celera sequence with 4,405 Mb 6000 from the HGP. C4 builds on C3 using improved algorithms as well as 5000 additional Celera sequences and new HGP data as of December 4000 q-arm 2000; 3000 (ii) The successive assemblies of p-arm the clone-based approach of the 2000 HGP from the February publication up until August 2001 (up-to-date Order of markers on chromosome 7 Order of markers 1000 statistics for the HGP sequence can be found at 0 Order of markers on Celera mapped scaffolds http://www.ebi.ac.uk/genomes/mot/). Current Biology The best websites for accessing HGP data are listed in Table 1. The order of 5343 chromosome 7 DNA did not fall into these larger scaffolds were HGP does not have Celera data in markers present in the C4 scaffolds (each all found in smaller ones or in the Celera their assemblies; scaffold in a different color) was almost fragment database. The 22 DNA markers entirely consistent with the marker order that are not in the expected order tend to (iii) The Celera mouse genome established by hand-curated data from map to the centromere or to (available since June 2001), radiation and somatic cell hybrid, yeast and intrachromosomal duplications. Over 98% of assembled solely using a WGS based bacterial- artificial chromosome, and genetic known markers could be placed on the map. on approximately 6X genome mapping experiments. The 246 markers that coverage with DNA from three different mouse strains. Magazine R809 Table 1 General characteristics of the Celera and HGP sequence databases.* Category Celera Human genome project Accessibility† To data Good Good to excellent Via cytolocation Very good (mirrors public data) Very good Via gene or marker Good Excellent Via DNA sequence Excellent Good Coverage Euchromatin Outstanding Good (~50% still in draft) Pericentromeric Good Good Large duplication Not represented Better than Celera Accuracy Internal accuracy Excellent Excellent Long-range order and orientation Outstanding Good, continues to improve Gene annotation‡ Known genes Very good Very good New genes Rudimentary Rudimentary Other strengths§ DNA sequence in fragment database often Ease of accessibility to data at multiple websites assists in gap filling Long sequence scaffolds favor genome-wide Availability of clones to confirm or complete comparison/annotation sequencing and mapping Availability of assembled mouse sequence to assist Clone-based strategy essential for completion human annotation of difficult regions Recommendations (wish list) Be more dynamic incorporating latest public data Increase resolution and accuracy of cytolocations Make clones available for sequencing of gap regions Top up and finish human sequence Release human component 4 and mouse data on DVD Increase efforts to incorporate highly-curated to academic subscribers data from community Sequence a third mammalian genome to assist Sequence a third mammalian genome to assist comparative analyses comparative analyses *Based on survey of 10 users of varying (http://www.ncbi.nlm.nih.gov/genome/cyto/h academic subscribers the supporting levels of sophistication; bioinformatics brc.shtml), Locus Link at NCBI evidence for new Celera transcripts is not analysts (4), molecular biologists (3), medical (http://www.ncbi.nlm.nih.gov/LocusLink/) intuitively available to the end user. HGP is geneticists (3). †While the Celera database and again Golden Path, respectively. also more dynamic in updating new cDNA is generally user friendly with excellent 'Ensembl' (http://www.ensembl.org/) was and gene data from the literature. §There are service support the limited number of portals also a good entry point into the public data. multitudes of DNA sequence analysis per academic subscription can inhibit Medical geneticists often use the Genome programs available in the public domain not accessibility. Data retrieval can sometimes Database (http://www.gdb.org/) or mentioned (see be slow. Navigating/searching HGP GeneCards http://searchlauncher.bcm.tmc.edu/). Celera databases is more intuitive primarily since (http://bioinfo.weizmann.ac.il/cards/index. has basic Blast search capabilities, GO familiar nomenclature is used compared with html).‡It is still premature to comment on the ontology, Panther Ontology (proprietary), the obscure identifiers often found in Celera. accuracy and completeness of the overall and Genome Browser which is an The favorite entry points to public DNA annotation of genes since many are based outstanding (proprietary) gene-model sequence data based on cytolocation, gene only on gene-prediction algorithms. Earlier building tool (available to corporate but not marker, and by DNA sequence itself are versions of the Celera and HGP academic subscribers). Celera's mouse UCSC Golden Path assemblies/annotation often missed genome assembly used the 129X1/SvJ, (http://genome.ucsc.edu/) and the BAC contiguous gene family members but in both DBA/2J, and A/J strains and the HGP is resources cases this continues to improve. For sequencing C57BL6/J. We have summarized our access both, if website hits alone sophisticated analysts performing experiences using Celera compared were counted, the HGP would win large-scale annotation experiments to HGP information in Table 1. Both out over Celera primarily because of usually occupy our laboratory’s datasets, and the accompanying ease of accessibility and increased single-portal access (per subscription) annotation, have strengths and number of entry points to the DNA to Celera (the release of the C3 weaknesses. While we constantly sequence. In our group, more assembly on DVD has relieved some R810 Current Biology Vol 11 No 20 Figure 2 of 50 kb of DNA (consistent with a chromosome 21 gene size of 57 kb). This suggests that the annotation of 123CFTR genes, in particular by the HGP, will 100% become more accurate as the 75% genome sequence moves from draft to finished form. As Hogenesch and 50% colleagues have shown, however, the 0k 10k 20k 30k 40k 50k current Celera and Ensembl (HGP) 4 5 6 7 8 9CFTR 10 11 sets of predicted genes are largely 100% mutually exclusive, suggesting that 75% even when a consensus genome sequence is achieved, the resulting 50% gene maps will still vary greatly. 50k 60k 70k 80k 90k 100k An example of where the HGP CFTR 12 13 14 15 16 17 18 19 20 21 clone-based strategy outperforms the 100% Celera WGS approach is in proper 75% assembly of large nearly identical DNA segments that occur in more 50% than one copy in the genome. Such 100k 110k 120k 130k 140k 150k duplications might account for up to CFTR 2223 24 25 26 27 5% of human DNA. When 100% duplications are >50 kb in size, in 75% our experience, they are not represented in large C3 or C4 50% scaffolds (they are found in the 150k 160k 170k 180k 190k 200k Celera ‘fragment’ database). The Current Biology same sequences may also be underrepresented or mistakenly Comparison of 200 kb of Celera human (C4) the program VISTA (http://www- assembled by the HGP. and mouse DNA sequence encompassing gsd.lbl.gov/vista). Each of the 27 CFTR However, we have found the the cystic fibrosis (CFTR) gene on human exons was present in the assembled mouse chromosome 7 and mouse chromosome 6, sequence. Blue shading represents exons HGP data usually to be more respectively. Each window represents 50 kb and red highlights other highly conserved representative for these chromosomal of syntenic DNA sequence displayed using sequences. regions with the added advantage of having access to a physical resource (the clone) for confirmatory analyses. For example, duplications involved of this pressure). Molecular biologists identification of large (and in Williams–Beuren syndrome at and medical geneticists almost sometimes small) genes that would 7q11.23 are not represented in Celera always start by accessing the public have otherwise been fragmented or scaffolds, but they are better covered databases to find out what can be missing and, therefore, not detected by the HGP. The same seems to be found or what is missing, and then using HGP data. For example, using true for duplications flanking they check Celera.
Recommended publications
  • Cisplatin and Phenanthriplatin Modulate Long-Noncoding
    www.nature.com/scientificreports OPEN Cisplatin and phenanthriplatin modulate long‑noncoding RNA expression in A549 and IMR90 cells revealing regulation of microRNAs, Wnt/β‑catenin and TGF‑β signaling Jerry D. Monroe1,2, Satya A. Moolani2,3, Elvin N. Irihamye2,4, Katheryn E. Lett1, Michael D. Hebert1, Yann Gibert1* & Michael E. Smith2* The monofunctional platinum(II) complex, phenanthriplatin, acts by blocking transcription, but its regulatory efects on long‑noncoding RNAs (lncRNAs) have not been elucidated relative to traditional platinum‑based chemotherapeutics, e.g., cisplatin. Here, we treated A549 non‑small cell lung cancer and IMR90 lung fbroblast cells for 24 h with either cisplatin, phenanthriplatin or a solvent control, and then performed microarray analysis to identify regulated lncRNAs. RNA22 v2 microRNA software was subsequently used to identify microRNAs (miRNAs) that might be suppressed by the most regulated lncRNAs. We found that miR‑25‑5p, ‑30a‑3p, ‑138‑5p, ‑149‑3p, ‑185‑5p, ‑378j, ‑608, ‑650, ‑708‑5p, ‑1253, ‑1254, ‑4458, and ‑4516, were predicted to target the cisplatin upregulated lncRNAs, IMMP2L‑1, CBR3‑1 and ATAD2B‑5, and the phenanthriplatin downregulated lncRNAs, AGO2‑1, COX7A1‑2 and SLC26A3‑1. Then, we used qRT‑PCR to measure the expression of miR‑25‑5p, ‑378j, ‑4516 (A549) and miR‑149‑3p, ‑608, and ‑4458 (IMR90) to identify distinct signaling efects associated with cisplatin and phenanthriplatin. The signaling pathways associated with these miRNAs suggests that phenanthriplatin may modulate Wnt/β‑catenin and TGF‑β signaling through the MAPK/ ERK and PTEN/AKT pathways diferently than cisplatin. Further, as some of these miRNAs may be subject to dissimilar lncRNA targeting in A549 and IMR90 cells, the monofunctional complex may not cause toxicity in normal lung compared to cancer cells by acting through distinct lncRNA and miRNA networks.
    [Show full text]
  • Supplementary Table S4. FGA Co-Expressed Gene List in LUAD
    Supplementary Table S4. FGA co-expressed gene list in LUAD tumors Symbol R Locus Description FGG 0.919 4q28 fibrinogen gamma chain FGL1 0.635 8p22 fibrinogen-like 1 SLC7A2 0.536 8p22 solute carrier family 7 (cationic amino acid transporter, y+ system), member 2 DUSP4 0.521 8p12-p11 dual specificity phosphatase 4 HAL 0.51 12q22-q24.1histidine ammonia-lyase PDE4D 0.499 5q12 phosphodiesterase 4D, cAMP-specific FURIN 0.497 15q26.1 furin (paired basic amino acid cleaving enzyme) CPS1 0.49 2q35 carbamoyl-phosphate synthase 1, mitochondrial TESC 0.478 12q24.22 tescalcin INHA 0.465 2q35 inhibin, alpha S100P 0.461 4p16 S100 calcium binding protein P VPS37A 0.447 8p22 vacuolar protein sorting 37 homolog A (S. cerevisiae) SLC16A14 0.447 2q36.3 solute carrier family 16, member 14 PPARGC1A 0.443 4p15.1 peroxisome proliferator-activated receptor gamma, coactivator 1 alpha SIK1 0.435 21q22.3 salt-inducible kinase 1 IRS2 0.434 13q34 insulin receptor substrate 2 RND1 0.433 12q12 Rho family GTPase 1 HGD 0.433 3q13.33 homogentisate 1,2-dioxygenase PTP4A1 0.432 6q12 protein tyrosine phosphatase type IVA, member 1 C8orf4 0.428 8p11.2 chromosome 8 open reading frame 4 DDC 0.427 7p12.2 dopa decarboxylase (aromatic L-amino acid decarboxylase) TACC2 0.427 10q26 transforming, acidic coiled-coil containing protein 2 MUC13 0.422 3q21.2 mucin 13, cell surface associated C5 0.412 9q33-q34 complement component 5 NR4A2 0.412 2q22-q23 nuclear receptor subfamily 4, group A, member 2 EYS 0.411 6q12 eyes shut homolog (Drosophila) GPX2 0.406 14q24.1 glutathione peroxidase
    [Show full text]
  • Abstracts from the 50Th European Society of Human Genetics Conference: Electronic Posters
    European Journal of Human Genetics (2019) 26:820–1023 https://doi.org/10.1038/s41431-018-0248-6 ABSTRACT Abstracts from the 50th European Society of Human Genetics Conference: Electronic Posters Copenhagen, Denmark, May 27–30, 2017 Published online: 1 October 2018 © European Society of Human Genetics 2018 The ESHG 2017 marks the 50th Anniversary of the first ESHG Conference which took place in Copenhagen in 1967. Additional information about the event may be found on the conference website: https://2017.eshg.org/ Sponsorship: Publication of this supplement is sponsored by the European Society of Human Genetics. All authors were asked to address any potential bias in their abstract and to declare any competing financial interests. These disclosures are listed at the end of each abstract. Contributions of up to EUR 10 000 (ten thousand euros, or equivalent value in kind) per year per company are considered "modest". Contributions above EUR 10 000 per year are considered "significant". 1234567890();,: 1234567890();,: E-P01 Reproductive Genetics/Prenatal and fetal echocardiography. The molecular karyotyping Genetics revealed a gain in 8p11.22-p23.1 region with a size of 27.2 Mb containing 122 OMIM gene and a loss in 8p23.1- E-P01.02 p23.3 region with a size of 6.8 Mb containing 15 OMIM Prenatal diagnosis in a case of 8p inverted gene. The findings were correlated with 8p inverted dupli- duplication deletion syndrome cation deletion syndrome. Conclusion: Our study empha- sizes the importance of using additional molecular O¨. Kırbıyık, K. M. Erdog˘an, O¨.O¨zer Kaya, B. O¨zyılmaz, cytogenetic methods in clinical follow-up of complex Y.
    [Show full text]
  • REVIEW ARTICLE the Genetics of Autism
    REVIEW ARTICLE The Genetics of Autism Rebecca Muhle, BA*; Stephanie V. Trentacoste, BA*; and Isabelle Rapin, MD‡ ABSTRACT. Autism is a complex, behaviorally de- tribution of a few well characterized X-linked disorders, fined, static disorder of the immature brain that is of male-to-male transmission in a number of families rules great concern to the practicing pediatrician because of an out X-linkage as the prevailing mode of inheritance. The astonishing 556% reported increase in pediatric preva- recurrence rate in siblings of affected children is ϳ2% to lence between 1991 and 1997, to a prevalence higher than 8%, much higher than the prevalence rate in the general that of spina bifida, cancer, or Down syndrome. This population but much lower than in single-gene diseases. jump is probably attributable to heightened awareness Twin studies reported 60% concordance for classic au- and changing diagnostic criteria rather than to new en- tism in monozygotic (MZ) twins versus 0 in dizygotic vironmental influences. Autism is not a disease but a (DZ) twins, the higher MZ concordance attesting to ge- syndrome with multiple nongenetic and genetic causes. netic inheritance as the predominant causative agent. By autism (the autistic spectrum disorders [ASDs]), we Reevaluation for a broader autistic phenotype that in- mean the wide spectrum of developmental disorders cluded communication and social disorders increased characterized by impairments in 3 behavioral domains: 1) concordance remarkably from 60% to 92% in MZ twins social interaction; 2) language, communication, and and from 0% to 10% in DZ pairs. This suggests that imaginative play; and 3) range of interests and activities.
    [Show full text]
  • (12) Patent Application Publication (10) Pub. No.: US 2016/0289762 A1 KOH Et Al
    US 201602897.62A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2016/0289762 A1 KOH et al. (43) Pub. Date: Oct. 6, 2016 (54) METHODS FOR PROFILIING AND Publication Classification QUANTITATING CELL-FREE RNA (51) Int. Cl. (71) Applicant: The Board of Trustees of the Leland CI2O I/68 (2006.01) Stanford Junior University, Palo Alto, (52) U.S. Cl. CA (US) CPC ....... CI2O 1/6883 (2013.01); C12O 2600/112 (2013.01); C12O 2600/118 (2013.01); C12O (72) Inventors: Lian Chye Winston KOH, Stanford, 2600/158 (2013.01) CA (US); Stephen R. QUAKE, Stanford, CA (US); Hei-Mun Christina FAN, Fremont, CA (US); Wenying (57) ABSTRACT PAN, Stanford, CA (US) The invention generally relates to methods for assessing a (21) Appl. No.: 15/034,746 neurological disorder by characterizing circulating nucleic acids in a blood sample. According to certain embodiments, (22) PCT Filed: Nov. 6, 2014 methods for S. a Nial disorder include (86). PCT No.: PCT/US2O14/064355 obtaining RNA present in a blood sample of a patient Suspected of having a neurological disorder, determining a S 371 (c)(1), level of RNA present in the sample that is specific to brain (2) Date: May 5, 2016 tissue, comparing the sample level of RNA to a reference O O level of RNA specific to brain tissue, determining whether a Related U.S. Application Data difference exists between the sample level and the reference (60) Provisional application No. 61/900,927, filed on Nov. level, and indicating a neurological disorder if a difference 6, 2013.
    [Show full text]
  • The Protein Arginine Methyltransferase PRMT5 Regulates Proliferation
    The Protein Arginine Methyltransferase PRMT5 Regulates Proliferation and the Expression of MITF and p27Kip1 in Human Melanoma DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University by Courtney Nicholas Graduate Program in Molecular, Cellular, and Developmental Biology The Ohio State University 2012 Dissertation Committee: Gregory B. Lesinski, PhD, Advisor Jiayuh Lin, PhD Amanda E. Toland, PhD Susheela Tridandapani, PhD Copyright by Courtney Nicholas 2012 Abstract The protein arginine methyltransferase-5 (PRMT5) enzyme is a Type II arginine methyltransferase that can regulate a variety of cellular functions. We hypothesized that PRMT5 plays a unique role in regulating the growth of human melanoma cells. Immunohistochemical analysis indicated significant upregulation of PRMT5 in human melanocytic nevi (88% of specimens positive for PRMT5), malignant melanomas (90% positive) and metastatic melanomas (88% positive) as compared to normal epidermis (5% of specimens positive for PRMT5; p<0.001, Fisher’s exact test). Furthermore, nuclear PRMT5 was significantly decreased in metastatic melanomas as compared to primary cutaneous melanomas (p<0.001, Wilcoxon rank sum test). Human metastatic melanoma cell lines in culture expressed PRMT5 predominantly in the cytoplasm. PRMT5 was found to be associated with its enzymatic cofactor Mep50, but not associated with STAT3 or cyclin D1. However, histologic examination of tumor xenografts from athymic mice revealed a heterogeneous pattern of nuclear and cytoplasmic PRMT5 expression. siRNA-mediated depletion of PRMT5 inhibited proliferation in a subset of melanoma cell lines, while it accelerated the growth of others. Loss of PRMT5 also led to reduced expression of MITF (microphthalmia-associated transcription factor), a melanocyte-lineage specific oncogene, and increased expression of the cell cycle regulator p27Kip1.
    [Show full text]
  • Reverse Transcriptase Genes Are Highly Abundant and Transcriptionally Active in Marine Plankton Assemblages
    The ISME Journal (2016) 10, 1134–1146 © 2016 International Society for Microbial Ecology All rights reserved 1751-7362/16 OPEN www.nature.com/ismej ORIGINAL ARTICLE Reverse transcriptase genes are highly abundant and transcriptionally active in marine plankton assemblages Magali Lescot1, Pascal Hingamp1, Kenji K Kojima2, Emilie Villar1, Sarah Romac3,4, Alaguraj Veluchamy5,11, Martine Boccara5, Olivier Jaillon6,7,8, Daniele Iudicone9, Chris Bowler5, Patrick Wincker6,7,8, Jean-Michel Claverie1 and Hiroyuki Ogata10 1Information Génomique et Structurale, UMR7256, CNRS, Aix-Marseille Université, Institut de Microbiologie de la Méditerranée (FR3479), Parc Scientifique de Luminy, Marseille, France; 2Genetic Information Research Institute, Los Altos, CA, USA; 3CNRS, UMR 7144, team EPEP, Station Biologique de Roscoff, Place Georges Teissier, Roscoff, France; 4Sorbonne Universités, UPMC Univ Paris 06, Station Biologique de Roscoff, Place Georges Teissier, FR-Roscoff, France; 5Ecole Normale Supérieure, PSL Research University, Institut de Biologie de l’Ecole Normale Supérieure (IBENS), Paris, France; 6CEA-Institut de Génomique, GENOSCOPE, Centre National de Séquençage, Evry Cedex, France; 7Université d’Evry, Evry Cedex, France; 8Centre National de la Recherche Scientifique (CNRS), Evry Cedex, France; 9Laboratory of Ecology and Evolution of Plankton, Stazione Zoologica Anton Dohrn, Naples, Italy and 10Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan Genes encoding reverse transcriptases (RTs) are found in most eukaryotes, often as a component of retrotransposons, as well as in retroviruses and in prokaryotic retroelements. We investigated the abundance, classification and transcriptional status of RTs based on Tara Oceans marine metagenomes and metatranscriptomes encompassing a wide organism size range. Our analyses revealed that RTs predominate large-size fraction metagenomes (45 μm), where they reached a maximum of 13.5% of the total gene abundance.
    [Show full text]
  • The MER41 Family of Hervs Is Uniquely Involved in the Immune-Mediated Regulation of Cognition/Behavior-Related Genes
    bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. The MER41 family of HERVs is uniquely involved in the immune-mediated regulation of cognition/behavior-related genes: pathophysiological implications for autism spectrum disorders Serge Nataf*1, 2, 3, Juan Uriagereka4 and Antonio Benitez-Burraco 5 1CarMeN Laboratory, INSERM U1060, INRA U1397, INSA de Lyon, Lyon-Sud Faculty of Medicine, University of Lyon, Pierre-Bénite, France. 2 University of Lyon 1, Lyon, France. 3Banque de Tissus et de Cellules des Hospices Civils de Lyon, Hôpital Edouard Herriot, Lyon, France. 4Department of Linguistics and School of Languages, Literatures & Cultures, University of Maryland, College Park, USA. 5Department of Spanish, Linguistics, and Theory of Literature (Linguistics). Faculty of Philology. University of Seville, Seville, Spain * Corresponding author: [email protected] bioRxiv preprint doi: https://doi.org/10.1101/434209; this version posted October 3, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. ABSTRACT Interferon-gamma (IFNa prototypical T lymphocyte-derived pro-inflammatory cytokine, was recently shown to shape social behavior and neuronal connectivity in rodents. STAT1 (Signal Transducer And Activator Of Transcription 1) is a transcription factor (TF) crucially involved in the IFN pathway.
    [Show full text]
  • Characterizing Genomic Duplication in Autism Spectrum Disorder by Edward James Higginbotham a Thesis Submitted in Conformity
    Characterizing Genomic Duplication in Autism Spectrum Disorder by Edward James Higginbotham A thesis submitted in conformity with the requirements for the degree of Master of Science Graduate Department of Molecular Genetics University of Toronto © Copyright by Edward James Higginbotham 2020 i Abstract Characterizing Genomic Duplication in Autism Spectrum Disorder Edward James Higginbotham Master of Science Graduate Department of Molecular Genetics University of Toronto 2020 Duplication, the gain of additional copies of genomic material relative to its ancestral diploid state is yet to achieve full appreciation for its role in human traits and disease. Challenges include accurately genotyping, annotating, and characterizing the properties of duplications, and resolving duplication mechanisms. Whole genome sequencing, in principle, should enable accurate detection of duplications in a single experiment. This thesis makes use of the technology to catalogue disease relevant duplications in the genomes of 2,739 individuals with Autism Spectrum Disorder (ASD) who enrolled in the Autism Speaks MSSNG Project. Fine-mapping the breakpoint junctions of 259 ASD-relevant duplications identified 34 (13.1%) variants with complex genomic structures as well as tandem (193/259, 74.5%) and NAHR- mediated (6/259, 2.3%) duplications. As whole genome sequencing-based studies expand in scale and reach, a continued focus on generating high-quality, standardized duplication data will be prerequisite to addressing their associated biological mechanisms. ii Acknowledgements I thank Dr. Stephen Scherer for his leadership par excellence, his generosity, and for giving me a chance. I am grateful for his investment and the opportunities afforded me, from which I have learned and benefited. I would next thank Drs.
    [Show full text]
  • Mechanisms of Common Fragile Site Instability and Cancer
    MECHANISMS OF COMMON FRAGILE SITE INSTABILITY AND CANCER by Ryan L. Ragland A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Human Genetics) in The University of Michigan 2009 Doctoral Committee: Professor Thomas W. Glover, Chair Professor Sally A. Camper Associate Professor Mats E. Ljungman Assistant Professor David O. Ferguson Assistant Professor Joann Sekiguchi © Ryan L. Ragland 2009 For my family and friends. ii Acknowledgments First and foremost I would like to acknowledge my family and friends without whom I would not be the person or scientist that I am today. This is exceptionally true for my wife Mary, whose love and support made this entire process possible. I would like to acknowledge my committee and its chair and my mentor Dr. Thomas Glover for all their support and understanding throughout these years. Also I would like to acknowledge Dr. Martin Arlt who, whether he likes it or not, is personally responsible for my development as a scientist. I would like to acknowledge the University of Michigan comprehensive cancer center in general and Dr. Eric Fearon in specific for their unwavering financial and scientific support of my work. Finally, special thanks to all of the members of the Glover lab past and present who have made my time here memorable and enjoyable and whose expertise was indispensable. iii Table of Contents Dedication…………………………………………………………………...……ii Acknowledgments…………….………………………………………...……….iii List of Figures………………………………………………………………..…vii List of Tables………………………………………………………….…………ix
    [Show full text]
  • Somatic Mutation Analysis of P53 and ST7 Tumor Suppressor Genes in Gastric Carcinoma by DHPLC
    P.O.Box 2345, Beijing 100023,China World J Gastroenterol 2003;9(12):2662-2665 Fax: +86-10-85381893 http://www.paper.edu.cn World Journal of Gastroenterology E-mail: [email protected] www.wjgnet.com Copyright © 2003 by The WJG Press ISSN 1007-9327 • GASTRIC CANCER • Somatic mutation analysis of p53 and ST7 tumor suppressor genes in gastric carcinoma by DHPLC Chong Lu, Hui-Mian Xu, Qun Ren, Yang Ao, Zhen-Ning Wang, Xue Ao, Li Jiang, Yang Luo, Xue Zhang Chong Lu, Hui-Mian Xu, Zhen-Ning Wang, Laboratory of cells in gastric tumor specimens is enough to form Medical Genomics, Oncology Department, the First Affiliated heteroduplex with mutant alleles for DHPLC detection. ST7 Clinical College, China Medical University, Shenyang, 110001, gene may not be the target gene of inactivation at 7q31 site Liaoning Province, China in gastric carcinoma. Qun Ren, Yang Ao, Xue Ao, Li Jiang, Yang Luo, Laboratory of Medical Genomics, China Medical University, Shenyang, 110001, Lu C, Xu HM, Ren Q, Ao Y, Wang ZN, Ao X, Jiang L, Luo Y, Liaoning Province, China Zhang X. Somatic mutation analysis of p53 and ST7 tumor Xue Zhang, Laboratory of Medical Genomics, China Medical suppressor genes in gastric carcinoma by DHPLC. World J University, Shenyang, 110001, Liaoning Province, China. Laboratory Gastroenterol 2003; 9(12): 2662-2665 of Genetics, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, 100021, China http://www.wjgnet.com/1007-9327/9/2662.asp Supported by National Science Fund for Distinguished Young Scholars, No. 30125017, and the Major State Basic Research Development Program of China (973 Program), No.
    [Show full text]
  • Visualizing Locus-Speciic Sister Chromatid Exchange Reveals Differential Patterns of Replication Stress-Induced Fragile Site Breakage
    Oncogene (2020) 39:1260–1272 https://doi.org/10.1038/s41388-019-1054-5 ARTICLE Visualizing locus-specific sister chromatid exchange reveals differential patterns of replication stress-induced fragile site breakage 1 1 1 1 1,2 Irina Waisertreiger ● Katherine Popovich ● Maya Block ● Krista R. Anderson ● Jacqueline H. Barlow Received: 16 November 2018 / Revised: 26 September 2019 / Accepted: 2 October 2019 / Published online: 21 October 2019 © The Author(s) 2019. This article is published with open access Abstract Chromosomal fragile sites are genomic loci sensitive to replication stress which accumulate high levels of DNA damage, and are frequently mutated in cancers. Fragile site damage is thought to arise from the aberrant repair of spontaneous replication stress, however successful fragile site repair cannot be calculated using existing techniques. Here, we report a new assay measuring recombination-mediated repair at endogenous genomic loci by combining a sister chromatid exchange (SCE) assay with fluorescent in situ hybridization (SCE-FISH). Using SCE-FISH, we find that endogenous and exogenous replication stress generated unrepaired breaks and SCEs at fragile sites. We also find that distinct sources of replication stress 1234567890();,: 1234567890();,: induce distinct patterns of breakage: ATR inhibition induces more breaks at early replicating fragile sites (ERFS), while ERFS and late-replicating common fragile sites (CFS) are equally fragile in response to aphidicolin. Furthermore, SCEs were suppressed at fragile sites near centromeres in response to replication stress, suggesting that genomic location influences DNA repair pathway choice. SCE-FISH also measured successful recombination in human primary lymphocytes, and identificed the proto-oncogene BCL2 as a replication stress-induced fragile site.
    [Show full text]