Cancer Sequencing Service Data File Formats File Format V2.4 Software V2.4 December 2012

Total Page:16

File Type:pdf, Size:1020Kb

Cancer Sequencing Service Data File Formats File Format V2.4 Software V2.4 December 2012 Cancer Sequencing Service Data File Formats File format v2.4 Software v2.4 December 2012 CGA Tools, cPAL, and DNB are trademarks of Complete Genomics, Inc. in the US and certain other countries. All other trademarks are the property of their respective owners. Disclaimer of Warranties. COMPLETE GENOMICS, INC. PROVIDES THESE DATA IN GOOD FAITH TO THE RECIPIENT “AS IS.” COMPLETE GENOMICS, INC. MAKES NO REPRESENTATION OR WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE OR USE, OR ANY OTHER STATUTORY WARRANTY. COMPLETE GENOMICS, INC. ASSUMES NO LEGAL LIABILITY OR RESPONSIBILITY FOR ANY PURPOSE FOR WHICH THE DATA ARE USED. Any permitted redistribution of the data should carry the Disclaimer of Warranties provided above. Data file formats are expected to evolve over time. Backward compatibility of any new file format is not guaranteed. Complete Genomics data is for Research Use Only and not for use in the treatment or diagnosis of any human subject. Information, descriptions and specifications in this publication are subject to change without notice. Copyright © 2011-2012 Complete Genomics Incorporated. All rights reserved. RM_DFFCS_2.4-01 Table of Contents Table of Contents Preface ...........................................................................................................................................................................................1 Conventions .................................................................................................................................................................................................. 1 Analysis Tools .............................................................................................................................................................................................. 1 References ..................................................................................................................................................................................................... 1 Introduction ................................................................................................................................................................................4 Sequencing Approach ............................................................................................................................................................................... 4 Mapping Reads and Calling Variations ............................................................................................................................................. 4 Read Data Format....................................................................................................................................................................................... 4 Data Delivery ................................................................................................................................................................................................ 5 Data File Formats and Conventions .................................................................................................................................... 6 Data File Structure ..................................................................................................................................................................................... 6 Header Format............................................................................................................................................................................................. 6 Sequence Coordinate System ..............................................................................................................................................................10 Data File Content and Organization .................................................................................................................................................10 Identifier Map ............................................................................................................................................................................................12 idMap-[ASM-ID].tsv ............................................................................................................................................................................12 ASM Results .............................................................................................................................................................................. 13 Small Variations and Annotations Files..........................................................................................................................................13 Variations .....................................................................................................................................................................................................17 ASM/var-[ASM-ID].tsv.bz2 ..............................................................................................................................................................17 Master Variations .....................................................................................................................................................................................24 Normal Sample MasterVariations .....................................................................................................................................................25 ASM/masterVarBeta-[ASM-ID]-T1.tsv.bz2 ..............................................................................................................................25 Tumor Sample MasterVariations ......................................................................................................................................................34 ASM/masterVarBeta-[ASM-ID]-N1.tsv.bz2 .............................................................................................................................34 Individual Genomes’ Small Variations, CNVs, SVs, and MEIs in VCF Format ................................................................. 43 ASM/vcfBeta-[ASM-ID].vcf.bz2 .....................................................................................................................................................43 Comparative Results of Small Variations, CNVs, and SVs in VCF Format ........................................................................62 ASM/somaticVcfBeta-[ASM-ID]-N1.vcf.bz2 .............................................................................................................................62 Annotated Variants within Genes .....................................................................................................................................................81 ASM/gene-[ASM-ID].tsv.bz2 ...........................................................................................................................................................81 Annotated Variants within Non-coding RNAs .............................................................................................................................86 ASM/ncRNA-[ASM-ID].tsv.bz2 ......................................................................................................................................................86 Count of Variations by Gene ................................................................................................................................................................88 ASM/geneVarSummary-[ASM-ID].tsv ........................................................................................................................................88 Variations at Known dbSNP Loci .......................................................................................................................................................90 ASM/dbSNPAnnotated-[ASM-ID].tsv.bz2 .................................................................................................................................90 Sequencing Metrics and Variations Summary .............................................................................................................................94 ASM/summary-[ASM-ID].tsv .........................................................................................................................................................94 © Complete Genomics, Inc. Cancer Sequencing Service Data File Formats — ii Table of Contents Copy Number Variation Files ..............................................................................................................................................................98 Copy Number Segmentation ............................................................................................................................................................ 101 ASM/CNV/cnvSegmentsDiploidBeta-[ASM-ID].tsv .......................................................................................................... 101 Detailed Ploidy and Coverage Information ................................................................................................................................ 104 ASM/CNV/cnvDetailsDiploidBeta-[ASM-ID].tsv.bz2 ....................................................................................................... 104 Genomic Copy Number Analysis of Non-Diploid Samples Files ......................................................................................
Recommended publications
  • PARSANA-DISSERTATION-2020.Pdf
    DECIPHERING TRANSCRIPTIONAL PATTERNS OF GENE REGULATION: A COMPUTATIONAL APPROACH by Princy Parsana A dissertation submitted to The Johns Hopkins University in conformity with the requirements for the degree of Doctor of Philosophy Baltimore, Maryland July, 2020 © 2020 Princy Parsana All rights reserved Abstract With rapid advancements in sequencing technology, we now have the ability to sequence the entire human genome, and to quantify expression of tens of thousands of genes from hundreds of individuals. This provides an extraordinary opportunity to learn phenotype relevant genomic patterns that can improve our understanding of molecular and cellular processes underlying a trait. The high dimensional nature of genomic data presents a range of computational and statistical challenges. This dissertation presents a compilation of projects that were driven by the motivation to efficiently capture gene regulatory patterns in the human transcriptome, while addressing statistical and computational challenges that accompany this data. We attempt to address two major difficulties in this domain: a) artifacts and noise in transcriptomic data, andb) limited statistical power. First, we present our work on investigating the effect of artifactual variation in gene expression data and its impact on trans-eQTL discovery. Here we performed an in-depth analysis of diverse pre-recorded covariates and latent confounders to understand their contribution to heterogeneity in gene expression measurements. Next, we discovered 673 trans-eQTLs across 16 human tissues using v6 data from the Genotype Tissue Expression (GTEx) project. Finally, we characterized two trait-associated trans-eQTLs; one in Skeletal Muscle and another in Thyroid. Second, we present a principal component based residualization method to correct gene expression measurements prior to reconstruction of co-expression networks.
    [Show full text]
  • Structural Forms of the Human Amylase Locus and Their Relationships to Snps, Haplotypes, and Obesity
    Structural Forms of the Human Amylase Locus and Their Relationships to SNPs, Haplotypes, and Obesity The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Usher, Christina Leigh. 2015. Structural Forms of the Human Amylase Locus and Their Relationships to SNPs, Haplotypes, and Obesity. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences. Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:17467224 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Structural forms of the human amylase locus and their relationships to SNPs, haplotypes, and obesity A dissertation presented by Christina Leigh Usher to The Division of Medical Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Genetics and Genomics Harvard University Cambridge, Massachusetts March 2015 © 2015 Christina Leigh Usher All rights reserved. Dissertation Advisor: Professor Steven McCarroll Christina Leigh Usher Structural forms of the human amylase locus and their relationships to SNPs, haplotypes, and obesity Abstract Hundreds of human genes reside in structurally complex loci that elude molecular analysis and assessment in genome-wide association studies (GWAS). One such locus contains the three different amylase genes (AMY2B, AMY2A, and AMY1) responsible for digesting starch into sugar. The copy number of AMY1 is reported to be the genome’s largest influence on obesity, yet has gone undetected in GWAS.
    [Show full text]
  • Seq2pathway Vignette
    seq2pathway Vignette Bin Wang, Xinan Holly Yang, Arjun Kinstlick May 19, 2021 Contents 1 Abstract 1 2 Package Installation 2 3 runseq2pathway 2 4 Two main functions 3 4.1 seq2gene . .3 4.1.1 seq2gene flowchart . .3 4.1.2 runseq2gene inputs/parameters . .5 4.1.3 runseq2gene outputs . .8 4.2 gene2pathway . 10 4.2.1 gene2pathway flowchart . 11 4.2.2 gene2pathway test inputs/parameters . 11 4.2.3 gene2pathway test outputs . 12 5 Examples 13 5.1 ChIP-seq data analysis . 13 5.1.1 Map ChIP-seq enriched peaks to genes using runseq2gene .................... 13 5.1.2 Discover enriched GO terms using gene2pathway_test with gene scores . 15 5.1.3 Discover enriched GO terms using Fisher's Exact test without gene scores . 17 5.1.4 Add description for genes . 20 5.2 RNA-seq data analysis . 20 6 R environment session 23 1 Abstract Seq2pathway is a novel computational tool to analyze functional gene-sets (including signaling pathways) using variable next-generation sequencing data[1]. Integral to this tool are the \seq2gene" and \gene2pathway" components in series that infer a quantitative pathway-level profile for each sample. The seq2gene function assigns phenotype-associated significance of genomic regions to gene-level scores, where the significance could be p-values of SNPs or point mutations, protein-binding affinity, or transcriptional expression level. The seq2gene function has the feasibility to assign non-exon regions to a range of neighboring genes besides the nearest one, thus facilitating the study of functional non-coding elements[2]. Then the gene2pathway summarizes gene-level measurements to pathway-level scores, comparing the quantity of significance for gene members within a pathway with those outside a pathway.
    [Show full text]
  • Chromosome 1 (Human Genome/Inkae) A
    Proc. Nati. Acad. Sci. USA Vol. 89, pp. 4598-4602, May 1992 Medical Sciences Integration of gene maps: Chromosome 1 (human genome/inkae) A. COLLINS*, B. J. KEATSt, N. DRACOPOLIt, D. C. SHIELDS*, AND N. E. MORTON* *CRC Research Group in Genetic Epidemiology, Department of Child Health, University of Southampton, Southampton, S09 4XY, United Kingdom; tDepartment of Biometry and Genetics, Louisiana State University Center, 1901 Perdido Street, New Orleans, LA 70112; and tCenter for Cancer Research, Massachusetts Institute of Technology, 40 Ames Street, Cambridge, MA 02139 Contributed by N. E. Morton, February 10, 1992 ABSTRACT A composite map of 177 locI has been con- standard lod tables extracted from the literature. Multiple structed in two steps. The first combined pairwise logarithm- pairwise analysis of these data was performed by the MAP90 of-odds scores on 127 loci Into a comprehensive genetic map. computer program (6), which can estimate an errorfrequency Then this map was projected onto the physical map through e (7) and a mapping parameter p such that map distance w is cytogenetic assignments, and the small amount ofphysical data a function of 0, e and p (8). It also includes a bootstrap to was interpolated for an additional 50 loci each of which had optimize order and a stepwise elimination of weakly sup- been assigned to an interval of less than 10 megabases. The ported loci to identify a conservative set of reliably ordered resulting composite map is on the physical scale with a reso- (framework) markers. The genetic map was combined with lution of 1.5 megabases.
    [Show full text]
  • Characterization of Genomic Copy Number Variation in Mus Musculus Associated with the Germline of Inbred and Wild Mouse Populations, Normal Development, and Cancer
    Western University Scholarship@Western Electronic Thesis and Dissertation Repository 4-18-2019 2:00 PM Characterization of genomic copy number variation in Mus musculus associated with the germline of inbred and wild mouse populations, normal development, and cancer Maja Milojevic The University of Western Ontario Supervisor Hill, Kathleen A. The University of Western Ontario Graduate Program in Biology A thesis submitted in partial fulfillment of the equirr ements for the degree in Doctor of Philosophy © Maja Milojevic 2019 Follow this and additional works at: https://ir.lib.uwo.ca/etd Part of the Genetics and Genomics Commons Recommended Citation Milojevic, Maja, "Characterization of genomic copy number variation in Mus musculus associated with the germline of inbred and wild mouse populations, normal development, and cancer" (2019). Electronic Thesis and Dissertation Repository. 6146. https://ir.lib.uwo.ca/etd/6146 This Dissertation/Thesis is brought to you for free and open access by Scholarship@Western. It has been accepted for inclusion in Electronic Thesis and Dissertation Repository by an authorized administrator of Scholarship@Western. For more information, please contact [email protected]. Abstract Mus musculus is a human commensal species and an important model of human development and disease with a need for approaches to determine the contribution of copy number variants (CNVs) to genetic variation in laboratory and wild mice, and arising with normal mouse development and disease. Here, the Mouse Diversity Genotyping array (MDGA)-approach to CNV detection is developed to characterize CNV differences between laboratory and wild mice, between multiple normal tissues of the same mouse, and between primary mammary gland tumours and metastatic lung tissue.
    [Show full text]
  • Renal Cell Neoplasms Contain Shared Tumor Type–Specific Copy Number Variations
    The American Journal of Pathology, Vol. 180, No. 6, June 2012 Copyright © 2012 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ajpath.2012.01.044 Tumorigenesis and Neoplastic Progression Renal Cell Neoplasms Contain Shared Tumor Type–Specific Copy Number Variations John M. Krill-Burger,* Maureen A. Lyons,*† The annual incidence of renal cell carcinoma (RCC) has Lori A. Kelly,*† Christin M. Sciulli,*† increased steadily in the United States for the past three Patricia Petrosko,*† Uma R. Chandran,†‡ decades, with approximately 58,000 new cases diag- 1,2 Michael D. Kubal,§ Sheldon I. Bastacky,*† nosed in 2010, representing 3% of all malignancies. Anil V. Parwani,*†‡ Rajiv Dhir,*†‡ and Treatment of RCC is complicated by the fact that it is not a single disease but composes multiple tumor types with William A. LaFramboise*†‡ different morphological characteristics, clinical courses, From the Departments of Pathology* and Biomedical and outcomes (ie, clear-cell carcinoma, 82% of RCC ‡ Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania; cases; type 1 or 2 papillary tumors, 11% of RCC cases; † the University of Pittsburgh Cancer Institute, Pittsburgh, chromophobe tumors, 5% of RCC cases; and collecting § Pennsylvania; and Life Technologies, Carlsbad, California duct carcinoma, approximately 1% of RCC cases).2,3 Benign renal neoplasms are subdivided into papillary adenoma, renal oncocytoma, and metanephric ade- Copy number variant (CNV) analysis was performed on noma.2,3 Treatment of RCC often involves surgical resec- renal cell carcinoma (RCC) specimens (chromophobe, tion of a large renal tissue component or removal of the clear cell, oncocytoma, papillary type 1, and papillary entire affected kidney because of the relatively large size of type 2) using high-resolution arrays (1.85 million renal tumors on discovery and the availability of a life-sus- probes).
    [Show full text]
  • Salivary Alpha Amylase (AMY1C) (NM 001008219) Human Tagged ORF Clone Product Data
    OriGene Technologies, Inc. 9620 Medical Center Drive, Ste 200 Rockville, MD 20850, US Phone: +1-888-267-4436 [email protected] EU: [email protected] CN: [email protected] Product datasheet for RG215827 Salivary alpha amylase (AMY1C) (NM_001008219) Human Tagged ORF Clone Product data: Product Type: Expression Plasmids Product Name: Salivary alpha amylase (AMY1C) (NM_001008219) Human Tagged ORF Clone Tag: TurboGFP Symbol: AMY1C Synonyms: AMY1 Vector: pCMV6-AC-GFP (PS100010) E. coli Selection: Ampicillin (100 ug/mL) Cell Selection: Neomycin This product is to be used for laboratory only. Not for diagnostic or therapeutic use. View online » ©2021 OriGene Technologies, Inc., 9620 Medical Center Drive, Ste 200, Rockville, MD 20850, US 1 / 5 Salivary alpha amylase (AMY1C) (NM_001008219) Human Tagged ORF Clone – RG215827 ORF Nucleotide >RG215827 representing NM_001008219 Sequence: Red=Cloning site Blue=ORF Green=Tags(s) TTTTGTAATACGACTCACTATAGGGCGGCCGGGAATTCGTCGACTGGATCCGGTACCGAGGAGATCTGCC GCCGCGATCGCC ATGAAGCTCTTTTGGTTGCTTTTCACCATTGGGTTCTGCTGGGCTCAGTATTCCTCAAATACACAACAAG GACGAACATCTATTGTTCATCTGTTTGAATGGCGATGGGTTGATATTGCTCTTGAATGTGAGCGATATTT AGCTCCCAAGGGATTTGGAGGGGTTCAGGTCTCTCCACCAAATGAAAATGTTGCCATTCACAACCCTTTC AGACCTTGGTGGGAAAGATACCAACCAGTTAGCTATAAATTATGCACAAGATCTGGAAATGAAGATGAAT TTAGAAACATGGTGACTAGATGCAACAATGTTGGGGTTCGTATTTATGTGGATGCTGTAATTAATCATAT GTGTGGTAATGCTGTGAGTGCAGGAACAAGCAGTACCTGTGGAAGTTACTTCAACCCTGGAAGTAGGGAC TTTCCAGCAGTCCCATATTCTGGATGGGATTTTAATGATGGTAAATGTAAAACTGGAAGTGGAGATATCG AGAACTATAATGATGCTACTCAGGTCAGAGATTGTCGTCTGTCTGGTCTTCTCGATCTTGCACTGGGGAA
    [Show full text]
  • Differential Proteomic Analysis of the Pancreas of Diabetic Db/Db Mice Reveals the Proteins Involved in the Development of Complications of Diabetes Mellitus
    Int. J. Mol. Sci. 2014, 15, 9579-9593; doi:10.3390/ijms15069579 OPEN ACCESS International Journal of Molecular Sciences ISSN 1422-0067 www.mdpi.com/journal/ijms Article Differential Proteomic Analysis of the Pancreas of Diabetic db/db Mice Reveals the Proteins Involved in the Development of Complications of Diabetes Mellitus Victoriano Pérez-Vázquez 1,*, Juan M. Guzmán-Flores 1, Daniela Mares-Álvarez 1, Magdalena Hernández-Ortiz 2, Maciste H. Macías-Cervantes 1, Joel Ramírez-Emiliano 1 and Sergio Encarnación-Guevara 2 1 Depto. de Ciencias Médicas, División de Ciencias de la Salud, Campus León, Universidad de Guanajuato, León, Guanajuato 37320, Mexico; E-Mails: [email protected] (J.M.G.-F.); [email protected] (D.M.-A.); [email protected] (M.H.M.-C.); [email protected] (J.R.-E.) 2 Centro de Ciencias Genómicas, Universidad Nacional Autónoma de México, Cuernavaca, Morelos 62210, Mexico; E-Mails: [email protected] (M.H.-O.); [email protected] (S.E.-G.) * Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.: +52-477-7143-812; Fax: +52-477-7167-623. Received: 4 April 2014; in revised form: 14 May 2014 / Accepted: 19 May 2014 / Published: 30 May 2014 Abstract: Type 2 diabetes mellitus is characterized by hyperglycemia and insulin-resistance. Diabetes results from pancreatic inability to secrete the insulin needed to overcome this resistance. We analyzed the protein profile from the pancreas of ten-week old diabetic db/db and wild type mice through proteomics. Pancreatic proteins were separated in two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) and significant changes in db/db mice respect to wild type mice were observed in 27 proteins.
    [Show full text]
  • Chuanxiong Rhizoma Compound on HIF-VEGF Pathway and Cerebral Ischemia-Reperfusion Injury’S Biological Network Based on Systematic Pharmacology
    ORIGINAL RESEARCH published: 25 June 2021 doi: 10.3389/fphar.2021.601846 Exploring the Regulatory Mechanism of Hedysarum Multijugum Maxim.-Chuanxiong Rhizoma Compound on HIF-VEGF Pathway and Cerebral Ischemia-Reperfusion Injury’s Biological Network Based on Systematic Pharmacology Kailin Yang 1†, Liuting Zeng 1†, Anqi Ge 2†, Yi Chen 1†, Shanshan Wang 1†, Xiaofei Zhu 1,3† and Jinwen Ge 1,4* Edited by: 1 Takashi Sato, Key Laboratory of Hunan Province for Integrated Traditional Chinese and Western Medicine on Prevention and Treatment of 2 Tokyo University of Pharmacy and Life Cardio-Cerebral Diseases, Hunan University of Chinese Medicine, Changsha, China, Galactophore Department, The First 3 Sciences, Japan Hospital of Hunan University of Chinese Medicine, Changsha, China, School of Graduate, Central South University, Changsha, China, 4Shaoyang University, Shaoyang, China Reviewed by: Hui Zhao, Capital Medical University, China Background: Clinical research found that Hedysarum Multijugum Maxim.-Chuanxiong Maria Luisa Del Moral, fi University of Jaén, Spain Rhizoma Compound (HCC) has de nite curative effect on cerebral ischemic diseases, *Correspondence: such as ischemic stroke and cerebral ischemia-reperfusion injury (CIR). However, its Jinwen Ge mechanism for treating cerebral ischemia is still not fully explained. [email protected] †These authors share first authorship Methods: The traditional Chinese medicine related database were utilized to obtain the components of HCC. The Pharmmapper were used to predict HCC’s potential targets. Specialty section: The CIR genes were obtained from Genecards and OMIM and the protein-protein This article was submitted to interaction (PPI) data of HCC’s targets and IS genes were obtained from String Ethnopharmacology, a section of the journal database.
    [Show full text]
  • The Genetic Basis of Hyaline Fibromatosis Syndrome in Patients from a Consanguineous Background: a Case Series
    Youssefian et al. BMC Medical Genetics (2018) 19:87 https://doi.org/10.1186/s12881-018-0581-1 CASE REPORT Open Access The genetic basis of hyaline fibromatosis syndrome in patients from a consanguineous background: a case series Leila Youssefian1,4†, Hassan Vahidnezhad1,2†, Andrew Touati1,3†, Vahid Ziaee5†, Amir Hossein Saeidian1, Sara Pajouhanfar1, Sirous Zeinali2,6 and Jouni Uitto1* Abstract Background: Hyaline fibromatosis syndrome (HFS) is a rare heritable multi-systemic disorder with significant dermatologic manifestations. It is caused by mutations in ANTXR2, which encodes a transmembrane receptor involved in collagen VI regulation in the extracellular matrix. Over 40 mutations in the ANTXR2 gene have been associated with cases of HFS. Variable severity of the disorder in different patients has been proposed to be related to the specific mutations in these patients and their location within the gene. Case presentation: In this report, we describe four cases of HFS from consanguineous backgrounds. Genetic analysis identified a novel homozygous frameshift deletion c.969del (p.Ile323Metfs*14) in one case, the previously reported mutation c.134 T > C (p.Leu45Pro) in another case, and the recurrent homozygous frameshift mutation c.1073dup (p.Ala359Cysfs*13) in two cases. The epidemiology of this latter mutation is of particular interest, as it is a candidate for inhibition of nonsense-mediated mRNA decay. Haplotype analysis was performed to determine the origin of this mutation in this consanguineous cohort, which suggested that it may develop sporadically in different populations. Conclusions: This information provides insights on genotype-phenotype correlations, identifies a previously unreported mutation in ANTXR2, and improves the understanding of a recurrent mutation in HFS.
    [Show full text]
  • Variation in Protein Coding Genes Identifies Information
    bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 1 1 2 3 4 5 Variation in protein coding genes identifies information flow as a contributor to 6 animal complexity 7 8 Jack Dean, Daniela Lopes Cardoso and Colin Sharpe* 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Institute of Biological and Biomedical Sciences 25 School of Biological Science 26 University of Portsmouth, 27 Portsmouth, UK 28 PO16 7YH 29 30 * Author for correspondence 31 [email protected] 32 33 Orcid numbers: 34 DLC: 0000-0003-2683-1745 35 CS: 0000-0002-5022-0840 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Abstract bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 2 1 Across the metazoans there is a trend towards greater organismal complexity. How 2 complexity is generated, however, is uncertain. Since C.elegans and humans have 3 approximately the same number of genes, the explanation will depend on how genes are 4 used, rather than their absolute number.
    [Show full text]
  • Duplications and Copy Number Variants of 8P23.1 Are Cytogenetically Indistinguishable but Distinct at the Molecular Level
    European Journal of Human Genetics (2005) 13, 1131–1136 & 2005 Nature Publishing Group All rights reserved 1018-4813/05 $30.00 www.nature.com/ejhg ARTICLE Duplications and copy number variants of 8p23.1 are cytogenetically indistinguishable but distinct at the molecular level John CK Barber*,1,2,3, Viv Maloney2, Edward J Hollox4, Annegret Stuke-Sontheimer5, Gabi du Bois6, Eva Daumiller6, Ute Klein-Vogler7, Andreas Dufke7, John AL Armour4 and Thomas Liehr8 1Wessex Regional Genetics Laboratory, Salisbury Hospital NHS Trust, Salisbury, Wiltshire, UK; 2National Genetics Reference Laboratory (Wessex), Salisbury Hospital NHS Trust, Salisbury, Wiltshire, UK; 3Human Genetics Division, Southampton University School of Medicine, Southampton General Hospital, Southampton, UK; 4Institute of Genetics, University of Nottingham, Queen’s Medical Centre, Nottingham, UK; 5Genetics Clinic, Wernigerode, Germany; 6Institute for Chromosome Diagnostics and Genetic Counselling, Boeblingen, Germany; 7Department of Medical Genetics, Eberhard-Karls University, Tuebingen, Germany; 8Institute for Human Genetics and Anthropology, Friedrich-Schiller University, Jena, Germany It has been proposed that duplications of 8p23.1 are either euchromatic variants of the 8p23.1 defensin domain with no phenotypic consequences or true duplications associated with developmental delay and heart defects. Here, we provide evidence for both alternatives in two new families. A duplication of most of band 8p23.1 (circa 5 Mb) was found in a girl of 8 years with pulmonary stenosis and mild language delay. BAC fluorescence in situ hybridisation (FISH) and multiplex amplifiable probe hybridisation (MAPH) showed that the two copies of the duplicated segment were sited, in an alternating fashion, between three copies of a circa 300–450 kb segment from 8p23.1 distal to REPD.
    [Show full text]