View / Download
Total Page:16
File Type:pdf, Size:1020Kb
Ji RR, Serebriyskaya T, Natalia Kuzkina N. Genetic Landscape of aHUS: A Comprehensive Journal of Analysis of Genetic Variants Reported in The Literature. J Rare Dis Res Treat. (2018) 4(1): 13-51 Rare Diseases Research www.rarediseasesjournal.com & Treatment Research Article Open Access Genetic Landscape of aHUS: A Comprehensive Analysis of Genetic Variants Reported in The Literature Rui-Ru Ji1*, Tatiana Serebriyskaya2,3, Natalia Kuzkina2,3 1Alexion Pharmaceuticals, Inc., 121 Seaport Boulevard, Boston, MA 02210, USA 2EPAM Systems, 22/2 Zastavskaya Street, MegaPark, 196084, Saint-Petersburg, Russia 3Moscow Institute of Physics and Technology, School of Biological and Medical Physics, 9 Institutskiy per., Dolgoprudny, 141701, Moscow, Russia Article Info Article Notes Abstract Received: October 4, 2018 Accepted: December 19, 2018 Genetic information provides important guidance for long-term management of patients with atypical hemolytic uremic syndrome (aHUS), *Correspondence: an extremely rare disease that primarily affects a patient’s kidney. To better Dr. Rui-Ru Ji, Alexion Pharmaceuticals, Inc., 121 Seaport understand the phenotypic impact of variants identified in aHUS patients, Boulevard, Boston, MA 02210, USA; Telephone No: 862- 200-6617; Email: [email protected] we systematically mined the National Library of Medicine database for case studies of aHUS patients with identifiable genetic variants. Allelic variants from © 2018 Ji RR. This article is distributed under the terms of the 10 genes (C3, CFB, CFH, CFI, CFHR1, CFHR3, CFHR5, DGKE, CD46/MCP, and Creative Commons Attribution 4.0 International License. THBD) associated with aHUS were collected from 1652 patients. We analyze the enrichment of genetic variants in this “literature cohort” compared with a Keywords Genotype-phenotype reference population, the Genome Aggregation Database (gnomAD). We also Allelic frequency used a number of tools to predict the pathogenicity of the variants, attempting aHUS to reconcile all the results using the protein structure and conservation data. In Alternative complement pathway total, we identified 447 unique genetic variants: 301 of these were not present Pathogenicity in the gnomAD database and thus have “moderate” evidence of pathogenicity; 33 variants have “strong” evidence of pathogenicity by enrichment analysis. This study showcases an in silico framework that patient data aggregation and a large scale sequencing database provided a novel opportunity to understand genotype-phenotype associations in aHUS. This framework can be efficiently applied to other rare diseases where data are sparse to help improve the diagnosis and management of these patients. Abbreviations: aHUS: atypical hemolytic uremic syndrome; gnomAD: Genome Aggregation Database; CD46/MCP: cluster of differentiation 46/membrane cofactor protein; CFH: complement factor H; CFI: complement factor I; CFB: complement factor B; C3: complement component 3; ACMG: American College of Medical Genetics; AF: allele frequency; CFHR1: complement factor H-related protein 1; CFHR3: complement factor H-related protein 3; CFHR5: complement factor H-related protein 5; DGKE: diacylglycerol kinase epsilon; THBD: thrombomodulin; MEDLINE: Medical Literature Analysis and Retrieval System Online; VEP: variant effect predictor; SIFT: sorts intolerant from tolerant substitutions; PROVEAN: protein variation effect analyzer; FATHMM: functional analysis through Hidden Markov Models Introduction Atypical hemolytic uremic syndrome (aHUS) is a disease characterized by the triad of microangiopathic hemolytic anemia, thrombocytopenia, and acute kidney injury.1 aHUS is extremely rare. Although the precise incidence and prevalence are unknown, it is estimated that aHUS affects 1–2 individuals per million inhabitants in the United States.2 aHUS can occur at any age, from the neonatal period to adulthood, although the onset appears to be more frequent in childhood than adulthood.3 In childhood, aHUS affects males and females equally.4 In adulthood, however, there is a preponderance in females affected and pregnancy can be a triggering event.5 Page 13 of 51 Ji RR, Serebriyskaya T, Natalia Kuzkina N. Genetic Landscape of aHUS: A Comprehensive Analysis of Genetic Variants Reported in The Literature. J Rare Dis Journal of Rare Diseases Research & Treatment Res Treat. (2018) 4(1): 13-51 aHUS is caused by chronic, uncontrolled activation or databases such as the Genome Aggregation Database dysregulation of the alternative pathway of complement.6 (gnomAD),17,18 containing 123 136 exome sequences and Unlike the classical and lectin pathways of complement, 15 496 whole-genome sequences of unrelated individuals the alternative pathway is continually activated and closely from a wide variety of large-scale sequencing projects, now regulated by a number of complement regulatory proteins make it possible to obtain AFs of rare variants in reference (such as complement factor H [CFH], complement factor populations. However, lack of a large patient cohort makes I [CFI], and CD46/membrane cofactor protein [MCP]), estimating variant AFs in patients with rare diseases so that host cells are not damaged.3 In most patients challenging. with aHUS, it has been demonstrated that the excessive activation of complement can result from mutations in any cohort” of more than 1600 patients with aHUS by compiling of the regulatory proteins or production of neutralizing all caseTo overcome studies in MEDLINE.this difficulty, The allelicwe created variants a of“literature 10 genes autoantibody of these proteins, for example, anti-factor H (namely, C3, CD46/MCP, CFB, CFH, complement factor antibodies.7,8 H-related protein 1 [CFHR1], complement factor H-related Although not required for diagnosis of aHUS, protein 3 [CFHR3], complement factor H-related protein 5 information on pathogenic complement mutations can [CFHR5], complement factor I [CFI], diacylglycerol kinase be used to establish the cause of disease with accuracy, as epsilon [DGKE], and thrombomodulin [THBD]) known well as to guide long-term disease management decisions to be associated with aHUS,3,19 along with the patient and effective treatment.9-11 Approximately 20% of patients characteristics, were compiled and stored together in experience extrarenal manifestations, including central a relational database. We assessed the pathogenicity nervous system, cardiac, gastrointestinal, distal extremity, of genetic variants in the 10 genes with bioinformatic, and severe systemic organ involvement.4,8,10 The ongoing statistical, and structural analyses. risk of these manifestations varies among genotypes. For example, patients having mutations in C3, complement Materials And Methods factor B (CFB), and CFH are particularly at risk for Literature Mining and Variant Annotation developing cardiovascular complications.12 MEDLINE was searched systematically through April In approximately 30–50% of individuals with aHUS, no 30, 2017, for all case reports of patients with aHUS mutation in a complement gene and no autoantibodies can with identifiable genetic variants in any of the 10 genes be detected.7,8 However, recent literature exploring larger panels and new high-throughput sequencing technology CFH, CFHR1, CFHR3, CFHR5, CFI, DGKE, and THBD.3,19 show that other genes and pathways such as the coagulation Ifknown reported, to be associatedinformation with about aHUS: the C3, patients’ CD46/MCP, relatives CFB, pathway are also associated with aHUS.13 Nonetheless, was also recorded. When possible, genetic variants were annotated with patient demographics (e.g., 14 Interpretation many variants identified in patients are classified as genetic ancestry, age, sex), as well as specific disease characteristics such as onset, severity, etc. All data were assaysvariants may of unknown be less reliableclinical significance.in predicting the impact of stored in a relational database to allow data retrieval variantsof these onvariants protein is functiondifficult because:than others; (1) some(2) the functional rarity of using structured query language. patient cohort; (3) incomplete penetrance is the norm in To account for repeated-case reporting in the literature, thesethe disease patients. hinders14,15 the aggregation of a sufficiently large ancestry, and variant information. Potential matches were The standards and guidelines published in 2015 by manuallypatients were checked first beforematched being based merged. on age, In sex, cases onset, where genetic one the American College of Medical Genetics (ACMG) lay out an extensive framework of evidence for interpretation of were missing, the original articles were double-checked sequence variants, including guidance for using population toor ensuremore of that the fieldsthey referenced (e.g., age, sex, each onset, other genetic and came ancestry) from data and computational and predictive tools.14 By the same source (i.e., authors). All matches had to have the same genetic variant(s). We designated the resulting cases populations as compared with reference populations. This principledefinition, has causal been genetic exploited variants in genome-wide are enriched assoc in patientiation 16 studies to identify genes implicated in common diseases. Variantthe “aHUS Pathogenicity literature cohort.” Analysis (Enrichment Analysis) In rare diseases, similar analyses can be utilized to assess pathogenicity of rare genetic variants if their allele For every variant (e.g., allele A) present in both the aHUS 17 frequencies (AFs) in patient populations