Identification of Pathogenic Gene Mutations in LMNA and MYBPC3 That Alter RNA Splicing
Total Page:16
File Type:pdf, Size:1020Kb
Identification of pathogenic gene mutations in LMNA and MYBPC3 that alter RNA splicing Kaoru Itoa,1, Parth N. Patela, Joshua M. Gorhama, Barbara McDonougha,b, Steven R. DePalmaa,b, Emily E. Adlera, Lien Lama, Calum A. MacRaec, Syed M. Mohiuddind, Diane Fatkine,f,g, Christine E. Seidmana,b,c,2, and J. G. Seidmana,2,3 aDepartment of Genetics, Harvard Medical School, Boston, MA 02115; bHoward Hughes Medical Institute, Harvard Medical School, Boston, MA 02115; cCardiovascular Division, Brigham and Women’s Hospital, Boston, MA 02115; dDepartment of Internal Medicine, Creighton University Medical Center, Omaha, NE 68131; eFaculty of Medicine, University of New South Wales, Kensington, NSW 2052, Australia; fVictor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia; and gCardiology Department, St Vincent’s Hospital, Darlinghurst, NSW 2010, Australia Contributed by J. G. Seidman, June 9, 2017 (sent for review May 16, 2017; reviewed by Gerald Dorn and Jil C. Tardiff) Genetic variants that cause haploinsufficiency account for many cost and time constraints limit their application to clinical gene- autosomal dominant (AD) disorders. Gene-based diagnosis classifies based diagnosis (6, 7). variants that alter canonical splice signals as pathogenic, but due to imperfect understanding of RNA splice signals other variants that Results maycreateoreliminatesplicesites are often clinically classified as LMNA and MYBPC3 Variants That Affect RNA Splicing. LoF variants in variants of unknown significance (VUS). To improve recognition of LMNA and MYBPC3 are prevalent causes of dilated cardiomyop- pathogenic splice-altering variants in AD disorders, we used compu- athy (DCM) with associated conduction defects (8) and hypertro- tational tools to prioritize VUS and developed a cell-based minigene phic cardiomyopathy (HCM) (9, 10), respectively. Databases report splicing assay to confirm aberrant splicing. Using this two-step 948 unique LMNA and 2,029 unique MYBPC3 variants observed procedure we evaluated all rare variants in two AD cardiomyopathy Materials and LMNA MYBPC3 across cardiomyopathy patients and normal subjects ( genes, lamin A/C ( ) and myosin binding protein C ( ). Methods We demonstrate that 13 LMNA and 35 MYBPC3 variants identified in ). Among these, we sought to identify splice-altering vari- cardiomyopathy patients alter RNA splicing, representing a 50% in- ants. After excluding all LoF variants (nonsense, frameshifts, and crease in the numbers of established damaging splice variants in variants that alter canonical donor GT and acceptor AG splice these genes. Over half of these variants are annotated as VUS by sequences)andcommonvariants (allele frequency >0.003; LMNA MEDICAL SCIENCES clinical diagnostic laboratories. Familial analyses of one variant, a n = 133; MYBPC3 n = 454, Table S1), we identified 815 LMNA synonymous LMNA VUS, demonstrated segregation with cardiomy- and 1575 MYBPC3 rare missense, synonymous, and intronic vari- opathy affection status and altered cardiac LMNA splicing. Applica- ants for study. tion of this strategy should improve diagnostic accuracy and variant First, we prioritized variants using a computational algorithm to classification in other haploinsufficient AD disorders. calculate MaxEnt scores (11), which are computed as the “dis- tance” from splice signals found in human genes (Materials and VUS | splicing | cardiomyopathy | LMNA | MYBPC3 Methods and Fig. S1). Based on the calculated MaxEnt scores for 815 LMNA (Dataset S1)and1,575MYBPC3 variants (Dataset S2), NA sequence analysis of patient DNA has been used to Dclinically diagnose associated inherited diseases, leading to the Significance identification of thousands of rare sequence variants in affected and unaffected individuals. Interpreting the medical significance of Sequence variants that create or eliminate splice sites are often these variants has proven to be a significant challenge. The anno- clinically classified as variants of unknown significance (VUS) due tation of nonsense, frameshift, insertion, and deletion variants that to imperfect understanding of RNA splice signals and cumber- cause a highly predictable outcome [i.e., loss of function (LoF) of some functional assays. In autosomal dominant disorders caused proteins encoded by disease genes] allows these to be clinically by haploinsufficiency, variants that alter normal splicing of one classified as pathogenic variants in more than 4,000 genes that allele are pathogenic. We developed enhanced computational cause disease by haploinsufficiency (1). Classification of rare syn- tools to prioritize potential splice-altering VUS and used a mini- onymous and missense variants in these genes, however, has been gene assay to functionally confirm splice-altering sequence vari- more difficult. Although bioinformatic algorithms predict that ants. In studying all reported variants across LMNA and MYBPC3, some variants may damage the encoded protein, these annotations two known heart disease genes, we demonstrate that ∼5% of are imperfect, and many rare variants remain unclassified and are VUS from affected patients alter splicing and are undetected clinically designated as variants of unknown significance (VUS) (2). disease-causing variants. This strategy improves clinical detection Splicing signals reside at all exon–intron junctions and include a of pathogenic variants and should be broadly relevant to other 9-bp 5′ splice donor sequence with an invariant GT dinucleotide human disorders that are caused by haploinsufficiency. and a 23-bp 3′ splice acceptor sequence with an invariant AG di- nucleotide (3) (Fig. 1A). Variants that alter the canonical GT and Author contributions: K.I., J.M.G., C.E.S., and J.G.S. designed research; K.I., P.N.P., J.M.G., AG dinucleotides in haploinsufficient autosomal dominant (AD) B.M., E.E.A., L.L., C.A.M., S.M.M., and D.F. performed research; K.I., P.N.P., S.R.D., and J.G.S. analyzed data; and K.I., P.N.P., C.E.S., and J.G.S. wrote the paper. genes are diagnostically classified as pathogenic, whereas substitu- tions of the remaining seven residues within the donor sequence, or Reviewers: G.D., Washington University; and J.C.T., University of Arizona. the 21 residues within the acceptor sequence, are often classified as Conflict of interest statement: C.E.S. and J.G.S. are founders of and own shares in Myo- kardia Inc., a startup company that is developing therapeutics that target the sarcomere. VUS (4). VUS that alter these sequences, however, have the po- 1Present address: Laboratory for Cardiovascular Diseases, RIKEN Center for Integrative tential to disrupt normal RNA splicing (5) (resulting in donor or Medical Sciences, Yokohama 230-0045, Japan. C acceptor loss; Fig. 1 ). Similarly, variants within a nearby intron or 2C.E.S. and J.G.S. contributed equally to this work. exon may sufficiently mimic the consensus donor or acceptor sites 3To whom correspondence should be addressed. Email: [email protected]. to generate new, inappropriate splice signals (donor or acceptor harvard.edu. C gain; Fig. 1 ). Although existing in vitro splicing assays can identify This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. variants that alter splicing and thereby identify pathogenic variants, 1073/pnas.1707741114/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1707741114 PNAS Early Edition | 1of6 Downloaded by guest on September 28, 2021 Fig. 1. Schematics of gene splice signals and a functional assay of potential splice-altering variants. (A) A gene segment containing two exons (blue) and flanking introns (gray) with consensus 9-bp splice donor and 23-bp splice acceptor sequences is shown. The nucleotide letter sizes (drawn using Pictogram; genes.mit.edu/ pictogram.html) denote use of that base in splicing across the genome. Dinucleotides GT (donor site) and AG (acceptor site) are invariant. Normally spliced transcript (red) that excludes intron sequence is depicted. (B) Functional assay of splicing candidates used a 1,200-bp minigene construct comprised of CMV promoter, exon– intron–exon sequence, a 2-bp barcode to identify the construct, and SV40polyA sequence. Exon–intron–exon test sequences were encoded on a synthetic 500-bp oligonucleotide that limited the inclusion of intron sequences to 200 bp (further details provided in Materials and Methods and Datasets S3 and S4). A normally spliced minigene-derived transcript (red) that excludes intron sequence is depicted. (C) Splice-altering variants detected by functional assays. Variants that destroy donor or acceptor signals prevent normal splicing (“x”) and yield transcripts with included introns, partial deletions, or entirely skipped exons. [Note that analyses reported excluded all variants that altered the invariant GT (donor site) and AG (acceptor site) sequences.] Variants that create a novel splice signal (donor or acceptor gain, “✓”) inappropriately delete sequences from the 5′ or 3′ exon. Novel splice sites that occur within introns and insert sequences are not depicted. we selected 57 LMNA and 139 MYBPC3 splicing candidates for observed in “normal” subjects. Approximately 0.4% (LMNA: further assessment. 1/395, MYBPC3: 4/767) of rare synonymous, missense, or intronic We developed a cell-based minigene assay to test all splicing variants reported only in normal subjects altered splicing, whereas candidates (Fig. S2). The assay compared RNA transcripts extracted ∼4% (LMNA: 13/420; MYBPC3: 35/808) of such variants reported from HEK293 cells transfected with