Long-Range PCR-Based NGS Applications to Diagnose Mendelian Retinal Diseases
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Molecular Sciences Article Long-Range PCR-Based NGS Applications to Diagnose Mendelian Retinal Diseases Jordi Maggi 1 , Samuel Koller 1 , Luzy Bähr 1, Silke Feil 1, Fatma Kivrak Pfiffner 1, James V. M. Hanson 2, Alessandro Maspoli 1, Christina Gerth-Kahlert 2 and Wolfgang Berger 1,3,4,* 1 Institute of Medical Molecular Genetics, University of Zurich, 8952 Schlieren, Switzerland; [email protected] (J.M.); [email protected] (S.K.); [email protected] (L.B.); [email protected] (S.F.); kivrakpfi[email protected] (F.K.P.); [email protected] (A.M.) 2 Department of Ophthalmology, University Hospital Zurich and University of Zurich, 8091 Zurich, Switzerland; [email protected] (J.V.M.H.); [email protected] (C.G.-K.) 3 Zurich Center for Integrative Human Physiology (ZIHP), University of Zurich, 8057 Zurich, Switzerland 4 Neuroscience Center Zurich (ZNZ), University and ETH Zurich, 8057 Zurich, Switzerland * Correspondence: [email protected]; Tel.: +41-44-556-33-50 Abstract: The purpose of this study was to develop a flexible, cost-efficient, next-generation sequenc- ing (NGS) protocol for genetic testing. Long-range polymerase chain reaction (PCR) amplicons of up to 20 kb in size were designed to amplify entire genomic regions for a panel (n = 35) of inherited retinal disease (IRD)-associated loci. Amplicons were pooled and sequenced by NGS. The analysis was applied to 227 probands diagnosed with IRD: (A) 108 previously molecularly diagnosed, (B) 94 without previous genetic testing, and (C) 25 undiagnosed after whole-exome sequencing (WES). The method was validated with 100% sensitivity on cohort A. Long-range PCR-based sequencing revealed likely causative variant(s) in 51% and 24% of proband from cohorts B and C, respectively. Break- points of 3 copy number variants (CNVs) could be characterized. Long-range PCR libraries spike-in Citation: Maggi, J.; Koller, S.; Bähr, L.; Feil, S.; Kivrak Pfiffner, F.; Hanson, extended coverage of WES. Read phasing confirmed compound heterozygosity in 5 probands. The J.V.M.; Maspoli, A.; Gerth-Kahlert, C.; proposed sequencing protocol provided deep coverage of the entire gene, including intronic and Berger, W. Long-Range PCR-Based promoter regions. Our method can be used (i) as a first-tier assay to reduce genetic testing costs, NGS Applications to Diagnose (ii) to elucidate missing heritability cases, (iii) to characterize breakpoints of CNVs at nucleotide Mendelian Retinal Diseases. Int. J. resolution, (iv) to extend WES data to non-coding regions by spiking-in long-range PCR libraries, Mol. Sci. 2021, 22, 1508. https:// and (v) to help with phasing of candidate variants. doi.org/10.3390/ijms22041508 Keywords: genetic testing; retinal diseases; sequencing; NGS; diagnostics; long-range PCR; CNV; ˙ Academic Editor: Tomasz Zarnowski phasing; missing heritability; ABCA4; PRPH2; BEST1 Received: 4 January 2021 Accepted: 28 January 2021 Published: 3 February 2021 1. Introduction Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in Inherited retinal diseases (IRDs) are a group of disorders affecting the retina and its published maps and institutional affil- function. They are characterized by high phenotypic variability and genetic heterogeneity, iations. including different modes of inheritance [1]. These disorders may present as either isolated or syndromic, progressive or stationary. Also, IRDs may affect the entire retina (panretinal) or may be restricted to the macula. Moreover, they are often classified based on the primarily affected photoreceptor type. Panretinal forms include retinitis pigmentosa (RP), cone-rod dystrophy (CRD), cone dystrophy (COD), and others, whereas macular Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. dystrophies (MDs) include Stargardt disease (STGD), Best disease (BEST), and others [1,2]. This article is an open access article Due to the multitude of loci associated with IRDs (almost 300 loci, https://sph. distributed under the terms and uth.edu/RetNet), molecular diagnostics often relies on targeted enrichment and high- conditions of the Creative Commons throughput sequencing, either whole-exome (WES) [3–6] or gene panels [5–10]. Studies Attribution (CC BY) license (https:// have reported that diagnostic yield when using these standard methods (WES/gene panels) creativecommons.org/licenses/by/ is typically between 50% and 76%, depending, amongst other factors, on the specific clinical 4.0/). subtype being investigated [6–9,11–14]. Int. J. Mol. Sci. 2021, 22, 1508. https://doi.org/10.3390/ijms22041508 https://www.mdpi.com/journal/ijms Int. J. Mol. Sci. 2021, 22, 1508 2 of 29 Int. J. Mol. Sci. 2021, 22, x FOR PEER REVIEW 3 of 33 Missing heritability is defined as the portion of the phenotypic trait that could not (yet) be explained by genotype data [15]. Since inherited diseases are, by definition, char- acterized byELOVL4 complete or near-complete heritability,2 missing heritability34,792 in these diseases indicates casesDHS6S1 in which genetic testing could1 not identify a molecular diagnosis13,473 [15]. RecentRP1L1 efforts to reduce missing heritability3 in IRDs (and improve diagnostic51,877 output) have focusedRP1 on non-coding regions of the genome,1 and on the detection16,930 of structural or copy numberCNGB3 variants (CNVs) [16–19]. These12 studies were performed172 using,052 customized microarrayKCNV2 probes [17], customized capturing1 probes [16,18], or genome15,214 sequencing [19]. SeveralATOH7 studies have focused on the ABCA41 locus [16,18,20–23], as9275 it has often been reported toPDE6C be one of the most prevalent mutated3 genes in IRD cohorts53,794 [10 ,12–14,24]. Variants inBEST1 this gene are the cause of STGD11 (MIM 248200) [25]. Previous17,272 studies have reported thatC1QTNF5 15–40% of STGD patients remain1 without a molecular diagnosis10,536 after standard genetic testing [16,18,20–23,26]. These studies identified CNVs and several non-canonical PDE6H 1 11,084 splice site and deep-intronic variants that have been shown to affect splicing, findings that RDH5 1 6122 partially explain missing heritability in STGD [16,18,20–23]. OTX2 1 12,758 Here, we present a flexible and cost-effective method to comprehensively sequence loci of interest.NR2E3 The method relies on high-throughput1 sequencing of pooled11,097 long-range (LR) polymeraseRLBP1 chain reaction (PCR) fragments1 of up to 20 kb in size.13,667 We designed primers toGUCY2D sequence 35 genomic loci that have1 been associated with IRDs,18,939 particularly MDs and X-linkedFSCN2 RP, including ABCA4, PRPH2,1BEST1 , CRB1, CNGB3, RP111,574, and RPGR. RAX2 1 5845 2. Results TIMP3 5 67,146 LR PCRsRS1 were established for 35 loci associated3 with IRDs. The total34,731 genomic target equals 1.81RPGR Mb, and the longest locus is CRB14 , with a total genomic60,808 target of 212 kb (Table1). TheRP2 entire regions of most target3 genes could be amplified46,521 and sequenced. Several intronic regions could not be amplified after several attempts: CRB1 (3.8 kb in size,The NC_000001.10:g.197420721–197424549), primers for loci that require multipleEFEMP1 PCRs were(0.5 designed kb, NC_000002.11:56102261– in a way that ampli- cons56102747), of adjacentIMPG1 regions(1.9 kb, overlap NC_000006.11:76729633–76731567 with each other. As an example, and Figure 1.6 kb, 1 shows NC_000006.11: the cov- erage76633571–76635199), obtained for theCNGB3 ABCA4(7.1 locus kb, by NC_000008.10:g.87647022–87654129 different NGS assays: WES, custom andcapture 3.8 kb, probes, NC_ and000008.10:87625835–87629598), LR PCRs. and TIMP3 (1.5 kb, NC_000022.10:33202359–33203885). Figure 1. ABCA4ABCA4 coveragecoverage comparison comparison of of different different assays. assays. Screenshot Screenshot of of the the Alamut Alamut Visual Visual software software showing the coverage coverage results for the ABCA4 locus from different next-generation sequencing assays. On the top, the software illustrates the results for the ABCA4 locus from different next-generation sequencing assays. On the top, the software illustrates the relative exon locations. Below the gene structure, coverage plots for a typical whole-exome sequencing assay (top), custom relative exon locations. Below the gene structure, coverage plots for a typical whole-exome sequencing assay (top), custom capture probes assay (middle), and, finally, the long-range polymerase chain reaction method (bottom). capture probes assay (middle), and, finally, the long-range polymerase chain reaction method (bottom). As per design, WES only provides coverage for the coding portions of the locus. The custom capture probes result in coverage for the majority of the locus, including introns. However, gaps accounting for 15–20% of the locus are present. Finally, the LR PCR Int. J. Mol. Sci. 2021, 22, 1508 3 of 29 Table 1. Established long-range polymerase chain reactions (PCRs) for retinal diseases-associated loci. The first column indicates the names of the loci included in the panel, ordered by chromosome coordinates. The second column highlights the number of PCRs necessary to cover the locus compre- hensively, and the last column shows the total size of the genomic target sequence. Abbreviations: bp, base pairs. Locus Number of PCRs Needed Genomic Target Size (bp) PPT1 2 26,746 ABCA4 8 137,748 CRB1 14 212,898 PCARE 1 18,628 EFEMP1 5 60,569 IMPG2 6 94,768 PROM1 7 117,712 MFSD8 3 49,747 CTNNA1 13 182,694 GUCA1A 2 27,191 GUCA1B 1 13,548 PRPH2 2 32,242 IMPG1 11 144,700 ELOVL4 2 34,792 DHS6S1 1 13,473 RP1L1 3 51,877 RP1 1 16,930 CNGB3 12 172,052 KCNV2 1 15,214 ATOH7 1 9275 PDE6C 3 53,794 BEST1 1 17,272 C1QTNF5 1 10,536 PDE6H 1 11,084 RDH5 1 6122 OTX2 1 12,758 NR2E3 1 11,097 RLBP1 1 13,667 GUCY2D 1 18,939 FSCN2 1 11,574 RAX2 1 5845 TIMP3 5 67,146 RS1 3 34,731 RPGR 4 60,808 RP2 3 46,521 The primers for loci that require multiple PCRs were designed in a way that amplicons of adjacent regions overlap with each other.