© American College of Medical Genetics and Genomics ORIGINAL RESEARCH ARTICLE

Genomic diagnostics within a medically underserved population: efficacy and implications

Kevin A. Strauss, MD1, Claudia Gonzaga-Jauregui, PhD2, Karlla W. Brigatti, MS1, Katie B. Williams, MD, PhD1, Alejandra K. King, PhD2, Cristopher Van Hout, PhD2, Donna L. Robinson, CRNP1, Millie Young, RNC1, Kavita Praveen, PhD2, Adam D. Heaps, MS1, Mindy Kuebler, MS1, Aris Baras, MD2, Jeffrey G. Reid, PhD2, John D. Overton, PhD2, Frederick E. Dewey, MD2, Robert N. Jinks, PhD3, Ian Finnegan, BA3, Scott J. Mellis, MD, PhD2, Alan R. Shuldiner, MD2 and Erik G. Puffenberger, PhD1

Purpose: We integrated whole-exome sequencing (WES) and Compared to trio analysis, “family” WES (average seven exomes chromosomal microarray analysis (CMA) into a clinical workflow per proband) reduced filtered candidate variants from 22 ± 6to to serve an endogamous, uninsured, agrarian community. 5 ± 3 per proband. Nineteen (51%) alleles were de novo and 17 Methods: Seventy-nine probands (newborn to 49.8 years) who (46%) inherited; the latter added to a population-based diagnostic presented between 1998 and 2015 remained undiagnosed after panel. We found actionable secondary variants in 21 (4.2%) of 502 biochemical and molecular investigations. We generated WES data subjects, all of whom opted to be informed. for probands and family members and vetted variants through Conclusion: CMA and family-based WES streamline and rephenotyping, segregation analyses, and population studies. economize diagnosis of rare genetic disorders, accelerate novel Results: The most common presentation was neurological disease discovery, and create new opportunities for community-based (64%). Seven (9%) probands were diagnosed by CMA. Family WES screening and prevention in underserved populations. data were informative for 37 (51%) of the 72 remaining individuals, Genet Med advance online publication 20 July 2017 yielding a specific genetic diagnosis (n = 32) or revealing a novel molecular etiology (n = 5). For five (7%) additional subjects, Key Words: chromosomal microarray; developmental delay; negative WES decreased the likelihood of genetic disease. exome; genomic; intellectual disability

INTRODUCTION in state-designated Health Professional Shortage Areas Whole-exome sequencing (WES) and chromosomal micro- (http://www.health.pa.gov/) and the majority are uninsured, array analysis (CMA) have revolutionized investigation of which is the strongest predictor of health disparity in the – rare genetic disorders and intellectual disability,1 6 but United States.14 important diagnostic and service gaps remain. The pretest Uninsured Americans are most commonly served by com- probability of a genetic lesion is high for individuals who munity health centers,15 only 12% of which offer the most basic move through contemporary diagnostic algorithms to arrive forms of genetic testing.16 Such services remain particularly at CMA or WES,7–9 yet many remain undiagnosed at the sparse in rural settings.16,17 Urban areas have meanwhile culmination of the process.2,3,9,10 Moreover, the cost and witnessed the rise of ambitious genomic centers funded by complexity of these methods limit access for people who are academic18 and industry stakeholders19enthused by the promise poor, uninsured, or otherwise medically underserved.11 of precision medicine.20 However, these large-scale genomics The Clinic for Special Children (CSC) is a medical home for initiatives are not necessarily intended to democratize genomic children who derive from endogamous Old Order Amish and testing16 or confront barriers to its broader implementation.11,21 Mennonite (Plain) populations of Pennsylvania and sur- To bridge the gap between technical resources and medical rounding states,12,13 integrating clinical care with an in-house need, CSC and the Regeneron Genetics Center (RGC) forged laboratory for Clinical Laboratory Improvement Amend- a collaboration to make WES freely accessible to uninsured ments–certified targeted testing and research-based CMA members of the Plain community (Supplementary Figure S1 analysis. Approximately 90% of CSC patients are medically online). The arrangement provided benefit to all major underserved, as defined by their geographic, social, and stakeholders: uninsured patients received high-quality geno- economic circumstances (http://www.hrsa.gov/). Many live mic testing at no cost, CSC received genomic data and below the federal poverty threshold (http://www.census.gov/) operating support, and RGC streamlined their investigation of

1Clinic for Special Children, Strasburg, Pennsylvania, USA; 2Regeneron Genetics Center, Regeneron Pharmaceuticals Inc., Tarrytown, New York, USA; 3Department of Biology, Franklin & Marshall College, Lancaster, Pennsylvania, USA. Correspondence: Kevin A. Strauss ([email protected]) Submitted 2 February 2017; accepted 13 March 2017; advance online publication 20 July 2017. doi:10.1038/gim.2017.76

GENETICS in MEDICINE | Volume 20 | Number 1 | January 2018 31 ORIGINAL RESEARCH ARTICLE STRAUSS et al | Clinical genomic diagnostics clinically relevant disease and pathways. Through this crossed against an existing panel of more than 200 known partnership, we have been able to optimize the yield of population-specific alleles detected by CSC laboratory using genomic testing, explore its broader social and economic high-resolution melt analysis or Sanger sequencing.12 Vetting value in community practice, and advance precision medicine the process in this way (Figure 1a), we ensured a high pretest while simultaneously redressing health care disparities unique probability of genetic illness while limiting representation of to the genomic era. known recessive “founder alleles” among probands who advanced to WES; institutional knowledge allowed us to MATERIALS AND METHODS enrich for phenotypes caused by de novo, X-linked, We identified 79 probands (36 female, mean age 6.9 ± 9.4 compound heterozygous, and copy-number variants years, range newborn to 49.8 years) who presented to CSC for (CNVs).2,3,7 evaluation between September 1998 and 2015, had clinical Sixty-eight (86%) probands had a 2.6-million marker high- signs of an underlying genetic disorder, and remained without density CMA (CytoScan HD Array, Affymetrix) to detect a diagnosis following focused biochemical and genetic pathogenic CNVs to a resolution of between 25 kb (losses) ± investigations spanning an average of 3.3 3.2 (range 0.1 and 50 kb (gains) using results from Affymetrix to 16.9) years. All but three probands descended from Old Analysis Suite software (ChAS 3.1) filtered against CNV data 12,22 Order Amish and Mennonite founder populations. from more than 350 individuals of Amish and Mennonite The Lancaster General Hospital Institutional Review Board descent. We investigated any deletion (regardless of size) that approved the study. Subjects (or their parents) had pretest encompassed at least one exon of an OMIM gene and counseling to explain goals, process, timing, and limitations of impacted at least three separate NspI fragments. CMA and WES before consenting in writing to participate. For probands with an uninformative high-density CMA, we Subjects could choose whether or not to receive medically proceeded to WES in collaboration with RGC. Briefly, 1 μgof actionable secondary findings that fit American College of 23 high-quality genomic DNA was exome-captured using the Medical Genetics and Genomics (ACMG) guidelines, NimbleGen VCRome SeqCap 2.1 reagent. Libraries were including pathogenic variants known to segregate with high sequenced on the Illumina HiSeq 2500 platform using v4 frequency in Plain populations (e.g., APOB c.10580 G > A; chemistry, achieving coverage of > 85% of bases at 20x or p.Arg3527Gln).24 greater. Raw sequence reads were mapped and aligned to the Clinicians phenotyped each proband following a structured GRCh37/hg19 genome reference assembly using and standardized assessment guided by PhenoTips (https:// BWA/GATK bioinformatics algorithms (https://software. phenotips.org)25 and using Human Phenotype Ontology broadinstitute.org/). Called variants were assessed by standard (HPO) terms. The likelihood of a monogenic disorder was metrics (read depth ≥ 10, genotype quality ≥ 30, allelic based primarily on conventional clinical indices such as balance ≥ 20%), annotated for potential functional effects abnormal brain size or morphology, developmental delay or (e.g., synonymous, missense, frameshift, nonsense), and regression, the presence of craniofacial/skeletal dysmorphisms, subsequently filtered by observed minor allele frequency or characteristic end-organ pathology (e.g., hearing loss, vision ≤ impairment, or ) in the absence of environmental 1% within public (1000 Genomes, ExAC, and NHLBI antecedents.7 The apparent inheritance of an autosomal ESP6500), RGC internal, and CSC population-specific allele recessive, dominant or X-linked phenotypes supported a genetic frequency databases. etiology in only 14 (17.7%) of 79 cases; 65 remaining probands The annotation process incorporated in silico predictions of presented with a unique clinical phenotype in the context of functional effect (e.g., LRT, Polyphen2, SIFT, CADD, an uninformative family history. MutationTaster) and conservation scores based on multi- Prior to CMA and WES, most probands with develop- species alignment (GERP, PhyloP, PhastCons). Primary ’ mental delay or neurological disease had additional analyte analyses were performed using RGC s trio-based pipeline testing that could include, but was not limited to, plasma and further vetted through segregation analyses among amino acids, acylcarnitines, lactate, ammonia, transferrin available affected and unaffected family members. In the glycoforms, homocysteine, urine organic acids, purines and large majority of cases, we succeeded in generating WES data pyrimidines, creatine and guanidinoacetate, lysosomal storage for all members (affected and unaffected) of the proband’s markers, and cerebrospinal fluid glucose and neurotrans- nuclear family and, when indicated, more distantly related mitters.26 The specific constellation of analyte tests for each individuals germane to the analysis. Classification of patho- proband was shaped by the clinical presentation, its attendant genicity for candidate exome variants was based upon differential diagnosis, and the newborn screening history. ACMG guidelines.30 Informative case results were restricted In general, we took a parsimonious approach to metabolic to “pathogenic” and “likely pathogenic” variants as judged analyte testing based upon its relatively low diagnostic yield by these criteria, whereas variants of unknown significance (1–5%) in this clinical context.26–29 were deemed “open” cases. Prior to reporting, all copy- Several subjects with Rett- or fragile X–like phenotypes had number and allelic variants were validated in CSC’s Clinical targeted MECP2 and FMR1 testing, respectively, prior to Laboratory Improvement Amendments–certified molecular CMA and WES. Finally, the phenotype of each proband was laboratory.12,13

32 Volume 20 | Number 1 | January 2018 | GENETICS in MEDICINE Clinical genomic diagnostics | STRAUSS et al ORIGINAL RESEARCH ARTICLE

ab Phenotyping Probands History and physical n=79 Diagnostic studies Medical problem list HPO terms

Molecular Diagnostic No Is genetic testing indicated? karyotype n=7

Ye s

Uninformative Diagnostic Targeted allele detection n=72

Non diagnostic Not applicable Diagnostic Diagnostic Chromosomal microarray Family exome n=32 Exclusionary: n=5

Non-diagnostic Yield: 51%

˝ Open˝ Likely diagnostic Diagnostic Family exome analysis n=35 n=5

Refine phenotype Dynamic reanalysis Non-diagnostic Recruit relatives Gene matching, iterative literature n=30 review, additional samples, cellular/ Analyze segregation animal modeling, mRNA studies

Figure 1 Molecular diagnostic algorithm (a) The standard diagnostic algorithm for each proband included population-specific allele detection, high- density chromosomal microarray (CMA), family whole-exome sequencing (WES), and posttest follow-up to refine phenotype, recruit additional relatives, and confirm segregation. (b) The pretest probability of Mendelian disease was high for 79 probands, 7 of whom had a pathogenic molecular karyotype by CMA. Seventy-two probands proceeded to WES, which allowed us to identify a definitive molecular diagnosis (n = 32; 44%) or find compelling evidence for a novel genetic mechanism (n = 5; 7%). In 5 (7%) additional subjects, “negative” WES reduced the likelihood of monogenic disease. Thirty cases remain “open” for iterative reanalysis. HPO, Human Phenotype Ontology.

We studied NIN constructs in human embryonic kidney Molecular diagnoses epithelial cells (HEK-293T; ATCC) using a STAT phosphor- A pathogenic abnormality was identified by high-density ylation assay (See Supplementary Methods).31 CMA in 7 (9%) of 79 cases, including split-hand/split foot malformation with long bone deficiency-3 (MIM 246560), RESULTS latent hereditary neuropathy with liability to pressure palsies Study population and testing indications (PMP22 deletion; MIM 162500), novel pathogenic CNVs in The most common indications for genomic testing (Figure 1a) syndromic developmental delay accompanied by congenital were central nervous system disease (64%), auditory or visual heart disease, and atypical presentations of Angelman (MIM impairment (7%), neuromuscular weakness (6%), growth delay 105830) and Turner syndrome (Table 1, Figure 1b). One (5%), hepatopathy (4%), and skeletal dysplasia (4%). Among 52 three-year-old boy (Proband 4) who presented with the classic probands with neurological disease, 85% had developmental cortical dysplasia-focal epilepsy syndrome (CDFES, MIM delay characterized by diverse and overlapping phenotypes 610042) inherited one copy of the common Amish CNTNAP2 such as global developmental delay/intellectual disability (73%), variant (c.3709delG) through the maternal line and a second motor disability with or without hypotonia (60%), executive pathogenic 37,556 bp deletion of CNTNAP2 (c.403_550del) dysfunction (44%), epilepsy (44%), autism (27%), extra- from his Mennonite father. pyramidal movement disorders (17%), and affective illness Seventy-two probands advanced to “family” WES for (15%). Nearly half of children who presented with develop- phenotypes unique to the individual (n = 62), found among mental disability had abnormal brain size and/or morphology more than one sibling (n = 6), or segregating within a larger (microcephaly, 23%; macrocephaly, 12%; and/or cortical pedigree (n = 4) (Table 2). Family WES data were infor- malformation, 13%). Prior to high-density CMA and WES, mative for 37 (51%) of 72 remaining individuals, yielding a 61% of probands had between one and six (average two) definitive genetic diagnosis (n = 32, 44%) or suggesting a uninformative targeted molecular tests and several were novel molecular etiology (n = 5, 7%) (Figure 2b). The diag- subjects of unsuccessful low-density autozygosity mapping.32 nostic yield of WES was highest (71%) for the 14 probands

GENETICS in MEDICINE | Volume 20 | Number 1 | January 2018 33 ORIGINAL RESEARCH ARTICLE STRAUSS et al | Clinical genomic diagnostics

Table 1 Pathogenic copy number variants identified by molecular karyotype Proband Age (yr) Phenotype OMIM Copy-number variant no. 1 1.0 Angelman syndrome 105830 arr[hg19] 15q11.2q13.1(22,770,421-28,723,454)x1 2 14.9 Hereditary neuropathy with liability to pressure palsies 118220 arr[hg19] 17p12(14,087,787-15,491,532)x1 (HNPP) 3 0.4 Turner syndrome 313000 arr(X)x1 4 3.0 Cortical dysplasia focal epilepsy syndrome (CDFE) 610042 arr[hg19] 7q35(146,737,887-146,770,928)x1 5 0.1 Split-hand/split-foot malformation with long bone 612576 arr[hg19] 17p13.3(716,836-1,165,473)x3 deficiency-3 (SHFLD3) 6 4.6 Syndromic global developmental delay, congenital heart arr[hg19] 2q32.1q32.2(189,297,817-191,024,072)x1 disease 7 6.6 Syndromic global developmental delay, congenital heart arr[hg19] 10q25.2q26.3(112,298,956-135,427,143)x3 disease

who shared a phenotype with one or more related individuals non-Plain probands (https://genematcher.org) and corrobo- in an apparently recessive, dominant, or X-linked segregation rated by rephenotyping of all affected subjects (Figure 1b). pattern. For 5 (7%) additional subjects with an ambiguous Our analyses were nondiagnostic in 30 (38%) cases, 18 of clinical phenotype (e.g., varicella encephalitis, transient hyper- which were characterized by a short list of candidate alleles cholanemia, transient glycogen hepatopathy, borderline (in mostly uncharacterized genes) that could not be narrowed QTc prolongation, extensive dental caries), negative WES to a specific variant. For a number of such cases, bioinfor- results markedly reduced the likelihood of a genetic disease matic analyses in conjunction with expression and literature mechanism. We performed an average of 7 (range 3–17) investigations implicated a single candidate allele as patho- exomes per proband (502 exomes for the cohort). When genic, but an association could not be firmly established in vitro compared to trio analysis (proband and parents only), this without further evidence, such as functional data, inclusive strategy narrowed filtered candidate variants more animal models, or additional patients (Figure 1b). than fourfold, from 22 ± 6to5± 3 alleles per proband (Figure 2a). Allele spectrum We identified cases of two Mendelian syndromes segregat- We identified 37 pathogenic or likely pathogenic exome ing in the same proband to produce a complex phenotype, variants among 32 probands represented in Table 2 (Supple- consistent with recent reports of multilocus genomic mentary Table S1). Twenty-seven (84%) of these individuals presented with primary neurological disease, most commonly variation.33 Proband 20 had de novo pathogenic variants in symptomatic epilepsy (n = 7), intellectual disability (n = 7), two genes (SHANK3 and TCF20) underlying a presentation of or syndromic global developmental delay (n = 6). Half the autism spectrum disorder, intellectual disability, and bipolar variants were missense changes, 27% were insertions or illness. A sibling pair (Proband 24) with skeletal dysplasia, deletions leading to frameshift variants, 13% were nonsense, scoliosis, and clubfoot shared homozygous pathogenic and 13% affected canonical splice sites. Inheritance was de variants in two genes—SLC26A2 (diastrophic dysplasia, SH3TC2 novo dominant in 16 (50%) cases, autosomal recessive in 12 MIM 222600) and (Charcot-Marie-Tooth type 4 C, (38%; 9 homozygous, 3 compound heterozygous) cases, and — MIM 601596) segregating on the same haplotype (Figure X-linked recessive in 1 case. Dominant inheritance was 2c,d). Although diastrophic dysplasia dominated the clinical observed for two probands within large multigenerational presentation, nerve conduction velocities subsequently families segregating nonlesional generalized epilepsy; in one revealed a motor neuropathy characteristic of CMT4C. such pedigree, seizures were attributable to three variants in Novel, as yet provisional, gene-disease associations listed in two different genes: SCN1B (MIM 604233) and NPRL3 (MIM – Table 2 (Probands 40 44) include four autosomal recessive 617118)(Figure 2e). We identified one putative case of germ- CHD1, JKAMP, NIN, NUP188 ( ) and one de novo dominant line mosaicism in which two siblings with Rubinstein–Taybi (BMP2) phenotypes. Each allele in Table 2 represents the only syndrome (MIM 180849) carried a variant of CREBBP that compelling variant(s) to pass all filtering criteria and segregate was not present in peripheral blood DNA of either biological appropriately within the family. However, each is classified parent. as “uncertain significance” according to ACMG criteria,30 largely because such criteria do not accommodate novel gene Secondary findings discoveries or phenotypes that diverge from published reports Among 502 subjects included for WES analysis, 490 (98%) (Table 2, Figure 3). Pathogenicity of BMP2 was first elected to receive secondary ACMG findings. Twenty-one suspected by matching the “Amish” phenotype to unrelated (4.2%) subjects harbored one of four known or likely

34 Volume 20 | Number 1 | January 2018 | GENETICS in MEDICINE lnclgnmcdiagnostics genomic Clinical EEISi MEDICINE in GENETICS Table 2 Exome data verifying (n=32) or suggesting (n=5) a Mendelian disorder Proband Age at Configuration Samples Final diagnosis/phenotype OMIM/ Allele 1 Zygosity Allele 2 Zygosity Inheritance no. intake (type: affected; post-exome status (yrs) unaffected) (affected; unaffected) 8 5.8 I:1;6 Bethlem myopathy 158810 COL6A1 c.1056 + 1G > A Het DND | oue20 Volume 9 0.3 I:1;4 CHARGE syndrome 214800 CHD7 c.7957 C > T, p.(Arg2653Ter) Het DND | STRAUSS 10 49.8 I:1;2 Coffin-Siris syndrome 1 614562 ARID1B c.5226_5229delAGAA, Het DND (CSS1) p.(Glu1743Alafs) tal et | ubr1 Number 11 0.7 I:1;3 Cortical dysplasia (CDCBM1) 614039 TUBB3 c.317 C > T, p.(Thr106Met) Het DND 12 6.2 I:1;8 Epileptic encephalopathy 615369 CHD2 c.4767_4768 delCT, Het DND p.(Leu1591Aspfs) | aur 2018 January 13 1.1 I:1;2 Intellectual disability (MRD19) 615075 CTNNB1 c.998dupA, p.(Tyr333Ter) Het DND 14 8.9 I:1;7 Intellectual disability (MRD29) 616078 SETBP1 c.4309 A > G, p.(Lys1437Glu) Het DND 15 0 I:1;3 Intellectual disability (MRD31) 616158 PURA c.697_699delTTC, p.(Phe233del) Het DND 16 3.4 I:1;4 Intellectual disability (MRD5) 612621 SYNGAP1 c.1526 C > A, p.(Ala509Asp) Het DND 17 1.6 I:1;4 Intellectual disability (MRD5) 612621 SYNGAP1 c.3582 + 3A > C Het DND 18 12.9 I:1;10 Intellectual disability (MRD5) 612621 SYNGAP1 c.936_937insC, (pGlu313Argfs) Het DND 19 0.1 I:1;6 Myhre syndrome 139210 SMAD4 Het DND c.1182_1197delAGGTGATGTTTGGGTC, p.(Asp396Alafs) 20 15.7 I:1;8 Phelan–McDermid syndrome 606232 SHANK3 c.5129 G > A, p.(Leu1696Gln) Het TCF20 Het DND c.1151 C > G, p.(Ser384Cys) 21 1 I:1;6 Pitt–Hopkins syndrome 610954 TCF4 c.2033 G > A, p.(Arg412Gln) Het DND 22 1 I:1;5 Primary aldosteronism, 615474 CACNA1D c.2245 G > A, p.(Ala749Thr) Het DND ARTICLE RESEARCH ORIGINAL seizures, neurologic abnormalities 23 5.2 I:1;10 Rett syndrome 312750 MECP2 c.3 G > A, p.(M1 ? ) Het DND 24 0 S:2;3 Diastrophic dysplasia, 222600, SLC26A2 c.835 C > T, p.(Arg279Trp) Hmz SH3TC2 Hmz AR Charcot-Marie-Tooth 4 C 601596 c.2860 C > T, (CMT4C) p.(Arg954Ter) 25 0.3 I:1;5 GM3 synthase deficiency 609056 ST3GAL5 c.862 C > T, p.(Arg265Ter) Hmz AR 26 16.1 S:2;2 Intellectual disability (MRT34) 614499 CRADD c.382 G > C, p.(Gly128Arg) Hmz AR 27 3 I:1;5 Myoclonic-astatic epilepsy 616421 CACNA1G c.6423_6424delCT, Hmz AR p.(Ser2142Tyrfs) 28 0 S:2;3 Neonatal inflammatory skin 616069 EGFR c.560-2_565delAGGCCAAA Hmz AR and bowel disease (NISBD2) 29 1.4 I:1;4 Non-syndromic sensorineural 611451 LRTOMT c.95_108delGGACCATGTCCCCT, Hmz AR hearing loss p.(Thr33Hisfs) 30 0.9 I:1;4 2;4 Poretti-Boltshauser syndrome 615960 LAMA1 c.8556 + 1G > T Hmz AR 31 19 I:1;2 Syndromic global ELP2 c.1385 A > G, p.(Arg462Gln) Hmz AR

35 developmental delay RGNLRSAC ARTICLE RESEARCH ORIGINAL 36 Table 2 Continued

Proband Age at Configuration Samples Final diagnosis/phenotype OMIM/ Allele 1 Zygosity Allele 2 Zygosity Inheritance no. intake (type: affected; post-exome status (yrs) unaffected) (affected; unaffected) 32 1.1 I:1;7 3;2 Syndromic global YARS c.499 C > A, p.(Pro167Thr) Hmz AR developmental delay 33 13.5 I:1;3 Dopa-responsive (Segawa) 605407 TH c.1083 C > G, p.(His361Gln) Het TH c.1411 G > T Het AR syndrome p.(Ala471Ser) 34 0 S:2;5 Oculocutaneous albinism II 203200 OCA2 c.823 A > G, p.(Thr275Ala) Het OCA2 Het AR (OCA2) c.2433 G > T, p.(Arg811Ser) 35 5.9 I:1;2 Epilepsy with febrile seizures+ 604233 SCN1B c.305_313delAGGATCTGT, Het AD p.(Gln102_Ser105delinsPro) 36 9.5 P:6;16 Epilepsy with febrile seizures+ 604233 NPRL3 c.349_349delG, p.(Glu117Lysfs) Het AD 37 12.2 P:5;5 Epilepsy with febrile seizures+ 604233 SCN1B c.350 G > A, p.(Gly117Asp) Het AD 38 11.6 S:4;3 4;2 X-linked Turner-type 300706 HUWE1 c.12389 G > A, p.(Arg4130Gln) Hemi XLR syndromic developmental delay 39 6.6 S:2;3 Rubinstein–Taybi syndrome 180849 CREBBP c.1823 + 5G > A Het GLM 40 3 I:1;6 Microcephaly, vision Provisional CHD1 c.377 C > A, p.(Ser126Tyr) Het CHD1 Het AR impairment, absent corpus c.4681 C > T, callosum, epilepsy p.(His1561Tyr) 41 9.8 S:2;3 1;0 ID, epilepsy, OCD, autism, Provisional JKAMP c.243_244dupG, Hmz AR dysmorphic features p.(Lys81Glufs*16) 42 13 S:3;2 1;7 Progressive high-frequency Provisional NIN c.4666 C > T, p.(Gln1556Ter) Hmz AR hearing loss oue20 Volume 43 3.2 S:2;6 Motor delay, C1 dysplasia, Provisional NUP188 c.313 C > T, p.(Arg105Trp) Het NUP188 Het AR hearing loss, tracheomalacia, c.3429 + 5G > A

| cyrptorchidism ubr1 Number 44 4.8 I:1;3 Growth failure, speech delay, Provisional BMP2 c.949dupC, p.(Tyr320Valfs*16) Het DND hypotonia, motor apraxia, STRAUSS | Ebstein anomaly aur 2018 January AD, autosomal dominant; AR, autosomal recessive; DND, de novo dominant; GLM, germ-line mosaic; hemi, hemizygous; Het, heterozygous; hmz, homozygous; I, Individual; ID, intellectual disability; indel, insertion/deletion; OCD, obsessive-compulsive disorder; P, pedigree (multiple generations); S, sibship. tal et | | lnclgnmcdiagnostics genomic Clinical EEISi MEDICINE in GENETICS Clinical genomic diagnostics | STRAUSS et al ORIGINAL RESEARCH ARTICLE pathogenic variants in three genes: BRCA2 (c.5073dupA with CMA results (e.g., Table 1, Proband 4). Five probands and c.7378_7379delAA), APOB (c.10580 G > A), and DSC2 underscore the phenotypic overlap that often exists between (c.1580_1583delTCAA); all opted to receive these results. genetic (e.g., neonatal rigidity and multifocal seizure syndrome; MIM 614498)32 and nongenetic (e.g., congenital DISCUSSION viral encephalitis) afflictions. In such cases, “negative” WES Those who stand to benefit most from genetic testing often data reduce the likelihood of a genetic disease mechanism and have complex medical needs and experience their health care can critically inform clinical management, whereas “positive” as expensive, fragmented, and confusing. As a corollary, WES data can challenge tacit assumptions about environ- referrals for WES are commonly rejected by insurance mental pathogenesis, as we found in one sibship (Proband 16, carriers2,21 and authorized samples are sometimes linked to Figure 2f) affected by both maternal phenylketonuria and incomplete or unreliable clinical data.1,3 Such prosaic SYNGAP1 haploinsufficiency (MIM 612621). problems reinforce healthcare disparities and also reduce Full genetic ascertainment can reveal surprising complexity diagnostic efficacy. In one study of 814 consecutive probands, at the root of seemingly simple diagnostic problems. Such was WES had a diagnostic yield of 26%, but provided potential the case for a nonlesional epilepsy phenotype segregating diagnoses for 228 (28%) additional probands. For the latter, through a 38-member Mennonite pedigree (Figure 2e), in promising variants were assigned “uncertain significance” which we expected to find a single dominant risk allele. pending further segregation (50%), phenotyping (25%), or Instead, we identified three different pathogenic epilepsy CNV analyses (25%). variants in two epilepsy-associated genes (SCN1B and Integration of molecular methods into medical practice not NPRL3). The relatively low observed penetrance (40–45% as only engenders better clinical outcomes but also improves compared to an expected value of ~ 70%) is noteworthy, but laboratory performance.13 Nonprofit-industry collaboration true penetrance may prove higher if currently asymptomatic allowed us to apply this principle to deep sequencing by individuals develop new seizure onsets over time or systemic incorporating a sophisticated genomic testing pipeline, electroencephalography (not done) reveals epileptiform optimized for performance, into the clinical workflow (Supple- cortical signatures in otherwise asymptomatic individuals. mentary Figure S1). Our overall WES diagnostic yield across Such cases provide a potentially informative platform for diverse phenotypes was 44–51%, approaching the theoretical discovering loci that modify disease expression, providing a yield (~50%) proposed for larger outbred cohorts2 and fruitful area for future study. the observed (45–49%) yield among carefully selected A financial calculation invariably weighs on the use of new children with neurological disease.34 Embedding this service technologies and, without better value accounting, constrains in a community-based practice with clinical laboratory the use of WES in clinical practice. In a recent study of capability12 allowed us to fully interrogate the genomic data, 2,000 probands,3 WES was performed at the discretion of the rephenotype patients as needed, validate WES variants on-site, referring physician unless denied by an insurance carrier. directly report clinically actionable results, and apply new Pre-authorization for WES is required by more than 80% of molecular findings to population-based health initiatives.13 US insurance carriers, who may ultimately fail to reimburse as To test the broader applicability of this strategy, we many as 50% of completed studies.21 As with other measures took steps to attenuate inflated yield (i.e., > 70%) when of health care, this “reimbursement wall” stands as a principal WES is used as a first-tier diagnostic test for multiplex, determinant of disparate access to genetic testing. consanguineous families.35 Within our Plain patients, This study was enabled by a nonprofit–industry collabora- “founder” alleles enriched by genetic drift manifest as more tion that posed opportunities as well as challenges (Supple- than 150 autosomal recessive and 25 autosomal dominant mentary Figure S1). The final decision to enter into partner- disorders.12,22 By recognizing and testing for these variants, ship was reached after careful negotiations to insure CSC’s we provide molecular diagnoses for more than 40% of clinical and operational autonomy, shared ownership of data, probands after one office encounter, obviating their need for stringent protection of patient privacy, and unanimous CMA or WES (Figure 1a).13 To further limit representation acceptance by the CSC’s nonprofit Board of Directors, most of homozygous recessive genotypes within the cohort, we of whom are leaders within Old Order communities. Adult selected study subjects who had unique clinical phenotypes, members of the Plain community tend to be entrepreneurial in many cases with uninformative homozygosity mapping and exceptionally pragmatic, and generally embrace creative results. These pre-WES procedures largely abrogated the forms of collaboration that allow their people to flourish.13 impact of founder variants, as half the pathogenic alleles we The overall success of the partnership has engendered strong discovered were de novo (Table 2), approximating what one ongoing community support for collaboration, which should expects to find in an outbred cohort.1,2 enable us prospectively to perform WES on each proband for In complex clinical contexts, WES data clarify the relative whom it is indicated. contribution of genetic versus environmental factors and can Growing evidence supports the economy of this approach. deliver unexpected results. In some cases, WES data reveal Within the US healthcare system, standard evaluation of a digenic or multigenic interactions (e.g., Table 2, Probands 20 child with neurodevelopmental disability costs an average and 24) and in others, are informative only when combined of US$19,000 (range $9,000 to $35,000)9,13,36 for testing, not

GENETICS in MEDICINE | Volume 20 | Number 1 | January 2018 37 ORIGINAL RESEARCH ARTICLE STRAUSS et al | Clinical genomic diagnostics including professional fees or other indirect institutional cases. By comparison, first-tier WES for children with expenses.9,36 This approach, which does not encompass neurodevelopmental disorders yields a molecular diagnostic WES,7, provides a genetic diagnosis in about one third of rate of 40–60%5,9,34 for an average $1,920 (range $1,170

ab40

15% 30 Monogenic, KNOWN 41% Monogenic, NOVEL 20 23% Genetic UNLIKELY Pathogenic CNV 10 CANDIDATE list Variants per proband Variants Uninformative 6% 6% 0 TRIO FAMILY

FGF1 c d ARHGAP26 ARHGAP26-AS1 NR3C1 MIR5197 HMHB1 YIPF5 KCTD16 PRELID2 GRXCR2 SH3RF2 PLAC8L1 LARS RBM27 POU4F3 TCERG1 GPR151 PPP2R2B PPP2R2B-IT1 STK32A DPYSL3 JAKMIP2-AS1 JAKMIP2 SPINK1 SCGB3A2 C5orf46 SPINK5 SPINK14 SPINK6 SPINK13 SPINK7 SPINK9 FBXO38 HTR4 ADRB2 SH3TC2 c.2860C>T, p.Arg954Ter ABLIM3 AFAP1L1 GRPEL2 PCYOX1L 7.6 Mb IL17B MIR143HG MIR143 MIR145 CSNK1A1 ARHGEF37 PPARGC1B 5 PDE6A SLC26A2, c.835C>T, p.Arg279Trp TIGD6 HMGXB3 CSF1R PDGFRB CDX1 SLC6A7

e I

II

III

f

1600 M)

 1200

800

400 Plasma phe ( 0 19 20 21 22 23 24 25 26 27 28 29 Maternal age (years)

38 Volume 20 | Number 1 | January 2018 | GENETICS in MEDICINE Clinical genomic diagnostics | STRAUSS et al ORIGINAL RESEARCH ARTICLE

b a NIN c.4666C>T –20

0

20 250 Hz M/+ M/+ 40

60

80 Hearing level (dB) Hearing level

100 4000 Hz M/M M/M M/MM/+ +/+ +/+ M/+ M/+M/+ M/M M/+ 120 8 10 12 14 16 18 20 Age in years

c +IFN-y –IFN-y IB: d 10

pSTAT1 8 Female Female (WT) Male Male (WT) Mutamt Wild type Wild type 6 100 STAT1 4 90 Mean 2 80 * * 0 WT WT 70

pSTAT1/STAT1 (fold-increase) pSTAT1/STAT1 WT aa Empty

Vector Vector p.Q1556* 1179–1931 vector 60

p.Q1556 p.Q1556 50 1179–1931 1179–1931

(dB SPL) 40 +IFN-y –IFN-y IB: 30 4 20 pSTAT3 3 10

2 ABR threshould 24-kHz Evoked Female (WT) Male Male (WT) STAT3 1

* * 0 WT WT WT aa Empty pSTAT3/STAT3 (fold-increase) pSTAT3/STAT3

Vector Vector p.Q1556* 1179–1931 vector p.Q1556 p.Q1556 1179–1931 1179–1931

Figure 3 Novel disease gene discovery. (a), (b) We documented progressive, high-frequency sensorineural hearing loss in Proband 42 and three of her siblings, who shared homozygous nonsense of NIN (c. 4666 C > T; p.Gln1556Ter). Segregation of wild type ( + ) and c.4666 C > T (M) are shown. (b) Characteristic audiogram data for two subjects (circle, square), showing selective insensitivity to high (4000 Hz, red symbols) versus low (250 Hz, gray symbols) frequencies (dotted line represents normal hearing level). Missense alterations of NIN were previously associated with microcephalic primordial dwarfism (MIM 210600) and spondyloepimetaphyseal dysplasia (MIM 603546), neither of which was observed in our subjects. (c) Ninein inhibits JAK2/STAT signaling through its C-terminus, displayed here by a decrease in STAT1 and STAT3 phosphorylation by overexpression of WT and C-terminal Ninein amino acids 1179-1931 in HEK-293T cells. Ninein p.Gln1556Ter was 2.8-fold and 1.5-fold less effective in inhibiting STAT1 and STAT3 phosphorylation, respectively (N = 3), suggesting constitutive upregulation of JAK2/STAT signaling. Inhibition of JAK2/STAT3 signaling attenuates noise-induced hearing loss in mice. (d) Nin-/- mice (yellow) have isolated high-frequency sensorineural hearing loss and elevated auditory brainstem response (ABR) thresholds at 18 and 24 kHz (Figure 3d from RIKEN BioResource Center: http://www.mousephenotype.org/phenoview/?gid = 8293&qeid = IMPC_ABR_010_001).

Figure 2 Whole-exome sequencing (WES) results. (a) We generated an average of 7 (range 3-17) exomes per proband (white circles). Compared to trio analysis, this strategy reduced the average number of filtered candidate variants from 22 ± 6 (gray) to 5 ± 3 (red). (b) Overall results of diagnostic evaluation for 79 subjects. (c), (d) Proband 24 and her younger brother had cleft palate, clubfeet, early onset scoliosis, short stature, and skeletal dysplasia. (c) Anterior-posterior radiograph shows a severe scoliotic angle (46.1 degrees, yellow dotted line) and abnormal morphology of the proximal femurs (yellow arrows). Family WES data showed siblings to be homozygous for two pathogenic variants: one for Charcot-Marie-Tooth type 4C (SH3TC2 [c.2860 C > T; p.Arg954Ter]; CMT4C, MIM 601596) and another for diastrophic dysplasia (SLC26A2 [c.835 C > T; p.Arg279Trp]; DTD, MIM 222600,)—in linkage disequilibrium on a 953-kb haplotype. A digenic mechanism explained the unusually severe course of scoliosis, a manifestation of both DTD and CMT4C, and unmasked a demyelinating sensorimotor neuropathy confirmed by nerve conduction velocity testing. (e) We generated WES data for 38 members of a three-generation Mennonite pedigree segregating nonlesional epilepsy, expecting to find a single dominant risk allele. Instead, we identified three different pathogenic epilepsy variants in two epilepsy-associated genes: a de novo deletion in SCN1B (Proband 37 (yellow): c.305_313delAGGATCTGT; p.Q102PdelDLS), a missense variant of SCN1B (Proband 35: c.350 G > A; p.G117D) segregating in a dominant fashion (MIM 604233; blue), and a dominantly segregating frameshift deletion (red) in NPRL3 (Proband 36: c.349_349delG; p.E117Kfs*5; MIM 617118). Color-filled symbols represent affected variant carriers, while open symbols represent unaffected variant carriers. Both dominant variants were incompletely penetrant, further complicating the clinical picture. (f) In an Amish sibship, the eldest child (Proband 16, red circle) presented with intellectual disability, hyperactivity, inattention, and epilepsy. Her three younger siblings had a similar behavioral phenotype but without epilepsy. The mother had classical phenylketonuria, and records revealed teratogenic (red dotted line) maternal phenylalanine levels (white circles) during all four pregnancies. Family WES analysis identified a pathogenic de novo missense variant of SYNGAP1 (c.1526 C > A; p.A509D) in the eldest proband, explaining her unique manifestation of epilepsy (MIM 612621), which does not commonly result from maternal hyperphenylalaninemia (diamonds). Note: light red, yellow, and green shading indicate the teratogenic potential of hyperphenylalaninemia as high, intermediate, or low, respectively, during serial phases of pregnancy.

GENETICS in MEDICINE | Volume 20 | Number 1 | January 2018 39 ORIGINAL RESEARCH ARTICLE STRAUSS et al | Clinical genomic diagnostics to $3,150) per exome trio (based on 34 reporting labs at 5. Sharma P, Gupta N, Chowdhury MR, et al. Application of chromosomal http://www.scienceexchange.com). Using this information to microarrays in the evaluation of intellectual disability/global develop- mental delay patients - A study from a tertiary care genetic centre calculate a simple metric of value (i.e., favorable outcomes per in India. Gene 2016;590:109–119. dollar spent),37 we assign a theoretical genomic evaluation 6. Miller DT, Adam MP, Aradhya S, et al. Consensus statement: cost of $4,000 per study subject (to comprise costs of targeted chromosomal microarray is a first-tier clinical diagnostic test for – individuals with developmental disabilities or congenital anomalies. allele detection, CMA, and 0 4 additional exomes per Am J Hum Genet 2010;86:749–764. proband; Figure 1a) to return actionable information in at 7. Moeschler JB, Shevell M, Committee on G. Comprehensive evaluation of least 50% of cases. This strategy yields one molecular diag- the child with intellectual disability or global developmental delays. Pediatrics 2014;134:e903–918. nosis per $8,000 dollars spent, compared to one diagnosis 8. Rauch A, Wieczorek D, Graf E, et al. Range of genetic mutations per $60,000 via the standard approach. associated with severe non-syndromic sporadic intellectual disability: an The implication is clear: for select patients, a diagnostic exome sequencing study. Lancet 2012;380:1674–1682. 9. Soden SE, Saunders CJ, Willig LK, et al. Effectiveness of exome method that prioritizes CMA and WES can be efficient and and genome sequencing guided by acuity of illness for diagnosis of cost-effective in a variety of clinical contexts, provided cases neurodevelopmental disorders. Sci Transl Med 2014;6:265ra168. are chosen carefully and executed systematically. Embedding 10. Fogel BL, Lee H, Deignan JL, et al. Exome sequencing in the clinical diagnosis of sporadic or familial cerebellar ataxia. JAMA Neurol 2014;71: this service within community-based practice further 1237–1246. improves its value and aligns well with the World Health 11. Cragun D, Bonner D, Kim J, et al. Factors associated with genetic Organization’s call to implement genetics in underserved counseling and BRCA testing in a population-based sample of young Black women with breast . Breast Cancer Res Treat 2015;151: 17,38 settings. We returned actionable secondary results to 21 169–176. subjects and, by designing rapid molecular tests for 17 (46%) 12. Strauss KA, Puffenberger EG. Genetics, medicine, and the Plain people. – alleles discovered by WES,13 created new opportunities for Annu Rev Genomics Hum Genet 2009;10:513 536. 13. Strauss KA, Puffenberger EG, Morton DH. One community’s effort to screening and prevention (Figure 3). We conclude that control genetic disease. Am J Public Health 2012;102:1300–1306. emerging genomic technologies, judiciously applied, can 14. Noonan D. No insurance?: that’s a killer. Newsweek 2008;152:20. 15. Ku L, Patrick R, Ellen T, et al. Strengthening Primary Care to Bend the Cost empower communities to curtail wasteful medical spending Curve: The Expansion of Community Health Centers Through Health and improve population health. Reform, 2010. 16. Hawkins AK, Hayden MR. A grand challenge: providing benefits of SUPPLEMENTARY MATERIAL clinical genetics to those in need. Genet Med 2011;13:197–200. Supplementary material is linked to the online version of the 17. Tekola-Ayele F, Rotimi CN. Translational Genomics in Low- and Middle- Income Countries: Opportunities and Challenges. Public Health Geno- paper at http://www.nature.com/gim mics 2015;18:242–247. 18. NIH Prepares to Launch Precision Medicine Study. Cancer Discov 2016; 6:938. ACKNOWLEDGMENTS 19. Shuldiner AR. An audience with: Alan Shuldiner. Nat Rev Drug Discov This work was supported in part by charitable contributions from 2016;15:378–378. Old Order Amish and Mennonite Communities of Pennsylvania 20. Shapiro SD. The promise of precision medicine for health systems. Am J Health Syst Pharm 2016;73:1907–1908. and surrounding states. CMA analysis at CSC and functional 21. Lennerz JK, McLaughlin HM, Baron JM, et al. Health care infrastructure studies performed by R.N.J. were supported in part by a grant to for financially sustainable clinical genomics. J Mol Diagn 2016;18: Franklin & Marshall College from the Howard Hughes Medical 697–706. 22. Puffenberger EG. Genetic heritage of the Old Order Mennonites of Institute through the Precollege and Undergraduate Science southeastern Pennsylvania. Am J Med Genet C Semin Med Genet Education Program. The authors thank D. Holmes Morton, Zineb 2003;121C:18–31. Ammous, Olivia Wenger, and James Deline for contributions to 23. Green RC, Berg JS, Grody WW, et al. ACMG recommendations for reporting of incidental findings in clinical exome and genome proband phenotyping and sample collection. sequencing. Genet Med 2013;15:565–574. 24. Shen H, Damcott CM, Rampersaud E, et al. Familial defective apolipoprotein B-100 and increased low-density lipoprotein cholesterol DISCLOSURE and coronary artery calcification in the old order amish. Arch Intern Med C.G.-J., A.K.K., C.V.H., K.P., A.B., J.G.R., J.D.O., F.E.D, S.J.M., and 2010;170:1850–1855. A.R.S. are full-time employees of the Regeneron Genetics Center, 25. Girdea M, Dumitriu S, Fiume M, et al. PhenoTips: patient pheno- typing software for clinical and research use. Hum Mutat 2013;34: Regeneron Pharmaceuticals, Inc., and receive stock options as 1057–1065. part of their compensation. The other authors declare no conflicts 26. van Karnebeek CD, Shevell M, Zschocke J, Moeschler JB, Stockler S. The of interest. metabolic evaluation of the child with an intellectual developmental disorder: diagnostic algorithm for identification of treatable causes and new digital resource. Mol Genet Metab 2014;111:428–438. REFERENCES 27. Battaglia A, Bianchini E, Carey JC. Diagnostic yield of the comprehensive 1. Retterer K, Juusola J, Cho MT, et al. Clinical application of whole-exome assessment of developmental delay/mental retardation in an institute of sequencing across clinical indications. Genet Med 2016;18:696–704. child neuropsychiatry. Am J Med Genet 1999;82:60–66. 2. Lee H, Deignan JL, Dorrani N, et al. Clinical exome sequencing for genetic 28. Shevell M, Ashwal S, Donley D, et al. Practice parameter: evaluation of identification of rare Mendelian disorders. JAMA 2014;312:1880–1887. the child with global developmental delay: report of the Quality 3. Yang Y, Muzny DM, Xia F, et al. Molecular findings among patients Standards Subcommittee of the American Academy of Neurology and referred for clinical whole-exome sequencing. JAMA 2014;312: The Practice Committee of the Child Neurology Society. Neurology 1870–1879. 2003;60:367–380. 4. Yang Y, Muzny DM, Reid JG, et al. Clinical whole-exome sequencing for 29. Michelson DJ, Shevell MI, Sherr EH, Moeschler JB, Gropman AL, Ashwal the diagnosis of mendelian disorders. N Engl J Med 2013;369: S. Evidence report: Genetic and metabolic testing on children with global 1502–1511. developmental delay: report of the Quality Standards Subcommittee of

40 Volume 20 | Number 1 | January 2018 | GENETICS in MEDICINE Clinical genomic diagnostics | STRAUSS et al ORIGINAL RESEARCH ARTICLE

the American Academy of Neurology and the Practice Committee of the 34. Kuperberg M, Lev D, Blumkin L, et al. Utility of whole exome sequencing Child Neurology Society. Neurology 2011;77:1629–1635. for genetic diagnosis of previously undiagnosed pediatric neurology 30. Richards S, Aziz N, Bale S, et al. Standards and guidelines for the patients. J Child Neurol 2016;31:1534–1539. interpretation of sequence variants: a joint consensus recommendation 35. Charng WL, Karaca E, Coban Akdemir Z, et al. Exome sequencing in of the American College of Medical Genetics and Genomics and the mostly consanguineous Arab families with neurologic disease provides a Association for Molecular Pathology. Genet Med 2015;17:405–424. high potential molecular diagnosis rate. BMC Med Genomics 2016;9:42. 31. Jay J, Hammer A, Nestor-Kalinoski A, Diakonova M. JAK2 tyrosine kinase 36. Joshi C, Kolbe DL, Mansilla MA, Mason SO, Smith RJ, Campbell CA. phosphorylates and is negatively regulated by centrosomal Reducing the cost of the diagnostic odyssey in early onset epileptic Ninein. Mol Cell Biol 2015;35:111–131. encephalopathies. Biomed Res Int 2016;2016:6421039. 32. Puffenberger EG, Jinks RN, Sougnez C, et al. Genetic mapping and 37. Porter ME. What is value in health care? N Engl J Med 2010;363: exome sequencing identify variants associated with five novel diseases. 2477–2481. PLoS One 2012;7:e28936. 38. Kingsmore SF, Lantos JD, Dinwiddie DL, et al. Next-generation commu- 33. Posey JE, Harel T, Liu P, et al. Resolution of disease phenotypes resulting nity genetics for low- and middle-income countries. Genome Med from multilocus genomic variation. N Engl J Med 2017;376:21–31. 2012;4:25.

GENETICS in MEDICINE | Volume 20 | Number 1 | January 2018 41