<<

© 2010 Nature America, Inc. All rights reserved. D.R.T. ([email protected]). School, Boston, USA. Massachusetts, Health Board, Wellington, New Zealand. 6 Genetics Services, Royal Children’s Hospital, Melbourne, Victoria, Australia. Boston, USA. Massachusetts, 1 ( I complex or of stability assembly, maturation proper the for required are that factors sory (7 mtDNA and complex 12 nuclear genes) and 6 acces nuclear-encoded the of subunits 19 include or These analysis mapping. linkage homozygosity sequencing, candidate by identified been nuclear defects by caused probably are rest the mtDNA, and the in to due are cases deficiency I complex isolated of 15–20% roughly that gest relies on assessment of biochemical biopsy material usually and heterogeneity genetic and clinical its given challenging acidosis lactic and drome, skeletal muscle , cardiomyopathy, , stroke, and shows a wide range of clinical manifestations, including Leigh syn births live 5,000 in 1 of incidence an has in collectively which disease, chain Defects membrane. respiratory human of type common inner most the are activity I complex mitochondrial pumping the while across ubiquinone to protons NADH from the electrons catalyzes of and chain ­transfer respiratory the to point entry main the is by both the nuclear and mitochondrial (mtDNA) genomes. Complex I encoded subunits 45 of composed machine macromolecular ~1-MDa large a is chain respiratory mitochondrial the of I Complex and FOXRED1 unsolved identify pools sequencing mitochondrial Discovering Mark J Daly C Michelle Redman Noel P Burtt E Sarah Calvo in High-throughput, sequencing identifies pooled mutations Nature Ge Nature Received 2 March; accepted 11 August; published online 5 September 2010; Department Department of Paediatrics and Child Health, University of Otago Wellington, Wellington, New Zealand. Center for Human Genetic Research, General Massachusetts Hospital, Boston, USA. Massachusetts, Twenty-five genes underlying human complex I deficiency have have deficiency I complex human underlying genes Twenty-five

experimental 1

NUBPL of . Complex I deficiency can present in infancy or early adulthood 1

DNA

cases 5 n can 1 1

and etics

the rare

7,8 1

from

3

and cause , using 3 , , Manuel Rivas

, though most of these mutations most of , remain unknown. these though 1 experimental molecular , , David R Thorburn

– variants

validation, 3

ADVANCE ONLINE PUBLICATION ONLINE ADVANCE

nuclear , a 10

complex confirmatory cohort , Elena , J Elena Tucker 3 2–4 and 3 , , Esko Wiltshire Broad Broad Institute of Harvard and MIT, Cambridge, USA. Massachusetts,

. The diagnosis of complex I deficiency is is deficiency of I complex diagnosis The . that upeetr Tbe 1 Table Supplementary

basis genes

of

can

10 validation I were 1

These These authors contributed equally to this work. should Correspondence be addressed to V.K.M. or ([email protected]) deficiency. 8

1 03 of National National Metabolic Service, Starship Children’s Hospital, Auckland, New Zealand.

be , that FOXRED1 3

experiments, , Candace , Guiducci Candace

mitochondrial

cases predicted

used

are 4 4

, to , 5

5 and to involved. & Vamsi K Mootha 6

,

10 Our , uncover 7

identify , Callum , J Callum Wilson , , Alison G Compton

42 to

5,6 study

including

affect healthy . . Estimates sug

). Many more more Many ). respiratory

We the

causal

illustrates

protein

report molecular

controls 3 in human complex I deficiency

cDNA , , Damien L Bruno

mutations 5 Department Department of Paediatrics, University of Melbourne, Melbourne, Victoria, Australia.

doi:10.1038/ng.65 chain

a

- - - function.

how strategy 1

8

complementation – and

, , David Altshuler basis approach to assess candidate gene across many individuals. individuals. many across exons gene candidate assess to approach sequencing a pooled we used Therefore, individual. a in single genes candidate 103 interrogate to needed that than greater far run, each in sequence of amount tremendous a yields technology sequencing and of evidence biochemical complex I deficiency. clinical Such massively parallel with individuals of cohort a in genes candidate 103 all ( complex I deficiency for genes human of candidate 103 set a targeted comprise bly factors, assem and subunits structural I complex known the with combined ited forms of complex I deficiency inher causing mutations harbor to shown been have which of three We generalized this method to identify 34 additional candidate genes previously used to identify the complex I assembly factor NDUFAF2 diction through phylogenetic profiling MitoCarta inventory of mitochondrial of complex I. To systematically predict such proteins, we combined the to reside in the and aid in the assembly and regulation I complex with genes known in mutations have individuals ­deficiency of half only that estimate that IV studies complex smaller the of assembly for necessary are factors ­assembly by probably as required, suggested the 20 3

disease 4

Recent technological advances technological Recent Additional proteins that are required for complex I activity are likely large-scale ,

10 then in

, , Denise M Kirby

of

We of individual

mitochondrial

focused performed is

established 9

challenging 4 4 2

Murdoch Murdoch Childrens Research Institute and Victorian Clinical sequencing, Department Department of Systems Biology, Harvard Medical School, , , Olga A Goldberger 7 Central Central Regional Genetics Service, Capital and Coast District

cases. candidate to

1 show deep , Supplementary TableSupplementary 1 3

genetic ,

9

complex given , , Stacey B Gabriel

4

sequencing that

, , Gabriel Crawford coupled

gene

the diagnoses

mutations 9 2 Department Department of Genetics, Harvard Medical

14,18,19 0 I

offer the prospect of sequencing sequencing of prospect the offer prediction, large disorders.

with 15,16

1 of . The remaining predictions, . predictions, The remaining

, number . Phylogenetic profiling was 2 1

10–13 , in functional

in

03 1

). 1

NUBPL

We 3 3

4 . candidate high-throughput , with functional pre

of

s e l c i t r A

of

created 3 60 9 ,

and by cohort cohort by and both

prediction previously and

genes

seven

factors factors

to

1 1

7 4  - - -

, .

© 2010 Nature America, Inc. All rights reserved. GC content ( content GC skewed to owing largely covered, poorly were regions target nuclear average an ( to individual per corresponding 168× pool, of per coverage median 3,359× at regions target nuclear our of 90% ( pool each for data quality Methods). Online lane (see per one pool with flowcell, Analyzer Genome Illumina a single using and sheared to construct libraries. The seven libraries were sequenced concatenated amounts, equimolar in combined were amplicons PCR successful 952 The bases. of targeted 97% captured successfully tions reac PCR Kb). (7 mtDNAregions two and Kb) (138 exons encoded to capture the 145 Kb of target sequence, which included 653 nuclear- amplification PCR For we 20 or performed pool, each 21 individuals. 2 from pools HapMap controls, with each containing pool DNA from and cases from pools We 5 DNA into European combined HapMap collection. the from subjects control healthy 42 sequenced also We established molecular diagnoses ( who as lacked a diagnosis well previous as molecular 43 controls with shown by biochemical assessment. The cohort included 60 individuals deficiency I ‘definite’, complex had isolated cases 103 of cohort Our Rare RESULTS deficiency. I complex with individuals 103 in sequenced genes candidate 103 the reflect to fibroblasts. subject in cDNA rescue approaches including variants of using genicity molecular prioritized pathogenic mutations, in all subjects. Finally, we confirmed the patho to type these newly discovered variants, as well as previously reported ( new pool each in detected variants and target depth, high selected to regions individuals, these sequenced ~20 regions, from DNA of pools created We 1 Figure s e l c i t r A  showed ihtruhu sqecn yedd ag aons f high- of amounts large yielded sequencing High-throughput Here, we report the results of our project, which we term ‘Mito10K’

variant

substantially higher substantially Schematic overview of the Mito10K project. Mito10K the of overview Schematic 2 previouslyundescribeddisease-relatedgenes: 5 previouslyundescribedmutationsinknowndisease-relatedgenes 60 withunknownmutations 43 withknownmutations (+ 42healthycontrols)

Supplementary Fig. 1 Fig. Supplementary discovery 13/60 (22%)individualswithnewgeneticdiagnosis 103 cases Fig. Fig. Illumina sequence(3,400X) Predict deleteriousvariants Rescue complexIdefectin

Identify SNVs/indels(898) Establish pathogenicity by Pool DNA(~20perpool) upeetr Fg 1 Fig. Supplementary PCR amplifytargets Genotyping phase Resequence indels 1 Discovery screen subject fibroblasts

Sequenom SNVs Supplementary Table 3 Table Supplementary pooled ). We then used genotyping technology technology genotyping used We ). then

coverage (10,144× median coverage). coverage). median (10,144× coverage Results Table X

sequencing ). The mtDNA target regions regions target mtDNA The ). ≥ 1 and 100× coverage and achieved achieved and coverage 100× 103 genes(145Kb) 81 nuclear(protein) 7 mtDNA(protein) 15 mtDNA(tRNA) Supplementary Table NUBPL ). Around 10% of of 10% Around ). , FOXRED1 ). We captured Wecaptured ).

2 ). - -

to bethesameas proband. a lactic acidosis,stroke-likeepisodes;VCFS, velo-cardio-facialsyndrome. LIMD, lethalinfantilemitochondrialdisease; MELAS,mitochondrialencephalopathy, Overall, we achieved 92% sensitivity and 99.6% specificity for for and specificity Methods with sites 99.6% DNA and nuclear at SNVs sensitivity control 92% achieved we Overall, controls HapMap and case our from available genotypes known ( variants. strand low-confidence each and high- on reads 3 least at by hoc ad ( pools case the in Note and communication, personal M.R., and (M.J.D. variants rare identify confidently to order in base at each rates error estimate Therefore, to we Syzygy empirically applied a called method present reads to makes in it alleles detect 1:40 . difficult the Illumina in individual of rate error 1% (indels) estimated The samples. variants pooled insertion/ small and (SNVs) ants samples. represented poorly some in discover even to variants us allowed individual. mtDNA of single coverage a deep the from came Nonetheless, mtDNA the of 96% example, for by ( introduced amplification biases whole-genome to owing primarily subjects, across tributed pooled the in mtDNA the However, c Fibroblast defect Family history Consanguinity T VCFS/DiGeorge plus Mitochondrial hepatopathy Mitochondrial cytopathy Mitochondrial myopathy MELAS LIMD Cardiomyopathy/ Other mitochondrialencephalopathy Clinical diagnosis T devastat and rare a ing underlie to likely are that those on attention ( fidelity high pools HapMap in counts frequencies ( read expected from with each estimated strongly within correlated frequencies mtDNA allele of minor The distribution pool. (32%) nonuniform controls the case to for owing sensitivity lower much but 100%, and respectively) (96% controls HapMap of and DNA genomic in sensitivity high specificity achieved we variants, mtDNA For singletons pool. for a 66% in and doubletons for 86% variants: nuclear rare ( variants 3 Fig. control allele HapMap high many for relatively the and frequency coverage sequence deep the to due Two subjectsrepresentprenataldiagnosesthatwereterminatedanddiagnosiswasassumed Complex Ienzymedefect presentinsubjectfibroblasts. otal R able able 1 Next, we assessed the accuracy of these 898 variants using using variants 898 these of accuracy the assessed we Next, vari nucleotide single low-frequency identify to aimed Wenext Next we prioritized the 898 discovered variants to focus our our focus to variants discovered 898 the prioritized we Next 2

= 0.96), indicating that the pooled sequencing protocol had had protocol sequencing pooled the that indicating 0.96), = ( phenotype ). Using this method, we detected 652 high 652 we detected ). Using method, this ). However, as expected, we achieved lower sensitivity for for sensitivity lower achieved we expected, as However, ). approach to identify 246 low 246 identify to approach

C linical, linical, molecular and biochemical features of the cohort b : definite,possible c Supplementary Fig. 3 Fig. Supplementary (no.tested) Supplementary Table 3 Table Supplementary a Fig. 2 Fig. DVANCE ONLINE PUBLICATION ONLINE DVANCE Table Table b a Family historyconsistent withamitochondrialdisorder. ). Briefly, we filtered out: (i) variants that that variants (i) out: filtered we Briefly, ). 2 ). To improve To ). Supplementary Fig. 2 Fig. Supplementary 17 (20) mutations mtDNA

7, 9 - samples was not uniformly dis uniformly not was samples confidence variants supported supported variants confidence 25 11 ). 0 0 1 2 6 0 2 0 3 Table Table ). This high sensitivity is is sensitivity high This ).

sensitivity, we applied an an applied we sensitivity, Individuals with ≥ 100 reads (see Online Online (see reads 100 2 10 (15) mutations Nuclear ). We identified 898 898 identified We ).

- confidence variants variants confidence 9, 0 Nature Ge Nature 18 7 0 3 0 0 0 2 1 6 6 Supplementary Supplementary Supplementary Supplementary a

). In one pool, pool, one In ). 18 (32) mutations Unknown n 9, 9 60 12 13 15 etics 6 1 2 3 5 0 9 2

1 - - - .

© 2010 Nature America, Inc. All rights reserved. vertebrate exons) shown for training data: missense variants annotated as disease-associated in HGMD (red curve) or present in dbSNP128 (blue curve). curve). (blue dbSNP128 in present or curve) (red HGMD in disease-associated as annotated variants missense data: training for shown exons) vertebrate ( deleterious’. ‘likely considered positions splice indicates asterisk and threshold frequency indicates line Dashed rectangles). (black exons acceptor splice and donor splice nearest to relative ( consequences. deleterious predicted by categorized 2 Figure ( site conserved variant splice nuclear 1 out located 4 filtered bp into an intron and 1 and mtDNA variant missense at a poorly screen, discovery the in variants mtDNA 17 and nuclear 4 missed approach The controls. I nuclear causal the of ­variants (78%) and 7/25 (28%) of causal 18/23 mtDNA variants within captured our variants deleterious’ that variants deleterious’. ‘likely low-confidence deemed 107 were and 109 variants genotyping for prioritized high-confidence we filters, these Using Methods). ( data training of on based frequency mutations reduced a pathogenic had sites evo these as low with sites conservation, at lutionary variants missense out filtered we (HGMD) addition, Database Gene Human the in variants splice disease-associated 8,189 of data training using tions posi site 8 splice We sites. selected splice or tRNA to corresponded synonymous (ii) controls, HapMap on based dbSNP individuals, healthy in present were T Noncoding Synonymous Missense Nonsense mtDNA Coding indels UTR Synonymous Splice Missense Nonsense Nuclear DNA Variant type T Nature Ge Nature variants. deleterious’ ‘likely for required conservation minimum indicates line Dashed otal able able 2 Together, the discovery screen and stringent definition of ‘likely ‘likely of definition stringent and screen discovery the Together, a No. variants 2

200 400 600 2 Number Number of variants detected in pooled sequencing discovery screen Definition of ‘likely deleterious’ variants detected in pooled sequencing screen. ( screen. sequencing pooled in detected variants deleterious’ ‘likely of Definition , mtDB , 0 n confidence variants etics High- Supplementary Table 2 Supplementary

2 variants; and (iii) non-coding variants, unless they they unless variants, non-coding (iii) and variants; in subjects 3 Detected

and pilot data from the 1,000 genomes project; project; genomes 1,000 the from data pilot and 214 131 652 ADVANCE ONLINE PUBLICATION ONLINE ADVANCE confidence Less likelydeleterious Likely deleterious 85 37 92 78 variants 0 3 3 9 High-confidence variantcalls Found inhealthyindividuals Non-coding Synonymous Missense (notconserved Missense (conserved) Splice site Coding indel Nonsense Mitochondrial tRNA Known pathogenic Low-

deleterious Likely 109 14 28 60 2 0 0 3 0 0 2 c

) Histogram of conservation score (no. species with identical amino acid, out of 44 aligned aligned 44 of out acid, amino identical with species (no. score conservation acid amino of ) Histogram ). ) Validated b HGMD pathogenic splice variants 91 12 22 51 2 0 0 3 0 0 1 b Fig. 2 Fig. ) Histogram of known disease-associated splice variants, annotated in HGMD in annotated variants, splice disease-associated known of ) Histogram 30% 10% 15% 20% 25% 0% 5%

2 Donor 4 –2 ( c in subjects ; see Online Online see ; Fig. –1 Detected * HGMD splicevariants 246 5 4 3 2 1 71 33 40 97 Position relativetosplicedonor/acceptor

* complex

0 0 0 0 0 5 2 Low-confidence variantcalls * b ). In In ). *

- -

* deleterious – 6 genotypes showed an estimated 11% false positive rate for extremely rate for extremely positive false 11% an estimated showed genotypes Sequenom as sequencing, Sanger using high-confidence interest particular of SNVs additional 101 on ( genotyped based variants rate, validation 96% 60 undiagnosed cases for individuals harboring either known known same the in either alleles mutant harboring two or mutations individuals mtDNA for pathogenic our cases searched next undiagnosed we hand, 60 in data sequence Mito10K the With Prioritizing ancestry in ( differences to due be might enrichment this although in than in our variants controls, European cases of deleterious’ ‘likely in provided 12 and pathogenic variants missed in the screen). discovery data Detailed are low-confidence 12 high-confidence, (91 loci unique 115 to gene. the resequence fully to sequencing which heterozygous we variants of identified interest, we used Sanger ( variants rare Likely Supplementary Note Supplementary 107 16 86 In total, we validated 151 ‘likely deleterious’ variants corresponding ( n 0 0 0 0 0 0 0 5 = 8,189) – 6

– 5 – 4 a Validated * Supplementary Table Supplementary 2 – 3 ) Barplot of high-confidence and low-confidence variants, variants, low-confidence and high-confidence of ) Barplot

variants * – 2 12 upeetr Note Supplementary 0 0 0 0 0 0 0 2 9 1 * 1 1 Acceptor Supplementary Table 4 Supplementary 2%

for ). ple ( ple ously known disease variants, previ in as each well case as sam variants, deleterious’ ‘likely Our next goal was to genotype the discovered Genotyping ‘Less likely deleterious’ variants had a higher a had higher variants deleterious’ likely ‘Less confidence variants ( low- of 11% only expected, as and variants, variants, we validated 84% of high us to assign the variants to individuals. allowed itexample,mtDNA Third,variants). (for power of lack a to owing screen covery dis our in detected not were that deficiency I complex underlying mutations known for covery screen. Second, it enabled us to search newly identified variants from the pooled dis validate to necessary was it First, purposes. multiple served genotyping The Methods).

Of Of the newly discovered ‘likely deleterious’ complex Supplementary Table4 Supplementary c Normalized count 0.03 0.06 0

. . We a frequency higher detected I

1 0 rare ). In a subset of instances in in instances of subset a In ). deficiency Amino acidconservationscores Missense variants

variants Supplementary Table 2 0 dbSNP HGMD ). We further validated validated We ). further 2 ( s e l c i t r A 4 ( 3 0 n n , by position position , by = 31,831) = 62,957) and see Online Online see and 4 0 - confidence 0

4 ).  - - - -

© 2010 Nature America, Inc. All rights reserved. inherited from his father and mother, respectively. respectively. mother, RT-PCR indicated analyses that and both the and c.99-1G>A c.351-2A>G father his from inherited were which c.351-2A>G, mutation new a and mutation same c.99-1G>A the for heterozygous compound was DT107 individual lated (p.Lys154AsnfsX34) c.462delA mutations reported the for heterozygous compound were ( NDUFS4 below. controls noted as except HapMap sequenced, and cases other all from absent were mutations genes: disease NDUFAF2 I complex reported previously in MT-TW (in mutations mtDNA causal with individuals three the in detected variants of pathogenicity the Weassessed next Establishing variants deleterious’ ‘likely no with ( 27 and significance clinical 17 with heterozygous ‘likely deleterious’ nuclear variants of unknown deleterious’ mtDNA‘likely variants of clinical unknown ( genes Table disease candidate in ­mutations ( mutations reported and two undescribed five including genes, in disease known mutations recessive-type had subjects eight only and mutations Of after ascertained inheritance. be phasing. of only confirmatory can mode heterozygosity recessive compound a course, with consistent variants, compound and homozygous include which variants, ( gene nuclear a,b diagnoses. genetic established experimentally new indicate arrowheads Black subject. each in variants deleterious’ ‘likely containing genes list Boxes variants. deleterious’ ‘likely without individuals indicates gray and (VUS) significance uncertain of variants with individuals indicates blue variants, pathogenic with individuals indicates Red gene. per detected variants deleterious’ ‘likely of type by categorized diagnosis, genetic 3 Figure s e l c i t r A  alter mutations Table Table 2 Supplementary NDUFS4 MCCC2 MCCC2 C7orf10 DCI Nuclear VUS Pairs of affected siblings. affected of Pairs We identified one undescribed and two previously reported reported previously two and undescribed one identified We mtDNA pathogenic reported previously had subjects three Only 27

3

3 ). The remaining subjects included three individuals with with individuals three included subjects remaining The ).

and 2 Sixty individuals with complex I deficiency without a previous a previous without I deficiency complex with individuals Sixty mutations in three individuals with Leigh syndrome syndrome Leigh with individuals three in mutations 6 ) and the eight individuals with recessive-type variants variants recessive-type with individuals eight the and ) 1 NPL NIPSNAP3A NIPSNAP3A NDUFV3 7 , , 11 11 Supplementary Fig. 4 Fig. Supplementary NDUFV1 Fig. Fig. NDUFS4 genetic 3 3 0 ). We refer to the latter as ‘recessive-type’ ‘recessive-type’ as latter the to refer We ). and c.99-1G>A (p.Ser34IlefsX4) c.99-1G>A and PTCD1 PHYH OXCT1 OXCT1 Table Table 3 3 2 ).

and and diagnoses splicing. The heterozygous c.351-2A>G c.351-2A>G heterozygous The splicing. NDUFS5/NDUFS8 NDUFV3/AMACR 17 NDUFS4/MGST3 3 C7orf10/AMACR NDUFS8 ). Two subjects had recessive-type recessive-type had subjects Two ). 8 ). Two siblings, DT37 and DT38, and DT38, Two). DT37 siblings,

2 in 3

3 known 3 ( NUBPL Table ND4 ND1 ND1 mtDNA VUS NDUFS8 NDUFV1 NDUFAF2 NDUFAF NDUFAF2 NDUFS4 NDUFS4 NDUFS4 known diseasegene Recessive-type, FOXRED1 NUBPL candidate gene Recessive-type, MT-TW ND5 ND3 Pathogenic mtDNA

disease ND3 3 ). The discovered discovered The ). and and NDUFS4 a a 2 b b 2

heterozygous heterozygous 5 n silico In 1

, , significance, significance,

0 genes FOXRED1 ND5 . The unre The .

previously previously 10,27–31 2 6 and and and and - , ;

blot analysis, which indicates that the truncated protein products products protein unstable. are truncated the that indicates which analysis, blot All lacked anythree subjects NDUFAF2detectable protein by protein (p.Ala73GlyfsX5). protein a truncated encodes also that a transcript generates which 3 skipping, occasional in resulted 3) exon into 4 bp (located in DT16 mutation nonsense c.221G>A the In addition, that of Analysis cDNA (p.Ile35SerfsX17). from showed fibroblasts subject mutation c.103delA a homozygous harbored DT68, and DT67 lings, 250K Affymetrix by (determined homo of region 6.3-Mb a within (p.Trp74X) mutation c.221G>A Fig. 5 syndrome Leigh with individuals three gene. this in exist but founder mutations also that previously mayunrecognized several in mutations recurrent that only tion muta c.99-1G>A the of report second the is This protein. NDUFS4 on analysis detectable no showed DT107 and DT38 blot individuals from fibroblasts Protein unstable. was mRNA the that suggesting (CHX), cycloheximide without or with cDNA in it undetectable was however, DT107, from DNA genomic in detected was mutation ( 1–4 exons of spanning deletion ~240-Kb a including rearrangement on DNA from individual DT35. We detected a complex chromosomal analysis cytogenetic array-based of exon 2, Affymetrix we performed portion this involving deletion a transmitted have could mother the ( mutation the carry not did mother the mutation, this for heterozygous was ject’sfather by site TargetPpredicted ( cleavage sequence targeting mitochondrial the (p.Gly56Arg), from acids amino 18 arginine with residue glycine conserved highly a of substitution cause to predicted is mutation This sequenced. somes the 204 other subject chromosomes or the 84 HapMap control chromo in and was found to an carry apparent mutationhomozygous c.166G>A and deficiency: I complex to linked previously not genes two in mutations recessive-type discovered also we subjects, 60 our Within NUBPL carriers. heterozygous were parents both unaffected in whereas homozygous also disease was sibling with affected an family: segregated this mutation This (pfam00037). domain cluster iron-sulfur 4Fe-4S Fer4 conserved highly the within polarity alters and acid amino conserved highly a affects mutation ( encephalopathy chondrial with who mito DT61, presented in a subject, Sudanese p.Gly154Ser) species. eukaryotic across conserved highly is which (pfam10589), site binding sulfur iron the for motif consensus the in residue charged positively a introduces mutation 6 Fig. lethal infantile mitochondrial (LIMD) disease ( with presented who DT3, individual, Lebanese sanguineous 250K Affymetrix by (determined homo zygosity of region 2.1-Mb a in p.Glu377Lys) (c.1129G>A, mutation Supplementary Fig. 8 Fig. Supplementary NUBPL We also identified new homozygous mutations in in mutations homozygous new identified also We Subject DT35 presented with mitochondrial encephalomyopathy encephalomyopathy mitochondrial with presented DT35 Subject homozygous new a Weidentified homozygous undescribed previously a identified We NUBPL 1 FOXRED1 NDUFAF2 0 ). ). A DT16, individual, consanguineous harbored a homozygous and the third of the c.462delA mutation c.462delA the of third the and ). Both unaffected parents were heterozygous carriers. This This carriers. heterozygous were parents unaffected Both ).

and ( and a ~130-Kb duplication involving exon 7 of of 7 exon involving duplication ~130-Kb a and Supplementary Supplementary Fig. 8

FOXRED1 . transcripts containing these mutations were stable. stable. were mutations these containing transcripts a DVANCE ONLINE PUBLICATION ONLINE DVANCE Supplementary Fig. 8 Fig. Supplementary ). Next, we assessed assessed we Next, ).

in Supplementary Fig. 8 Supplementary

Table complex ). ). We did not detect this mutation in NDUFS4 3 and NDUFS8

I

(Table deficiency Supplementary Fig. 7 Supplementary Table underlie Leigh syndrome syndrome Leigh underlie Nsp Nsp ). To determine whether whether Todetermine ). NUBPL mutation (c.460G>A, (c.460G>A, mutation 3 3 SNP chip). Twosib chip). SNP SNP chip) in a con a in chip) SNP and

and 28,30 ). Although the sub ). Although Nature Ge Nature , suggesting not not suggesting , Supplementary Supplementary Supplementary Supplementary mRNA species species mRNA NDUFAF2 NDUFV1

binding binding NUBPL NUBPL n ). ). This etics in in - - - - - ­ ­ ­

© 2010 Nature America, Inc. All rights reserved. when assayed by spectrophometric assay and 40% residual residual 40% and assay enzyme spectrophometric by assayed when activity I complex residual 19% only with defect, I complex strong a in the defect complex I activity. Fibroblasts show from this individual rescued fibroblasts subject cDNA into wild-type of introduction the skipping. 10 exon causes probably that mutation that harbors both a p.Gly56Arg missense allele mutation second and a a and c.815-27T>C 1–4 exons spans that deletion a harboring allele one contains DT35 Thus, ancestry. European of individuals sequence. This mutation branch was present in 2 out of 232 consensus control chromosomes from a ablate to predicted is that mutation Wecoverage). sequence found a high-throughput c.815-27T>C poor previous of area (an regions intronic flanking the and 10 exon of ing sequenc Sanger performed we skipping, 10 exon of cause the mine allele. To allele. deter of maternal the paternal of expression no evidence was There the was it that suggesting mutation, c.166G>A the shorter fragment resulted from exon 10 skipping, and that it contained ( fragment a shorter was mRNA species predominant the and transcript, length of full- the low expression RT-PCR very DT35. showed in individual DT67 DT107 DT38 DT37 DT20 DT55 DT58 Subject T Nature Ge Nature a Using assay. enzyme dipstick by assayed when activity I complex DT35 DT61 DT3 DT16 DT68 a,b 250K observed insubjectfibroblastcDNAwithorwithoutCHX;conservation,aminoacidconserved protein, bySDS-PAGE andproteinblot;seg.,variantsegregateswithdiseaseinfamily;reseq.,confirmedbySangersequencingofgenomicDNA;splice,splicingdefect het., heterozygous/heteroplasmic;cmpd.compoundheterozygous;rescue,pathogenicityconfirmedbyrescueofcomplexIdefectinsubjectfibroblasts;NDP, nodetectable Bold indicateslikelycausalvariants.Mt.enc.,mitochondrialencephalopathy;LS,Leighsyndrome;LIMD,lethalinfantiledisease;hom.,homozygous/homoplasmic; DT22 able able 3 Affected siblingpairs. We performed a complementation experiment to assess whether whether assess to experiment complementation a Weperformed b b a a Nsp

SNPchip. LS LS LS LS LIMD LS Mt. enc. diagnosis Clinical LS Mt. enc. Mt. enc. LIMD LS LS New New genetic diagnoses for cases of complex n upeetr Fg 8 Fig. Supplementary etics

Firm ( diagnosis Genetic Firm ( Firm ( Probable ( Probable ( Firm ( Firm ( Firm ( Firm ( Firm ( Firm ( Firm ( Firm (

c Novel variant,notpreviouslyreported. ADVANCE ONLINE PUBLICATION ONLINE ADVANCE NDUFAF2 NDUFS4 NDUFS4 NDUFS4 MT-TW ND5 ND3 FOXRED1 NUBPL NDUFAF2 NDUFAF2

NDUFS8 NDUFV1 het.) het.) hom.) cmpd.het. cmpd.het. cmpd.het.) cmpd.het.) cmpd.het. hom. hom. hom. hom. hom. ). Sequencing revealed that the the that revealed Sequencing ). c c c ) ) ) c c c ) ) ) c ) c ) NDUFS8 NDUFV1 NDUFAF2 p. NDUFAF2 p. NDUFAF2 DCI ND5 ND2 MT-TW Homozygous variants I I le35 le35 :c.392T>C,p.Leu131Pro :m.13676A>G,p.Asn447Ser :m.4890A>G,p.Ile141Val, :m.5567 S S erfsX17 erfsX17 :c.1129G>A,p.Glu377 :c.460G>A,p.Gly154 :c.221G>A,p. :c.103delA, :c.103delA,

I deficiency T c c > C ,

NUBPL

T rp74X - - S

c er L ys c low of area an in HapMap was mutation 84 or c.1289A>G The chromosomes chromosomes. control case other 204 in detected and not screen was discovery the in detected was mutation c.694C>T The (p.Gln232X) and ( (p.Asn430Ser) c.1289A>G in mutations two for heterozygous compound disease. cause to mutation branch-site the with synergy in mutation p.Gly56Arg might missense abolish patho be the Alternatively, as in DT35. a allele, with null may inherited when genic mutation this However, functionality. NUBPL for full-length sufficient generates homozygous if that allele, pseudo-deficiency a be may mutation site branch 27T>C c.815- the controls, in prevalence to its Owing mutations. individual of pathogenicity the established not have we subject, this in ciency case. this in gene causal the who harbored DT22 subject from not but DT35 subject from fibroblasts in activity of wild-type cDNA. Expression wild-type with fibroblasts subject transduced we system, expression lentiviral c C20orf7 GPAM GPAM 1G>A,p. NDUFS4 c.392T>C,p.Leu131Pro GAD1 NDUFS4 NDUFS4 Glu330Asp NDUFS2 NDUFS4 NDUFS4 TMEM22 C2orf56 ND5 ND3 Heterozygous variants NIPSNAP1 FOXRED1 FOXRED1 NDUFB9 (31,345,080_31,350,225)dup chr14g.(31,211,800_31,212,780)_ (31,193,278_31,194,846)del [chr14:g.(30,932,976_30,953,766)_ 815-27 NUBPL NDUFV3 Subject DT22 presented with Leigh syndrome and syndrome was with DT22 Leigh found presented Subject to be that shown have we Although

≥ coverage but was subsequently identified by Sanger sequencing sequencing Sanger by identified subsequently was but coverage 30/44 vertebratespecies;250KSNP, regionofhomozygosityfromAffymetrix :m.10197G>A,p.Ala47 :m.13094 :c.990A>T,p.Glu330Asp :c.1340C>T,p.Thr447Met :c.1340C>T,p.Thr447Met :[c.166G>A,p.Gly56Arg T :c.412G>A,p.Val138Ile :c.208C>G,p.Pro70Ala S :c.826G>A,p.Glu276Lys :c.351-2A>G :c.99-1G>A,p. :c.462delA,p. :c.96-3C>T, :c.99-1G :c.462delA,p. > :c.290A>G,p.YTyr97Cys :c.500G>A,p.Arg167Gln er34 :c.1289A>G,p.Asn430 :c.694 C :c.215A>G,p.Tyr72Cys ,p.Asp273GlnfsX31 I FOXRED1 T lefsX4 > C C > > ,p.Val253Ala A,p. T ,p.Gln232X GAD1 c L L , S S ys154AsnfsX34, ys154AsnfsX34, NDUFS4 er34 er34 mutations ( :c.990A>T,p. T hr I I lefsX4, lefsX4, c DCI c c c + , c + ], ], S , :c.99-

er

: c ,

NUBPL Fig. Fig. 4 250K SNP, reseq.,conservedin NDP, 250KSNP, reseq.,splice NDP, reseq.,splice,conservation NDP, reseq,splice,conservation reseq., splice,conservation,NDP Known diseasevariant Reseq., splice,NDP Known diseasevariants Reseq., splice Known diseasevariants and fibroblasts homoplasmic inblood,muscle,liver Known diseasevariant mutant loadinmuscle Known diseasevariant mutant loadinblood Known diseasevariant Supporting evidence Rescue, reseq.,conservation,splice Rescue, reseq.,conservation,splice domain Seg., reseq.,conservationinFer4 NADH 4Fe-4Sdomain underlies complex I defi I complex underlies NUBPL a ), establishing ), establishing Supplementary Fig. Supplementary NUBPL FOXRED1 NUBPL s e l c i t r A rescued complex I complex rescued function or function act 1 2 2 2 , c.694C>T c.694C>T , transcript transcript 10,30 10,30 0 6 6 5 NUBPL , seg., , 100% , ~60% , ~90% , ,

9 as ). ).  - - © 2010 Nature America, Inc. All rights reserved. tRNA mutations required for mtDNA , 4% are in other other in are 4% translation, mtDNA for required mutations tRNA (including factors assembly I complex established in are 6% subunits, structural I complex in are 33% mutations, these (1%; mutations X-linked and (22%) mutations sive-type reces (29%), mutations mtDNA to due diagnoses including noses, diag genetic firm have now them of 52% individuals; unrelated 94 includes subjects 103 of cohort Our date. to deficiency I complex of our at diagnostic laboratory,seen provide the largest systematic deficiency sequencing study I complex isolated definite with individuals other 43 all of diagnosis molecular previous the to addition in here, and for studies validation 60 discovery reported cases The large-scale Mutational respectively. DT22, and DT35 individuals in genes disease-related I that evidence provide cDNA, to cell line this ( and was specific rescue this cDNA using lentiviral-mediated with rescue the wild-type stick enzyme assay. We were able to rescue the defect in these fibroblastsenzyme assay and 15% residual complex I activity when assayed by dip only 9% residual complex I activity when with assayed bydefect, spectrophometric I complex striking a show subject this from Fibroblasts of role the assess to fibroblasts ject residues internal ( 40 lack to predicted is that transcript a in results which also c.694C>T), cDNA (containing 6 exon of subject skipping of occasional shows analysis RT-PCR genotyping. for available (p.Asn430Ser; ine ser a with residue asparagine conserved cause highly a of to substitution the predicted is and mother, subject’s the from inherited was compound ( with heterozygosity consistent species, predominant the as mutation c.1289A>G the containing transcript the leaving undetectable, was of CHX the transcript containing the c.694C>T (p.Gln232X) mutationabsence the in However, present. were decay mutations both that showed nonsense-mediated inhibit to CHX with treated fibroblasts European ancestry screened by RFLP analysis. Analysis of cDNA from of replicates wild-type wild-type with transduction after and before fibroblasts, subject and control in measured (CIV), activity IV complex by ( fibroblasts. subject 4 Figure s e l c i t r A  C10orf2, and POLG proteins replication (mtDNA factors auxiliary Supplementary Fig. 9 Fig. Supplementary a NUBPL Together, the mutation data and complementation experiments experiments complementation and data mutation the Together, sub in experiment complementation a performed we above, As FOXRED1 Relative CI:CIV 100 120 CIV 20 40 60 80 CI 0

NUBPL FOXRED1-V5 Control + – ± + – s.e.m. * s.e.m. NUBPL NUBPL

spectrum and was not present in 102 control chromosomes of of chromosomes control 102 in present not was and and and Supplementary Fig. 9 Fig. Supplementary + – DT35 P a Supplementary Fig. 9 Fig. Supplementary < 0.01. Representative dipstick assays shown below. shown assays dipstick Representative < 0.01. , FOXRED1 * b mRNA ( mRNA ) Barplots show complex I activity (CI), normalized normalized (CI), I activity complex show ) Barplots NUBPL

). of

complex DT22 + – b cDNA rescue of complex I defects in I defects complex of rescue cDNA ). Data shown are mean of three biological biological of three mean are shown ). Data and and FOXRED1 FOXRED1 FOXRED1

b I

Relative CI:CIV deficiency 100 120 CIV 20 40 60 80 ). The c.1289A>G mutation mutation c.1289A>G The ). CI 0 NUBPL-V5 ). Paternal DNA was not not was DNA Paternal ). Control + – + – are bona fide complex complex fide bona are FOXRED1 FOXRED1 in complex I activity. I complex in Fig. Fig. 4 mRNA ( mRNA NUBPL + – DT35 b Fig. ). FOXRED1 a ), 7% are are 7% ), ) ) or + – DT22

5 *

). Of Of ). - - - - -

species. However, further analysis indicated that this individual was was individual this that indicated analysis However,further species. an amino acid that has been across conserved all 36 aligned vertebrate in mutation missense an p.Gly56Arg homozygous revealed apparent individual this from DNA of Sequencing controls). to relative respectively, activity, normalized 19% and (37% deficiency I complex marked showed fibroblasts skin and biopsy Muscle tion). CSF lactate (see 2 years of age with developmental delay, and elevated at presented who male a deficiency, I complex with individual an in mal mitochondrial morphology arm of complexcomplex andperipheral I abnorI, activity decreased the of assembly improper causes knockdown its and subunits, I plex com into clusters Fe/S of incorporation the for essential is NUBPL complex I and approach. We prioritized candidate genes on the basis of functional functional of basis the on genes candidate We prioritized approach. disease. of cases sporadic individual, to applicable readily be not same phenotype with the mutations regions different in individuals these unrelated by identifying pathogenicity of and regions have interest, established Mendelian disease by using multiple affected for individuals to variants hone in causal on detected have projects sequencing whole-exome variants rare protein-modifying 400–500 mated esti an carries person each genome, the of portion coding protein the Even within individuals. between differences sequence of benign plethora the from alleles pathogenic distinguish to be moving will genetics forward human of cases. challenge important individual in most the even Perhaps disease of basis genetic the solve to opportu nity new a offer technology sequencing genome in Advances DISCUSSION deficiency. I complex of unique mutations47 in 20 to genes, highlighting the allelic and correspond heterogeneity cohort our in diagnoses genetic new and 1% gene are ( in an uncharacterized membrane) inner mitochondrial the within pools cardiolipin maintaining by stability I complex for TAZ required the protein and selected as all unrelated individuals within the 103 individuals sequenced. the 103 individuals within individuals as all unrelated selected cohort, are a representative Subjects diagnosis. of genetic absence and indicates gray diagnosis, genetic confirmed with individuals indicates Red genome. (nDNA) or nuclear (mtDNA) in the mitochondrial location and gene of underlying by function grouped I deficiency complex isolated 5 Figure exists. phenotype cellular a which for any disorder to principle in applied be can strategy This individuals. single in tions which we could establish the pathogenicity of newly discovered muta with disease, of models cellular of availability the was approach our rare variants that we Key to predicted be to deleterious. the of success of DNA a and pooled identified cohort, sequencing performed clues, In the Mito10K project, we have demonstrated an alternative alternative an demonstrated have we project, Mito10K the In Our approach successfully identified pathogenic roles for roles pathogenic identified successfully approach Our 48% FOXRED1

Genetic diagnosis of 94 unrelated individuals with definite, definite, with individuals of 94 unrelated diagnosis Genetic 3 8 . Similar . to Similar its role in the yeast 2% . NUBPL, also known as IND1, is an assembly factor for 3% a Supplementary NoteSupplementary 36,37 7% DVANCE ONLINE PUBLICATION ONLINE DVANCE 6% . . Although this approach has broad utility, it may 12% 21% 38,39 Other mtDNA replication CI assemblyfactor CI subunit(nDNA) CI subunit(mtDNA) mt-tRNA . We now report FOXRED1

[ TAZ for a complete descrip clinical FOXRED1 ] Yarrowia lipolytica , ). ). In the total, previous

35,36 nDNA X-linked(1% nDNA recessive(22% mtDNA (29% Nature Ge Nature NUBPL . Several recent recent Several . NUBPL mutations ) , , human NUBPL n 3 4 etics , and and , ) in in ) ------

© 2010 Nature America, Inc. All rights reserved. sequence sequence data analysis, M. Garber for assistance with evolutionary conservation PolyPhen-2.0 predictions, M. DePristo, E. Banks and A. Sivachenko for advice on assistance with sequence analysis, pooled I. Adzhubei and S. Sunyaev for L. Ziaugra for genotyping assistance, M. Cabili for tool evaluation, J. Flannick for for Illumina sequence project management, T. Fennel for sequence alignment, S. Mahan for assistance in DNA sample preparation, J. Wilkinson and L. Ambrogio human subjects protocols, R. Onofrio for designing PCR primers, K. Ardlie and antibody, J. for Boehm the lentiviral expression vector, S. Flynn for assistance with assays and DNA preparation, M. McKenzie and M. Ryan for the NDUFAF2 We thank S. Tregoning, A. Laskowski and S. Smith for assistance with enzyme Note: Supplementary information is available on the online the at http://www.nature.com/naturegenetics/. paper the of in ­version available are references associated any and Methods M bases of these with remaining cases. combined sequencing, elucidaterequiredfullyfunctionalvalidation,theto be will Broader mechanisms. epigenetic or someindividuals,thatin causedcomplex diseasebyisthe inheritance discovery screen but filtered out by our stringentdeletions, which ourcriteria. approach cannot It detect, or (v)is were presentalso inpossible our gene orexon sensitivity,mtDNA,contain full(iv) ofthe inespecially toryregion or un-annotated exon, (iii) were not detected owing to lack non-targeted gene, (ii) reside in a non-targeted region, such as a regula a resideunsolvedincausal(i) variantsthecases true inlikely thethat thatmaycontribute pathogenesis,to suchvariants.nomost carry It is agnosedcomplexindividuals I detectedwe‘likely deleterious’ variants studyofX-linked mental retardation for the I remaining complexhalf. Our with results are subjects comparable 103 todeficiency ( a the recent sequencing of half in mutations pathogenic I. complex and metabolism acid amino between link a potential suggesting catabolism, in acid amino redox reactions perform PDPR) and PIPOX SARDH, (DMGDH, homologs human related gene. of The function FOXRED1 is not clear, its although four with As acid. amino in a mutation conserved missense and a p.Asn430Ser compound revealed heterozygous subject this from samples Sequencing thase). syn citrate to relative samples both in mean control normal of (9% deficiency I complex severe showed fibroblasts and biopsy Muscle (see age of years 6 at syndrome Leigh with diagnosed was and acidosis congenital lactic with birth at presented who infant male a in mutations subunits I complex with profile logenetic solely on the basis of localization its mitochondrial . This gene was as selected a candidate protein that uncharacterized derives its name from a FAD-dependent rolepathogenic for of allele wild-type a of expression by rescued was fibroblasts in defect I complex the Nevertheless, looked. over or missed be may mutations site branch as such variants and second-generation pooled duplication sequencing. Large deletions and are not detected of 1–4 7 exon exons of of deletion involving rearrangement chromosomal complex a contained allele other the and 10, exon of skipping caused that mutation site branch a and mutation missense p.Gly56Arg the both contained allele one heterozygous: compound Nature Ge Nature analyses, J. Pirruccello, R. Do and S. Kathiresan for data and analysis of control A cknowledgments et Although the Mito10KprojectsuccessfullyAlthoughtheidentified or We also discovered pathogenic mutations in h NUBPL ods Supplementary Note Supplementary n Fig. above, cDNA established rescue etics NUBPL FOXRED1 5 ),we were unable to identify ‘smoking gun’ mutations

NUBPL ADVANCE ONLINE PUBLICATION ONLINE ADVANCE Ti idvda hglgt te iiain of limitations the highlights individual This . mutations: a p.Gln232X nonsense mutation nonsense a mutations: p.Gln232X mutations in complex I deficiency. for a complete clinical description). description). clinical complete a for 4 1 . Although. someinofthe undi NUBPL 1 N 4 . We detected We . detected a t u FOXRED1 , thereby establishing a establishing thereby , FOXRED1 r e

G e 4 n 0 e and shared phy

t i c s website. as a disease- , , which is an

FOXRED1

confirmed molecular

- - - - -

reprintsandpermissions/. http://npg.nature.com/ at online available is information permissions and Reprints Published online at http://www.nature.com/naturegenetics/. The authors declare no competing interests.financial ofAll aspects the study bywere D.R.T.supervised and V.K.M. M.J.D. The manuscript was written by S.E.C., E.J.T., A.G.C., D.R.T. and V.K.M. analysis was performed by D.L.B. was Syzygy developed and run by andM.R. performed by E.J.T., A.G.C. and O.A.G.array-based cytogenetic Affymetrix with assistance from E.J.T., A.G.C. and All M.R. experiments were anddesigned the genotyping. S.E.C. anddesigned performed the computational analyses, by S.E.C., N.P.B. and C.G. G.C. andM.C.R. performed pooling. C.G. performed Broad Institute by D.A., M.J.D. and S.B.G. Project management was performed and E.J.T. The sequencing protocolpooled was anddesigned established at the Samples were bycollected D.M.K., E.W. and C.J.W. and prepared by A.G.C. E.W. and C.J.W. provided clinical interaction and assisted with sample collection. from M.J.D. and S.B.G. Enzyme diagnosis of the cohort was coordinated by D.M.K. This study was conceived and bydesigned S.E.C., D.R.T. and V.K.M. with input scientist and dear colleague who died during the preparation of this manuscript. dedicate this article to the ofmemory our co-author Kirby,Denise an outstanding the US National Institutes of Health to(GM077465) V.K.M. The authors wish to awarded to D.R.T., an Australian Postgraduate Award to E.J.T. and a grant from Fellowship from the Australian National Health & Medical Research Council studies. This work was supported by a grant (436901) and Principal Research data, and the many physicians who referred subjects and assisted with these 20. 19. 18. 17. 16. 15. 14. 13. 12. 11. 10. 9. 8. 7. 6. 5. 4. 3. 2. 1. COMPETING FINANCIAL INTERESTS FINANCIAL COMPETING AUTHOR CONTRIBUTIONS

ete, D.R. Bentley, C. Sugiana, for A. Saada, molecular A E.A. Shoubridge, & N.G. Kennaway, I., Ogilvie, T.O. Yeates, & D. Eisenberg, M.J., Thompson, E.M., Marcotte, M., Pellegrini, A D. Eisenberg, & T.O. Yeates, M.J., Thompson, M., Pellegrini, E.M., Marcotte, D.J. Pagliarini, Smeitink, J., Sengers, R., Trijbels, F. & van den Heuvel, L. Human NADH:ubiquinone S. Lebon, M. Bugiani, P.Bénit, mitochondrial of Assembly A. Barrientos, & D. Horn, I.C., Soto, F., disease. Fontanesi, and DNA Mitochondrial G. Davidzon, & S. Dimauro, R. McFarland, E. Morava, F.P.Bernier, Lazarou, M., Thorburn, D.R., Ryan, M.T. & McKenzie, M. Assembly of mitochondrial Mitochondrial L.P.J.A. Heuvel, Smeitink, den & van L.G., Nijtmans, R.J., Janssen, Distelmaier, F. Skladal, D., Halliday, J. & Thorburn, D.R. Minimum birth prevalence of mitochondrial terminator chemistry. terminator neonatal (2008). disease. fatal mitochondrial neonatal lethal causes cause protein, assembly I disease. mitochondrial complex (C6ORF66)-interacting encephalopathy. progressive a in Invest. Clin. mutated J. is assembly I complex mitochondrial profiles. phylogenetic protein analysis: genome comparative by functions protein Assigning (1999). 83–86 function. protein of prediction genome-wide for algorithm combined biology. disease oxidoreductase. deficiency. chain deficiency. syndrome. −1) inthe Leigh (IVS1nt mutation nonsense anovel of rapid identification allows deficiency I complex chain respiratory with families inbred/multiplex Physiol. Cell Physiol. process. cellular regulated highly and complicated a c-oxidase, cytochrome (2005). 222–232 55 deficiency. I complex and encephalopathy mitochondrial infantile of Neurology children. disease. in defects and I complex (2006). pathology.and function structure, I: complex disease. clinical to children. in disorders chain respiratory , 58–64 (2004). 58–64 , Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. Neurology et al. et

67 t al. et et al. et Biochim. Biophys. Acta Biophys. Biochim. t al. et et al. et , 1823–1826 (2006). 1823–1826 , t al. et t al. et t al. et

et al. Genotyping microsatellite DNA markers at putative disease loci in loci disease putative at markers DNA microsatellite Genotyping et al. et 115 t al. et J. Bioenerg. Biomembr. Bioenerg. J. criteria: diagnostic applications in children. in applications diagnostic criteria: disease Mitochondrial Cell eurn d nv mtcodil N mttos n respiratory in mutations DNA mitochondrial novo de Recurrent Hum. Genet. Hum. uain i NUA3 COF0, noig n NDUFAF4 an encoding (C3ORF60), NDUFAF3 in Mutations J. Med. Genet. Med. J. Diagnostic criteria for respiratory chain disorders in adults and adults in disorders chain respiratory for criteria Diagnostic Clinical and molecular findings in children with complex I complex with children in findings molecular and Clinical

, 2784–2792 (2005). 2784–2792 , Brain MitochondrialcomplexIdeficiency:fromorganelledysfunction cuae hl hmn eoe eunig sn reversible using sequencing genome human whole Accurate De novo mutations in the mitochondrial ND3 gene as a cause a as gene ND3 mitochondrial the in mutations novo De uain f 2of dsut cmlx asml and assembly I complex disrupts C20orf7 of Mutation

59

134 mtcodil rti cmedu euiae cmlx I complex elucidates compendium protein mitochondrial A 291 Nature , 1406–1411 (2002). 1406–1411 , Am. J. Hum. Genet. Hum. J. Am.

, 112–123 (2008). 112–123 , 132 , C1129–C1147 (2006). C1129–C1147 ,

456 , 833–842 (2009). 833–842 ,

112

40 , 53–59 (2008). 53–59 ,

Biochim. Biophys. Acta Biophys. Biochim. , 563–566 (2003). 563–566 , 1659

, 896–899 (2003). 896–899 , 96 Brain , 4285–4288 (1999). 4285–4288 ,

33 , 136–147 (2004). 136–147 , , 259–266 (2001). 259–266 ,

84

126 J. Inherit. Metab. Dis. Metab. Inherit. J. m J Hm Genet. Hum. J. Am. , 718–727 (2009). 718–727 , , 1905–1912 (2003). 1905–1912 ,

s e l c i t r A 1793 , 78–88 (2009). 78–88 , NDUFS4 n. Med. Ann.

83 29 Ann. Neurol. Ann. Nature 468–478 , , 499–515 , ee in gene Am. J. Am.

402

37  , ,

© 2010 Nature America, Inc. All rights reserved. 30. 29. 28. 27. 26. 25. 24. 23. 22. 21. s e l c i t r A 

Anderson, S.L. Anderson, V. Petruzzella, E. Leshinsky-Silver, S.M. Budde, Valente, L. D.M. Kirby, P.D.Stenson, a Database, Genome Mitochondrial Human mtDB: U. Gyllensten, & M. Ingman, S.T.Sherry, K.A. Frazer, Ashkenazi Jewish family. Jewish Ashkenazi (2001). 529–535 18 kDa (AQDQ) subunit of complex I abolishes assembly and activity of of activity and syndrome. assembly Leigh-like abolishes with I patient a complex in of complex subunit the (AQDQ) kDa 18 involvement. brainstem predominant gene. NDUFS4 Commun. encoded nuclear the in mutations with encephalomyopathy. deficiency. I complex mitochondrial Med (2006). sciences. D749–D751 medical and genetics population for resource Res. SNPs.

1 29 , 13 (2009). 13 , Nature , 308–311 (2001). 308–311 ,

275 et al. et al. et et al. et t al. et

t al. et et al. et , 63–68 (2000). 63–68 , 449 et al. et t al. et Identification of novel mutations in five patients with mitochondrial dbSNP: the NCBI database of genetic variation. genetic of database NCBI the dbSNP: A second generation human haplotype map of over 3.1 million 3.1 over of map haplotype human generation second A , 851–861 (2007). 851–861 , Combined enzymatic complex I and III deficiency associated deficiency III and I complex enzymatic Combined DF6 uain ae nvl as o lta neonatal lethal of cause novel a are mutations NDUFS6 The Human Gene Mutation Database: 2008 update. 2008 Database: Mutation Gene Human The Biochim. Biophys. Acta Biophys. Biochim. A novel mutation in mutation novel A A nonsense mutation in the NDUFS4 gene encoding the the encoding gene NDUFS4 the in mutation nonsense A t al. et J. Inherit. Metab. Dis. Metab. Inherit. J. DF4 uain cue eg snrm with syndrome Leigh cause mutations NDUFS4 J. Clin. Invest. Clin. J. Mol. Genet. Metab. Genet. Mol. NDUFS4

1787

32 , 491–501 (2009). 491–501 ,

, 121 (2009). 121 , causes Leigh syndrome in an in syndrome Leigh causes 114 , 837–845 (2004). 837–845 ,

97 u. o. Genet. Mol. Hum. ice. ipy. Res. Biophys. Biochem. uli Ais Res. Acids Nucleic , 185–189 (2009). 185–189 , Nucleic Acids Nucleic Genome

10 34 , ,

41. 40. 39. 38. 37. 36. 35. 34. 33. 32. 31.

Tarpey,P.S. S. Calvo, K. Bych, Sheftel, A.D. S.B. Ng, S.B. Ng, M. Choi, respiratory M.T.Mitochondrial Ryan, & D.R. Thorburn, M., Lazarou, M., McKenzie, J. Loeffen, M. Schuelke, L. Heuvel, den van coding exons in mental retardation. mental in exons coding genomics. integrative through assembly. I. complex Genet. Nat. exomes. sequencing. DNA 361 patients. Syndrome Barth in destabilized are supercomplexes chain syndrome. Leigh epilepsy. myoclonic and leukodystrophy subunit. (AQDQ) 18-kD the encoding gene nuclear the in duplication 5-bp a deficiency: I complex , 462–469 (2006). 462–469 , Nature t al. et t al. et t al. et t al. et t al. et EMBO J. EMBO Mol. Cell. Biol. Cell. Mol.

t al. et et al. et 42 et al. t al. et eei danss y hl eoe atr ad asvl parallel massively and capture exome whole by diagnosis Genetic agtd atr ad asvl prle sqecn o 1 human 12 of sequencing parallel massively and capture Targeted , 30–35 (2010). 30–35 , h io-upu poen n1 s eurd o efcie ope I complex effective for required is Ind1 protein iron-sulphur The identifies the cause of a mendelian disorder. amendelian of cause the identifies sequencing Exome

Systematic identification of human mitochondrial disease genes disease mitochondrial human of identification Systematic a 461 The first nuclear-encoded complex I mutation in a patient with inapatient Imutation complex nuclear-encoded first The Am. J. Hum. Genet. Hum. J. Am. Am. J. Hum. Genet. Hum. J. Am. A systematic, large-scale resequencing screen of X- of screen resequencing large-scale systematic, A DVANCE ONLINE PUBLICATION ONLINE DVANCE Human ind1, an iron-sulfur cluster assembly factor for respiratory Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc.

uat DF1 uui o mtcodil ope I causes I complex mitochondrial of subunit NDUFV1 Mutant 27 et al. et , 272–276 (2009). 272–276 , , 1736–1746 (2008). 1736–1746 , Demonstration of a new pathogenic mutation in human in mutation pathogenic new a of Demonstration

29 , 6059–6073 (2009). 6059–6073 , Nat. Genet. Nat. Nat. Genet. Nat.

63 62 Nat. Genet. Nat. , 1598–1608 (1998). 1598–1608 , , 262–268 (1998). 262–268 ,

38

, 576–582 (2006). 576–582 , 106

41 , 19096–19101 (2009). 19096–19101 , , 535–543 (2009). 535–543 ,

21 , 260–261 (1999). 260–261 ,

Nature Ge Nature . o. Biol. Mol. J. n etics

© 2010 Nature America, Inc. All rights reserved. adjacent to homopolymer runs (see (see runs homopolymer to adjacent indels excluding exon, a targeted to match 20-bp exact an preceding deletion by were supported and reads, unaligned within from identified were Indels strand). each quality (base strand each on reads NoteSupplementary (see >0.1 bias or a of LOD > Fisher’sstrand −1.5 test exact strand-specific the on each strand). High confidence SNVs had log odds (LOD) scores scores (LOD) odds log had SNVs confidence High strand). each on 100 of quality minimum (base a reads aligned with high-quality bases targeted on algorithm Syzygy the using ple Variant discovery. software SAMtools the Genome Illumina algorithm an on cycles 76 with reads were Analyzer. to aligned the Seventy-six-base-pair genome using MAQ sequenced single-end then and PCR, of cycles 14 using enrichment PCR and with selection size protocol, gel 225–275-bp library single-end Illumina modified a described by as constructed were fragments into sheared and adapters NotI using Sequencing. EX. HT II Multiprobe Packard using pool DNA sample by pooled then were products PCR The size. product PCR to visualize ladder one column of PCR product per plate on 2% agarose E-gel against a 1-kb testing DNA by obtained was confirmation Secondary above. described as pooled of cycle one by followed 72 °C min; for 3 1 min. The for PCR products °C were separately 72 quantified, normalized and and s 30 for °C 60 s, 20 for °C 95 volume. PCR cycling parameters were: one cycle of 95 0.25 °C for 15 and min; 35 cycles of (Qiagen), MgCl mM 2.5 dNTPs, mM 0.8 buffer, HotStar regions were PCR-amplified using 20 ng of whole-genome-amplified DNA,Target 1× catenation. downstream for site recognition a provide to tails added NotI were iterations. design three using samples, CEU HapMap 3 on dated vali and buffer) no length, amplicon bp (150–600 sequence reference hg17 Table1 exons of 111 RefSeq transcripts (release 29) from 103 gene loci ( selection. Target See 4. identifiers. sample 5, 11, = 4; 5 5, pool 12, 3; = 3 5, pool 12, 3; = 5, 4 13, pool = 2 pool 4; 5, 12, = 1 Pool counts: following with the mutations, nuclear known and mutations, mtDNA known diagnoses, unknown with individuals contained pool case Each amounts. equimolar in pooled then were samples twenty-one or Twenty error. pipetting uniform a automation was used across the entire set and in robotic all steps in order to same guarantee The EX. HT II Multiprobe Packard the using automated were that is the limit accuracy of PicoGreen quantification. The normalization steps mean concentration of 19.2 ng ng 20 to normalized was DNAVarioskan Flash. concentration Scientific on a Thermo concentration was measured by Quant-iT PicoGreen dsDNA reagent detected with Kit REPLI-g QIAGEN 100 a ng input DNA. HapMap using samples were not whole-genome amplified. DNA amplified whole-genome Each was salting-out. by sample followed digestion K proteinase by liver) and Nucleon DNA Extraction kit or from subject tissues (skeletal or cardiac DNA and pooling. preparation DNA suitable sequencing. for no available was whom from individuals nine of exception the to with 1992 2007, from Melbourne in diagnosed individuals such all includes cohort ( I complex of that than higher twofold least at mal, and of the activity normalized complexes II, III and IV was required to be ity to citrate or synthase relative to complex II was required to be assays interpreted by published criteria diagnosis of isolated complex I deficiency, based on spectrophotometric enzyme cases. I deficiency Complex ONLINE doi:10.1038/ng.659 ). Primers were iteratively designed using PRIMER3 software on the the on software PRIMER3 using designed iteratively were Primers ). μ 4 l

4 −1 MET within the Picard analysis pipeline, and further processed using using processed further and pipeline, analysis Picard the within The PCR products for each pooled sample were concatenated concatenated were sample pooled each for products PCR The based on two rounds of quantification and dilution, yielding a yielding dilution, and quantification of rounds two on based ≥ 10 unaligned reads on 10 unaligned each strand that an contained insertion/ H Targets included 2 mtDNA regions and coding and UTR UTR and coding and regions mtDNA 2 included Targets High-confidence SNVs were in sam High-confidence detected each pooled ODS μ ). Low-confidence ). SNVs Low-confidence were supported by at least three frad n rvre rmr i a 10- a in primers reverse and forward M 4 5 and custom scripts. custom and The 60 cases plus 43 case controls had a definite had a controls plus definite 43 case The 60 cases μ DNA was from isolated using cells a cultured l −1 (1.56 s.d.). We for allowed 10% as variance ≥ Supplementary Note Supplementary 20, mapping quality >0, >0, quality mapping 20, 5,42 ≥ . . Briefly, the ratio of complex I activ Supplementary Note Supplementary 20, mapping quality >0, >0, quality mapping 20, 2 Supplementary Fig. 10 Fig. Supplementary , 0.2 units of HotStar Enzyme Enzyme HotStar of units 0.2 , ). Supplementary ≥ for HapMap for 200 reads on on reads 200 ≤ 4 25% 25% of nor μ 3 . Libraries Libraries . l reaction reaction l ≥ 30 reads reads 30 ≥

subject subject muscle 3, with with 3, ). The The ). - - - -

the 103 individuals with complex I deficiency using Sequenom MassARRAY Sequenom using deficiency I complex with individuals 103 the Genotyping. nonsynonymous, as annotated sites dbSNP128 HGMD. in present those ­excluding all and 2009.1, version from selected training data: missense variants all disease-associated in HGMD mutations mtDNA pathogenic of carriers mtDB in frequency allele minor dbSNP controls, HapMap 42 in present if excluded were disease with associated previously not were that (HumVar data) PolyPhen-2.0 training browser genome species, based on the multiz44way genome in alignments downloadedconserved fromacid UCSC amino an at variant; variant nonsense missense (vii) (vi) indel; coding (v) variants); splice disease-associated HGMD of 8,189 all data consisting on of basis the training selected −1,1,2,3,5 gene; (iv) present at a tRNA splice site (splice acceptor and sites −1,-2,-3, splice donor sites mitochondrial 5 a in in present (iii) present (ii) 2009.1; version (HGMD) based Database professional Mutation variant, Gene Human disease the a and as curation manual reported on previously (i) criteria: following the hg18 the contained allele. reference individuals all where sites at calculated was specificity where medium was replaced. Cells were grown in antibiotic-free medium for 30 h 30 for medium antibiotic-free in grown were Cells replaced. was medium before °C 37 at h 24 for incubated and min 90 for r.p.m. 2,500 at spun were atpolybrene a concentration of final 5 of addition 62.5 0.45- a incubation, h through filtered and 24 harvested were after virus packaged and, containing supernatants transfection after h 16 cells the to applied was medium Fresh protocol. manufacturer’s the to according (Qiagen) reagents or pDest (pCMV- plasmid packaging a with cotransfected and confluence 60% to plates 10-cm and transduction. production Viral particle the generate to cloning Gateway by tag) C-terminal (V5 TRC970 pLEX into into cloned was then adaptors, Gateway incorporating RT-PCR by cells MCH58 FOXRED1 (primers instructions listed in manufacturer’s to according (Stratagene) kit mutagenesis site-directed XL II QuikChange using (alanine) GCA codon reference hg18 the to rs17855445) mutagenesis was to performed change codon 343 from CCA (proline) (dbSNP experiments using this vector did not rescue complex I activity so site-directed Initial (Invitrogen). cloning Gateway by tag) C-terminal (V5 TRC970 pLEX into cloned and Biosystems) Open 3956972, ID: (Clone vector pDONR223 Cloning. protocols. manufacturer’s the to according Biosystems) (Applied Terminators v3.1 BigDye and 3130XL ABI using DNA, genomic on formed variants. rare for reviewed manually and algorithm, by analyzed by SpectroCaller real-time SpectroTyper v.4.0 software SNV. one least at had Variants called that were samples all in >1%) MAF and Inc.). We rate, call data quality high genotype (>95% HWE obtained Compact system with a solid phase laser mass (Bruker spectrometer Daltonics SpectroCHIPs were analyzed in automated mode by a MassArray MALDI-TOF 384-well SpectroCHIP preloaded with 7 nl of matrix (3-hydroxypicolinic acid). a of position each onto loaded was reaction of nl 7 Around pool. DNAof per of by 20–38 v.3.1assays, designed AssayDesigner with 10 starting software, ng pools multiplexed in genotyped were SNVs All DNA Techologies. Integrated chemistry GOLD iPLEX Variants were annotated as ‘likely deleterious’ on the basis of any of of any of basis the on deleterious’ ‘likely as annotated were Variants sites using data genotype from estimated was sensitivity screen Discovery Subject fibroblasts were grown to 80% confluence in 6-well plates before before plates 6-well in confluence 80% to grown were fibroblasts Subject per resequencing, Sanger by validated were SNVs selected and Deletions NUBPL ≥ μ 1 1 individual in the pool contained a variant compared to hg18, whereas M membrane filter. membrane M δ The The FOXRED1 8.91), a pseudotyping plasmid (pMD2-VSVg) and either -V5 pDest vector. The full-length vector. -V5 pDest The full-length -V5 pDest vector. pDest -V5 SNVs were assayed in whole-genome-amplified DNA from from DNA whole-genome-amplified in assayed were SNVs FOXRED1 μ 4 ′ 7 L L of T ad leig h peec o a usra ORF upstream an of presence the altering and UTR (see (see -V5-pDest. Transfection was performed using Effectene Effectene using performed was Transfection -V5-pDest. NUBPL Supplementary Note Supplementary 5 0 pn edn fae OF ws ucae i a in purchased was (ORF) frame reading open . Oligos were synthesized and mass-spec QCed at QCed mass-spec and synthesized were Oligos . 2 -V5 -V5 or 125 Supplementary TableSupplementary 5 2 , 1,000 genomes pilot 1, or present at >0.005 >0.005 at present or 1, pilot genomes 1,000 , 2 3 based on the frequency of asymptomatic asymptomatic of frequency the on based μ 4 8 g g ml (see (see μ L L 4 9 −1 FOXRED1 NUBPL . Conservation thresholds were were thresholds Conservation . Supplementary Note Supplementary HEK-293T cells were cells grown HEK-293T on ), or predicted as as predicted or ), in 8.75 ml total medium. Plates ORF was amplified from ORF was amplified ) ) to generate the RefSeq ≥ -V5 viral particles and particles -V5 viral 10 aligned vertebrate vertebrate aligned 10 Nature Ge Nature ‘ damaging’ by damaging’ NUBPL ). ). Variants P n > 0.001 etics -V5- 4 2 6 4 - ;

© 2010 Nature America, Inc. All rights reserved. 4 °C. After washing, membranes were incubated in anti-mouse or or anti-mouse Bioscience). (Amersham in incubated reagents were or ECL Plus detection ECL using for 1 h and temperature developed membranes at overnight washing, rabbit antibodies After primary °C. with 4 incubated powder, and milk skim Tween-20) 5% 0.05% containing (PBS blocked (Millipore), membranes PVDF to transferred were proteins and (Invitrogen), gels NuPAGE Bis-Tris 25–50 cock Next, inhibitor (Roche). protease tail containing SDS) 0.1% and deoxycholate 0.5% NP-40, sodium 1% NaCl, mM 150 8.0, pH Tris mM (50 buffer RIPA in lyzed blot. protein and SDS-PAGE (Qiagen). kit Extraction Gel MinElute the using purification gel after or sequenced pro tocols manufacturer’s per as Biosystems) (Applied Terminators v3.1 BigDye and 3130XL ABI using sequenced directly either were products PCR seg ments. overlapping in or product PCR one in either cDNA entire the amplify preparation ng 100 fibroblasts containing medium in splicing, cultured mRNA were and decay nonsense-mediated of protocols. analysis manufacturers’ For per as (Invitrogen) kit synthesis strand First III SuperScript the using generated was cDNA and (Illustra) Kit Mini RNAspin RT-PCR. (Affymetrix). software ToolClient GCOS of Analysis Facility.Research Data were using the analyzed Loss of (LOH) Heterozygosity GeneChip Homozygosity mapping. differences. significant statistically determine to of parisons groups followed by post hoc using analysis the method Bonferroni Two-waydensitometry. repeated measures analysis of for (ANOVA)variance was used used for com was reader dipstick immunochromatographic 1000 to according A the manufacturer’s Hamamatsu ICA-protocol (Mitosciences). on 10 were performed assays assays. activity enzyme Dipstick assays. dipstick for harvested were cells selection, of days 12–20 1 containing medium selection applying before Nature Ge Nature HRP RNA was extracted from cultured subject fibroblasts using the the using fibroblasts subject cultured from extracted was RNA secondary antibodies (DakoCytomation used at 1:10,000) at room at 1:10,000) used (DakoCytomation antibodies secondary Nsp 5 n 1 PR rmr ( primers PCR . 250 k Array (Affymetrix), performed by the Australian Genome etics Homozygosity was using determined SNP Mapping μ g of cleared lysate were run per lane on 10% 10% on lane per run were lysate cleared of g μ g g and 15 upeetr Tbe 5 Table Supplementary Primary control and case fibroblasts were were fibroblasts case and control Primary Complex I Complex and complex IV activity dipstick μ g, respectively, of cleared cell lysates lysates cell of cleared g, respectively, μ l −1 CHX for 24 h before RNA before h 24 for CHX μ g ml g −1 ) were designed to to designed were ) puromycin. After After puromycin. - - - -

44. 43. 42. request. upon available are dated patient variants, and the data sequence seven pooled (BAM files format) availability. Data (Affymetrix). v1.2 software (ChAS) Suite Analysis manufacturer’s Data analysis instructions. was using performed Chromosome the to array, according 2.7M GeneChip Affymetrix the using conducted was analysis. DNA number copy Microarray at 1:5,000. Trobe La Ryan, University) M. and McKenzie M. from gift (kind NDUFAF2 and 1:1,000, at Probes) Molecular (A-1142, subunit 70-kD II complex at 1:10,000, Calbiochem) Porin (529534, at 1:1,000, Mitosciences) blotting. protein for Antibodies gels. manufacturer’s on as per agarose 1% and resolved Biolabs) protocol, England (New NlaIV, or AflIII respectively with overnight digested phoresis, from 100 ng of subject genomic DNA. The products were checked by gel electro FOXRED1 ( RFLP screen 51. 50. 49. 48. 47. 46. 45.

i H, un J & ubn R Mpig hr DA eunig ed ad calling and reads sequencing DNA short Mapping R. Durbin, & J. Ruan, H., Li, A. Gnirke, Kirby, D.M. Lamandé, S.R. Gabriel, S., Ziaugra, L. & Tabbaa, D. SNP genotyping using the Sequenom MassARRAY Cree, L.M., Samuels, D.C. & Chinnery, P.F. using function The inheritance SNP of pathogenic Inferring mitochondrial C.D. Bustamante, & S. Sunyaev, M.W., Dimmic, Karolchik, D., Hinrichs, A.S. & Kent, W.J. The UCSC Genome Browser. cause frames reading open Upstream V.K. Mootha, & D.J. Pagliarini, S.E., Calvo, H. Li, variants using mapping quality scores. quality mapping using variants sequencing. targeted parallel massively disorder. generation Hum. Mol. Genet. Mol. Hum. COL6A1 nonsense mutation results in mRNA decay and functional haploinsufficiency. platform. iPLEX mutations.DNA methods. (2005). 382–384 computational and structural, evolutionary, Bioinformatics USA Sci. humans. Acad. Natl. among Proc. polymorphic are and expression protein of reduction widespread 25 , 2078–2079 (2009). 2078–2079 , t al. et or exon 10 of et al. t al. et FOXRED1 h Sqec AinetMp omt n SAMtools. and format Alignment/Map Sequence The Chapter 1: Unit 1.4 (2009). 1.4 Unit 1: Chapter et al.

Curr. Protoc. Hum. Genet. Hum. Curr.Protoc. Biochim. Biophys. ActaBiophys.Biochim. Respiratory chain complex I deficiency: an underdiagnosed energy Supplementary Table 2 Supplementary ouin yrd eeto wt utaln oiouloie for oligonucleotides ultra-long with selection hybrid Solution

7 Reduced collagen VI causes Bethlem myopathy: a heterozygous , 981–989 (1998). 981–989 , Neurology NUBPL :c.1289A>G and :c.1289A>G

106

52 was PCR-amplified ( , 7507–7512 (2009). 7507–7512 , , 1255–1264 (1999). 1255–1264 , Antibodies included NDUFS4 (MS104, (MS104, NDUFS4 included Antibodies Genome Res. Genome

Nat. Biotechnol. Nat. 1792 Chapter 2: Unit 2.12 (2009). 2.12 Unit 2: Chapter Genome-wide microarray analysis analysis microarray Genome-wide NUBPL provides detailed data on all vali all on data detailed provides , 1097–1102 (2009).1097–1102 , :c.815-27T>C).

18 Supplementary Supplementary Table 5 , 1851–1858 (2008). 1851–1858 ,

a. yp Biocomput. Symp. Pac. 27 , 182–189 (2009). 182–189 , doi:10.1038/ng.659 Bioinformatics Exon 11 of Curr. Protoc. ­ - )