Personal Genomics and Mitochondrial Disease

The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters.

Hershman, Steven Gregory. 2013. Personal Genomics and Citation Mitochondrial Disease. Doctoral dissertation, Harvard University.

Accessed April 17, 2018 4:03:57 PM EDT

Citable Link http://nrs.harvard.edu/urn-3:HUL.InstRepos:11129104

This article was downloaded from Harvard University's DASH Terms of Use repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA

(Article begins on next page) Personal Genomics and Mitochondrial Disease

A dissertation presented by Steven Gregory Hershman to e Committee on Higher Degrees in Systems Biology in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Systems Biology

Harvard University Cambridge, Massachusetts April  ©  -Steven Gregory Hershman All rights reserved. Vamsi K Mootha Steven Gregory Hershman

Personal Genomics and Mitochondrial Disease

A

Mitochondrial diseases involving dysfunction of the respiratory chain are the most common inborn errors of metabolism. Mitochondria are found in all cell types besides red blood cells; consequently, patients can present with any symp- tom in any organ at any age. ese diseases are genetically heterogeneous, and exhibit maternal, autosomal dominant, autosomal recessive and X-linked modes of inheritance. Historically, clinical genetic evaluation of mitochondrial disease has been limited to sequencing of the mitochondrial DNA (mtDNA) or several can- didate . As sequencing transformed from a research grade effort costing , to a clinical test orderable by doctors for under ,, it has become practical for researchers to sequence individual patients. is the- sis describes our experiences in applying “MitoExome” sequencing of the mtDNA and exons of >  nuclear genes encoding mitochondrial in ~ pa- tients with suspected mitochondrial disease. In  infants, we found that  harbored pathogenic mtDNA variants or compound heterozygous mutations in candidate genes. e pathogenicity of two nuclear genes not previously linked to disease, NDUFB and AGK, was supported by complementation studies and evi- dence from multiple patients, respectively. In an additional two unrelated children presenting with Leigh syndrome and combined OXPHOS deficiency, we identified compound heterozygous mutations in MTFMT. Patient fibroblasts exhibit severe defects in mitochondrial translation that can be rescued by exogenous expres- sion of MTFMT. Furthermore, patient fibroblasts have dramatically reduced fMet- tRNAMet levels and an abnormal formylation profile of mitochondrially translated

iii Vamsi K Mootha Steven Gregory Hershman

COX. ese results demonstrate that MTFMT is critical for human mitochon- drial translation. Lastly, to facilitate evaluation of copy number variants (CNVs), we developed a web-interface that integrates CNV calling with genetic and pheno- typic information. Additional diagnoses are suggested and in a male with ataxia, neuropathy, azoospermia, and hearing loss we found a deletion compounded with a missense variant in D-bifunctional , HSDB, a peroxisomal enzyme that catalyzes beta-oxidation of very long chain fatty acids. Retrospective review of metabolic testing from this patient revealed alterations of long- and very-long chain fatty acid metabolism consistent with a peroxisomal disorder. is work ex- pands the molecular basis of mitochondrial disease and has implications for clin- ical genomics.

iv Contents

 Introduction  . Mitochondria ......  . Mitochondrial disease ......  . DNA Sequencing of mitochondrial disease ......  . esis outline ...... 

 Molecular diagnosis of infantile mitochondrial disease with targeted next-generation sequencing  . Introduction ......  . Results ......  .. MitoExome sequencing and validation ......  .. Variant prioritization ......  .. Enrichment of prioritized genes in patients ......  .. Prioritized variants in  patients ......  .. Molecular diagnoses in the mtDNA ......  .. Molecular diagnoses in previously established nuclear dis- ease genes ......  .. Mutations in candidate genes ......  .. AGK mutations in two patients with myopathic mtDNA depletion ......  .. Establishing the pathogenicity of NDUFB ......  . Discussion ......  . Materials and Methods ......  .. Patients ......  .. MitoExome target selection ...... 

v .. Hybrid selection and Illumina sequencing ......  .. Sequence alignment, variant detection, and annotation ..  .. mtDNA sequence analysis ......  .. Sensitivity, specificity, and reproducibility ......  .. Variant prioritization ......  .. Prevalence in cases and controls ......  .. Variant validation and phasing ......  .. Evidence of pathogenicity ......  .. Molecular karyotyping ......  .. NDUFB complementation ......  .. Complex I and IV enzyme activity assays ......  . Manuscript Information ......  .. Previously Published As ......  .. Author’s Contributions ...... 

 Mutations in MTFMT Underlie a Human Disorder of Formylation Caus- ing Impaired Mitochondrial Translation  . Introduction ......  . Results ......  .. Mitochondrial translation is impaired in two unrelated pa- tients with Leigh syndrome ......  .. MitoExome Sequencing Identifies MTFMT Mutations ...  .. Mitochondrial Translation Is Rescued in Patient Fibrob- lasts by Exogenous MTFMT ......  .. Mitochondrial tRNAMet Pools Are Abnormal in Patient Fi- broblasts ......  .. COX Protein Formylation Is Decreased in Patient Fibrob- lasts ......  . Discussion ......  . Experimental Procedures ......  .. Cell Culture ......  .. Biochemical Analysis ......  .. Translation Assays ......  .. SDS-PAGE and Immunoblotting ...... 

vi .. MitoExome Sequencing ......  .. Variant Prioritization ......  .. Sanger DNA Sequencing ......  .. Lentiviral Transduction ......  .. Acid-Urea PAGE and Northern Blotting ......  .. Mass Spectrometric Analysis of COX N-termini ......  .. Statistical Analysis ......  .. Accession Numbers ......  . Manuscript Information ......  .. Previously Published As ......  .. Author’s Contributions ...... 

 Copy Number Variant Detection From Targeted Sequencing Data  . Introduction ......  .. Structural Variants ......  .. Methods for detecting structural variants ......  .. Goals of this chapter ......  . Materials and Methods ......  .. Exome and MitoExome Datasets ......  .. Depth of coverage calculation ......  .. Read Depth-Based CNV Detection ......  .. Read-Based Variant Evaluation ......  .. Personal Genome Browser ......  .. Calculations of sensitivity and false positive rates .....  . Results ......  .. Performance of CNV Discovery ......  .. CNV Discovery on MitoExome patients ......  . Discussion ......  . Manuscript Information ......  .. Author Contributions ...... 

 Implication and Future Directions  . Implications ......  .. Discovery of new disease genes ...... 

vii .. Development of clinical genetic diagnostics ......  . Future Directions ......  .. Creating tools for personal genomic evaluation ......  .. Finding candidates for the rest ......  .. Evaluating the pathogenicity of candidates ...... 

A Supplementary Materials for Chapter   A. Materials and Methods ......  A.. Mitochondrial localization of known OXPHOS disease- related genes ......  A.. mtDNA alignment ......  A.. mtDNA de novo assembly ......  A.. Mutations in dominant-acting genes ......  A.. Analysis of variants reported previously as pathogenic ..  A.. Variant Validation, Phasing, and Cellular Assay ......  A. Support of Pathogenicity ......  A.. ACAD mutations in P ......  A.. POLG mutations in P ......  A.. BCSL mutations in P ......  A.. COXB mutations in P ......  A.. GFM mutations in P and P ......  A.. TSFM mutations in P and P ......  A.. AARS mutations in P ......  A.. AGK mutations in P and P ......  A. Clinical Information for AGK and NDUFB Patients ......  A. Evolutionary Conservation ...... 

B Supplementary Materials for Chapter   B. Detailed clinical summaries ......  B.. Patient  (P) ......  B.. Patient  (P) ......  B. Next generation sequencing coverage analysis ......  B. Supplementary Methods ......  B.. Quantitative RT-PCR (qRT-PCR) ...... 

viii B.. Full N-termini Mass Spectrometry methods ...... 

ix T  

x A L

e author for Chapter  is Steven G Hershman. e authors for Chapter  are Sarah E Calvo, Alison G Compton, Steven G Hershman, Sze Chern Lim, Daniel S Lieber, Elena J Tucker, Adrienne Laskowski, Caterina Garone, Shangtao Liu, David B Jaffe, John Christodoulou, Janice M. Fletcher, Damien L. Bruno, Jack Goldblatt, Salvatore DiMauro, David R or- burn, and Vamsi K Mootha e authors for Chapter  are Elena J Tucker, Steven G Hershman, Caro- line Köhrer, Casey A Belcher-Timme, Jinal Patel, Olga A Goldberger, John Christodoulou, Jonathon M Silberstein, Matthew McKenzie, Michael T Ryan, Alison G Compton, Jacob D Jaffe, Steven A Carr, Sarah E Calvo, Uttam L RajB- handary, David R orburn, and Vamsi K Mootha e authors for Chapter  are Steven G Hershman, Daniel S Lieber and Vamsi K Mootha. e author for Chapter  is Steven G Hershman. Detailed information about author contributions are provided at the end of each chapter in the section entitled “Manuscript Information.”

xi A

is thesis would not have been possible without the support of many individ- uals throughout my graduate studies. anks goes first and foremost to my visionary adviser, Professor Vamsi K Mootha. Without his guidance and support, this research would not have been possible. It has been wonderful working with the multidisciplinary set of clini- cians and scientists in his laboratory. Whether bouncing experimental ideas off of Robert Bao or deciding how to process data with Sarah Calvo, I have always been able to find a great mentor. I’ve wholeheartedly enjoyed working on these projects with Casey Belcher-Timme, Shangtao Liu, Nancy Slate, Nina Gold, and Daniel Lieber and having discussions with everyone in the lab. is work was done collaboratively with others at Harvard and beyond. Specific contributions to each chapter are listed in the sections marked “Manuscript In- formation.” In particular, I am grateful for having the opportunity to work closely with David orburn, Elena Tucker, Alison Compton, Uttam RajBhandari Caroline Koehrer, Steve Carr and Jacob Jaffe. For the first few years of my PhD, I have had the pleasure of working in the wake of Mark DePristo and his team, including Kiran Garimella, Eric Banks, Chris Hartl, Aaron McKenna, Matthew Hanna, as they developed the Genomic Analysis Toolkit. I would also like to thank my dissertation advisory committee of George Church, Steve McCarroll and especially the chair, Jagesh Shah, for giving me an opportunity to present my research and providing guidance. It has been a treat being surrounded by so many wonderful people in the Sys- tems Biology and Leder Programs. I’d like to the thank program heads, Sam Reed, Tim Mitchison, Pam Silver, Connie Cepko, and omas Michel in addition to class- mates in my year, other years and those from the Harvard/MIT communities. I’m grateful for my family and friends, especially my parents, Morris Hershman and Shirley Seiden, and my sister Samantha Hershman. I would finally like to thank the Department of Defense for awarding me the National Defense Science and Engineering Graduate fellow, which funded most of my studies. Text is typeset in X TE EX using a modified version of Andrew Leifer’s template,

xii http://github.com/aleifer/LaTeX-template-for-Harvard-dissertation.

xiii My mitochondria comprise a very large proportion of me. I can- not do the calculation, but I suppose there is almost as much of them in sheer dry bulk as there is the rest of me. Looked at in this way, I could be taken for a very large, motile colony of respiring bacteria, operating a complex system of nuclei, micro- tubules, and neurons for the pleasure and sustenance of their families, and running, at the moment, a typewriter.

Lewis omas, [] 1 Introduction

     is to link phenotype to genotype, T thereby enabling basic biology research and providing direction for the development of clinical diagnostics and therapeutics. As of January , the molecular basis of just fewer than half of the , Mendelian disorders described in the Online Mendelian Inheritance in Man (OMIM) have been identified[].

Most of these associations were discovered through linkage studies followed by positional cloning in large pedigrees[]; however, such studies require several af-

 fected individuals, preferably in a multi-generation pedigree, which is not always possible as disease often presents in singleton cases. Introduction of massively parallel sequencing instruments [, , ], capable of generating hundreds of millions of short reads in a single experiment, have given researchers the se- quencing capabilities necessary to conduct genome-scale resequencing projects on such singleton cases. In this thesis, I present the application of targeted se- quencing on approximately  individuals with suspected mitochondrial disease and phenotypically similar disorders. is chapter will present an introduction to mitochondria and mitochondrial disease and then will discuss how massively parallel sequencing can be used to gain insight into disease.

. M

Mitochondria formed when an eukaryotic progenitor engulfed an alpha- proteobacterium[]. Inside the eukaryote, the early mitochondria began to transfer its genome to the nucleus as it specialized into a compartment rich in metabolic activity. Mitochondria are most well known for their role in produc- ing cellular energy in the form of adenosine triphosphate (ATP); however, they also play roles in the production of biosynthetic intermediates, calcium regulation and cellular stress responses such as autophagy and apoptosis. At a macro scale, mitochondria are a series of tubes . to . micrometer (µm) in diameter that form a dynamically interconnected network. All cell types, except for red blood cells, house -, mitochondria, which are actively transported to various subcellular compartments. Populations of mitochondria are constantly fusing and dividing. At a microscale, mitochondria have two membranes – a somewhat- selective mitochondrial outer membrane (MOM), which encloses the intermem-

 brane space, and a highly selective mitochondrial inner membrane (MIM), which encloses the matrix.

e MIM houses the respiratory chain, a series of five macromolecular pro- tein complexes that performs oxidative phosphorylation (OXPHOS) to produce

ATP. e first four complexes couple electron transfer from donors (NADH and

FADH) to oxygen while pumping protons from the matrix across the MIM to the intermembrane space, thereby generating an electrochemical gradient. Complex

V, the world’s tiniest rotary motor, uses this electrochemical gradient to gener- ate ATP. Oxidative phosphorylation is extremely efficient: aerobic catabolism of one molecule of glucose produces - molecules of ATP while only  ATPs are generated under anaerobic conditions. However, when cells undergo oxidative phosphorylation they are playing with fire since reactive oxygen species, capable of causing oxidative stress, are generated by Complexes I and III[].

Mitochondria are unique in that they house an extranuclear DNA: the mtDNA.

Four of the five respiratory chain complexes, all but complex II, have components encoded by the mtDNA. mtDNAs are .Kb circular DNAs found in - copies in each mitochondrial matrix in a subcellular structure known as a nucleoid. e mtDNA encodes  protein subunits of the respiratory chain,  mitochondrial rRNAs and  tRNAs. Additional machinery required for mtDNA replication, tran- scription and translation are imported from the nucleus. Mitochondria DNA is materially inherited as paternal mitochondria are eliminated through autophagy shortly after fertilization [, ]. As mitochondria and cells contain multiple mtDNAs, it is possible that a fraction may harbor mutations – a phenomena known as heteroplasmy. Normal human cells contain dozens of heteroplasmic sites and the frequency of heteroplasmic variants can vary considerably between different

 tissues in the same individual [].

While mitochondria have their own genome, > of mitochondrial proteins are imported from the nucleus. e most complete mitochondrial protein inven- tory published to date is MitoCarta[], which includes over > proteins.

A Bayesian framework was used to integrate proteomic, computational (protein domains, induction with mitochondria, coexpression with other mitochondrial genes, yeast homology, ancestry in R. prowazekii and targeting sequences), litera- ture and microscopy data to make a sum more sensitive and specific than its parts.

It was estimated to be  complete and contain  false positives. is thesis makes use of this molecular framework to better understand mitochondrial dis- ease.

. M 

Mitochondrial disease is a class of devastating disorders caused by dysfunction of the mitochondria. ey are the most common inborn errors of metabolism, with a minimum estimated prevalence of at least  in ,[]. Illness occurs when tissues become starved for energy, damaged by excess ROS, or accumulates nox- ious intermediate metabolites. Many tissue types can be affected, especially those most reliant on mitochondrial energy such as the central nervous system, skeletal muscle, heart, endocrine organs, and kidney[]. Patients commonly present with lactic acidosis, skeletal myopathy, deafness, blindness, subacute neurodegenera- tion, intestinal dysmotility and/or peripheral neuropathy[]. A rule of thumb is to suspect mitochondrial disease in patients with symptoms in three or more or- gan systems without a unifying cause [], though more sophisticated diagnostic systems exist [, ]. In addition to clinical observations, diagnosis of mito-

 chondrial disease relies on laboratory observations, cerebral imaging and muscle biopsies. Muscle tissue from biopsies is inspected histologically for a prolifera- tion of mitochondria, which present as “ragged red fibers”, and biochemically for respiratory chain function. is testing can run up to , and often returns ambiguous findings, partially due to lab-to-lab variabilities[, ]. Patients can go through a long diagnostic journey before mitochondrial disease is suspected and many cases of suspected mitochondrial disease may actually be due to a re- lated disorder. With the exception of disease due to CoQ deficiency, no proven cures exist and treatments focus on relieving symptoms[].

While influenced by environmental factors, most cases of mitochondrial dis- eases are believed to caused by genetic factors. Genetically, mitochondrial dis- eases exhibit maternal (mitochondrial), autosomal dominant, autosomal reces- sive and X-linked modes of inheritance. Mutations in  mitochondrial and  nuclear genes have been implicated in mitochondrial disease []. mtDNA mu- tations may be homoplasmic, being present in all molecules, or heteroplasmic, being present in only a fraction. Patients with heteroplasmic mutations present with symptoms only when heteroplasmy exceeds a certain level, a phenomenon known as the threshold effect. ese mitochondrial mutations have been more thoroughly studied than those of nuclear localization since the .KB mtDNA is much easier to sequence than the full human genome. e majority of mito- chondrial disease cases (-) are believed to be due to nuclear mutations[], yet as of  only half of patients with complex I disease could be diagnosed with the current list of Mitochondrial disease genes []. Nuclear proteins that cause mitochondrial disease are not just found in structural subunits of the respi- ratory chain complexes, but also include proteins that fall in a variety of processes

 necessary for mitochondrial formation and function including those essential for mtDNA replication, transcription, RNA-processing, translation, assembly factors and mitochondrial membrane dynamics and composition.

Historically, clinical genetic evaluation of patients with mitochondrial disease has been limited to sequencing of the mtDNA or a handful of candidate genes, but it is hard to prioritize candidate genes as there is often a poor link between genotype and phenotype in mitochondrial disease. A recent literature review of

 patients with nuclear encoded complex I deficiency found that most children had a severe multi-system disease with prominent neurological involvement, but actual symptoms and onset varied and no clear genotype-phenotype correlations were found[]. mtDNA mutations are also difficult to interpret as some show variable levels of penetrance. For example, the majority of families with the mi- tochondrial disease of Leber’s hereditary optic neuropathy are often homoplastic for mtDNA mutations; however, only a subset of individuals in these families de- velops disease, leading to the hypothesis that there are additional nuclear factors to be identified[].

. DNA S   

Over the past five years, as human genome sequencing transformed from a re- search grade effort costing ,[] to a clinical test orderable by doctors for under ,, it has become practical for researchers to sequence individual patients. is price reduction was enabled by the creation of massively-parallel next generation sequencing (NGS) instruments, which were spurred by a series of grants that succeeded the sequencing of the human genome [] and hope to reduce costs to , and beyond. While the first generation sequencing instru-

 ments used to sequence the human genome were cable of generating hundreds of concurrent sequences of ~ bases, second-generation sequencing instru- ments are cable of generating data from one or both ends of hundreds of millions of short DNA fragments in parallel, though these sequences are much shorter with read lengths of -bp. To discover variants, these reads are first mapped to a reference genome. Single changes, referred to as single-nucleotide vari- ants (SNVs), and small insertions and deletions (indels) are detected by evaluation reads that overlap each reference base. Larger variants, referred to as structural variants, also exist and are discussed more in Chapter .

As it is still somewhat expensive to conduct whole genome sequencing of all  billion bases in a human haploid genome, many researchers opt instead to perform exome sequencing, which costs a tenth to a third as much. In this form of targeted sequencing, DNA is processed to enrich for coding DNA sequences before being placed on a sequencer. A variety of selection methods exist[, ], but most projects, including those described in this thesis, use a liquid phase capture strat- egy known as hybrid selection []. Early exome designs targeted the Mb (~ of the genome) defined by the Consensus CDS (CCDS) []; however this target only accounts for about  of the , human genes. Larger targets based on

GENCODE Encyclopedia of genes have since been developed []. e first study to show that exome sequencing could identify a known candidate was per- formed in  unrelated cases of Freeman-Shlend syndrome, who were shown to all have mutations in MYH[]. Since then, there has been a flurry of Mendelian disease genes discovered through exome sequencing [], including many that cause mitochondrial disease [, , , , –, , , , , ].

Exome sequencing is not perfectly suited for a cohort of patients with mito-

 chondrial diseases. First, mtDNA is not reliably captured by a standard exome target and and CCDS target lacks coverage for several genes in MitoCarta[]. Ad- ditionally, > of mitochondrial disease genes are localized in the mitochondria, making sequencing of an entire exome seem inefficient. In this thesis, we present an alternative: MitoExome sequencing. We target <Mb of DNA that consists of the mtDNA and exons from over , nuclear genes. ese nuclear genes come from the MitoCarta inventory, but also include additional candidate mitochon- drial genes and known disease genes that cause diseases phenotypically similar to mitochondrial disease.

. T 

is thesis begins with Chapter , which introduces MitoExome sequencing and benchmarks its performance on  patients with infantile mitochondrial disease.

A case study showing how discovery of a new disease can lead to new biology can be found in Chapter . Chapter  describes efforts to increase the diagnostic yield by attempting to call large amplifications and deletions. is thesis concludes with a summary of implications and future directions in Chapter .

 2 Molecular diagnosis of infantile mitochondrial disease with targeted next-generation sequencing

  - sequencing (NGS) promise to facilitate diag- A nosis of inherited disorders. Although in research settings NGS has pin- pointed causal alleles using segregation in large families, the key challenge for clin- ical diagnosis is application to single individuals. To explore its diagnostic use, we

 performed targeted NGS in  unrelated infants with clinical and biochemical evi- dence of mitochondrial oxidative phosphorylation disease. ese devastating mi- tochondrial disorders are characterized by phenotypic and genetic heterogeneity, with more than  causal genes identified to date. We performed “MitoExome” sequencing of the mitochondrial DNA (mtDNA) and exons of ~ nuclear genes encoding mitochondrial proteins and prioritized rare mutations predicted to dis- rupt function. Because patients and healthy control individuals harbored a com- parable number of such heterozygous alleles, we could not prioritize dominant- acting genes. However, patients showed a fivefold enrichment of genes with two such mutations that could underlie recessive disease. In total,  of  () pa- tients harbored such recessive genes or pathogenic mtDNA variants. Firm diag- noses were enabled in  patients () who had mutations in genes previously linked to disease. irteen patients () had mutations in nuclear genes not pre- viously linked to disease. e pathogenicity of two such genes, NDUFB and AGK, was supported by complementation studies and evidence from multiple patients, respectively. e results underscore the potential and challenges of deploying NGS in clinical settings.

. I

Advances in next-generation sequencing (NGS) technology are facilitating the dis- covery of new disease genes and promise to transform the routine clinical diag- nosis of inherited disease. Since the first reports in  [, ], NGS has aided in the discovery of more than  new disease genes in research settings.

Although sequence-based clinical diagnosis has long been available for individual genes, NGS could be particularly useful for conditions characterized by genetic

 heterogeneity, where individual genes cannot be readily prioritized for traditional sequencing. e success of NGS for clinical use, however, hinges on distinguish- ing the small number of pathogenic alleles from the thousands of DNA variants present in each person. When DNA is available from large affected families or from individuals with a closely related phenotype, variants have been successfully prioritized on the basis of segregation with disease. However, it is not clear how useful NGS-based diagnostics will be in the more typical clinical scenario where an individual has no clear family history of disease and where the mode of inheritance is ambiguous.

Human oxidative phosphorylation (OXPHOS) disease represents a challenging class of disorders that could benefit tremendously if sequence-based tests could provide accurate diagnosis. Inherited defects in OXPHOS affect at least  in  live births [] and are characterized by a biochemical defect in the respiratory chain. OXPHOS disease is clinically heterogeneous because it can present early in infancy or in adulthood, can be severe or mild in its presentation, and typi- cally affects multiple organ systems. Clinical manifestations can include, but are not limited to, myopathy, lactic acidosis, seizures, ataxia, peripheral neuropathy, blindness, deafness, gastrointestinal dysmotility, liver failure, and bone marrow dysfunction[]. OXPHOS disease can show maternal, recessive, dominant, or X- linked inheritance, although most cases are sporadic and have no obvious family history [, ]. Biochemical diagnosis typically requires invasive biopsies and even then can be inconclusive.

Genetically, OXPHOS disease can be caused by mutations either in the mito- chondrial DNA (mtDNA) or in the nuclear genome. It is possible to routinely re- sequence the mtDNA, but lesions in this tiny genome likely account for no more

 than  to  of all childhood cases [, ]. A recent review listed  nu- clear disease genes, identified largely through pedigree analysis, with up to  new

OXPHOS disease genes described each year []. Most involve recessive, loss- of-function alleles, although nine genes can harbor dominant-acting mutations

(POLG, POLG, Corf, RRMB, SLCA, OPA, CYCS, MFN, and HSPD)[].

Individual sequence-based diagnostic tests are readily available for each of  genes

[] and, more recently, NGS gene panels [, ]; however, pleiotropy makes it difficult for clinicians to predict which specific genes may be mutated in a given case []. Moreover, the known disease genes likely account for only a fraction of the total genetic burden of these disorders []. Given that  of the known nu- clear disease genes encode mitochondrial-localized proteins (Supplementary Ma- terial in Appendix A), the full set of ~ genes encoding mitochondrial proteins represents strong candidates for OXPHOS disease []. e extensive het- erogeneity, allelic heterogeneity, and pleiotropy of OXPHOS disease, combined with the availability of a focused set of ~ high-confidence candidate genes encoding the known mitochondrial proteome, make OXPHOS disorders an ideal application area for sequence-based diagnostics. ree recent studies reported the technical feasibility of sequencing several hundred of these genes [, , ], and NGS diagnostic tests are newly available for panels of disease-related genes

[, ]. We performed targeted NGS to capture and sequence the entire mtDNA and the exons of the ~ genes encoding the mitochondrial proteome (the “Mi- toExome”). After benchmarking the performance on several DNA samples, we ap- plied MitoExome sequencing to  unrelated patients with clinical and biochemi- cal evidence of infantile OXPHOS disease in whom a molecular diagnosis was not previously available (Table .).

 Table 2.1: Patient clinical, biochemical and molecular characteristics. Patients are ordered by primary biochemical defect. Where appropriate, age at death is shown in parentheses. Family history (Hx) includes consanguinity (cons.), with level of con- sanguinity noted (1st cousin, 2nd cousin, or 1st cousin once removed indicated by 1st, 2nd, and 1st+1) and/or sibling with fea- tures of OXPHOS disease (sib.). Biochemical features were measured in up to four tissues (skm, skeletal muscle; liv, liver; fib, fi- broblasts; hea, heart) and expressed relative to the marker enzymes citrate synthase and complex II. Enzyme activity per tissue per complex (I, II, III, IV) were categorized as ↓↓ markedly deficient; ↓ moderately deficient; ∼, equivocal; or nl, normal. MtDNA quantity denotes % in given tissue relative to the mean value for normal controls, with values <20% regarded as mtDNA depletion. Gene names shown in bold represent genes with mutations that are believed to cause the clinical and biochemical phenotype. Pre- viously reported OXPHOS disease genes are marked with an asterisk. Other abbreviations: na/nd, not available/not determined; d/wk/mo/yr, day/week/month/year; M/F, Male/Female; DD, Developmental Delay; Dev., Developmental; FTT, Failure To Thrive; IUGR, Intra-Uterine Growth Restriction; LFT, Liver Function Test; CSF, CerebroSpinal Fluid; URTI, Upper Respiratory Tract Infec- tion; GI, GastroIntestinal; del. Deletion; dep., Depletion.

ID Sex Age of Clinical Presenta- Family Biochemical Features Gene Can-

 onset tion Hx didate(s) (death) Tissue I II III IV mtDNA quantity Complex I P M <mo lethal infantile mi- fib ↓↓ nd *ACAD tochondrial disease P F <wk hypotonia, fib ↓↓ ↓ ↓ nd *POLG (wk) lethargy, respi- ratory distress, metabolic acidosis, FTT Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P F <wk severe IUGR, FTT, skm, fib ↓↓ nd NDUFB (mo) recurrent episodes of metabolic acido- sis P F in utero multiple pterygium cons. liv ↓↓ nd AKRB (in syndrome, severe (st) fib ∼ nd utero) fetal hydrops, IUGR, terminated

 at w P F <wk lethargy, tachypnea, liv ↓↓  - (yr) organomegaly, renal skm,hea ↓↓ nd & liver failure P M <wk bilateral optic skm,liv,fib ↓ nd - nerve hypoplasia, lethargy, FTT P M <mo cystic leukodystro- skm,liv,fib ↓↓ nd - (mo) phy, FTT, severe metabolic acidosis P F <m leukodystrophy, skm ↓↓ nd - (yr) DD, abnormal LFTs, fib ↓↓ nd vasculitic rash liv ∼ nd Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P F in utero leukodystrophy, liv ↓↓  - (d) IUGR, metabolic skm,fib ↓↓ nd acidosis, seizures P F <yr FFT, DD, seizures, skm ↓ nd - (na) hypotonia, promi- nent extra-axial CSF space Complex III P F <wk severe global skm,liv ↓↓  *BCSL

 DD, tachypnea, fib nl nd metabolic acido- sis, renal tubular acidosis, FTT P M <wk severe IUGR, cons. skm ↓ ↓↓ nd *TYMP, (na) lethargy, metabolic (st). fib ∼ ↓↓ nd MTCH, acidosis, renal Corf tubular acidosis, dysmorphic facies P M <mo hypotonia, FTT, cons. skm ↓↓ nd ACADSB DD, short stature, (st) liv ∼ nd rickets, pruritis, dysmorphic facies Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P M <wk fetal hypokine- skm,fib ↓↓ nd UCP, (na) sia, Pierre Robin MTHFDL sequence, intra- abdominal calcifica- tion P F in utero ventriculomegaly, skm ∼ ↓↓ nd UQCR (yr) apnea, fib ↓ ↓↓ nd dev.regression,

 hypotonia, seizures P M <wk cardiorespiratory fib ↓↓ nd - arrest, metabolic acidosis, renal tubular acidosis Complex IV P M <wk neonatal lactic skm ↓↓ nd *COXB acidosis, cystic fib nl nd leukodystrophy P M <wk hypsarrhythmia, skm ∼ ∼ ↓↓ nd *GFM, FTT, dystonia, fib ↓ nd ACOX squint Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P F <mo DD, seizure, hypo- cons. fib ↓↓ nd ACAD (mo) tonia, doesn’t fix & (st+) follow, cortical atro- phy P F <yr mild global DD, hy- sib. skm ↓↓ nd Corf, potonia, mild cere- fib nl nd MTERF bral atrophy P M mo Leigh syndrome, cons. fib ↓↓ nd -

 (mo) global DD, FTT, (nd) hypotonia, seizures P F in utero IUGR, oligohydram- cons. skm,liv ↓↓ nd - nios, severe global (st), fib nl nd DD, seizures, micro- sib. cephaly, hyperther- mia P F <mo profound hypo- skm ↓↓  - tonia, paucity of movement P M <yr DD, hypotonia, ap- skm ↓↓  - nea, spastic diple- fib ↓ nd gia, dysmorphic Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P M <mo Leigh-like syn- skm ↓↓ nd - drome, respiratory failure, lactic acido- sis, RRF Combined OXPHOS P M wk FTT, metabolic cons. skm ↓↓ ↓↓ ↓↓ nd LYRM deficiencies acidosis, hep- (st) liv ∼ ∼ ∼ nd atomegaly, apnea fib nl nl nl nd P M <wk FTT, hypotonia, cons. liv ↓↓ ↓↓ ↓↓  *TSFM  (mo) hypothermia, (st), fib ↓ ↓ ↓ nd hypertrophic car- sib. diomyopathy, hepatomegaly P F <mo FTT, hyperto- cons. fib ↓ ↓ ↓ nd *TSFM (mo) nia, hypertrophic (st), cardiomyopathy, sib. respiratory arrest P F <yr DD, seizures, hy- liv ↓↓ ↓↓  *GFM potonia, episodic metabolic acidosis Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P M <mo FTT, DD, sen- cons. skm ↓↓ ↓↓ ↓↓  - sorineural deafness, (st), fib ↓ ↓ ↓ nd renal failure, dys- sib. morphic P M <yr hypertrophic car- hea ↓↓ ↓↓ nd Corf diomyopathy fib nl nl nd P M <mo gastroesophageal skm ↓↓ ↓↓ ↓↓ nd EARS reflux, FFT, hypo-

 tonia, respiratory distress, lactic acidosis P M <yr FTT, dev.regression, skm ↓↓ ↓↓ ↓↓  ALDHB, (mo) microcephaly liv ∼ ↓↓ ↓↓ nd mtDNA del. P F <wk FTT, cardiac ar- skm ↓↓ ↓↓ ↓↓  - (mo) rest, hypertrophic fib ↓ ↓ ∼ nd cardiomyopathy liv ∼ nl nl nd P F <wk metabolic acidosis, liv ↓↓ ↓↓  - (d) cardiac failure, fib nl nl nd hemorrhagic brain infarct Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P M <yr sudden-onset en- skm ↓↓ ∼ ↓↓  - cephalopathy with liv ∼ ∼ ↓ nd seizures after URTI P F <yr DD, ptosis, mi- skm ↓↓ ↓↓  - crocephaly, GI liv nl nl nd dysmotility P F <yr leukodystrophy, skm ↓↓ ∼ ↓↓  - speech delay, fib nl nl nl nd

 dysaesthe- sia in hands, dev.regression P M <yr Leigh syndrome, skm ↓ ↓ ↓ nd - global DD, de- fib ↓ ↓ ∼ nd terioration after URTI mtDNA dep. P F in utero stillborn with hypo- skm ↓↓ ↓↓ ↓↓  *AARS (still- tonia and multiple fib nl nl nl nd born) fractures P F <yr cardioskeletal my- skm ↓↓ ↓↓ ↓↓  AGK (y) opathy, cataracts, FTT, fatigue Continued on next page Table . – Continued from previous page ID Sex Age of Clinical Presenta- Family Tissue I II III IV mtDNA Gene Can- onset tion Hx quantity didate(s) (death) P F <wk marked respiratory cons. skm ↓↓ ↓↓ ↓↓  AGK (d) distress, pulmonary (st) liv nl nl nl  hypertension, cataracts  . R

.. ME   

We targeted for sequencing the entire mitochondrial genome and all coding exons of  nuclear genes encoding mitochondrial proteins based on the MitoCarta inventory [] (Table S available on the attached CD). ese regions were en- riched using hybrid selection and then sequenced on an Illumina GAII. For each individual, we generated ~ billion bases of sequence, which yielded an average coverage of ~× at each targeted nuclear base and ~,× at each mtDNA base (Table .). On average,  of targeted bases were covered and  of tar- geted bases exceeded the × coverage threshold required for confident analysis, defined as  power to detect a variant (Table .).

Table 2.2: MitoExome sequencing statistics. *Median values per patient, with range across patients indicated in brackets. † Excluding reads with ambiguous align- ment

Targeted DNA Targeted DNA* Targeted DNA*  gene loci   target size (bp) ,, , coverage† mean coverage  [–] , [,-,]  target bp covered ≥X  [–]  []  target bp covered ≥X  [–]  []  target bp covered ≥X  [–]  []  variants compared to reference DNA  [–]  [–]  rare  [–]  [–]  rare, protein-modifying  [–]  [–]  genes with  rare, protein-modifying alleles  [–] NA

e accuracy and reproducibility of MitoExome variant detection was assessed using DNA obtained from the parents and daughter of a family with indepen-

 dent sequence data available through HapMap []. Detection of nuclear single- nucleotide variants (SNVs) was  sensitive and  specific based on  and

 sites genotyped using independent technology. Specifically, there was  genotype concordance at heterozygous sites,  concordance at homozygous sites, and . concordance at sites with at least  reads. Similarly, nuclear SNV detection was  sensitive based on  variant sites detected by whole-genome sequencing of these samples (. concordance at sites with at least  reads).

Two technical replicates of MitoExome sequencing showed  concordance at variant sites, consistent with the reported sensitivity. mtDNA SNV detection was

 sensitive and  specific based on five patient samples for which indepen- dent data were available.

MitoExome sequencing was next applied to whole-genome amplified DNA ob- tained from  unrelated patients who fulfilled the clinical and biochemical crite- ria for OXPHOS disease before  years of age (Table .). ese patients included

 from consanguineous families and  additional patient with an affected fam- ily member, although none of these families were large enough to identify a sin- gle locus by a traditional mapping approach. Each sequenced individual harbored about  nuclear SNVs,  small nuclear insertions or deletions (indels), and  mtDNA variants compared to the reference human genome (Table . and Table

S available on the attached CD).

.. V 

e key challenge for interpreting sequence data for molecular diagnosis is to dis- tinguish causal alleles from the thousands of variants present in each individual.

As we focused on rare, devastating clinical phenotypes, we hypothesized that the

 underlying causal variants would be uncommon in the general population, disrup- tive to protein function, and would either be inherited in an autosomal recessive fashion from unaffected carrier parents or be de novo, dominant-acting mutations.

We therefore highlighted all nuclear variants that were rare and predicted to be protein-modifying (Materials and Methods), as well as all variants previously as- sociated with disease in the Human Gene Mutation Database (HGMD) (Table .).

If rare, protein-modifying variants are enriched for causal alleles, we would ex- pect increased prevalence in patients compared to healthy individuals. erefore, we analyzed the MitoExome regions in  healthy individuals of European an- cestry with available whole-exome sequence data. To avoid overestimating patient variants, we limited this comparison to the  patients from nonconsanguineous families and to MitoExome regions well covered in all individuals ( of all au- tosomal coding exons, see Materials and Methods). Genes containing only a sin- gle rare, protein-modifying variant showed only a .-fold enrichment in patients compared to controls (P = .), and this subtle enrichment is likely caused by greater ethnic diversity in patients (Fig. .A and Fig. A.). However, we observed a marked .-fold enrichment of genes containing two such prioritized alleles in patients compared to controls (P =  × ) (Fig. .A and Materials and Meth- ods). ere was no enrichment observed in the  auxiliary genes sequenced

(that had only weak evidence of mitochondrial localization), suggesting that the enrichment is not due to population stratification (Fig. .A). Forty-five percent of all nonconsanguineous cases contained mitochondrial genes with two priori- tized variants compared to only  of controls (.-fold enrichment, P =  × )

(Fig. .B). us, the background rate of rare, predicted deleterious heterozygous variants makes it difficult to spotlight dominant-acting, heterozygous causal al-

 Figure 2.1: Enrichment of prioritized variants in cases compared to healthy individuals. (A) Mean number of genes containing rare, protein-modifying alle- les in cases and healthy controls within the 1034 mitochondrial genes and the 347 auxiliary genes sequenced. Error bars indicate SE. (B) Percent of cases and con- trols containing prioritized mitochondrial genes (red), defined as harboring two rare, protein-modifying alleles. (C) Percent of cases and controls containing prioritized mi- tochondrial genes, excluding variants with allele frequency of >0.005 in either cases or controls, with enrichment in cases versus controls displayed above. Analysis was re- stricted to regions with sequence coverage in all individuals and to the 31 cases from nonconsanguineous families. leles from individual DNA samples. However, the excess of genes harboring two prioritized variants provides a signal to distinguish recessive variants. erefore, we prioritized only variants consistent with a recessive mode of pathogenesis, and denoted “prioritized genes” as those containing two such alleles.

We additionally prioritized two classes of mtDNA variants: variants annotated as pathogenic in MITOMAP [] at greater than  heteroplasmy (as estimated by sequence reads) [], and structural deletions and rearrangements as detected by de novo mtDNA assembly. Detection of heteroplasmic variants via deep se- quencing was previously validated on the basis of mixing experiments [].

For the nine genes known to harbor dominant mutations in OXPHOS disease, we carefully evaluated all rare, protein-modifying variants and all variants that were reported as pathogenic in HGMD. As described in the Supplementary Mate-

 rial, none of the six such variants appeared to be dominant mutations associated with the observed patient phenotypes.

.. E     

We note that the fivefold enrichment of prioritized genes in cases compared to controls is likely to be a conservative estimate, especially as databases of allele fre- quencies improve. Indeed, many prioritized variants showed greater than . allele frequency in the  control samples. When we excluded variants exceeding

. allele frequency from cases and  controls (similar to exclusion criteria based on  genomes and dbSNP thresholds), we observed a .-fold enrich- ment in nonconsanguineous cases versus controls (P =  × ) (Fig. .C). is corresponded to a marked .-fold enrichment (P =  × ) specifically within the  established disease-related genes [] and .-fold enrichment (P =  ×

) within the remaining  mitochondrial genes (Fig. .C). us, the  infan- tile patients are enriched for prioritized variants in both known pathogenic genes and candidate genes not previously linked to disease.

.. P    

Of the  patients with severe OXPHOS disorders,  () harbored at least one “prioritized gene” or mtDNA variant:  () of these were in known, nu- clear OXPHOS disease–related genes,  () were in nuclear genes not pre- viously linked to disease, and  patient () harbored a large mtDNA deletion

(Fig. .A). e remaining  () patients lacked prioritized genes or mtDNA variants. Most prioritized genes harbored missense changes at amino acid residues highly conserved across evolution (Fig. .B). Because there was about fivefold en-

 Figure 2.2: Prioritized variants in 42 patients with OXPHOS disease. (A) Forty-two patients categorized by the presence of prioritized genes (red). The gene names are listed alphabetically at right, with parentheses indicating genes prioritized in two unrelated patients, and triangle indicating support of pathogenicity. (B) Fifty- two prioritized alleles, categorized by type of protein modification. richment in patients with prioritized genes compared to controls, we can estimate that about  of the  patients () will contain causal variants in prioritized genes. Determining which of the prioritized DNA variants are actually causal will require additional experimental evidence.

.. M    DNA

rough de novo assembly of the mtDNA (Materials and Methods), we discovered a .-kb deletion in the mitochondrial genome in patient P (Fig. .A). is pa- tient presented with progressive developmental delay and regression, failure to thrive, microcephaly, lactic acidosis, and death after a sudden metabolic and neu- rological decompensation. Muscle biopsy showed ragged red fibers and reduced cytochrome c oxidase (COX) staining in affected fibers. e deletion (m._ del with imperfect tandem repeats at both ends) has been previously reported

[] and was present in  of sequence reads at this locus. e deletion was confirmed by quantitative polymerase chain reaction (qPCR) to be at  het-

 Figure 2.3: mtDNA deletion in patient P33. (A) Schematic diagram of mtDNA indicates deletion (black arc) relative to position 0/16,569 (black triangle). Arrows indicate LR-PCR primers. (B) mtDNA sequence coverage in 50-bp windows. Inset shows Sanger electropherogram of breakpoint. (C) Gel electrophoresis of LR-PCR amplicon displays a 14,939-bp fragment in control DNA and 7681-bp fragment in P33, along with marker II (Roche) and mtDNA from individuals with confirmed single and multiple deletions. eroplasmy in genomic DNA from affected muscle tissue. Because of mtDNA pro- liferation ( compared to control), the actual amount of nondeleted mtDNA present in muscle of this patient was  of control. e deletion was confirmed by long-range PCR (LR-PCR) (Fig. .C), and the breakpoints were mapped by

Sanger sequencing (Fig. .B). us, MitoExome sequencing enables simultane- ous detection of mtDNA point mutations and deletions.

.. M      



Ten patients harbored recessive mutations in eight genes previously shown to cause OXPHOS disease: ACAD [, ], POLG [, ], BCSL [], COXB

[], GFM [], TSFM [], AARS [], and TYMP [] (Fig. .A). All such variants were confirmed through Sanger sequencing. We established compound heterozygosity via sequencing familial DNA, complementary DNA (cDNA), cloning,

 or molecular haplotyping [] because short sequence reads do not typically pro- vide the ability to determine whether variants are in cis or trans.

With additional experiments, we established a firm genetic diagnosis for nine of these patients (Table .). ese methods included the demonstration that the detected alleles segregated with disease in a family, that they were rare in controls, and/or that they caused reduced cellular abundance of full-length mRNA tran- scripts, protein products, or OXPHOS subunits and assembly factors (Table . and Materials and Methods). e enzyme ACAD [, ], recently linked to complex I deficiency [Mendelian Inheritance in Man (MIM) ], was mutated in a male infant with isolated complex I deficiency and lethal infantile mitochon- drial disease (Fig. A.). A female infant with encephalopathy and clear complex

I deficiency with less marked deficiency of complexes III and IV harbored previ- ously reported causal mutations in POLG [, ], which encodes the mtDNA polymerase and is the most commonly mutated nuclear gene underlying OXPHOS disease () (Fig. A.). e complex III assembly factor gene BCSL [] which is the most commonly mutated gene in complex III deficiency (MIM ) [], was mutated in an infant girl with complex III deficiency (Fig. A.). Complex

IV subunit COXB [] was mutated in a boy with neonatal onset of mito- chondrial encephalopathy, metabolic acidosis, and complex IV deficiency (MIM

) (Fig. A.). e protein GFM (MIM ) [], which is required for translational elongation of mtDNA-encoded proteins, was mutated in two unre- lated patients with mitochondrial encephalopathy presenting within the first year of life (Fig. A.). A second elongation factor, TSFM [], was mutated in one pa- tient with combined OXPHOS deficiencies (MIM ) and in an unrelated pa- tient with cardioencephalomyopathy (Fig. A.). A stillborn fetus with mitochon-

 drial myopathy harbored one new and one previously reported causal variant in the mitochondrial translational protein AARS, which was recently shown to un- derlie fatal infantile cardiomyopathy (MIM )[] (Fig. A.).

Although in all nine cases the MitoExome genetic diagnosis was consistent with the patient phenotype, the clinical presentation alone was never specific enough to suggest the underlying gene as the most likely candidate for an available gene- based sequence diagnostic test, except perhaps for BCSL. In the remaining th case, the compound heterozygous TYMP mutations in patient P do not fit with the clinical presentation, complex III defect, or familial consanguinity, and it is possible that homozygous mutations in candidate genes MTCH or Corf may prove to be causal.

 Table 2.3: Support of molecular diagnosis in 13 patients. *Indicates genes previously associated with OXPHOS disease. Abbre- viations: cmpd het, compound heterozygous; hom., homozygous; het, heterozygous; mut, heteroplasmic mutant load.

ID Disease sub- Gene Type Mutations ACMG cate- Support of pathogenicity type gory[] P complex I *ACAD cmpd het c.G>C,  conservation, protein degrada- deficiency p.AP[] tion (MIM:) c.T>A,  p.IN P complex I *POLG cmpd het c.G>A,  deficiency p.GS[] (MIM:)  c.C>T,  p.RW[] P complex I NDUFB hom. c.T>C,  Mother carrier, no deletion or deficiency p.WR isoparental disomy detected by (MIM:) Affy.M & Illumina K ar- rays, protein degradation, conser- vation, correction studies P complex III *BCSL cmpd het c.C>T,  conservation, mRNA and protein deficiency p.RX[] degradation (MIM:) c.C>T,  p.RC Continued on next page Table . – Continued from previous page ID Disease sub- Gene Type Mutations ACMG cate- Support of pathogenicity type gory[] P complex IV *COXB hom. c.A>C,  protein degradation, family segre- deficiency p.TP gation (MIM:) P complex IV de- *GFM cmpd het c.delT,  mRNA degradation, protein ficiency p.ENfsX degradation, family segregation c.C>T,  p.RC P combined *GFM cmpd het c.delT,  mRNA degradation, protein deficiency p.ENfsX degradation maternal carrier

 (MIM:) c.delT c.A>G,  p.KE P combined *TSFM hom. c.C>T,  family segregation deficiency p.RW[] (MIM:) P combined *TSFM hom. c.C>T,  family segregation deficiency p.RW[] (MIM:) P combined defi- mtDNA  mut m._  qPCR, LR-PCR and sequencing ciency deletion del [] P mtDNA deple- *AARS cmpd het c.G>A,  conservation, protein degrada- tion p.RH tion Continued on next page Table . – Continued from previous page ID Disease sub- Gene Type Mutations ACMG cate- Support of pathogenicity type gory[] c.C>T,  p.RW[] P myopathic AGK cmpd het c.+T>C,  independent mutations in  pa- mtDNA p.KQfsX tients depletion (MIM:) c.T>A,  p.YX P myopathic AGK hom. c.+G>T,  independent mutations in  pa-

 mtDNA p.SEfsX tients, family segregation depletion (MIM:) .. M   

About one-third of patients harbored prioritized DNA sequence variants in genes never before linked to OXPHOS disease. In total,  such candidate disease genes were identified in  patients (Fig. .A and table S) after excluding mutations that were not compound heterozygous (MRPL) and mutations that did not seg- regate with disease in available familial DNA (COX, FAMA). Our collec- tion of candidate genes is likely greatly enriched for new, bona fide mitochon- drial disease genes, but additional proof of pathogenicity is required. One method is to identify independent mutations of the same gene in unrelated individuals presenting with exquisitely similar phenotypes, as was recently shown for Miller

[], Kabuki [], and Bartter [] syndromes. A second definitive method is to complement a cellular defect present within patient cells by introducing a wild- type copy of the locus, either through cybrid fusions for mtDNA defects [] or through overexpression of a nuclear gene []. We obtained support of pathogenic- ity for two of the new disease genes, as described below.

.. AGK       DNA -



One new candidate gene, acylglycerol kinase (AGK), had recessive mutations in two unrelated patients. is protein [also known as multisubstrate lipid kinase

(MuLK)] phosphorylates monoacylglycerols and diacylglycerols to lysophospha- tidic acid and phosphatidic acid, respectively []. e two unrelated patients each presented with severe myopathy, combined complex I, III, and IV deficiency, bi- lateral cataracts, and severe mtDNA depletion in skeletal muscle ( and  of normal). Detailed clinical histories are available in the Supplementary Material.

 Currently, only two genes are known to underlie the myopathic form of mtDNA depletion [][TK/MIM  [] and RRMB/MIM  []], and coding mutations in both these genes were excluded. Our two unrelated patients harbored three severe mutations in AGK (Fig. . and Fig. A.): patient P har- bored a homozygous splice variant that caused a shortened transcript with a pre- mature termination codon (c.+G>T, p.SEfsX), and patient P har- bored a compound heterozygous nonsense variant (p.YX) and splice variant that caused a shortened transcript with a premature termination codon (c.+T>C, p.KQfsX), inherited respectively from her mother and father. All mRNAs were stable and not affected by nonsense-mediated decay. All three mutations would cause deletion of a C-terminal region that is highly conserved across evolu- tion and thus likely to be important for proper protein function, especially because it contains the conserved C domain from related sphingosine-type kinases []

(Fig. .). We screened for coding AGK mutations in eight additional individuals with myopathic mtDNA depletion but detected no further mutations.

Our identification of independent, likely deleterious mutations in AGK in pa- tients with myopathic mtDNA depletion suggests that AGK is a new locus under- lying this syndrome and may offer a link between lipid metabolism and the control of mtDNA copy number.

.. E    NDUFB

e complex I subunit NDUFB was mutated in patient P, a girl from a noncon- sanguineous family with complex I deficiency and lethal infantile mitochondrial disease. She harbored an apparently homozygous c.T>C (p.WR) mutation.

e p.W residue is highly conserved (conserved in  of  vertebrates) and

 Figure 2.4: Patient mutations in acylglycerol kinase (AGK). Schematic dia- gram shows protein structure with the location of truncating mutations in unrelated patients P41 and P42 shown in red and a box indicating the diacylglycerol kinase catalytic domain. Conservation of C-terminal region is shown below, with identical residues shaded and the C5 domain shared with yeast sphingosine kinases underlined.

 lies adjacent to an important lysine residue that undergoes N-acetylation within the N terminus (Fig. A.). Sanger sequencing identified the proband’s mother as a heterozygous carrier for this mutation. Paternal DNA was unavailable for testing. To assess possible DNA deletions or uniparental isodisomy, we analyzed proband and maternal DNA using single-nucleotide polymorphism (SNP) arrays.

No large deletions were evident, and isodisomy of  was excluded on the basis of the SNPduo identity by state algorithm []. However, the CNVPar- tition v.. algorithm (Illumina) detected a .-Mb long contiguous stretch of homozygosity (LCSH) encompassing NDUFB in the proband’s DNA (Fig. A.).

e maternal sample did not show statistically significant evidence of LCSH. Anal- ysis of the patient’s fibroblast cDNA (with and without cycloheximide) showed a single stable transcript of expected size, indicating that the mutation did not alter mRNA splicing or the presence of any small indels below the detection range of the cytogenetic arrays.

We next performed a complementation experiment to assess whether the in- troduction of wild-type cDNA into subject fibroblasts rescued the defect in com- plex I activity. Fibroblasts from this individual showed a strong complex I defect, with ~ residual complex I activity when assayed by spectrophotometric en- zyme assay and < residual complex I activity when assayed by dipstick enzyme assay. Using a lentiviral expression system, we transduced subject fibroblasts with wild-type cDNA. Expression of wild-type NDUFB rescued complex I activity and protein levels in fibroblasts from subject P but not from a previously diagnosed patient with complex I deficiency, who had pathogenic mutations in Corf[] and vice versa (Fig. .), establishing NDUFB as the causal gene in this case. is is the  nuclear encoded complex I subunit gene in which mutations have been

 shown to cause complex I deficiency in humans.

. D

We performed MitoExome sequencing of the entire mtDNA and exons of  nuclear genes encoding mitochondrial proteins, including all  nuclear OXPHOS disease–related genes that were reviewed recently []. We applied MitoExome sequencing to  unrelated patients with a spectrum of early-onset OXPHOS dis- orders who lacked molecular diagnoses. Notably, we did not know what percent of unsolved OXPHOS cases would be expected to be due to mutations in estab- lished loci, because, to our knowledge, no study had sequenced even the  known disease-related genes in a collection of cases. We found that only  of unsolved cases were due to mutations in the known disease loci (including one mtDNA dele- tion and nine nuclear gene defects shown in Fig. .), highlighting the locus het- erogeneity of OXPHOS disease. A further  of patients harbored rare, protein- modifying, recessive variants in candidate genes not previously linked to OXPHOS disease. Given that such variants exhibit fivefold enrichment in cases over back- ground, they are likely enriched for truly causal alleles. We performed comple- mentation experiments to firmly establish pathogenicity for NDUFB in complex

I deficiency, and based on independent mutations in P and P, we suggest a new link between the gene AGK and myopathic mtDNA depletion through an un- known mechanism.

Recessive mutations in  additional genes, not previously linked to mitochon- drial OXPHOS disease, were identified. Formal proof of pathogenicity will be es- tablished by cDNA complementation experiments in cellular models or patient

fibroblasts [], when such cells exhibit an OXPHOS defect, or by detecting inde-

 Figure 2.5: NDUFB3 complementation of complex I defects in subject P3 fibroblasts. (A) Bar plots show complex I (CI) activity, normalized by complex IV (CIV) activity, in fibroblasts from patient P3, a healthy control individual, and an un- related patient with complex I deficiency because of a C8orf38 mutation P(C8orf38). Enzyme activity was measured before and after transduction with wild-type C8orf38 or NDUFB3 mRNA. Data are means of three biological replicates ± SEM. *P < 0.001, Student’s t test. (B) Western blot shows protein expression of complex I, com- plex IV, and complex V (CV) as the loading control.

 pendent mutations in individuals with a similar phenotype. We estimate that  of these  genes will prove to be bona fide disease genes. ese candidates are particularly exciting because until recently such discoveries generally required fa- milial forms of disease to narrow down the search region. Some of the candidate genes have established roles in OXPHOS biology, consistent with the enzyme de- fect found in the relevant patient (for example, UQCR, LYRM, and EARS), whereas most of the candidates encode poorly characterized proteins that have never before been linked to OXPHOS and will reveal fundamentally new biochem- ical insights (for example, Corf, Corf, and AKRB).

is pilot study showed that about half of the  sequenced patients lacked

“smoking gun” prioritized variants. We can envision at least five possible expla- nations. First, we may have missed the causal variants because of a technical lack of sensitivity. Although we detected more than  of SNVs present in the Mi- toExome, there is an unknown sensitivity for indels and exon deletions because of a lack of training data. Second, the causal variant may have been located in a gene that we did not target with MitoExome sequencing. Although possible, this explanation seems less likely because linkage and homozygosity mapping strate- gies to date have found that  of causal genes encode mitochondrial proteins.

ird, the causal variant may be located in a nontargeted intronic or regulatory region. Although these are likely to exist, no robust methods are available yet for interpreting such variants. Fourth, and perhaps most likely, our stringent filters may have rejected the truly causal variant. For example, de novo dominant alleles

[], acting through gain of function or haploinsufficiency, were not prioritized because, without parental DNA, these alleles are difficult to distinguish from the high heterozygote burden of apparently deleterious alleles in healthy individuals.

 Similarly, it is difficult to distinguish benign from pathogenic mtDNA variants

[]. By applying MitoExome sequencing to parental DNA, such de novo alleles may be identified in the future []. Finally, and potentially most interesting, our assumptions on the genetic architecture of OXPHOS disease may be inaccu- rate. Nearly all of the nuclear genes discovered to date correspond to Mendelian syndromes, characterized by strong, highly penetrant alleles. It is possible that many of the sporadic cases of OXPHOS disease in our cohort are due to the com- bined action of multiple “weak” alleles, each with incomplete penetrance.

Our study underscores the need for clinical standards for interpretation of ge- netic variants to evolve as NGS is applied more widely. First, current guidelines for interpreting clinical genetic tests, such as the American College of Medical Ge- netics (ACMG) guidelines [], are deliberately restricted to gene loci with estab- lished roles in disease. Second, it is notable that many variants purported to be causal for disease may, in some cases, be benign polymorphisms []. For exam- ple, we detected  alleles previously reported as pathogenic, but only  actually appear to explain the phenotype, whereas the remainder were heterozygous or present in patients with unrelated phenotypes (Supplementary Methods).

We anticipate that the success rate for establishing molecular diagnosis in un- selected cases of infantile OXPHOS disease using NGS will be higher than that observed in this study. Indeed, our  patients were refractory to molecular di- agnosis using traditional methods because most patients had been screened for common mutations in mtDNA or in relevant genes suggested by phenotype (for example, POLG and SURF) and were not from informative pedigrees. Analysis of a representative cohort of  unrelated infantile patients with “definite” OX-

PHOS disease from the Murdoch Childrens Research Institute, of which  cases

 had previous molecular diagnosis, suggests that MitoExome sequencing could en- able diagnosis in roughly  of all infantile patients and prioritize candidates in a further  (see Fig. A. and Supplementary Material).

ree advances will greatly aid interpretation of DNA variation for routine clin- ical diagnosis. First, NGS studies of Mendelian families will rapidly expand the set of validated OXPHOS disease–related genes from  to perhaps  nuclear loci. Second, NGS studies of ethnically diverse, healthy individuals will generate databases of allele frequencies that are necessary for filtering out common vari- ants unlikely to cause severe disease, as was recently shown for interpretation of carrier screening [] and has long been appreciated in the interpretation of mtDNA variation [, ]. ird, whereas costs currently support MitoExome sequencing (roughly one-third the cost of exome sequencing), future cost reduc- tions will enable sequencing of the entire exome or genome, as well as simultane- ous analysis of parental DNA to detect de novo mutations and to confirm the mode of transmission.

e subset of patients for whom a clear molecular etiology was possible spot- lights the immediate promise of NGS in clinical diagnosis. For these patients, Mi- toExome sequencing, requiring a single blood or tissue sample, can accelerate clin- ical diagnosis and enable genetic counseling where appropriate. Genetic diagno- sis will also enable rational subclassification of disease, which may help to predict clinical course and severity, and lead to patient grouping for targeted therapeutics.

However, we anticipate that even with improvements afforded by broader catalogs of genomic variation, interpretation of most sequence variants will be of greatest use when integrated with the broader clinical and biochemical presentation.

 . M  M

.. P

On the basis of clinical and biochemical criteria from established diagnostic algo- rithms [], we selected  patients with definite OXPHOS disease who lacked a molecular diagnosis (Table .). Of these,  patients were selected from the Mur- doch Childrens Research Institute laboratory and  from the Columbia University

H. Houston Merritt Clinical Research Center. All patients had severe biochemical

OXPHOS defects and were selected to represent five main categories of OXPHOS defects: complex I deficiency ( patients), complex III deficiency ( patients), complex IV deficiency ( patients), combined OXPHOS deficiencies ( patients), and mtDNA depletion ( patients) (Table .). e severity of enzyme defects in

Table . was classified as marked (<), moderate ( to ), or equivocal

( to ), corresponding to residual activities of normal control mean relative to citrate synthase or complex II. e inclusion criterion was infantile age of onset, defined for this study as under  years of age. e exclusion criterion was maternal family history or presence of a known nuclear or mtDNA mutation. In  cases, some candidate genes had previously been sequenced and excluded. e study protocols were approved by the ethics committee at the Royal Children’s Hospital,

Melbourne, Columbia University, and the Massachusetts Institute of Technology.

All samples were obtained as part of diagnostic investigations, and families pro- vided informed consent.

 .. ME  

e . Mb of targeted DNA included the .-kb mtDNA and all coding and un- translated exons of  nuclear genes, including  mitochondrial genes from the MitoCarta database [],  genes with recent strong evidence of mitochon- drial association, and  additional genes with weak evidence of mitochondrial association, all listed in Table S. Gene transcripts from RefSeq [] and Uni- versity of California Santa Cruz (UCSC) known gene [] collections were down- loaded from the UCSC genome browser [] assembly hg (February ). All analyses presented here, except one described in Fig. .A, were restricted to the mtDNA and coding exons and splice sites of  genes with confident evidence of mitochondrial association (. Mb), because no significant results were found in the untranslated regions (. Mb) or in the coding regions of  auxiliary genes sequenced (. Mb).

.. H   I 

An in-solution hybridization capture method [] was used to isolate targeted

DNA, which was sequenced on the Illumina GAII platform [] with paired – base pair (bp) reads. Single -bp baits were synthesized (Agilent) for each tar- get region or tiled across targets that exceeded  bp. A total of , baits were synthesized to cover , target regions. Each subject sample was whole genome–amplified with a Qiagen REPLI-g Kit with  ng of input DNA. HapMap and the National Institutes of Mental Health (NIMH) control samples were not whole genome–amplified. All samples were sequenced at the Broad Institute Se- quencing Platform. Genomic DNA was sheared, ligated to Illumina sequencing adapters, and selected for lengths between  and  bp. is “pond” of DNA

 was hybridized with an excess of Agilent baits in solution. e “catch” was pulled down by magnetic beads coated with streptavidin and then eluted [, ]. Each patient or HapMap sample was sequenced in one lane of an Illumina GAII instru- ment. NIMH control samples were subjected to whole-exome hybrid selection with the same protocol and were sequenced in two or three lanes of an Illumina

GAII instrument with paired -bp reads.

.. S ,  ,  

Illumina reads were aligned to the GRCh reference human genome assembly with the BWA [] algorithm in SAMtools [] within the Picard analysis pipeline

(http://picard.sourceforge.net/). Illumina quality scores were recalibrated and re- aligned to GRCh, duplicates were removed, and the alignment was modified to account for indels with the GATK software package [] version v... SNVs were detected and genotyped with the GATK UnifiedGenotyper in single-sample mode (with parameters -im ALL -mbq  -mmq  -mm  -deletions .). Vari- ants were filtered with GATK VariantFiltration module (with filters “QUAL<.

& QD<. & HRun> & DP<” and parameters -cluster  -window ). In- dels were detected with GATK IndelGenotyperV (with parameters -im ALL) and

filtered with a custom python module that removed sites with a max_cons_av

≥. (maximum average number of mismatches across reads supporting the in- del) or max_cons_nqs_av_mm ≥. (maximum average mismatch rate in the - bp NQS window around the indel, across indel-supporting reads). Indels were genotyped as homozygous (allele balance >.) or heterozygous (allele balance

≤.). Variants were annotated with the GATK GenomicAnnotator with data ob- tained from RefSeq transcripts [], UCSC known gene transcripts [], known

 disease-associated variants obtained from the HGMD [] professional version

., allele frequency from dbSNP [] version , the  Genomes Project

[], and amino acid conservation scores calculated from -vertebrate species align- ments downloaded from the UCSC genome browser [].

Eight hundred and forty-seven variants likely to be alignment artifacts were excluded by filtering out variants supported by greater than two Illumina read- pairs that aligned to the genome in an aberrant orientation. We observed stacks of aberrantly oriented reads at the boundaries of genomic regions with extremely high coverage, and sequence mismatches at the boundaries to these regions sug- gested that the underlying DNA had been circularized before amplification and fragmentation, perhaps during whole-genome amplification.

.. DNA  

Analysis of mtDNA variants is challenging given the large copy number of mito- chondrial genomes per cell, as well as the nuclear genome sequences of mitochon- drial origin (NUMTs) []. Subject DNA samples typically contain >× copies of mtDNA molecules compared to nuclear DNA molecules. To avoid alignment artifacts caused by mtDNA reads being improperly assigned to NUMTs, we sepa- rately aligned all Illumina reads to the mtDNA revised Cambridge Reference Se- quence (NC_) using BWA []. Unlike the nuclear genome, read-pairs with identical alignment positions were retained (resulting in ~,× coverage) rather than excluded (resulting in ~× coverage) because the observed number of such duplicate read-pairs matched the expected number based on simulation of high mtDNA copy number. Variants were detected in each BAM file with the

GATK UnifiedGenotyper version v.. in pooled-sample mode to capture het-

 eroplasmic variants (with parameters -mbq  -mmq  -mm  -deletions .

-exp -gm POOLED -poolSize  -confidence  -pl SOLEXA). Variants supported by > of aligned reads were identified, because it is difficult to resolve low- heteroplasmy variants from variants present in NUMTs []. mtDNA variants were annotated with data from MITOMAP [], and allele frequency was ob- tained from mtDB [].

mtDNA deletions or rearrangements were detected through de novo mtDNA as- sembly with a modified ALLPATHS assembly approach []. Reads aligned to the mtDNA were randomly down-sampled to × coverage for computational tractability. Errors in reads were corrected with the ALLPATHS-LG [] error cor- rection algorithm. Next, a k-mer graph [] was constructed (k = ) and filtered using four steps (Supplementary Methods). Each resulting mtDNA assembly (in a circular, directed graph representation) was manually inspected to identify alter- ations supported by > of aligned reads.

.. S, ,  

Sensitivity and specificity of detecting nuclear SNVs in HapMap individuals (NA,

NA, and NA) were estimated with independent genotype training data obtained from HapMap Phase  release  [] as well as in Illumina whole- genome sequence data from the  Genomes Project [] pilot . Genotype con- cordance was assessed at  targeted sites that differed between HapMap Phase  and the reference genome, as well as  variant sites based on the  Genomes

Project [] pilot . Reproducibility was measured by technical duplicates of sam- ple NA, from which one aliquot of lymphoblastoid cell line DNA was used to create two sequence libraries that were separately hybridized and sequenced, with

 genotype concordance observed in  of  sites detected as variant in either replicate. mtDNA sensitivity and specificity were calculated at  nonreference and , reference sites for which independent sequence data were available in

five patient samples (Newcastle University).

.. V 

Nuclear SNVs and indels that passed quality control metrics were prioritized ac- cording to three criteria: (i) variants that were rare in healthy individuals, with

SNV allele frequency below . within public databases [dbSNP [] version

 and the  Genomes Project [] release  including low-pass whole- genome sequence data from  individuals, and OMNI chip genotype data from

 individuals], consistent with the frequencies observed in  of all  re- ported pathogenic OXPHOS alleles within HGMD version . []. Because estimates of indel allele frequency were less robust, only indels absent in the 

Genomes data were considered rare. Variants too common among cases (> allele frequency) were also excluded. Variants associated with any HGMD disease phenotype were not excluded, unless associated with allele frequency exceeding

. (ii) Variants predicted to modify protein function, including nonsense, splice site, coding indel, or missense variants at sites conserved across evolution, as pre- viously described []. Briefly, eight splice site locations were included (acceptor sites −, −, −, and donor sites −, , , , ), and missense alleles were required to be at protein residues that were identical in at least  of  aligned vertebrate species. (iii) Variants consistent with a recessive model of pathogenesis, includ- ing homozygous variants or two heterozygous variants present in the same gene.

All rare, protein-modifying, and recessive-type variants in both patient and con-

 trol samples were manually reviewed to remove sequence artifacts and to exclude variants present that were not compound heterozygous based on read or read-pair evidence.

We prioritized mtDNA variants that were annotated as pathogenic in MITO-

MAP[] and mtDNA deletions or rearrangements that were detected through de novo mtDNA assembly with the ALLPATHS algorithm [].

.. P    

Prevalence of prioritized variants was assessed in patients and in  healthy in- dividuals of European ancestry obtained with permission from the NIMH Control

Sample collection. Existing whole-exome data for the  controls, generated by the same Broad Institute Sequencing Platform protocol, were analyzed with the

MitoExome analysis pipeline. Prevalence comparisons within the  mitochon- drial genes were restricted to the  autosomal coding exons that were well mea- sured in both study designs, totaling  Mb ( of all autosomal coding exons), where each included exon had > of bases covered by > sequence reads (base quality >, mapping quality >), in at least  of patient samples and at least

 of control samples. Using the same criteria, we restricted comparisons within the  auxiliary genes to the  autosomal coding exons, totaling  kb ( of all such coding exons), that were well measured in both study designs. Signifi- cance was assessed by one-sided, unpaired t test with unequal variance (Fig. .A) and by one-sided Wilcoxon rank sum (Fig. ., B and C). Details of variant preva- lence in cases and controls are shown in Fig. A.. For patients, we conservatively excluded heterozygous variants shown to be in cis based on haplotype phasing ( of  cases) as well as variants not validated by Sanger sequencing ( of  vari-

 ants). ese exclusions may underestimate the degree of enrichment in cases over controls.

.. V   

All prioritized variants detected in patients were independently validated by Sanger sequencing ( of  variants validated), and compound heterozygous variants were phased through sequencing cDNA (GFM), cloned DNA (BCSL, Corf,

TYMP, MTHFDL), familial DNA (GFM, AGK, EARS), or by a molecular haplo- typing approach () (ACAD, AARS, POLG) described in Supplementary Meth- ods. Ten of  putative compound heterozygous variants were confirmed, and two instances are not yet phased (ACOX and ALDHB). Two heterozygous vari- ants in MRPL were determined to be on the same maternal haplotype, and therefore, MRPL was not listed as a prioritized candidate. One Corf vari- ant was determined to have been introduced by whole-genome amplification and was excluded. e mtDNA deletion was validated by LR-PCR with primers F

(m._) and DR (m._) and Sanger sequencing with internal primers near the breakpoint (available on request). e heteroplasmy level in skeletal mus- cle was determined by quantitative real-time PCR as previously described [], and the relative abundance of mtDNA versus nuclear DNA was measured in patient and control tissues as previously described [].

.. E  

Multiple assays were performed to support the pathogenicity of novel alleles within genes known to cause OXPHOS disease (Fig. A.-A.). Presence and segregation of alleles in family members (when available) were assayed by Sanger sequencing.

 e effect of mutations on mature RNA transcript splicing and abundance was measured in patient and control fibroblasts by reverse transcription–PCR (RT-

PCR) in the presence and absence of cycloheximide ( ng/µl) for  hours to inhibit nonsense-mediated decay []. Relative abundance of full-length pro- tein products and OXPHOS subunits or assembly factors from all five complexes was measured in patient and control fibroblast lines and skeletal muscle via SDS– polyacrylamide gel electrophoresis (SDS-PAGE) Western blot[, ]. Prioritized genes were excluded if they did not segregate with disease in the family (FAMA in P and COX in P) or if variants showed >. allele frequency in healthy individuals (MUL in P and MRPL in P). See Supplementary Methods for de- tails. Prioritized variants were within individuals of western European ancestry, except as noted below, and therefore, allele frequencies were obtained from the

 Genomes Project (Table S). Prioritized variants within patients of Lebanese decent (P, P, P, P, P, P, P, P, and P) were screened with

Sequenom in  ethnically matched . Only the ACADSB variant

(homozygous in P) was present in any other sample, where it was observed in heterozygous state in  of  Lebanese chromosomes (. allele frequency).

Ethnically matched allele frequencies were not obtained for prioritized variants in P (Vietnamese ancestry), P (Pakistani ancestry), P, or P (uncertain ancestry).

.. M 

Molecular karyotyping of proband P and maternal DNA samples was performed with the Illumina HumanCytoSNP- array (version .) as previously described

[]. Automated LCSH detection was performed with the CNVPartition v..

 algorithm in KaryoStudio software. SNP genotypes were generated in GenomeS- tudio software (Illumina) with data from a set of  intra-run samples. Both the proband and the maternal DNA samples yielded an SNP call rate of .. Iden- tity by state (IBS) analysis, assessing the sharing of alleles between individuals, was performed with the online SNPduo program [](http://pevsnerlab. kennedykrieger.org/SNPduo) with default parameters.

.. NDUFB 

e NDUFB open reading frame (ORF) was amplified from control fibroblast cDNA and cloned into the -hydroxytamoxifen–inducible lentiviral vector pF-

xUAS-MCS-SV-puroGalERTVP (GEV)-W [] with ′ Bci I and ′ Xba

I. e pF-xUAS-Corf-SV puroGEV-W vector was described previously

[]. To generate lentiviral particles, we grew human embryonic kidney (HEK)

T cells on -cm plates to  confluence and cotransfected them using Ef- fectene reagents (Qiagen) with packaging plasmid (pCMVδR.), a pseudotyping plasmid (pCMV-VSVg), and either pF-xUAS-NDUFB-SV puroGEV-W or pF-xUAS-Corf-SV puroGEV-W. Fresh medium was applied to the cells

 hours after transfection, and after another  hours of incubation, super- natants containing packaged virus were harvested and filtered through a .-

µm membrane filter. Primary human fibroblasts were infected with viral super- natant along with polybrene ( µg/ml) (Sigma) for  hours. Cells were grown in antibiotic-free medium for  hours before application of selection medium containing puromycin ( µg/ml). We found that our lentivectors were “leaky” in primary human fibroblast lines and that sufficient expression of Corf and

NDUFB was achieved without tamoxifen induction. After  days of selection,

 cells were harvested for enzyme activity assays and SDS-PAGE Western blotting.

.. C I  IV   

Complex I and complex IV dipstick activity assays (MitoSciences) were performed on  and  µg, respectively, of cleared cell lysates according to the manufac- turer’s protocol. A Hamamatsu ICA- immunochromatographic dipstick reader was used for densitometry. Two-way repeated-measures analysis of variance (ANOVA) was used for comparisons of groups followed by post hoc analysis with Bonferroni method to determine statistically significant differences.

. M I

.. P P A

A previous version of this chapter appeared in []:

Sarah E Calvo, Alison G Compton, Steven G Hershman, Sze Chern Lim, Daniel S

Lieber, Elena J Tucker, Adrienne Laskowski, Caterina Garone, Shangtao Liu, David B

Jaffe, John Christodoulou, Janice M Fletcher, Damien L Bruno, Jack Goldblatt,

Salvatore DiMauro, David R orburn, and Vamsi K Mootha. Molecular Diagno- sis of Infantile Mitochondrial Disease with Targeted Next-Generation Sequencing

. Science Translational Medicine, ():ra–ra, January . URL http://stm.sciencemag.org/content///ra.abstract

.. A’ C

is study was conceived and designed by S.E.C., D.R.T., S.D., and V.K.M. Char- acterization of patient phenotypes was performed by J.C., J.M.F., and J.G. Next- generation sequencing analysis was performed by S.G.H., D.S.L., and S.E.C., with

 mtDNA assembly performed by D.B.J. Variant validation and follow-up on patient samples were performed by A.G.C., S.C.L., S.L., C.G., E.J.T., D.L.B., and A.L. Molec- ular haplotyping was performed by S.G.H. e manuscript was written by S.E.C.,

A.G.C., S.G.H., S.C.L., S.D., D.R.T., and V.K.M. All aspects of the study were super- vised by D.R.T. and V.K.M.

 3 Mutations in MTFMT Underlie a Human Disorder of Formylation Causing Impaired Mitochondrial Translation

   translation machinery is unusual in having T a single tRNAMet that fulfills the dual role of the initiator and elongator tRNAMet. A portion of the Met-tRNAMet pool is formylated by mitochondrial methionyl- tRNA formyltransferase (MTFMT) to generate N-formylmethionine-tRNAMet (fMet-

 tRNAMet), which is used for translation initiation; however, the requirement of formylation for initiation in human mitochondria is still under debate. Using targeted sequencing of the mtDNA and nuclear exons encoding the mitochon- drial proteome (MitoExome), we identified compound heterozygous mutations in

MTFMT in two unrelated children presenting with Leigh syndrome and combined

OXPHOS deficiency. Patient fibroblasts exhibit severe defects in mitochondrial translation that can be rescued by exogenous expression of MTFMT. Furthermore, patient fibroblasts have dramatically reduced fMet-tRNAMet levels and an abnor- mal formylation profile of mitochondrially translated COX. Our findings demon- strate that MTFMT is critical for efficient human mitochondrial translation and reveal a human disorder of Met-tRNAMet formylation.

. I

Of the ~ protein components of the oxidative phosphorylation (OXPHOS) ma- chinery,  are encoded by the mitochondrial DNA (mtDNA) and translated within the organelle. Defects in mitochondrial protein synthesis lead to combined OX-

PHOS deficiency. Although the mtDNA encodes the ribosomal and transfer RNAs, all remaining components of the mitochondrial translational machinery are en- coded by nuclear genes and imported into the organelle. To date, mutations in more than  different nuclear genes have been shown to cause defective mito- chondrial translation in humans. However, molecular diagnosis by sequencing these candidates in patients with defects in mitochondrial translation is far from perfect [], underscoring the need to identify additional pathogenic mutations underlying these disorders.

Translation within metazoan mitochondria is reminiscent of the bacterial path-

 way, initiating with N-formylmethionine (fMet) []. Unlike bacteria, which en- code distinct tRNAMet molecules for translation initiation and elongation, meta- zoan mitochondria express a single tRNAMet that fulfills both roles []. After aminoacylation of tRNAMet, a portion of Met-tRNAMet is formylated by mitochon- drial methionyl-tRNA formyltransferase (MTFMT) to generate fMet-tRNAMet. e

Met mitochondrial translation initiation factor (IFmt) has high affinity for fMet-tRNA , which is recruited to the ribosomal P-site to initiate translation []. In contrast,

Met the mitochondrial elongation factor (EF-Tumt) specifically recruits Met-tRNA to the ribosomal A-site to participate in polypeptide elongation. Synthesized pro- teins can then be deformylated by a mitochondrial peptide deformylase (PDF) and demethionylated by a mitochondrial methionyl aminopeptidase (MAPD)[,

].

Here, we applied targeted exome sequencing to two unrelated patients with

Leigh syndrome and combined OXPHOS deficiency to discover pathogenic mu- tations in MTFMT. Fibroblasts from these patients have impaired Met-tRNAMet formylation, peptide formylation, and mitochondrial translation. Despite stud- ies in yeast suggesting that MTFMT is not essential for mitochondrial translation

[, , ], we show that in humans this gene is required for efficient mito- chondrial translation and function.

. R

.. M       -

  L 

We studied two unrelated patients with Leigh syndrome and combined OXPHOS deficiency (Fig. .A). Clinical summaries for Patient  (P) and Patient  (P)

 are provided in Supplementary Data. Patient fibroblasts had reduced synthesis of most mtDNA-encoded proteins as assayed by [³⁵S]-methionine labeling in the presence of inhibitors of cytosolic translation (Fig. .B). is correlated with re- duced steady state protein levels as detected by immunoblotting (Fig. .C), and, at least for ND, was not due to reduced mRNA (Fig. B.). Collectively, these data suggest a defect in translation of mtDNA-encoded proteins.

.. ME S I MTFMT M

Toelucidate the molecular basis of disease in P and P, we performed next-generation sequencing of coding exons from  nuclear-encoded mitochondrial-associated genes and the mtDNA (collectively termed the “MitoExome”). DNA was captured via an in-solution hybridization method [] and sequenced on an Illumina GA-II platform []. Details are provided in the Supplemental Results and Table ..

Table 3.1: Next-generation sequencing statistics. Q20 = Phred-scaled quality ￿20, nuDNA = MitoExome nuclear DNA, Cov = coverage

P P Bases Targeted ,, Q Bases ,, ,,,  selected   nuDNA Cov (X)   mtDNA Cov (X) ,  Bases Analyzed ,, Cov >=  ,, (.) ,, (.) 

We identified ~ single-nucleotide variants (SNVs) and short insertion or deletion variants (indels) in each patient relative to the reference genome, and pri-

 Figure 3.1: Combined OXPHOS deficiency due to a defect in mitochon- drial translation. (A) Biochemical analysis of OXPHOS complexes relative to cit- rate synthase (CS) in fibroblasts (Fb), muscle (M) or liver (L), expressed as a % of mean from healthy controls. (B) SDS-PAGE analysis of 35S-methionine-labeled mtDNA-encoded proteins from control and patient fibroblasts. MtDNA-encoded sub- units of complex I (ND1, ND2, ND3, ND5, ND4L), complex III (cytb), complex IV (COX1, COX2, COX3) and complex V (ATP6, ATP8) are shown. (C) Gel in (b) was immunoblotted with against mtDNA-encoded ND1, COX1, COX2 and nuclear-encoded SDHA (complex II; loading control). See also Fig. B.1.

 oritized those that may underlie a severe, recessive disease (Fig. .A). We first fil- tered out likely benign variants present at a frequency of >. in public databases which left ~ variants in each patient. We then prioritized variants that were pre- dicted to have a deleterious impact on protein function [], leaving ~ variants.

Focusing on genes that fit autosomal recessive inheritance, having either homozy- gous variants or two different variants in the same gene, only one candidate gene,

MTFMT, remained in each patient (Fig. .A).

 

Figure 3.2: Identification of Compound Heterozygous Mutations in MTFMT. (A) Number of MitoExome variants that pass prioritization filters. (B) Schematic diagram of MTFMT showing the location of mutations in P1 and P2 (red bars), exon skip- ping (gray boxes), and primers for RT-PCR (forward and reverse arrows). (C) Electrophoresis of RT-PCR products demonstrates a smaller cDNA species (280 bp) in P1 and P2 that is particularly prominent in cells grown in the presence of cycloheximide (+CHX). Top: Sequence chromatograms of full-length MTFMT RT-PCR products (–CHX) to confirm compound heterozygosity. Bottom: Sequence chromatograms of the smaller RT-PCR products (+CHX) shows patient cDNA lacks the c.382C→T (P1) or c.374C→T (P2) mutations and skips exon 4, which carries the shared c.626C→T mutation. See also Fig. B.2 and Table 3.1. We identified three distinct heterozygous variants in our patients (Fig. .B).

Both patients harbor a c.C→T mutation. e c.C site is  bp upstream of the ￿ end of exon  and is predicted to eliminate two overlapping exonic splic- ing enhancers (GTCAAG, TCAAGA) [] and to generate an exonic splicing sup- pressor (GTTGTT) []. Skipping of exon  results in a frameshift and prema- ture stop codon (p.RSfsX). e second mutation in P is a nonsense mu- tation (c.C→T, p.RX), while the second mutation in P changes a highly conserved serine to leucine in the catalytic core of MTFMT (c.C→T, p.SL)

(Fig. B.). An affected cousin of P also carries the c.C→T and c.C→T mu- tations.

As predicted by in silico analysis, the shared c.C→T mutation caused skip- ping of exon  (Fig. .C). qRT-PCR analysis revealed that P had only  full- length MTFMT transcript compared to controls (Fig. B.), the majority of which carries the c.C→T nonsense mutation and lacks the c.C→T splicing muta- tion (Fig. .C). P had  full-length MTFMT transcript (Fig. B.), all of which appears to carry the c.C→T mutation and to lack the c.C→T splicing mu- tation (Fig. .C). Collectively, these results confirm compound heterozygosity of the MTFMT mutations and almost complete exon skipping due to the c.C→T mutation.

.. M T I R  P F

 E MTFMT

We used complementary DNA (cDNA) complementation to prove that the transla- tion defect in these patients is due to mutations in MTFMT. Fibroblasts from both patients showed reduced levels of the mtDNA-encoded complex IV subunit, COX,

 consistent with a defect in mitochondrial translation, and of the nuclear-encoded complex I subunit, NDUFB, reflecting instability of complex I in the absence of mtDNA-encoded proteins (Fig. .A). Lentiviral transduction of MTFMT cDNA caused a significant increase of COX and NDUFB in both patients (Fig. .B).

In contrast, lentiviral transduction of a control cDNA, Corf, caused no change of these subunits (Fig. .B). ese data confirm that an MTFMT defect is respon- sible for the combined OXPHOS deficiency in these patients.

.. M RNAM P A A  P F-



To directly analyze the mitochondrial tRNAMet pools (Fig. .A), we used a modi-

fied protocol of acid-urea PAGE followed by northern blotting [, , ] (Fig. .B).

We were able to separate the mitochondrial uncharged tRNAMet, Met-tRNAMet, and fMet-tRNAMet from total RNA isolated from fibroblasts and to show that two independent wild-type cell lines contained uncharged tRNAMet and fMet-tRNAMet, but very little Met-tRNAMet (Fig. .B, lanes –). In striking contrast, fibrob- lasts from P and P lacked detectable fMet-tRNAMet and contained mostly Met- tRNAMet along with traces of the uncharged tRNAMet (Fig. .B, compare lanes

– to control lanes –). We also observed a .-fold increase of the overall mi- tochondrial tRNAMet signal in patient fibroblasts compared to control (Fig. .B, top panel; compare lanes  and  to control lane ), while the cytoplasmic initia-

Met tor tRNAi showed constant signal throughout (Fig. .B, bottom panel). e analysis of the mitochondrial tRNAMet pools clearly shows a defect in tRNAMet formylation.

 Figure 3.3: Proving pathogenicity of MTFMT. Patient and control fibroblasts were transduced with MTFMT cDNA or control C8orf38 cDNA. (A) Representative SDS-PAGE western blot shows reduced COX2 and NDUFB8 in patient fibroblasts and restoration of protein levels with MTFMT but not C8orf38 transduction. The 70kDa complex II subunit acts as a loading control.(B) Protein expression was quantified by densitometry and bar charts show the level of complex I (NDUFB8) or complex IV (COX2) relative to complex II (70 kDa) normalized to control, before and after transduction. Error bars show the mean of three biological replicates and error bars indicate ± 1 standard error of the mean (SEM). Asterisks indicate p < 0.05 (*), p < 0.01 (**), and p < 0.001 (***).

 Figure 3.4: Patient Fibroblasts Have a Defect in Met-tRNAMet Formylation. (A) In metazoan mitochondria, a single tRNAMet species acts as both initiator and elongator tRNAMet. After aminoacylation of tRNAMet by the mitochondrial methionyl- Met tRNA synthetase (MetRSmt), a portion of Met-tRNA is formylated by MTFMT to Met Met generate fMet-tRNA . fMet-tRNA is used by the mitochondrial IF2 (IF2mt) to initiate translation, whereas Met-tRNAMet is recognized by the mitochondrial EF-Tu (EF-Tumt) for the elongation of translation products. (B) Total RNA from control (lanes 5–7) and patient fibroblasts (P1, lanes 8–10; P2, lanes 11–13) was separated by acid-urea PAGE. Total RNA from MCH58 cells is shown as a reference (lanes 1– Met Met 4). The mitochondrial tRNA (top) and the cytoplasmic initiator tRNAi (bot- tom) were detected by northern hybridization. Total RNA was isolated under acidic conditions, which preserves both the Met-tRNAMet and fMet-tRNAMet (ac); tRNAs were treated with copper sulfate (Cu+), which specifically deacylates Met-tRNAMet but not fMet-tRNAMet, or with base (OH−), which deacylates both Met-tRNAMet and fMet-tRNAMet. Base-treated tRNA was reaminoacylated in vitro with Met using MetRS generating a Met-tRNAMet standard (M).

 .. COX P F I D  P F

Although fibroblasts from P and P have severely impaired mitochondrial trans- lation, they do retain residual activity (Fig. .B). is residual activity could be due to () low activity of mutant MTFMT generating a small amount of fMet- tRNAMet that is rapidly consumed in translation initiation and, therefore, unde- tectable by Northern blot analyses and/or () the human IFmt recognizing, albeit weakly, the nonformylated Met-tRNAMet species to support translation initiation.

Translation through the first mechanism would produce formylated protein, while translation through the second mechanism would produce unformylated protein.

To investigate these two possibilities, we used semiquantitative mass spectro- metric analysis to simultaneously measure three possible N-terminal states of mi- tochondrially translated COX: formylated (Fig. .A), unformylated (Fig. .B), and demethionylated (des-Met) (Fig. .C). We applied this method to complex IV immunoprecipitated from fibroblasts from P and P and two independent wild- type cell lines (Fig. .D). Although no fMet-tRNAMet was detected in patient fi- broblasts by northern blotting (Fig. .B), the dominant COX peptide in all four samples is the formylated species as estimated from total ion current of each form

(Fig. .E). e expression of mitochondrial PDF and MAPD was normal in pa- tient fibroblasts (Fig. B.). ese semiquantitative analyses clearly demonstrate that patient fibroblasts retain residual MTFMT activity.

. D

Here, we report human patients with mutations in MTFMT, a gene that has not been previously linked to human disease. We verified the causal mutations by rescuing the mitochondrial translation defects in patient fibroblasts via lentiviral

 Figure 3.5: Analysis of the COX1 N-terminus in Patient Fibroblasts. (A–C) Annotated MS/MS spectra confirming correct targeting of the three possible N- termini of COX1. The sequence of the peptide is MFADRWLFSTNHK where the Met residue may be (A) formylated, (B) unformylated, or (C) absent (des-Met). In- sets show high-resolution, high-mass accuracy precursors from which the fragmenta- tion spectra were derived. Given their sequence similarity, peptides are expected to have similar ionization efficiencies. (D) Extracted ion chromatograms (XICs) of three N-terminal states of COX1 ([fMet,Met,des-Met]FADRWLFSTNHK), normalized to an internal COX1 peptide (VFSWLATLHGSNMK). (E) Fractional ion current of the three N-terminal states of COX1 from immunoprecipitated complex IV of patients and controls. See also Fig. B.3.

 transduction of MTFMT. Analysis of the tRNAMet pools in patient fibroblasts re- vealed severe MTFMT dysfunction. To our knowledge, the human mitochondrial tRNAMet profile has not been previously reported. It is interesting to note that control fibroblasts lack detectable Met-tRNAMet, suggesting that it is utilized as quickly as it is produced; either converted to fMet-tRNAMet or used to donate Met to the growing polypeptide chain. Strikingly, patient fibroblasts lack detectable levels of fMet-tRNAMet and contain mostly Met-tRNAMet.

Drastically decreased fMet-tRNAMet levels prevent efficient mitochondrial trans- lation as demonstrated by the reduced translation observed in patient fibroblasts.

Although fibroblasts from P and P have severely impaired mitochondrial trans- lation, they do retain some residual activity. To understand the origin of this ac- tivity, we measured the relative distribution of three possible N-terminal states of mitochondrially translated COX by mass spectrometry. While previous stud- ies have interrogated the formylation status of the N-terminus of COX [], our study interrogates all three modification states and demonstrates mitochondrial methionine excision activity, which is detectable albeit weak.

Formylated COX is the dominant species in patient fibroblasts, indicating resid- ual MTFMT activity. Assuming P’s nonsense mutation has a full loss of function, then the allele harboring the shared c.C→T mutation must confer MTFMT ac- tivity. Transcript that has not undergone skipping of exon  encodes an MTFMT variant harboring a p.SL missense mutation. Residue p.S is moderately conserved and lies on the periphery of MTFMT based on homology with the bacte- rial enzyme. Similarly, P’s residual MTFMT activity must originate from enzyme variants carrying the p.SL mutation and/or the p.SL mutation located in the active site.

 Studies in bacteria and yeast have raised questions about the absolute require- ment for Met-tRNAMet formylation. Formylation is not essential in all bacteria

[] and in yeast disruption of FMT causes no discernible defect in mitochon- drial protein synthesis or function [, , ]. Additionally, bovine IFmt is able to restore respiration in a yeast mutant lacking both IFmt and FMT [], suggesting that bovine IFmt, like yeast IFmt, can initiate protein synthesis with- out fMet-tRNAMet. However, a number of studies in mammals indicate that formy- lation of mitochondrial Met-tRNAMet is required for translation initiation. Bovine

Met Met IFmt has a - to -fold greater affinity for fMet-tRNA than for Met-tRNA in vitro [] and  of the  bovine mtDNA-encoded proteins retain fMet at the

N-terminus [].

What are the factors that could allow nonformylated Met-tRNAMet to initiate mitochondrial translation? In Salmonella typhimurium, amplification of initiator tRNA genes compensates for a lack of methionyl-tRNA formyltransferase activity and allows translation initiation without formylation of the initiator tRNA [].

e “upregulation” of the mitochondrial tRNAMet in patient fibroblasts (Fig. .B) could, in principle, be a compensatory response due to limited fMet-tRNAMet.

In summary, we have used MitoExome sequencing to identify MTFMT as a gene underpinning combined OXPHOS deficiency associated with Leigh syndrome. We have shown that patient fibroblasts have a striking deficiency of fMet-tRNAMet leading to impaired mitochondrial translation. Despite studies in yeast suggest- ing that MTFMT is not essential for mitochondrial translation [, , ], we show here that in humans this gene is required for efficient mitochondrial trans- lation and function. More generally, this study demonstrates how MitoExome sequencing can reveal insights into basic biochemistry and the molecular basis of

 mitochondrial disease.

. E P

.. C C

Cells were grown at  ◦C and  CO in Dulbecco’s modified Eagle’s medium

(DMEM; Invitrogen, Carlsbad, CA) supplemented with  (v/v) fetal bovine serum

(FBS, Invitrogen).

.. B A

Spectrophotometric analysis of mitochondrial OXPHOS activity was performed as described previously (Kirby et al., ). Investigations were performed with informed consent and in compliance with ethics approval by the Human Research

Ethics Committee of the Royal Children’s Hospital, Melbourne.

.. T A

MtDNA-encoded proteins in patient fibroblasts were labeled with ³⁵S-methionine/³⁵S- cysteine (EXPRE³⁵S³⁵S Protein Labeling Mix; Perkin Elmer Life Sciences) prior to mitochondrial isolation and analysis of translation products by SDS-PAGE as pre- viously described [].

.. SDS-PAGE  I

Immunoblotting was performed as previously described []. Proteins were de- tected with the following antibodies: complex II α- kDa subunit monoclonal (MitoSciences, MS), ND polyclonal antibody (kind gift from Anne

Lombes, Paris), α-complex IV subunit I monoclonal antibody (Invitrogen, ),

 α-complex IV subunit II monoclonal antibody (Invitrogen, A), Total OXPHOS

Human WB Antibody Cocktail containing αNDUFB and αCOX (MitoSciences,

MS), and either α-mouse or α-rabbit IgG horseradish peroxidase (HRP; Dako-

Cytomation).

.. ME S

We used an in-solution hybridization capture method [] to isolate target DNA, which was sequenced on the Illumina GA-II platform []. e . Mb of targeted

DNA included the  kb mtDNA and all coding and untranslated exons of  nuclear genes, including  mitochondrial genes from the MitoCarta database

[],  genes with recent strong evidence of mitochondrial association, and  additional genes. All analyses were restricted to the mtDNA and coding exons of the  genes with confident evidence of mitochondrial association (. Mb).

Detailed methods for target selection, sequencing, alignment and variant detec- tion are in Chapter .

.. V P

Differences in DNA sequence between each individual and the GRCh human ref- erence assembly were identified. Nuclear variants that passed quality control met- rics were prioritized according to three criteria: () SNV allele frequency <. in public databases (dbSNP [] version  and the  genomes project [] released November ) or indels absent in the  genomes data, () variants predicted to modify protein function as previously described [], and () vari- ants consistent with recessive inheritance (homozygous variants or two heterozy- gous variants in the same gene). We also prioritized mtDNA variants annotated

 as pathogenic in MITOMAP []. Detailed methods are in Chapter .

.. S DNA S

DNA isolation, RNA isolation, cDNA synthesis, inhibition of nonsense mediated decay, and sequencing of PCR products were performed as described previously

[].

.. L T

e MTFMT open reading frame (ORF) was purchased in a pCMV-

SPORT vector (Clone ID: BC., Open Biosystems) and was cloned into the -hydroxytamoxifen-inducible lentiviral vector, pF_x_UAS_MCS_SV_puroGEV-W []. MTFMT viral particles were generated and patient fibroblasts were transduced as described previously [].

ree independent transductions were performed and cells were harvested after

– days selection with  µg/ml puromycin.

.. A-U PAGE  N B

Total RNA was isolated from frozen cells under acidic conditions with TRIzol (In- vitrogen) according to the manufacturer’s instructions. Acid-washed glass beads

(. mm diameter; Sigma) were added during extraction. Total RNAs were sepa- rated by acid-urea PAGE as described previously [, , ] with modifications.

In brief, . A units of each RNA sample were loaded onto a . polyacry- lamide gel containing  M urea and . M sodium acetate (pH .). Individual tRNAs were detected by northern blotting [] with the following hybridization probes: ′TAGTACGGGAAGGGTATAA′ (mitochondrial tRNAMet) and ′TTCCACTGCACCACTCTGCT′

 Met (cytoplasmic initiator tRNAi ). Northern blots were quantified by Phospho- rImager analysis with ImageQuant software (Molecular Dynamics). Experiments were performed in duplicate. Aminoacyl-tRNAs and formylaminoacyl-tRNAs were deacylated by base treatment in . M Tris-HCl (pH .) at  ◦C for  min, fol- lowed by incubation at  ◦C for  hr. Aminoacyl-tRNAs were selectively deacy- lated by treatment with copper sulfate as described []. Total RNA was aminoa- cylated in vitro with methionine using E. coli MetRS as previously described with minor modifications [].

.. M S A  COX N-

In brief, complex IV was immunoprecipitated from control and patient fibroblasts with MitoSciences’ complex IV Immunocapture kit (MS-) and separated by gel electrophoresis on a NuPAGE – Bis-Tris gel (Invitrogen). A band corre- sponding to the MW of COX was excised and subjected to in-gel Lys-C digestion

(Kinter and Sherman). Extracted peptides were separated on C column with a

-Series nano-LC pump (Agilent) and run on a LTQ-Velos-Orbitrap mass spec- trometer (ermoFisher) set to scan and to targeted MS/MS for m/zs correspond- ing to the z =  states of the unformylated, formylated and des-Met species of the

N-terminal peptide of COX (MFADRWLFSTNHK).

Extracted ion chromatographs (XICs) were generated from the Orbitrap sur- vey scans based on the z =  states of the unformylated, formylated and des-Met species of the N-terminal peptide of COX using XCalibur software (ermoFisher

Scientific). e identities of the peaks corresponding to these species were veri-

fied with the accompanying static MS/MS spectra (Figs. .A-C). e areas under these peaks were integrated with the Genesis peak detection algorithm included

 in XCalibur with all standard defaults. Peak areas were further normalized to the peak area of a distal peptide of COX (VFSWLATLHGSNMK, m/z ., z = ) to ensure that comparisons allowed for the different amounts of COX in control and patient samples. Full methods are in the Appendix B.

.. S A

Two-way repeated-measures analysis of variance (ANOVA) was used for compar- isons of groups followed by post hoc analysis via the Bonferroni method.

.. A N

e GenBank accession number for the MTFMT sequence reported in this paper is NM_..

. M I

.. P P A

A previous version of this chapter appeared in []:

Elena J.Tucker, Steven G. Hershman, Caroline Köhrer, Casey A. Belcher-Timme,

Jinal Patel, Olga A. Goldberger, John Christodoulou, Jonathon M. Silberstein,

Matthew McKenzie, Michael T. Ryan, Alison G. Compton, Jacob D. Jaffe, Steven A.

Carr, Sarah E. Calvo, Uttam L. RajBhandary, David R. orburn, and Vamsi K.

Mootha. Mutations in MTFMT Underlie a Human Disorder of Formylation Caus- ing Impaired Mitochondrial Translation, September . URL http://linkinghub. elsevier.com/retrieve/pii/S

 .. A’ C

is study was conceived and designed by D.R.T. and V.K.M. Characterization of patient phenotypes was performed by J.C. and J.S. Biochemical analysis was per- formed by E.J.T., A.G.C., M.M., and M.T.R. Next-generation sequencing analysis was performed by S.G.H and S.E.C. Variant validation was performed by E.J.T. and

A.G.C. cDNA rescue was performed by E.J.T. and O.A.G. Northern blots were per- formed by C.K. and C.A.B. Mass Spectrometry was performed by J.P.,J.D.J, S.A.C. and C.A.B. e manuscript was written by E.J.T., S.G.H., C.K., J.D.J, U.L.R., D.R.T. and V.K.M. All aspects of the study were supervised by U.L.R., D.R.T. and V.K.M.

 What’s cool is that we’re a mosaic of pieces of genomes...None of us is truly normal.

Evan Eichler, []

4 Copy Number Variant Detection From Targeted Sequencing Data

 , such as large amplifications and deletions, are known S to contribute to human disease but are rarely detected in targeted sequenc- ing experiments. In this chapter, we evaluate the ability to detect copy number variants (CNVs) from the read depths of whole-exome sequencing data and evalu- ate the utility of including CNVs when analyzing the MitoExomes of patients with suspected mitochondrial disease. We expand upon a previously published CNV

 detection algorithm, CoNIFER, which uses singular value decomposition (SVD) to isolate and remove noise in the read depths of targeted sequencing experiments in order to facilitate the detection of rare CNVs. In exome data from a control set of individuals that had previously undergone CNV detection via genome-wide array-CGH, we are able to recover  of deletions and  of amplifications.

Performance is better when recovering synthetically inserted variants; however, events that overlap few, highly variable and/or poorly covered targets are diffi- cult to detect. To facilitate the evaluation of CNV callsets in a cohort of  pa- tients with suspected mitochondrial disease, we developed a web-interface, CN-

Visualizer, which integrates CNV calling with other genetic and phenotypic infor- mation. In one patient, we detect an event consistent with the kb common deletion responsible for ~ of cases of Krabbe Disease. In another, we detect an amplification of chromosome p→pter that is consistent with the patient’s multi-systemic disorder. We find signals for deletions in genes that already had a prioritized variant and were consistent with phenotypic information in four pa- tients. ree of these deletions have signal from single exons and need to be con-

firmed. In two patients with Leigh syndrome and the pathogenic p.SL variant described in Chapter , we detect exons with low coverage in the mitochondrial methionyl-tRNA formyltransferase (MTFMT). In a patient who first presented af- ter a teenage bout of flu with a series of symptoms consistent with adult Refsum disease, we detect a deletion in the promoter region of phytanoyl-CoA hydroxy- lase (PHYH) that may be compound heterozygous with a known partial loss-of- function variant. Progression of symptoms in Refsum disease can be avoided by reducing dietary intake of phytanic acid, a fatty acid whose catabolism requires

PHYH activity. Lastly, in a  year-old male patient with cerebellar ataxia, pe-

 ripheral neuropathy, azoospermia, and mild hearing loss we detect, confirmed and phased a kb heterozygous deletion that was compounded with a likely loss-of- function missense variant in D-bifunctional protein, HSDB, a peroxisomal en- zyme that catalyzes multiple steps of beta-oxidation of very long chain fatty acids.

Retrospective review of metabolic testing from this patient revealed alterations of long- and very-long chain fatty acid metabolism consistent with a peroxisomal disorder. ese vignettes illuminate the capabilities and limitations of CNV de- tection from targeted sequencing data and demonstrates the importance of CNV detection for clinical diagnostics.

. I

To date, we have sequenced the MitoExome of over  cases and controls. We evaluated single nucleotide polymorphisms (SNPs) and short insertion and dele- tions (indels) to prioritize genes with rare, protein modifying variants that fit a re- cessive mode of inheritance and matched the patient phenotype. In one case, a sin- gle rare protein modifying variant in POLG was prioritized, as that gene had pre- viously been shown to act dominantly. We benchmarked MitoExome sequencing performance on two cohorts of individuals:  infants with severe mitochondrial disease (Chapter ) and  patients with suspected mitochondrial disease[].

While many samples had at least one candidate disease gene, a striking number lack convincing candidates (Fig. .).

ere are a number of reasons why a genetic test may not be fully informative

(Fig. .). e cause of disease in some cases may not be genetic, but due to an en- vironmental factor. Even if the cause were genetic, if it is due to a somatic[, ] or heteroplasmic variant, then the causal variant may not have been present in the

 Figure 4.1: Diagnostic yield of MitoExome sequencing in two cohorts of pa- tients with OXPHOS disease. Abbreviations: MCRI, Murdoch Childrens Research Institute; MGH, Massachusetts General Hospital sample that was sequenced. Even if the causal variant was in the sequenced sam- ple, it may not have been prioritized. Some cases of ciliopathies have been shown to be due to oligogenicity, that is to say multiple heterozygous mutations across several genes [, ]. It is difficult to prioritize such cases, as the number of pairs of candidate genes can be quite large (>,, for mitochondrial disease).

Similarly, disease may be due to dominant mutations, such as those that enable a gain-of-function, but it is hard to identify causal variants over the background of rare protein-modifying variants (~ MitoExome variants per individual). Lastly, even if disease were due to recessive-type variants we may miss them either be- cause they were a class that was not prioritized (such as splice variants distant from splice junctions, as in the MTFMT cases in Chapter ), incorrectly filtered

(ie. “common” pathogenic variants) or never called, such as structural variants.

is chapter describes an attempt to expand our analysis of MitoExome data by including some structural variant calls.

 Figure 4.2: Possible failure modes of MitoExome sequencing.

.. S V

Structural variants (SVs) are a diverse class of genomic alterations that encom- passes all sub-chromosomal changes larger than SNPs and indels. Originally struc- tural variants were restricted to events >kb [], but more recently it has been appreciated that events as short as ~bp frequently occur and are in fact more common than larger events []. Structural variants may be balanced, causing no change in copy number (such as inversion or translocations), or can be copy num- ber variants (CNVs) such as deletions or amplifications. Some structural events are complex and may include a combination of deletions, amplifications, inver- sions and/or translocations [].

P

Individuals do not have as many structural variants as SNPs or indels, but such events are still prevalent throughout the human genome. A typical individual has about ~, CNVs with a median size of .kb[]. One of the first analyses of large, rare (.-) CNVs in the general population used Illumina BeadArray tech-

 nology to survey variants in a cohort of ~, individuals. Variants > kb were found in - of individuals and ones greater than Mb were found in - of individuals []. A follow-up study looked to see how many of these events were due to de novo mutations by looking at structural variants in trios []. Events larger than kb had a mutation rate µ = .×− CNVs/genome/transmission, while events > kb were twice as rare with a mutation rate of µ = . × −, or once every ~ births.

L   

Pathogenic structural variants are not as numerous as pathogenic SNPs or indels; however, they represent an appreciable proportion (~) of entries in the Hu- man Gene Mutation Database (HGMD). ey are being discovered at a more rapid rate than SNPs and indels suggesting an ascertainment bias (Table .). Structural variation has been linked to many human diseases (reviewed in []), both spo- radic [] and complex []. Examples include left-sided congenital heart disease

[], autism [, ], schizophrenia [, ], Crohn’s disease [, ], and mitochondrial disease [].

.. M    

e earliest structural variants were discovered by karyotyping using microscopy of metaphase chromosomes. In the early ’s two methods of dying metaphase chromosomes to create band patterns enabled discovery of events >Mb in size

[, ]. Since then, a variety of molecular, microarray and sequencing based methods have been developed that have increased the resolution of structural vari- ant detection by  orders of magnitude to events less than  bases[].

 Table 4.1: Human Genome Mutation Database variants stratified by type. “Small” events are ≤ bp. While structural variants only account for 10% of entries, new ones are rapidly becoming discovered.

Type  in Q ()  in Q ()  Growth Missense/nonsense , () , ()  Splicing , () , ()  Regulatory , () , ()  Small deletions , () , ()  Small insertions , () , ()  Small indels , () , ()  Gross deletions , () , ()  Gross insertions , () , ()  Complex rearrangements  () , ()  Repeat variations  ()  () 

M M

e ability to conjugate DNA to fluorophores in the ’s greatly advanced struc- tural variant detection. In fluorescent in situ hybridization (FISH) [], fluores- cently labeled DNA molecules are used to label specific loci within genomes. As methods of DNA preparation for FISH have evolved, the resolution limit has ac- celerated from >Mb to kb [, , , ]. In multiplex-FISH, also known as chromosome painting or spectral karyotyping, genome-wide libraries of FISH probes are used to differentially label whole chromosomes [, , ]. is advancement enabled the first genome-wide screen for gross chromosomal abnor- malities with resolution < Mb [, ].

e first test designed specifically to detect copy number variants was compara- tive genome hybridization (CGH). In CGH, reference and sample DNA are labeled with green and red fluorophores, mixed in equal amounts and co-hybridized to normal human metaphase chromosomes []. CGH has a modest resolution of

 >Mb for low-level amplifications and deletions and >Mb for high level ampli-

fications [, ].

M M

Microarray technology radically advanced the detection of CNVs as reviewed in

[]. Both array-CGH [, ] and SNP arrays can be used to detect structural variations.

Array-CGH is a microarray designed specifically to perform CGH. By separat- ing genomic loci on a slide, array-CGH is able to achieve much higher resolution than traditional CGH. In array-CGH, DNA from a test and a reference sample are both labeled and hybridized to a two-color microarray. e ratio of intensi- ties between sample and reference is used to detect amplifications and deletions.

Early CNV calling algorithms smoothed consecutive ratios and then looked for series of probes that exceeded a threshold []; however, more nuanced detec- tion schemes including circular binary segmentation (CBS) [], hidden Markov models (HMM) [] and wavelet methods [] were later developed.

e first human genome-wide array-CGH experiment used an array of bacterial artificial chromosomes (BAC) spaced every Mb.  unrelated individuals were tested and  distinct gains or losses were found.  of these were recurrent

[]. As array-CGH experiments moved from BAC arrays to oligonucleotide ar- rays, the number of CNVs discovered has drastically increased. To date, there have been two large-scale high-resolution surveys have used genome-wide array-CGH to assay copy number variations across the entire human genome. Using a set of

 NimbleGen arrays each comprising ~. million long oligonucleotide probes,

Conrad et al. surveyed  Hapmap samples ( CEU,  YRI) to discover ,

 CNVs greater than  base pairs []. Using a set of  Agilent arrays each com- prising ~M oligonucleotide probes, Park et al. surveyed  individuals from three

Asian populations to discover , CNVs of which , were specific to Asians

[].

While SNP arrays were not originally designed to detect copy number variants, intensity values from individual probes have been massaged into revealing vari- ants. B-allele frequencies from the ratio of two SNP-specific probes can be used to refine calls. For instance, stretches of homozygosity are suggestive of a deletion and stretches of abnormal B-allele frequencies are suggestive of an amplification[].

e prevalence of CNVs in human genome has been measured using SNP arrays on several cohorts [, , ]. A comparative survey of SNP arrays for the detec- tion of structural variants concluded that while SNP arrays can be used to detect rare CNVs and genotype common events, their inability to detect balanced events and bias against probes that fall in segmental duplications limited their ability to capture a large proportion of genomic variation[].

S M

With advances in sequencing technology, it is now more feasible than ever to de- tect structural variation on a genomic scale. Several strategies exist to detect struc- tural variation [], including read-depth, pair-end mapping, split-read and se- quence assembly methods.

• Read-depth (DP): Inspired by CGH and array-CGH, these methods use depth

of coverage profiles as a proxy for copy number [, , , , , ].

• Paired-end mapping (PR): Also known as “end sequence profiling”, these

methods detect structural variants by scanning for read pairs with abnormal

 orientations or gap sizes. Pair-end analysis was first conducted on fosmid

clones [, , ], but has since been applied to whole genome data

[, , , ].

• Split-read (SR): Split read strategies also use sequence pair information,

but restrict the analysis to pairs where only one end maps to the reference

genome. e unmapped read is split in half and both ends are realigned

[, , , ]. ’Clipping reveals structure’ (CREST) is a tool that uses

a variant of split reads where both ends of a read pair align to a genome, but

one has an incomplete alignment [].

• Sequence assembly (SA): ese are computationally expensive methods that

do away with a reference genome and instead align reads to each other to

build up a genome sequence [, , ]. ey are uniquely able to detect

novel sequence insertions.

ese methods were all used in the structural variant analysis of the  Genomes

Project (KG), the largest survey of human structural variation to date. Vari- ant calling focused on deletions, but insertions and duplications were also inves- tigated.  methods were used to generate  call sets using a cohort of  individuals[]. Candidate loci were verified either by PCR or genome wide array-

CGH []. e  call sets exhibited a large range in false discovery rates (FDRs), with  having FDRs greater than the  cutoff (-). Sensitivity of deletion discovery was estimated to be only  in high-coverage and  in low-coverage data. e best algorithm, GenomeSTRiP, integrated signals from RD, PR and SR detection strategies with population level analysis []. Unfortunately, benefits from integrating population data are not applicable to rare or de novo events.

 Recently there have been several efforts to find CNVs in targeted sequencing data [, , , , , , , , , ]. Targeted sequencing data is more difficult to analyze than whole genome data for two reasons. First, DP measures are biased and noised by the capture step. Second, non-DP detection methods require the presence of reads that cover breakpoints, which is unlikely when only .-. of the genome is captured in a typical targeted sequencing experiment.

Algorithms differ on their exact implementation, but follow the same two-step work-flow: a target-by-target investigation a sample’s of the depth profile followed by a calling step. Initial algorithms were designed to compare pairs of samples and would use a detailed statistical model of read depth and in order to report

P-values to a caller [, , , ]. Using cohort of samples instead of a pair enables new opportunities for improved noise reduction. CoNIFER pioneered the approach of using singular value decomposition and the removal of the top components in order to reduce noise in the DP measure. XHMM uses CoNIFER’s noise reduction approach, but claims to significantly improve calling by moving from a threshold caller to a three state hidden Markov model [].

.. G   

In this chapter we used a modified version of the CoNIFER algorithm to call copy number variants from targeted sequencing experiments. We assess CNV detection performance both on a cohort of samples with known variants and on a cohort of samples with synthetically inserted variants. We then call copy number variants in a collection of  patients with suspected mitochondrial disease who underwent

MitoExome sequencing.

 . M  M

.. E  ME D

Targeted sequencing data came from the  Genomes Project (KG)[] and Mi- toExome experiments[]. Exomes from the KG project were generated by Bay- lor College Of Medicine, Broad Institute, and Beijing Genomics Institute using ei- ther Illumina or SOLiD sequencers and a variety of selection platforms. Mapped read files were versioned  for SOLiD data and  for Illumina data. Consensus targets as of  were obtained from the KG FTP site ftp://ftp.genomes.ebi.ac.uk/. Sensitivity and specificity were calcu- lated using a sets of  samples analyzed by both the KG project and array-CGH

[]. Synthetic variants were tested using randomly chosen cohorts of  or  samples from KG. Discovery was completed on a set of  patients with mito- chondrial disease, pseudo-obstruction or ataxia that had undergone MitoExome sequencing of the mtDNA and nuclear-encoded genes implicated in mitochondrial biology, mitochondrial disease, or monogenic disorders with phenotypic overlap.

 of these samples were captured using a second generation MitoExome target set that covered  genes and were individually sequenced in a single lane of an Illumina GA-IIx. e remaining  samples underwent selection using a third generation target set that included  genes and were barcoded and sequenced on an Illumina Hi-Seq. ird generation targets were used for analysis. Study pro- tocols were approved by the Partners Human Research Committee. All samples were obtained with informed consent for sequence analysis and data deposition into public databases.

 .. D   

Depth of coverage was calculated on single samples using the Genomic Analysis

Toolkit v.. using the DepthOfCoverage walker []. By default the GATK ignores duplicate reads. Walker parameters were set to ignore bases with quality scores < and reads with mapping qualities <.

.. R D-B CNV D

A copy number variant detection framework inspired by CoNIFER[] was de- veloped using the Python program language and the iPython, Numpy, Scipy, Mat- plotlib, Pytables and Pymongo libraries. An initialization step reads and processes depth data into an intermediate file that can be used for noise reduction and call- ing. GATK “sample_interval_summary” files provide depth counts by target and the total number of mapped reads. Reads Per Kilobase per Million mapped reads

(RPKMs) were calculated and used to remove probes where the median RPKM

<.. e RPKM matrix of unfiltered probes is Z-normalized by subtracting the by-target mean and dividing by the standard deviation. Singular value decompo- sition is then computed on the entire ZRPKM matrix. Because rare CNVs are not expected to contribute much to the noise in the ZRPKM matrix, noise is reduced by setting a given number of the top singular values to zero and calculating the dot product of the U, S, and VT matrices to reconstruct a matrix of SVD-ZRPKM val- ues. e number of components to remove is determined by looking for an elbow in a Scree plot, a line plot showing the variance in the data explained by each prin- ciple component. e elbow is automatically detected by finding the point furthest from a line that connects the beginning and end of the Scree curve. Before calling

CNVs, SVD-ZRPKM values are smoothed over ± or ± probes using weightings

 from a Blackman filter. CNV calls were made from smoothed SVD-ZRPKM values that exceeded a set threshold.

.. R-B V E

Additional tools were developed to obtain data from aligned read (BAM) files that support structural variants using the pysam library. Methods implemented in- clude pair read analysis by looking for mate pairs where only one end is mapped or in which there is an abnormally large insertion size. Clipped read analysis was per- formed by filtering for reads whose CIGAR alignment contains two regions, one that is aligned and another that is soft clipped.

.. P G B

A web-based tool capable of calling structural variants and integrating with Mi- toExome results was developed with the Python Flask library.

.. C      

True positive call sets were obtained from the Database of Genomic Variants (DGV)[].

Raw GVF formatted files were converted to bed format via a custom Python script and translated to hg using the UCSC liftover tool []. Sensitivity was calcu- lated on rare (in < of samples) variants that overlapped at least  unfiltered targets but had less than half of their sequence overlapped by segmental duplica- tions. It is nontrivial to determine the allele frequency of structural events as their endpoints are highly polymorphic; variants were considered overlapping if their minimum reciprocal overlap was at least . Calls were considered being true positives if at least a quarter of overlapping probes exceeded the calling threshold.

 Call sets used for sensitivity analysis were also used to create synthetic variants.

e FPR was calculated on chromosomes ,  and  treating all variants as true.

. R

.. P  CNV D

P   A-CGH   E

To benchmark the performance of our depth-based CNV calling system, we made calls in a cohort of  samples whose DNA was analyzed for CNVs by genome- wide array-CGH[] and also underwent whole exome sequencing as part of the

 genomes project[]. ese samples had  gains and  loss events that met our filtration criteria of being in at most two samples, having less than half their sequence covered by segmental duplications and overlapping at least three well covered targets. e inflection point of the scree plot was detected at , which was chosen as the number of components to remove (Fig. .A). Plots of three putative amplifications and deletions before and after SVD noise reduction can be found in the top two panels of Fig. .B-G. Setting the top singular values to zero signifi- cantly reduced the variability of the data while having a minimal, but noticeable, impact on real events. For example, in the loss event overlapping MASP, setting the top singular values to zero brings NA’s smoothed depth just under the default calling threshold of ..

Calls made using the default parameters for CoNIFER (smoothing with a Black- man window of ± probes and a calling threshold of .) have a sensitivity of . for detecting amplifications and  for detecting deletions. e FPR for both

 A H I J

B C

D E

F G

Figure 4.3: Performance of CNV detection in a control cohort. (A) Scree plot revealing the contribution of each principle component in a cohort of 37 control sam- ples has an elbow at 5. (B-G) Visualization of three amplifications (left) and three deletions (right) when processing 37 samples without (top panels) or with (middle panels) SVD noise reduction or when fully processing 200 samples (bottom panels). The graphed CNV detection threshold is ±1.5. (H) Scree plot for 200 samples has an elbow at 15. (I) ROC curve of CNV detection in 37 samples when processing 37 or 200 samples. (J) ROC curve of CNV detection when processing 200 total samples and using smoothing over ±7 or ±4 probes. Abbreviations: ROC, Receiver operating characteristic; W, Window size for smoothing

 amplifications and deletions is ., which for the ~, targets in an exome would amount to about one call per sample. Relaxing the calling threshold to . increased the sensitivities to . and .; but also increased the FPR roughly -fold. A full receiver operating characteristic (ROC) curve can be found in (Fig. .I).

e SVD noise reduction strategy should work better as more samples are con- currently analyzed and the systematic noise becomes more evident. CNV detec- tion was repeated after adding additional samples from the  genomes project such that a total of  samples are analyzed. e inflection point of the scree plot was detected at  (Fig. .H). Running the analysis with  verses  samples does not seem to have a significant impact on signal to noise in detecting the six example variants (Fig. .B-G, bottom panel vs middle panel) or on performance when viewed as as ROC curve (Fig. .I). We were concerned that the algorithm’s default parameters use excessive smoothing as the SVD-RPKM values are averaged over a window of ± probes and < % of our variant set spans such a distance.

Changing the window to ± probes slightly improved calling performance on syn- thetic variants (Fig. .J); however, no matter how these samples are processed, sensitivity for amplifications and deletions maxes out at around  and  re- spectively.

P   

To better understand why these sensitivities are so low, we artificially inserted the true positive set of variants into a cohort of  randomly chosen samples.

e read depths overlapping the  amplifications and  loss events above were modified by  (“mild”, .X change),  (± copy) or  (“severe”, X

 change) simulating a partial, single or many copy number change (Fig. .A-F).

“Mild” variants simulate the situation where that may not be a : correspon- dence between copy number and sequencing depth, perhaps due to non-linearities during DNA target capture. In such instances, only . of amplifications and

. of deletions were recovered at a threshold of .. Calls made after inserting

± copy variants faired better with sensitivities of . for detecting amplifica- tions and . for detecting deletions using that threshold. A full ROC curve can be found in (Fig. .G). Unlike for the real variants, changing the window to

± probes significantly improved calling performance on both this set of synthetic variants (Fig. .H) and another set inserted into a different cohort of  sam- ples (Fig. .I), suggesting that the default settings over-smooth the data.

.. CNV D  ME 

We next applied CNV detection to  individuals with suspected mitochondrial disease who had subsequently undergone MitoExome sequencing at MGH. To fa- cilitate variant interpretation, we developed CNVisualizer, a web interface that integrates CNV detection with MitoExome sequencing variants and clinical sum- maries (Fig. .). A conservative call-set of  CNVs was made after removing

 singular values using a threshold of . and smoothing ± probes. Almost half of these variants were in genes known to commonly harbor copy number poly- morphisms such as PDPR (N=)[], GTPBP (N=), ACOT (N=) and HPR

(N=). Many CNVs were reported to be larger than the depth data suggested due to smoothing. In addition to evaluating these variants, a liberal call-set was made using a threshold of . and smoothing over ± probes. A feature was added to

CNVisualizer that automatically detected genes with multiple variants. While the

 A B

C D

E F

G H I

Figure 4.4: Performance of CNV detection of synthetic variants. (A-F) Visual- ization of the three amplifications (left) and three deletions (right) from Fig. 4.3B-G when artificially inserted into a cohort of 200 samples. The top panel shows a “mild” CNV which models a half gain or loss. The middle panel represents a copy number gain or loss of one copy. The bottom panel shows a “severe” CNV caused by a 40X gain or loss of depth. The graphed CNV detection threshold is ±1.5. (G) ROC curve of CNV detection of synthetic CNVs. (H-I) ROC curve of CNV detection of two sets synthetic gains or losses of one copy using smoothing over ±7 or ±4 probes in two. Abbreviations: ROC, Receiver operating characteristic; W, Window size for smoothing

 liberal call-set had ~ events per sample (, total), analysis could be focused on the few recessive-type candidates in each sample. Additionally, depth profiles for genes harboring known pathogenic variants were manually investigated. Ex- emplary variants and cases are discussed below.

D      K D

Krabbe Disease (globoid cell leukodystrophy, OMIM ) is a progressive disease of demyelination caused by recessive mutations in galactocerebrosidase

(GALC), a lysosomal enzyme involved in the catabolism of galactosylceramide.

Most cases present during infancy with rapid central and peripheral nervous sys- tem degradation, but - of cases are late-onset and can appear as late as the fifth decade of life []. Presentation of these late onset cases is highly vari- able. About  of both infantile and adult patients have one or two copies of a “common” kb deletion that overlaps the seven exons of GALC[, , ].

We detect a CNV consistent with this variant in one patient, HL (Fig. .A).

In addition to detecting a reduced depth signature, variants from MitoExome se- quencing provided further support for this call. e patient has six variants over- lapping the deletion that all appear homozygous (c.A→G, c.+C→G, c.A→T, c.T→C, c.A→G, and c.C→T) and is heterozygous for the “T” allele known to be associated with the deletion (at c.C→T under current nomenclature[]).

HL is a French Canadian female who presented at age  with weakness, muscle pain and exercise intolerance and underwent a muscle biopsy. Light mi- croscopy of the biopsy revealed scattered small and round angulated fibers that varied in size. We do not detect any additional rare-protein modifying GALC vari-

 A

B

Figure 4.5: CNVisualizer is a web application that can make CNV calls and inte- grate them with MitoExome sequencing results and clinical phenotypes to facilitate variant interpretation. (A) The sample view automatically detects recessive-type can- didates and provides relevant links to GeneCards, UCSC Genome Browser and DGV. (B) Each CNV call is linked to a plot that facilitates visualization of the supporting depth data and provides controls to alter calling parameters.

 ants in this patient, but it is interesting to note that mutations in GALC have previ- ously been implicated in altered muscle fiber diameters[] and there have been reports of psychomotor [] and cognitive effects in carriers[]. Detection of this variant underscores the importance of performing CNV detection for genetic diagnostics - especially in diseases with common deletions, such as Krabbe Dis- ease.

MB       HL

HL had an amplification of the distal quarter of the short arm of chromo- some  (p→pter) (Fig. .B). Full trisomy of p is a clinically recognizable multiple congenital anomaly/mental retardation syndrome[]. To date, there have been five case reports of submicroscopic trisomy  ranging in size from .-

Mb in pure form [, ] or in tandem with large deletions in q[], q[] or p[]. Our female patient presented as an adolescent with many symptoms shared by other patients with submicroscopic trisomy p such as GI reflux[], cerebellar hypoplasia [], optic nerve defects [], learning disability [, ,

, , ], hypotonia [], and immune deficiencies[]. She additionally presented with migraine, seizures, balance problems, autonomic nervous system dysfunction, muscle tremors, twitching, spasms, dysmenorrhea, bone density loss and disc desiccation. e next steps for this patient are to confirm this variant which should be possible through karyotyping and test if it formed de novo by in- vestigating parental DNA.

 A B

C D

E F

Figure 4.6: CNV Detection in MitoExome patients. (A) An event consistent with the 30kb deletion found in about half of Krabbe Disease cases is found in HL1173. (B) A large subtelomeric amplification with a breakpoint between ATP4C1 and ECHDC3 is detected in HL1107. (C) A potential single exon deletion in PHYH in HL1033. (D) A candidate deletion of the last exon of MTFMT in HL1254. (E) A candidate deletion of the second exon of MTFMT in SM-1HOMR. (F) A 12kb dele- tion of exons 10-13 of HSD17B4 in HL1228 supported by a reduced depth signature and read pairs with long inserts.

 P R   HL

Refsum disease (OMIM ) is a disorder of impaired alpha oxidation of fatty acids which leads to the buildup of neurotoxic phytanic acid. About  of cases are caused by recessive mutations in phytanoyl-CoA hydroxylase (PHYH), a per- oxisomal enzyme[]. In PHYH in sample HL, we detect an exon in the promoter region is covered by only three read pairs (Fig. .C) and the patient additionally harbors a variant (p.RQ) previously shown to cause partial en- zyme deficiency[]. If Refsum disease is left unnoticed through childhood, it may present in the teens, often after an infection, with symptoms including acute weakness, ataxia, sensory neuropathy, cardiomyopathy and ichthyosis[]. Our patient presented at age  after a bout of flu with muscle pain in her lower left ex- tremities. e patient went on to develop exercise intolerance, heart palpitations, mitral valve prolapse, tinnus in her right ear, dry eyes, hot and cold intolerance, complex migraines and stroke. GI symptoms were notable for Crohn’s disease and stress incontinence. An ultrasound performed on her sural nerve  years later was consistent with demyelination. While blood work was not notable for elevated levels of phytanic acid, there have been case reports of Refsum Disease with nor- mal blood phytanic acid levels []. Given the phenotypic overlap of this patient with Refsum disease, phytanoyl-CoA hydroxylase enzyme activity levels should be tested and this deletion should be confirmed and phased. If a diagnosis can be es- tablished, further symptoms can be prevented by reducing dietary intake of free phytanic acid by avoiding dairy, fish and meats[].

 T   L S    MTFMT

In two cases with Leigh syndrome and one copy of the pathogenic p.SL variant in MTFMT described in Chapter , we detect exons of MTFMT with low coverage.

We have previously shown that recessive mutations in MTFMT cause Combined

Oxidative Phosphorylation Deficiency  (OMIM ). HL presented in infancy with Leigh syndrome, optic atrophy, ataxia and elevated lactic acid in cerebral spinal fluid. e patient has severely reduced coverage of the last exon of MTFMT (Fig. .D). SM-HOMR is a male who presented with developmental delay, ataxic gate, dyspraxia of upper limb, and Leigh syndrome. He has low cover- age of the second and fourth exons of MTFMT (Fig. .E). Evidence of a deletion in this patient is weaker than in HL; the low coverage in the fourth exon is likely explained by the p.SL variant, and the reduction in coverage of the sec- ond exon is modest. Before a diagnosis can be established in either of these cases, these variants will need to be confirmed and phased.

HSDB-  HL

In one case, we make a new molecular diagnosis. HL is a  year-old male pa- tient with cerebellar ataxia, peripheral neuropathy, azoospermia, and mild hear- ing loss. We detected a kb heterozygous in-frame deletion of exons - of

HSDB (Fig. .F) that was compounded with a missense variant (p.AV) at a highly conserved residue predicted to impact NAD-binding (Fig. .A). e dele- tion is supported by both PR and SR signals in the form of reads with long inserts and clipped reads. Inspection of the clipped reads reveals that a single adenine was inserted into the deletion site. is deletion is indicative of a non-homologous end joining following a double strand break as it lacks extensive regions of homology

 g- 8 V>:- />>- wV/- Vgg >gg- >- - x - x #_w g-- #Y/- _YY>_cY/gYYVwVV: #wY_gY>cw - #>V:cV- - x x x x -- -- - P. aeruginosa w_ V/ wc :/ Y>#V-- VY/

Figure 4.7: Confirmation and phasing of HSD17B4 variants in HL1228. (A) HSD17B4 gene structure showing multiple enzymatic domains and the location of the p.A196V variant at a conserved residue and 12kb in-frame deletion. (B) Mutations in HSD17B4 in HL1228 and family members. The proband is the only member with evidence of both variants. (C) Confirmation of p.A196V variant in the proband and mother via Sanger sequencing. (D) Confirmation of the 12kb deletion in the proband and sister, but not mother. Abbreviations: PTS, peroxisomal targeting signal; SCP, sterol carrier protein at both ends of the breakpoint and has the insertion of a nucleotide []. Analy- sis of familial DNA confirmed the presence of the deletion and revealed that the two mutations were compound heterozygous, with the missense variant inher- ited from the mother and the deletion, which was heterozygous in the HL’s healthy sister, likely inherited from the father, from whom DNA was unavailable

(Fig. .B-D).

HSDB, also known as D-bifunctional protein and multi-functional protein

, is a peroxisomal enzyme that catalyzes multiple steps of beta-oxidation of very

 long chain fatty acids. e protein consists of three domains: an N-terminal de- hydrogenase domain, a central hydratase domain, and a C-terminal sterol car- rier protein (SCP) domain. e final three amino acids form the peroxisomal tar- geting signal (PTS). After peroxisomal import, the kDa full-length protein is proteolytically cleaved near amino acid  to yield a kDa dehydrogenase sub- unit and a  kDa hydratase/SCP subunit. Recessive mutations in HSDB are the known cause of D-bifunctional protein deficiency (OMIM ), an au- tosomal recessive disorder of peroxisomal fatty acid beta-oxidation that is gen- erally fatal within the first  years of life[], and have recently been identified in two sisters, diagnosed with Perrault syndrome (OMIM ), who exhib- ited severe sensorineural deafness, ovarian dysgenesis, peripheral neuropathy and ataxia[]. Retrospective review of metabolic testing from our patient revealed alterations of long- and very-long chain fatty acid metabolism consistent with a peroxisomal disorder, including elevated ratios of pristanic:phytanic acid and arachidonic:docosahexaenoic acid. Due to the phenotypic similarity to a previous case, the predicted severity of the HSDB mutations, and the metabolic evi- dence of peroxisomal dysfunction, the patient was given a diagnosis of HSDB- deficiency. A full case report on this patient is in preparation.

. D

We have shown that is is possible to call copy number variants from targeted se- quencing experiments and that these variants can lead to new molecular diagnoses of disease.

We show that we are able to recover a portion of known deletions and, to a lesser extent, amplifications in a control cohort. ere are a number of reasons

 why we may not have detected more of these bona fide variants. One possibility is that these variants may not all be real. Multiple attempts to call copy number variants from the same samples often produce callsets with poor concordance [,

, ]. Our experience from recalling synthetic variants, especially attenuated ones, underscores the difficulty in calling copy number variants from depth data.

A recent paper presents a similar sensitivity of  of detecting CNVs using the

 Genomes exome data []. ese data have been produced on multiple sequencing platforms in multiple sequencing centers, which could contribute to variability in depths. Lastly, it could be that increased coverage or larger sample cohorts are required to make accurate calls. While we show that increasing the cohort size from  to  samples did not significantly improve variant calling, it is possible that an even larger cohort could improve performance.

More work remains in improving performance of structural variant calling from targeted sequencing data. For instance, breakpoints can be more accurately identi-

fied and small events can be more readily detected by switching from a smoothing method that treats all data points as equal to one that weighs data based on dis- tance [] or on their coverage. Call quality may also be improved by moving from thresholding to a more advanced calling algorithm such as CBS[, , ,

] or a HMM [, , , ]. Further improvements to structural calling tools can be made by integrating signals beyond just depth. At least two structural vari- ant callers, GenomeSTRiP and GSAVPro, integrate read depth signals with pair end signals [, ]. e deletion in HSDB in HL is an example of a variant that has an abundance of supporting information from raw reads and the

Krabbe disease deletion in HL was supported by B-allele frequencies and a

SNP in linkage. Further data integration opportunities exist if additional data is

 collected, such as array-CGH, SNP arrays [] or emerging technologies like mi- crofluidic nanochannel-assays of DNA length distributions []. e design of a sequencing experiment can also be altered to improve structural variant detec- tion. Increasing depth and adding more baits for bases flanking exons should im- prove the signal from variants and capture more breakpoints, though for the most complete coverage, samples should undergo whole genome sequencing, preferably with de novo assembly.

We are able to make one firm and four tentative genetic diagnoses in our co- hort of  patients with mitochondrial and related disease. Our strategy to au- tomatically detect recessive candidates using CNVisualizer enabled the discovery of several potential single exon events that would otherwise have gone undiscov- ered. Many of these variants need to be confirmed and phased before a link to pathogenicity can fully be established. Given the prevalence of large insertion and deletion events in HGMD, one may expect more than five examples of potentially pathogenic variants in a cohort of over  individuals; however, given our lim- ited sensitivity of calling variants in a control cohort and for synthetic variants, it is likely that there are more CNVs in this cohort that have yet to be discovered.

ese results underscore the importance of looking beyond SNPs and indels when investigating the genetic causes of disease.

. M I

.. A C

is study was conceived and designed by S.G.H. and V.K.M. Analysis was per- formed by S.G.H, with the exception of the HSDB case which was assisted by

D.S.L. Variant confirmation was performed by D.S.L. e manuscript was written

 by S.G.H. All aspects of the study were supervised by V.K.M.

 5 Implication and Future Directions

   next-generation sequencing, we have investigated the ge- T netic basis of mitochondrial disease in approximately  patients. We have identified three new disease genes and have presented several additional can- didates. Using patient samples deficient in MTFMT, we were able to show that formylation is required for mitochondrial translation. Lastly, we were able to re- analyze our targeted sequencing data and detect copy number variants leading to the discovery of a case of HSDB deficiency. is work has implications for our

 understanding of mitochondrial biology and personalized medicine.

. I

.. D    

We present strong evidence that recessive mutations in AGK, NDUFB and

MTFMT cause mitochondrial disease and present  additional prioritized can- didates. Since publication of these findings, other groups have found individuals with pathogenic variants in AGK[] and MTFMT[, ] and in one candidate,

EARS[].

.. D    

Genetic testing for mitochondrial disorders has historically not been terribly suc- cessful. It has been myopic in that typically only the mtDNA or a handful of can- didate disease genes would be sequenced. Recently, several companies including

Courtagen and GeneDx have begun to offer more comprehensive genetic tests modeled after MitoExome sequencing. Another option is exome sequencing; how- ever, standard exome kits do not reliably capture the mtDNA. To remedy this, a group developed a “: Mito-Plus Whole-Exome” that combined an exome kit with additional probes for  MitoExome genes and the mtDNA[] and Baylor

College of Medicine Medical Genetics Laboratories currently offers an Exome+mtDNA test (http://www.bcm.edu/geneticlabs/?pmid=). It is exciting to see such tests being offering to patients so quickly.

 . F D

.. C     

is work addressed the so called “, interpretation” that threatens the

“, genome”[]. In the course of investigating the MitoExomes of ~ in- dividuals, approximately half a million variants were detected and processed. Such a feat was accomplished largely through a series of bioinformatic scripts that in- tegrated data from several sources and generated user-digestible reports. In order to dive into cases, I prototyped a web-based data exploration tool, CNVisualizer.

While some believe that advances in CNV detection algorithms will “eventually eliminate the need for chromosome microarrays as a separate test”[], simula- tions with synthetic variants suggest that targeted sequencing data is inherently limited and is not a suitable replacement for microarrays. In clinical sequencing, it is common for a patient to undergo several tests. As more patients undergo more genetic and “omic” tests, tools that integrate and automate the analysis of these data will need to be developed.

.. F    

It has been estimated that exome strategies are successful in identifying Mendelian disease genes in  of projects[]. is is consistent with our experience in

Chapter  where we were able to identify candidates underlying genetic defect in

 of  () infants with mitochondrial oxidative phosphorylation disease. But what about the rest? Several possibilities exist and were discussed in the introduc- tion of Chapter  (Fig .). Some of these possibilities can be explored with new informatics. For example, the possibility of additional splice mutations is exciting

 as it has been estimated that between [] and [] of disease-causing mutations affect splicing. An example of one such mutation is the p.SL variant in MTFMT, which we showed caused skipping of an exon (Chapter ). Progress is being made on algorithms that detect such mutations [], but their sensi- tivity/specificity are not yet ready for prime time on a genome-scale[]. In- stead of using informatics approaches, generating additional genetic or pheno- typic data may enable better evaluation of sequencing failure modes. MitoExome sequencing does not fully detect the wide array of informative variants that live within a patient mtDNA or nuclear genome. Whole genome sequencing would be ideal as it would not only detect the untargeted SNPs and indels, but would addi- tionally improve structural variant detection by providing data from breakpoints; however, it is still expensive to apply on large cohorts. Even with whole genome sequencing, coverage of the mtDNA would not be complete as most sequencing projects suffer from limited ability to detect low level heteroplasmic variants, due to the presence of nuclear mitochondrial DNA homologs (NUMTs), which gener- ate short reads that are indistinguishable from ones generated from mtDNAs. A report recently showed that single amplicon long-range PCR instead of hybrid cap- ture reduces the background from NUMTs and increases sensitivity to heteroplas- mic mtDNA mutations[]. Another PCR-related strategy, long padlock probes can also be used to supplement coverage for nuclear exons. A set for  exons from  nuclear-encoded mitochondrial genes has recently been created[].

Lastly, microarrays could be deployed to improve detection of CNVs and mtDNA deletions. Two groups have presented array designs that cover the entire mtDNA and ~ nuclear genes[, ], which have enabled the discovery of single exon deletions[]. With current array synthesis technologies, single sample arrays

 have been developed that cover the entire exome[]. A MitoExome target is much smaller than an Exome, and, in theory, a MitoExome array can be designed that can process eight samples using a single slide.

On the phenotypic front, it should be possible to molecularly characterize pa- tient cell lines in order to understand which mitochondrial processes are affected.

“Central Dogma” phenotyping can be applied to look for defects in mtDNA copy number, the expression and processing in mitochondrial RNAs and the expression and assembly of respiratory chain complexes. A recent study introduced the con- cept of an integrative personal omics profile that used genomics, metabolomics and transcriptomics to provide medical insight []. Such an approach could also be fruitful in a research setting. Metabolic profiling could be used to detect en- zyme deficiencies. Expression analysis of RNA-sequencing data can detect dys- functional regulatory pathways and allelic expression[, , ] can be mea- sured which can uncover heterozygous mutations that are expressed in a homozy- gous or near-homozygous state[].

.. E    

More candidate mutations would be instructive; however, sequencing projects have already produced a backlog of candidates whose pathogenicity has yet to be proven. In this thesis, pathogenicity of genes was proven via complementation studies in patient cell lines or recurrent hits in phenotypically similar patients.

Cell line studies can only be completed if patient material is available and it shows a phenotype. Even if such materials are available, getting a rescue is a laborious and finicky process. In some cases, yeast have been used to prove pathogenicity

[], but this is only an option if a homologue exists. e ability to rapidly gen-

 erate DNA editing reagents, such as those based on zinc-fingers nucleases[], transcription activator-like effector nuclease (TALENs) [] or clustered regu- larly interspaced short palindromic repeats (CRISPR)[], will enable studies in a variety of model systems ranging from human cell lines to model organisms. Phe- notypes can be enhanced with stem cell technologies, such as MyoD transforma- tion of fibroblasts into myotubes[]. Rescue experiments will still be laborious and given the proliferation of sequencing, the future of variant evaluation may come from statistical tests on large cohorts of phenotyped individuals. Such co- horts will arise both from research efforts, many of which are available via the database of Genotypes and Phenotypes (dbGAP)[], but also from the health care industry, which is rapidly becoming digitized [].

 A Supplementary Materials for Chapter 

A. M  M

A.. M    OXPHOS -



Four of the  nuclear OXPHOS disease-related genes recently reviewed [] did not have evidence of mitochondrial localization in the MitoCarta compendium

[]: TAZ, PUS, RRMB, TYMP. Manual review of these genes confirmed uncer- tain or non-mitochondrial localization.

 A.. DNA 

Several different approaches were compared for mtDNA sequence alignment and analysis. Since the nuclear genome has multiple nuclear sequences of mitochon- drial origin (NuMTs), care must be taken in alignment and identification of het- eroplasmic variants. On average, nuclear exons show X whereas the mtDNA has ,X coverage, reflecting the high number of mtDNA copies per nuclear genome in the sampled tissues.

When reads were aligned to the entire reference genome GRCh (including mtDNA sequence NC_),  of the reads aligned to the mtDNA aligned uniquely while the remainder had an alternative alignment of equal score. us the vast majority of mtDNA-aligned reads are expected to be indeed derived from the mtDNA rather than a NuMT. However, we observed that sequence polymor- phisms could cause thousands of reads to erroneously map a NuMT instead of the mtDNA. Since we aimed to compare mtDNA sequence and coverage across individ- uals, we chose to align all reads to the mtDNA to avoid differences in alignment due to polymorphisms. From the number of NuMT sequences in the reference genome and the extremely high depth of mtDNA coverage, we estimated that at least  of reads aligned to the mtDNA were derived from the mtDNA instead of any NuMT (data not shown). is estimation was consistent with the variant data that showed that the vast majority of potential variants were supported by

<  of reads. erefore we only analyzed variants present at >  of reads, consistent with previously methodologies[].

 A.. DNA    mtDNA deletions or rearrangements were detected through de novo mtDNA as- sembly using a modified ALLPATHS assembly approach[]. Reads aligned to the mtDNA were randomly downsampled to X coverage for computational tractability. Errors in reads were corrected using the ALLPATHS-LG[] error cor- rection algorithm. Next a k-mer graph[] was constructed (k=), and filtered using four steps:

. Removal of low coverage edges. At each graph vertex that had two edges (ei-

ther entering or exiting), any edge supported by >  fold fewer reads than

the other edge was removed.

. Removal of small components. Components of the graph containing less than

 k-mers were discarded.

. Removal of all-C edges. Edges whose sequence consisted entirely of Cs were

presumed to be artifacts and discarded.

. Removal of hanging ends. Edges of length <  k-mers for which one end

was a source or sink of the graph were removed.

e resulting assembly consisted of a directed graph, with a DNA sequence as- signed to each edge. Each assembly was manually inspected to identify alterations supported by >  of aligned reads.

A.. M  - 

Nine genes were previously reported to cause dominant-acting OXPHOS disease:

POLG, POLG, Corf, RRMB, SLCA, OPA, CYCS, MFN, HSPD. We man-

 ually inspected all rare, protein-modifying heterozygous variants or HGMD disease- associated alleles in these nine genes with the hypothesis that such heterozygous variants might cause disease. We would expect to detect . such variants in pa- tients by chance, given that we observe  such variants within the  healthy control individuals (including  variants associated with a phenotype in HGMD).

In our  patients, we detected six rare, protein-modifying or HGMD variants in the nine dominant-acting genes and in none of the cases was the observed patient phenotype consistent with the previously reported phenotype for dom- inant mutations. Two individuals harbored MFN variants previously associated with Charcot-Marie-Tooth in HGMD: p.GR [] and p.VI[]. However, existing reports of these variants did not provide any evidence of pathogenicity, and furthermore the severe phenotypes of our two cases were not consistent with previously reported symptoms (details provided in next section). ree individ- uals harbored previously unreported mutations in genes known to cause autoso- mal dominant progressive external ophthalmoplegia (adPEO): OPA:p.RH and

Corf:p.PL. None of the cases showed symptoms consistent with adPEO.

Individual P, from a consanguineous Lebanese family, harbored a previously unreported heterozygous POLG:p.AV variant of unknown significance. us, there is no strong support that any of these heterozygous variants are likely to cause the severe disease observed in our patients.

A.. A      

We carefully inspected all variants which were previously associated with any dis- ease in HGMD or MITOMAP (ACMG category  variants). Supplementary Table  lists the  such variants with allele frequency <., including  in autosomal

 genes,  in mtDNA genes and  in X-linked genes. Of the autosomal variants,  of  variants were regarded as unlikely to be pathogenic in the patient(s) they were found in for the following reasons:

.  variants had been reported in autosomal recessive disorders with differ-

ent clinical presentations to the patient and were heterozygous variants in

patients with no other mutation identified in that gene (in the PINK, CPT,

ACADM, SLCA x , CPS, DHGDH, AMT, MUT, GLDC, ACADSB, TCIRG,

ACAD, SUOX, SPG x , ALDHA, FECH, GCDH and ADSL genes).

. One patient (P) was homozygous for a reported ACADSB p.EK vari-

ant, but this mutation has only been reported in biochemical newborn screen-

ing and showed no clinical symptoms attributable to the enzyme defect up

to  years of age [].

. A WFS heterozygous p.KQ variant present as a single heterozygous

variant in an isolated case of sensorineural hearing impairment[] was

present as a single heterozygous variant in two patients (P, P) with

early onset cardiomyopathy or Leigh syndrome, respectively. It was also

present as a heterozygous variant in two sets of healthy control individuals

at . and . allele frequency.

. An HTRA heterozygous p.GS variant previously reported as a risk fac-

tor for Parkinson disease [] was heterozygous in one patient (P) with

a markedly more severe phenotype. It was also present as a heterozygous

variant in one of our control groups at . allele frequency.

. An ELAC heterozygous p.RQ variant listed in HGMD as being linked

to prostate cancer was not interpreted as being a risk factor in the original

 publication[]. It was heterozygous in one patient (P).

. An MFN heterozygous p.GR variant was described [] as a previously

reported missense mutation and listed in their table on molecular findings

but also in a polymorphisms table. Pathogenic MFN mutations typically

cause autosomal dominant Charcot-Marie-Tooth Neuropathy, Type A. e

pedigree described two affected members but only the proband appears to

have been tested. e previous report [] described this variant as a poly-

morphism, having found it in  of  control chromosomes but not in

MFN patients. is variant was present as a single heterozygous variant

in one patient (P) with a severe leukodystrophy.

. An MFN heterozygous p.VI variant was reported in the proband of a

family with an autosomal dominant history of peripheral neuropathy []

but other family members were not tested. is variant was present as a sin-

gle heterozygous variant in one patient (P) with a severe leukodystrophy.

e p.V residue is poorly conserved (/ species) and was present as a

heterozygous variant in two of our control groups at . and . allele

frequency.

Of  mtDNA variants reported as disease-associated in MITOMAP,only  (m.C>T in the MTND subunit gene) was listed as confirmed as it was reported as a homo- plasmic mutation in  families with adult-onset Leber’s Hereditary Optic Neuropa- thy and spontaneous recovery of vision []. However, analysis elsewhere sug- gests it lacks evidence of pathogenicity[]. It was present in one patient (P) with a severe multisystem disease and combined OXPHOS complex I, III and IV deficiency, making it unlikely to be causative in this patient. e other  reported

 mtDNA mutations in MITOMAP were not classified as confirmed, lacked evidence of pathogenicity, were associated with different clinical and enzyme abnormalities to those present in the patients identified in this study, and were present in the mtDB database at allele frequencies of . to .. us these variants were regarded as unlikely to be causative in our patients. Two patients had mutations in X-linked genes that had previously been reported as pathogenic in HGMD. One patient (P) was a girl with a heterozygous synonymous ABCD p.RR mu- tation. ABCD mutations typically cause X-linked recessive adrenoleukodystro- phy and are an unlikely cause of the lethal neonatal mitochondrial disease found in P. e other reported X-linked mutation was an in-frame bp deletion in

OTC reported previously in a case of late-onset hyperammonemic coma[]. is hemizygous indel was present in a boy (P) who manifested symptoms of OTC deficiency including orotic aciduria, hyperammonemia plus borderline low OTC activity in liver. However, this OTC deficiency alone did not explain the sever- ity of his neurological deficit given quick resolution of hyperammonemia nor his combined OXPHOS complex I, III and IV defect in skeletal muscle with normal

OXPHOS activities in liver.

us only  of the  reported pathogenic variants identified in our cohort appeared likely to explain the phenotype of the patients they were identified in, namely one homozygous TSFM variant found in two patients, two POLG variants found in one patient and a heterozygous BCSL variant found in one patient in compound heterozygosity with a novel variant as described in the Results section and Table .

Extrapolation of MitoExome diagnostic efficacy in retrospective cohort Patients selected for sequencing were refractory to molecular diagnosis using traditional

 approaches. To estimate the efficacy of our approach on unselected samples, we analyzed a retrospective cohort of  unrelated patients with biochemical evi- dence of a “definite” infantile OXPHOS disease from the Murdoch Childrens Re- search Institute laboratory, which performs OXPHOS diagnostic tests for most of

Australia and New Zealand. is cohort contained  of the samples sequenced in this study as well as  cases with molecular diagnoses established through tra- ditional methods such as mtDNA sequencing, candidate gene sequencing, family studies, or NGS of Complex I candidates[](Fig. A.A). ese  cases included

 nuclear gene diagnoses ( unique genes) and  mtDNA diagnoses ( unique genes/deletions), with the most common mutations in POLG ( cases), SURF

( cases), mtDNA site m.T>G ( cases) and mtDNA site m.T>G ( cases). e  cases encompassed  causal alleles corresponding to  unique variants, where each homozygous variant represents two causal alleles.

How many of the  known causal alleles would have been detected and prior- itized by MitoExome sequencing? Since MitoExome sequencing showed high and reproducible coverage across samples, we used the observed coverage to roughly estimate which of the  known causal alleles would have been detected. We assumed all SNVs and short indels (<bp) would be detected if supported by at least  reads (mapping quality > and base quality ≥ ) in half of the  se- quenced individuals. Variants were assumed to be undetected if coverage <X, indel length > bp, or nuclear deletion >bp. Using these assumptions, we es- timated detection and prioritization of  of the  unique variants,  of the

 causal alleles, and estimated diagnoses in  of the  cases (Fig. A.B,

D, E, F). We note that all four variants poorly covered by MitoExome technology are now well-covered by newer hybrid selection technology, however current tech-

 nology would most likely miss or not prioritize three large nuclear deletions, one splice variant located bp from an exon boundary, and one large indel (SURF: c._delinsAT).

We next extrapolated the MitoExome results from the  sequenced patients to the  remaining index cases that lacked a molecular diagnosis. Together, these analyses suggest that MitoExome sequencing would enable molecular diagnosis in roughly  of cases with infantile OXPHOS disease and would provide poten- tial candidate mutations in a further  of cases requiring additional follow-up

(Fig. A.C).

A.. V V, P,  C A

All prioritized variants detected in patients were independently validated by Sanger sequencing (/ variants validated), and compound heterozygous variants were phased though sequencing cDNA (GFM), cloned DNA (BCSL, Corf, MTHFDL), familial DNA (GFM, AGK, EARS), or by a molecular haplotyping approach de- tailed below (ACAD, AARS, POLG, TYMP). / putative compound heterozy- gous variants were confirmed and  instances are not yet phased, including ACOX and ALDHB. We note that the candidate genes ACOX and ALDHB harbor variants in patients with an established diagnosis in known disease-related genes.

Two heterozygous variants in MRPL were determined to be on the same ma- ternal haplotype and therefore this gene was not listed as a prioritized candidate.

e Corf variant was determined to be an artifact of whole genome amplifi- cation (WGA) since it was present in WGA DNA but absent in genomic DNA based on Sanger sequencing.

 DNA  RNA   DNA 

DNA was isolated from cultured fibroblast cells or blood using a Nucleobond CB

DNA Extraction kit (Scientifix) or from subject tissues (skeletal or cardiac muscle and liver) by proteinase K digestion followed by salting-out. RNA was extracted from cultured fibroblasts using the Illustra RNAspin Mini Kit (GE healthcare) and cDNA was generated using the SuperScript III First strand synthesis system (Invit- rogen) as per manufacturers’ protocols. For analysis of nonsense-mediated decay and mRNA splicing, fibroblasts were cultured in medium with and without  ng/µl CHX for  h before RNA preparation[].

PCR  S 

PCR products (primers supplied upon request) were either directly sequenced us- ing GENEWIZ, ABI XL and BigDye v. Terminators (Applied Biosystems) as per manufacturer’s protocols or sequenced after gel purification using the MinE- lute Gel Extraction kit (Qiagen).

A 

For patients P, P and P, the relevant exons containing both heterozygous mutations in BCSL, MTHFDL and Corf respectively were PCR amplified from genomic DNA and cloned into the TOPO®TA Cloning®system (Invitrogen) as per manufacturer’s instructions before Sanger sequencing.

M 

Our molecular haplotyping protocol was adapted from a previous report[].

Briefly, DNA was amplified from a single input molecule (- replicates) and

 then the two target sites were genotyped via PCR-amplification and Sanger se- quencing. Genomic DNA (gDNA) was quantified using a TaqMan assay for Intra-

AluYb[] and pre-amplified using a primer-extension preamplification. In a  well microtiter plate, gDNA was diluted to .- genomes/well and mixed with X

PCR Buffer II (Applied Biosystems), .mM MgCl (Applied Biosystems), µM each dNTP (Denville Scientific),  µM N oligonucleotide (Gene Link), .U Taq polymerase (Amplitaq, Applied Biosystems). Reactions were cycled ( ◦C ×  min, then  cycles of  ◦C ×  s,  ◦C × min,  −  ◦C ramp over  min,  ◦C ×

 min). Products were diluted by adding µL of water. Each pair of target sites was genotyped by PCR-amplification and Sanger sequencing. Hemi-nested PCR primers were designed using Primer[](http://primer.sourceforge.net/) and filtered by a custom Python script based on the following criteria: ) -bp primer length, ) − ◦C primer Tm, ) two G/C bases at ’ primer end and one

G/C base at ’ primer end, ) -bp length for inner PCR amplicon, and ) minimum length for outer PCR amplicon. When necessary, criteria  and  were relaxed to enable primer design. e first round of PCR was multiplexed; µL of preamplified patient DNA was amplified in a  µL PCR reaction containing X

PCR Gold buffer (Applied Biosystems),  mM MgCl (Applied Biosystems), 

µM each dNTP (Denville Scientific), . µM of the forward-external and reverse primer for each of the two sites and .U Amplitaq Gold DNA polymerase (Ap- plied Biosystems). Reactions were cycled ( ◦C ×  min, then  cycles of  ◦C ×

 s,  ◦C ×  s,  ◦C ×  s). Products were diluted by adding µL of water.

Separate second round PCR reactions were run for each of the two amplicons; µL of diluted first round PCR product was amplified in a  µL PCR reactions con- taining X PCR Gold buffer (Applied Biosystems),  mM MgCl (Applied Biosys-

 tems),  µM each dNTP (Denville Scientific),  µM of the forward-internal and reverse primers, .U Amplitaq Gold DNA polymerase (Applied Biosystems), .X

SyBr Green I (Invitrogen) and .X ROX (Invitrogen). Amplification was assayed by melting curve analysis ( Fast Real Time PCR System, Applied Biosystems).

Successful reactions were cleaned by adding U Antarctic phosphatase (NEB) and

U Exonuclease I (NEB) and incubated for  min at ,◦ C. Enzymes were deacti- vated by incubating for  min at ,◦ C. Cleaned samples were Sanger sequenced

(Genewiz or MGH DNA Sequencing Core). Sequence tracers were manually in- spected, and a minimum of  informative reactions were required to determine the phase.

SDS-PAGE   

Primary control and patient fibroblasts were lyzed in RIPA buffer ( mM Tris pH

.,  mM NaCl,  NP-, . sodium deoxycholate and . SDS) contain- ing protease inhibitor cocktail (Roche)[]. Skeletal muscle samples and comple- mentation assay samples were lyzed in SDS/glycerol solubilization buffer (mM

Tris pH.,  glycerol,  SDS, mM DTT, . Bromophenol blue), con- taining protease inhibitor cocktail (Roche)[]. After sonication and heat denatu- ration, - µg of whole lysate were run per lane on  NuPAGE Bis-Tris gels (In- vitrogen), and proteins were transferred to PVDF membranes (Millipore). Protein blots were later developed using ECL or ECL Plus detection reagents (Amersham

Bioscience).

 A   

Antibodies included MitoProfile Total OXPHOS Human WB Antibody Cocktail (MS

Mitosciences) at :, Porin ( Calbiochem) at :,, complex II - kD subunit (A- Molecular Probes) at :,, complex II kDa subunit

(MS MitoSciences) at :, COXB (MS MitoSciences) at :, GFM

(--AP ProteinTech) at :, BCSL (--AP ProteinTech) at :,

CIA and ECSIT (kind gifts from M Ryan and M McKenzie, La Trobe University) at :, UQCRFS (MS Mitosciences) at :, UQCRC (MS Mito- sciences) at : and anti-mouse or rabbit HRP secondary antibodies (P,

P DakoCytomation).

D   PCR

e relative abundance of mtDNA vs nuclear DNA was tested in patient and con- trol tissues as previously described previously []. e level of deleted mtDNA in the skeletal muscle of patient P was determined by quantitative real-time

PCR as described[].

L- PCR  DNA

DNA extracted from the skeletal muscle of patient P was checked for large-scale mtDNA rearrangements and the large mtDNA deletion mapped using long-range

PCR using primers F (nt -) and DR (nt -) which in control DNA amplifies a ,bp fragment. e mtDNA was then sequenced using an internal primer closer to the break point (primer sequence available on request) as above.

 S   L  

SNVs in candidate disease genes (TSFM, ACAD, ACADSB, Corf, LYRM and

MTCH) found in patients of Lebanese descent were assayed in genomic DNA from  Lebanese probands ( ethnically matched control chromosomes) us- ing Sequenom MassARRAY iPLEX GOLD chemistry. Oligos were synthesized and mass-spec QCed at Sigma Aldrich. All SNVs were genotyped in a multiplexed pool of  assays, designed by AssayDesigner v.. software, starting with  ng of

DNA for each sample. Around  nl of reaction was loaded onto each position of a

-well SpectroCHIP preloaded with matrix. SpectroCHIPs were analyzed in au- tomated mode by a Sequenom MassArray Analyszer Compact mass spectrometer.

Variants were analyzed by Typer v.. software.

A. S  P

A.. ACAD   P

DNA from patient P, with lethal neonatal mitochondrial disease and complex

I deficiency, contained two mutations in ACAD, which were Sanger validated

(Fig. A.A) and determined to be compound heterozygous via molecular haplotyp- ing. Mutation c.G>C, p.AP was previously reported as pathogenic [].

Western blots show a reduction of complex I assembly factors CIA and ECSIT protein levels, in keeping with previously published data[] (Fig. A.B) and dis- tinct from a range of other complex I defects, which had normal levels of both as- sembly factors. e second mutation, c.T>A, p.IN, is located within a highly conserved region (Fig. A.C) and the isoleucine residue is conserved within  of

 aligned vertebrate species based on the UCSC multiple genome alignments.

 A.. POLG   P

DNA from patient P, with lethal neonatal mitochondrial encephalopathy and clear complex I deficiency with less marked deficiency of complexes III and IV,con- tained two mutations in POLG which were Sanger validated (Fig. A.) and deter- mined to be compound heterozygous via molecular haplotyping. Both mutations were previously reported as pathogenic[, ].

A.. BCSL   P

DNA from patient P, an infant girl with neonatal mitochondrial cytopathy and complex III deficiency, contained two mutations in BCSL, which were Sanger val- idated (Fig. A.A) and determined to be compound heterozygous by cloning ge- nomic DNA. Mutation c.C>T (p.RX) was previously reported to be pathogenic[], and this mutation was undetectable in cDNA without the presence of cyclohex- imide, indicating that the mRNA is degraded via nonsense mediated decay (Fig. A.B).

e novel c.C>T (p.RC) mutation is located in a highly conserved region, with identical residues in / aligned species (Fig. A.C) and alters the charge of a highly conserved residue within the BCS_N superfamily domain (pfam).

Western blots show that although BCSL protein levels are within normal limits, the patient shows a reduction of complex III protein UQCRFS, in keeping with previous patient reports[] (Fig. A.D).

A.. COXB   P

DNA from patient P, a boy with neonatal onset of mitochondrial encephalopa- thy, metabolic acidosis, and complex IV deficiency, contained a homozygous muta- tion in COXB, which was inherited from carrier parents (Fig. A.A). Protein blot

 analysis showed a complete absence of COXB protein in both the patient’s mus- cle and fibroblasts (Fig. A.B). e p.T residue is located within the COXB:COXA interface[] and is therefore likely to affect complex IV assembly and stability.

A.. GFM   P  P

DNA from patients P and P, both presenting with mitochondrial encephalopa- thy, contained compound heterozygous mutations in GFM, which were Sanger validated and phased by sequencing familial DNA and cDNA. P contained mu- tations c.delT and c.C>T inherited from his mother and father, respec- tively (Fig. A.A). P contained mutations c.delT and c.A>G, where the mother carried only the former mutation and paternal DNA was unavailable for testing (Fig. A.A). In both patients the c.delT allele was undetectable in cDNA without CHX, indicating that the transcript undergoes nonsense mediated decay

(Fig. A.B). Protein blots showed reduced or absent levels of GFM protein in pa- tient muscle and fibroblasts (Fig. A.C). In addition, protein blots showed reduced amounts of OXPHOS proteins COX and NDUFB (Fig. A.C), indicating that the

GFM missense mutations cause protein instability leading to a defect in mtDNA translation and deficiency of complexes I, III and IV.

A.. TSFM   P  P

A previously reported pathogenic mutation (c.C>T, p.RW)[] in TSFM was homozygous in DNA from patient P, with combined OXPHOS deficiency, and patient P, with cardioencephalomyopathy. Sanger sequencing of familial

DNA showed segregation of the mutation with disease in both consanguineous

Lebanese families (Fig. A.A). Western blots showed that OXPHOS subunits COX,

 NDUFB and CIII core were deficient in all patients and their affected siblings

(Fig. A.B), consistent with a defect in a mitochondrial translation elongation fac- tor such as TSFM.

A.. AARS   P

DNA from patient P, a stillborn fetus with mitochondrial myopathy and com- bined complex I, III and IV was found to harbored one novel (c.G>A, p.RH) and one previously reported pathogenic mutation (c.C>T, p.RW [] in the mitochondrial translational protein AARS (Fig. A.A). e novel p.RH mutation is located within a highly conserved region of the AARS protein and the arginine residue is conserved in  out of the  aligned vertebrate species

(Fig. A.B). Western blots show that OXPHOS subunits COX, NDUFB and CIII core  were deficient in this patient’s skeletal muscle (Fig. A.C), consistent with a defect in mtDNA protein synthesis.

A.. AGK   P  P

DNA from patients P and P, described in detail below, contained mutations in the gene AGK. P had compound heterozygous mutations c.+T>C and c.T>A inherited from her father and mother, respectively (Fig. A.A). P contained a homozygous mutation c.+G>T that was Sanger validated; parental

DNA was not available for testing (Fig. A.B). cDNA from patient P showed the presence of a shorter isoform P (Fig. A.C), and subsequent sequencing identi-

fied a bp deletion corresponding to the complete skipping of exon  consistent with the observed splice mutation. Sequence analysis of the full length PCR prod- uct in this patient validated the p.YX nonsense mutation. cDNA from patient

 P showed that only a shorter isoform was present (Fig. A.C), and sequencing confirmed that it resulted from exon  skipping consistent with the observed splicing mutation.

A. C I  AGK  NDUFB P

P presented to hospital with cardiac failure and hypertrophic obstructive car- diomyopathy at  months of age. She had marked hyperlacticacidemia (- mmol/l; normal range . – .) with metabolic acidosis. Clinical review revealed generalized muscle weakness. She responded to medical management of cardiac failure therapy and carnitine. Recurrent episodes of metabolic acidosis accompa- nied viral infections. At age  months she could pull to stand, sit without support and crawl on all fours. Her speech, fine motor and social skills were normal. She walked in her third year. Bilateral cataracts were removed at / years of age.

ere were multiple episodes of congestive cardiac failure in childhood but the cardiomyopathy was not progressive and cardiac failure medications were ceased at age . She developed aphakic glaucoma. Fatigue was the major problem affect- ing her day to day life throughout childhood, with muscle power in adolescence better than in childhood. She used a wheelchair for school from age  because of fatigue and lactic acidosis on exertion. Her weight and height remained below the  percentile with head circumference on the  centile. Commencement of riboflavin therapy coincided with a degree of stabilization. At  years of age, verbal intellectual functioning lay within the borderline range while performance intellectual functioning was in the average range. Her school progress was within normal limits for age with reading and spelling. Recurrent headaches were prob- lematic in teenage years. At  years of age, verapamil was introduced because of

 cardiac wall thickening. She developed osteopenia and premature ovarian failure.

Initial psychometric assessments were normal, but the gap between her abilities and those of her peers widened. She graduated from high school, having required a modified curriculum in the final  years. At age  years  months, suddenly,

 weeks after contracting a viral illness, she developed worsening acidosis, had a cardiac arrest and died. Skeletal muscle biopsy at the initial presentation revealed punctuate red staining on Gomori trichrome preparations and markedly reduced cytochrome oxidase staining. Electron microscopy revealed lipid vacuoles, with subsarcolemmal accumulation of mitochondria in some fibers. e mitochondria were of abnormal shape, being rounded, with the usual internal pattern of cristae replaced by amorphous granular interior.

P was the first child of first cousin parents of Pakistani origin. e pregnancy was uncomplicated, but delivery was by lower segment cesarean section because of unstable fetal lie. Birthweight was  gm, and Apgar scores were  and  at

 and  minutes, respectively. She developed respiratory distress and collapsed within the first hour of life with rapidly progressive circulatory failure, requiring mechanical ventilation. At this time she had profound metabolic acidosis (pH .; base excess -) and severe hyperlacticacidemia (up to  mmol/L). Urinary or- ganic acids showed an elevated lactate. Liver function tests were normal. Car- diac echo was suggestive of persistent pulmonary hypertension. Head ultrasound showed no significant abnormality, and EEG was not severely encephalopathic.

Bilateral cataracts were noted, and she had persistent thrombocytopenia. Hemo- dynamic stability was restored but lactic acidemia persisted. Her parents were counseled about the likely poor prognosis, and mechanical ventilation was with- drawn at three days of age. She succumbed within one hour of extubation. Muscle

 and liver samples were collected within two hour of death for histopathology and respiratory chain enzymology, and a skin biopsy was performed to establish a cul- tured fibroblast line.

P had lethal infantile mitochondrial disease and her clinical presentation was described previously due to her muscle biopsy being unusual in having nemaline rods together with a severe complex I defect and borderline low complex III defi- ciency (Patient  in ref. []). She was born to unrelated parents (father is of

British decent and mother has Dutch ancestry) with no previous family history of mitochondrial disease. Briefly, she was born at  weeks gestation by caesarian section because of failure to grow, and weighed g at birth. She required venti- lation, was hypotonic, failed to thrive and developed worsening metabolic acidosis prior to her death at  months of age.

A. E C

Evolutionary conservation of protein AGK (Fig. .) was determined by identify- ing RefSeq protein homologs in selected species via BLASTP[]. e following

GenBank sequences were then multiply aligned using MUSCLE software[] with default parameters: gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|. Evolutionary conservation of BCSL protein used the following GenBank sequences: gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|. Evolutionary conservation of ACAD protein used the following

GenBank sequences: gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|. Evolutionary con- servation of AARS protein used the following GenBank sequences: gi|,

 Figure A.1: Variant prevalence in cases and controls. Number of variants present in cases (red) compared to control individuals of European ancestry (gray). Barcharts display the mean number of variants present per sample, within autosomal coding re- gions of 1034 mitochondrial genes that were well-covered in all individuals. Variants are categorized by rare (<0.005 allele frequency), protein-modifying (PM), heterozy- gous, and recessive-type (homozygous or 2 heterozygous variants present in the same gene) filters. Error bars indicate standard error of mean. gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|. Evolutionary conservation of NDUFB protein (Fig. A.C) used the following GenBank sequences: gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|, gi|

 Figure A.2: ACAD9 variants in patient P1. (A) Electropherograms show Sanger sequence validation of the two mutations. (B) SDS-PAGE western blotting of fibrob- lasts from control (C), patient P1, and patient controls with mutations in NDUFS6 (FS6), NDUFS4 (FS4), C20orf7 (C20) and C8orf38 (C8) Asterisk indicates non- specific band in ECSIT blot. (C) Multiple sequence alignment of ACAD9 protein region surrounding the novel p.I87N mutation (asterisk).

 Figure A.3: POLG mutations in P2. Electropherograms show Sanger sequence validation of the two mutations found in POLG.

Figure A.4: BCS1L mutations in P11. (A) Electropherograms show Sanger se- quence validation of the two mutations. (B) Sanger sequencing of cDNA with (+) and without (-) cycloheximide (CHX). (C) Multiple sequence alignment of BCS1L protein region surrounding the novel p.R90C mutation (asterisk). (D) SDS-PAGE western blots of patient and control (C1 and C2) skeletal muscle.

 Figure A.5: COX6B1 mutations in P17. (A) Family pedigree of P17. Carrier sta- tus of the parents was confirmed by Sanger sequencing parental DNA (WT represents Wild-Type allele). (B) SDS-PAGE western blots of OXPHOS complex subunits in patient and control (C1, C2) fibroblasts (Fib) and skeletal muscle (Skm). Loading controls included CV α (for top panel) and Porin (for COX6B1 panel).

 Figure A.6: GFM1 mutations in P18 and P29. (A) Family pedigrees of P18 and P29, showing the mutant alleles segregated with disease in the families. For P29 the c.720delT was inherited from her mother, DNA from her father was unavailable for testing but is a presumed carrier of the c.910A>G (WT represents Wild-Type allele) or that this mutation is de novo. textbf(B) Chromatograms from Sanger sequencing of cDNA +/- cycloheximide (CHX). (C) SDS-PAGE western blots of GFM1 and OX- PHOS complex subunits in patient and control (C1, C2) fibroblasts (Fib) and skeletal muscle (Skm). Loading controls included CII 70kD (for top panel) and CII 30kD (for GFM1 panel).

 Figure A.7: TSFM mutations in P27 and P28. (A) Family pedigrees of P27 and P28. Sanger sequencing of familial DNA confirmed segregation of mutant alle- les with disease in both consanguineous Lebanese families. P27 and his affected sib- ling (P27S) were homozygous for c.997C>T. Their parents and 2 unaffected siblings were carriers (WT represents Wild-Type allele), whereas the 3rd unaffected sibling was homozygous wild-type. P28 and her affected sibling (P28S) were homozygous for c.997C>T while their unaffected sibling was a carrier. DNA from both unaffected parents was unavailable for testing however are presumed to be carriers. (B) SDS- PAGE western blot of OXPHOS complex subunits in patients (P27, P28), affected patient siblings (P27S, P28S) and control (C). CV α was used as a loading control.

 Figure A.8: AARS2 mutations in P40. (A) Electropherograms show Sanger se- quence validation of the two mutations. (B) Multiple sequence alignment of AARS2 protein region surrounding the novel p.R329H mutation (asterisk). Grey indicates conserved amino acids. (C) SDS-PAGE western blots of OXPHOS complex subunits in patient and control skeletal muscle. CV α was used as a loading control.

 Figure A.9: AGK mutations in P41 and P42. (A) Family pedigree for P41. Sanger sequencing of familial DNA confirmed segregation of mutation allele with disease. P41 is compound heterozygous for the c.297+2T>C and c.1170T>A. Her parents are carriers and her unaffected sibling was not tested (WT represents Wild- Type allele). (B) Family pedigree for P42. Sanger sequencing of P42 confirmed her homozygous state for the c.1131+1G>T mutation, her unaffected consanguineous parents were presumed to be carriers as DNA was unavailable for testing. (C) PCR products of AGK exon 1-16 from control and patient’s fibroblast cDNA -/+ cyclo- heximide (CHX). A full length PCR product of 1417bp was detected in control (-/+ CHX). Sequence analyses of smaller products are summarized in the schematic repre- sentation of the AGK genomic structure indicating the location and consequences of the mutations identified.

 Figure A.10: NDUFB3 mutation in P3. (A) Family pedigree of patient P3. Sanger sequencing of genomic DNA confirmed homozygosity in P3 for the c.64C>T (p.W22R) mutation and carrier status of the mother (WT represents Wild-Type al- lele), Paternal DNA was unavailable for testing but is presumed to be a carrier. (B) Microarray-based detection of copy number change for the Chromosome 2 area sur- rounding the NDUFB3 locus (chromosome 2q33.1) using the high-resolution Illumina HumanCytoSNP-12 array (version 2.1), Blue dots indicate B-allele frequency (BAF) data for all SNPs in region and smoothed LogR data (solid red line centred at 0) are shown for P3. The LogR data indicate that there are no detectable deletions around the NDUFB3 locus. The 1.3Mb region of Long Contiguous Stretch of Homozygos- ity (LCSH) is also shown (solid grey bar). (C) ClustalW alignment of NDUFB3 or- thologs shows conservation of the p.W22 residue (asterisk) as well as the adjacent lysine residue, which undergoes N6-acetylation (triangle).

 Figure A.11: Estimated MitoExome diagnostic efficacy in a retrospective co- hort. (A) Retrospective cohort containing 291 patients with biochemically proven OXPHOS disease, of which 124 cases (blue) had pre-existing molecular diagnoses obtained via traditional sequencing. 38 of the remaining 167 cases resistant to tradi- tional methods were sampled for MitoExome sequencing (red and light gray). (B) Estimated efficacy of MitoExome sequencing in the 124 cases with a pre-existing molecular diagnosis. (C) Estimated MitoExome diagnostic efficacy in cohort of 291 cases, where rates of molecular diagnosis for the 38 sequenced patients were extrapo- lated to the remaining 129 patients lacking a diagnosis. (D-F) Estimated MitoExome sequencing detection and bioinformatic prioritization of 98 unique variants (D), the 198 unique causal variants (where homozygous variants each count as two causal alle- les) (E), and 124 patient cases (F). Asterisk indicates estimated detection of variants based on sequence coverage observed in 38 sequenced individuals, and prioritization using the MitoExome annotation pipeline.

 B Supplementary Materials for Chapter 

B. D  

B.. P  (P)

P was investigated at  years of age with an acquired strabismus (bilateral in- ferior oblique overaction thought to be due to an internuclear ophthalmoplegia) and mild impairment of visual acuity. A normal ophthalmological examination had been documented one year previously. ere was a past history of develop- mental delay affecting linguistic and to a lesser extent coordination abilities. She

 had mild difficulty performing tandem walking and standing on one leg and was unable to ride a bicycle. Her printing was legible but slow and immature for her age. She scored  months below her age level on the Schonell Graded Word Read- ing Test. e optic fundi showed slightly pale optic discs. Visual acuity was / on the right and / on the left. CT scan revealed diminished attenuation within the putamina, the left caudate and the subcortical white matter of the parietal and occipital regions. MRI showed nearly symmetrical high signal on T lesions involving the lentiform nuclei, caudate nuclei, mid-brain tectum, red nuclei, cor- pus callosum and subcortical white matter.

Urine organic acid and amino acid levels were normal. ree separate fast- ing CSF samples showed elevated lactate (., . and . mmol/L, reference range .-.) and mildly elevated pyruvate (,  and  µmol/L, refer- ence range -). Blood lactate and pyruvate were consistently normal.

Cardiovascular, respiratory and abdominal systems were normal on physical ex- amination. Histology, histochemistry and electronmicroscopy of muscle biopsy material were normal. Electrocardiography showed a short PR interval and delta waves consistent with Wolff-Parkinson-White syndrome. She was prescribed mul- tivitamins including Riboflavin, iamine and Ubidecanone. Subsequently she stopped experiencing night terrors which were previously occurring almost ev- ery night and no deterioration of her condition has been noted. An intellectual assessment at  years of age showed verbal, performance and full scale abilities below the . percentile level. At  years of age she is working in her family’s shop where she can stack the shelves, serve customers and handle money. Cranial

MRI shows no change in the lesions.

P’s maternal first cousin also had Leigh-like changes on brain MRI. At  years

 of age he was more severely affected with global developmental delay, optic at- rophy, mild bilateral pyramidal tract signs and coordination difficulty. He had significantly impaired visual acuity, to / in the right eye and / in the left.

He was mildly obese, had small hands and feet, a left single palmar crease, mildly dysmorphic facies and a past history of bilateral inguinal hernia repair. Plasma lactate level was normal but CSF lactate was elevated at .mmol/L, with a raised lactate/pyruvate ratio of . He also showed evidence of Wolff-Parkinson-White syndrome on electrocardiography. He was commenced on similar high dose mul- tivitamin therapy. At  years of age there was an acute deterioration with right sided weakness and ataxia. MRI showed progression of T signal abnormality af- fecting periaqueductal grey matter of the midbrain and upper pons. With rehabil- itation he recovered much of the lost function.

B.. P  (P)

P was the first born child of non-consanguineous Australian parents and had no significant neonatal problems. Early milestones were age appropriate and there was no overt history of neurodevelopmental regression. She was initially inves- tigated at  years of age for excess weight gain and hypertension associated with elevated cortisol and ACTH suggestive of Cushing’s disease. Her height was on the

 centile and weight on the  centile.

Cerebral MRI identified a mm pituitary adenoma and bilateral symmetrical abnormal high signal areas in the putamen, globus pallidus and brainstem. MRI performed one month later showed marked progression of previous basal ganglia, midbrain and medullary disease with new extensive bilateral frontoparietal and occipital white matter changes. Following the general anaesthetic for the MRI, P

 experienced a right focal seizure that progressed to a tonic clonic seizure. High

CSF and serum lactate levels were noted (serum: .mmol/L, CSF: .mmol/L).

She experienced a second seizure one week later which resulted in respiratory ar- rest and she was unable to be weaned from mechanical ventilation. She had fur- ther neurological deterioration. Care was withdrawn and she soon succumbed,

five weeks after her initial admission.

Tissue biopsies and cell lines from all  patients had deficient activities of OX-

PHOS complexes I and IV, with deficient complex III in some samples (Fig. B.A).

ese combined OXPHOS defects were not due to mtDNA depletion since mtDNA levels estimated by qPCR [] were normal to elevated in all tissues studied, at

 to  of normal mean in muscle biopsies from the  patients and  of normal mean in P liver biopsy.

B. N    

Roughly  of our sequence aligned to the target region resulting in an average of x coverage of nuclear DNA-encoded exons and >x coverage of mtDNA.

For accurate variant detection, we required that each target base have independent sequence data from at least  reads. Based on this cutoff, ~ of the MitoExome was well covered. Regions of poor coverage tended to be GC-rich, consistent with previous reports [].

 B. S M

B.. Q RT-PCR (RT-PCR)

RNA was extracted from cultured patient fibroblasts using the Illustra RNAspin

Mini Kit (GE Healthcare) and cDNA was generated using the TaqMan Reverse

Transcription Kit (Applied Biosystems) according to manufacturers’ protocols. For

ND qRT-PCR, residual genomic DNA was digested using the Turbo DNA-free Kit

(Ambion). Quantitative evaluation of ND, MTFMT, PDF, MAPD and HPRT ex- pression was performed using the Taqman Human HPRT Endogenous Control

Gene Expression Assay (Applied Biosystems), the TaqMan Human MTFMT Gene

Expression Assay, Hs__m (Applied Biosystems), the TaqMan Human

PDF Gene expression Assay (Hs_m), the TaqMan Human MAPD Gene expression Assay (Hs_m) or ND probes and primers described previ- ously []. Cycling was performed in a BioRad iQ iCycler and results were an- alyzed with BioRad iQ Optical System Software v.. Dilution series were gen- erated to calculate probe efficiency and to validate analysis via the ΔΔCt or the

Pfaffl method [, ]. ree independent qPCR experiments were performed, each in triplicate.

B.. F N- M S 

C IV P D

Complex IV was immunoprecipitated from control and patient fibroblasts using

MitoSciences’ Complex IV Immunocatpure kit (MS-) following manufacturer’s instructions. .mg of purified mitochondria were solubilized with n-dodecyl-β-

D-maltoside (Sigma-Aldrich), pulled down with .uL of matrix in a total volume

 of .mL. Bound Complex IV was eluted with uL of  SDS solution.

S P

Complex IV was prepared as described above and dissolved in X LDS sample buffer (Invitrogen) with  mM dithiothreitol (Pierce) for  min at  ◦C, fol- lowed by  min at RT. Proteins were then alkylated with iodoaceteamide (Sigma) added to a final concentration of mM for min at RT in the dark. e complex was separated by gel electrophoresis on a NuPAGE - Bis-Tris gel according to the manufacturer’s instructions (Invitrogen). A band corresponding to the MW of

COI was excised and subjected to in-gel Lys-C digestion over night at  ◦C[].

Peptides were extracted twice from the gel with  acetonitrile,  formic acid.

e peptides were vacuum concentrated to dryness and resuspended in  ace- tonitrile,  formic acid for mass spectrometry analysis.

LC-MS/MS A

e peptides were separated using a homemade column packed with µm Reprosil

C resin (Dr. Maisch GmbH) and fashioned with a PicoFrit nanoelectrospray emitter (New Objective). e column was connected to a -Series nano-LC pump (Agilent) and coupled to a LTQ-Velos-Orbitrap mass spectrometer (er- moFisher). e following buffers were used for separation: Buffer A: . formic acid and Buffer B: . formic acid/ acetonitrile. A -minute separation was performed: - min at  B and . µl/min, - min at -. B and .

µl/min, - min at .- B and . µl/min, - min at - B and .

µl/min, - min at  B and . µl/min, - min at  B and . µl/min. e peptides (~ of digest) were ionized with a spray voltage of .kV. Each duty

 cycle in the  min data acquisition method included an Orbitrap survey scan -

 m/z range and , resolution, and LTQ-based data-dependent MS/MS of top  most abundant precursors in the survey followed by static events MS/MS for m/zs ., ., . corresponding to the z= states of the unformylated, formylated and des-Met species of the N-terminal peptide of COI (MFADRWLF-

STNHK). Dynamic exclusion was enabled for data dependent scans with the fol- lowing parameters: repeat count was set equal to , precursor m/z tolerance was

+/-.ppm and the exclusion duration was set at  seconds.

D A

Extracted ion chromatographs (XICs) were generated from the Orbitrap survey scans based on the z= states of the unformylated, formylated and des-Met species of the N-terminal peptide of COI (m/zs ., ., ., . ppm, respectively) using XCalibur software (ermoFisher Scientific). e identi- ties of the peaks corresponding to these species were verified using the accompa- nying static MS/MS spectra (Fig. S). e areas under these peaks were integrated using the Genesis peak detection algorithm included in XCalibur with all standard defaults. Peak areas were further normalized to the peak area of a distal peptide of

COI (VFSWLATLHGSNMK, m/z ., z=) to ensure that subsequent com- parisons among samples were based on an equivalent amount of COI.

 Figure B.1: mtDNA transcript analysis. The level of ND1 relative to HPRT ex- pression was analyzed by qRT-PCR. There is no reduction in ND1 transcription which could have accounted for the lack of ND1 translation. Error bars show ± 1 s.e.m of 3 qRT-PCR experiments, each performed in triplicate.

 Figure B.2: Additional support for mutation pathogenicity. (A) MUSCLE align- ment of the protein sequence from MTFMT orthologues shows strong conservation of p.S125 and p.R128 and moderate conservation of p.S209. Note that p.S209 lies within a solvent exposed region of a leucine zipper and is substituted solely by small polar residues. (B) MUSCLE alignment of the mRNA sequence from MTFMT or- thologues shows conservation of c.626C. (C) Location of patient variants homolo- gous to S125, R128 and S209 in the crystal structure of E. coli FMT are shown in purple[232]. Red indicates sequence that is deleted or frameshifted. (D) MTFMT expression relative to HPRT expression was analysed in patient fibroblasts using a probe across the exon 4/5 junction. P1 has only 9% of control expression in the ab- sence of CHX and P2 has 56%, consistent with essentially complete exon skipping of the c.626C>T containing allele. Error bars represent ±1 s.e.m of 3 independent qRT-PCR experiments, each performed in triplicate.

 Figure B.3: Analysis of PDF and MAP1D Expression, Related to Fig. 3.5. The level of PDF and MAP1D relative to HPRT expression was analyzed by qRT- PCR. There is no increase in steady state levels of PDF or MAP1D transcripts, sug- gesting the increased proportion of the deformylated N-terminal peptide of COX1 in patient fibroblasts is not due to increased PDF levels. Error bars represent ±1 s.e.m of 3 independent qRT-PCR experiments, each performed in triplicate.

 Bibliography

[] Initial sequencing and analysis of the human genome. Nature, ():–, February . ISSN -. URL http://dx.doi.org/./http://www.nature.com/ nature/journal/v/n/suppinfo/a_S.html.

[]  Genomes Project. A map of human genome variation from population-scale sequencing. Nature, ():– , October . ISSN -. URL http://dx.doi. org/./naturehttp://www.nature.com/nature/ journal/v/n/abs/.-nature-unlocked.html# supplementary-information.

[] Ritu Agarwal, Guodong (Gordon) Gao, Catherine DesRoches, and Ashish K Jha. Research Commentary—e Digital Transformation of Healthcare: Current Status and the Road Ahead. Information Systems Research, (): –, December . doi: ./isre... URL http://isr. journal.informs.org/content///.abstract.

[] A Agostino, L Valletta, P F Chinnery, G Ferrari, F Carrara, R W Tay- lor, A M Schaefer, D M Turnbull, V Tiranti, and M Zeviani. Mu- tations of ANT, Twinkle, and POLG in sporadic progressive ex- ternal ophthalmoplegia (PEO). Neurology, ():–, . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Sara Al Rawi, Sophie Louvet-Vallée, Abderazak Djeddi, Martin Sachse, Em- manuel Culetto, Connie Hajjar, Lynn Boyd, Renaud Legouis, and Vin- cent Galy. Postfertilization Autophagy of Sperm Organelles Prevents Paternal Mitochondrial DNA Transmission. Science, ():– , November . doi: ./science.. URL http://www. sciencemag.org/content///.abstract.

[] Can Alkan, Jeffrey M Kidd, Tomas Marques-Bonet, Gozde Aksay, Francesca Antonacci, Fereydoun Hormozdiari, Jacob O Kitzman, Carl Baker, Maika

 Malig, Onur Mutlu, S Cenk Sahinalp, Richard A Gibbs, and Evan E Eich- ler. Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet, ():–, October . ISSN -. URL http://dx.doi.org/./ng.http:// www.nature.com/ng/journal/v/n/suppinfo/ng._S.html. [] Can Alkan, Bradley P Coe, and Evan E Eichler. Genome structural variation discovery and genotyping. Nat Rev Genet, ():–, May . [] Kaushalya Amarasinghe, Jason Li, and Saman Halgamuge. CoNVEX: copy number variation estimation in exome sequencing data using HMM. BMC Bioinformatics, (Suppl ):S, . ISSN -. URL http://www. biomedcentral.com/-//S/S. [] Joanna Amberger, Carol A Bocchini, Alan F Scott, and Ada Hamosh. McKu- sick’s Online Mendelian Inheritance in Man (OMIM®). Nucleic Acids Research, (suppl ):D–D, January . doi: ./nar/ gkn. URL http://nar.oxfordjournals.org/content//suppl_ /D.abstract. [] S Anderson, A T Bankier, B G Barrell, M H de Bruijn, A R Coul- son, J Drouin, I C Eperon, D P Nierlich, B A Roe, F Sanger, P H Schreier, A J Smith, R Staden, and I G Young. Sequence and organiza- tion of the human mitochondrial genome. Nature, ():–, . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=. [] Jeffrey A Bailey, Zhiping Gu, Royden A Clark, Knut Reinert, Rhea V Samonte, Stuart Schwartz, Mark D Adams, Eugene W Myers, Peter W Li, and Evan E Eichler. Recent Segmental Duplications in the Hu- man Genome. Science, ():–, August . doi: . /science.. URL http://www.sciencemag.org/content/ //.abstract. [] N P S Bajaj, A Waldman, R Orrell, N W Wood, and K P Bhatia. Familial adult onset of Krabbe’s disease resembling hereditary spastic paraplegia with normal neuroimaging. Journal of Neurology, Neurosurgery & Psychi- atry, ():–, May . doi: ./jnnp.... URL http: //jnnp.bmj.com/content///.abstract. [] Eleanor J Baldwin, F Brian Gibberd, Claire Harley, Margaret C Sidey, Michael D Feher, and Anthony S Wierzbicki. e effectiveness of long-term dietary therapy in the treatment of adult Refsum disease. Journal of Neu- rology, Neurosurgery & Psychiatry, ():–, September . doi:

 ./jnnp... URL http://jnnp.bmj.com/content/// .abstract.

[] Michael J Bamshad, Sarah B Ng, Abigail W Bigham, Holly K Ta- bor, Mary J Emond, Deborah A Nickerson, and Jay Shendure. Ex- ome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet, ():–, November . ISSN -. URL http://dx.doi.org/./nrghttp://www.nature.com/ nrg/journal/v/n/suppinfo/nrg_S.html.

[] Vafa Bayat, Isabelle iffault, Manish Jaiswal, Martine Tétreault, Taraka Donti, Florin Sasarman, Geneviève Bernard, Julie Demers-Lamarche, Marie-Josée Dicaire, Jean Mathieu, Michel Vanasse, Jean-Pierre Bouchard, Marie-France Rioux, Charles M Lourenco, Zhihong Li, Claire Haueter, Eric A Shoubridge, Brett H Graham, Bernard Brais, and Hugo J Bellen. Mu- tations in the Mitochondrial Methionyl-tRNA Synthetase Cause a Neurode- generative Phenotype in Flies and a Recessive Ataxia (ARSAL) in Humans. PLoS Biol, ():e, March . URL http://dx.doi.org/. /journal.pbio..

[] Medical Genetics Laboratories Baylor College of Medicine. Mito- chondrial DNA Test Requisition, . URL https://www.bcm.edu/ geneticlabs/index.cfm?pmid=.

[] M Bektas, S G Payne, H Liu, S Goparaju, S Milstien, and S Spiegel. A novel acylglycerol kinase that produces lysophosphatidic acid modu- lates cross talk with EGFR in prostate cancer cells. J Cell Biol,  ():–, . doi: jcb.[pii]./jcb.. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] C J Bell, D L Dinwiddie, N A Miller, S L Hateley, E E Ganusova, J Mudge, R J Langley, L Zhang, C C Lee, F D Schilkey, V Sheth, J E Woodward, H E Peckham, G P Schroth, R W Kim, and S F Kingsmore. Carrier testing for severe childhood recessive diseases by next-generation sequencing. Sci Transl Med, ():ra, . doi: //ra[pii]./scitranslmed. . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] D R Bentley, S Balasubramanian, H P Swerdlow, G P Smith, J Milton, C G Brown, K P Hall, D J Evers, C L Barnes, H R Bignell, J M Boutell, J Bryant, R J Carter, R Keira Cheetham, A J Cox, D J Ellis, M R Flatbush, N A Gorm- ley, S J Humphray, L J Irving, M S Karbelashvili, S M Kirk, H Li, X Liu,

 K S Maisinger, L J Murray, B Obradovic, T Ost, M L Parkinson, and M R Pratt. Accurate whole human genome sequencing using reversible termi- nator chemistry. Nature, :–, . [] Martin Bentz, Anja Plesch, Stephan Stilgenbauer, Hartmut Döhner, and Peter Lichter. Minimal sizes of deletions detected by comparative genomic hybridization. Genes, Chromosomes and Cancer, ():–, . [] B Benzacken, J M Lapierre, J P Siffroi, A Chalvon, and G Tachdjian. Identifi- cation and characterization of a de novo partial trisomy p by comparative genomic hybridization (CGH). Clinical Genetics, ():–, October . ISSN -. doi: ./j.-...x. URL http://dx.doi.org/./j.-...x. [] F P Bernier, A Boneh, X Dennett, C W Chow, M A Cleary, and D R orburn. Diagnostic criteria for respiratory chain disor- ders in adults and children. Neurology, ():–, . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=. [] Valentina Boeva, Tatiana Popova, Kevin Bleakley, Pierre Chiche, Julie Cappo, Gudrun Schleiermacher, Isabelle Janoueix-Lerosey, Olivier Delat- tre, and Emmanuel Barillot. Control-FREEC: a tool for assessing copy num- ber and allelic content using next-generation sequencing data . Bioinfor- matics, ():–, February . URL http://bioinformatics. oxfordjournals.org/content///.abstract. [] David Botstein and Neil Risch. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet, oo. [] Damien L Bruno, Zornitza Stark, David J Amor, Trent Burgess, Kathy Butler, Sylvea Corrie, David Francis, Devika Ganesamoorthy, Louise Hills, Paul A James, Darren O’Rielly, Ralph Oertel, Ravi Savarirayan, Krishna- murthy Prabhakara, Nicholas Salce, and Howard R Slater. Extending the scope of diagnostic chromosome analysis: Detection of single gene defects using high-resolution SNP microarrays. Human Mutation, ():– , December . ISSN -. doi: ./humu.. URL http://dx.doi.org/./humu.. [] Paul R Buckland. Allele-specific gene expression differences in humans. Hu- man Molecular Genetics, (suppl ):R–R, October . doi: . /hmg/ddh. URL http://hmg.oxfordjournals.org/content/ /suppl_/R.abstract.

 [] Sandeep Burma, Benjamin P C Chen, and David J Chen. Role of non-homologous end joining (NHEJ) in maintaining genomic integrity. DNA Repair, (–):–, September . ISSN -. doi: http://dx.doi.org/./j.dnarep.... URL http://www. sciencedirect.com/science/article/pii/S.

[] S E Calvo, E J Tucker, A G Compton, D M Kirby, G Crawford, N P Burtt, M Rivas, C Guiducci, D L Bruno, O A Goldberger, M C Red- man, E Wiltshire, C J Wilson, D Altshuler, S B Gabriel, M J Daly, D R orburn, and V K Mootha. High-throughput, pooled sequencing identifies mutations in NUBPL and FOXRED in human complex I defi- ciency. Nat Genet, ():–, . doi: ng.[pii]./ng. . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Sarah E Calvo, Alison G Compton, Steven G Hershman, Sze Chern Lim, Daniel S Lieber, Elena J Tucker, Adrienne Laskowski, Caterina Garone, Shangtao Liu, David B Jaffe, John Christodoulou, Janice M Fletcher, Damien L Bruno, Jack Goldblatt, Salvatore DiMauro, David R orburn, and Vamsi K Mootha. Molecular Diagnosis of Infantile Mitochondrial Dis- ease with Targeted Next-Generation Sequencing . Science Translational Medicine, ():ra–ra, January . URL http://stm. sciencemag.org/content///ra.abstract.

[] Peter J Campbell, Philip J Stephens, Erin D Pleasance, Sarah O’Meara, Heng Li, omas Santarius, Lucy A Stebbings, Catherine Leroy, Sarah Edkins, Claire Hardy, Jon W Teague, Andrew Menzies, Ian Goodhead, Daniel J Turner, Christopher M Clee, Michael A Quail, Antony Cox, Clive Brown, Richard Durbin, Matthew E Hurles, Paul A W Edwards, Graham R Bignell, Michael R Stratton, and P Andrew Futreal. Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat Genet, ():–, June . ISSN - . URL http://dx.doi.org/./ng.http://www.nature. com/ng/journal/v/n/suppinfo/ng._S.html.

[] Valerio Carelli, Carla Giordano, and Giulia D’Amati. Pathogenic ex- pression of homoplasmic mtDNA mutations needs a complex nuclear– mitochondrial interaction. Trends in Genetics, ():–, May . ISSN -. doi: http://dx.doi.org/./S-() -. URL http://www.sciencedirect.com/science/article/ pii/S.

 [] Christopher J Carroll, Pirjo Isohanni, Rosanna Pöyhönen, Liliya Euro, Uwe Richter, Virginia Brilhante, Alexandra Götz, Taina Lahtinen, Anders Pae- tau, Helena Pihko, Brendan J Battersby, Henna Tyynismaa, and Anu Suo- malainen. Whole-exome sequencing identifies a mutation in the mitochon- drial ribosome protein MRPL to underlie mitochondrial infantile car- diomyopathy. Journal of Medical Genetics, ():–, March . doi: ./jmedgenet--. URL http://jmg.bmj.com/content/ //.abstract. [] N P Carter. Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet, ( Suppl):S–S, . [] C Casasnovas, I Banchs, J Cassereau, N Gueguen, A Chevrollier, J A Martínez-Matos, D Bonneau, and V Volpini. Phenotypic spectrum of MFN mutations in the Spanish population. Journal of Medical Genetics, (): –, April . doi: ./jmg... URL http://jmg. bmj.com/content///.abstract. [] T Caspersson, L Zech, and C Johansson. Differential binding of alkylating fluorochromes in human chromosomes. Experimental Cell Research, (): –, June . [] Patricia B S Celestino-Soper, Chad A Shaw, Stephan J Sanders, Jian Li, Michael T Murtha, A Gulhan Ercan-Sencicek, Lea Davis, Susanne omson, Tomasz Gambin, A Craig Chinault, Zhishuo Ou, Jennifer R German, Alek- sandar Milosavljevic, James S Sutcliffe, Edwin H Cook, Pawel Stankiewicz, Matthew W State, and Arthur L Beaudet. Use of array CGH to detect ex- onic copy number variants throughout the genome in autism families de- tects a novel deletion in TMLHE. Human Molecular Genetics, ():– , November . doi: ./hmg/ddr. URL http://hmg. oxfordjournals.org/content///.abstract. [] Ken Chen, John W Wallis, Michael D McLellan, David E Larson, Joelle M Kalicki, Craig S Pohl, Sean D McGrath, Michael C Wendl, Qunyuan Zhang, Devin P Locke, Xiaoqi Shi, Robert S Fulton, Timo- thy J Ley, Richard K Wilson, Li Ding, and Elaine R Mardis. Break- Dancer: an algorithm for high-resolution mapping of genomic struc- tural variation. Nat Meth, ():–, September . ISSN - . URL http://dx.doi.org/./nmeth.http://www. nature.com/nmeth/journal/v/n/suppinfo/nmeth._S.html. [] Qun Chen, Edwin J Vazquez, Shadi Moghaddas, Charles L Hoppel, and Ed- ward J Lesnefsky. Production of Reactive Oxygen Species by Mitochon- dria: CENTRAL ROLE OF COMPLEX III. Journal of Biological Chemistry, 

 ():–, September . doi: ./jbc.M. URL http://www.jbc.org/content///.abstract.

[] Rui Chen, George I. Mias, Jennifer Li-Pook-an, Lihua Jiang, Hugo Y.K. Lam, Rong Chen, Elana Miriami, Konrad J. Karczewski, Manoj Hari- haran, Frederick E. Dewey, Yong Cheng, Michael J. Clark, Hogune Im, Lukas Habegger, Suganthi Balasubramanian, Maeve O’Huallachain, Joel T. Dudley, Sara Hillenmeyer, Rajini Haraksingh, Donald Sharon, Ghia Eu- skirchen, Phil Lacroute, Keith Bettinger, Alan P. Boyle, Maya Kasowski, Fabian Grubert, Scott Seki, Marco Garcia, Michelle Whirl-Carrillo, Mer- cedes Gallardo, Maria A. Blasco, Peter L. Greenberg, Phyllis Snyder, Teri E. Klein, Russ B. Altman, Atul J Butte, Euan A. Ashley, Mark Gerstein, Kari C. Nadeau, Hua Tang, and Michael Snyder. Personal Omics Profil- ing Reveals Dynamic Molecular and Medical Phenotypes. Cell, (): –, March . URL http://linkinghub.elsevier.com/ retrieve/pii/S.

[] Wanting Chen. Copy Number Variants in the human genome and their asso- ciation with quantitative traits. PhD thesis, University of Edinburgh, . URL http://hdl.handle.net//.

[] Xiulian Chen, David R orburn, Lee-Jun Wong, Georgirene D Vladutiu, Richard H Haas, uy Le, Charles Hoppel, Margaret Sedensky, Philip Mor- gan, and Si Houn Hahn. Quality improvement of mitochondrial respira- tory chain complex enzyme assays using Caenorhabditis elegans. Genet Med, ():–, September . ISSN -. URL http: //dx.doi.org/./GIM.beafca.

[] Derek Y Chiang, Gad Getz, David B Jaffe, Michael J T O’Kelly, Xiaojun Zhao, Scott L Carter, Carsten Russ, Chad Nusbaum, Matthew Meyerson, and Eric S Lander. High-resolution map- ping of copy-number alterations with massively parallel sequenc- ing. Nat Meth, ():–, January . ISSN -. URL http://dx.doi.org/./nmeth.http://www.nature.com/ nmeth/journal/v/n/suppinfo/nmeth._S.html.

[] A Craig Chinault, Chad A Shaw, Ellen K Brundage, Lin-Ya Tang, and Lee- Jun C Wong. Application of dual-genome oligonucleotidearray-based com- parative genomic hybridization to the molecular diagnosis of mitochon- drial DNA deletion and depletion syndromes. Genet Med, ():–, July . ISSN -. URL http://dx.doi.org/./GIM. beabdc.

 [] Patrick F Chinnery and Douglass M Turnbull. Epidemiology and treatment of mitochondrial disorders. American Journal of Medical Genetics, (): –, March . ISSN -. doi: ./ajmg.. URL http://dx.doi.org/./ajmg..

[] M Choi, U I Scholl, W Ji, T Liu, I R Tikhonova, P Zumbo, A Nayir, A Bakkaloglu, S Ozen, S Sanjad, C Nelson-Williams, A Farhi, S Mane, and R P Lifton. Genetic diagnosis by whole exome capture and mas- sively parallel DNA sequencing. Proc Natl Acad Sci U S A, (): –, . doi: [pii]./pnas.. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] H Christomanou, S Jaffé, J Martinius, C Cáp, and Betke K. Biochemical, genetic, psychometric, and neuropsychological studies in heterozygotes of a family with globoid cell leucodystrophy (Krabbe’s disease). Hum Genet., ():–, .

[] Tomáš Cipra. Exponential smoothing for irregular data. Applications of Mathematics, ():–, . URL http://dml.cz/bitstream/ handle/.dmlcz//AplMat_--_.pdf.

[] M J Coenen, H Antonicka, C Ugalde, F Sasarman, R Rossi, J G Heis- ter, R F Newbold, F J Trijbels, L P van den Heuvel, E A Shoubridge, and J A Smeitink. Mutant mitochondrial elongation factor G and combined oxidative phosphorylation deficiency. N Engl J Med,  ():–, . doi: //[pii]./NEJMoa. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Alison J Coffey, Felix Kokocinski, Maria S Calafato, Carol E Scott, Priit Palta, Eleanor Drury, Christopher J Joyce, Emily M LeProust, Jen Har- row, Sarah Hunt, Anna-Elina Lehesjoki, Daniel J Turner, Tim J Hubbard, and Aarno Palotie. e GENCODE exome: sequencing the complete hu- man exome. Eur J Hum Genet, ():–, July . ISSN - . URL http://dx.doi.org/./ejhg..http://www. nature.com/ejhg/journal/v/n/suppinfo/ejhgs.html.

[] Jon Cohen. DNA Duplications and Deletions Help Determine Health. Science, ():–, September . doi: ./science. ... URL http://www.sciencemag.org/content/// .short.

 [] Le Cong, F Ann Ran, David Cox, Shuailiang Lin, Robert Barretto, Naomi Habib, Patrick D Hsu, Xuebing Wu, Wenyan Jiang, Luciano A Marraffini, and Feng Zhang. Multiplex Genome Engineering Using CRISPR/Cas Sys- tems. Science, ():–, February . doi: ./science. . URL http://www.sciencemag.org/content///. abstract.

[] D F Conrad, D Pinto, R Redon, L Feuk, O Gokcumen, Y Zhang, J Aerts, T D Andrews, C Barnes, P Campbell, T Fitzgerald, M Hu, C H Ihm, K Kristians- son, D G Macarthur, J R Macdonald, I Onyiah, A W Pang, S Robson, K Stir- rups, A Valsesia, K Walter, J Wei, Wellcome Trust Case Control Consortium, C Tyler-Smith, N P Carter, C Lee, S W Scherer, and M E Hurles. Origins and functional impact of copy number variation in the human genome. Nature, :–, .

[] Donald F Conrad, Dalila Pinto, Richard Redon, Lars Feuk, Omer Gokc- umen, Yujun Zhang, Jan Aerts, T Daniel Andrews, Chris Barnes, Peter Campbell, Tomas Fitzgerald, Min Hu, Chun Hwa Ihm, Kati Kristiansson, Daniel G MacArthur, Jeffrey R MacDonald, Ifejinelo Onyiah, Andy Wing Chun Pang, Sam Robson, Kathy Stirrups, Armand Valsesia, Klaudia Walter, John Wei, Chris Tyler-Smith, Nigel P Carter, Charles Lee, Stephen W Scherer, and Matthew E Hurles. Origins and functional impact of copy number variation in the human genome. Nature, ():–, April . ISSN -. URL http://dx.doi.org/./naturehttp://www.nature. com/nature/journal/v/n/suppinfo/nature_S.html.

[] Gregory M Cooper, Troy Zerr, Jeffrey M Kidd, Evan E Eichler, and Deborah A Nickerson. Systematic assessment of copy num- ber variant detection via genome-wide SNP genotyping. Nat Genet, ():–, October . ISSN -. URL http://dx.doi.org/./ng.http://www.nature.com/ng/ journal/v/n/suppinfo/ng._S.html.

[] S T Cooper, H P Lo, and K N North. Single section Western blot: improving the molecular diagnosis of the muscular dystrophies. Neurology, ():– , . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Winnie Courtens, Wim Wuyts, Stefaan Scheers, Rob Van Luijk, Ed- win Reyniers, Liesbeth Rooms, Berten Ceulemans, Frank Kooy, and Jan Wauters. A de novo subterminal trisomy p and monosomy q in a girl

 with MCA/MR: case report and review. European Journal of Medical Genet- ics, ():–, September . ISSN -. doi: http://dx. doi.org/./j.ejmg.... URL http://www.sciencedirect. com/science/article/pii/S.

[] T Cremer, P Lichter, J Borden, D C Ward, and L Manuelidis. Detection of chromosome aberrations in metaphase and interphase tumor cells by in situ hybridization using chromosome-specific library probes. Human Ge- netics, ():–, November . doi: ./BF.

[] Kim Cryns, Markus Pfister, RonaldJ. Pennings, StevenJ. Bom, Kris Floth- mann, Goele Caethoven, Hannie Kremer, Isabelle Schatteman, KarenA. Köln, Tímea Tóth, Susan Kupka, Nikolaus Blin, Peter Nürnberg, Holger iele, PaulH. Heyning, William Reardon, Dafydd Stephens, CorW. Cre- mers, RichardJ. Smith, and Guy Camp. Mutations in the WFS gene that cause low-frequency sensorineural hearing loss are small non-inactivating mutations. Human Genetics, ():–, . ISSN -. doi: ./s---. URL http://dx.doi.org/./ s---.

[] Hong Cui, Fangyuan Li, David Chen, Guoli Wang, Cavatina K Truong, Gregory M Enns, Brett Graham, Margherita Milone, Megan L Landsverk, Jing Wang, Wei Zhang, and Lee-Jun C Wong. Comprehensive next- generation sequence analyses of the entire mitochondrial genome reveal new insights into the molecular diagnosis of mitochondrial DNA disorders. Genet Med, January . ISSN -. URL http://dx.doi.org/./gim..http://www.nature. com/gim/journal/vaop/ncurrent/suppinfo/gims.html.

[] Erica E Davis, Qi Zhang, Qin Liu, Bill H Diplas, Lisa M Davey, Jane Hartley, Corinne Stoetzel, Katarzyna Szymanska, Gokul Ramaswami, Clare V Logan, Donna M Muzny, Alice C Young, David A Wheeler, Pe- dro Cruz, Margaret Morgan, Lora R Lewis, Praveen Cherukuri, Baishali Maskeri, Nancy F Hansen, James C Mullikin, Robert W Blakesley, Ger- ard G Bouffard, Gabor Gyapay, Susanne Rieger, Burkhard Tonshoff, Ilse Kern, Neveen A Soliman, omas J Neuhaus, Kathryn J Swo- boda, Hulya Kayserili, Tomas E Gallagher, Richard A Lewis, Carsten Bergmann, Edgar A Otto, Sophie Saunier, Peter J Scambler, Philip L Beales, Joseph G Gleeson, Eamonn R Maher, Tania Attie-Bitach, Helene Dollfus, Colin A Johnson, Eric D Green, Richard A Gibbs, Friedhelm Hildebrandt, Eric A Pierce, and Nicholas Katsanis. TTCB contributes both causal and modifying alleles across the ciliopathy spectrum.

 Nat Genet, ():–, March . ISSN -. URL http://dx.doi.org/./ng.http://www.nature.com/ng/ journal/v/n/abs/ng..html#supplementary-information. [] Robert L Davis, Harold Weintraub, and Andrew B Lassar. Expression of a single transfected cDNA converts fibroblasts to myoblasts. Cell, (): –, December . ISSN -. doi: http://dx.doi.org/. /-()-X. URL http://www.sciencedirect.com/ science/article/pii/X. [] R De Gasperi, M A Gama Sosa, E L Sartorato, S Battistini, H Mac- Farlane, J F Gusella, W Krivit, and E H Kolodny. Molecular het- erogeneity of late-onset forms of globoid-cell leukodystrophy. e American Journal of Human Genetics, ():–, . URL http://www.pubmedcentral.nih.gov/articlerender.fcgi? artid=&tool=pmcentrez&rendertype=abstract. [] P de Lonlay, I Valnot, A Barrientos, M Gorbatyuk, A Tzagoloff, J W Taanman, E Benayoun, D Chretien, N Kadhom, A Lombes, H O de Baulny, P Niaudet, A Munnich, P Rustin, and A Rotig. A mutant mitochondrial respiratory chain assembly protein causes complex III deficiency in patients with tubulopathy, encephalopathy and liver fail- ure. Nat Genet, ():–, . doi: ./ngng[pii]. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=. [] Xutao Deng. SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data. BMC Bioinformatics, (): , . ISSN -. URL http://www.biomedcentral.com/ -//. [] S Dimauro and G Davidzon. Mitochondrial DNA and disease. Ann Med, ():–, . doi: RRRU[pii]./ . URL http://www.ncbi.nlm.nih.gov/entrez/ query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_ uids=. [] Salvatore DiMauro and Eric A Schon. Mitochondrial Respiratory-Chain Diseases. New England Journal of Medicine, ():–, June . ISSN -. doi: ./NEJMra. URL http://dx.doi. org/./NEJMra. [] Salvatore DiMauro, Michio Hirano, and Eric A Schon. Mitochondrial Medicine. Informa Healthcare, New York, .

 [] Robert C Edgar. MUSCLE: multiple sequence alignment with high accu- racy and high throughput. Nucleic Acids Research, ():–, March . doi: ./nar/gkh. URL http://nar.oxfordjournals. org/content///.abstract. [] Jenni M Elo, Srujana S Yadavalli, Liliya Euro, Pirjo Isohanni, Alexandra Götz, Christopher J Carroll, Leena Valanne, Fowzan S Alkuraya, Johanna Uusimaa, Anders Paetau, Eric M Caruso, Helena Pihko, Michael Ibba, Henna Tyynismaa, and Anu Suomalainen. Mitochondrial phenylalanyl- tRNA synthetase mutations underlie fatal infantile Alpers encephalopa- thy. Human Molecular Genetics, ():–, October . doi: . /hmg/dds. URL http://hmg.oxfordjournals.org/content/ //.abstract. [] Kathrin Engelfried, Matthias Vorgerd, Michaela Hagedorn, Gerhard Haas, Jurgen Gilles, Jorg Epplen, and Moritz Meins. Charcot-Marie-Tooth neu- ropathy type A: novel mutations in the mitofusin  gene (MFN). BMC Medical Genetics, ():, . ISSN -. URL http://www. biomedcentral.com/-//. [] JA Enríquez and G Attardi. Analysis of aminoacylation of human mito- chondrial tRNAs. Methods Enzymol., :–, . [] Sindy Escobar-Alvarez, Jeffrey Gardner, Aneesh Sheth, Giovanni Manfredi, Guangli Yang, Ouathek Ouerfelli, Mark L Heaney, and David A Scheinberg. Inhibition of Human Peptide Deformylase Disrupts Mitochondrial Func- tion. Molecular and Cellular Biology, ():–, . doi: . /MCB.-. URL http://mcb.asm.org/content///. abstract. [] William G Fairbrother, Ru-Fang Yeh, Phillip A Sharp, and Christopher B Burge. Predictive Identification of Exonic Splicing Enhancers in Human Genes. Science, ():–, August . doi: ./science. . URL http://www.sciencemag.org/content///. abstract. [] Marni J. Falk, Eric A. Pierce, Mark Consugar, Michael H. Xie, Moraima Guadalupe, Owen Hardy, Eric F. Rappaport, Douglas C. Wallace, Emily LeP- roust, and Xiaowu Gai. Mitochondrial Disease Genetic Diagnostics: Opti- mized Whole-Exome Analysis for All MitoCarta Nuclear Genes and the Mi- tochondrial Genome. Discov Med, ():–, . [] Sacha Ferdinandusse, Simone Denis, Petra A W Mooyer, Conny Dekker, Marinus Duran, Roelineke J Soorani-Lunsing, Eugen Boltshauser, Alfons

 Macaya, Jutta Gärtner, Charles B L M Majoie, Peter G Barth, Ronald J A Wanders, and Bwee Tien Poll-e. Clinical and biochemical spectrum of D-bifunctional protein deficiency. Annals of Neurology, ():–, Jan- uary . ISSN -. doi: ./ana.. URL http://dx. doi.org/./ana..

[] Lars Feuk, Andrew R Carson, and Stephen W Scherer. Structural variation in the human genome. Nat Rev Genet, ():–, February .

[] Sheila Fisher, Andrew Barry, Justin Abreu, Brian Minie, Jillian Nolan, Toni Delorey, Geneva Young, Timothy Fennell, Alexander Allen, Lauren Ambrogio, Aaron Berlin, Brendan Blumenstiel, Kristian Cibulskis, Dennis Friedrich, Ryan Johnson, Frank Juhn, Brian Reilly, Ramy Shammas, John Stalker, Sean Sykes, Jon ompson, John Walsh, Andrew Zimmer, Zac Zwirko, Stacey Gabriel, Robert Nicol, and Chad Nusbaum. A scalable, fully automated process for construction of sequence-ready human exome tar- geted capture libraries. Genome Biology, ():R, . ISSN -. URL http://genomebiology.com////R.

[] Lars Anders Forsberg, Devin Absher, and Jan Piotr Dumanski. Non- heritable genetics of human disease: spotlight on post-zygotic genetic vari- ation acquired during lifetime. Journal of Medical Genetics, ():–, Jan- uary . doi: ./jmedgenet--. URL http://jmg.bmj. com/content///.abstract.

[] J Fridlyand. Hidden Markov models approach to the analysis of array CGH data. Special Issue on Multivariate Methods in Genomic Data Analysis, (): –, July . doi: doi:./j.jmva....

[] Menachem Fromer, Jennifer L. Moran, Kimberly Chambert, Eric Banks, Sarah E. Bergen, Douglas M. Ruderfer, Robert E. Handsaker, Steven A. Mc- Carroll, Michael C. ODonovan, Michael J. Owen, George Kirov, Patrick F. Sullivan, Christina M. Hultman, Pamela Sklar, and Shaun M. Purcell. Dis- covery and Statistical Genotyping of Copy-Number Variation from Whole- Exome Sequencing Depth, October . URL http://linkinghub. elsevier.com/retrieve/pii/SX.

[] Jean-Pierre Fryns, K Vandenberghe, and D Deschrijver. Early urethral ob- struction sequence and unbalanced translocation with terminal p dupli- cation/p deficiency. Genetic Counseling, ():–, .

[] Pauline A Fujita, Brooke Rhead, Ann S Zweig, Angie S Hinrichs, Donna Karolchik, Melissa S Cline, Mary Goldman, Galt P Barber, Hiram Claw-

 son, Antonio Coelho, Mark Diekhans, Timothy R Dreszer, Belinda M Giar- dine, Rachel A Harte, Jennifer Hillman-Jackson, Fan Hsu, Vanessa Kirkup, Robert M Kuhn, Katrina Learned, Chin H Li, Laurence R Meyer, Andy Pohl, Brian J Raney, Kate R Rosenbloom, Kayla E Smith, David Haussler, and W James Kent. e UCSC Genome Browser database: update . Nucleic Acids Research, (suppl ):D–D, January . doi: ./nar/ gkq. URL http://nar.oxfordjournals.org/content//suppl_ /D.abstract.

[] Louise Galmiche, Valérie Serre, Marine Beinat, Zahra Assouline, Anne- Sophie Lebre, Dominique Chretien, Patrick Nietschke, Vladimir Benes, Nathalie Boddaert, Daniel Sidi, Francis Brunelle, Marlène Rio, Arnold Munnich, and Agnès Rötig. Exome sequencing identifies MRPL muta- tion in mitochondrial cardiomyopathy. Human Mutation, ():– , November . ISSN -. doi: ./humu.. URL http://dx.doi.org/./humu..

[] Beatriz Garcia-Diaz, Mario H. Barros, Simone Sanna-Cherchi, Valentina Emmanuele, Hasan O. Akman, Claudia C. Ferreiro-Barros, Rita Horvath, Saba Tadesse, Nader El Gharaby, Salvatore DiMauro, Darryl C. De Vivo, Aly Shokr, Michio Hirano, and Catarina M. Quinzii. Infantile Encephaloneu- romyopathy and Defective Mitochondrial Translation Are Due to a Ho- mozygous RMND Mutation. e American Journal of Human Genetics, ():–, October . ISSN -. doi: http://dx.doi. org/./j.ajhg.... URL http://www.sciencedirect.com/ science/article/pii/S.

[] Frank Norbert Gellerich, Johannes A Mayr, Sebastian Reuter, Wolfgang Sperl, and Stephan Zierz. e problem of interlab variation in meth- ods for mitochondrial disease diagnosis: enzymatic measurement of res- piratory chain complexes. Mitochondrion, (–):–, September . ISSN -. doi: http://dx.doi.org/./j.mito... . URL http://www.sciencedirect.com/science/article/pii/ S.

[] GeneDX. Next-Generation Sequence Analysis of  Nuclear Genes Impor- tant for Normal Mitochondrial Function. URL www.genedx.com/pdf_ files/Nuclear_panel__gene_descriptions.pdf.

[] Christian Gilissen, Alexander Hoischen, Han G Brunner, and Joris A Velt- man. Disease gene identification strategies for exome sequencing. Eur J Hum Genet, ():–, May . ISSN -. URL http: //dx.doi.org/./ejhg...

 [] Santhosh Girirajan, Catarina D Campbell, and Evan E Eichler. Human Copy Number Variation and Complex Genetic Disease. Annual Review of Genetics, ():–, November . ISSN -. doi: ./annurev-genet--. URL http://dx.doi.org/. /annurev-genet--.

[] Sante Gnerre, Iain MacCallum, Dariusz Przybylski, Filipe J Ribeiro, Joshua N Burton, Bruce J Walker, Ted Sharpe, Giles Hall, Terrance P Shea, Sean Sykes, Aaron M Berlin, Daniel Aird, Maura Costello, Riza Daza, Louise Williams, Robert Nicol, Andreas Gnirke, Chad Nusbaum, Eric S Lander, and David B Jaffe. High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proceedings of the National Academy of Sci- ences, ():–, January . doi: ./pnas.. URL http://www.pnas.org/content///.abstract.

[] A Gnirke, A Melnikov, J Maguire, P Rogov, E M LeProust, W Brockman, T Fennell, G Giannoukos, S Fisher, C Russ, S Gabriel, D B Jaffe, E S Lander, and C Nusbaum. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol, ():–, . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Alexandra Götz, Henna Tyynismaa, Liliya Euro, Pekka Ellonen, Tuulia Hyötyläinen, Tiina Ojala, Riikka H. Hämäläinen, Johanna Tommiska, Taneli Raivio, Matej Oresic, Riitta Karikoski, Outi Tammela, Kalle O.J. Simola, Anders Paetau, Tiina Tyni, and Anu Suomalainen. Exome Sequenc- ing Identifies Mitochondrial Alanyl-tRNA Synthetase Mutations in Infan- tile Mitochondrial Cardiomyopathy. American journal of human genetics, ():–, May . URL http://linkinghub.elsevier.com/ retrieve/pii/S.

[] T B Haack, K Danhauser, B Haberberger, J Hoser, V Strecker, D Boehm, G Uziel, E Lamantea, F Invernizzi, J Poulton, B Rolinski, A Iuso, S Biskup, T Schmidt, H W Mewes, I Wittig, T Meitinger, M Zeviani, and H Prokisch. Exome sequencing identifies ACAD mutations as a cause of complex I de- ficiency. Nat Genet, ():–, . doi: ng.[pii]./ng. . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Tobias B Haack, Birgit Haberberger, Eva-Maria Frisch, omas Wieland, Arcangela Iuso, Matteo Gorza, Valentina Strecker, Elisabeth Graf, Jo- hannes A Mayr, Ulrike Herberg, Julia B Hennermann, omas Klop- stock, Klaus A Kuhn, Uwe Ahting, Wolfgang Sperl, Ekkehard Wilichowski,

 Georg F Hoffmann, Marketa Tesarova, Hana Hansikova, Jiri Zeman, Bar- bara Plecko, Massimo Zeviani, Ilka Wittig, Tim M Strom, Markus Schuelke, Peter Freisinger, omas Meitinger, and Holger Prokisch. Molecular di- agnosis in mitochondrial complex I deficiency using exome sequencing. Journal of Medical Genetics, ():–, April . doi: ./ jmedgenet--. URL http://jmg.bmj.com/content/// .abstract. [] Richard H Haas, Sumit Parikh, Marni J Falk, Russell P Saneto, Nicole I Wolf, Niklas Darin, and Bruce H Cohen. Mitochondrial Disease: A Practical Ap- proach for Primary Care Physicians. Pediatrics, ():–. doi: ./peds.-. URL http://pediatrics.aappublications. org/content///.abstract. [] Iman Hajirasouliha, Fereydoun Hormozdiari, Can Alkan, Jeffrey M Kidd, Inanc Birol, Evan E Eichler, and S Cenk Sahinalp. Detection and characterization of novel sequence insertions using paired-end next- generation sequencing. Bioinformatics, ():–, May . doi: ./bioinformatics/btq. URL http://bioinformatics. oxfordjournals.org/content///.abstract. [] Robert E Handsaker, Joshua M Korn, James Nemesh, and Steven A Mc- Carroll. Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat Genet, ():–, March . ISSN -. URL http://dx.doi.org/./ng.http: //www.nature.com/ng/journal/v/n/abs/ng..html# supplementary-information. [] Klaus Harzer, Rupert Knoblich, Arndt Rolfs, Peter Bauer, and Jür- gen Eggers. Residual galactosylsphingosine (psychosine) β-galactosidase activities and associated GALC mutations in late and very late on- set Krabbe disease. Clinica Chimica Acta, (–):–, March . ISSN -. doi: http://dx.doi.org/./S-() -. URL http://www.sciencedirect.com/science/article/ pii/S. [] Einat Hazkani-Covo, Raymond M Zeller, and William Martin. Molecu- lar Poltergeists: Mitochondrial DNA Copies (numts) in Se- quenced Nuclear Genomes. PLoS Genet, ():e, February . URL http://dx.doi.org/./journal.pgen.. [] Langping He, Patrick F Chinnery, Steve E Durham, Emma L Blakely, eresa M Wardell, Gillian M Borthwick, Robert W Taylor, and Dou- glass M Turnbull. Detection and quantification of mitochondrial DNA

 deletions in individual cells by real￿time PCR. Nucleic Acids Research,  ():e–e, July . doi: ./nar/gnf. URL http://nar. oxfordjournals.org/content///e.abstract.

[] Yiping He, Jian Wu, Devin C Dressman, Christine Iacobuzio-Donahue, Sanford D Markowitz, Victor E Velculescu, Luis A Diaz Jr, Kenneth W Kinzler, Bert Vogelstein, and Nickolas Papadopoulos. Heteroplasmic mitochondrial DNA mutations in normal and tumour cells. Na- ture, ():–, March . ISSN -. URL http://dx.doi.org/./naturehttp://www.nature. com/nature/journal/v/n/suppinfo/nature_S.html.

[] Mervi Heiskanen, Leena Peltonen, and Aarno Palotie. Visual mapping by high resolution FISH. Trends in Genetics, ():–, October . ISSN -. URL http://www.sciencedirect.com/science/ article/pii/.

[] H H Heng, J Squire, and L C Tsui. High-resolution mapping of mammalian genes by in situ hybridization to free chromatin . Proceedings of the National Academy of Sciences, ():–, October .

[] L. Hirt, P. J. Magistretti, J. Bogousslavsky, O. Boulat, and F. X. Borruat. Large deletion (. kb) of mitochondrial DNA with novel boundaries in a case of progressive external ophthalmoplegia. J. Neurol. Neurosurg. Psychi- atry, ():–, .

[] Marc-Phillip Hitz, Louis-Philippe Lemieux-Perreault, Christian Marshall, Yassamin Feroz-Zada, Robbie Davies, Shi Wei Yang, Anath Christopher Li- onel, Guylaine D’Amours, Emmanuelle Lemyre, Rebecca Cullum, Jean-Luc Bigras, Maryse ibeault, Philippe Chetaille, Alexandre Montpetit, Paul Khairy, Bert Overduin, Sabine Klaassen, Pamela Hoodless, Mona Nemer, Alexandre F R Stewart, Cornelius Boerkoel, Stephen W Scherer, Andrea Richter, Marie-Pierre Dubé, and Gregor Andelfinger. Rare Copy Number Variants Contribute to Congenital Left-Sided Heart Disease. PLoS Genet, ():e, September . URL http://dx.doi.org/./ journal.pgen..

[] Fereydoun Hormozdiari, Can Alkan, Evan E Eichler, and S Cenk Sahi- nalp. Combinatorial algorithms for structural variation detection in high- throughput sequenced genomes. Genome Research, ():–, July . doi: ./gr... URL http://genome.cshlp.org/ content///.abstract.

 [] Fang-Han Hsu, Hung-I Chen, Mong-Hsun Tsai, Liang-Chuan Lai, Chi- Cheng Huang, Shih-Hsin Tu, Eric Chuang, and Yidong Chen. A model- based circular binary segmentation algorithm for the analysis of array CGH data. BMC Research Notes, ():, . ISSN -. URL http: //www.biomedcentral.com/-//.

[] Li Hsu, Steven G Self, Douglas Grove, Tim Randolph, Kai Wang, Jeffrey J Delrow, Lenora Loo, and Peggy Porter. Denoising array-based compara- tive genomic hybridization data using wavelets. Biostatistics, ():–, April .

[] Timothy R Hughes, Matthew J Marton, Allan R Jones, Christopher J Roberts, Roland Stoughton, Christopher D Armour, Holly A Bennett, Ernest Coffey, Hongyue Dai, Yudong D He, Matthew J Kidd, Amy M King, Michael R Meyer, David Slade, Pek Y Lum, Sergey B Stepaniants, Daniel D Shoemaker, Daniel Gachotte, Kalpana Chakraburtty, Julian Simon, Mar- tin Bard, and Stephen H Friend. Functional Discovery via a Compendium of Expression Profiles, July . URL http://linkinghub.elsevier. com/retrieve/pii/S.

[] A John Iafrate, Lars Feuk, Miguel N Rivera, Marc L Listewnik, Patricia K Donahoe, Ying Qi, Stephen W Scherer, and Charles Lee. Detection of large- scale variation in the human genome. Nat Genet, ():–, Septem- ber .

[] M Ingman and U Gyllensten. mtDB: Human Mitochondrial Genome Database, a resource for population genetics and medi- cal sciences. Nucleic Acids Res, (Database issue):D–, . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Andy Itsara, Gregory M Cooper, Carl Baker, Santhosh Girirajan, Jun Li, Devin Absher, Ronald M Krauss, Richard M Myers, Paul M Ridker, Daniel I Chasman, Heather Mefford, Phyllis Ying, Deborah A Nickerson, and Evan E Eichler. Population Analysis of Large Copy Number Variants and Hotspots of Human Genetic Disease. e American Journal of Human Genetics, ():–, February . ISSN -. URL http://www. sciencedirect.com/science/article/pii/S.

[] Andy Itsara, Hao Wu, Joshua D Smith, Deborah A Nickerson, Isabelle Romieu, Stephanie J London, and Evan E Eichler. De novo rates and selec- tion of large copy number variation. Genome Research, ():–, November . doi: ./gr...

 [] Gerbert A Jansen, Rob Oftnan, Sacha Ferdinandusse, Lodewijk Ijlst, An- ton O Muijsers, Ola H Skjeldal, Oddvar Stokke, Cornells Jakobs, Guy T N Besley, J Ed Wraith, and Ronald J A Wanders. Refsum disease is caused by mutations in the phytanoyl-CoA hydroxylase gene. Nat Genet, ():– , October . URL http://dx.doi.org/./ng-.

[] Gerbert A Jansen, Eveline M Hogenhout, Sacha Ferdinandusse, Hans R Wa- terham, Rob Ofman, Cornelis Jakobs, Ola H Skjeldal, and Ronald J A Wan- ders. Human phytanoyl-CoA hydroxylase: resolution of the gene struc- ture and the molecular basis of Refsum’s disease. Human Molecular Genet- ics, ():–, May . doi: ./hmg/... URL http: //hmg.oxfordjournals.org/content///.abstract.

[] Yue Jiang, Yadong Wang, and Michael Brudno. PRISM: Pair-read in- formed split-read mapping for base-pair level detection of insertion, dele- tion and structural variants. Bioinformatics, ():–, October . URL http://bioinformatics.oxfordjournals.org/content/ //.abstract.

[] Mark Johnson, Irena Zaretskaya, Yan Raytselis, Yuri Merezhuk, Scott McGinnis, and omas L Madden. NCBI BLAST: a better web interface. Nucleic Acids Research, (suppl ):W–W, July . doi: ./nar/ gkn. URL http://nar.oxfordjournals.org/content//suppl_ /W.abstract.

[] Anne Kallioniemi, Olli-P. Kallioniemi, Damir Sudar, Denis Rutovitz, Joe W Gray, Fred Waldman, and Dan Pinkel. Comparative Genomic Hybridization for Molecular Cytogenetic Analysis of Solid Tumors. Science, (): –, October . doi: ./.

[] Olli-P.Kallioniemi, Anne Kallioniemi, Jim Piper, Jorma Isola, Fred M Wald- man, Joe W Gray, and Dan Pinkel. Optimizing comparative genomic hy- bridization for analysis of DNA sequence copy number changes in solid tu- mors. Genes, Chromosomes and Cancer, ():–, .

[] Emre Karakoc, Can Alkan, Brian J O’Roak, Megan Y Dennis, Laura Vives, Kenneth Mark, Mark J Rieder, Debbie A Nickerson, and Evan E Eichler. Detection of structural variants and indels within exome data. Nat Meth, ():–, February . ISSN - . URL http://dx.doi.org/./nmeth.http: //www.nature.com/nmeth/journal/v/n/abs/nmeth..html# supplementary-information.

 [] John P Kemp, Paul M Smith, Angela Pyle, Vivienne C M Neeve, Helen A L Tuppen, Ulrike Schara, Beril Talim, Haluk Topaloglu, Elke Holinski- Feder, Angela Abicht, Birgit Czermin, Hanns Lochmüller, Robert McFar- land, Patrick F Chinnery, Zofia M A Chrzanowska-Lightowlers, Robert N Lightowlers, Robert W Taylor, and Rita Horvath. Nuclear factors involved in mitochondrial translation cause a subgroup of combined respiratory chain deficiency. Brain, ():–, January . doi: ./brain/ awq. URL http://brain.oxfordjournals.org/content/// .abstract.

[] W J Kent. BLAT-the BLAST-like alignment tool. Genome Res, ():– , .

[] Jeffrey M Kidd, Gregory M Cooper, William F Donahue, Hillary S Hayden, Nick Sampas, Tina Graves, Nancy Hansen, Brian Teague, Can Alkan, Francesca Antonacci, Eric Haugen, Troy Zerr, N Alice Yamada, Peter Tsang, Tera L Newman, Eray Tuzun, Ze Cheng, Heather M Ebling, Nadeem Tusneem, Robert David, Will Gillett, Karen A Phelps, Molly Weaver, David Saranga, Adrianne Brand, Wei Tao, Erik Gustafson, Kevin McKernan, Lin Chen, Maika Malig, Joshua D Smith, Joshua M Korn, Steven A McCarroll, David A Altshuler, Daniel A Peiffer, Michael Dorschner, John Stamatoyannopoulos, David Schwartz, Deborah A Nickerson, James C Mullikin, Richard K Wilson, Laurakay Bruhn, May- nard V Olson, Rajinder Kaul, Douglas R Smith, and Evan E Eichler. Mapping and sequencing of structural variation from eight human genomes. Nature, ():–, May . ISSN -. URL http://dx.doi.org/./naturehttp://www.nature. com/nature/journal/v/n/suppinfo/nature_S.html.

[] Ji Yeon Kim, Jeong-Min Hwang, and Sung Sup Park. Mitochondrial DNA CA/ND is a novel primary causative mutation of Leber’s hereditary optic neuropathy with a good prognosis. Annals of Neurology, ():– , May . ISSN -. doi: ./ana.. URL http: //dx.doi.org/./ana..

[] M P King, Y Koga, M Davidson, and E A Schon. Defects in mi- tochondrial protein synthesis and respiratory chain activity seg- regate with the tRNA(Leu(UUR)) mutation associated with mito- chondrial myopathy, encephalopathy, lactic acidosis, and stroke- like episodes. Mol Cell Biol, ():–, . URL http: //www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db= PubMed&dopt=Citation&list_uids=.

 [] Michael Kinter and Nicholas E Sherman. e Preparation of Protein Digests for Mass Spectrometric Sequencing Experiments. In Protein Sequencing and Identification Using Tandem Mass Spectrometry, pages –. John Wiley & Sons, Inc., . ISBN . doi: ./.ch. URL http://dx.doi.org/./.ch. [] D M Kirby and D R orburn. Approaches to finding the molecular ba- sis of mitochondrial oxidative phosphorylation disorders. Twin Res Hum Genet, ():–, . doi: ./twin..../twin... [pii]. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=. [] S Koene, R J Rodenburg, M S Knaap, M.A.A.P. Willemsen, W Sperl, V Laugel, E Ostergaard, M Tarnopolsky, M A Martin, V Nesbitt, J Fletcher, S Edvardson, V Procaccio, A Slama, L.P.W.J. Heuvel, and J A M Smeitink. Natural disease course and genotype-phenotype correlations in Complex I deficiency caused by nuclear gene defects: what we learned from  cases. Journal of Inherited Metabolic Disease, ():–, . ISSN - . doi: ./s---z. URL http://dx.doi.org/. /s---z. [] Caroline Köhrer and Uttam L RajBhandary. e many applications of acid urea polyacrylamide gel electrophoresis to studies of tRNAs and aminoacyl-tRNA synthetases. Methods, ():–, February . ISSN -. doi: http://dx.doi.org/./j.ymeth... . URL http://www.sciencedirect.com/science/article/pii/ S. [] Daisuke Komura, Fan Shen, Shumpei Ishikawa, Karen R Fitch, Wenwei Chen, Jane Zhang, Guoying Liu, Sigeo Ihara, Hiroshi Nakamura, Matthew E Hurles, Charles Lee, Stephen W Scherer, Keith W Jones, Michael H Shap- ero, Jing Huang, and Hiroyuki Aburatani. Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays. Genome Research, ():–, December . doi: ./gr.. URL http://genome.cshlp.org/content/// .abstract. [] Bernard A Konfortov, Alan T Bankier, and Paul H Dear. An efficient method for multi-locus molecular haplotyping. Nucleic Acids Research,  ():e–e, January . doi: ./nar/gkl. URL http://nar. oxfordjournals.org/content///e.abstract. [] Jan O Korbel, Alexander Eckehart Urban, Jason P Affourtit, Brian God- win, Fabian Grubert, Jan Fredrik Simons, Philip M Kim, Dean Palejev,

 Nicholas J Carriero, Lei Du, Bruce E Taillon, Zhoutao Chen, Andrea Tanzer, A C Eugenia Saunders, Jianxiang Chi, Fengtang Yang, Nigel P Carter, Matthew E Hurles, Sherman M Weissman, Timothy T Harkins, Mark B Gerstein, Michael Egholm, and Michael Snyder. Paired-End Mapping Re- veals Extensive Structural Variation in the Human Genome. Science,  ():–, October . doi: ./science.. URL http: //www.sciencemag.org/content///.abstract.

[] M Kozak. Comparison of initiation of protein synthesis in procaryotes, eu- caryotes, and organelles. Microbiological Reviews, ():–, March . URL http://mmbr.asm.org/content///.short.

[] Niklas Krumm, Peter H Sudmant, Arthur Ko, Brian J O’Roak, Maika Malig, Bradley P Coe, NHLBI NHLBI Exome Sequencing Project, Aaron R Quinlan, Deborah A Nickerson, and Evan E Eichler. Copy number variation detec- tion and genotyping from exome sequence data . Genome Research, May . URL http://genome.cshlp.org/content/early//// gr...abstract.

[] Ernest T Lam, Alex Hastie, Chin Lin, Dean Ehrlich, Somes K Das, Michael D Austin, Paru Deshpande, Han Cao, Niranjan Nagarajan, Ming Xiao, and Pui- Yan Kwok. Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly. Nat Biotech, ():–, August . ISSN -. URL http://dx.doi.org/./nbt.http: //www.nature.com/nbt/journal/v/n/abs/nbt..html# supplementary-information.

[] S R Lamande, J F Bateman, W Hutchison, R J McKinlay Gard- ner, S P Bower, E Byrne, and H H Dahl. Reduced collagen VI causes Bethlem myopathy: a heterozygous COLA nonsense mu- tation results in mRNA decay and functional haploinsufficiency. Hum Mol Genet, ():–, . doi: ddb[pii]. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] E Lamantea, V Tiranti, A Bordoni, A Toscano, F Bono, S Servidei, A Pa- padimitriou, H Spelbrink, L Silvestri, G Casari, G P Comi, and M Ze- viani. Mutations of mitochondrial DNA polymerase gammaA are a fre- quent cause of autosomal dominant or recessive progressive external oph- thalmoplegia. Ann Neurol, ():–, . doi: ./ana. . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

 [] P J Lamont, D R orburn, V Fabian, J Vajsar, C Hawkins, A Saada-Reisch, H Durling, NG Laing, and Y Nevo. Nemaline rods and complex I deficiency in three infants with hypotonia, motor delay and failure to thrive. Neuro- pediatrics, ():–, .

[] Nick Lane and William Martin. e energetics of genome complexity. Na- ture, ():–, October . ISSN -. URL http: //dx.doi.org/./nature.

[] P R Langer-Safer, M Levine, and D C Ward. Immunological method for mapping genes on Drosophila polytene chromosomes . Proceedings of the National Academy of Sciences, ():–, July .

[] Victoria H Lawson, Brad V Graham, and Kevin M Flanigan. Clinical and electrophysiologic features of CMTA with mutations in the mito- fusin  gene. Neurology, ():–, July . doi: ./.wnl. ... URL http://www.neurology.org/content// /.abstract.

[] N-C Lee, D Dimmock, W-L Hwu, L-Y Tang, W-C Huang, A C Chinault, and L-J C Wong. Targeted array CGH as a valuable molecular diagnos- tic approach: Experience in the diagnosis of mitochondrial and metabolic disorders. Archives of Disease in Childhood, ():–, January . doi: ./adc... URL http://adc.bmj.com/content// /.abstract.

[] Seunghak Lee, Elango Cheran, and Michael Brudno. A robust frame- work for detecting structural variations in a genome. Bioinformatics, ():i–i, July . doi: ./bioinformatics/btn. URL http://bioinformatics.oxfordjournals.org/content/// i.abstract.

[] Carmen C Leitch, Norann A Zaghloul, Erica E Davis, Corinne Stoetzel, Anna Diaz-Font, Suzanne Rix, Majid Alfadhel, Richard Alan Lewis, Wafaa Eyaid, Eyal Banin, Helene Dollfus, Philip L Beales, Jose L Badano, and Nicholas Katsanis. Hypomorphic mutations in syndromic encephalocele genes are associated with Bardet-Biedl syndrome. Nat Genet, ():–, April . ISSN -. URL http://dx.doi.org/./ng.http: //www.nature.com/ng/journal/v/n/suppinfo/ng._S.html.

[] Gang Li, Jae Hoon Bahn, Jae-Hyung Lee, Guangdun Peng, Zugen Chen, Stanley F Nelson, and Xinshu Xiao. Identification of allele-specific al- ternative mRNA processing via transcriptome sequencing. Nucleic Acids

 Research, ():e–e, July . doi: ./nar/gks. URL http://nar.oxfordjournals.org/content///e.abstract.

[] H Li and R Durbin. Fast and accurate short read alignment with Burrows- Wheeler transform. Bioinformatics, :–, .

[] Jason Li, Richard Lupat, Kaushalya C Amarasinghe, Ella R omp- son, Maria A Doyle, Georgina L Ryland, Richard W Tothill, Saman K Halgamuge, Ian G Campbell, and Kylie L Gorringe. CONTRA: copy number analysis for targeted resequencing . Bioinformatics, April . URL http://bioinformatics.oxfordjournals.org/content/ early////bioinformatics.bts.abstract.

[] Mingkun Li, Anna Schönberg, Michael Schaefer, Roland Schroeder, Ivane Nasidze, and Mark Stoneking. Detecting Heteroplasmy from High-roughput Sequencing of Complete Human Mitochondrial DNA Genomes. American journal of human genetics, ():–, Au- gust . URL http://linkinghub.elsevier.com/retrieve/pii/ S.

[] Ruiqiang Li, Wei Fan, Geng Tian, Hongmei Zhu, Lin He, Jing Cai, Quanfei Huang, Qingle Cai, Bo Li, Yinqi Bai, Zhihe Zhang, Yaping Zhang, Wen Wang, Jun Li, Fuwen Wei, Heng Li, Min Jian, Jianwen Li, Zhaolei Zhang, Rasmus Nielsen, Dawei Li, Wanjun Gu, Zhentao Yang, Zhaoling Xuan, Oliver A Ryder, Frederick Chi-Ching Leung, Yan Zhou, Jianjun Cao, Xiao Sun, Yonggui Fu, Xiaodong Fang, Xiaosen Guo, Bo Wang, Rong Hou, Fujun Shen, Bo Mu, Peixiang Ni, Runmao Lin, Wubin Qian, Guodong Wang, Chang Yu, Wenhui Nie, Jinhuan Wang, Zhigang Wu, Huiqing Liang, Jiumeng Min, Qi Wu, Shifeng Cheng, Jue Ruan, Mingwei Wang, Zhongbin Shi, Ming Wen, Binghang Liu, Xiaoli Ren, Huisong Zheng, Dong Dong, Kathleen Cook, Gao Shan, Hao Zhang, Carolin Kosiol, Xueying Xie, Zuhong Lu, Hancheng Zheng, Yingrui Li, Cynthia C Steiner, Tommy Tsan-Yuk Lam, Siyuan Lin, Qinghui Zhang, Guoqing Li, Jing Tian, Timing Gong, Hongde Liu, Dejin Zhang, Lin Fang, Chen Ye, Juanbin Zhang, Wenbo Hu, Anlong Xu, Yuanyuan Ren, Guojie Zhang, Michael W Bruford, Qibin Li, Lijia Ma, Yiran Guo, Na An, Yujie Hu, Yang Zheng, Yongyong Shi, Zhiqiang Li, Qing Liu, Yanling Chen, Jing Zhao, Ning Qu, Shancen Zhao, Feng Tian, Xiaoling Wang, Haiyin Wang, Lizhi Xu, Xiao Liu, Tomas Vinar, Yajun Wang, Tak-Wah Lam, Siu-Ming Yiu, Shiping Liu, Hemin Zhang, Desheng Li, Yan Huang, Xia Wang, Guohua Yang, Zhi Jiang, Junyi Wang, Nan Qin, Li Li, Jingxiang Li, Lars Bolund, Karsten Kristiansen, Gane Ka-Shu Wong, Maynard Olson, Xiuqing Zhang, Songgang Li, Huanming Yang, Jian Wang, and Jun

 Wang. e sequence and de novo assembly of the giant panda genome. Nature, ():–, January . ISSN -. URL http://dx.doi.org/./naturehttp://www.nature. com/nature/journal/v/n/suppinfo/nature_S.html.

[] Yan Li, William B Holmes, Dean R Appling, and Uttam L RajBhandary. Initiation of Protein Synthesis in Saccharomyces cerevisiae Mitochondria without Formylation of the Initiator tRNA. Journal of Bacteriology,  ():–, May . doi: ./JB...-.. URL http://jb.asm.org/content///.abstract.

[] P Lichter, T Cremer, J Borden, L Manuelidis, and D C Ward. Delineation of individual human chromosomes in metaphase and interphase cells by in situ suppression hybridization using recombinant DNA libraries. Human Genetics, ():–, November . doi: ./BF.

[] Daniel S. Lieber, Sarah E. Calvo, Kristy Shanahan, Nacy G. Slate, Shang- tao Liu, Steven G. Hershman, Nina B. Gold, Brad A. Chapman, David R. orburn, Gerard T. Berry, Jeremy D. Schmahmann, Mark L. Borowsky, David M. Mueller, Katherine B. Sims, and Vamsi K. Mootha. Targeted ex- ome sequencing of suspected mitochondrial disorders. Neurology, (): in press, .

[] Kian Huat Lim and William Fairbrother. Spliceman - A computational web server that predicts sequence variations in pre-mRNA splicing. Bioinfor- matics, ():–, . ISSN . doi: ./bioinformatics/ bts. URL http://www.ncbi.nlm.nih.gov/pubmed/.

[] Kian Huat Lim, Luciana Ferraris, Madeleine E Filloux, Benjamin J Raphael, and William G Fairbrother. Using positional distribution to identify splic- ing elements and predict pre-mRNA processing defects in human genes. Proceedings of the National Academy of Sciences, June . URL http: //www.pnas.org/content/early////.abstract.

[] Kenneth J Livak and omas D Schmittgen. Analysis of Relative Gene Ex- pression Data Using Real-Time Quantitative PCR and the −ΔΔCT Method. Methods, ():–, December . ISSN -. doi: http:// dx.doi.org/./meth... URL http://www.sciencedirect. com/science/article/pii/S.

[] H Shuen Lo, Zhining Wang, Ying Hu, Howard H Yang, Sheryl Gere, Ken- neth H Buetow, and Maxwell P Lee. Allelic Variation in Gene Expression Is

 Common in the Human Genome. Genome Research, ():–, Au- gust . doi: ./gr.. URL http://genome.cshlp.org/ content///.abstract.

[] Núria López-Bigas, Benjamin Audit, Christos Ouzounis, Genís Parra, and Roderic Guigó. Are splicing mutations the most frequent cause of heredi- tary disease? FEBS Letters, ():–, March . ISSN - . URL http://www.sciencedirect.com/science/article/pii/ SX.

[] Michael I Love. Modeling Read Counts for CNV Detection in Exome Sequencing Data, . URL http://www.degruyter.com/view/j/ sagmb...issue-/-./-..xml.

[] James R Lupski. Genomic rearrangements and sporadic disease. Nat Genet, June .

[] Paola Luzi, Mohammad A Rafi, and David A Wenger. Characterization of the large deletion in the GALC gene found in patients with Krabbe disease. Human Molecular Genetics, ():–, December . doi: ./hmg/... URL http://hmg.oxfordjournals.org/ content///.abstract.

[] Iain MacCallum, Dariusz Przybylski, Sante Gnerre, Joshua Burton, Ilya Shlyakhter, Andreas Gnirke, Joel Malek, Kevin McKernan, Swati Ranade, Terrance Shea, Louise Williams, Sarah Young, Chad Nusbaum, and David Jaffe. ALLPATHS : small genomes assembled accurately and with high continuity from short paired reads. Genome Biology, ():R, . ISSN -. URL http://genomebiology.com////R.

[] Matthew D Mailman, Michael Feolo, Yumi Jin, Masato Kimura, Kim- berly Tryka, Rinat Bagoutdinov, Luning Hao, Anne Kiang, Justin Paschall, Lon Phan, Natalia Popova, Stephanie Pretel, Lora Ziyabari, Moira Lee, Yu Shao, Zhen Y Wang, Karl Sirotkin, Minghong Ward, Michael Kholodov, Kerry Zbicz, Jeffrey Beck, Michael Kimelman, Sergey Shevelev, Don Preuss, Eugene Yaschenko, Alan Graeff, James Ostell, and Stephen T Sherry. e NCBI dbGaP database of genotypes and pheno- types. Nat Genet, ():–, October . ISSN -. URL http://dx.doi.org/./ng-http://www.nature. com/ng/journal/v/n/suppinfo/ng-_S.html.

[] Lira Mamanova, Alison J Coffey, Carol E Scott, Iwanka Kozarewa, Emily H Turner, Akash Kumar, Eleanor Howard, Jay Shendure, and

 Daniel J Turner. Target-enrichment strategies for next-generation se- quencing. Nat Meth, ():–, February . ISSN - . URL http://dx.doi.org/./nmeth.http://www. nature.com/nmeth/journal/v/n/suppinfo/nmeth._S.html.

[] Aron Marchler-Bauer, Shennan Lu, John B Anderson, Farideh Chitsaz, Myra K Derbyshire, Carol DeWeese-Scott, Jessica H Fong, Lewis Y Geer, Renata C Geer, Noreen R Gonzales, Marc Gwadz, David I Hurwitz, John D Jackson, Zhaoxi Ke, Christopher J Lanczycki, Fu Lu, Gabriele H March- ler, Mikhail Mullokandov, Marina V Omelchenko, Cynthia L Robertson, James S Song, Narmada anki, Roxanne A Yamashita, Dachuan Zhang, Naigong Zhang, Chanjuan Zheng, and Stephen H Bryant. CDD: a Con- served Domain Database for the functional annotation of proteins. Nucleic Acids Research, (suppl ):D–D, January . doi: ./nar/ gkq. URL http://nar.oxfordjournals.org/content//suppl_ /D.abstract.

[] Elaine Mardis. e , genome, the , analysis? Genome Medicine, ():, . ISSN -X. URL http://genomemedicine.com/content///.

[] M Margulies, M Egholm, W E Altman, S Attiya, J S Bader, L A Bemben, J Berka, M S Braverman, Y J Chen, Z Chen, S B Dewell, L Du, J M Fierro, X V Gomes, B C Godwin, W He, S Helgesen, C H Ho, G P Irzyk, S C Jando, M L I Alenquer, T P Jarvie, K B Jirage, J B Kim, J R Knight, J R Lanza, J H Leamon, S M Lefkowitz, M Lei, J Li, K L Lohman, H Lu, V B Makhijani, K E McDade, M P McKenna, E W Myers, E Nickerson, J R Nobile, R Plant, B P Puc, M T Ronan, G T Roth, G J Sarkis, J F Simons, J W Simpson, M Srini- vasan, K R Tartaro, A Tomasz, K A Vogt, G A Volkmer, S H Wang, Y Wang, M P Weiner, P Yu, R F Begley, and J M Rothberg. Genome sequencing in mi- crofabricated high-density picolitre reactors. Nature, ():–, .

[] Borivoj Marjanovic, Dubravka Cvetković, Slobodan Dozić, Slobodanka Todorović, and Milena Djurić. Association of Krabbe leukodystrophy and congenital fiber type disproportion. Pediatric Neurology, ():–, July . ISSN -. doi: http://dx.doi.org/./-() -. URL http://www.sciencedirect.com/science/article/ pii/.

[] V Massa, E Fernandez-Vizarra, S Alshahwan, E Bakhsh, P Goffrini, I Fer- rero, P Mereghetti, P D’Adamo, P Gasparini, and M Zeviani. Severe in- fantile encephalomyopathy caused by a mutation in COXB, a nucleus-

 encoded subunit of cytochrome c oxidase. Am J Hum Genet, (): –, . doi: S-()-[pii]./j.ajhg.. .. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Johannes A. Mayr, Tobias B. Haack, Elisabeth Graf, Franz A. Zimmer- mann, omas Wieland, Birgit Haberberger, Andrea Superti-Furga, Jan- bernd Kirschner, Beat Steinmann, Matthias R. Baumgartner, Isabella Mo- roni, Eleonora Lamantea, Massimo Zeviani, Richard J. Rodenburg, Jan Smeitink, Tim M. Strom, omas Meitinger, Wolfgang Sperl, and Holger Prokisch. Lack of the Mitochondrial Protein Acylglycerol Kinase Causes Sengers Syndrome. American journal of human genetics, ():–, February . URL http://linkinghub.elsevier.com/retrieve/ pii/S.

[] Steven A McCarroll and David M Altshuler. Copy-number variation and association studies of human disease. Nat Genet, June .

[] Steven A McCarroll, Alan Huett, Petric Kuballa, Shannon D Chilewski, Aimee Landry, Philippe Goyette, Michael C Zody, Jennifer L Hall, Steven R Brant, Judy H Cho, Richard H Duerr, Mark S Silverberg, Kent D Taylor, John D Rioux, David Altshuler, Mark J Daly, and Ramnik J Xavier. Deletion polymorphism upstream of IRGM associated with altered IRGM expres- sion and Crohn’s disease. Nat Genet, ():–, September . ISSN -. URL http://dx.doi.org/./ng.http:// www.nature.com/ng/journal/v/n/suppinfo/ng._S.html.

[] Shane E McCarthy, Vladimir Makarov, George Kirov, Anjene M Adding- ton, Jon McClellan, Seungtai Yoon, Diana O Perkins, Diane E Dickel, Mary Kusenda, Olga Krastoshevsky, Verena Krause, Ravinesh A Kumar, Detelina Grozeva, Dheeraj Malhotra, Tom Walsh, Elaine H Zackai, Paige Kaplan, Jaya Ganesh, Ian D Krantz, Nancy B Spinner, Patricia Roc- canova, Abhishek Bhandari, Kevin Pavon, B Lakshmi, Anthony Leotta, Jude Kendall, Yoon-ha Lee, Vladimir Vacic, Sydney Gary, Lilia M Iak- oucheva, Timothy J Crow, Susan L Christian, Jeffrey A Lieberman, T Scott Stroup, Terho Lehtimaki, Kaija Puura, Chad Haldeman-Englert, Justin Pearl, Meredith Goodell, Virginia L Willour, Pamela DeRosse, Jo Steele, Layla Kassem, Jessica Wolff, Nisha Chitkara, Francis J McMahon, Anil K Malhotra, James B Potash, omas G Schulze, Markus M Nothen, Sven Cichon, Marcella Rietschel, Ellen Leibenluft, Vlad Kustanovich, Clara M Lajonchere, James S Sutcliffe, David Skuse, Michael Gill, Louise Gal- lagher, Nancy R Mendell, Nick Craddock, Michael J Owen, Michael C

 O’Donovan, Tamim H Shaikh, Ezra Susser, Lynn E DeLisi, Patrick F Sul- livan, Curtis K Deutsch, Judith Rapoport, Deborah L Levy, Mary-Claire King, and Jonathan Sebat. Microduplications of p. are associated with schizophrenia. Nat Genet, ():–, November . ISSN -. URL http://dx.doi.org/./ng.http://www. nature.com/ng/journal/v/n/suppinfo/ng._S.html.

[] Aaron McKenna, Matthew Hanna, Eric Banks, Andrey Sivachenko, Kris- tian Cibulskis, Andrew Kernytsky, Kiran Garimella, David Altshuler, Stacey Gabriel, Mark Daly, and Mark A DePristo. e Genome Analysis Toolkit: A MapReduce framework for analyzing next-generation DNA sequencing data . Genome Research, ():–, September . URL http: //genome.cshlp.org/content///.abstract.

[] Matthew McKenzie, Michael Lazarou, and Michael T Ryan. Chapter  Analysis of Respiratory Chain Complex Assembly with Radiolabeled Nuclear￿ and Mitochondrial￿Encoded Subunits. In William S Allison in Enzymology and Immo E Scheffler B T Methods, editors, Mitochon- drial Function, Part A: Mitochondrial Electron Transport Complexes and Reac- tive Oxygen Species, volume Volume , pages –. Academic Press, . ISBN -. doi: http://dx.doi.org/./S-() -. URL http://www.sciencedirect.com/science/article/ pii/S.

[] Matthew McKenzie, Elena J Tucker, Alison G Compton, Michael Lazarou, Christa George, David R orburn, and Michael T Ryan. Mutations in the Gene Encoding Corf Block Complex I Assembly by Inhibiting Produc- tion of the Mitochondria-Encoded Subunit ND. Journal of Molecular Biol- ogy, ():–, December . ISSN -. doi: http://dx.doi. org/./j.jmb.... URL http://www.sciencedirect.com/ science/article/pii/S.

[] MEDomics. MEDomics. URL www.medomics.com/MitoNucleomeDx.

[] Paul Medvedev, Monica Stanciu, and Michael Brudno. Computational methods for discovering structural variation with next-generation sequenc- ing. Nat Meth, (s):S–S, November .

[] Heather C Mefford. Diagnostic Exome Sequencing — Are We ere Yet? New England Journal of Medicine, ():–, October . ISSN -. doi: ./NEJMe. URL http://dx.doi.org/. /NEJMe.

 [] Laurence R Meyer, Ann S Zweig, Angie S Hinrichs, Donna Karolchik, Robert M Kuhn, Matthew Wong, Cricket A Sloan, Kate R Rosenbloom, Greg Roe, Brooke Rhead, Brian J Raney, Andy Pohl, Venkat S Malladi, Chin H Li, Brian T Lee, Katrina Learned, Vanessa Kirkup, Fan Hsu, Steve Heitner, Rachel A Harte, Maximilian Haeussler, Luvina Guruvadoo, Mary Goldman, Belinda M Giardine, Pauline A Fujita, Timothy R Dreszer, Mark Diekhans, Melissa S Cline, Hiram Clawson, Galt P Barber, David Haus- sler, and W James Kent. e UCSC Genome Browser database: exten- sions and updates . Nucleic Acids Research, (D):D–D, January . doi: ./nar/gks. URL http://nar.oxfordjournals. org/content//D/D.abstract.

[] Ryan E Mills, Christopher T Luttig, Christine E Larkins, Adam Beauchamp, Circe Tsui, W Stephen Pittard, and Scott E Devine. An initial map of in- sertion and deletion (INDEL) variation in the human genome. Genome Re- search, ():–, September . doi: ./gr.. URL http://genome.cshlp.org/content///.abstract.

[] Ryan E Mills, Klaudia Walter, Chip Stewart, Robert E Handsaker, Ken Chen, Can Alkan, Alexej Abyzov, Seungtai Chris Yoon, Kai Ye, R Keira Cheetham, Asif Chinwalla, Donald F Conrad, Yutao Fu, Fabian Grubert, Iman Hajira- souliha, Fereydoun Hormozdiari, Lilia M Iakoucheva, Zamin Iqbal, Shuli Kang, Jeffrey M Kidd, Miriam K Konkel, Joshua Korn, Ekta Khurana, Deniz Kural, Hugo Y K Lam, Jing Leng, Ruiqiang Li, Yingrui Li, Chang-Yun Lin, Ruibang Luo, Xinmeng Jasmine Mu, James Nemesh, Heather E Peck- ham, Tobias Rausch, Aylwyn Scally, Xinghua Shi, Michael P Stromberg, Adrian M Stutz, Alexander Eckehart Urban, Jerilyn A Walker, Jiantao Wu, Yujun Zhang, Zhengdong D Zhang, Mark A Batzer, Li Ding, Gabor T Marth, Gil McVean, Jonathan Sebat, Michael Snyder, Jun Wang, Kenny Ye, Evan E Eichler, Mark B Gerstein, Matthew E Hurles, Charles Lee, Steven A McCar- roll, and Jan O Korbel. Mapping copy number variation by population-scale genome sequencing. Nature, ():–, February . ISSN -. URL http://dx.doi.org/./naturehttp: //www.nature.com/nature/journal/v/n/abs/. -nature-unlocked.html#supplementary-information.

[] A L Mitchell, J L Elson, N Howell, R W Taylor, and D M Turnbull. Se- quence variation in mitochondrial complex I genes: mutation or polymor- phism? Journal of Medical Genetics, ():–, February . doi: ./jmg... URL http://jmg.bmj.com/content/// .abstract.

 [] E Morava, L van den Heuvel, F Hol, M C de Vries, M Hogeveen, R J Rodenburg, and J A Smeitink. Mitochondrial disease crite- ria: diagnostic applications in children. Neurology, ():–, . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Vivienne C M Neeve, Angela Pyle, Veronika Boczonadi, Aurora Gomez- Duran, Helen Griffin, Mauro Santibanez-Koref, Ulrike Gaiser, Peter Bauer, Andreas Tzschach, Patrick F Chinnery, and Rita Horvath. Clinical and func- tional characterisation of the combined respiratory chain defect in two sis- ters due to autosomal recessive mutations in MTFMT. Mitochondrion, (), . ISSN -. doi: http://dx.doi.org/./j.mito... . URL http://www.sciencedirect.com/science/article/pii/ SX.

[] D Trevor Newton, Carole Creuzenet, and Dev Mangroo. Formylation Is Not Essential for Initiation of Protein Synthesis in All Eubacteria. Jour- nal of Biological Chemistry, ():–, August . doi: ./jbc.... URL http://www.jbc.org/content/// .abstract.

[] S B Ng, K J Buckingham, C Lee, A W Bigham, H K Tabor, K M Dent, C D Huff, P T Shannon, E W Jabs, D A Nickerson, J Shendure, and M J Bamshad. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet, ():–, . doi: ng.[pii]./ng. . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Sarah B Ng, Emily H Turner, Peggy D Robertson, Steven D Flygare, Abigail W Bigham, Choli Lee, Tristan Shaffer, Michelle Wong, Arindam Bhattacharjee, Evan E Eichler, Michael Bamshad, Deborah A Nickerson, and Jay Shendure. Targeted capture and massively parallel sequencing of  human exomes. Nature, ():–, September . ISSN -. URL http://dx.doi.org/./naturehttp: //www.nature.com/nature/journal/v/n/suppinfo/ nature_S.html.

[] Sarah B Ng, Abigail W Bigham, Kati J Buckingham, Mark C Han- nibal, Margaret J McMillin, Heidi I Gildersleeve, Anita E Beck, Holly K Tabor, Gregory M Cooper, Heather C Mefford, Choli Lee, Emily H Turner, Joshua D Smith, Mark J Rieder, Koh-ichiro Yoshiura, Naomichi Matsumoto, Tohru Ohta, Norio Niikawa, Deborah A Nick- erson, Michael J Bamshad, and Jay Shendure. Exome sequencing

 identifies MLL mutations as a cause of Kabuki syndrome. Nat Genet, ():–, September . ISSN -. URL http://dx.doi.org/./ng.http://www.nature.com/ng/ journal/v/n/abs/ng..html#supplementary-information.

[] Annika I Nilsson, Anna Zorzet, Anna Kanth, Sabina Dahlström, Otto G Berg, and Dan I Andersson. Reducing the fitness cost of antibiotic re- sistance by amplification of initiator tRNA genes. Proceedings of the Na- tional Academy of Sciences, ():–, May . doi: ./ pnas.. URL http://www.pnas.org/content///. abstract.

[] I Nishino, A Spinazzola, and M Hirano. ymidine phosphorylase gene mutations in MNGIE, a human mitochondrial disorder. Science,  ():–, . URL http://www.ncbi.nlm.nih.gov/entrez/ query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_ uids=.

[] Alex Nord, Ming Lee, Mary-Claire King, and Tom Walsh. Accurate and exact CNV identification from targeted high-throughput sequence data. BMC Genomics, ():, . ISSN -. URL http://www. biomedcentral.com/-//.

[] J Nouws, L Nijtmans, S M Houten, M van den Brand, M Huynen, H Venselaar, S Hoefs, J Gloerich, J Kronick, T Hutchin, P Willems, R Rodenburg, R Wanders, L van den Heuvel, J Smeitink, and R O Vogel. Acyl-CoA dehydrogenase  is required for the biogenesis of oxidative phosphorylation complex I. Cell Metab, ():– , . doi: S-()-[pii]./j.cmet... . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Adam B Olshen, E S Venkatraman, Robert Lucito, and Michael Wigler. Circular binary segmentation for the analysis of array￿based DNA copy number data. Biostatistics, ():–, October . doi: ./ biostatistics/kxh.

[] Brian J O’Roak, Pelagia Deriziotis, Choli Lee, Laura Vives, Jerrod J Schwartz, Santhosh Girirajan, Emre Karakoc, Alexandra P MacKenzie, Sarah B Ng, Carl Baker, Mark J Rieder, Deborah A Nickerson, Raphael Bernier, Simon E Fisher, Jay Shendure, and Evan E Eichler. Exome se- quencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nat Genet, ():–, June . ISSN -. URL

 http://dx.doi.org/./ng.http://www.nature.com/ng/ journal/v/n/abs/ng..html#supplementary-information.

[] Maeve O’Huallachain, Konrad J Karczewski, Sherman M Weissman, Alexander Eckehart Urban, and Michael P Snyder. Extensive ge- netic variation in somatic human tissues. Proceedings of the National Academy of Sciences, ():–, October . doi: ./ pnas.. URL http://www.pnas.org/content///. abstract.

[] David J Pagliarini, Sarah E Calvo, Betty Chang, Sunil A Sheth, Scott B Vafai, Shao-En Ong, Geoffrey A Walford, Canny Sugiana, Avihu Boneh, William K Chen, David E Hill, Marc Vidal, James G Evans, David R or- burn, Steven A Carr, and Vamsi K Mootha. A Mitochondrial Protein Com- pendium Elucidates Complex I Disease Biology. Cell, ():–, July . ISSN -. URL http://www.sciencedirect.com/ science/article/pii/SX.

[] Alistair T Pagnamenta, Jan-Willem Taanman, Callum J Wilson, Neil E Anderson, Rosetta Marotta, Andrew J Duncan, Maria Bitner Glindzicz, Robert W Taylor, Adrienne Laskowski, David R orburn, and Shamima Rahman. Dominant inheritance of premature ovarian failure associ- ated with mutant mitochondrial DNA polymerase gamma. Human Re- production, ():–, October . doi: ./humrep/ del. URL http://humrep.oxfordjournals.org/content/// .abstract.

[] Ricardo Palacios, Elodie Gazave, Joaquín Goñi, Gabriel Piedrafita, Olga Fer- nando, Arcadi Navarro, and Pablo Villoslada. Allele-Specific Gene Expres- sion Is Widespread Across the Genome and Biological Processes. PLoS ONE, ():e, January . URL http://dx.doi.org/./ journal.pone..

[] Andy Pang, Jeffrey MacDonald, Dalila Pinto, John Wei, Muhammad Rafiq, Donald Conrad, Hansoo Park, Matthew Hurles, Charles Lee, J Craig Ven- ter, Ewen Kirkness, Samuel Levy, Lars Feuk, and Stephen Scherer. To- wards a comprehensive structural variation map of an individual human genome. Genome Biology, ():R, . ISSN -. URL http: //genomebiology.com////R.

[] Hansoo Park, Jong-Il Kim, Young Seok Ju, Omer Gokcumen, Ryan E Mills, Sheehyun Kim, Seungbok Lee, Dongwhan Suh, Dongwan Hong, Hyun- seok Peter Kang, Yun Joo Yoo, Jong-Yeon Shin, Hyun-Jin Kim, Maryam

 Yavartanoo, Young Wha Chang, Jung-Sook Ha, Wilson Chong, Ga-Ram Hwang, Katayoon Darvishi, HyeRan Kim, Song Ju Yang, Kap-Seok Yang, Hyungtae Kim, Matthew E Hurles, Stephen W Scherer, Nigel P Carter, Chris Tyler-Smith, Charles Lee, and Jeong-Sun Seo. Discovery of common Asian copy number variants using integrated high-resolution array CGH and mas- sively parallel DNA sequencing. Nat Genet, ():–, May . ISSN -. URL http://dx.doi.org/./ng.http:// www.nature.com/ng/journal/v/n/suppinfo/ng._S.html.

[] Irma Parra and Bradford Windle. High resolution visual mapping of stretched DNA by fluorescent hybridization. Nat Genet, ():–, September .

[] Daniel A Peiffer, Jennie M Le, Frank J Steemers, Weihua Chang, Tony Jen- niges, Francisco Garcia, Kirt Haden, Jiangzhen Li, Chad A Shaw, John Bel- mont, Sau Wai Cheung, Richard M Shen, David L Barker, and Kevin L Gun- derson. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Research, ():– , September . doi: ./gr.. URL http://genome. cshlp.org/content///.abstract.

[] Pavel A Pevzner, Haixu Tang, and Michael S Waterman. An Eulerian path approach to DNA fragment assembly. Proceedings of the National Academy of Sciences, ():–, August . doi: ./pnas.. URL http://www.pnas.org/content///.abstract.

[] Michael W Pfaffl. A new mathematical model for relative quantification in real-time RT–PCR. Nucleic Acids Research, ():e–e, May . doi: ./nar/..e. URL http://nar.oxfordjournals.org/ content///e.abstract.

[] Sarah B Pierce, Tom Walsh, Karen M Chisholm, Ming K Lee, Anne M ornton, Agata Fiumara, John M Opitz, Ephrat Levy-Lahad, Rachel E Klevit, and Mary-Claire King. Mutations in the DBP-Deficiency Protein HSDB Cause Ovarian Dysgenesis, Hearing Loss, and Ataxia of Per- rault Syndrome, August . URL http://linkinghub.elsevier. com/retrieve/pii/S.

[] Sarah B Pierce, Karen M Chisholm, Eric D Lynch, Ming K Lee, Tom Walsh, John M Opitz, Weiqing Li, Rachel E Klevit, and Mary-Claire King. Mu- tations in mitochondrial histidyl tRNA synthetase HARS cause ovarian dysgenesis and sensorineural hearing loss of Perrault syndrome. Proceed- ings of the National Academy of Sciences, ():–, April .

 doi: ./pnas.. URL http://www.pnas.org/content/ //.abstract.

[] Tyler Mark Pierson, David Adams, Florian Bonn, Paola Martinelli, Praveen F Cherukuri, Jamie K Teer, Nancy F Hansen, Pedro Cruz, James C Mullikin for the NISC Comparative Sequencing Program, Robert W Blakesley, Gretchen Golas, Justin Kwan, Anthony Sandler, Karin Fuentes Fajardo, omas Markello, Cynthia Tifft, Craig Blackstone, Elena I Rugarli, omas Langer, William A Gahl, and Camilo Toro. Whole-Exome Sequenc- ing Identifies Homozygous AFGL Mutations in a Spastic Ataxia-Neuropathy Syndrome Linked to Mitochondrial m- AAA Proteases. PLoS Genet, ():e, October . URL http: //dx.doi.org/./journal.pgen..

[] D Pinkel, J Landegent, C Collins, J Fuscoe, R Segraves, J Lucas, and J Gray. Fluorescence in situ hybridization with human chromosome-specific li- braries: detection of trisomy  and translocations of chromosome  . Pro- ceedings of the National Academy of Sciences, ():–, December .

[] Dalila Pinto, Alistair T Pagnamenta, Lambertus Klei, Richard Anney, Daniele Merico, Regina Regan, Judith Conroy, Tiago R Magalhaes, Cata- rina Correia, Brett S Abrahams, Joana Almeida, Elena Bacchelli, Gary D Bader, Anthony J Bailey, Gillian Baird, Agatino Battaglia, Tom Berney, Nadia Bolshakova, Sven Bolte, Patrick F Bolton, omas Bourgeron, Sean Brennan, Jessica Brian, Susan E Bryson, Andrew R Carson, Guillermo Casallo, Jillian Casey, Brian H Y Chung, Lynne Cochrane, Christina Corsello, Emily L Crawford, Andrew Crossett, Cheryl Cytrynbaum, Geral- dine Dawson, Maretha de Jonge, Richard Delorme, Irene Drmic, Eftichia Duketis, Frederico Duque, Annette Estes, Penny Farrar, Bridget A Fernan- dez, Susan E Folstein, Eric Fombonne, Christine M Freitag, John Gilbert, Christopher Gillberg, Joseph T Glessner, Jeremy Goldberg, Andrew Green, Jonathan Green, Stephen J Guter, Hakon Hakonarson, Elizabeth A Heron, Matthew Hill, Richard Holt, Jennifer L Howe, Gillian Hughes, Vanessa Hus, Roberta Igliozzi, Cecilia Kim, Sabine M Klauck, Alexander Kolevzon, Olena Korvatska, Vlad Kustanovich, Clara M Lajonchere, Janine A Lamb, Magdalena Laskawiec, Marion Leboyer, Ann Le Couteur, Bennett L Lev- enthal, Anath C Lionel, Xiao-Qing Liu, Catherine Lord, Linda Lotspeich, Sabata C Lund, Elena Maestrini, William Mahoney, Carine Mantoulan, Christian R Marshall, Helen McConachie, Christopher J McDougle, Jane McGrath, William M McMahon, Alison Merikangas, Ohsuke Migita, Nancy J Minshew, Ghazala K Mirza, Jeff Munson, Stanley F Nelson,

 Carolyn Noakes, Abdul Noor, Gudrun Nygren, Guiomar Oliveira, Katerina Papanikolaou, Jeremy R Parr, Barbara Parrini, Tara Paton, Andrew Pickles, Marion Pilorge, Joseph Piven, Chris P Ponting, David J Posey, Annemarie Poustka, Fritz Poustka, Aparna Prasad, Jiannis Ragoussis, Katy Renshaw, Jessica Rickaby, Wendy Roberts, Kathryn Roeder, Bernadette Roge, Michael L Rutter, Laura J Bierut, John P Rice, Jeff Salt, Katherine Sansom, Daisuke Sato, Ricardo Segurado, Ana F Sequeira, Lili Senman, Naisha Shah, Val C Sheffield, Latha Soorya, Ines Sousa, Olaf Stein, Nuala Sykes, Vera Stoppioni, Christina Strawbridge, Raffaella Tancredi, Katherine Tansey, Bhooma iruvahindrapduram, Ann P ompson, Susanne omson, Ana Tryfon, John Tsiantis, Herman Van Engeland, John B Vincent, Fred Volk- mar, Simon Wallace, Kai Wang, Zhouzhi Wang, omas H Wassink, Caleb Webber, Rosanna Weksberg, Kirsty Wing, Kerstin Wittemeyer, Shawn Wood, Jing Wu, Brian L Yaspan, Danielle Zurawiecki, Lonnie Zwaigen- baum, Joseph D Buxbaum, Rita M Cantor, Edwin H Cook, Hilary Coon, Michael L Cuccaro, Bernie Devlin, Sean Ennis, Louise Gallagher, Daniel H Geschwind, Michael Gill, Jonathan L Haines, Joachim Hallmayer, Judith Miller, Anthony P Monaco, John I Nurnberger Jr, Andrew D Paterson, Margaret A Pericak-Vance, Gerard D Schellenberg, Peter Szatmari, Astrid M Vicente, Veronica J Vieland, Ellen M Wijsman, Stephen W Scherer, James S Sutcliffe, and Catalina Betancur. Functional impact of global rare copy number variation in autism spectrum disorders. Nature, (): –, July . ISSN -. URL http://dx.doi.org/. /naturehttp://www.nature.com/nature/journal/v/ n/abs/nature.html#supplementary-information.

[] Vincent Plagnol, James Curtis, Michael Epstein, Kin Y Mok, Emma Stebbings, Sofia Grigoriadou, Nicholas W Wood, Sophie Hambleton, Siobhan O Burns, Adrian J rasher, Dinakantha Kumararatne, Rainer Doffinger, and Sergey Nejentsev. A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics, ():–, November . doi: ./bioinformatics/bts. URL http://bioinformatics. oxfordjournals.org/content///.abstract.

[] Jonathan R Pollack, Charles M Perou, Ash A Alizadeh, Michael B Eisen, Alexander Pergamenschikov, Cheryl F Williams, Stefanie S Jeffrey, David Botstein, and Patrick O Brown. Genome-wide analysis of DNA copy- number changes using cDNA microarrays. Nat Genet, ():–, Septem- ber .

[] Jonathan R Pollack, erese Sø rlie, Charles M Perou, Christian A Rees, Ste-

 fanie S Jeffrey, Per E Lonning, Robert Tibshirani, David Botstein, Anne-Lise Bø rresen Dale, and Patrick O Brown. Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors . Proceedings of the National Academy of Sciences,  ():–, October . doi: ./pnas..

[] A Poulos, A C Pollard, J D Mitchell, G Wise, and G Mortimer. Patterns of Refsum’s disease. Phytanic acid oxidase deficiency. Archives of Disease in Childhood, ():–, .

[] Kim D Pruitt, Tatiana Tatusova, and Donna R Maglott. NCBI refer- ence sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Research, (suppl ): D–D, January . doi: ./nar/gkl. URL http://nar. oxfordjournals.org/content//suppl_/D.abstract.

[] Kim D Pruitt, Jennifer Harrow, Rachel A Harte, Craig Wallin, Mark Diekhans, Donna R Maglott, Steve Searle, Catherine M Farrell, Jane E Loveland, Barbara J Ruef, Elizabeth Hart, Marie-Marthe Suner, Melissa J Landrum, Bronwen Aken, Sarah Ayling, Robert Baertsch, Julio Fernandez- Banet, Joshua L Cherry, Val Curwen, Michael DiCuccio, Manolis Kellis, Jen- nifer Lee, Michael F Lin, Michael Schuster, Andrew Shkeda, Clara Amid, Garth Brown, Oksana Dukhanina, Adam Frankish, Jennifer Hart, Bonnie L Maidak, Jonathan Mudge, Michael R Murphy, Terence Murphy, Jeena Ra- jan, Bhanu Rajput, Lillian D Riddick, Catherine Snow, Charles Steward, David Webb, Janet A Weber, Laurens Wilming, Wenyu Wu, Ewan Birney, David Haussler, Tim Hubbard, James Ostell, Richard Durbin, and David Lipman. e consensus coding sequence (CCDS) project: Identifying a com- mon protein-coding gene set for the human and mouse genomes. Genome Research, ():–, July . doi: ./gr... URL http://genome.cshlp.org/content///.abstract.

[] Mohammad A Rafi, Paola Luzi, Yue Qun Chen, and David A Wenger. A large deletion together with a point mutation in the GALC gene is a common mutant allele in patients with infantile Krabbe disease. Human Molecular Genetics, ():–, August . doi: ./hmg/... URL http://hmg.oxfordjournals.org/content///.short.

[] Arthi Ramachandran, Mariann Micsinai, and Itsik Pe’er. CONDEX: Copy number detection in exome sequences. Bioinformatics and Biomedicine Work- shop, IEEE International Conference on, :–, . doi: http://doi. ieeecomputersociety.org/./BIBMW...

 [] DavidA. Ray, Kyudong Han, JerilynA. Walker, and MarkA. Batzer. Labora- tory Methods for the Analysis of Primate Mobile Elements. Genetic Varia- tion SE - , :–, . doi: ./----\_. URL http://dx.doi.org/./----_.

[] R Redon, S Ishikawa, K R Fitch, L Feuk, G H Perry, T D Andrews, H Fiegler, M H Shapero, A R Carson, W Chen, E K Cho, S Dallaire, J L Freeman, J R Gonzalez, M Gratacos, J Huang, D Kalaitzopoulos, D Komura, J R MacDon- ald, C R Marshall, R Mei, L Montgomery, K Nishimura, K Okamura, F Shen, M J Somerville, J Tchinda, A Valsesia, C Woodwark, F Yang, J Zhang, T Zer- jal, L Armengol, D F Conrad, X Estivill, C Tyler-Smith, N P Carter, H Abu- ratani, C Lee, K W Jones, S W Scherer, and M E Hurles. Global variation in copy number in the human genome. Nature, ():–, .

[] C Sue Richards, Sherri Bale, Daniel B Bellissimo, Soma Das, Wayne W Grody, Madhuri R Hegde, Elaine Lyon, and Brian E Ward. ACMG recommendations for standards for interpretation and reporting of se- quence variations: Revisions . Genet Med, ():–, April . ISSN -. URL http://dx.doi.org/./GIM. bebcae.

[] Lisa G Riley, Sandra Cooper, Peter Hickey, Joëlle Rudinger-irion, Matthew McKenzie, Alison Compton, Sze Chern Lim, David orburn, Michael T Ryan, Richard Giegé, Melanie Bahlo, and John Christodoulou. Mutation of the Mitochondrial Tyrosyl-tRNA Synthetase Gene, YARS, Causes Myopathy, Lactic Acidosis, and Sideroblastic Anemia—MLASA Syndrome. e American Journal of Human Genetics, ():–, July . ISSN -. doi: http://dx.doi.org/./j.ajhg... . URL http://www.sciencedirect.com/science/article/pii/ S.

[] Elisha D O Roberson and Jonathan Pevsner. Visualization of Shared Ge- nomic Regions and Meiotic Recombination in High-Density SNP Data. PLoS ONE, ():e, August . URL http://dx.doi.org/./ journal.pone..

[] Steve Rozen and Helen Skaletsky. Primer on the WWW for general users and for biologist programmers. Methods in Molecular Biology, ():– , . ISSN . doi: ./---:. URL http: //www.ncbi.nlm.nih.gov/pubmed/.

[] E Ruiz-Pesini, M T Lott, V Procaccio, J C Poole, M C Brandon, D Mish- mar, C Yi, J Kreuziger, P Baldi, and D C Wallace. An enhanced

 MITOMAP with a global mtDNA mutational phylogeny. Nucleic Acids Res, (Database issue):D–, . doi: gkl[pii]./nar/ gkl. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] A Saada, A Shaag, H Mandel, Y Nevo, S Eriksson, and O Elpeleg. Mutant mitochondrial thymidine kinase in mitochondrial DNA depletion my- opathy. Nat Genet, ():–, . doi: ./ngng[pii]. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Jeffry D Sander, Elizabeth J Dahlborg, Mathew J Goodwin, Lindsay Cade, Feng Zhang, Daniel Cifuentes, Shaun J Curtin, Jessica S Black- burn, Stacey ibodeau-Beganny, Yiping Qi, Christopher J Pierick, Ellen Hoffman, Morgan L Maeder, Cyd Khayter, Deepak Reyon, Drena Dobbs, David M Langenau, Robert M Stupar, Antonio J Giraldez, Daniel F Voytas, Randall T Peterson, Jing-Ruey J Yeh, and J Keith Joung. Selection-free zinc-finger-nuclease engineering by context- dependent assembly (CoDA). Nat Meth, ():–, January . ISSN -. URL http://dx.doi.org/./nmeth.http: //www.nature.com/nmeth/journal/v/n/abs/nmeth..html# supplementary-information.

[] Jörn Oliver Sass, Regina Ensenauer, Wulf Röschinger, Horst Reich, Ul- rike Steuerwald, Oliver Schirrmacher, Katharina Engel, Johannes Häberle, Brage Storstein Andresen, André Mégarbané, Willy Lehnert, and Johannes Zschocke. -Methylbutyryl-coenzyme A dehydrogenase deficiency: Func- tional and molecular studies on a defect in isoleucine catabolism. Molecu- lar Genetics and Metabolism, ():–, January . ISSN -. doi: http://dx.doi.org/./j.ymgme.... URL http://www. sciencedirect.com/science/article/pii/S.

[] J Fah Sathirapongsasuti, Hane Lee, Basil A J Horst, Georg Brun- ner, Alistair J Cochran, Scott Binder, John Quackenbush, and Stan- ley F Nelson. Exome Sequencing-Based Copy-Number Variation and Loss of Heterozygosity Detection: ExomeCNV . Bioinformatics, August . URL http://bioinformatics.oxfordjournals.org/content/ early////bioinformatics.btr.abstract.

[] Miyuki Sato and Ken Sato. Degradation of Paternal Mitochondria by Fertilization-Triggered Autophagy in C. elegans Embryos. Science,  ():–, November . doi: ./science.. URL http://www.sciencemag.org/content///.abstract.

 [] Andrew M Schaefer, Robert W Taylor, Douglass M Turnbull, and Patrick F Chinnery. e epidemiology of mitochondrial disorders– past, present and future. Biochimica et Biophysica Acta (BBA) - Bioen- ergetics, (-):–, . doi: DOI:./j.bbabio.. .. URL http://www.sciencedirect.com/science/article/ BTS-DBX-//abfbdecaeeb.

[] Curt Scharfe, Henry Horng-Shing Lu, Jutta K Neuenburg, Edward A Allen, Guan-Cheng Li, omas Klopstock, Tina M Cowan, Gregory M Enns, and Ronald W Davis. Mapping Gene Associations in Human Mitochondria us- ing Clinical Disease Phenotypes. PLoS Comput Biol, ():e, . URL http://dx.doi.org/./journal.pcbi..

[] E Schleiermacher, U Schliebitz, G Rompe, C Steffens, and U Schmidt. Brother and sister with trisomy p: A new syndrome. Humangenetik, ():–, . ISSN -. doi: ./BF. URL http://dx.doi.org/./BF.

[] Emmanuelle Schmitt, Michel Panvert, Sylvain Blanquet, and Yves Mechu- lam. Crystal structure of methionyl-tRNAfMet transformylase complexed with the initiator formyl-methionyl-tRNAfMet. EMBO J, ():– , December . ISSN -. URL http://dx.doi.org/. /emboj/...

[] Peter Schofield and Paul C Zamecnik. Cupric ion catalysis in hydroly- sis of aminoacyl-tRNA. Biochimica et Biophysica Acta (BBA) - Nucleic Acids and Protein Synthesis, ():–, February . ISSN -. doi: http://dx.doi.org/./-()-. URL http://www. sciencedirect.com/science/article/pii/.

[] E Schröck, S du Manoir, T Veldman, B Schoell, J Wienberg, M A Ferguson- Smith, Y Ning, D H Ledbetter, I Bar-Am, D Soenksen, Y Garini, and T Ried. Multicolor Spectral Karyotyping of Human Chromosomes. Science,  ():–, July . doi: ./science....

[] Marina Seabright. A rapid banding technique for human chromosomes. Lancet, ():–, October . URL http://europepmc. org/abstract/MED/.

[] Jonathan Sebat, B Lakshmi, Dheeraj Malhotra, Jennifer Troge, Christa Lese-Martin, Tom Walsh, Boris Yamrom, Seungtai Yoon, Alex Krasnitz, Jude Kendall, Anthony Leotta, Deepa Pai, Ray Zhang, Yoon-Ha Lee, James Hicks, Sarah J Spence, Annette T Lee, Kaija Puura, Terho Lehtimäki, David

 Ledbetter, Peter K Gregersen, Joel Bregman, James S Sutcliffe, Vaidehi Jobanputra, Wendy Chung, Dorothy Warburton, Mary-Claire King, David Skuse, Daniel H Geschwind, T Conrad Gilliam, Kenny Ye, and Michael Wigler. Strong Association of De Novo Copy Number Mutations with Autism. Science, ():–, April . doi: ./science. . URL http://www.sciencemag.org/content///. abstract.

[] Bertrand Ségues, Pascale Saugier Veber, Daniel Rabier, Patrick Calvas, Jean-Marie Saudubray, Brigitte Gilbert-Dussardier, Jean-Paul Bonnefont, and Arnold Munnich. A -base pair in-frame deletion in exon  (del- Glu/) of the ornithine transcarbamylase gene in late-onset hyper- ammonemic coma. Human Mutation, ():–, January . ISSN -.

[] Alexandre Serero, Carmela Giglione, Alessandro Sardini, Juan Martinez- Sanz, and ierry Meinnel. An Unusual Peptide Deformylase Features in the Human Mitochondrial N-terminal Methionine Excision Pathway. Jour- nal of Biological Chemistry, ():–, December . doi: ./jbc.M. URL http://www.jbc.org/content/// .abstract.

[] Hanan E Shamseldin, Muneera Alshammari, Tarfa Al-Sheddi, Mustafa A Salih, Hisham Alkhalidi, Amal Kentab, Gabriela M Repetto, Mais Hashem, and Fowzan S Alkuraya. Genomic analysis of mitochondrial diseases in a consanguineous population reveals novel candidate disease genes. Journal of Medical Genetics, ():–, April . doi: ./ jmedgenet--. URL http://jmg.bmj.com/content/// .abstract.

[] Peidong Shen, Wenyi Wang, Sujatha Krishnakumar, Curtis Palm, Aung- Kyaw Chi, Gregory M Enns, Ronald W Davis, Terence P Speed, Michael N Mindrinos, and Curt Scharfe. High-quality DNA sequence capture of  disease candidate genes. Proceedings of the National Academy of Sciences, ():–, . doi: ./pnas.. URL http: //www.pnas.org/content///.abstract.

[] Jay Shendure, Gregory J Porreca, Nikos B Reppas, Xiaoxia Lin, John P Mc- Cutcheon, Abraham M Rosenbaum, Michael D Wang, Kun Zhang, Robi D Mitra, and George M Church. Accurate Multiplex Polony Sequencing of an Evolved Bacterial Genome. Science, ():–, September . doi: ./science.. URL http://www.sciencemag.org/ content///.abstract.

 [] S T Sherry, M H Ward, M Kholodov, J Baker, L Phan, E M Smigielski, and K Sirotkin. dbSNP: the NCBI database of ge- netic variation. Nucleic Acids Res, ():–, . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=. [] Jared T Simpson, Kim Wong, Shaun D Jackman, Jacqueline E Schein, Steven J M Jones, and İnanç Birol. ABySS: A parallel assembler for short read sequence data. Genome Research, ():–, June . doi: ./gr... URL http://genome.cshlp.org/content// /.abstract. [] Suzanne Sindi, Selim Onal, Luke Peng, Hsin-Ta Wu, and Benjamin Raphael. An integrative probabilistic model for identification of structural variation in sequencing data. Genome Biology, ():R, . ISSN -. URL http://genomebiology.com////R. [] Daniela Skladal, Jane Halliday, and David R orburn. Minimum birth prevalence of mitochondrial respiratory chain disorders in chil- dren . Brain, ():–, August . URL http://brain. oxfordjournals.org/content///.abstract. [] J A Smeitink, O Elpeleg, H Antonicka, H Diepstra, A Saada, P Smits, F Sasarman, G Vriend, J Jacob-Hirsch, A Shaag, G Rechavi, B Welling, J Horst, R J Rodenburg, B van den Heuvel, and E A Shoubridge. Distinct clinical phenotypes associated with a mutation in the mito- chondrial translation elongation factor EFTs. Am J Hum Genet,  ():–, . doi: S-()-[pii]./. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=. [] S Solinas-Toldo, S Lampel, S Stilgenbauer, J Nickolenko, A Benner, H Dohner, T Cremer, and P Lichter. Matrix-based comparative genomic hybridization: biochips to screen for genomic imbalances. Genes Chromo- somes Cancer, ():–, . [] Michael R Speicher, Stephen Gwyn Ballard, and David C Ward. Karyotyping human chromosomes by combinatorial multi-fluor FISH. Nat Genet, (): –, April . [] Angela C Spencer and Linda L Spremulli. Interaction of mitochondrial ini- tiation factor  with mitochondrial fMet-tRNA. Nucleic Acids Research,  ():–, January . doi: ./nar/gkh. URL http: //nar.oxfordjournals.org/content///.abstract.

 [] A Spinazzola, F Invernizzi, F Carrara, E Lamantea, A Donati, M Dirocco, I Giordano, M Meznaric-Petrusa, E Baruffini, I Ferrero, and M Zeviani. Clin- ical and molecular features of mitochondrial DNA depletion syndromes. J Inherit Metab Dis, ():–, . doi: ./s---z. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Paweł Stankiewicz and James R Lupski. Structural Variation in the Human Genome and its Role in Disease. Annual Review of Medicine, ():–, January . ISSN -. doi: . /annurev-med--. URL http://dx.doi.org/./ annurev-med--.

[] Hreinn Stefansson, Dan Rujescu, Sven Cichon, Olli P H Pietilainen, Andres Ingason, Stacy Steinberg, Ragnheidur Fossdal, Engilbert Sigurdsson, or- dur Sigmundsson, Jacobine E Buizer-Voskamp, omas Hansen, Klaus D Jakobsen, Pierandrea Muglia, Clyde Francks, Paul M Matthews, Arnaldur Gylfason, Bjarni V Halldorsson, Daniel Gudbjartsson, orgeir E orgeirs- son, Asgeir Sigurdsson, Adalbjorg Jonasdottir, Aslaug Jonasdottir, As- geir Bjornsson, Sigurborg Mattiasdottir, orarinn Blondal, Magnus Har- aldsson, Brynja B Magnusdottir, Ina Giegling, Hans-Jurgen Moller, An- nette Hartmann, Kevin V Shianna, Dongliang Ge, Anna C Need, Caro- line Crombie, Gillian Fraser, Nicholas Walker, Jouko Lonnqvist, Jaana Su- visaari, Annamarie Tuulio-Henriksson, Tiina Paunio, Timi Toulopoulou, Elvira Bramon, Marta Di Forti, Robin Murray, Mirella Ruggeri, Evan- gelos Vassos, Sarah Tosato, Muriel Walshe, Tao Li, Catalina Vasilescu, omas W Muhleisen, August G Wang, Henrik Ullum, Srdjan Djurovic, In- grid Melle, Jes Olesen, Lambertus A Kiemeney, Barbara Franke, Chiara Sabatti, Nelson B Freimer, Jeffrey R Gulcher, Unnur orsteinsdottir, Augustine Kong, Ole A Andreassen, Roel A Ophoff, Alexander Georgi, Marcella Rietschel, omas Werge, Hannes Petursson, David B Goldstein, Markus M Nothen, Leena Peltonen, David A Collier, David St Clair, and Kari Stefansson. Large recurrent microdeletions associated with schizophre- nia. Nature, ():–, September . ISSN -. URL http://dx.doi.org/./naturehttp://www.nature. com/nature/journal/v/n/suppinfo/nature_S.html.

[] P D Stenson, M Mort, E V Ball, K Howells, A D Phillips, N S omas, and D N Cooper. e Human Gene Mutation Database:  update. Genome Med, ():, . URL http://www.ncbi.nlm.nih.gov/entrez/ query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_ uids=.

 [] Timothy Sterne-Weiler, Jonathan Howard, Matthew Mort, David N Cooper, and Jeremy R Sanford. Loss of exon identity is a common mech- anism of human inherited disease. Genome Research, ():–, October . URL http://genome.cshlp.org/content///. abstract.

[] Deborah Stone, Y Ning, Xin-Yuan Guan, Muriel Kaiser-Kupfer, Anthony Wynshaw-Boris, and L Biesecker. Characterization of familial partial p trisomy by chromosomal microdissection, FISH, and microsatellite dosage analysis. Human Genetics, ():–, . ISSN - . doi: ./s. URL http://dx.doi.org/./ s.

[] Karsten M Strauss, L Miguel Martins, Helene Plun-Favreau, Frank P Marx, Sabine Kautzmann, Daniela Berg, omas Gasser, Zbginiew Wszolek, omas Müller, Antje Bornemann, Hartwig Wolburg, Julian Downward, Olaf Riess, Jörg B Schulz, and Rejko Krüger. Loss of function mutations in the gene encoding Omi/HtrA in Parkinson’s disease. Human Molec- ular Genetics, ():–, . doi: ./hmg/ddi. URL http://hmg.oxfordjournals.org/content///.abstract.

[] JeffreyD. Stumpf and WilliamC. Copeland. Mitochondrial DNA replica- tion and disease: insights from DNA polymerase γ mutations. Cellu- lar and Molecular Life Sciences, ():–, . ISSN -X. doi: ./s---. URL http://dx.doi.org/./ s---.

[] Anu Suomalainen. erapy for mitochondrial disorders: Little proof, high research activity, some promise. Seminars in Fetal and Neonatal Medicine, ():–, August . ISSN -X. doi: http://dx.doi. org/./j.siny.... URL http://www.sciencedirect.com/ science/article/pii/SX.

[] Gabriella P Szabó, Alida C Knegt, Anikó Ujfalusi, Erzsébet Balogh, Tamás Szabó, and Éva Oláh. Subtelomeric . Mb trisomy p and . Mb mono- somy q detected by FISH and array-CGH in three related patients. Amer- ican Journal of Medical Genetics Part A, A():–, April . ISSN -. doi: ./ajmg.a.. URL http://dx.doi.org/. /ajmg.a..

[] Beril Talim, Angela Pyle, Helen Griffin, Haluk Topaloglu, Aysegul Tokatli, Michael J Keogh, Mauro Santibanez-Koref, Patrick F Chinnery, and Rita Horvath. Multisystem fatal infantile disease caused by a novel homozygous

 EARS mutation. Brain, ():e–e, February . doi: ./ brain/aws. URL http://brain.oxfordjournals.org/content/ //e.short. [] Barbara Tappino, Roberta Biancheri, Matthew Mort, Stefano Regis, Fabio Corsolini, Andrea Rossi, Marina Stroppiano, Susanna Lualdi, Agata Fiu- mara, Bruno Bembi, Maja Di Rocco, David N Cooper, and Mirella Filo- camo. Identification and characterization of  novel GALC gene muta- tions causing Krabbe disease. Human Mutation, ():E–E, De- cember . ISSN -. doi: ./humu.. URL http: //dx.doi.org/./humu.. [] e International HapMap  Consortium. Integrating common and rare genetic variation in diverse human populations. Nature,  ():–, . doi: nature[pii]./nature. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=. [] Many e Welcome Trust Case Control Consortium. Genome-wide association study of CNVs in , cases of eight common diseases and , shared controls. Nature, ():–, April . ISSN -. URL http://dx.doi.org/./naturehttp: //www.nature.com/nature/journal/v/n/suppinfo/ nature_S.html. [] L omas. e lives of a cell: notes of a biology watcher. Bantam Books, . ISBN . URL http://books.google.com/books? id=QTsNCSLtNAC. [] D R orburn and J Smeitink. Diagnosis of mitochondrial disorders: Clini- cal and biochemical approach. Journal of Inherited Metabolic Disease, (): –, . ISSN -. doi: ./A:. URL http://dx.doi.org/./A:. [] David R orburn, Canny Sugiana, Renato Salemi, Denise M Kirby, Lisa Worgan, Akira Ohtake, and Michael T Ryan. Biochemical and molec- ular diagnosis of mitochondrial respiratory chain disorders. Biochimica et Biophysica Acta (BBA) - Bioenergetics, (–):–, December . ISSN -. doi: http://dx.doi.org/./j.bbabio... . URL http://www.sciencedirect.com/science/article/pii/ S. [] A S Tibbetts. Mammalian Mitochondrial Initiation Factor  Supports Yeast Mitochondrial Translation without Formylated Initiator tRNA. Journal

 of Biological Chemistry, ():–, . doi: ./jbc. M.

[] E J Tucker, A G Compton, and D R orburn. Recent advances in the genetics of mitochondrial encephalopathies. Curr Neurol Neu- rosci Rep, ():–, . doi: ./s---. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Elena J. Tucker, Steven G. Hershman, Caroline Köhrer, Casey A. Belcher- Timme, Jinal Patel, Olga A. Goldberger, John Christodoulou, Jonathon M. Silberstein, Matthew McKenzie, Michael T. Ryan, Alison G. Compton, Ja- cob D. Jaffe, Steven A. Carr, Sarah E. Calvo, Uttam L. RajBhandary, David R. orburn, and Vamsi K. Mootha. Mutations in MTFMT Underlie a Hu- man Disorder of Formylation Causing Impaired Mitochondrial Translation, September . URL http://linkinghub.elsevier.com/retrieve/ pii/S.

[] H A Tuppen, F Fattori, R Carrozzo, M Zeviani, S DiMauro, S Seneca, J E Martindale, S E Olpin, E P Treacy, R McFarland, F M San- torelli, and R W Taylor. Further pitfalls in the diagnosis of mtDNA mutations: homoplasmic mt-tRNA mutations. J Med Genet,  ():–, . doi: //[pii]./jmg... URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Emily H Turner, Sarah B Ng, Deborah A Nickerson, and Jay Shendure. Methods for Genomic Partitioning. Annual Review of Genomics and Hu- man Genetics, ():–, August . ISSN -. doi: ./annurev-genom--. URL http://dx.doi.org/. /annurev-genom--.

[] Eray Tuzun, Andrew J Sharp, Jeffrey A Bailey, Rajinder Kaul, V Anne Morrison, Lisa M Pertz, Eric Haugen, Hillary Hayden, Donna Albertson, Daniel Pinkel, Maynard V Olson, and Evan E Eichler. Fine-scale structural variation of the human genome. Nat Genet, ():–, July . ISSN -. URL http://dx.doi.org/./nghttp:// www.nature.com/ng/journal/v/n/suppinfo/ng_S.html.

[] Scott B Vafai and Vamsi K Mootha. Mitochondrial disorders as windows into an ancient organelle. Nature, ():–, November . ISSN -. URL http://dx.doi.org/./nature.

 [] Rafael Valdés-Mas, Silvia Bea, Diana A Puente, Carlos López-Otín, and Xose S Puente. Estimation of Copy Number Alterations from Exome Se- quencing Data. PLoS ONE, ():e, December . URL http: //dx.doi.org/./journal.pone..

[] G Van Goethem, B Dermaut, A Lofgren, J J Martin, and C Van Broeckhoven. Mutation of POLG is associated with progressive ex- ternal ophthalmoplegia characterized by mtDNA deletions. Nat Genet, ():–, . doi: ./[pii]. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] U Varshney, C P Lee, and U L RajBhandary. Direct analysis of aminoacy- lation levels of tRNAs in vivo. Application to studying recognition of Es- cherichia coli initiator tRNA mutants by glutaminyl-tRNA synthetase. Jour- nal of Biological Chemistry, ():–, December . URL http://www.jbc.org/content///.abstract.

[] Valeria Vasta, Sarah Ng, Emily Turner, Jay Shendure, and Si Houn Hahn. Next generation sequence analysis for mitochondrial disorders. Genome Med, ():, . URL http://genomemedicine.com/content// /.

[] E S Venkatraman and Adam B Olshen. A faster circular binary seg- mentation algorithm for the analysis of array CGH data. Bioinformat- ics, ():–, March . doi: ./bioinformatics/btl. URL http://bioinformatics.oxfordjournals.org/content/// .abstract.

[] Lionel Vial, Pilar Gomez, Michel Panvert, Emmanuelle Schmitt, Sylvain Blanquet, and Yves Mechulam. Mitochondrial Methionyl-tRNAfMet Formyltransferase from Saccharomyces cerevisiae:￿ Gene Disruption and tRNA Substrate Specificity. Biochemistry, ():–, January . ISSN -. doi: ./bix. URL http://dx.doi.org/. /bix.

[] I Visapaa, V Fellman, J Vesa, A Dasvarma, J L Hutton, V Kumar, G S Payne, M Makarow, R Van Coster, R W Taylor, D M Turnbull, A Suoma- lainen, and L Peltonen. GRACILE syndrome, a lethal metabolic disorder with iron overload, is caused by a point mutation in BCSL. Am J Hum Genet, ():–, . doi: S-()-[pii]./ . URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi? cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=.

 [] John E Walker, Joe Carroll, Matthew C Altman, and Ian M Fearnley. Chap- ter  Mass Spectrometric Characterization of the irteen Subunits of Bovine Respiratory Complexes that are Encoded in Mitochondrial DNA. In William S Allison in Enzymology and Immo E Scheffler B T Meth- ods, editors, Mitochondrial Function, Part A: Mitochondrial Electron Trans- port Complexes and Reactive Oxygen Species, volume Volume , pages – . Academic Press, . ISBN -. doi: http://dx.doi.org/. /S-()-. URL http://www.sciencedirect.com/ science/article/pii/S.

[] Jianmin Wang, Charles G Mullighan, John Easton, Stefan Roberts, Sue L Heatley, Jing Ma, Michael C Rusch, Ken Chen, Christopher C Harris, Li Ding, Linda Holmfeldt, Debbie Payne-Turner, Xian Fan, Lei Wei, David Zhao, John C Obenauer, Clayton Naeve, Elaine R Mardis, Richard K Wilson, James R Downing, and Jinghui Zhang. CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat Meth, ():–, August . ISSN -. URL http://dx.doi.org/./nmeth.http: //www.nature.com/nmeth/journal/v/n/abs/nmeth..html# supplementary-information.

[] Jing Wang, Hongli Zhan, Fang-Yuan Li, Amber N Pursley, Eric S Schmitt, and Lee-Jun Wong. Targeted array CGH as a valuable molecular diagnos- tic approach: Experience in the diagnosis of mitochondrial and metabolic disorders. Molecular Genetics and Metabolism, ():–, June . ISSN -. doi: http://dx.doi.org/./j.ymgme... . URL http://www.sciencedirect.com/science/article/pii/ S.

[] Liang Wang, Shannon K McDonnell, David A Elkins, Susan L Slager, Eric Christensen, Angela F Marks, Julie M Cunningham, Brett J Peterson, Steven J Jacobsen, James R Cerhan, Michael L Blute, Daniel J Schaid, and Stephen N ibodeau. Role of HPC/ELAC in Hereditary Prostate Can- cer. Cancer Research, ():–, September . URL http: //cancerres.aacrjournals.org/content///.abstract.

[] Zefeng Wang, Michael E Rolish, Gene Yeo, Vivian Tung, Matthew Maw- son, and Christopher B Burge. Systematic Identification and Analysis of Exonic Splicing Silencers, December . URL http://linkinghub. elsevier.com/retrieve/pii/S.

[] J Wiegant, W Kalle, L Mullenders, S Brookes, J M N Hoovers, J G Dauwerse, G J B van Ommen, and A K Raap. High-resolution in situ hybridization

 using DNA halo preparations. Human Molecular Genetics, ():–, October . doi: ./hmg/...

[] Jiantao Wu, KrzysztofR Grzeda, Chip Stewart, Fabian Grubert, AlexanderE Urban, MichaelP Snyder, and GaborT Marth. Copy Number Variation de- tection from  Genomes project exon capture sequencing data. BMC Bioinformatics, ():–, . doi: ./---. URL http://dx.doi.org/./---.

[] Chao Xie and Martti Tammi. CNV-seq, a new method to detect copy number variation using high-throughput sequencing. BMC Bioinformatics, ():, .

[] Kai Ye, Marcel H Schulz, Quan Long, Rolf Apweiler, and Zemin Ning. Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics, ():–, November . doi: ./bioinformatics/btp. URL http://bioinformatics.oxfordjournals.org/content// /.abstract.

[] Yvonne Y C Yeap, Ivan H W Ng, Bahareh Badrian, Tuong-Vi Nguyen, Yan Y Yip, Amardeep S Dhillon, Steven E Mutsaers, John Silke, Marie A Bogoyevitch, and Dominic C H Ng. c-Jun N-terminal kinase/c-Jun in- hibits fibroblast proliferation by negatively regulating the levels of stath- min/oncoprotein , .

[] Seungtai Yoon, Zhenyu Xuan, Vladimir Makarov, Kenny Ye, and Jonathan Sebat. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Research, ():–, September . doi: ./gr... URL http://genome.cshlp.org/content/ //.abstract.

[] N A Zaghloul, Y Liu, J M Gerdes, C Gascue, E C Oh, C C Leitch, Y Bromberg, J Binkley, R L Leibel, A Sidow, J L Badano, and N Katsanis. Functional anal- yses of variants reveal a significant role for dominant negative and common alleles in oligogenic Bardet-Biedl syndrome. Proc Natl Acad Sci U S A,  ():–, . doi: [pii]./pnas.. URL http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd= Retrieve&db=PubMed&dopt=Citation&list_uids=.

[] Bruno Zeitouni, Valentina Boeva, Isabelle Janoueix-Lerosey, Sophie Loeil- let, Patricia Legoix-né, Alain Nicolas, Olivier Delattre, and Emmanuel Baril- lot. SVDetect: a tool to identify genomic structural variations from paired-

 end and mate-pair sequencing data. Bioinformatics, ():–, Au- gust . doi: ./bioinformatics/btq.

[] Feng Zhang, Le Cong, Simona Lodato, Sriram Kosuri, George M Church, and Paola Arlotta. Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription. Nat Biotech, ():–, February . ISSN -. URL http: //dx.doi.org/./nbt.http://www.nature.com/nbt/ journal/v/n/abs/nbt..html#supplementary-information.

