Molecular Psychiatry (2013) 18, 141 -- 153 & 2013 Macmillan Publishers Limited All rights reserved 1359-4184/13 www.nature.com/mp

REVIEW A new paradigm emerges from the study of de novo mutations in the context of neurodevelopmental disease

CS Ku,1, C Polychronakos2,EKTan3,4, N Naidoo1, Y Pawitan5, DH Roukos6,7, M Mort8 and DN Cooper8

The study of de novo point mutations (new germline mutations arising from the gametes of the parents) remained largely static until the arrival of next-generation sequencing technologies, which made both whole-exome sequencing (WES) and whole-genome sequencing (WGS) feasible in practical terms. Single nucleotide polymorphism genotyping arrays have been used to identify de novo copy-number variants in a number of common neurodevelopmental conditions such as schizophrenia and autism. By contrast, as point mutations and microlesions occurring de novo are refractory to analysis by these microarray- based methods, little was known about either their frequency or impact upon neurodevelopmental disease, until the advent of WES. De novo point mutations have recently been implicated in schizophrenia, autism and mental retardation through the WES of case-parent trios. Taken together, these findings strengthen the hypothesis that the occurrence of de novo mutations could account for the high prevalence of such diseases that are associated with a marked reduction in fecundity. De novo point mutations are also known to be responsible for many sporadic cases of rare dominant Mendelian disorders such as Kabuki syndrome, Schinzel--Giedion syndrome and Bohring--Opitz syndrome. These disorders share a common feature in that they are all characterized by intellectual disability. In summary, recent WES studies of neurodevelopmental and neuropsychiatric disease have provided new insights into the role of de novo mutations in these disorders. Our knowledge of de novo mutations is likely to be further accelerated by WGS. However, the collection of case-parent trios will be a prerequisite for such studies. This review aims to discuss recent developments in the study of de novo mutations made possible by technological advances in DNA sequencing.

Molecular Psychiatry (2013) 18, 141--153; doi:10.1038/mp.2012.58; published online 29 May 2012 Keywords: de novo mutation; exome sequencing; Mendelian disorders; next-generation sequencing; neurodevelopmental disease

INTRODUCTION each generation, their incidence in the entire is Germline mutations arise anew in each generation during meiosis. difficult to estimate owing to the lack of appropriate tools. Further, Such spontaneously occurring lesions are termed de novo muta- regional differences in the mutation rate across the genome (for tions that is, heritable mutations that neither parent possessed or example, coding versus non-coding regions, or evolutionarily transmitted. De novo mutations refer to mutations that arose conserved versus non-conserved regions) and between families in the gametes of the parents. They are not easily distinguished and populations have been largely refractory to analysis. Hence, from the new/de novo mutations that arise somatically in the the relative contributions of de novo mutations from the paternal offspring, as both types of mutations are undetectable in the and maternal sets of in offspring have until recently 1 parents. To avoid confusion, de novo is used here specifically to also remained largely unclear. Finally, the molecular features or denote mutations that have arisen from the gametes of the factors that influence the de novo mutation rate in the human parents during meiosis. The presence of variants not found in a genome are subject to speculation but might include chromoso- given individual’s parents may also be due to postzygotic mal location, length of coding sequence, number and length of mutations that have attained a high level of somatic mosaicism introns, nucleotide composition (for example, GC level), sequence either because they occurred at an early stage of embryonic repetitivity both within and flanking the , and the efficiency 2 development or because they conferred a survival or proliferation of DNA repair mechanisms. advantage upon the cells harboring them. As current methodol- Our inability to identify de novo mutations both genome wide ogy cannot distinguish between the two instances, they will be in a given individual or within specific populations has precluded discussed together under the de novo term. the estimation of the proportions of mutations, respectively, As with inherited mutations, de novo mutations range in size predicted to be neutral (genetic drift), advantageous (positive from single nucleotide variants (SNVs)/point mutations and other selection) and deleterious (purifying selection) using available 3--5 microlesions to much larger copy-number variants (CNVs) and bioinformatic tools such as SIFT, PolyPhen and ANNOVAR. structural rearrangements. Although they arise spontaneously in However, these prediction tools have comparatively high error

1Saw Swee Hock School of Public Health, National University of Singapore, Singapore, Singapore; 2Department of Pediatrics/Human Genetics, McGill University Health Center, Montreal, QC, Canada; 3Department of Neurology, National Neuroscience Institute, Singapore, Singapore; 4Duke-National University of Singapore Graduate Medical School, Singapore General Hospital, Singapore, Singapore; 5Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden; 6Department of Surgery, Ioannina University School of Medicine, Ioannina, Greece; 7Biosystems and Synthetic Genomic Network Medicine Center, Ioannina University, Ioannina, Greece and 8Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, UK. Correspondence: Dr CS Ku, Saw Swee Hock School of Public Health, National University of Singapore, 14 Medical Drive, Singapore 117599, Singapore or DN Cooper, Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, UK. E-mail: [email protected] or [email protected] Received 21 November 2011; revised 22 March 2012; accepted 13 April 2012; published online 29 May 2012 De novo mutations in inherited disease CS Ku et al 142 rates. For example, the sensitivity of SIFT and PolyPhen was trios.12,17--19 In a similar vein, de novo point mutations have also reasonably high that is, B70%, but their specificity was low that is, been found to be responsible for sporadic cases of various rare o20%.6 This suggests that although SIFT and PolyPhen may be dominant Mendelian disorders such as Kabuki syndrome, Schin- useful in prioritizing variants that are likely to be deleterious zel--Giedion syndrome and Bohring--Opitz syndrome.20--22 These resulting in a loss-of- function, their low specificity also Mendelian disorders are also characterized by multiple neurode- requires that their predictions should be interpreted with a degree velopmental defects; thus, for example, Bohring--Opitz syndrome of caution. The specificity of the functional prediction might is a clinically recognizable syndrome characterized by severe be increased by applying more stringent criteria, for example, by intellectual disability, distinctive facial features and multiple con- seeking consistency in prediction of deleterious mutations genital malformations.22 by several different algorithms and/or by incorporating evolu- In this review, we appraise the recent paradigm shift in the tionary conservation data. However, further evidence beyond the study of de novo mutations as a consequence of the availability in silico predictions is invariably needed to support or exclude the of new sequence data emerging through technological pathological significance of newly detected missense variants. advances in sequence capture methods and NGS technologies, In addition to the need for case-parent trios, studying de novo which have potentiated both WES and WGS. We also review the mutations on a genome-wide scale requires a whole-genome identification of de novo point mutations in various neurode- screening tool. Clearly, comparison with the parental genomes is velopmental and neuropsychiatric conditions by means of WES. necessary to distinguish from inherited variants. Although the Finally, we discuss future directions in the study of de novo conventional gene-centric approach of limited Sanger sequencing mutations. is capable of identifying de novo point mutations, this method cannot realistically be scaled up to the entire genome, exome or even to thousands of . The targeted sequencing of THE TECHNOLOGICAL ADVANCES THAT LED TO THE EMER- candidate genes has been successful in identifying numerous GENCE OF A NEW PARADIGM IN THE STUDY OF DE NOVO de novo point mutations underlying a variety of neurodevelop- MUTATIONS mental conditions.7,8 However, it is a hypothesis-driven approach. Owing to technical limitations, it used not to be possible to study The selection of candidate genes is highly dependent upon an a the occurrence and pathological relevance of de novo mutations priori hypothesis, which is in turn constrained by our still limited in the context of the entire human genome. This all changed with knowledge of the biological pathways that characterize the technological advances in oligonucleotide-comparative genomic diseases in question. Most of the candidate genes were selected hybridization and SNP genotyping arrays that enabled high- on the basis of their biological plausibility; as a result, this will have resolution CNV analysis (compared with lower resolution methods rendered unlikely the discovery of new or unexpected genes/ such as traditional cytogenetic analysis and bacterial artificial pathways affected by de novo mutations underlying these clone microarrays), thereby providing us with the diseases. Although the availability of multiple oligonucleotide- first opportunity to perform relatively unbiased genome-wide comparative genomic hybridization and single nucleotide poly- surveys of these mutations in a variety of different disease morphism (SNP) genotyping arrays has provided new opportu- states.23--25 High-density SNP genotyping arrays have been the nities to study de novo events at the whole-genome level, the mainstay of the genome-wide association studies that have application of this methodology is limited to the analysis of sought to identify associations of SNPs with various complex CNVs.9,10 By contrast, the arrival of next-generation sequencing diseases.26 It also became apparent that these arrays would allow (NGS) technologies has revolutionized the detection and analysis CNVs to be investigated on the basis of their signal intensity. Two of microlesions occurring de novo, allowing first estimations of the types of signal intensity data are generated that is, (a) the total mutation rate as well as a glimpse of the spatial distribution of de signal intensity (or log ratio) and (b) the allelic intensity ratio or novo mutations in the human genome11 and their association with B allele frequency. The combination of log ratio and B allele heritable disease.12 Thus, the whole-genome sequencing (WGS) frequency can be used to determine several different copy-number approach (by mapping or aligning sequence reads against a states such as homozygous and heterozygous deletions, and reference genome) should lead to the identification of different one-copy or two-copy duplications, as implemented in multiple types of de novo mutation within case-parent trios. However, CNV-calling algorithms.27,28 Figure 1 illustrates the difference in the comprehensive identification of de novo mutations would log ratio and B allele frequency patterns between a deletion and a necessarily require a de novo genome assembly approach13 duplication. where the sequence reads are assembled into contigs that are Copy-number variation is one of the most intriguing types of then compared between genomes to identify variants. WGS human genetic polymorphism discovered over the past few years; studies performed to date have employed the mapping it generally refers to deletions or duplications that are typically approach, both because of the short sequence reads generated larger than 1 kb in size.29--31 Many studies have now been per- but also because of the current lack of powerful de novo formed to examine the associations of CNVs with complex assembly tools.14--16 diseases, using SNP genotyping arrays. These studies have yielded Owing to technical limitations, the contribution of de novo somewhat mixed results. The study performed by the Wellcome mutations to human genetic disease (both Mendelian disorders Trust Case Control Consortium, which investigated the association and complex diseases) has remained largely unexplored. De novo between B3400 common CNVs and 8 complex diseases (that is, mutations could have severe biological or phenotypic conse- bipolar disorder, breast cancer, coronary artery disease, Crohn’s quences when they affect functionally important nucleotides in disease, hypertension, rheumatoid arthritis, type 1 diabetes and thegenome(forexample,nonsense mutations leading to truncated type 2 diabetes) in 19 000 samples, served as a proof of principle ) or when they occur in evolutionarily conserved regions. and did not itself yield any novel discoveries.32 By contrast, However, only a relatively small proportion of the de novo numerous CNVs have been identified for various neurodevelop- mutations occurring at each meiosis will affect functionally mental and neuropsychiatric disorders such as autism/autism important nucleotides. Although de novo CNVs have been shown, spectrum disorders (ASD),9,33 schizophrenia,10,34--36 mental retar- using whole-genome SNP genotyping arrays, to be associated dation37 as well as bipolar disorder.38 More importantly, de novo with various common neurodevelopmental diseases,9,10 it was not CNVs have also been discovered in the context of several neuro- until very recently that de novo point mutations were also developmental diseases (discussed below). The paucity of CNV implicated in the etiology of schizophrenia, autism and mental associations with non-neurodevelopmental diseases is currently retardation through the whole-exome sequencing (WES) of affected unclear. However, it should be noted that common CNVs were the

Molecular Psychiatry (2013), 141 -- 153 & 2013 Macmillan Publishers Limited De novo mutations in inherited disease CS Ku et al 143 a

Log ratio

b

B allele frequency

One copy Normal One copy duplication diploid copy deletion

Figure 1. Different patterns of signal intensity of CNVs (copy-number variations; one-copy duplication and one-copy deletion) for oligonu- cleotide-comparative genomic hybridization (CGH) and short nucleotide polymorphism (SNP) genotyping arrays. These arrays have been used to detect CNVs; however, oligonucleotide-CGH generated only one type of signal intensity data that is, log ratio compared with the SNP genotyping array where both log ratio and B allele frequency (BAF) were generated. (a) In array CGH, the signal intensity ratio between a test sample and a reference sample (one sample or a pool of samples can be used as a reference) is normalized and converted to a log2 ratio, which acts as a proxy for copy number. An increased log2 ratio represents a gain in copy number in the test compared with the reference; conversely, a decrease indicates a loss in copy number. SNP array generates a similar metric by comparing the signal intensity of the sample being analyzed to a collection of reference signal intensities. (b) By contrast, BAF was generated only by SNP array. BAF is the proportion of the total allele signal (A þ B) explained by a single allele (A) and can be interpreted as follows: a BAF of 0 represents the genotype (A/A or A/À), whereas 0.5 represents (A/B) and 1 represents (B/B or B/À). Different BAF values occur for one-copy duplication and one-copy deletion compared with normal haploid copy. main focus of the Wellcome Trust Case Control Consortium study. detection of inherited SNVs. In addition, SNV detection is very In sharp contradistinction, it has been de novo and rare CNVs that sensitive to the mapping quality, as false positives could result have been found to be associated with various neurodevelop- from misalignment of the reads to the reference genome. mental disorders. Further, rare CNVs have not been widely tested Misalignment of reads is attributed to the fact that the human for their associations with complex non-neurodevelopmental genome has a significant portion of homologous regions, in diseases; this could be because of the concepts encapsulated in particular paralogous loci. However, these alignment artifacts can the common disease-common variant model. The notion that be reduced in number by using paired-end sequencing (together neurodevelopmental diseases could be subject to stronger puri- with a longer sequence read length, for example, 100 bp) as there fying selection pressure is somewhat speculative; hence, the is a very low likelihood with this approach that misalignment underlying causal mutations are individually rare or could have occurs for both reads. In addition, the misalignment can also be arisen de novo, but collectively contribute to the commonness of alleviated by scoring the SNVs according to the uniqueness the diseases. of their genomic locations. The UCSC genome browser provides Although some success has been achieved in the study of de a tool that allows all genomic regions to be assigned to their novo CNVs by means of SNP genotyping arrays, de novo point corresponding homologous regions if they exist (UCSC Self Chain mutations and microlesions have until recently remained refrac- Track). tory to analysis. The advent of NGS technologies has provided new Thus, the massive throughput of NGS technologies has enabled opportunities to explore this latter type of mutation. The output WES and WGS to become technically feasible. NGS is a very of these NGS platforms is a very large number of sequence reads powerful tool to study SNVs, as it has a fairly high sensitivity and (up to hundreds of millions). The read lengths vary from B100 bp specificity rate (490%) for detecting SNVs. Although the reported (Illumina Genome Analyzer (Illumina, San Diego, CA, USA)/HiSeq or specificity rate for small indels and structural rearrangements is ABI SOLiD, Life Technologies, Grand Island, NY, USA) to several much lower, B30--70%, sensitivity is difficult to estimate as the hundreds of base pairs (Roche 454 GS FLX, Roche Diagnostics, true number of variants is unknown.20,41--43 Thus, both WES and Mannheim, Germany).39,40 The designs of the sequencing experi- WGS in their current forms are good enough to study de novo ments are such that each nucleotide (either in the exome or whole point mutations, but probably not as yet good enough for other genome) is represented in a large number of different reads, types of lesions. These NGS technologies, when coupled with typically at least 30-fold on average (depth of sequencing). sequence-enrichment methods and applied to case-parent trio Naturally, the coverage of specific sequences varies around this samples, have revolutionized the way in which we identify the average, with the minimum coverage required to call a variant de novo point mutations, and to a lesser extent other mutation depending upon homozygosity versus heterozygosity and on the types through WES.1,12,17--19 Indeed, these WES studies have error rate of the method employed. Although the raw base-calling used the commercial whole-exome enrichment kits coupled error rate is much higher for all NGS technologies than for Sanger with NGS technologies (Table 1). The lower cost of WES, and the sequencing, this may be remedied by increasing the depth of fact that it is less analytically challenging compared with the sequencing to improve the consensus accuracy. However, a WGS approach, are probably the major reasons for the wide minimum of eight-fold coverage is deemed sufficient for accurate acceptance of WES.

& 2013 Macmillan Publishers Limited Molecular Psychiatry (2013), 141 -- 153 De novo mutations in inherited disease CS Ku et al 144 Table 1. Summary of exome sequencing studies---experimental designs

Sequence enrichment/ sequencing Disease/study platforms Sample size and characteristics Sequencing statistics

Mental Agilent Exomes of 10 case-parent trios. Obtained 3.1 Gb of mappable sequence data per retardation SureSelect Human All cases had moderate-to-severe mental individual after exome enrichment (37 Mb of Vissers et al.12 Exome Kit retardation and a negative family history. genomic sequence targeting B18 000 genes). Life Technologies Array-based genomic profiling did not reveal On average, 79.6% of the bases originated from SOLiD3 de novo or other CNVs. the targeted exome, with 90% of the targeted exons covered at least ten times. The median exon coverage was 42-fold. Schizophrenia Agilent Exomes of 53 family trios of subjects diagnosed with Obtained 7.3 Gb of mappable sequence data per Xu et al.18 SureSelect Human schizophrenia or schizo-affective disorder with no individual after exome enrichment, targeting 37 Mb All Exon Target history of the disease in a first- or second-degree from exons and their flanking regions. Enrichment relative as well as family trios of 22 unrelated healthy 97.9% of the reads were properly aligned to the Illumina controls. reference genome. HiSeq2000 Excluded carriers of rare de novo CNVs. Median read depth was 65.2 Â . 92.4% of the captured target exons were covered by high quality genotype calls at least eight times to ensure good detection sensitivity. Schizophrenia Agilent Exomes from 14 trios, with each trio consisting of an An average of B56 million reads were available for Girard et al.19 SureSelect All individual with schizophrenia and his or her parents. each individual, of which B50% aligned to the Exome Kit v.1 None of the selected probands had a first- or targeted regions. Illumina Genome second-degree family history of psychotic disorders. An average of B72% of targeted regions covered Analyzer IIx Examined genomic DNA from every selected trio with a read depth 420 Â . platform using a CNV-targeted array to exclude potentially causative structural genetic variations. Autism Roche SeqCap EZ 20 trios with an idiopathic ASD, each consistent with Obtained sufficient coverage to call variants for spectrum Exome probes v.1.0 a sporadic ASD. B90% of the primary target (26.4 Mb). disorders Illumina Genome Each family was initially screened by array- Genotype concordance with SNP microarray data (ASDs) Analyzer IIx comparative genomic hybridization. was 99.7% and on average. O’Roak et al.17 Identified no large (4250 kb) de novo CNVs but did identify a maternally inherited deletion (B350 kb) at 15q11.2 in one family (this deletion has been associated with an increased risk of epilepsy and schizophrenia but has not been considered causal for autism). Abbreviations: ASDs, Autism spectrum disorders; CNVs, copy-number variants.

WES refers to the sequencing of the entire set of exons tools. The alignment steps are then followed by variant calling and (B200 000 exons) in the genome. It requires several sequence- filtering to remove false positives, and variant annotation to enrichment steps followed by massively parallel sequencing. The obtain information such as genomic position. Although most of development of commercial whole human exome enrichment kits the studies have adopted the alignment or mapping approach, it has been largely responsible for making this approach feasible in is widely anticipated that with rapid developments in sequencing practical terms. During the enrichment steps, the genomic regions technologies designed to produce longer sequence read lengths48 of interest (that is, all exons) are captured through hybrid selection and more powerful analytical tools,14 de novo genome assembly of DNA fragments using oligonucleotide probes, whereas the will become even easier.13 In de novo genome assembly, the unwanted DNA sequences (that is, the non-coding regions) are sequence reads are assembled into contigs by bioinformatic removed before sequencing, leading to a significant reduction means without mapping against a reference genome sequence; as in the proportion of the genome that needs to be sequenced. such, this process of assembly has the potential to identify new or The difference between on-solid and in-solution capture methods previously unknown sequences (and hence the variants within is that the oligonucleotide probes are tethered on microarray or these sequences). It follows that de novo genome assembly should are suspended in-solution, respectively. The performance of different make possible the more thorough and potentially comprehensive commercial exome-enrichment methodologies has been evaluated, detection of the various types of genetic variant, including those with pros and cons associated with each of the methods.44--47 that have occurred de novo. Figure 2 illustrates a general workflow of WES from genomic DNA extraction to biological interpretation and the identification of the causal mutation. However, in the context of de novo DE NOVO MUTATION RATES mutations, intersecting with the exome data from parents is Recent studies have estimated the per-generation mutation rate in needed to distinguish them from inherited variants. humans to be between 7.6 Â 10À9 and 2.2 Â 10À8 (see refs. 11, 49). An inherent limitation of NGS technologies (particularly Illumina This equates to an average of 50--100 de novo mutations in a GA/HiSeq and ABI SOLiD) is, however, the short length of the newborn genome, which corresponds to B0.86 de novo amino- sequence reads, which therefore needs to be computationally acid-altering mutations. The genome-wide CNV mutation rate has aligned to the reference genome using conventional alignment also been estimated using SNP microarrays; at a resolution of o30 kb,

Molecular Psychiatry (2013), 141 -- 153 & 2013 Macmillan Publishers Limited De novo mutations in inherited disease CS Ku et al 145 1. Genomic DNA extracted from peripheral blood sequenced to 422-fold mapped depth during the pilot phase of the 1000 Genomes Project;51 a total of 49 and 35 germline de novo mutations were identified in two parent-offspring (CEU and YRI) 1 2. Sequencing library preparation trios, respectively. In cellula and somatic mutations may have adversely affected published estimates of de novo mutation rates. To take these 3. Exome capture and enrichment artifacts into consideration, Conrad et al1 performed validation of

Sequencing all putative de novo mutations detected to unambiguously distinguish germline de novo mutations from somatic or in cellula

Sample Preparation and Sample Preparation 4. Exome sequencing to sufficient coverage mutations. Thus, the de novo mutations were classified into five i.e., average coverage ≥30 categories that is, (a) germline de novo mutations, (b) non-germ- line (somatic or arising in cellula) de novo mutations, (c) inherited variant, (d) false positive and (e) inconclusive. In addition to the 5. Data processing to raw sequence reads germline de novo mutations, through validation and classification, the study also detected 952 and 643 non-germline de novo mutations in the CEU and YRI trios, respectively. The observed 6. Reads alignment to reference genome using ratio of non-germline versus germline de novo mutations was conventional alignment tools (ELAND, BWA etc) substantially greater than the 1:1 ratio reported previously. This Processing Primary Data difference could be because of the age of the cell lines (number of 7. PCR duplicates removal passages), the mutagenicity of the cell culture conditions and/or the clonality of the cell lines. However, to ensure that mutations are truly germline and occurred in the parents, a three-generation 8. Variant calling and filtering using commonly applied study is required to demonstrate that the de novo mutations in the quality control criteria (Samtools etc) second generation have been transmitted to the third generation. Alternatively, DNA from another tissue for example, skin (assum- ing DNA from peripheral blood was used for the analysis of de 9. Variant annotation (dbSNP, 1000GP etc.) novo mutations) in the offspring needs to be tested to discern

Processing whether the mutations that were absent in parents are germline

Secondary Data 10. Biological interpretation and identification of causal de novo or somatically acquired. If the former, theoretically these mutation mutations should also be observed in another constitutional tissue, otherwise they are likely to have been somatically acquired. Figure 2. A general workflow of whole-exome sequencing (WES) Further, in addition to the estimation of the germline de novo from genomic DNA extraction to biological interpretation and the mutation rate, differences in parental transmission were also identification of the causal mutation. The workflow is divided into observed. In one family, 92% of the de novo mutations originated three phases, that is, sample preparation and sequencing (steps 1-- from the paternal germline, whereas the equivalent figure in the 4), primary data processing (steps 5--7) and secondary data pro- 1 cessing (steps 8--10). The genomic DNA is used for sequencing li- other family was 36% of de novo mutations. In future, an in-depth brary preparation involving several steps such as DNA investigation of the variation in mutation rates within and fragmentation and adapter ligation (steps 1 and 2). Exome capture between families/populations will require a larger sample size. and enrichment is usually performed using commercial kits such as The estimation of the de novo mutation rate using WES could be the Agilent (Agilent Technologies, Santa Clara, CA, USA), Illumina, affected by the incomplete capture of the known exons. However, and Nimblegen (NimbleGen Systems, Madison, WI, USA) human the implications for the de novo mutation rate calculations will exome capture kits. WES is then performed using high-throughput depend upon whether the incomplete capture of the exons is next-generation sequencing (NGS) technologies such as Illumina random or systematic (for example, exons with low/high GC GA/HiSeq, Life Technologies SOLiD (Life Technologies), and Roche 454 Genome Sequencer. These steps are performed according to content that have been reported to have lower capture efficiency, the manufacturer’s recommended protocols. An average coverage or exons with a specific sequence feature such as highly repetitive of B30-fold is deemed sufficient for the accurate calling of variants sequences). The former is unlikely to affect the estimation of the (steps 3 and 4). Subsequently, the data from NGS platforms are de novo mutation rate significantly. By contrast, the effects of processed to raw sequence reads (step 5). These reads are then exons with low/high GC content on the estimation of the de novo aligned to the reference genome using conventional alignment mutation rate will depend upon whether this specific group of tools. PCR duplicates, which introduce noise to variant calling, are exons has a significantly lower/higher than average de novo removed (steps 6 and 7). The analytical pipeline is then followed by mutation rate as compared with the other exons. Unfortunately, variant calling and filtering to remove false positives according to there are no empirical data available to demonstrate this one commonly applied quality control criteria (step 8). Finally, the var- iants are annotated to obtain information such as genomic position way or the other. Thus, the consequences of the incomplete and functional effect (for example, missense and nonsense variants; capture of exons on estimates of the de novo mutation rate remain step 9), before being examined to identify the causal mutations speculative. (step 10). DE NOVO MUTATIONS IN NEURODEVELOPMENTAL DISOR- DERS Itsara et al.50 observed nine de novo CNVs from 772 transmissions, Rett syndrome: a classical example of a neurodevelopmental corresponding to a mutation rate of 1.2 Â 10À2 CNVs per genome disorder affected by de novo mutations per transmission. However, as yet, a direct comparison of male A classical example of a neurodevelopmental disorder caused by with female germline mutation rates with complete genome de novo mutations is Rett syndrome. It is a progressive neuro- sequences has not been performed. Thus, the gender-specific developmental disorder and one of the most common causes of mutation rates still remain to be formally ascertained. Complete mental retardation. The syndrome is transmitted as an X-linked genomes from two parent-offspring trios (from the HapMap CEU dominant trait, and therefore almost exclusively affects females. (Utah residents with Northern and Western European ancestry) Further, the syndrome is sporadic in most cases.52 One of the first and YRI (Yoruba people in Ibadan, Nigeria) populations) were pieces of evidence showing the importance of de novo mutations

& 2013 Macmillan Publishers Limited Molecular Psychiatry (2013), 141 -- 153 De novo mutations in inherited disease CS Ku et al 146 in Rett syndrome was the identification of a de novo-balanced carry at least one de novo mutational event. It is noteworthy that translocation between chromosomes X and 3 (with the chromo- de novo SNVs (four non-synonymous and three synonymous some constitution 46,X,t(X;3)(p22.11;q13.31)).53 This genetic mutations) were also identified in the control subjects.18 This evidence is in line with the hypothesis of an X-linked dominant suggests that identifying the pathogenic and deleterious de novo mutation. The mode of inheritance of Rett syndrome as an mutations from among a background of non-disease-associated X-linked dominant disorder was further supported by the identifi- SNVs is likely to be a tricky task. However, the finding of a cation of mutations in MECP2.54 More interestingly, in 5 of 21 significantly larger than expected number of de novo mutations in sporadic patients, three de novo missense mutations were found schizophrenia probands appears to be supported by the results in the region encoding the highly conserved methyl-binding of another study.19 It could, of course, be that this excess of domain, as well as a de novo frameshift and a de novo nonsense de novo mutations is simply a proxy indicator of increased mutation, both of which disrupted the transcription repression genome-wide mutability and that only one mutation, possibly not domain. Approximately 80% of Rett syndrome cases are sporadic even one within the exome screened, in each case is responsible and caused by mutations in the MECP2 gene. Almost all mutations for the disease. Tables 1 and 2 summarize the (a) experimental in MECP2 occur de novo, suggesting that such mutations are designs and (b) analysis pipeline and major findings, respectively, largely responsible for the almost invariable occurrence of this for the WES studies. highly deleterious disorder in females and presumed lethality In a parallel development, de novo mutations were also found to in males.52,54--56 be enriched in ASD in a WES study of 20 sporadic ASD cases and their parents; 18 de novo-coding mutations were identified, of which 11 were protein altering.17 Potentially causative de novo Common neurodevelopmental disorders mutations were identified in the FOXP1, GRIN2B, SCN1A and LAMC3 A comprehensive summary of de novo mutations in rare Mendelian genes in four ASD probands who were among the most severely forms of neurodevelopmental disorders is beyond the scope of affected individuals studied.17 Interestingly, de novo mutations in this review. Instead, we focus on the importance of de novo some of these genes have already been previously found in mutations in common neurodevelopmental and neuropsychiatric association with a variety of neurodevelopmental phenotypes. For disorders such as schizophrenia, autism and mental retardation. example, in an earlier study, which sequenced the GRIN2B gene in It has been postulated that the occurrence of de novo mutations 468 individuals with mental retardation, four de novo mutations might explain why diseases characterized by dramatically reduced in the gene were identified.67 Similarly, de novo mutations in fecundity such as mental retardation and schizophrenia never- the FOXP1 gene have also been identified in individuals with theless remain frequent in the population. If this were indeed to intellectual disability, autism and language impairment.7,68 This be the case, the de novo mutations occurring in sporadic cases of is therefore suggestive of a pleiotropic effect for these genes these diseases would serve to replenish the number of highly in different neurodevelopmental disorders---de novo mutations in penetrant disease mutations in every generation despite the these genes could give rise to variable phenotypic expressivity or continual negative selection. Indeed, de novo CNVs are a known even a spectrum of neurodevelopmental phenotypes, depending cause of schizophrenia, autism and mental retardation. For upon mutation severity and probably modulated by other genetic example, in one of the first CNV studies to examine de novo and environmental factors.69 CNVs in schizophrenia, Xu et al.10 showed that de novo CNVs are De novo point mutations have also been identified in individuals significantly more common in sporadic cases than in controls that with unexplained mental retardation. More specifically, Vessers is, 10 versus 1.3%, respectively. A recent analysis of de novo CNVs et al.12 employed the WES approach to identify de novo point implicated the involvement of postsynaptic signaling complexes mutations in children with a normal karyotype in whom array- in the pathogenesis of schizophrenia.57 Similarly, de novo CNVs based genome profiling had excluded a potential contribution have also been found to be associated with autism and/or ASD from de novo CNVs, a known cause of mental retardation. Thus, and mental retardation.9,58--62 For example, a recent genome-wide WES was performed on 10 children with unexplained mental analysis was performed to examine rare CNVs in 1124 ASD families retardation and their parents, leading to the identification of an (each family comprising a single proband and their unaffected average of 21 755 genetic variants per individual. One of the main parents), which identified multiple rare recurrent de novo CNVs, in challenges in WES is therefore clearly the huge number of variants particular a significant association of ASD with the de novo dupli- found in each exome and in identifying the causal mutation cation of 7q11.23.60 The identification of de novo and inherited responsible for the disease. As with other WES studies, a series of CNVs in neurodevelopmental disorders has been reviewed else- filtering steps were applied and, after validation by Sanger where.63--66 sequencing, a total of nine de novo mutations were identified. De novo point mutations have also been identified in The commonly applied filters serve to exclude (a) all nongenic, schizophrenia through the exome sequencing of affected case- intronic and synonymous variants (as these variants do not alter parent trios. The study of Xu et al.18 is by far the largest in this the amino-acid sequence of the gene concerned and are much context; the exomes of 53 sporadic schizophrenia cases (with no less likely to be causally related), (b) known and likely benign history of the disease in a first- or second-degree relative), 22 variants by comparison with SNP databases and 1000 Genomes unaffected controls and their parents were sequenced. This Project to exclude common variants and (c) inherited variants by endeavor identified a total of 34 de novo point mutations (33 comparison with the parental genomes.12 These de novo muta- SNVs and 1 dinucleotide substitution) and 4 de novo indels in the tions occurred in different genes, including RAB39B and SYNGAP1, affected trios. Further, this analysis identified an excess of non- which have previously been implicated in mental retardation,70,71 synonymous changes amongst the identified de novo mutations in thereby lending further support to the results derived from exome schizophrenia cases, as well as an increased likelihood of these sequencing. Six of these nine de novo mutations were predicted to missense variants affecting protein structure and function when be pathogenic on the basis of gene function, evolutionary compared with the rare inherited variants identified in the same conservation and likely mutational impact.12 study. Thus, of the 34 de novo point mutations, 32 were missense In addition to data from genome-wide CNV and WES analyses, mutations, 19 of which affected evolutionarily conserved positions evidence is also accumulating from studies that have employed a and were predicted (by PolyPhen-2 analysis) to alter protein more targeted gene-centric approach; such studies have identified function.18 In addition, three of the indels resulted in protein de novo CNVs and point mutations for a number of different truncations and one gave rise to a single amino-acid deletion. neurodevelopmental disorders.72,73 However, this targeted ap- Overall, 27 out of 53 cases of schizophrenia (B51%) were found to proach is hypothesis-driven. For example, on the basis that

Molecular Psychiatry (2013), 141 -- 153 & 2013 Macmillan Publishers Limited De novo mutations in inherited disease CS Ku et al 147 Table 2. Summary of exome-sequencing studies---analysis pipeline

Analysis of evolutionary conservation and Disease/study functional effects Variant filtering strategies Major findings

Mental De novo mutations in genes with a Identified 21 755 genetic variants per 38 mutations could not be validated in retardation functional link to mental retardation individual the proband (covered by a median of five Vissers et al.12 showed a higher phyloP (mean, 4.7) and Excluded all nongenic, intronic and variant reads), but 13 mutations could be Grantham score (mean, 135) than synonymous variants other than those validated (covered by a median of 17 mutations in genes without such a occurring at canonical splice sites. variant reads). functional indication (mean phyloP score, Reduced the number to an average of Parental analysis validated the de novo À0.5 and mean Grantham score, 38). 5640 non-synonymous and canonical occurrence for 9 of these 13 mutations, splice site variants per affected individual. detected in seven different individuals. Further reduced this number to 143 by All de novo mutations occurred in excluding all known, probably benign, different genes, including two genes variants by comparison with data from recently implicated in mental retardation. the dbSNP database v.130 and in-house variant database. Used the exome data from each case’s parents to exclude all remaining inherited variants and this resulted in an average of five candidate de novo non- synonymous mutations per affected individual. Performed Sanger sequencing for all 51 candidate mutations. Schizophrenia Explored the possibility that de novo Cases Cases Xu et al.18 mutations in cases have a greater The variant filtering steps were not Found 27 out of 53 cases (B51%) potential to affect protein structure and described in detail. harboring at least one de novo mutational function than private-inherited variants Observed a total of 34 de novo point event. by examining the evolutionary mutations (33 SNVs and 1 dinucleotide 10 of the 27 cases carried more than conservation of affected nucleotides substitution) and 4 de novo indels in the one de novo mutation, and the rest each (using the phyloP score20) as well as the affected trios. carried a single mutation or indel. potential of the de novo protein-altering All identified de novo mutations were In sporadic schizophrenia cases, rare de mutations to affect the structure or absent in a total of 1658 control novo variants are approximately ten times function of the resulting proteins (using chromosomes. more likely than inherited rare variants to the Grantham score 21) Controls harbor non-synonymous changes. Compared the cumulative distribution Using the same pipeline and filtering Controls of these scores between de novo and criteria, identified seven exonic de novo 7 out of 22 controls carry at least one de private-inherited variants in the sporadic SNVs but no indel in 7 out of 22 control novo event, with one control having cases cohort, and observed that the subjects. more than one de novo mutation. distribution of the de novo variants was clearly shifted to the right (phyloP P ¼ 0.0005 and Grantham P ¼ 0.14). Schizophrenia Among the 11 missense de novo Examined any variation from the Identified 15 de novo mutation in 8 of Girard et al.19 mutations identified, 7 were predicted to reference genome in his or her unaffected the 14 schizophrenia probands. be deleterious when analyzed using the parents and in a pool of 26 control Out of the 15 validated mutations, 4 are bioinformatic algorithms SIFT, PolyPhen, individuals (comprising the parents of nonsense mutations that are predicted to the PhyloP conservation score or the the other trios) and placed every variant lead to a premature stop codon, whereas Grantham matrix. found to be unique to the proband on a the rest are missense mutations. Two missense variants in CCDC137 and putative de novo mutation list. CHD4, respectively, were predicted to be Considered all mutations within coding damaging by all four prediction sequences and identified 73 putative algorithms. variants. For each of the putative 73 variants, performed direct PCR amplification and Sanger sequencing in the probands and their respective parents. Validated a total of 15 variants as genuine de novo mutations. None of the de novo mutations was reported in dbSNP build 131 or in the latest release of the 1000 Genomes Project. ASDs Assessed whether proband de novo Filtered variants previously observed in 11 of the 18 coding de novo events are O’Roak et al.17 mutations were enriched for disruptive the dbSNP database, 1000 Genomes Pilot predicted to alter protein function. events by considering two independent Project data and 1490 other exomes. Identified a subset of trios (4 out of 20) quantitative measures: the nature of the Performed Sanger sequencing on the with disruptive de novo mutations that amino-acid replacement (Grantham matrix remaining de novo candidates (o5 per are potentially causative, including genes score) and the degree of nucleotide-level trio), validating 18 events within coding previously associated with autism, evolutionary conservation (Genomic sequence and three additional events intellectual disability and epilepsy. Evolutionary Rate Profiling/GERP). mapping to 30 untranslated regions.

& 2013 Macmillan Publishers Limited Molecular Psychiatry (2013), 141 -- 153 De novo mutations in inherited disease CS Ku et al 148 Table 2 (Continued )

Analysis of evolutionary conservation and Disease/study functional effects Variant filtering strategies Major findings

Determined by simulation of the expected mean GERP and Grantham distributions for ten randomly selected common or private control SNVs. Compared the observed means of the ten de novo protein-altering ASD proband variants with the distribution of common control SNVs; they corresponded to more highly conserved (GERP, Po0.001) and disruptive amino-acid mutations (Grantham, P ¼ 0.015). Abbreviations: ASDs, Autism spectrum disorders; dbSNP, single nucleotide polymorphism databases; SNVs, single nucleotide variants.

de novo mutations in glutamatergic pathways can account for a In similar vein, most cases of Schinzel--Giedion syndrome occur substantial fraction of sporadic cases of nonsyndromic intellectual sporadically, suggesting that heterozygous de novo mutations disability, Hamdan et al.8 sequenced 197 genes encoding may be responsible for causing this disorder. This syndrome is also glutamate receptors and a large subset of their known interacting characterized by severe mental retardation, distinctive facial proteins in 95 sporadic cases of nonsyndromic intellectual features and multiple congenital malformations. De novo causal disability. This approach led to the identification of 11 de novo mutations in SETBP1 were first identified in four individuals with mutations, 10 of which were deemed to be potentially deleterious this disorder through exome sequencing.21 Further sequencing of (that is, three nonsense, two splicing, one frameshift, four SETBP1 identified de novo mutations in an additional six (of eight) missense) and one synonymous substitution in eight different cases with parental samples. All these mutations affected amino- genes. De novo-truncating and/or -splicing mutations in SYNGAP1, acid residues within a highly conserved domain of SETBP1, STXBP1 and SHANK3 were noted in six patients and were held to consistent with their deleterious effect. Thus, it should be noted be probably pathogenic. In addition, de novo missense mutations that the degree of nucleotide-level evolutionary conservation can in KIF1A, GRIN1, CACNG2 and EPB41L1 were deemed to be be used as an indicator of the deleterious nature and potential potentially pathogenic. In earlier studies, de novo mutations in pathogenicity of de novo mutations. Finally, de novo nonsense STXBP1 have already been identified in patients with severe (protein-truncating) mutations in the ASXL1 and KAT6B genes have mental retardation and nonsyndromic epilepsy74 and early-onset also been found to cause Bohring--Opitz syndrome (characterized epileptic encephalopathy75 also by a means of targeted gene by severe intellectual disability, distinctive facial features and sequencing. Although most of the targeted sequencing studies multiple congenital malformations) and Say-Barber-Biesecker- have employed traditional PCR-Sanger sequencing, this approach Young-Simpson syndrome (a multiple anomaly syndrome which has now been augmented by high-throughput multiplex-PCR and is also characterized by severe intellectual disability), respec- custom enrichment methods, and NGS technologies to selectively tively.22,79 In addition, numerous studies have now applied WES to enrich for specific genes, allowing up to hundreds of genes to be the identification of de novo mutations in a range of rare diseases sequenced efficiently. (Table 3).80--93 Taken together, these studies have demonstrated the power of exome sequencing in revealing causal genes for Rare Mendelian disorders associated with neurodevelopmental Mendelian disorders with hitherto unknown genetic etiology as defects well as identifying de novo point mutations when parental samples are available. More importantly, these results suggest In addition to common diseases, de novo mutations have also that de novo mutations are a likely cause of sporadic conditions been found to have an important role in sporadic cases of with reduced fecundity, such as congenital malformations, mental Mendelian disorders associated with neurodevelopmental defects. retardation and various psychiatric disorders. A good example is provided by one of the first exome-sequencing studies performed in Kabuki syndrome with a hitherto unknown genetic etiology.20 Kabuki syndrome is a multiple malformation disorder characterized by mild-to-moderate mental retardation in PERSPECTIVES AND CONCLUSIONS addition to a distinctive facial appearance and various congenital The development of sequence-enrichment methods and NGS cardiac, skeletal and immunological anomalies. It is an extremely technologies has made WES a very powerful and cost-effective rare autosomal-dominant Mendelian disorder, with the majority of research tool that can completely bypass the hypothesis-driven reported cases being sporadic. De novo mutations have been targeted approach of sequencing a limited set of candidate genes. suspected to be responsible for these sporadic cases. More However, the current shortcoming of WES lies in its limited ability specifically, a total of 33 distinct mutations in the MLL2 gene were to study CNVs, and its necessary focus on coding regions. Hence, identified in 35 of 53 individuals or families, with Kabuki syndrome the clinicobiological importance of de novo mutations in non- using exome sequencing and follow-up-targeted sequencing. In coding regions remains unclear. Eventually, WGS and especially de accord with the original hypothesis, the MLL2 mutation was found novo genome assembly promises to permit the comprehensive to have occurred de novo in all 12 cases where DNA from both analysis of de novo mutations and other inherited variants in the parents was available.20 The de novo nature of the mutations in human genome in collections of case-parent trios affected by MLL2 for Kabuki syndrome has been further demonstrated by various heritable diseases. As a considerable number of inherited follow-up studies.76--78 Indeed, MLL2 mutations have been found in mutations for various diseases have now been found in non- a majority of 110 families with Kabuki syndrome (74%) in an coding regions,94 it is reasonable to suppose that this could also independent study and in those 25 cases where DNA was available pertain to de novo mutations within non-coding regions. This in from both parents, all mutations were confirmed to be de novo.76 turn supports the likely efficacy of the strategy of eventually

Molecular Psychiatry (2013), 141 -- 153 & 2013 Macmillan Publishers Limited De novo mutations in inherited disease CS Ku et al 149 Table 3. Summary of whole-exome sequencing studies identifying de novo mutations underlying rare disorders

Study Disorder Characteristics Major findings

Ng et al.20 Kabuki A rare, multiple malformation disorder characterized Seven probands had newly identified nonsense or syndrome (MIM# by a distinctive facial appearance, cardiac anomalies, frameshift mutations in MLL2 gene. Follow-up Sanger 147920) skeletal abnormalities, immunological defects and sequencing detected MLL2 mutations in two of the mild-to-moderate mental retardation. three remaining individuals with Kabuki syndrome, and in 26 of 43 additional cases. In families where parental DNA was available, the mutation was confirmed to be de novo (n ¼ 12) or transmitted (n ¼ 2) in concordance with the phenotype. Hoischen Schinzel-- Characterized by severe mental retardation, distinctive Heterozygous de novo variants identified in SETBP1 et al.21 Giedion facial features, multiple congenital malformations gene in four affected individuals. The study also syndrome (MIM# (including skeletal abnormalities, genitourinary and identified SETBP1 mutations in eight additional cases 269150) renal malformations, and cardiac defects) and a using Sanger sequencing. For six of the eight follow- higher-than-normal prevalence of tumors, notably up cases, parental DNA was available, and the neuroepithelial neoplasia. mutations present in the affected individuals were again shown to have occurred de novo. Hoischen Bohring--Opitz Characterized by severe intellectual disability, Heterozygous de novo nonsense mutations identified et al.22 syndrome (MIM# prominent metopic suture, exophthalmus, forehead in ASXL1 gene in three individuals with Bohring--Opitz 605039) nevus flammeus, upslanting palpebral fissures, ulnar syndrome. Analysis of additional samples found that in deviation and flexion of the wrists and total, 7 out of 13 subjects with a Bohring--Opitz metacarpophalangeal joints, and severe feeding phenotype had de novo ASXL1 mutations. problems. Clayton- SBBYSS or Ohdo A multiple anomaly syndrome characterized by Identified de novo protein-truncating mutations in the Smith et al.79 syndrome (MIM# severe intellectual disability, blepharophimosis, and a highly conserved histone acetyltransferase gene, 603736) mask-like facial appearance. A number of individuals KAT6B, in three out of four individuals. Sanger with SBBYSS also have thyroid abnormalities and sequencing was used to confirm truncating mutations cleft palate. of KAT6B, clustering in the final exon of the gene in all four individuals and in a further nine persons with typical SBBYSS. Where parental samples were available, the mutations were shown to have occurred de novo. Bochukova Hypothyroidism A child with classic features of hypothyroidism (growth De novo heterozygous nonsense mutation identified in et al.80 (MIM# 614450) retardation, developmental retardation, skeletal the THRA gene encoding thyroid hormone receptor dysplasia and severe constipation) but only borderline- alpha, generating a mutant protein that inhibits wild- abnormal thyroid hormone levels. type receptor action in a dominant-negative manner. Simpson GPS (MIM# A rare disorder in which patellar aplasia or hypoplasia De novo mutations of KAT6B gene identified in five et al.81 606170) is associated with external genital anomalies and individuals with GPS; a single nonsense variant and severe intellectual disability. three frameshift indels, including a 4-bp deletion observed in two cases. All identified mutations are located within the terminal exon of the gene and are predicted to generate a truncated protein product lacking evolutionarily conserved domains. Campeau GPS (MIM# A rare skeletal dysplasia combining hypoplastic or De novo heterozygous-truncating mutations identified et al.82 606170) absent patellae, genital anomalies, craniofacial defects, in KAT6B gene in three subjects; Sanger sequencing of and intellectual disability among other features. KAT6B was used to identify similar mutations in three additional subjects. The mutant transcripts do not undergo nonsense-mediated decay in cells from subjects with GPS. Sirmaci KBG syndrome Characterized by intellectual disability associated with Deleterious heterozygous mutations identified in et al.83 (MIM# 148050) macrodontia of the upper central incisors as well as ANKRD11 gene encoding repeat domain 11. distinct craniofacial findings, short stature and skeletal A splice-site mutation, c.7570-1G4C anomalies. (p.Glu2524_Lys2525del), co-segregated with the disease in a family with three affected members, whereas in a simplex case a de novo-truncating mutation, c.2305delT (p.Ser769GlnfsX8), was detected. Sanger sequencing revealed additional de novo-truncating ANKRD11 mutations in three other simplex cases. Gibson Weaver A rare congenital anomaly syndrome consists of Two different de novo mutations identified in EZH2 et al.84 syndrome (MIM# generalized overgrowth, advanced bone age, marked gene. Sanger sequencing of EZH2 in a third classically 614421) macrocephaly, hypertelorism and characteristic facial affected proband identified a third de novo mutation in features. Intellectual disability is common. this gene. Hood et al.85 FHS (MIM# A rare condition characterized by short stature, Heterozygous truncating mutations identified in SRCAP 136140) delayed osseous maturation, expressive-language gene in five unrelated individuals with sporadic FHS. deficits and a distinctive facial appearance. Sanger sequencing identified mutations in SRCAP in eight additional affected persons. Mutations were found to be de novo in all six instances in which parental DNA was available.

& 2013 Macmillan Publishers Limited Molecular Psychiatry (2013), 141 -- 153 De novo mutations in inherited disease CS Ku et al 150 Table 3 (Continued )

Study Disorder Characteristics Major findings

Rademakers HDLS (MIM# An autosomal-dominant central nervous system white- 14 different mutations affecting the tyrosine kinase et al.86 221820) matter disease with variable clinical presentations, domain of the colony-stimulating factor-1 receptor including personality and behavioral changes, (encoded by CSF1R gene) identified in 14 families with dementia, depression, parkinsonism, seizures and HDLS. In one kindred, the study confirmed the de novo other phenotypes. occurrence of the mutation. Santen Coffin--Siris Characterized by developmental delay, severe speech De novo truncating mutations identified in ARID1B et al.87 syndrome (MIM# impairment, coarse facial features, hypertrichosis, gene in three individuals with Coffin--Siris syndrome. 135900) hypoplastic or absent fifth fingernails or toenails and All variants truncated the ARID1B reading frame (two agenesis of the corpus callosum. nonsense variants and one frameshift). The mutations were validated using Sanger sequencing and shown to occur de novo in all three individuals. Tsurusaki Coffin--Siris A rare congenital anomaly syndrome characterized by De novo SMARCB1 gene mutations identified in two of et al.88 syndrome (MIM# growth deficiency, intellectual disability, microcephaly, five individuals with typical Coffin--Siris syndrome. Two 135900) coarse facial features and hypoplastic nail of the fifth de novo coding-sequence mutations occurring within a finger and/or toe. specific gene is an extremely unlikely event, supporting the idea that SMARCB1 is the causative gene in Coffin--Siris syndrome. Lin et al.89 Olmsted A rare congenital disorder characterized by bilateral De novo missense mutation identified in the TRPV3 syndrome (MIM# mutilating PPK and periorificial keratotic plaques with gene yielding a p.Gly573Ser substitution in an 607066) severe itching at all lesions. individual with Olmsted syndrome. Nucleotide sequencing of five additional affected individuals also revealed missense mutations in TRPV3 (p.Gly573Ser in three cases plus p.Gly573Cys and p.Trp692Gly). Rivie`re Baraitser--Winter A rare but well-defined developmental disorder De novo missense changes identified in the et al.90 syndrome recognized by the combination of congenital ptosis, cytoplasmic actin-encoding genes ACTB and ACTG1 in high-arched eyebrows, hypertelorism, ocular one and two probands, respectively. Sequencing of colobomata and a brain malformation consisting of both genes in 15 additional affected individuals anterior predominant lissencephaly. Other typical identified disease-causing mutations in all probands, features include postnatal short stature and including two recurrent de novo alterations. microcephaly, intellectual disability, seizures and hearing loss. Winkelmann ADCA-DN (MIM# Characterized by late onset (age 30--40 years) DNMT1 identified as the only gene with mutations et al.91 604121) narcolepsy--cataplexy, sensory neuronal deafness, found in all five affected individuals. Sanger cerebellar ataxia, dementia and, more variably, sequencing confirmed a de novo mutation p.Ala570Val psychosis, optic atrophy and other symptoms. in one family, and showed co-segregation of p.Val606Phe and p.Ala570Val with the ADCA-DN phenotype in two other kindreds. Lines et al.92 MFDM (MIM# A rare sporadic syndrome comprising craniofacial Heterozygous mutations or deletions of EFTUD2 gene 610536) malformations, microcephaly, developmental delay identified in all four unrelated affected individuals. and a recognizable dysmorphic appearance. Major Validation studies of eight additional individuals with sequelae, including choanal atresia, sensorineural MFDM demonstrated causative EFTUD2 mutations in hearing loss and cleft palate, each occur in a significant all affected individuals tested. A range of EFTUD2 proportion of affected individuals. mutation types, including null alleles and frameshifts, is seen in MFDM, consistent with haploinsufficiency; segregation is de novo in all cases assessed to date. Caputo Myhre A developmental disorder characterized by reduced Heterozygous mutations identified in the SMAD4 gene, et al.93 syndrome (MIM# growth, generalized muscular hypertrophy, facial which encodes for a transducer mediating 139210) dysmorphism, deafness, cognitive deficits, joint transforming growth factor b and bone stiffness and skeletal anomalies. morphogenetic protein-signaling branches. Two recurrent de novo SMAD4 mutations were identified in eight unrelated subjects. Abbreviations: ADCA-DN, autosomal-dominant cerebellar ataxia, deafness and narcolepsy; FHS, Floating-Harbor syndrome; GPS, Genitopatellar syndrome; HDLS, hereditary diffuse leukoencephalopathy with spheroids; MFDM, mandibulofacial dysostosis with microcephaly; PPK, palmoplantar keratoderma; SBBYSS, Say-Barber-Biesecker-Young-Simpson syndrome. We have included all recent studies applying the whole-exome sequencing approach that identified de novo mutations in rare disorders. However, the list is by no means complete. This table briefly summarizes the studies, which have applied whole-exome sequencing to identify de novo mutations in rare disorders together with their major findings. However, readers are encouraged to refer to the full papers for more detailed information.

adopting WGS to ascertain the full impact of de novo mutations in sions to identify de novo CNVs. The identified de novo CNVs were human genetic disease and to study the molecular characteristics then tested for association in 1433 schizophrenia cases and 33 250 of such lesions. controls. The case-parent trio study design is gaining in popularity; Collections of a large number of such trios would be important in addition to being able to detect de novo events, this study in this context. Although genome-wide SNP arrays were applied, design should also be capable of determining the parent of origin Stefansson et al.95 analyzed 9878 parent-to-offspring transmis- of the mutations, for example, the paternal versus the maternal

Molecular Psychiatry (2013), 141 -- 153 & 2013 Macmillan Publishers Limited De novo mutations in inherited disease CS Ku et al 151 haplotype. The importance of studying parent of origin effects has GJB1, a variant that had been reported previously as being already been demonstrated in genome-wide association studies of pathogenic in X-linked Charcot--Marie--Tooth families. Hence, the inherited variants for complex diseases.96 prospect of this application is also to be anticipated in the context The de novo point mutations and CNVs discovered to date in of de novo mutations in sporadic cases of neurodevelopmental neurodevelopmental disorders are likely to represent only the tip diseases or Mendelian disorders if parental samples are available. of the iceberg. Although WES also allows the detection of larger Indeed, just as the use of microarrays to detect CNVs has CNVs using depth-of-coverage from mapped short sequence revolutionized the clinical diagnostic approach to individuals with reads,97 this approach suffers from the same inherent limitation disorders such as intellectual disability,37,108--110 WES and WGS are that only coding regions are currently being interrogated. The increasingly acquiring a dual role as both a discovery and a potential contribution of de novo mutations residing in other diagnostic tool in neurodevelopmental and neuropsychiatric regions of the genome remains to be explored. A further limitation disease. It follows that microarrays could eventually be supplanted of WES is that coverage of the entire exome is not complete using by WGS as a single diagnostic test for various genetic variants. A the existing commercial enrichment kits. Although whole-genome further advantage of WGS lies in its ability to detect CNVs and SNP genotyping arrays allow a relatively unbiased genome-wide structural rearrangements in addition to point mutations and survey of CNVs across the entire genome, they are a rather blunt small indels. In contrast to WES, WGS has clear advantages in tool in terms of their sensitivity with respect to the detection of detecting CNVs and structural rearrangements based on both the CNVs as compared with NGS-based approaches.98,99 detection methodology that is, paired-end mapping and/or However, studies performed to date have generated sufficient depth-of-coverage. Further, not only do the reads generated by data to justify the application of WGS to assess the contribution of WGS cover the entire genome but WGS also avoids the biases various types of de novo mutation, particularly in the context of introduced by exome enrichment in WES (for the depth-of- neurodevelopmental and neuropsychiatric diseases. Taken to- coverage method) in identifying/predicting CNVs.98,99 gether, these studies support the emerging general postulate that the genetic basis of sporadic cases of neurodevelopmental Note added in proof disorders is more likely to involve de novo mutations than While this review paper was being typeset, three studies appeared inherited variants. On the basis that neurodevelopmental dis- which described the identification of de novo mutations in autism orders share some genetic (that is, overlap of genetic loci) and spectrum disorders by means of the WES approach.111--113 These 61,100,101 pathological (overlap of phenotypic manifestations) basis, findings provide further support for the importance of de novo newly identified de novo mutations in one disorder should also be mutations in the etiology of these complex disorders. For example, sought in other related conditions. More specifically, pleiotropic in their large-scale study of 928 individuals, including 200 effects have recently been demonstrated, by means of WES, for phenotypically discordant sib-pairs, Sanders et al.112 found that other neurological diseases such as VCP mutations in amyotrophic the highly disruptive (nonsense and splice-site) de novo mutations 102 103 lateral sclerosis and NOTCH3 mutations in Alzheimer disease, in brain-expressed genes were not only associated with autism further emphasizing that these effects may have a more spectrum disorders but also carried large effects. Further, the significant role in neurological disease than previously appre- occurrence of multiple independent de novo single nucleotide ciated. Indeed, the entire gene in which the de novo mutation is variants in the same gene among unrelated probands was found found should be screened (instead of just the single mutation) in to be a reliable identifier of risk alleles, thereby providing a new other related conditions, in order to allow for the possibility of approach to gene discovery in autism spectrum disorders. allelic heterogeneity, a situation in which different de novo mutations in the same gene are responsible for different diseases. In parallel to the emergence of sequencing technologies CONFLICT OF INTEREST capable of generating longer sequence reads and the develop- The authors declare no conflict of interest. ment of ever more powerful bioinformatics tools, the study of de novo mutations in the human genome would be further facilitated by de novo genome assembly. This would allow the contribution of de novo mutations to genetic disease to be fully assessed ACKNOWLEDGEMENTS genome wide. Although the importance of de novo mutations to We would like to thank Touati Benoukraf and Mengchu Wu (Cancer Science Institute the etiology of common neurodevelopmental disorders has been of Singapore) for discussion. relatively well established, their significance in the context of other types of inherited disease still remains to be examined. For example, sporadic cases of nonsyndromic tetralogy of Fallot (the REFERENCES most common severe congenital heart malformation) also 1 Conrad DF, Keebler JE, DePristo MA, Lindsay SJ, Zhang Y, Casals F et al. Variation commonly result from de novo CNVs.104 The high-throughput in genome-wide mutation rates within and between human families. Nat Genet sequencing (both WES and WGS) of case-parent trios as part of a 2011; 43: 712--714. strategy to search for disease-causing mutations should be 2 Cooper DN, Bacolla A, Ferec C, Vasquez KM, Kehrer-Sawatzki H, Chen JM. On the advocated as this study design possesses distinct advantages, sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene muta- which cannot be substituted by the study of samples from tions underlying human inherited disease. Hum Mutat 2011; 32: 1075--1099. unrelated individuals. In the light of the expected increase in 3 Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein number of reported de novo mutations and their associations with function. Nucleic Acids Res 2003; 31: 3812--3814. various human diseases, a curated database to catalog all such 4 Sunyaev S, Ramensky V, Koch I, Lathe 3rd W, Kondrashov AS, Bork P. Prediction mutations is seen as an invaluable research tool for the future. of deleterious human alleles. Hum Mol Genet 2001; 10: 591--597. Finally, the diagnostic application of exome sequencing has 5 Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic been increasingly demonstrated in identifying inherited causal variants from high-throughput sequencing data. Nucleic Acids Res 2010; 38: mutations for Mendelian disorders.105--107 Montenegro et al.107 e164. showed the feasibility and power of WES as a diagnostic tool in 6 Flanagan SE, Patch AM, Ellard S. Using SIFT and PolyPhen to predict loss-of- function and gain-of-function mutations. Genet Test Mol Biomarkers 2010; 14:533-- two affected individuals from a family with Charcot--Marie--Tooth 537. disease (an inherited peripheral neuropathy characterized by 7 Hamdan FF, Daoud H, Rochefort D, Piton A, Gauthier J, Langlois M et al. De novo extensive heterogeneity involving mutations in at least 35 mutations in FOXP1 in cases with intellectual disability, autism, and language genes). The analysis identified a nonsynonymous mutation in impairment. Am J Hum Genet 2010; 87: 671--678.

& 2013 Macmillan Publishers Limited Molecular Psychiatry (2013), 141 -- 153 De novo mutations in inherited disease CS Ku et al 152 8 Hamdan FF, Gauthier J, Araki Y, Lin DT, Yoshizawa Y, Higashi K et al. Excess of de 37 Vissers LE, de Vries BB, Veltman JA. Genomic microarrays in mental retardation: novo deleterious mutations in genes associated with glutamatergic systems in from copy number variation to gene, from research to diagnosis. J Med Genet nonsyndromic intellectual disability. Am J Hum Genet 2011; 88: 306--316. 2010; 47: 289--297. 9 Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, Walsh T et al. Strong asso- 38 Priebe L, Degenhardt FA, Herms S, Haenisch B, Mattheisen M, Nieratschker V ciation of de novo copy number mutations with autism. Science 2007; 316: 445--449. et al. Genome-wide survey implicates the influence of copy number variants 10 Xu B, Roos JL, Levy S, van Rensburg EJ, Gogos JA, Karayiorgou M. Strong (CNVs) in the development of early-onset bipolar disorder. Mol Psychiatry 2011; association of de novo copy number mutations with sporadic schizophrenia. Nat 17: 421--432. Genet 2008; 40: 880--885. 39 Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol 2008; 26: 11 Roach JC, Glusman G, Smit AF, Huff CD, Hubley R, Shannon PT et al. Analysis of 1135--1145. genetic inheritance in a family quartet by whole-genome sequencing. Science 40 Metzker ML. Sequencing technologies - the next generation. Nat Rev Genet 2010; 2010; 328: 636--639. 11: 31--46. 12 Vissers LE, de Ligt J, Gilissen C, Janssen I, Steehouwer M, de Vries P et al. A de 41 Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C et al. Targeted novo paradigm for mental retardation. Nat Genet 2010; 42: 1109--1112. capture and massively parallel sequencing of 12 human exomes. Nature 2009; 13 Li Y, Zheng H, Luo R, Wu H, Zhu H, Li R et al. Structural variation in two human 461: 272--276. genomes mapped at single-nucleotide resolution by whole genome de novo 42 Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG assembly. Nat Biotechnol 2011; 29: 723--730. et al. Accurate whole human genome sequencing using reversible terminator 14 Paszkiewicz K, Studholme DJ. De novo assembly of short sequence reads. Brief chemistry. Nature 2008; 456: 53--59. Bioinform 2010; 11: 457--472. 43 Wang J, Wang W, Li R, Li Y, Tian G, Goodman L et al. The diploid genome 15 Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z et al. De novo assembly of human sequence of an Asian individual. Nature 2008; 456: 60--65. genomes with massively parallel short read sequencing. Genome Res 2010; 20: 44 Clark MJ, Chen R, Lam HY, Karczewski KJ, Euskirchen G, Butte AJ et al. 265--272. Performance comparison of exome DNA sequencing technologies. Nat 16 Li Y, Hu Y, Bolund L, Wang J. State of the art de novo assembly of human Biotechnol 2011; 29: 908--914. genomes from massively parallel sequencing data. Hum Genomics 2010; 4: 271--277. 45 Parla JS, Iossifov I, Grabill I, Spector MS, Kramer M, McCombie WR. A comparative 17 O’Roak BJ, Deriziotis P, Lee C, Vives L, Schwartz JJ, Girirajan S et al. Exome analysis of exome capture. Genome Biol 2011; 12: R97. sequencing in sporadic autism spectrum disorders identifies severe de novo 46 Sulonen AM, Ellonen P, Almusa H, Lepisto M, Eldfors S, Hannula S et al. mutations. Nat Genet 2011; 43: 585--589. Comparison of solution-based exome capture methods for next generation 18 Xu B, Roos JL, Dexheimer P, Boone B, Plummer B, Levy S et al. Exome sequencing sequencing. Genome Biol 2011; 12: R94. supports a de novo mutational paradigm for schizophrenia. Nat Genet 2011; 43: 47 Asan, Xu Y, Jiang H, Tyler-Smith C, Xue Y, Jiang T et al. Comprehensive 864--868. comparison of three commercial human whole-exome capture platforms. 19 Girard SL, Gauthier J, Noreau A, Xiong L, Zhou S, Jouan L et al. Increased exonic Genome Biol 2011; 12: R95. de novo mutation rate in individuals with schizophrenia. Nat Genet 2011; 43: 860--863. 48 Schadt EE, Turner S, Kasarskis A. A window into third-generation sequencing. 20 Ng SB, Bigham AW, Buckingham KJ, Hannibal MC, McMillin MJ, Gildersleeve HI Hum Mol Genet 2010; 19(R2): R227--R240. et al. Exome sequencing identifies MLL2 mutations as a cause of Kabuki 49 Lynch M. Rate, molecular spectrum, and consequences of human mutation. syndrome. Nat Genet 2010; 42: 790--793. Proc Natl Acad Sci USA 2010; 107: 961--968. 21 Hoischen A, van Bon BW, Gilissen C, Arts P, van Lier B, Steehouwer M et al. 50 Itsara A, Wu H, Smith JD, Nickerson DA, Romieu I, London SJ et al. De novo De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat Genet 2010; rates and selection of large copy number variation. Genome Res 2010; 20: 42: 483--485. 1469--1481. 22 Hoischen A, van Bon BW, Rodriguez-Santiago B, Gilissen C, Vissers LE, de Vries P 51 1000 Genomes Project Consortium. A map of human genome variation from et al. De novo nonsense mutations in ASXL1 cause Bohring-Opitz syndrome. Nat population-scale sequencing. Nature 2010; 467: 1061--1073. Genet 2011; 43: 729--731. 52 Matijevic T, Knezevic J, Slavica M, Pavelic J. Rett syndrome: from the gene to the 23 Carter NP. Methods and strategies for analyzing copy number variation using disease. Eur Neurol 2009; 61:3--10. DNA microarrays. Nat Genet 2007; 39(7 Suppl): S16--S21. 53 Zoghbi HY, Ledbetter DH, Schultz R, Percy AK, Glaze DG. A de novo X;3 24 Ragoussis J. Genotyping technologies for genetic research. Annu Rev Genomics translocation in Rett syndrome. Am J Med Genet 1990; 35: 148--151. Hum Genet 2009; 10: 117--133. 54 Amir RE, Van den Veyver IB, Wan M, Tran CQ, Francke U, Zoghbi HY. Rett 25 Girirajan S, Campbell CD, Eichler EE. Human copy number variation and complex syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG- genetic disease. Annu Rev Genet 2011; 45: 203--226. binding protein 2. Nat Genet 1999; 23: 185--188. 26 Ku CS, Loy EY, Pawitan Y, Chia KS. The pursuit of genome-wide association 55 Wan M, Lee SS, Zhang X, Houwink-Manville I, Song HR, Amir RE et al. Rett studies: where are we now? J Hum Genet 2010; 55: 195--206. syndrome and beyond: recurrent spontaneous and familial MECP2 mutations at 27 Peiffer DA, Le JM, Steemers FJ, Chang W, Jenniges T, Garcia F et al. High- CpG hotspots. Am J Hum Genet 1999; 65: 1520--1529. resolution genomic profiling of chromosomal aberrations using Infinium whole- 56 Dragich J, Houwink-Manville I, Schanen C. Rett syndrome: a surprising result of genome genotyping. Genome Res 2006; 16: 1136--1148. mutation in MECP2. Hum Mol Genet 2000; 9: 2365--2375. 28 Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF et al. PennCNV: an 57 Kirov G, Pocklington AJ, Holmans P, Ivanov D, Ikeda M, Ruderfer D et al. De novo integrated hidden Markov model designed for high-resolution copy number CNV analysis implicates specific abnormalities of postsynaptic signalling variation detection in whole-genome SNP genotyping data. Genome Res complexes in the pathogenesis of schizophrenia. Mol Psychiatry 2012; 17: 2007; 17: 1665--1674. 142--153. 29 Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P et al. Large-scale copy 58 Pinto D, Pagnamenta AT, Klei L, Anney R, Merico D, Regan R et al. Functional number polymorphism in the human genome. Science 2004; 305: 525--528. impact of global rare copy number variation in autism spectrum disorders. 30 Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y et al. Detection of Nature 2010; 466:368--372. large-scale variation in the human genome. Nat Genet 2004; 36: 949--951. 59 Levy D, Ronemus M, Yamrom B, Lee YH, Leotta A, Kendall J et al. Rare de novo 31 Freeman JL, Perry GH, Feuk L, Redon R, McCarroll SA, Altshuler DM et al. Copy and transmitted copy-number variation in autistic spectrum disorders. Neuron number variation: new insights in genome diversity. Genome Res 2006; 16: 949--961. 2011; 70: 886--897. 32 Craddock N, Hurles ME, Cardin N, Pearson RD, Plagnol V, Robson S et al. 60 Sanders SJ, Ercan-Sencicek AG, Hus V, Luo R, Murtha MT, Moreno-De-Luca D Genome-wide association study of CNVs in 16 000 cases of eight common et al. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 diseases and 3000 shared controls. Nature 2010; 464: 713--720. Williams syndrome region, are strongly associated with autism. Neuron 2011; 70: 33 Glessner JT, Wang K, Cai G, Korvatska O, Kim CE, Wood S et al. Autism genome- 863--885. wide copy number variation reveals ubiquitin and neuronal genes. Nature 2009; 61 Berkel S, Marshall CR, Weiss B, Howe J, Roeth R, Moog U et al. Mutations in the 459: 569--573. SHANK2 synaptic scaffolding gene in autism spectrum disorder and mental 34 Walsh T, McClellan JM, McCarthy SE, Addington AM, Pierce SB, Cooper GM et al. retardation. Nat Genet 2010; 42: 489--491. Rare structural variants disrupt multiple genes in neurodevelopmental pathways 62 McMullan DJ, Bonin M, Hehir-Kwa JY, de Vries BB, Dufke A, Rattenberry E et al. in schizophrenia. Science 2008; 320: 539--543. Molecular karyotyping of patients with unexplained mental retardation by SNP 35 International Schizophrenia Consortium. Rare chromosomal deletions and arrays: a multicenter study. Hum Mutat 2009; 30: 1082--1092. duplications increase risk of schizophrenia. Nature 2008; 455: 237--241. 63 Cook Jr EH, Scherer SW. Copy-number variations associated with neuropsychia- 36 Mulle JG, Dodd AF, McGrath JA, Wolyniec PS, Mitchell AA, Shetty AC et al. tric conditions. Nature 2008; 455: 919--923. Microdeletions of 3q29 confer high risk for schizophrenia. Am J Hum Genet 2010; 64 Kakinuma H, Sato H. Copy-number variations associated with autism spectrum 87: 229--236. disorder. Pharmacogenomics 2008; 9: 1143--1154.

Molecular Psychiatry (2013), 141 -- 153 & 2013 Macmillan Publishers Limited De novo mutations in inherited disease CS Ku et al 153 65 Tam GW, Redon R, Carter NP, Grant SG. The role of DNA copy number variation 89 Lin Z, Chen Q, Lee M, Cao X, Zhang J, Ma D et al. Exome sequencing reveals in schizophrenia. Biol Psychiatry 2009; 66: 1005--1012. mutations in TRPV3 as a cause of Olmsted syndrome. Am J Hum Genet 2012; 90: 66 Merikangas AK, Corvin AP, Gallagher L. Copy-number variants in neurodevelop- 558--564. mental disorders: promises and challenges. Trends Genet 2009; 25: 536--544. 90 Riviere JB, van Bon BW, Hoischen A, Kholmanskikh SS, O’Roak BJ, Gilissen C et al. 67 Endele S, Rosenberger G, Geider K, Popp B, Tamer C, Stefanova I et al. De novo mutations in the actin genes ACTB and ACTG1 cause Baraitser-Winter Mutations in GRIN2A and GRIN2B encoding regulatory subunits of NMDA syndrome. Nat Genet 2012; 44: 440--444. receptors cause variable neurodevelopmental phenotypes. Nat Genet 2010; 91 Winkelmann J, Lin L, Schormair B, Kornum BR, Faraco J, Plazzi G et al. Mutations 42: 1021--1026. in DNMT1 cause autosomal dominant cerebellar ataxia, deafness and 68 Horn D, Kapeller J, Rivera-Brugues N, Moog U, Lorenz-Depiereux B, Eck S et al. narcolepsy. Hum Mol Genet 2012; 21: 2205--2210. Identification of FOXP1 deletions in three unrelated patients with mental 92 Lines MA, Huang L, Schwartzentruber J, Douglas SL, Lynch DC, Beaulieu C et al. retardation and significant speech and language deficits. Hum Mutat 2010; 31: Haploinsufficiency of a spliceosomal GTPase encoded by EFTUD2 causes mandi- E1851--E1860. bulofacial dysostosis with microcephaly. Am J Hum Genet 2012; 90: 369--377. 69 Sundaram SK, Huq AM, Wilson BJ, Chugani HT. Tourette syndrome is associated 93 Caputo V, Cianetti L, Niceta M, Carta C, Ciolfi A, Bocchinfuso G et al. A restricted with recurrent exonic copy number variants. Neurology 2010; 74: 1583--1590. spectrum of mutations in the SMAD4 tumor-suppressor gene underlies Myhre 70 Giannandrea M, Bianchi V, Mignogna ML, Sirri A, Carrabino S, D’Elia E et al. syndrome. Am J Hum Genet 2012; 90: 161--169. Mutations in the small GTPase gene RAB39B are responsible for X-linked mental 94 Cooper DN, Chen JM, Ball EV, Howells K, Mort M, Phillips AD et al. Genes, retardation associated with autism, epilepsy, and macrocephaly. Am J Hum Genet mutations, and human inherited disease at the dawn of the age of personalized 2010; 86: 185--195. genomics. Hum Mutat 2010; 31: 631--655. 71 Hamdan FF, Gauthier J, Spiegelman D, Noreau A, Yang Y, Pellerin S et al. 95 Stefansson H, Rujescu D, Cichon S, Pietilainen OP, Ingason A, Steinberg S et al. Mutations in SYNGAP1 in autosomal nonsyndromic mental retardation. N Engl J Large recurrent microdeletions associated with schizophrenia. Nature 2008; 455: Med 2009; 360: 599--605. 232--236. 72 Gauthier J, Champagne N, Lafreniere RG, Xiong L, Spiegelman D, Brustein E et al. 96 Kong A, Steinthorsdottir V, Masson G, Thorleifsson G, Sulem P, Besenbacher S De novo mutations in the gene encoding the synaptic scaffolding protein et al. Parental origin of sequence variants associated with complex diseases. SHANK3 in patients ascertained for schizophrenia. Proc Natl Acad Sci USA 2010; Nature 2009; 462: 868--874. 107: 7863--7868. 97 Sathirapongsasuti JF, Lee H, Horst BA, Brunner G, Cochran AJ, Binder S et al. 73 Tarabeux J, Champagne N, Brustein E, Hamdan FF, Gauthier J, Lapointe M et al. Exome sequencing-based copy-number variation and loss of heterozygosity De novo truncating mutation in Kinesin 17 associated with schizophrenia. Biol detection: exomeCNV. Bioinformatics 2011; 27: 2648--2654. Psychiatry 2010; 68: 649--656. 98 Medvedev P, Stanciu M, Brudno M. Computational methods for discovering 74 Hamdan FF, Piton A, Gauthier J, Lortie A, Dubeau F, Dobrzeniecka S et al. De novo structural variation with next-generation sequencing. Nat Methods 2009; 6(11 STXBP1 mutations in mental retardation and nonsyndromic epilepsy. Ann Neurol Suppl): S13--S20. 2009; 65: 748--753. 99 Medvedev P, Fiume M, Dzamba M, Smith T, Brudno M. Detecting copy number 75 Deprez L, Weckhuysen S, Holmgren P, Suls A, Van Dyck T, Goossens D et al. variation with mated short reads. Genome Res 2010; 20: 1613--1622. Clinical spectrum of early-onset epileptic encephalopathies associated with 100 Carroll LS, Owen MJ. Genetic overlap between autism, schizophrenia and bipolar STXBP1 mutations. Neurology 2010; 75: 1159--1165. disorder. Genome Med 2009; 1: 102. 76 Hannibal MC, Buckingham KJ, Ng SB, Ming JE, Beck AE, McMillin MJ et al. 101 Fernandez BA, Roberts W, Chung B, Weksberg R, Meyn S, Szatmari P et al. Spectrum of MLL2 (ALR) mutations in 110 cases of Kabuki syndrome. Am J Med Phenotypic spectrum associated with de novo and inherited deletions and Genet A 2011; 155A: 1511--1516. duplications at 16p11.2 in individuals ascertained for diagnosis of autism 77 Li Y, Bogershausen N, Alanay Y, Simsek Kiper PO, Plume N, Keupp K et al. A spectrum disorder. J Med Genet 2010; 47: 195--203. mutation screen in patients with Kabuki syndrome. Hum Genet 2011; 130: 102 Johnson JO, Mandrioli J, Benatar M, Abramzon Y, Van Deerlin VM, Trojanowski JQ 715--724. et al. Exome sequencing reveals VCP mutations as a cause of familial ALS. Neuron 78 Paulussen AD, Stegmann AP, Blok MJ, Tserpelis D, Posma-Velter C, Detisch Y 2010; 68: 857--864. et al. MLL2 mutation spectrum in 45 patients with Kabuki syndrome. Hum Mutat 103 Guerreiro RJ, Lohmann E, Kinsella E, Bras JM, Luu N, Gurunlian N et al. Exome 2011; 32: E2018--E2025. sequencing reveals an unexpected genetic cause of disease: NOTCH3 mutation 79 Clayton-Smith J, O’Sullivan J, Daly S, Bhaskar S, Day R, Anderson B et al. Whole- in a Turkish family with Alzheimer’s disease. Neurobiol Aging 2012; 33: 1008.e17-- exome-sequencing identifies mutations in histone acetyltransferase gene KAT6B 1008.e23. in individuals with the Say-Barber-Biesecker variant of Ohdo syndrome. Am J 104 Greenway SC, Pereira AC, Lin JC, DePalma SR, Israel SJ, Mesquita SM et al. Hum Genet 2011; 89: 675--681. De novo copy number variants identify new genes and loci in isolated sporadic 80 Bochukova E, Schoenmakers N, Agostini M, Schoenmakers E, Rajanayagam O, tetralogy of Fallot. Nat Genet 2009; 41: 931--935. Keogh JM et al. A mutation in the thyroid hormone receptor alpha gene. N Engl J 105 Choi M, Scholl UI, Ji W, Liu T, Tikhonova IR, Zumbo P et al. Genetic diagnosis by Med 2012; 366: 243--249. whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci 81 Simpson MA, Deshpande C, Dafou D, Vissers LE, Woollard WJ, Holder SE et al. De USA 2009; 106: 19096--19101. novo mutations of the gene encoding the histone acetyltransferase KAT6B cause 106 Worthey EA, Mayer AN, Syverson GD, Helbling D, Bonacci BB, Decker B et al. Genitopatellar syndrome. Am J Hum Genet 2012; 90: 290--294. Making a definitive diagnosis: successful clinical application of whole exome 82 Campeau PM, Kim JC, Lu JT, Schwartzentruber JA, Abdul-Rahman OA, Schlaubitz S sequencing in a child with intractable inflammatory bowel disease. Genet Med et al. Mutations in KAT6B, encoding a histone acetyltransferase, cause Genitopatellar 2011; 13: 255--262. syndrome. Am J Hum Genet 2012; 90:282--289. 107 Montenegro G, Powell E, Huang J, Speziani F, Edwards YJ, Beecham G et al. 83 Sirmaci A, Spiliopoulos M, Brancati F, Powell E, Duman D, Abrams A et al. Exome sequencing allows for rapid gene identification in a Charcot-Marie-Tooth Mutations in ANKRD11 cause KBG syndrome, characterized by intellectual family. Ann Neurol 2011; 69: 464--470. disability, skeletal malformations, and macrodontia. Am J Hum Genet 2011; 89: 108 Edelmann L, Hirschhorn K. Clinical utility of array CGH for the detection of 289--294. chromosomal imbalances associated with mental retardation and multiple 84 Gibson WT, Hood RL, Zhan SH, Bulman DE, Fejes AP, Moore R et al. Mutations in congenital anomalies. Ann N Y Acad Sci 2009; 1151: 157--166. EZH2 cause weaver syndrome. Am J Hum Genet 2012; 90: 110--118. 109 Poot M, Hochstenbach R. A three-step workflow procedure for the interpretation 85 Hood RL, Lines MA, Nikkel SM, Schwartzentruber J, Beaulieu C, Nowaczyk MJ of array-based comparative genome hybridization results in patients with idiopathic et al. Mutations in SRCAP, encoding SNF2-related CREBBP activator protein, mental retardation and congenital anomalies. Genet Med 2010; 12:478--485. cause floating-harbor syndrome. Am J Hum Genet 2012; 90: 308--313. 110 Schaaf CP, Wiszniewska J, Beaudet AL. Copy number and SNP arrays in clinical 86 Rademakers R, Baker M, Nicholson AM, Rutherford NJ, Finch N, Soto-Ortolaza A diagnostics. Annu Rev Genomics Hum Genet 2011; 12: 25--51. et al. Mutations in the colony stimulating factor 1 receptor (CSF1R) gene cause 111 Neale BM, Kou Y, Liu L, Ma’ayan A, Samocha KE, Sabo A et al. Patterns and rates hereditary diffuse leukoencephalopathy with spheroids. Nat Genet 2012; 44: of exonic de novo mutations in autism spectrum disorders. Nature 2012; 485: 200--205. 242--245. 87 Santen GW, Aten E, Sun Y, Almomani R, Gilissen C, Nielsen M et al. Mutations in 112 Sanders SJ, Murtha MT, Gupta AR, Murdoch JD, Raubeson MJ, Willsey AJ et al. SWI/SNF chromatin remodeling complex gene ARID1B cause Coffin-Siris De novo mutations revealed by whole-exome sequencing are strongly syndrome. Nat Genet 2012; 44: 379--380. associated with autism. Nature 2012; 485: 237--241. 88 Tsurusaki Y, Okamoto N, Ohashi H, Kosho T, Imai Y, Hibi-Ko Y et al. Mutations 113 O’Roak BJ, Vives L, Girirajan S, Karakoc E, Krumm N, Coe BP et al. Sporadic autism affecting components of the SWI/SNF complex cause Coffin-Siris syndrome. Nat exomes reveal a highly interconnected protein network of de novo mutations. Genet 2012; 44: 376--378. Nature 2012; 485: 246--250.

& 2013 Macmillan Publishers Limited Molecular Psychiatry (2013), 141 -- 153