Supplementary Information
Total Page:16
File Type:pdf, Size:1020Kb
SUPPLEMENTARY INFORMATION Functional impact of global rare copy number variation in autism spectrum disorders Dalila Pinto1, Alistair T. Pagnamenta2, Lambertus Klei3, Richard Anney4, Daniele Merico5, Regina Regan6, Judith Conroy6, Tiago R. Magalhaes7,8, Catarina Correia7,8, Brett S. Abrahams9, Joana Almeida10, Elena Bacchelli11, Gary D. Bader5, Anthony J. Bailey12, Gillian Baird13, Agatino Battaglia14, Tom Berney15,56, Nadia Bolshakova4, Sven Bölte16, Patrick F. Bolton17, Thomas Bourgeron18, Sean Brennan4, Jessica Brian19, Susan E. Bryson20, Andrew R. Carson1, Guillermo Casallo1, Jillian Casey6, Brian H.Y. Chung1, Lynne Cochrane4, Christina Corsello21, Emily L. Crawford22, Andrew Crossett23, Cheryl Cytrynbaum1, Geraldine Dawson24,25, Maretha de Jonge26, Richard Delorme27, Irene Drmic19, Eftichia Duketis16, Frederico Duque10, Annette Estes28, Penny Farrar2, Bridget A. Fernandez29, Susan E. Folstein30, Eric Fombonne31, Christine M. Freitag16, John Gilbert30, Christopher Gillberg32, Joseph T. Glessner33, Jeremy Goldberg34, Andrew Green6, Jonathan Green35, Stephen J. Guter36, Hakon Hakonarson33,37, Elizabeth A. Heron4, Matthew Hill4, Richard Holt2, Jennifer L. Howe1, Gillian Hughes4, Vanessa Hus21, Roberta Igliozzi14, Cecilia Kim33, Sabine M. Klauck38, Alexander Kolevzon39, Olena Korvatska40, Vlad Kustanovich41, Clara M. Lajonchere41, Janine A. Lamb42, Magdalena Laskawiec12, Marion Leboyer43, Ann Le Couteur15,56, Bennett L. Leventhal44,45, Anath C. Lionel1, Xiao-Qing Liu1, Catherine Lord21, Linda Lotspeich46, Sabata C. Lund22, Elena Maestrini11, William Mahoney47, Carine Mantoulan48, Christian R. Marshall1, Helen McConachie15,56, Christopher J. McDougle49, Jane McGrath4, William M. McMahon50, Alison Merikangas4, Ohsuke Migita1, Nancy J. Minshew51, Ghazala K. Mirza2, Jeff Munson52, Stanley F. Nelson53, Carolyn Noakes19, Abdul Noor54, Gudrun Nygren32, Guiomar Oliveira10, Katerina Papanikolaou55, Jeremy R. Parr56, Barbara Parrini14, Tara Paton1, Andrew Pickles57, Marion Pilorge58, Joseph Piven59, Chris P. Ponting60, David J. Posey49, Annemarie Poustka38‡, Fritz Poustka16, Aparna Prasad1, Jiannis Ragoussis2, Katy Renshaw12, Jessica Rickaby1, Wendy Roberts19, Kathryn Roeder23, Bernadette Roge48, Michael L. Rutter61, Laura J. Bierut62, John P. Rice62, Jeff Salt36, Katherine Sansom1, Daisuke Sato1, Ricardo Segurado4, Ana F. Sequeira7,8, Lili Senman19, Naisha Shah6, Val C. Sheffield63, Latha Soorya39, Inês Sousa2, Olaf Stein64, Nuala Sykes2, Vera Stoppioni65, Christina Strawbridge34, Raffaella Tancredi14, Katherine Tansey4, Bhooma Thiruvahindrapduram1, Ann P. Thompson34, Susanne Thomson22, Ana Tryfon39, John Tsiantis55, Herman Van Engeland26, John B. Vincent54, Fred Volkmar66, Simon Wallace12, Kai Wang33, Zhouzhi Wang1, Thomas H. Wassink67, Caleb Webber60, Rosanna Weksberg1, Kirsty Wing2, Kerstin Wittemeyer48, Shawn Wood3, Jing Wu23, Brian L. Yaspan22, Danielle Zurawiecki39, Lonnie Zwaigenbaum68, Joseph D. Buxbaum39*, Rita M. Cantor53*, Edwin H. Cook36*, Hilary Coon50*, Michael L. Cuccaro30*, Bernie Devlin3*, Sean Ennis6*, Louise Gallagher4*, Daniel H. Geschwind9*, Michael Gill4*, Jonathan L. Haines69*, Joachim Hallmayer46*, Judith Miller50, Anthony P. Monaco2*, John I. Nurnberger Jr49*, Andrew D. Paterson1*, Margaret A. Pericak-Vance30*, Gerard D. Schellenberg70*, Peter Szatmari34*, Astrid M. Vicente7,8*, Veronica J. Vieland64*, Ellen M. Wijsman71*, Stephen W. Scherer1,72*, James S. Sutcliffe22* & Catalina Betancur58* 1 CONTENTS A. AUTISM SPECTRUM DISORDER (ASD) SAMPLE AND CONTROL COLLECTIONS 4 ASD samples 4 Control cohorts 4 B. GENOTYPING AND DATA CLEANING 5 SNP quality control 6 Intensity quality control for CNV detection 6 C. CNV DETECTION AND QUALITY CONTROL EVALUATION 7 Pilot experiment to evaluate the quality of detected stringent CNVs 8 D. CNV VERIFICATION 9 Confirmation using computational methods 9 Experimental CNV validation 9 E. RARE CNV BURDEN ANALYSIS 11 CNV global burden analysis 11 CNV Region (CNVR) burden analysis 12 CNV-based gene association test 12 Burden analysis for genes known to be implicated in ASD and/ or ID 12 Population attributable risk 13 F. GENE-SET ENRICHMENT AND FUNCTIONAL MAP 13 Analytical synopsis 13 Deriving gene sets 14 Gene-set enrichment test 14 Cryptic bias 14 Network visualization of gene-sets enriched for deletions: functional enrichment map 15 Expanded functional map between deletion-enriched gene sets and known ASD/ID genes 15 G. SUPPLEMENTARY FIGURES AND TABLES 17 H. ACKNOWLEDGEMENTS 72 I. SUPPLEMENTARY REFERENCES 72 2 LIST OF FIGURES Figure 1. Quality control and analysis flow chart 18 Figure 2. Results from ancestry analysis using genome-wide genotype data 19 Figure 3. Global CNV measures for blood versus cell-line DNA-derived samples 20 Figure 4. Examples of CNVs overlapping ASD candidate genes or loci 21 Figure 5. Number of enriched gene-sets at different FDR q-value thresholds 25 Figure 6. Overview of all enriched gene-sets in deletions 26 Figure 7. Detailed annotation of the main enriched clusters 27 Figure 8. Control for bias in length and number of deletions 28 Figure 9. Bias control for genome proximity 30 Figure 10. Distribution of q-values per gene-set 31 Figure 11. Gene coverage between ASD and/or ID gene-sets and sets enriched for deletions 32 Figure 12. Expanded functional map with high-level of annotation detail 33 LIST OF TABLES Table 1. Quality control (QC) steps for CNV analysis of ASD cases and controls 34 Table 2. Chromosome abnormalities larger than 7.5 Mb detected during QC 35 Table 3: Summary of characteristics of stringent CNVs in cases and controls 37 Table 4. Characteristics of rare CNVs in European ASD probands and controls 38 Table 5. Global rare CNV burden analyzes with respect to CNV size and CNV rate 41 Table 6. Examples of ASD candidate genes or loci 42 Table 7. Rare CNVs confirmed experimentally 43 Table 8. Rare CNVs in 996 ASD cases 59 Table 9. List of known ASD genes, ID genes, and ASD-candidates 60 Table 10. Clinically-relevant findings 65 Table 11. Population attributable risk (PAR) 68 Table 12. Number of analyzed gene-sets in the functional map enriched for deletions 69 Table 13. List of gene-sets enriched for deletions 70 3 A. AUTISM SPECTRUM DISORDER (ASD) SAMPLE AND CONTROL COLLECTIONS ASD samples In an ongoing effort, the international Autism Genome Project (AGP) Consortium is collecting ASD families for genetic studies. The first phase of this initiative involved examining genetic linkage and chromosomal rearrangements in 1,168 families having at least two ASD individuals1. In this second phase of the project, we collected more families and genotyped them to examine copy number variation (CNV) and single nucleotide polymorphisms (SNPs) affecting risk for ASD. Here, we present the analysis of rare CNVs, which is outlined in Fig. 1 in the main text. As discussed, 1,275 ASD cases (1,256 cases with both parents) were available for genotyping for our study (Fig.1). DNA was obtained from blood (63%), buccal-swabs (10%) or cell-lines (22%) (in 5% the DNA source was not available). The Autism Diagnostic Interview-Revised (ADI-R)2 and Autism Diagnostic Observation Schedule (ADOS)3 were used for research diagnostic classification. Subjects with previously known karyotypic abnormalities or other genetic disorders associated with ASD were excluded. Our ascertainment of ASD cases was based on the following criteria. Affected subjects were grouped in three classes (strict, broad and spectrum ASD) based on proband diagnostic measures. To qualify for the strict class, affected individuals met criteria for autism on both primary instruments, the ADI-R and the ADOS. The broad class included individuals who met ADI-R criteria for autism and ADOS criteria for ASD, but not autism, or vice versa. ADI-R-based diagnostic classification of subjects as ASD followed criteria published by Risi et al.4. Specifically, individuals who almost met ADI criteria for autism were classified as ASD if (1) they met criteria on social and either communication or repetitive behavior domains; or (2) met criteria on social and within 2 points of criteria for communication, or met criteria on communication and within 2 points of social criteria, or within 1 point on both social and communication domains4. Finally, the spectrum class included all individuals who were classified as ASD on both the ADI-R and ADOS or who were not evaluated on one of the instruments but were diagnosed with autism on the other instrument. Subjects from all classifications (strict, broad, and spectrum) were included in the CNV analysis. Family-history reports were taken to inform on the family type. Multiplex (MPX) families had at least two individuals receiving validated ASDs diagnoses who were first to third degree relatives (for third degree, only considered cousins). This included families with affected dizygotic twins. Simplex (SPX) families had only one known individual with ASD in first to third (cousin) degree relatives. Families with only affected monozygotic twins were considered SPX. Unknown (UKN) families were any families that did not fall into the MPX or SPX criteria above. Given the international and multi-site nature of the project and range of chronological and mental age of the probands, a range of cognitive tests were administered, and standard scores were combined across tests to provide consolidated IQ estimates. Control cohorts Our primary considerations for selecting control groups for the genome-wide CNV comparison studies included using individuals with no obvious psychiatric history