Diagnóstico Molecular Do Transtorno Do Espectro Autista Através Do Sequenciamento Completo De Exoma Molecular Diagnosis Of
Total Page:16
File Type:pdf, Size:1020Kb
Universidade de São Paulo Tatiana Ferreira de Almeida Diagnóstico molecular do transtorno do espectro autista através do sequenciamento completo de exoma Molecular diagnosis of autism spectrum disorder through whole exome sequencing São Paulo 2018 Universidade de São Paulo Tatiana Ferreira de Almeida Diagnóstico molecular do transtorno do espectro autista através do sequenciamento completo de exoma Molecular diagnosis of autism spectrum disorder through whole exome sequencing Tese apresentada ao Instituto de Biociências da Universidade de São Paulo, para a obtenção de Título de Doutor em Ciências, na Área de Biologia/Genética. Versão corrigida. Orientador(a): Maria Rita dos Santos Passos-Bueno São Paulo ii 2018 Dedico esta tese à ciência, e a todos os seus amantes. O, reason not the need! Our basest beggars Are in the poorest thing superfluous. Allow not nature more than nature needs, Man’s life is cheap as beast’s. Thou art a lady: If only to go warm were gorgeous, Why, nature needs not what thou gorgeous wear’st Which scarcely keeps thee warm. But, for true need- You heavens, give me that patience, patience I need! You see me here, you gods, a poor old man, As full of grief as age; wretched in both.” iii King Lear, William Shakespeare, 1608 Agradecimentos Agradeço em primeiro lugar ao meu grande amor Danilo, sem ele, há 11 anos, essa jornada não teria começado, a vida não teria me surpreendido e eu estaria de volta em casa, com todos os peixinhos do mar, e infeliz pelo resto da minha vida. Agradeço minha família, principalmente minha mãe Flávia e meu pai Airton por sacrificarem com mínimos protestos, as melhores horas que passaríamos juntos para que eu cumprisse meu trabalho. Aos compadres e amigos Bruna e Wagniel e Cíntia e Fernando que não somente me apoiaram neste caminho, mas colocaram na minha vida os maiores presentes que eu poderia sonhar, Alice e Juliana, que mesmo em tão pouco tempo de vida e com tantas ausências ainda sorriem ao me ver. À minha orientadora Maria Rita Santos Passos-Bueno que com um raciocínio feroz e implacável não mediu esforços para me ajudar a percorrer o caminho mais desafiador que eu já enfrentei. Ao professor Paulo Otto que solidificou meu amor pela matemática e me ajudou a pensar em probabilidades. Agradeço ao Renato Puga que pegou na minha mão e me ensinou a programar (em awk!!), obrigada por ser o ursinho toda vez que eu errava uma vírgula. Jovem, “Até logo, e obrigada pelos pães!” Aos amigos de laboratório, o Lab200, que me ensinaram a força da cooperação, a pensar em todos. Eu nunca tinha visto pessoas trabalharem com tamanha vontade de fazer o melhor por todos e espero ter aprendido com vocês e levar para a vida essa tão valiosa lição. Gerson, Lucas e Luciano obrigada pela constante inspiração. Obrigada meninas do GeneEvol, Carol, Ágatha, Camila e Clarice, e mais tarde Duda, por me deixarem entrar na vida de vocês e sempre lembrarem de me incluir nos encontros mesmo eu quase nunca podendo ir. Aos amigos do DLE, principalmente Dayse, Karina, Cíntia e Mi, que sempre me viam correndo de um lado pro outro com as “coisas do doutorado” e foram compreensivos e pacientes com a vida dupla que eu levava. Ao Gustavo Campana que me mostrou como deve ser um exame para o diagnóstico laboratorial e me encantou com esse mundo da patologia clínica. Às amadas bailarinas, obrigada por fazerem da minha vida melhor e mais divertida, nada disso teria sido possível se não fosse minha terapia em roupas de ballet. Lili, obrigada por me ensinar tanto na vida e no palco. Paty, a bola é sua agora!! E finalmente à CAPES pelo apoio financeiro durante o percurso deste trabalho. iv Table of Contents 1 Introduction 1 1.1 Clinical and epidemiological aspects of autism spectrum disorder 1 1.2 Molecular aspects of autism spectrum disorder 3 1.3 Next Generation Sequencing and the Diagnosis of Rare Diseases 6 1.4 Next Generation Sequencing and the ASD Diagnosis 8 2 Objectives 9 3 Material and Methods 10 3.1 Subjects 11 3.1.1 Austism Spectrum Disorder Subjects 11 3.1.2 Control group 11 3.1.3 Kinship and ancestry analysis 15 3.2 Gene lists 17 3.2.1 Genes with critical exons 17 3.2.2 Chromatin modifier genes 18 3.2.3 1527 genes related to ASD 18 3.2.4 FMRP targets 18 3.2.5 208 genes related to ASD 19 3.2.6 GABA pathway 19 3.2.7 Hypermutable Genes 19 3.2.8 mTOR and RAS/MAPK Pathways 20 3.2.9 IGF1 Pathway 20 3.2.10 Genes intolerant to loss-of-function mutations 20 3.2.11 Differentially expressed genes 21 v 3.2.12 SFARI 21 3.3 Sequencing and annotation strategies 23 3.4 Comparisons pipeline 25 3.4.1 Rare Loss-of-function and missense 27 3.4.2 Discriminating models 29 3.4.2.1 Classification Performance 29 3.4.2.2 Logistic Regression 30 3.4.2.3 Decision Tree 31 3.4.2.4 Neural network 31 3.4.2.5 Support Vector Machine 32 3.4.3 Principal Component Analysis 32 4 Results 33 4.1 Exploratory Analysis 34 4.1.1 Removing outliers 37 4.1.2 Variants distribution 37 4.1.3 Kmeans 39 4.1.4 Correlation Matrices 41 4.1.5 Variant distribution between affected and control 42 4.1.6 Comparison Analysis 45 4.1.6.1 Loss-of-Function grouping 45 4.1.6.2 Multivariate models 46 4.1.6.3 Principal Components Analysis 47 4.1.7 Analysis by clusters 51 4.1.7.1 Cluster 1 52 4.1.7.2 Cluster 2 52 4.1.8 Separate regions 53 vi 4.2 Report Analysis 54 4.2.1 Genes with critical exons 55 4.2.2 Chromatin modifier genes 56 4.2.3 1,527 genes related to ASD 57 4.2.4 FMRP targets 57 4.2.5 208 genes related to ASD 59 4.2.6 GABA pathway 59 4.2.7 Hipermutable Genes 60 4.2.8 mTOR Pathway 60 4.2.9 IGF1 Pathway 61 4.2.10 Genes intolerant to loss-of-function mutations 61 4.2.11 Differentially expressed genes M1 62 4.2.12 Differentially expressed genes M2 62 4.2.13 Differentially expressed genes M8 63 4.2.14 Differentially expressed genes M16 63 4.2.15 Differentially expressed genes M18 65 4.2.16 RAS/MAPK Pathway 66 4.2.17 SFARI High Confidence 68 4.2.18 SFARI Strong Candidate 68 4.2.19 SFARI Suggestive Evidence 68 4.2.20 SFARI Syndromic 68 4.2.21 SFARI High Confidence, Strong Evidence, Suggestive Evidence and Syndromic... 69 5 Discussion and Conclusions 69 6 Resumo 77 7 Abstract 79 8 Reference 81 vii ?? Attachments ? ?? Attachment 1 ? ?? Attachment 2 ? ?? Attachment 3 ? ?? Attachment 4 ? ?? Attachment 5 ? ?? Attachment 6 ? ?? Attachment 7 ? ?? Attachment 8 ? ?? Attachment 9 ? ?? Attachment 10 ? ?? Attachment 11 ? ?? Attachment 12 ? ?? Attachment 13 ? ?? Attachment 14 ? ?? Attachment 15 ? ?? Attachment 16 ? ?? Attachment 17 ? ?? Attachment 18 ? ?? Attachment 19 ? ?? Attachment 20 ? ?? Attachment 21 ? ?? Attachment 22 ? ?? Prefilter ? ?? Spring ? ?? Counting ? viii ?? Attachment 23 ? ?? Attachment 24 ? ?? Attachment 25 ? ix 1 Introduction 1.1 Clinical and epidemiological aspects of autism spectrum disorder Autism spectrum disorders (ASD) are a group of neurodevelopmental disorders characterized by impaired communication skills, behavior and social interaction. According to the DSM-5 criteria (Diagnostic and Statistical Manual of Mental Disorders, 5th edition), the diagnosis of ASD is made based on clinical findings in two different categories, namely: (1) impaired social and communication skills and (2) stereotypic behavior and repetitive motions (LA TORRE-UBIETA et al., 2016; LORD; BISHOP, 2015). Social-communicative skill changes must present deficits in three distinct areas: (a) socio-emotional reciprocity, (b) nonverbal behaviors used in social interaction, and (c) maintaining, developing and understanding relationships, in addition to adjusting to various social contexts (LORD; BISHOP, 2015). This new classification encompasses every pervasive developmental disorder previously sorted into Asperger's syndrome, autism, childhood disintegrative disorders, and unspecified pervasive developmental disorders. According to the new classification criteria, although there is no age limit for symptom onset, all the individuals with ASD should present them from childhood, even if they are only recognized later. In addition to these core characteristics, ASD can present highly complex clinical features and harbor many comorbidities, where 50% of the cases present language delay, 35-50% are associated with intellectual disability (ID), approximately 21.5% of individuals with ASD and ID are affected by epilepsy, while only 8% of them present average intellect (BOURGERON, 2015; GESCHWIND; STATE, 2015; WOODBURY-SMITH; SCHERER, 2018). Gastrointestinal problems, sleep disorders and psychiatric disorders are also common ASD comorbidities (ANAGNOSTOU et al., 2014). Approximately 10-25% of the ASD cases are part of a recognizable Mendelian disorder, such as fragile X-Syndrome (1.9%), Rett syndrome, tuberous sclerosis (0.9%), neurofibromatosis type 1 (0.28%), Williams-Beuren syndrome and many others (ANAGNOSTOU et al., 2014; LA TORRE-UBIETA et al., 2016), a great number with a known molecular basis that can be tested on suspected cases. Thus, routine screening for Rett syndrome and fragile X syndrome should be performed on individuals with ASD (SCHAEFER; MENDELSOHN, 2013). This particular characteristic confers ASD the label of being a common syndrome that, at least in part, comprises several rare disorders (GESCHWIND; STATE, 2015). The available data suggest that at least some forms of ASD involve time-specific developmental deficits that might be targets for treatment, even after the symptoms first emerge (BETANCUR; SAKURAI; BUXBAUM, 2009).