A Ontology Approach to Autism Spectrum Disorders Through the Window of Hypoxia

Shreya Lakshmi Ramachandran April 15, 2016

1

A APPROACH TO AUTISM SPECTRUM DISORDERS THROUGH THE WINDOW OF HYPOXIA

An Honors Thesis Submitted to the Department of Biology in partial fulfillment of the Honors Program STANFORD UNIVERSITY

by SHREYA LAKSHMI RAMACHANDRAN APRIL 2016

2

3 Acknowledgements

This honors thesis is my first step towards achieving my dreams as a research scientist. I could not be more honored or grateful to have so many people cheering on through the process. During every late night spent typing away in my room, I was supported not only by copious amounts of tea, but also by the knowledge that I have the world’s best team of mentors and cheerleaders on my side. I of course would never have been able to complete this process without my amazing research mentors, Dr. Ruth O’Hara and Dr. Joachim Hallmayer. Thank you both for our weekly meetings, the inspiration to think more critically and write more precisely, and your constant kindness and encouragement. Dr. O’Hara, since my freshman year at Stanford, you have been such a figure of support and guidance in my life—teaching me to be a conscientious scientist, helping me set goals I am truly excited about, and encouraging and training me to work towards them. Dr. Hallmayer, you have been so generous in sharing your expertise in genetics with me while encouraging me to think critically and scientifically, and I am honored to have you training me to become a research geneticist like you. You have both put in so much of your time and energy into making sure I am the best scientist I can be, and I will always carry your lessons with me. To my second reader and major advisor, Dr. Russ Fernald, I am so grateful for every conversation in your office hours—I always leave feeling confident and excited about the future. I am thankful not only for your willingness to be my second reader and provide feedback, but also for your advice on science and life! I am also grateful for the support of Dr. Rainald Schmidt-Kastner of Florida Atlantic University. Thank you for sharing your passion for hypoxia and gene- environment interactions with me. And of course, thank you for taking on the task of helping me validate my data. I have been so lucky to have you as a mentor! To my amazing friends: Thank you for all the boba runs, the pictures of cute animals you sent me as motivation, and generally, all the love and support. Some stereotypes say that science is a lonely career, but even when I’m working by myself in the lab, I know I’ll never be lonely with people like you just a quick Snapchat away. And of course, during every one of my achievements, I’ve had the best family in the world behind me. Thank you for being my cheerleaders since day one. I am so lucky to have people who love me all over the world, from Berkeley to Chennai. Aolakah! Daddy, I couldn’t have had a better role model to show me what a kindhearted and ethical science nerd looks like. Thank you for showing me that I could be your princess and be a scientist at the same time. And of course, thank you for teaching me to chill, keep it simple, and, when in doubt, ask “what would Yoda do?” Mama, thank you for the chai, the proofreading, and the motivational Milo texts. I think anyone who talks to you now would think you have a deep and abiding passion for gene-environment interactions—but really, you just care about listening to all the things I babble about at the dinner table. Thank you for supporting your daughter in all her nerdy dreams, and teaching me that being happy and healthy is the most important goal of all. Milo, thank you for all the love, cuddles, and disemboweled stuffed animals! And Shasta, you remind me why I want to be a scientist: every day, you remind me both of the intrinsic beauty of science itself and its power to make the world a better place. I want to be you when I grow up. Love, Shreya

4 Table of Contents

Acknowledgements ...... 4 List of Tables ...... 6 List of Figures ...... 7 Abstract ...... 8 Introduction ...... 9 Materials and Methods ...... 20 Results ...... 28 Discussion ...... 38 Bibliography ...... 44 Appendix ...... 50

5 List of Tables

Table 1: AutDB database broken down by category of ASD evidence ...... 21 Table 2: Results of literature search by two independent researchers ...... 29 Table 3: Final ASD/Hypoxia gene set, separated by evidence of hypoxia regulation ..... 29 Table 4: ASD/Hypoxia genes, divided by category of ASD evidence ...... 30 Table 5: ASD/Hypoxia genes sorted by hypoxia evidence and ASD evidence ...... 32 Table 6: Most significantly represented biological functions and diseases in ASD and ASD/Hypoxia gene sets ...... 33 Table 7: Most commonly represented networks in ASD and ASD/Hypoxia gene set ..... 37

6 List of Figures

Figure 1: Comparison of ASD evidence categories in ASD and ASD/Hypoxia gene sets ...... 31 Figure 2: Biological function heatmap for ASD gene set...... 34 Figure 3: Biological function heatmap for ASD/Hypoxia gene set...... 34 Figure 4: Most significantly represented network for ASD gene set ...... 35 Figure 5: Most significantly represented network for ASD/Hypoxia gene set ...... 36

7 Abstract

Autism spectrum disorders (ASDs) are fundamentally social and behavioral disorders with a range of comorbidities and social and financial impacts. Recent studies have estimated ASDs to have a heritability of around 50%, while indicating that there is also a significant environmental component. It is clear that neither genes nor environment in isolation can explain the etiology of autism. Large-scale studies have identified a set of genes that have been shown to have a high association with ASDs. In addition, recent studies have identified certain environmental factors associated with an increased risk for developing ASDs, with pre- and perinatal hypoxia as one of the more salient factors. However, the interaction between genes and environment through the lens of hypoxia has yet to be evaluated. This study aimed to find and characterize the intersection between genes associated with autism and the genes associated with the cellular response to hypoxia. Every gene in a database of autism-associated genes was interrogated, through a thorough literature search and comparison with a set of microarray data, for evidence of its regulation by hypoxia. This process created a set of genes associated with both autism and the hypoxia response. A statistical test for overrepresentation indicated that hypoxia-regulated genes were overrepresented in the ASD database; the proportion of ASD genes also responsive to hypoxia was roughly twice what would be expected by chance. Functional and network analyses then showed that specific biological functions were overrepresented in the ASD/Hypoxia gene set, indicating that the number of hypoxia-regulated genes among all ASD genes, were indeed greater than would be expected by chance, and fell into specific networks and pathways. This lays the groundwork for functional characterization of variations in these genes in a population affected with autism as well as a neurotypical population. With a better understanding of how variations in certain genes can affect an individual’s response to a potential hypoxic event, researchers and medical professionals alike can open up better therapeutic avenues for children and adults affected by ASDs.

8 Introduction Overview The most recent figure from the Center for Disease Control estimates the prevalence of Autism Spectrum Disorders (ASDs) as 1 in 681. ASDs are characterized in the DSM 5 by two primary core domains of symptoms: social and communicative difficulties and repetitive behaviors2. Yet their symptoms are not limited to social and communicative dysfunction; ASDs are complex conditions with a wide range of comorbidities. Thus, the effects of ASDs extend beyond the affected children. Studies on their social impact have shown that families of children with ASDs experience significant financial problems3 as well as emotional stress and stigmatization from the community4. The profound impact of ASDs at a patient, family and societal level, has led to a significant effort to identify the risk factors for and underlying etiology of these ASDs in order to develop the best possible interventions and preventative approaches. As a result, ASDs are the subject of substantial research efforts. For several decades, research on the etiology of ASDs has operated under the view that there is a very strong genetic basis for these disorders. A series of twin studies567 from the 1970s through the 1990s all suggested a very high heritability of autism, i.e., a significant genetic component to their etiology. Motivated by this perspective, molecular genetics studies, including the Autism Genome Project, aimed to use new technologies to find specific gene variants associated with ASDs. These large- scale studies analyzed both copy number variants (CNVs) and common variants, or single nucleotide polymorphisms (SNPs) in search of variants that conferred an increased risk of ASDs. However, these studies suggested a lower genetic basis for ASDs than was previously understood. While CNVs associated with ASDs were found, and shown to be in genes associated with specific processes in the central nervous system8, a 2008 review estimated that “defined mutations, genetic syndromes, and de novo CNV account for about 10-20% of ASD cases.”9 Likewise, data on common variants from the Autism Genome Project showed that the effects of individual SNPs were merely “modest”10 and insufficient to explain the etiology of ASDs. Studies such as these showed that while genetic variants certainly have some effect on the development of ASDs, they do not provide a comprehensive explanation.

9 Further, a highly influential 2011 paper by Hallmayer et al. 11 also countered the idea that autism is highly heritable. As the largest twin study to date using the most contemporary diagnostic standards, this study showed that the heritability of ASDs, while still significant, was less than previously understood; it estimated a 38% heritability in contrast to previous twin studies, which suggested estimates as high as 90%11. More recent research12 has only confirmed this finding. The recently understood lower heritability of ASDs suggests that environmental factors also play a significant role. From the earliest days of research on ASDs, certain non-genetic biological factors have been shown to associate with an increased risk of ASDs. As early as 1971, a “strikingly high” rate of autism was found in a sample of children with congenital rubella13. Likewise, children exposed to thalidomide in utero demonstrated a 50-fold increase in autism prevalence compared to the general population14. Given the documented brain damage induced by thalidomide, these findings provided support for a model in which “brain damage sustained at different times of neural development”14 served as a risk factor for ASDs. Indeed, a 1977 twin study found, in addition to a significant genetic component, that “in 12 out of 17 pairs discordant for autism, the presence of autism was associated with a biological hazard liable to cause brain damage.”5 Given the likely contribution of fetal brain damage to ASD development, many studies have focused on pre- and perinatal risk factors for ASDs. Of the risk factors studied, which include parental age, smoking and alcohol consumption during pregnancy, and place of birth, one of the more salient factors is the set of obstetric complications and symptoms associated with fetal hypoxia15. Pre- and perinatal hypoxia has been shown to induce damage to the developing brain, suggesting it as a likely risk factor for neurodevelopmental disorders. However, studies specifically investigating the link between hypoxia and ASDs have shown that only a proportion of children who experience a hypoxic event then develop an ASD. It therefore stands that pre- and perinatal hypoxia can partially explain the etiology of ASDs, but not entirely; likewise, genetic factors also only provide part of the explanation. The fact that only some children undergoing hypoxic trauma develop ASDs, coupled with the weak contributions of individual genetic variants, suggests that gene-environment interactions may play a role

10 in their etiology. Here, I lay the groundwork for research on the role of gene-environment interactions in ASDs through the lens of hypoxia, by evaluating the intersection of ASD- associated genes and genes associated with the cellular response to hypoxia.

Autism Spectrum Disorders: An Overview The core symptoms of autism spectrum disorders are fundamentally social and behavioral. The DSM-5 describes the two main diagnostic criteria as “persistent deficits in social communication and social interaction across contexts” and “restricted, repetitive patterns of behavior, interests, or activities.”16 These core symptoms can vary in severity; while some affected individuals are able to function in society with support, more severely affected individuals, many of whom are nonverbal, suffer from severely impaired function and an inability to be independent. IQ levels in children with ASDs can also vary; while ASDs are often associated with delayed language development, they are prevalent in both high and low IQ17 individuals. However, a longitudinal study showed that children with ASDs showed daily living skills “considerably below age level expectations.”18 Aside from the core symptoms of ASDs, affected children also suffer from a number of comorbidities. ASDs have been shown to have overlaps with psychiatric conditions such as ADHD, social anxiety, and OCD19. Yet their comorbidities can be somatic as well as psychiatric; children with autism have been shown to have a high rate of gastrointestinal symptoms, such as chronic constipation and diarrhea20, as well as markers of immune dysfunction such as high levels of inflammatory cytokines21. The debilitating nature of severe ASDs (and the need for support even for less severe ASDs), as well as their many comorbidities, mean that ASDs take a financial and emotional toll not just on affected children, but on their families and communities as well. Families with children with ASDs were shown to have “greater financial, employment, and time burdens compared with other children with special health care needs.”322 In addition, unaffected siblings of children with ASDs showed higher depressive symptoms than a control population23. The effect of ASDs on both affected individuals and their families have underscored the necessity for effective therapies and interventions for children with ASDs. However,

11 interventions to date have primarily focused on improving measures to diagnose ASDs earlier24, and occupational therapies to improve social functioning25. While these can decrease the severity of ASD symptoms, they do not fully prevent the development of the disorders. In order to better develop preventative measures and more effective interventions, it is important to understand the risk factors and etiology of ASDs. However, ASDs are complex disorders with a large variety of factors conferring a small risk of developing the disorders, and a lack of a simple etiology. Studies have shown that both genetic and environmental factors can contribute to the development of the disorders; future directions will include looking at the interaction of the two in order to better understand the etiology of these disorders, leading to better therapies and interventions.

Genetic Contributions to Autism The idea of genetic contributions to Autism Spectrum Disorders was initially suggested by the higher rate of ASDs in siblings of affected children, compared to the general population26. In 1977, Folstein and Rutter conducted a small-scale twin study in order to compare the concordance of ASDs in monozygotic versus dizygotic twins, thereby gaining insight into the heritability of ASDs, or the degree to which they can be attributed to genetic factors. The results indicated a much higher concordance for monozygotic than dizygotic twin pairs, suggesting a significant role for genetic factors5. These results, indicating a high degree of heritability for ASDs, were supported by later twin studies, including a 1989 Swedish study by Steffenburg et al6. and a 1994 British study by Bolton et al7; the overall estimate of heritability was 90%12. The rise of molecular genetic technology in recent years supported these findings on the role of genetics by identifying a rapidly growing set of mostly rare genetic variants associated with the development of ASDs. Pathway analyses of these genes have shown that they are not randomly distributed throughout the genome, but fall into certain genetic pathways27. Knowledge of these pathways was used in 201428 to develop a genetic diagnostic tool to predict ASDs in an individual based on the presence or absence of single nucleotide polymorphisms (SNPs), some associated with ASD vulnerability, and some with a protective effect. The 71.7% accuracy of this prediction tool supported the

12 idea of a strong genetic component in ASDs. Similarly, a 2015 study by Sanders et al. investigating de novo copy number variants in ASDs found “strong evidence” for the association of de novo variants and ASD risk, and identified 65 new risk genes29. Analyzing the -protein interactions of these genes showed that they fell into two large subnetworks: one for synaptic genes and one for chromatin genes, suggesting that the ASD-associated genes pointed to specific biological mechanisms. The identification of these genes provides support for a model in which genetic factors contribute to the development of ASDs. However, the most recent studies on autism heritability suggest that genetic studies alone cannot provide a complete explanation for the etiology of ASDs. Genetic studies on ASDs have analyzed both rare copy number variants (CNVs) and common single nucleotide polymorphisms (SNPs), and found that neither form of gene variant provided a complete etiology for ASDs. Studies on CNVs found that there were significant associations between certain CNVs and ASDs; for example, one analysis of the Autism Genome Project found seven samples with ASD-associated gains in 15q30. This same paper, however, pointed out that there were “a number of families in which only one of the affected relatives has a detected CNA [copy number abnormality],” and suggested that “relevant CNVs might be risk factors and not the only causal event.”30 Analysis of SNPs showed similar results; individual common variants were shown to “exert weak effects on the risk for autism spectrum disorders”10. Studies such as these showed that while genetic effects were certainly present, individual variants only exerted very small effects on the risk for ASDS. These results were supported by a study in 2011, the California Autism Twin Study by Hallmayer et al11, which aimed to rigorously quantify the heritability of autism spectrum disorders. With 192 twin pairs analyzed, this was the largest study to date that used the most contemporary diagnostic standards for ASDs. The results, however, did not confirm prior studies’ results that autism had a very high heritability. Instead, this study suggested that ASD risk has “moderate genetic heritability and a substantial shared twin environmental component.”11 These results were later supported by a large-scale 2014 study on a Swedish cohort that estimated the heritability of ASDs to be around 50%12. It

13 therefore follows that while genetic factors must contribute to ASD susceptibility, they are not the only factors implicated, indicating that environmental factors may play a role.

Hypoxia as an Environmental Risk Factor for ASDs While many recent studies have focused on elucidating genetic mechanisms of ASDs susceptibility, especially given the previous conception of a very high degree of genetic heritability, studies have also been performed to explore the idea of environmental risk factors. Many such risk factors have been suggested, from the oft- discussed vaccines to congenital diseases such as rubella. One salient idea throughout the history of ASD research is the suggestion that fetal brain damage can have an effect on ASDs, be it via thalidomide poisoning, maternal alcohol use, or obstetric complications. While factors such as thalidomide poisoning and rubella conferred a hugely elevated ASD risk, these are rare factors, and certainly do not account for more recent ASD cases; however, obstetric complications, and especially resulting fetal hypoxia, are still prevalent and have been shown to confer a moderate risk for ASDs. Indeed, since the early days of research on children with ASDs and similar conditions, it has been suggested that complications during pregnancy were associated with neuropsychiatric and behavioral disorders. A 1956 study, published less than two decades after Leo Kanner wrote the first clinical description of autism31, proposed that children experiencing trauma in utero and surviving would then experience the residual effects of their injury into adulthood, in the form of neurological and behavioral disorders32. More recent studies have focused on characterizing this connection more precisely—specifically, interrogating the connection between various environmental factors and injuries and fetal brain development. Studies have shown that the brain is particularly vulnerable during this time, because of the many sensitive cerebral processes, such as cell proliferation and synaptogenesis, whose temporal and spatial occurrences can be disturbed by environmental insults33. Many types of traumas can affect a developing brain, including low levels of nutrients34, infection35, and exposure to environmental toxins33. Of the wide variety of environmental factors, cerebral hypoxia-ischemia (the lack of oxygen reaching brain tissues) has been identified as a major cause of infant mortality or lifelong neurological conditions363734.

14 The effects of hypoxia on the brain have been extensively studied, and there is evidence for a connection between hypoxic injury and neurodevelopmental and psychiatric disorders. A 19-year longitudinal study38 found a “strikingly elevated” risk for “schizophrenia and other nonaffective psychoses” associated with hypoxic-ischemic fetal and neonatal complications. This is not surprising given the number of studies elucidating the mechanism by which hypoxia can injure the developing brain. A 2011 review by Rees et al. of the effect of perinatal hypoxia34 emphasized that brief periods of acute hypoxemia can lead to the death of vulnerable neuronal populations—for example, hypoxic events in later stages of gestation can result in neuronal death in the cerebral cortex as well as white matter damage39. Based on the significant effects that hypoxic injury can have on a developing brain, it is a logical avenue of exploration with regards to the environmental sources of ASDs. To this aim, a 2007 review by Kolevzon et al. of pre- and perinatal risk factors for autism analyzed seven different epidemiological studies, and isolated a set of obstetric variables that could be reporters of fetal hypoxia, including but not limited to “low Apgar score,” “fetal distress,” and “bleeding during pregnancy.”40 Four of the seven studies analyzed by Kolevzon et al. examined measures of fetal hypoxia, and all four showed that low Apgar scores were predictors of autism; in addition, each study also found significant associations between autism risk and other markers of hypoxia as well. Overall, this led Kolevzon et al. to suggest that “hypoxia-related obstetric complications and fetal hypoxia may possibly increase the risk of autism”.40 This confirmed the results of previous studies analyzing a variety of obstetric factors as risk factors for autism, and finding that fetal distress and low Apgar scores, which can be markers of fetal hypoxia, are associated with increased risk of ASDs41. The idea of prenatal hypoxia as a risk factor was further supported by a 2009 meta-analysis by Gardener et al on prenatal risk factors for autism. As part of the study, Gardner et al. analyzed 13 studies evaluating the link between maternal gestational bleeding and autism risk, and found an 81% elevated risk for autism in relation to maternal bleeding42. As mentioned above, maternal bleeding is “believed to be associated with fetal hypoxia”.42 However, Gardener et al. showed that the data remained inconclusive with regards to other markers of hypoxia as risk factors for autism, and

15 suggested that more investigation was necessary to solidify a connection between the two. A follow-up meta-analysis by Gardener et al.43 focused specifically on perinatal and neonatal risk factors for ASDs, and found many similar results. This study analyzed 40 studies and found that out of the factors with the strongest evidence for an autism association, many could also be associated with hypoxia. Specifically, “growth retardation, fetal distress, umbilical-cord wrapping around the neck, low Apgar score, respiratory distress, resuscitation, meconium aspiration, and Cesarean delivery”43 were all shown to have evidence for an increased risk for autism. However, not all the studies analyzed by Gardener et al. for a given perinatal factor indicated an increased risk of autism. For example, of the four studies analyzed that investigated fetal distress, three showed null evidence for a risk of ASDs, and only one showed evidence. Likewise, of thirteen studies investigating neonatal respiratory distress, nine showed null results while four showed positive results. All studies analyzed by the paper found either null results or positive results for almost all hypoxia-associated risk factors. The only anomaly was Cesarean sections, which only one study indicated were associated with a decreased risk for developing ASDs. The overall results do suggest a weak association between the analyzed risk factors and ASD. However, as Gardner et al. conclude, there is “insufficient evidence to implicate any one perinatal or neonatal factor in autism etiology.”43 A 2012 meta-analysis by Guinchat et al. found similar results, this time by grouping together a specific collection of factors known as markers of hypoxia: “lack of first cry, breath, or oxygen; blue baby; respiratory distress syndrome or assisted ventilation or asphyxia”44 and analyzing 85 studies for evidence of a connection between these markers of hypoxia and ASD risk. Out of the nine studies that mentioned “lack of first cry, breath, or oxygen; blue baby,” five showed positive results while four showed negative or null studies. Likewise, out of the 17 studies that analyzed “respiratory distress syndrome or assisted ventilation or asphyxia”, only seven showed positive results. These were shown to be statistically significant, indicating a potential, if not all-encompassing, role for fetal hypoxia. These meta-analyses have isolated fetal hypoxia as a risk factor based on a set of markers that could be associated with hypoxia, such as fetal distress. Based on this

16 evidence, albeit weak, for fetal hypoxia as a potential risk factor for ASDs, recent studies have focused specifically on fetal hypoxia using more direct markers. A 2011 study by Burstyn et al.45 used blood tests of acidity as a marker for fetal hypoxia. The data from this study showed a “weak effect of fetal hypoxia on risk of ASD among males”45, a result that is interesting in light of the higher prevalence of ASDs in males than females. Indeed, the same cohort of twins analyzed by Hallmayer et al. was later analyzed specifically for prenatal and perinatal risk factors by Froehlich-Santino et al in 201446. Given the perinatal risk factors identified by Gardener et al. which were shown to be associated with hypoxia, Froehlich-Santino et al. devised a variable for “markers of hypoxia” like that used by Guinchat at al. In this study, “markers of hypoxia” were measured through a parent questionnaire that asked about the presence of factors such as “fetal distress, unscheduled C-section due to low fetal heart rate and/or distress, umbilical cord complications” among others46. The presence of a marker for hypoxia was shown to be “significantly more common” in twins with ASDs than in twins without. Specifically, of ASD-discordant twin pairs, 3 pairs had only the ASD-affected twin experience a marker of hypoxia, while no pairs had only the unaffected twin experience hypoxia46. Froehlich et al. confirmed the 2011 findings by Burstyn et al, showing that markers of hypoxia “were associated with an increased risk for ASDs in males.”46 Similar to the study by Burstyn et al., a very recent meta-analysis by Modabbernia et al.47 focused specifically on impaired gas exchange at birth, which includes neonatal hypoxia, and its association with autism spectrum disorders47. This meta-analysis evaluated 67 studies, and, like the study by Burstyn et al., focused on “clinical proxies” of impaired gas exchange: “acidosis, Apgar score, need for resuscitation or oxygenation, apnea or delayed crying or breathing, respiratory distress syndrome, and asphyxia/hypoxia”47. The results of the meta-analysis showed that the presence of these markers for impaired gas exchange was strongly associated with “an increased risk of [intellectual disabilities], and (to a lesser extent) ASDs.”47 This confirmed findings from a 2014 study by Maramara et al., which found infant hypoxia to be one of seven perinatal factors more common in a cohort of children with ASDs than in the general population in New Jersey. Maramara et al. also found an association between newborn hypoxia and maternal age; an increased maternal age, which is one of the more commonly found risk

17 factors for ASDs, was shown to be associated with an increased risk of newborn hypoxia48. While the effects of neonatal hypoxia on ASD risk were shown to be stronger when specifically analyzed through direct markers of hypoxia such as acidosis, the correlation was still not strong enough to provide a complete explanation for ASDs. These studies have suggested that hypoxia may be just one factor in the development of ASDs. It is abundantly clear that while hypoxia is likely associated with an increased risk of ASDs, there are many children who experience fetal or perinatal hypoxia and do not subsequently develop an ASD. Guinchat et al. remained inconclusive on the effects of external effects such as hypoxia on ASD risk, instead suggesting that such factors may work in tandem with “genetic vulnerability.”44 Likewise, Gardener et al. suggest that “environmental factors may interact with genetic factors to increase risk.”43 Independently, genetic and environmental analyses have shown that genetics and hypoxia can both increase the risk of ASDs. However, neither in isolation can provide a full explanation for the etiology. Given the strong evidence in favor of both genetic and environmental factors to ASDs, and the failure of either lens to provide a comprehensive etiology, the natural next step is to look at the intersection of the two.

Interaction of Hypoxic and Genetic Factors The response to hypoxia is highly genetically controlled. The cell’s response to hypoxia has been extensively studied on a molecular level; the effects of hypoxia are mediated by transcription factors known as hypoxia-inducible factors (HIFs), which contribute to homeostasis in changing oxygen concentrations 49. As regulators of the cell’s hypoxia response, HIFs are implicated in the brain’s ability to respond to hypoxic or ischemic injury50. HIFs, being transcription factors, are merely one part of complex, multiple-step signal transduction pathways; therefore, there are numerous genes of various functions, all involved in regulating the brain’s response to hypoxia. As perinatal hypoxia has been shown to have an association with ASDs, and hypoxia’s effects on the brain are mediated by a large set of genes, it is possible that variants in genes associated with hypoxia response are implicated in determining whether an ASD develops after a hypoxia response. However, the intersection between ASD-

18 related gene variants and genes that control hypoxia response is not yet known. Based on the fact that hypoxia response and ASDs have a documented association, and that both conditions have a significant genetic component, it is hypothesized that a subset of genes associated with both ASD and hypoxia that would account for the vulnerability of individuals with those genes to develop ASD when exposed to hypoxia. A similar framework was used in a study by Schmidt-Kastner et al.51, focusing on the gene-environment interactions of hypoxia and genetics as a risk factor for schizophrenia. Schmidt-Kastner showed that a significant portion of genes associated with schizophrenia were also shown to be linked to hypoxia-ischemia, and suggested that this supported a model in which gene-environment interactions, specifically interactions between hypoxia and the genome, could be investigated as a plausible etiology for schizophrenia. With regards to autism, the strong evidence in favor of both genetic and environmental factors suggests that a similar approach will be successful in identifying genes with a demonstrated link to autism that happen to be regulated by hypoxia. In this paper, I hypothesize that there is a genetic component to the link between ASDs and hypoxia. I start with a large and publically available dataset of genes shown to be associated with ASDs, collected from genome-wide association studies as well as other large-scale research. I then investigate each of these genes for regulation by hypoxia or hypoxia-inducible transcription factors. The resulting dataset of genes that are both associated with ASDs and regulated by hypoxia is then statistically analyzed. I see, as expected based on my hypothesis, an overrepresentation of hypoxia-regulated genes in the dataset of ASD-associated genes; based on this, I can describe the two gene sets from a functional standpoint.

19 Materials and Methods

Based on the known relationship between pre- and perinatal hypoxia and the risk of developing ASDs, combined with the role of genetic risk factors, I hypothesize that there genetic factors interact with hypoxia in the development of at least a subgroup of ASDs. Therefore, my first research question asked whether the proportion of hypoxia-regulated genes in the ASD gene dataset was larger than would be expected due to chance. If I find that the number of hypoxia-regulated genes is greater than expected by chance, this would provide support for gene-environment interaction that involves hypoxia. To answer this question, I first curated a dataset comprising all genes in an ASD gene database that are also regulated by hypoxia. I then performed a χ2 test to see if the genes in this ASD/hypoxia dataset were overrepresented in the ASD set compared to the genome as a whole. Having determined that the hypoxia genes were significantly overrepresented, I then used Gene Ontology functional and network analyses to compare the ASD/hypoxia gene set with the ASD set as a whole.

Aim 1: Overrepresentation Analysis of ASD/Hypoxia Genes The selection of genes associated with both autism spectrum disorders and the cellular hypoxia response was performed using a similar analytical approach to that utilized by Dr. Rainald Schmidt-Kastner in 201251 for the selection of genes associated with both schizophrenia and hypoxia.

Aim 1: Step 1: Selection of ASD-associated genes The first step in the analysis was to obtain a dataset of genes shown to have associations with ASDs. While multiple autism-related databases have been created, I selected the database most frequently cited in research papers according to PubMed: a publically available database named AutDB, produced and published by MindSpec, Inc. in McLean, VA, and licensed through the Simons Foundation as SFARI Gene52. The database consists of human genes shown to have an association with autism spectrum disorders. Since its launch in 2008, it has been updated every three months by expert researchers analyzing all published studies on ASD; as such, it is the most comprehensive database, representing the most current primary research at time of access. In addition to

20 containing a curated set of genes from published and peer-reviewed studies, AutDB is unique among autism databases because it includes detailed annotations on the molecular function of each gene, as well as a description of the evidence for its connection to autism53—information that can be used to further characterize subsets of the dataset. While the database has since grown, when accessed for this project in January 2015, it consisted of 667 genes selected as follows. For a gene to be included in the AutDB database, it must have a demonstrated association with ASDs in peer-reviewed studies, and match at least one of the following criteria: 1. rare single gene variants, such as polymorphisms and sub-microscopic CNVs, with direct links to ASDs 2. genes implicated in syndromes with significant autistic symptoms 3. genes shown to confer a small risk for ASDs through genome-wide association studies 4. functional candidate genes that are biologically relevant for ASDs, but not experimentally verified 5. genes included in multigenic CNVs associated with ASDs A small subset of genes fell into more than one of the abovementioned categories.52 The contents of the database when accessed in January 2015 are described in Table 1. Evidence for ASD link Number of Genes % Rare single gene variant only 323 48 Genetic association only 165 25 Functional candidate only 74 11 Syndromic only 61 9 Multigenic CNV only 18 3 Rare single gene variant, Genetic association 14 2 Genetic association, Functional candidate 5 .8 Multigenic CNV, rare single gene variant 3 .5 Functional candidate, rare single gene variant 2 .3 Functional, Negative association 1 .2 Genetic association, Multigenic CNV 1 .2 Total 667 100 Table 1: AutDB database genes broken down by category of ASD evidence

21 Aim 1: Step 2: Selection of Hypoxia-associated Genes Having obtained a dataset of autism-associated genes, the next step was to find which of these genes were also regulated by hypoxia. I used a database curated by Dr. Schmidt-Kastner called the Ischemia-Hypoxia Response (IHR) database51. The IHR database was comprised of genes identified from 24 microarray expression studies on hypoxia in the brain54. These genes have all been shown to be either up-or down- regulated by hypoxia in the brains of rats and mice. The database originally consisted of 1750 genes; it was then refined through a comparison of the rat and mouse microarray data with a list of human gene symbols for autosomes55. In its current form, as used for this paper, the database consists of 1629 genes. Like AutDB, the genes comprising the database are manually curated by experts in the field; however, unlike AutDB’s multiple criteria, the evidence for a gene’s regulation by hypoxia must come specifically from microarray expression studies to qualify the gene for inclusion in the database.

Aim 1: Step 3: Comparison of AutDB database and Hypoxia-Regulated Genes The inclusion criteria of the IHR database’s criteria meant that evidence from non-microarray studies that a gene is hypoxia-regulated was not considered; likewise, the database’s use of mouse and rat brains meant that human-specific hypoxia-regulated genes would be excluded as well. As such, it was necessary to not only use the IHR database for comparison with AutDB, but also search the literature for evidence of hypoxia regulation for the autism-associated genes. In order to be fully blinded, the literature review was done first; this eliminated bias in the assessment of genes from the literature based on knowledge of the IHR database’s contents.

Step 3, Part A: Method for Literature Review of Genes I performed a search for evidence in the published scientific literature for regulation either directly by hypoxia or by Hypoxia-Inducible Factors (HIFs). All 667 genes in the AutDB database at time of access were systematically interrogated for association with hypoxia. If a literature search revealed a match for either of the following criteria, the gene was labeled as an ASD/Hypoxia gene.

22 The criteria were as follows: 1.The gene was shown to be regulated by ischemia-hypoxia in the brain, in studies found in the PubMed literature. It was necessary for the gene to be specifically mentioned as undergoing a change in expression in response to ischemia or hypoxia; both upregulation and downregulation as a response to ischemia and hypoxia were accepted. 2.The gene was regulated by hypoxia-inducible factors; specifically, the gene was a target gene for HIF-1 or HIF-2, as shown in the PubMed literature. Genes not directly acted upon by HIFs, but shown to be downstream from HIFs in signaling pathways were included. However, genes upstream from HIFs, or exerting regulatory effects on HIFs, were not included unless they were shown to be directly regulated by hypoxia.

As autism is a neurodevelopmental disorder, and the brain is known to be extremely vulnerable to hypoxic-ischemic injury5657, I hypothesized that a genetic component to central nervous system vulnerability to hypoxia was involved. Therefore, I selected genes that were shown to respond to brain-targeted ischemic-hypoxic injury. The exception was genes that directly bound to or formed a complex with HIFs, even if the study was in non-neural tissues. The matching procedure started by using the criteria for the first test described above: the presence of literature evidence for regulation by hypoxia or HIFs. I entered each ASD gene in a PubMed search, first by its gene symbol and then by its full gene name, combined with search terms “hypoxia,” “ischemia,” and “HIF” in succession; a search for “HIF” was sufficient to bring up results for both HIF-1 and HIF-2. I then interrogated the subsequent search results. I read and analyzed the abstract of each search result in search of evidence of HIF or hypoxia regulation. If the abstract did not provide sufficient information, the entire paper was then read. One study result demonstrating hypoxia-ischemia/HIF regulation for the gene was deemed sufficient evidence, and the PubMed ID number of one relevant study was added to the database as well as a brief summary of its link to hypoxia; the gene was labeled as an ASD/Literature gene. To eliminate bias in the assessment of the papers, this analysis was performed blind to the

23 contents of the IHR database described above, ensuring that prior knowledge of a hypoxia response did not affect the analysis of the abstracts. To ensure the reliability of results, two researchers carried out this analysis independently, and did not compare results until the entire literature analysis was complete. The two resulting gene sets were then compared and tested for inter-rater reliability using Cohen’s kappa58. Any discrepancy in the two researchers’ gene sets from the literature review portion was discussed until a conclusion was drawn.

Step 3, Part B: Comparison of AutDB and IHR Database Having completed the literature search, blind to each gene’s presence or absence in the IHR database, compiled by Dr. Schmidt-Kastner and described above, I then turned to the IHR database. For every gene in the AutDB database, I examined the IHR database to see if it was included. If a gene was included in both databases, it was labeled as an ASD/IHR gene. At the end of Step 3, the dataset of ASD-associated genes that were also regulated by hypoxia was complete. All genes from the AutDB database with literature evidence for hypoxia regulation were labeled as ASD/Literature genes, while all AutDB genes that also appeared in the IHR database were labeled as ASD/IHR genes. To qualify a gene as an ASD/Hypoxia gene, it was required to fall into at least one of the above two categories. This ensured that genes with hypoxia regulation shown in non-microarray studies would be included in the final analysis, as well as genes found through microarray studies but not independently analyzed for hypoxia responsiveness in other studies. Genes that fell into both categories were also included in the final dataset. The labels indicating whether a gene’s hypoxia evidence was from the literature or IHR database or both were retained when the gene sets were combined into the final ASD/Hypoxia dataset.

Aim 1: Step 4: χ2 Test for Overrepresentation The differences in the two methods for finding hypoxia-regulated genes meant that only genes from the IHR database were suitable for performing a χ2 test for overrepresentation. The IHR database was constructed using very specific criteria,

24 focusing only on microarray gene expression studies. Because microarray studies probe all known genes and the same criteria were used to find every gene in the IHR database, the ASD/IHR gene set could be compared to the genome as a whole51. In contrast, the literature search for hypoxia-responsive genes only identified genes that had been of specific interest to researchers; it is impossible to perform a χ2 test on the literature genes, as there is no number that could serve as a denominator, as the total number of genes in the genome could for the IHR genes. Therefore, the ASD/IHR genes were analyzed as a subset of the larger ASD gene set, and statistical analyses were performed to test the hypothesis that the number of IHR genes in the ASD set was larger than would be expected by chance. The number of genes in the IHR database, compared to the genome as a whole, was used to estimate the number of IHR genes found in a random gene set, and this was compared to the percentage of IHR genes in the AutDB set. A χ2 analysis was used to test the validity of the hypothesis, and answer my first research question: are the hypoxia-regulated genes overrepresented in the ASD-associated gene set?

Aim 2: Statistical and Gene Ontology Comparison of ASD/Hypoxia ASD Genes My second set of research questions asked whether or not the ASD/Hypoxia gene set resembled a random sampling of the ASD set based on the evidence of association, and whether the ASD/Hypoxia genes impact specific functional pathways. To this end, I functionally annotated the ASD/Hypoxia gene set, and compared it to the annotated set of ASD-associated genes as a whole. I first examined the annotations for each gene provided by AutDB to find annotations that were over- or underrepresented in the ASD/Hypoxia subset. I then used Ingenuity Pathway Analysis (IPA) to functionally annotate the gene sets using Gene Ontology terms and network analysis. The gene sets were imported into IPA for comparison and analysis. A Core Analysis was performed on each gene set, and then a Comparison Analysis was performed on the resulting analyses.

Aim 2: Step 1: Statistical Analysis of Support for Autism The first analysis performed focused on the “Support for Autism” annotation included for every gene in AutDB. As described above, the compilers of AutDB defined

25 five categories of evidence associating a gene with ASDs, including results from association studies and rare single gene variants, and a gene was required to match at least one of these categories to qualify for inclusion in the database. In order to answer the question of whether the ASD/Hypoxia gene set resembled a random sample of the ASD genes, I sorted each gene set by the different categories of ASD evidence. I then performed a χ2 test for goodness of fit to see if the proportions of the different categories were significantly different between the two gene sets. In order to more specifically characterize the ASD/Hypoxia genes, I also performed this analysis on subsets of the ASD/Hypoxia gene set. I divided the ASD/Hypoxia genes into five categories based on the evidence I used to designate the gene as regulated by hypoxia. These categories did overlap, in order to thoroughly compare the two forms of hypoxia evidence. The categories were as follows: 1.Literature evidence only 2.IHR database evidence only 3.Both literature and IHR evidence 4.Literature evidence (both with and without additional IHR evidence) 5.IHR database evidence (both with and without additional literature evidence) I then performed a similar analysis, sorting these subsets by ASD evidence, and performed χ2 tests to compare the proportions of ASD evidence categories for each subset, as compared to the overall ASD/Hypoxia gene set.

Aim 2: Step 2: Functional Analysis The next step was to use Ingenuity Pathway Analysis (IPA) to biologically characterize the differences between the two gene sets. The first analysis focused on the biological functions represented by the two sets. IPA draws upon a large repository of genetic and molecular information, including Gene Ontology, to label each gene with its associated biological functions. Using IPA’s functional analysis tool, the most significant biological functions represented by the two different gene sets were compared. For each gene set, the Core Analysis was performed and the Diseases and Biological Functions tool was used to examine the most overrepresented biological functions in the set. The list of functions was sorted by increasing P-value, to find the

26 most significant results for each set. The sorted lists for each gene set were then compared to determine whether the ASD/Hypoxia gene subset resembled a random selection of the ASD gene set from a biological standpoint, or if specific functions were overrepresented in the subset.

Aim 2: Step 3: Network Analysis In a similar manner, IPA’s network analysis tool was used to look for differences in the most significant networks represented by the two datasets. Each genetic network identified by IPA from the datasets included information about the main diseases and functions associated with the network. As above, it was possible to compare the top functions for each network.

27 Results

Aim 1: Overrepresentation Analysis of ASD/Hypoxia Genes

Statistical Test for Significance The ability of microarray tests to investigate the entire genome, and the fact that only microarray studies were used to create the IHR database, meant that the IHR database could be used as a measure of whether representation of hypoxia-regulated genes in the ASD set was greater than would be expected by chance. The IHR database consists of 1629 genes; assuming a low estimate of 20,000 genes in the genome, it therefore holds that 1629/20000 or approximately 8% of genes in a given gene set would be in the IHR database by chance alone. As such, we would expect approximately 54 of the 667 ASD genes to be IHR genes by chance. However, we found 124 ASD genes to also be IHR genes, approximately 19%; a χ2 test showed a P-value < 0.00001. Likewise, we would expect 667/20000 = 3% of a given gene set to be associated with ASDs by chance; however, 124/1629 = 8% of all IHR genes were associated with ASD. Based on the IHR microarray data, ASD genes had more than twice the likelihood of being associated with ischemia-hypoxia than would be expected by chance, again with a P- value < 0.00001.

Gene Selection and Reliability Test The AutDB database included 667 ASD-associated genes. An initial comparison of the literature-derived gene sets for hypoxia association revealed 37 genes with identical papers selected by both reviewers, as well as 35 more identified in similar, but not identical, papers. As only one paper was required for evidence of hypoxia, this meant 72 out of 667 genes were agreed to be associated with hypoxia or HIFs given the criteria for the literature search. In addition, 505 out of 667 genes were agreed to have no association with hypoxia under the same criteria. However, there were also discrepancies in the gene sets identified by the two researchers. S.R. identified 63 hypoxia-associated genes not identified by R.S-K., and

28 R.S-K. found 27 genes not identified by S.R. The resulting kappa score for the hypoxia test between the two researchers was 0.53658 (Table 2).

S: Hypoxia S: Not Hypoxia Total R: Hypoxia 72 27 99

R: Not Hypoxia 63 505 568 Total 135 532 667

Table 2: Results of literature search by two independent researchers

S.R. and R.S-K. then collaborated to evaluate the genes for which there was a discrepancy. Each researcher revisited the set of genes that the other researcher had identified and he or she had not, and critically analyzed each gene. This eliminated discrepancies in genes that had been simply overlooked during the first pass by either researcher. Other discrepancies were then eliminated after discussion and further investigation of the literature; for example, for a gene lacking consensus, it was collectively determined that relative hypoxia, induced by high altitude, was acceptable as a source of hypoxia in the brain. This process produced a data set of 131 genes found in the literature to be associated with both ASDs and hypoxia, on which both researchers were in complete consensus. The test of overlap between the AutDB database and the IHR database was less subjective than the test for evidence from scientific literature. Out of the 667 ASD genes, 124 were shown to also appear in the IHR database. Of these IHR genes, 60 showed additional evidence from the literature—that is, they were also found to be hypoxia- regulated by the literature search described above—and 64 IHR genes were not additionally supported by the literature search (Table 3). The complete gene set, as well as the evidence for each gene, can be found in the Appendix.

Literature Only Literature and IHR IHR only Total 71 60 64 195 Table 3: Final ASD/Hypoxia gene set, separated by evidence of hypoxia regulation

29 Aim 2: Statistical and Gene Ontology Comparison of ASD/Hypoxia and ASD Genes

Statistical Analysis of Support for Autism To start to answer the question of whether or not the 195 ASD/Hypoxia genes were greater than would be expected by chance, I divided both the full ASD dataset and the ASD/Hypoxia dataset into the different categories of ASD evidence as listed in AutDB. The category breakdown for the ASD/Hypoxia genes is as shown in Table 4.

Evidence for ASD link Number of Genes in ASD/Hypoxia Set % Rare single gene variant only 70 36 Genetic association only 55 28 Functional candidate only 42 22 Syndromic only 16 8 Multigenic CNV only 3 1.5 Rare single gene variant, Genetic association 5 2.5 Genetic association, Functional candidate 1 0.5 Multigenic CNV, rare single gene variant 1 0.5 Functional candidate, rare single gene variant 2 1 Functional, Negative association 0 0 Genetic association, Multigenic CNV 0 0 Total 195 100 Table 4: ASD/Hypoxia genes, divided by category of ASD evidence

30 Having sorted the gene sets by category, I performed a χ2 test for goodness of fit to compare the proportions of genes falling into each evidence category in the ASD set and the ASD/Hypoxia set. The results showed significant differences in the proportions of the different categories (p<0.0005), as shown in Figure 1. Specifically, the most significantly overrepresented in the ASD/Hypoxia set were the functional candidates (p<0.00005), and the most significantly underrepresented were the rare single gene variants (p<0.05).

60% Rare single gene variant only 50% 48% Genetic association only Functional candidate only 40% Syndromic only 36% Multigenic CNV only 30% 28% Rare single gene variant, Genetic association 25% 22% Genetic association, Functional candidate 20% Multigenic CNV, rare single gene variant Functional candidate, rare single gene variant 10% 11% Functional, Negative association 0% Genetic association, Multigenic CNV ASD ASD/Hypoxia Figure 1: Comparison of ASD evidence categories in ASD and ASD/Hypoxia gene sets

I then sorted the hypoxia evidence subsets, described above, of the ASD/Hypoxia gene set, by ASD evidence category, as shown in Table 5. This was done in order to identify any significant differences in ASD evidence category that might occur when comparing different sources of hypoxia evidence. For every hypoxia evidence category, I performed a χ2 test for goodness of fit, comparing the proportions of ASD evidence between the hypoxia category and the overall ASD/Hypoxia dataset. Out of all five hypoxia categories, none showed significant differences from the ASD/Hypoxia dataset at a 0.05 significance level. From a statistical standpoint, that is, each category of hypoxia evidence resembled a random sampling of the ASD/Hypoxia dataset.

31 Evidence for ASD Full ASD/ Lit IHR Overlap All All ASD Set Hypoxia Only Only Lit IHR Rare single gene variant only 323 70 22 28 20 46 48 Genetic association only 165 55 22 17 16 38 33 Functional candidate only 74 42 16 10 16 32 26 Syndromic only 61 16 8 4 4 12 8 Multigenic CNV only 18 3 0 0 1 1 3 Rare single gene variant, 14 5 1 1 3 1 4 Genetic association Genetic association, Functional 5 1 0 1 0 0 1 candidate Multigenic CNV, rare single 3 1 1 0 0 1 0 gene variant Functional candidate, rare 2 2 1 1 0 0 1 single gene variant Functional, Negative 1 0 0 0 0 0 0 association Genetic association, Multigenic 1 0 0 2 0 0 0 CNV Sum 667 195 71 64 60 131 124 Table 5: ASD/Hypoxia genes sorted by hypoxia evidence and ASD evidence

Functional Analysis Given the significant overrepresentation of hypoxia-regulated genes in the ASD database, my next aim was to functionally characterize the two gene sets using Gene Ontology terms and network analysis. The larger ASD and hypoxia dataset (n=195), including IHR genes as well as genes with literature evidence not found in the IHR database, was then imported into Ingenuity Pathway Analysis as well as the original 667- gene AutDB dataset. The first analysis performed examined the most strongly represented diseases and biological function terms in the two differing datasets, in order to determine whether the hypoxia-associated genes comprised a random selection of the entire ASD dataset, or whether they clustered into specific functional groups. The 25 most strongly represented biological function terms for the two datasets are listed in order of increasing P-values in Table 6. Rank ASD Disease/Function ASD ASD/Hypoxia Disease/Function ASD/ P-Value Hypoxia P-Value 1 Behavior 1.98E-73 behavior 4.62E-63 2 cognitive impairment 8.08E-64 neurotransmission 7.93E-53 3 development of neurons 1.46E-55 cognition 9.65E-48

32 4 Cognition 6.32E-55 learning 1.08E-47 5 schizophrenia spectrum disorder 2.24E-54 development of neurons 1.05E-45 6 neurotransmission 9.16E-54 synaptic transmission 3.00E-44 7 gastrointestinal adenocarcinoma 3.58E-52 organismal death 1.08E-43 8 gastrointestinal carcinoma 6.57E-52 schizophrenia spectrum disorder 1.53E-42 9 Learning 9.19E-52 cognitive impairment 2.60E-42 10 development of central nervous 2.29E-51 quantity of cells 3.81E-42 system 11 abdominal adenocarcinoma 1.20E-50 transport of molecule 6.55E-42 12 mental retardation 3.44E-50 Movement Disorders 4.77E-41 13 pervasive developmental disorder 3.59E-50 generation of cells 7.52E-41 14 gastrointestinal tract cancer 3.60E-50 quantity of neurons 1.33E-40 15 abdominal carcinoma 1.03E-49 locomotion 3.99E-40 16 Gastrointestinal Tract Cancer and 2.88E-49 morphology of nervous system 9.62E-39 Tumors 17 morphology of nervous system 5.32E-48 emotional behavior 3.45E-38 18 familial mental retardation 5.89E-48 seizure disorder 7.34E-38 19 microtubule dynamics 4.82E-47 conditioning 5.27E-37 20 intestinal cancer 1.23E-46 Schizophrenia 8.58E-37 21 malignant neoplasm of large 1.69E-46 cell death 1.80E-36 intestine 22 synaptic transmission 4.25E-46 necrosis 4.90E-35 23 gastroesophageal adenocarcinoma 6.85E-46 cellular homeostasis 1.03E-33 24 digestive organ tumor 9.08E-46 differentiation of cells 3.01E-33 25 digestive system cancer 1.53E-45 abnormal morphology of nervous 3.03E-33 system Table 6: Most significantly represented biological functions and diseases in ASD and ASD/Hypoxia gene sets

Highly noticeable from the functional analysis was the high prevalence of biological functions related to nervous system development, neurological disease, and cell-to-cell signaling in the ASD/Hypoxia dataset. However, the two most strongly represented functional categories in the set of ASD genes were “organismal injury and abnormalities” and “cancer” (Figure 2). These two functional categories were conspicuously underrepresented in the ASD/Hypoxia dataset (Figure 3) compared to the whole ASD dataset.

33

Figure 2: Biological function heatmap for ASD gene set. Group A: Organismal Injury and Abnormalities. Group B: Cancer. Group C: Nervous System Development and Function. Group D: Neurological Disease.

Figure 3: Biological function heatmap for ASD/Hypoxia gene set. Group A: Organismal Injury and Abnormalities. Group B: Cancer. Group C: Nervous System Development and Function. Group D: Neurological Disease.

Network Analysis The next analysis focused on the genetic networks that were most strongly represented by the two different datasets. IPA’s network analysis tool took a gene set as input, and found networks of interacting genes that included significant portions of the set. It then annotated the network with the biological function terms with which the network was most significantly associated. I hypothesized that if the hypoxia-associated genes resembled a random sampling of the ASD genes, the most significant networks represented in both gene sets would have similar biological function annotations. I ran network analyses on both the ASD/Hypoxia gene set and the ASD gene set. between genes. As with the functional analysis, the most significant networks represented by the two datasets were not the same. Considering only direct relationships between genes, the most significant network represented by the ASD gene set was annotated as “RNA Damage and Repair, Cancer, Dermatological Diseases and Conditions” (Figure 4).

34 Using the same criteria, the most significant network of the ASD/Hypoxia set was annotated as “Cell-to-Cell Signaling and Interaction, Nervous System Development and Function, Neurological Disease” (Figure 5).

Figure 4: Most significantly represented network for ASD gene set

35

Figure 5: Most significantly represented network for ASD/Hypoxia gene set

Similarly, the second most represented network for the ASD set was labeled “Cancer, Organismal Injury and Abnormalities, Reproductive System Disease” while the second most represented network for the ASD/Hypoxia set was labeled “Neurological Disease, Hereditary Disorder, Organismal Injury and Abnormalities.” Similar results were found when indirect relationships were also included in the analysis. As shown in Table 7, more neurological and psychological functions are represented by the ASD/Hypoxia networks as compared to the ASD networks.

36 ASD Networks ASD/Hypoxia Networks Cellular Assembly and Organization, Nervous Cell-To-Cell Signaling and Interaction, Nervous System Development and Function, Organ System Development and Function, Behavior Morphology Developmental Disorder, Hereditary Disorder, Behavior, Cell-To-Cell Signaling and Interaction, Organismal Injury and Abnormalities Nervous System Development and Function Behavior, Cellular Development, Cellular Growth Gene Expression, Developmental Disorder, and Proliferation Neurological Disease Cell-To-Cell Signaling and Interaction, Nervous Neurological Disease, Psychological Disorders, System Development and Function, Behavior Developmental Disorder Cancer, Dermatological Diseases and Conditions, Cardiovascular System Development and Function, Organismal Injury and Abnormalities Behavior, Embryonic Development Cell Signaling, Nucleic Acid Metabolism, Small Psychological Disorders, Nutritional Disease, Molecule Biochemistry Digestive System Development and Function Cell-To-Cell Signaling and Interaction, Neurological Disease, Psychological Disorders, Neurological Disease, Nervous System Gastrointestinal Disease Development and Function Cell Morphology, Cellular Compromise, Cellular Behavior, Endocrine System Development and Assembly and Organization Function, Lipid Metabolism Embryonic Development, Organismal Cell Morphology, Cellular Assembly and Development, Cancer Organization, Cellular Development Cell Death and Survival, Cellular Compromise, Lipid Metabolism, Molecular Transport, Small Neurological Disease Molecule Biochemistry Neurological Disease, Embryonic Development, Cellular Development, Cellular Growth and Nervous System Development and Function Proliferation, Nervous System Development and Function Neurological Disease, Auditory Disease, Cell Morphology, Hair and Skin Development and Inflammatory Disease Function, Cancer Behavior, Psychological Disorders, Nutritional Cell Morphology, Cellular Assembly and Disease Organization, Cellular Development Cellular Movement, Nervous System Development Skeletal and Muscular System Development and and Function, Cell Signaling Function, Cancer, Cardiovascular System Development and Function RNA Damage and Repair, Cellular Development, Cancer, Organismal Injury and Abnormalities, Nervous System Development and Function Carbohydrate Metabolism Table 7: Most commonly represented networks (indirect relationships included) in ASD and ASD/Hypoxia gene set

37 Discussion

In this study, I used a combination of data from microarray studies and literature searches to identify hypoxia-regulated genes within a dataset of autism-associated genes. I knew that a) genetic risk factors for ASDs exist but fail to provide a complete etiology, b) that environmental risk factors likewise provide an incomplete etiology, and that pre- and perinatal hypoxia are one of the most salient of these, and c) that the cellular response to hypoxic injury is genetically controlled. Based on this, I hypothesized that there is a genetic component that impacts susceptibility to hypoxia and, as a consequence, ASD risk. Therefore, the first main question I asked was whether the number of hypoxia- related genes found in the ASD dataset varied significantly from what would be expected by chance. Indeed, I found that a given gene had roughly twice the chance of being associated with hypoxia if it were selected from the ASD dataset, compared with the genome as a whole. This provides support for an interaction between hypoxia and genetic factors in the development of ASDs. The ASD gene database I used included not just gene names, but also the evidence for each gene’s association with ASDs. This allowed me to compare the proportion of genes with a given type of evidence between the ASD gene set and the ASD/Hypoxia gene set. I found that the proportions of the varying types of ASD evidence did differ significantly across the two gene sets (p<0.0005). Specifically, the most significantly overrepresented in the ASD/Hypoxia set were the functional candidates and the most significantly underrepresented were the rare single gene variants. The functional candidate genes in the AutDB database are not yet directly experimentally linked to ASDs, but are biologically relevant for ASDs and have been associated with autism-like symptoms in humans or mice. It is plausible that these functional candidate genes are associated with neurodevelopmental processes; in my proposed model, these genes might mediate the functioning of these processes in response to hypoxic-ischemic injury. The underrepresentation of rare single gene variants in the ASD/Hypoxia dataset also fits with my model. Many of these rare gene variants have higher penetrance and contribute more directly to the development of autism, and environmental factors play less of a role. In contrast, a smaller subset of children carrying more common gene variants develop ASDs; pre- or perinatal hypoxia could be the factor differentiating

38 children with the same gene variant who do or do not develop an ASD. The need for a second risk factor, in the form of hypoxic injury, could partially explain the small effect of individual common variants. In contrast, gene variants that are more rare, but have a larger effect, might not require the addition of hypoxic injury to contribute to ASD risk. Of note is the finding that while the ASD evidence categories vary significantly in representation between the ASD/Hypoxia gene set and the ASD gene set, the different subsets of the ASD/Hypoxia set do not mirror this result. A χ2 analysis showed that each category of hypoxia evidence did not significantly vary from the ASD/Hypoxia dataset in terms of the proportion of ASD evidence categories. This suggests that both methods of finding hypoxia evidence, the literature search and the microarray database search, found a representative sample of the ASD/Hypoxia genes, and neither method was strongly biased towards a certain set of ASD-related genes. This validated my choice to augment the IHR database with evidence from a literature search, because both methods identified a similarly composed sample of genes. In other words, adding evidence from the literature to the IHR database did not skew the findings towards including a different set of ASD-associated genes. It was thus shown that hypoxia-regulated genes were overrepresented in the ASD dataset, and that there were significant differences in the composition of the ASD/Hypoxia set and ASD sets. The next question asked from a molecular biological perspective whether these hypoxia-regulated genes were randomly selected from the ASD set, or whether they fell into particular functional classifications. Here, my results were particularly striking as seen in the heat maps. Had the ASD/Hypoxia genes been a random selection of genes, the ASD functional category heat map and the ASD/Hypoxia heat map would have looked similar; however, one can see drastic differences in the functional categories represented. The ASD dataset showed the largest functional enrichment in the categories of “organismal injury and abnormality”, “cancer”, “nervous system development”, “neurological disease”, and “gastrointestinal disease.” The presence of these categories supports existing knowledge of ASDs’ complexity, and the fact that they have significant comorbidities, especially gastrointestinal, as well as neurodevelopmental symptoms. However, the ASD/Hypoxia dataset showed a very different enrichment pattern of functional categories. The “cancer” and “organismal

39 injury and abnormality” categories were barely represented in this dataset. The comparative significance of “nervous system development and function” and “neurological disease” also supports a model for hypoxia increasing ASD risk by affecting neurodevelopment. It might appear to be a trivial finding that brain-related Gene Ontology terms would be overrepresented in the ASD/Hypoxia dataset, given that my search was for genes regulated by hypoxia in the brain. Of note, however, is the striking lack of cancer- associated genes in the ASD/Hypoxia dataset, when they comprised a large portion of the entire ASD set. This is an interesting finding given that hypoxia has long been known to associate with cancer, and HIFs are well-known to be key factors in many processes involved in cancer59, such as tumor formation, angiogenesis, and metastasis5960. Despite the search for HIF-regulated genes, many of which presumably would be associated with cancer, no cancer-related terms were significantly enriched in the ASD/Hypoxia dataset. This suggests that the cancer-related functions of HIFs are less significant in this context than the neurodevelopmental functions. Paired with the significant overrepresentation of hypoxia-regulated genes, this provides support for my model as well. While my findings do support the idea of genetic factors mediating the link between ASDs and hypoxia, it is important to note the limitations of the study as well. I utilized a relatively conservative approach to identifying genes associated with hypoxia; studies had to have been performed testing each gene’s regulation by hypoxia or HIFs. It therefore is possible that I overlooked other neuroprotective genes that may well protect against hypoxic injury, even if studies have not been performed showing direct regulation by hypoxia. In addition, I did not include genes involved in vascular expression or function without specific evidence for hypoxia regulation. HIFs are also implicated in regulating angiogenesis and vascular function51, and these functions of HIFs could additionally be involved in ASDs. More genes from the AutDB database would have matched the criteria for angiogenesis; future studies could broaden the scope of the criteria to include genes involved in angiogenesis in the brain. Additional limitations of the study include the slight inconsistencies in forms of evidence inherent to the databases we used. While the IHR database contained genes identified in microarray studies in animals, the majority of the AutDB database were

40 found through association studies in humans. Therefore, it may not be the most accurate comparison. However, given that there is no such direct microarray test for ASDs like there is to check for hypoxia regulation, there is no choice but to use genes from association studies and functional candidate genes (many of which stem from knockdown experiments in mice). However, there are weaknesses inherent in this method; while the AutDB database is well-curated by experts, it is very possible that genes identified in association studies and functional candidates do not actually contribute to ASDs. As a way of combating this limitation, the compilers of the AutDB database have added a “Gene Scoring Module” through which the larger research community can contribute assessments of the strength of the evidence supporting a given gene’s links to ASDs52. Genes can be assigned a score from 1 (high confidence) to 6 (not supported), with a separate category, S, for genes known to be related to syndromes that have ASD- like symptoms. The scoring process is performed by the research community at large, not the compilers of AutDB. As such, not all genes are represented in the list of scored genes; rather, researchers are able to pick specific genes of interest to them and assign them scores. Because not every gene in the AutDB database has been assigned a score in this module, it is difficult to statistically analyze the ASD/Hypoxia gene set in terms of scored genes—for example, performing a χ2 test to find over- or under-represented scores would be inaccurate because not all genes in the ASD/Hypoxia set have scores. However, the presence of the Scoring Module does shed some light onto which of the ASD/Hypoxia genes have the strongest associations with ASDs. Here, I highlight the eight genes with literature evidence that have the highest ASD scores, and briefly describe the evidence for the gene’s responsiveness to neural hypoxia. These genes provide support to my model in which variation in genes responsive to hypoxia determines the risk of ASDs by affecting neurodevelopment. Tbr1 (T-box, brain, 1): ASD Score: 1. Tbr1, a high-confidence ASD-associated gene, is a transcription factor that determines differentiation of “early-born glutamatergic neocortical neurons.”61 A 2009 study found an increase in the generation of Tbr1+ neurons in the brains of wild type mice exposed to hypoxic injury62. Cul3 (Cullin3): ASD Score: 2. From an autism standpoint, de novo mutations in Cul3 have been shown to be associated with idiopathic ASDs.63 Cul3 has also been shown to

41 be directly downstream of hypoxia signaling; hypoxia-induced microRNAs are shown to induce signaling pathways by directly binding to Cul364. DSCAM (Down syndrome cell adhesion molecule): ASD Score: 2. De novo coding mutations in this gene have been associated with ASDs65. DSCAM is known to play a role in neuron development, and DSCAM-positive cells in monkey dentate gyri were shown to significantly increase after cerebral ischemia.66 NRXN1 ( 1): ASD Score: 2. Disruption of neurexin 1, a presynaptic protein, in subjects with chromosomal abnormalities was shown to be associated with ASDs67. Neurexin 1 was also shown to be upregulated in hypoxic rat brains68. RELN (Reelin): Polymorphisms in reelin, an extracellular matrix glycoprotein, were shown to be associated with ASDs69; in addition, reelin expression was shown to be significantly upregulated by hypoxia70. GRIP1 (Glutamate Receptor Interacting Protein 1): ASD Score: 2. Significant evidence exists relating ASDs with glutamate receptors and their regulators71. GRIP1 was shown to be downregulated in response to oxygen deprivation72. KDM5B (Lysine demethylase 5B): ASD Score: 2. As a histone demethylase involved in posttranslational modifications, KDM5B has been shown to be mutated in neurodevelopmental disorders including ASDs73. A 2009 study showed that demethylases such as KDM5B require oxygen to function and often respond to HIFs74. MET (HGF receptor): ASD score: 2. Familial mutations in MET have been shown to be associated with ASDs in multiple studies7576. MET also shows significant delays in expression following cerebral ischemia77. The aforementioned genes are by no means the only genes both associated with ASDs and responsive to hypoxia. However, they are the highest-scoring genes in terms of association with ASDs, and demonstrate significant expression changes in response to hypoxia-ischemia. Therefore, these genes provide support to my model. It is possible to imagine that the mechanism by which a mutation in one of these genes might increase the risk of ASD could be by responding to hypoxia in an altered manner, and altering neurodevelopment in response. Despite the abovementioned limitations of the study, the significant overrepresentation of hypoxia-regulated genes in the AutDB database does suggest that

42 there is a connection between genetic risk factors, hypoxic injury, and the risk of developing ASDs. This is in line with studies that suggest that both genetic and environmental factors contribute to ASDs’ etiology. Indeed, my findings support a model in which neither genes nor environment function alone, but rather, gene-environment interactions can explain the etiology of ASDs. This is a cutting-edge theoretical framework in ASD research, and as yet there have been no studies interrogating gene- environment interactions specifically through the lens of hypoxia. However, this study lays the groundwork for future research along this avenue. Future studies could include in vitro analyses of these genes’ function in response to hypoxia in cells. In vivo experiments, knockdown experiments in mice exposed to hypoxia could see if certain genotypes could increase risk of autism-like symptoms, only when paired with hypoxic injury. Eventually, these findings could lead to interventions such as improved prenatal care; perhaps additional oxygen supplementation for fetuses known to carry an ASD/Hypoxia gene variant. This study opens the door to a more nuanced characterization of the documented links between hypoxia and ASDs, and between genetic factors and ASDs. The future is full of possibilities to better explore these connections; with this database in hand, we are entering an exciting world of gene-environment analysis.

43 Bibliography

1. Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years — Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2010. http://www.cdc.gov/mmwr/preview/mmwrhtml/ss6302a1.htm?s_cid=ss6302a1_w. Accessed February 25, 2016. 2. Jones EJH, Gliga T, Bedford R, Charman T, Johnson MH. Developmental pathways to autism: a review of prospective studies of infants at risk. Neurosci Biobehav Rev. 2014;39:1-33. doi:10.1016/j.neubiorev.2013.12.001. 3. Kogan MD, Strickland BB, Blumberg SJ, Singh GK, Perrin JM, van Dyck PC. A national profile of the health care experiences and family impact of autism spectrum disorder among children in the United States, 2005-2006. Pediatrics. 2008;122(6):e1149-e1158. doi:10.1542/peds.2008-1057. 4. Corcoran J, Berry A, Hill S. The lived experience of US parents of children with autism spectrum disorders: A systematic review and meta-synthesis. J IntellectDisabil. 2015;(1744-6309 (Electronic)). doi:10.1177/1744629515577876. 5. Folstein S, Rutter M. Infantile autism: a genetic study of 21 twin pairs. J Child Psychol Psychiatry. 1977;18(4):297-321. doi:10.1111/j.1469- 7610.1977.tb00443.x. 6. Steffenburg S, Gillberg C, Hellgren L, et al. A {Twin} {Study} of {Autism} in {Denmark}, {Finland}, {Iceland}, {Norway} and {Sweden}. J Child Psychol Psychiatry. 1989;30(3):405-416. doi:10.1111/j.1469-7610.1989.tb00254.x. 7. Bolton P, Macdonald H, Pickles A, et al. A case-control family history study of autism. J Child Psychol Psychiatry Allied Discip. 1994;35(5):877-900. doi:10.1111/j.1469-7610.1994.tb02300.x. 8. Glessner JT, Wang K, Cai G, et al. Autism genome-wide copy number variation reveals ubiquitin and neuronal genes. Nature. 2009;459(7246):569-573. doi:10.1038/nature07953. 9. Abrahams BS, Geschwind DH. Advances in autism genetics: on the threshold of a new neurobiology. Nat Rev Genet. 2008;9(5):341-355. doi:10.1038/nrg2346. 10. Anney R, Klei L, Pinto D, et al. Individual common variants exert weak effects on the risk for autism spectrum disorders. Hum Mol Genet. 2012;21(21):4781-4792. doi:10.1093/hmg/dds301. 11. Hallmayer J, Cleveland S, Torres A, et al. Genetic Heritability and Shared Environmental Factors among Twin Pairs with Autism. 2011;68(11):1095-1102. doi:10.1001/archgenpsychiatry.2011.76.Genetic. 12. Sandin S, Lichtenstein P, Kuja-Halkola R, Larsson H, Hultman CM, Reichenberg A. The familial risk of autism. Jama. 2014;311(17):1770-1777. doi:10.1001/jama.2014.4144. 13. Chess S. Follow-up report on autism in congenital rubella. J Autism Child Schizophr. 1977;7(1):69-81. doi:10.1007/BF01531116. 14. Stromland K, Nordin V, Miller M, Gillberg C. Autism in Thalidomide Embryopathy : a Population Study. 1993. 15. Hultman CM., Sparén P, Cnattingius S. Perinatal Risk Factors for Infantile Autism. Epidemiology. 2002;13(4):417-423.

44 doi:10.1097/01.EDE.0000016968.14007.E6. 16. Carpenter L. DSM-5 Autism Spectrum Disorder. Dsm-5. 2013;(February):1-7. https://depts.washington.edu/dbpeds/Screening Tools/DSM- 5(ASD.Guidelines)Feb2013.pdf. 17. Charman T, Jones CRG, Pickles A, Simonoff E, Baird G, Happé F. Defining the cognitive phenotype of autism. Brain Res. 2010. file:///Users/shreya/Downloads/download.pdf. Accessed October 5, 2015. 18. Bal VH, Kim SH, Cheong D, Lord C. Daily living skills in individuals with autism spectrum disorder from 2 to 21 years of age. Autism. 2015;19(7):774-784. doi:10.1177/1362361315575840. 19. Kaufmann W. DSM-5: The new diagnostic criteria for autism spectrum disorders. Res Symp Consortium, Boston, MA. 2012. http://www.autismconsortium.org/symposium- files/WalterKaufmannAC2012Symposium.pdf. Accessed October 5, 2015. 20. Gorrindo P, Williams KC, Lee EB, Walker LS, McGrew SG, Levitt P. Gastrointestinal dysfunction in autism: parental report, clinical evaluation, and associated factors. Autism Res. 2012;5(2):101-108. doi:10.1002/aur.237. 21. Onore C, Careaga M, Ashwood P. The role of immune dysfunction in the pathophysiology of autism. Brain Behav Immun. 2012;26(3):383-392. doi:10.1016/j.bbi.2011.08.007. 22. Zablotsky B, Kalb LG, Freedman B, Vasa R, Stuart E a. Health care experiences and perceived financial impact among families of children with an autism spectrum disorder. Psychiatr Serv. 2014;65(3):395-398. doi:10.1176/appi.ps.201200552. 23. Lovell B, Wetherell MA. The psychophysiological impact of childhood autism spectrum disorder on siblings. Res Dev Disabil. 2016;49-50:226-234. doi:10.1016/j.ridd.2015.11.023. 24. Dawson G, Rogers S, Munson J, et al. Randomized, Controlled Trial of an Intervention for Toddlers With Autism: The Early Start Denver Model. Pediatrics. 2010;125(1):e17-e23. doi:10.1542/peds.2009-0958. 25. Tanner K, Hand BN, Toole GO, Lane AE. Effectiveness of Interventions to Improve Social Participation, Play, Leisure, and Restricted and Repetitive Behaviors in People With Autism Spectrum Disorder : A Systematic Review. 2015;69(5). 26. Rutter M. Genetic studies of autism: From the 1970s into the millennium. J Abnorm Child Psychol. 2000;28(1):3-14. doi:10.1023/A:1005113900068. 27. Parikshak NN, Luo R, Zhang A, et al. Integrative functional genomic analyses implicate specific molecular pathways and circuits in autism. Cell. 2013;155(5):1008-1021. doi:10.1016/j.cell.2013.10.031. 28. Skafidas E, Testa R, Zantomio D, Chana G, Everall IP, Pantelis C. Predicting the diagnosis of autism spectrum disorder using gene pathway analysis. Mol Psychiatry. 2014;19(4):504-510. doi:10.1038/mp.2012.126. 29. Sanders SJ, He X, Willsey AJ, et al. Insights into Autism Spectrum Disorder Genomic Architecture and Biology from 71 Risk Loci. Neuron. 2015;87(6):1215- 1233. doi:10.1016/j.neuron.2015.09.016. 30. Consortium TAGP, Szatmari P, Paterson AD, et al. Mapping autism risk loci using

45 genetic linkage and chromosomal rearrangements. Nat Genet. 2007;39(3):319-328. doi:10.1038/ng1985. 31. Kanner L. Autistic Disturbances of Affective Contact. 32. Pasamanick B, Rogers ME, Lilienfeld AM. Pregnancy Experience and the Development of Behavior Disorder in Children. Am J Psychiatry. 1956;112(8):613-618. http://ajp.psychiatryonline.org/doi/pdf/10.1176/ajp.112.8.613. 33. Rice D, Barone S. Critical periods of vulnerability for the developing nervous system: Evidence from humans and animal models. Environ Health Perspect. 2000;108(SUPPL. 3):511-533. doi:10.1289/ehp.00108s3511. 34. Rees S, Harding R, Walker D. The biological basis of injury and neuroprotection in the fetal and neonatal brain. Int J Dev Neurosci. 2011;29(6):551-563. doi:10.1016/j.ijdevneu.2011.04.004. 35. Rees S, Harding R, Walker D. An adverse intrauterine environment: implications for injury and altered development of the brain. Int J Dev Neurosci. 2008;26(1):3- 11. doi:10.1016/j.ijdevneu.2007.08.020. 36. Vannucci SJ, Hagberg H. Hypoxia-ischemia in the immature brain. J Exp Biol. 2004;207(Pt 18):3149-3154. doi:10.1242/jeb.01064. 37. Rees S, Harding R. Brain development during fetal life: Influences of the intra- uterine environment. Neurosci Lett. 2004;361(1-3):111-114. doi:10.1016/j.neulet.2004.02.002. 38. Zornberg GL, Buka SL, Tsuang MT. Hypoxic-ischemia-related fetal/neonatal complications and risk of schizophrenia and other nonaffective psychoses: A 19- year longitudinal study. Am J Psychiatry. 2000;157(2):196-202. doi:10.1176/appi.ajp.157.2.196. 39. Loeliger M, Watson CS, Reynolds JD, et al. Extracellular glutamate levels and neuropathology in cerebral white matter following repeated umbilical cord occlusion in the near term fetal sheep. Neuroscience. 2003;116(3):705-714. doi:10.1016/S0306-4522(02)00756-X. 40. Kolevzon A, Gross R, Reichenberg A. Prenatal and Perinatal Risk Factors for Autism. Arch Pediatr Adolesc Med. 2007;161(4):326. doi:10.1001/archpedi.161.4.326. 41. Glasson EJ, Bower C, Petterson B, de Klerk N, Chaney G, Hallmayer JF. Perinatal factors and the development of autism: a population study. Arch Gen Psychiatry. 2004;61(6):618-627. doi:10.1001/archpsyc.61.6.618. 42. Gardener H, Spiegelman D, Buka SL. Prenatal risk factors for autism: comprehensive meta-analysis. Br J Psychiatry. 2009;195(1):7-14. doi:10.1192/bjp.bp.108.051672. 43. Gardener H, Spiegelman D, Buka SL. Perinatal and Neonatal Risk Factors for Autism: A Comprehensive Meta-Analysis. Pediatrics. 2011;128(2):peds.2010- 1036d - peds.2010-1036d. doi:10.1542/peds.2010-1036d. 44. Guinchat V, Thorsen P, Laurent C, Cans C, Bodeau N, Cohen D. Pre-, peri- and neonatal risk factors for autism. Acta Obstet Gynecol Scand. 2012;91(3):287-300. doi:10.1111/j.1600-0412.2011.01325.x. 45. Burstyn I, Wang X, Yasui Y, Sithole F, Zwaigenbaum L. Autism spectrum disorders and fetal hypoxia in a population-based cohort: accounting for missing

46 exposures via Estimation-Maximization algorithm. BMC Med Res Methodol. 2011;11(1):2. doi:10.1186/1471-2288-11-2. 46. Froehlich-Santino W, Londono Tobon A, Cleveland S, et al. Prenatal and perinatal risk factors in a twin study of autism spectrum disorders. J Psychiatr Res. 2014;54(1):100-108. doi:10.1016/j.jpsychires.2014.03.019. 47. Modabbernia A, Mollon J, Boffetta P, Reichenberg A. Impaired Gas Exchange at Birth and Risk of Intellectual Disability and Autism: A Meta-analysis. J Autism Dev Disord. 2016. doi:10.1007/s10803-016-2717-5. 48. Maramara LA, He W, Ming X. Pre- and Perinatal Risk Factors for Autism Spectrum Disorder in a New Jersey Cohort. J Child Neurol. 2014;29(12):1645- 1651. doi:10.1177/0883073813512899. 49. Benita Y, Kikuchi H, Smith AD, Zhang MQ, Chung DC, Xavier RJ. An integrative genomics approach identifies Hypoxia Inducible Factor-1 (HIF-1)- target genes that form the core response to hypoxia. Nucleic Acids Res. 2009;37(14):4587-4602. doi:10.1093/nar/gkp425. 50. Ziello JE, Jovin IS, Huang Y. Hypoxia-Inducible Factor (HIF)-1 regulatory pathway and its potential for therapeutic intervention in malignancy and ischemia. Yale J Biol Med. 2007;80(2):51-60. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2140184&tool=pmcen trez&rendertype=abstract. Accessed March 4, 2015. 51. Schmidt-Kastner R, van Os J, Esquivel G, Steinbusch HWM, Rutten BPF. An environmental analysis of genes associated with schizophrenia: hypoxia and vascular factors as interacting elements in the neurodevelopmental model. Mol Psychiatry. 2012;17(12):1194-1205. doi:10.1038/mp.2011.183. 52. Basu SN, Kollu R, Banerjee-Basu S. AutDB: a gene reference resource for autism research. Nucleic Acids Res. 2009. http://nar.oxfordjournals.org/content/37/suppl_1/D832.full.pdf. Accessed September 16, 2015. 53. Banerjee-Basu S, Packer A. SFARI Gene: an evolving database for the autism research community. Dis Model Mech. 2010;3(3-4):133-135. doi:10.1242/dmm.005439. 54. Schmidt-Kastner R, Yamamoto H, Hamasaki D, et al. Hypoxia-regulated components of the U4/U6.U5 tri-small nuclear riboprotein complex: possible role in autosomal dominant retinitis pigmentosa. Mol Vis. 2008;14(December 2007):125-135. doi:v14/a16 [pii]. 55. Cooper GM, Coe BP, Girirajan S, et al. A copy number variation morbidity map of developmental delay. Nat Genet. 2011;43(9):838-846. doi:10.1038/ng.909. 56. Bhalala US, Koehler RC, Kannan S. Neuroinflammation and neuroimmune dysregulation after acute hypoxic-ischemic injury of developing brain. Front Pediatr. 2014;2:144. doi:10.3389/fped.2014.00144. 57. Smith TF, Schmidt-Kastner R, McGeary JE, Kaczorowski JA, Knopik VS. Pre- and Perinatal Ischemia-Hypoxia, the Ischemia-Hypoxia Response Pathway, and ADHD Risk. Behav Genet. February 2016. doi:10.1007/s10519-016-9784-4. 58. McHugh ML. Interrater reliability: the kappa statistic. Biochem medica. 2012;22(3):276-282. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3900052&tool=pmcen

47 trez&rendertype=abstract. Accessed August 12, 2015. 59. Yang Y, Sun M, Wang L, Jiao B. HIFs, angiogenesis, and cancer. J Cell Biochem. 2013;114(5):967-974. doi:10.1002/jcb.24438. 60. Mucaj V, Shay JES, Simon MC. Effects of hypoxia and HIFs on cancer metabolism. Int J Hematol. 2012;95(5):464-470. doi:10.1007/s12185-012-1070-5. 61. Hevner RF, Shi L, Justice N, et al. Tbr1 regulates differentiation of the preplate and layer 6. Neuron. 2001;29(2):353-366. http://www.ncbi.nlm.nih.gov/pubmed/11239428. Accessed April 14, 2016. 62. Fagel DM, Ganat Y, Cheng E, et al. Fgfr1 is required for cortical regeneration and repair after perinatal hypoxia. J Neurosci. 2009;29(4):1202-1211. doi:10.1523/JNEUROSCI.4516-08.2009. 63. Codina-Solà M, Rodríguez-Santiago B, Homs A, et al. Integrated analysis of whole-exome sequencing and transcriptome profiling in males with autism spectrum disorders. Mol Autism. 2015;6:21. doi:10.1186/s13229-015-0017-0. 64. Kim J-H, Lee K-S, Lee D-K, et al. Hypoxia-responsive microRNA-101 promotes angiogenesis via heme oxygenase-1/vascular endothelial growth factor axis by targeting cullin 3. Antioxid Redox Signal. 2014;21(18):2469-2482. doi:10.1089/ars.2014.5856. 65. Turner TN, Hormozdiari F, Duyzend MH, et al. Genome Sequencing of Autism- Affected Families Reveals Disruption of Putative Noncoding Regulatory DNA. Am J Hum Genet. 2016;98(1):58-74. doi:10.1016/j.ajhg.2015.11.023. 66. Yamashima T, Popivanova BK, Guo J, et al. Implication of “Down syndrome cell adhesion molecule” in the hippocampal neurogenesis of ischemic monkeys. Hippocampus. 2006;16(11):924-935. doi:10.1002/hipo.20223. 67. Kim H-G, Kishikawa S, Higgins AW, et al. Disruption of neurexin 1 associated with autism spectrum disorder. Am J Hum Genet. 2008;82(1):199-207. doi:10.1016/j.ajhg.2007.09.011. 68. Sommer JU, Schmitt A, Heck M, et al. Differential expression of presynaptic genes in a rat model of postnatal hypoxia: relevance to schizophrenia. Eur Arch Psychiatry Clin Neurosci. 2010;260 Suppl :S81-S89. doi:10.1007/s00406-010- 0159-1. 69. Shen Y, Xun G, Guo H, et al. Association and gene-gene interactions study of reelin signaling pathway related genes with autism in the Han Chinese population. Autism Res. August 2015. doi:10.1002/aur.1540. 70. Komitova M, Xenos D, Salmaso N, et al. Hypoxia-induced developmental delays of inhibitory interneurons are reversed by environmental enrichment in the postnatal mouse forebrain. J Neurosci. 2013;33(33):13375-13387. doi:10.1523/JNEUROSCI.5286-12.2013. 71. Uzunova G, Hollander E, Shepherd J. The role of ionotropic glutamate receptors in childhood neurodevelopmental disorders: autism spectrum disorders and fragile x syndrome. Curr Neuropharmacol. 2014;12(1):71-98. doi:10.2174/1570159X113116660046. 72. Fernandes J, Vieira M, Carreto L, et al. In vitro ischemia triggers a transcriptional response to down-regulate synaptic in hippocampal neurons. PLoS One. 2014;9(6):e99958. doi:10.1371/journal.pone.0099958. 73. Vallianatos CN, Iwase S. Disrupted intricacy of histone H3K4 methylation in

48 neurodevelopmental disorders. Epigenomics. 2015;7(3):503-519. doi:10.2217/epi.15.1. 74. Yang J, Ledaki I, Turley H, et al. Role of hypoxia-inducible factors in epigenetic regulation via histone demethylases. Ann N Y Acad Sci. 2009;1177:185-197. doi:10.1111/j.1749-6632.2009.05027.x. 75. Campbell DB, Buie TM, Winter H, et al. Distinct genetic risk based on association of MET in families with co-occurring autism and gastrointestinal conditions. Pediatrics. 2009;123(3):1018-1024. doi:10.1542/peds.2008-0819. 76. Lambert N, Wermenbol V, Pichon B, et al. A familial heterozygous null mutation of MET in autism spectrum disorder. Autism Res. 2014;7(5):617-622. doi:10.1002/aur.1396. 77. Nagayama T, Nagayama M, Kohara S, et al. Post-ischemic delayed expression of hepatocyte growth factor and c-Met in mouse brain following focal cerebral ischemia. Brain Res. 2004;999(2):155-166. doi:10.1016/j.brainres.2003.11.052.

49 Appendix: ASD/Hypoxia Genes

The following is the collected set of genes from the ASD database with evidence for regulation by hypoxia. Genes for which evidence was found from literature searches are annotated with the PubMed reference number for a study demonstrating hypoxia regulation, as well as a brief note describing the evidence. Genes found in the IHR database are labeled as IHR genes; some genes fell into both evidence categories.

Gene PubMed ID for Notes on hypoxia regulation of gene IHR Symbol literature evidence 1 ABAT 11044580 Elevated expression after ischemia 2 ADA 25720338 important in perinatal hypoxia response 3 ADARB1 16504947 neuroprotection during ischemia IHR 4 ADCY5 IHR 5 ADK 25720338 expression altered by neonatal HI IHR 6 ADNP 21453737? 7 ADORA2A 17033689 upregulated after hypoxic preconditioning in immature brain 8 ADORA3 12825837 neuroprotection during hypoxia 9 ADSL IHR 10 AFF4 23746844 part of a complex bound by HIF1A to stimulate RNAPII elongation 11 AGTR2 24849663 involved in heat acclimation-mediated neuroprotection 12 ANK3 IHR 13 ANXA1 16917513 p53-dependent hypoxia response IHR 14 APBA2 IHR 15 APC 24609463 cells in hypoxic white matter lesions lack IHR expression of APC 16 APH1A 18063223 hypoxia enhances expression 17 APP 25771168 APP processing involved in retinal ganglion cell IHR axonopathy after hypoxia 18 ARNT2 25017895 connected to HIF, differential expression in dopaminergic expression could relate to hypoxia 19 ASS1 17900569 expressed after transient ischemia in rat brain 20 ATG7 PMC3512007 role in autophagy and apoptosis under hypoxia 21 ATP2B2 IHR 22 BAIAP2 23363253 hypoxia affects gene function in tumor cells IHR 23 BCL2 25714473 sleep apnea drug mitigated effects of this gene IHR 24 BDNF 24397751 increased expression after hypoxia IHR

25 BRAF 2065747 upregulated after hypoxia 26 C4B 18691639 activated by hypoxia 27 CA6 24275196 levels increased after hypoxia 28 CACNA1B IHR 29 CACNA1C PMC3726947 Chronic hypoxia upregulates protein expresson of IHR

50 Ca channel in pulmonary artery 30 CACNA1D IHR 31 CACNB2 IHR 32 CADPS2 IHR 33 CAMK4 23868268 neuroprotective in ischemia 34 CD44 18638458 induced after ischemia IHR 35 CDKN1B 11438580 Activated after ischemia 36 CHRNA7 10923672 increased expression after transient hypoxia IHR 37 CNR1 16632332 regulated by hypoxia 38 CNTN3 IHR 39 CREBBP 24383849 hypoxia affects CREB signaling-- CREBBP IHR binds to CREB 40 CTCF 22354964 regulation dependent on hypoxic stress 41 CTNNB1 25946682 catenin-binding protein; catenin is involved in IHR Wnt pathway mediating hypoxia response 42 CUL3 24844779 Hypoxia-responsive miRNA stimulates IHR angiogenesis via Cul3 43 CX3CR1 11960641 mice deficient in fractalkine (agonist for this receptor) less susceptible to ischemia-reperfusion injury 44 CXCR3 24799675 HIFs regulate these genes in breast cancer 45 CYFIP1 IHR 46 DAB1 19635490 signaling levels decreased in prenatal hippocampus after maternal hypoxia 47 DAPK1 24806680 DAPK-1 interacts with p53 -->necrosis/apoptosis (mediating ischemic injury) 48 DDC 19457096 HIF-2a function necessary for expression of DDC in sympathoadrenal progenitor cells 49 DISC1 25738396 Stability altered after hypoxia 50 DLG4 IHR 51 DLGAP3 IHR 52 DLX1 16580139 ischemia increased number of cells expressing IHR this 53 DMD PMC2958114 dystrophin deficiency is associated with pathological hypoxic stress response, more sensitive to hypoxia induced muscle dysfunction 54 DOCK10 22388811 Involved in inflammation in response to hypoxia 55 DPP6 IHR 56 DPYD IHR 57 DRD1 18535281 D1R activation presynaptically depresses IHR excitatory synaptic transmission in striatal neurons after ischemia 58 DRD2 26104289 cerebral ischemia induces D2 expression in IHR inflammatory cells 59 DRD3 19653907 slow-reacting hypoxia-sensitive transcription factors may be involved in transactivation of these genes; connection with low neonatal cerebral blood flow as predictor of ADHD

51 60 DSCAM 16983647 upregulated after ischemia IHR 61 EEF1A2 IHR 62 EGR2 21185809 expression decreased after ischemia in enriched IHR environment 63 EHMT1 25414343 transcriptionally induced in response to hypoxia 64 EIF4E 25622105 transcription factor, binds to HIF1a mRNA 65 EIF4EBP2 23591646 Correlates with neuron death after ischemia 66 EP300 24782431 Interacts with HIFs 67 EPHA6 IHR 68 ERBB4 24966332 HIF-1a directly induces this in mammary gland 69 ETFB IHR 70 FABP3 IHR 71 FABP5 19623607 possibly involved in adult postischemic neuronal IHR antiapoptosis 72 FABP7 25263561 involved in HIF-regulated Notch pathway for IHR glioblastoma stem cells 73 FLT1 25085511 HIF target gene, necessary for angiogenesis IHR 74 FMR1 26239490 HI brain injury can affect protein product of gene 75 FRK 12419528 Fyn kinase interactions increase after transient ischemia 76 GABRA1 24513087 receptor phosphorylation associated with IHR neuronal death in in vitro ischemia 77 GABRB1 IHR 78 GABRB3 IHR 79 GAD1 20969567 intermittent hypoxia during sleep modifies this protein 80 GAP43 10037503 dephosphorylation induced after hypoxia 81 GPD2 IHR 82 GPHN 24513087 Oxygen deprivation decreased interactions of GABA receptors with this scaffold protein 83 GPX1 21193029 pathway in cerebral ischemic injury IHR 84 GRID2 IHR 85 GRIK2 IHR 86 GRIN1 22371606 bound to REST which is upregulated by stroke IHR 87 GRIN2A IHR 88 GRIP1 24960035 down-regulated after oxygen-glucose deprivation IHR 89 GRM5 22034224 mGluR5 is involved in proliferation of rat neural IHR progenitor cells exposed tohypoxia 90 GSK3B IHR 91 GSN IHR 92 HCN1 25042871 increased expression after hypoxia 93 HDAC4 21917920 regulates HIF1a protein acetylation in cancer cells 94 HDAC6 26210454 maintains mitochondrial connectivity after IHR hypoxia 95 HOMER1 22465321 protein level decreases after cerebral ischemia in IHR rats

52 96 HSD11B1 IHR 97 HTR1B IHR 98 IMMP2L 21824519 deficiency increases ischemic brain damage in mice 99 ITGB3 18342368 HIF1a induces ITGB3 aggregation IHR 100 KCNJ10 24431311 expression patterns change after hypoxia IHR 101 KCNMA1 PMC4114839 KCNMA1 encoded cardiac BK channels afford IHR protection against Ischemia-Reperfusion Injury 102 KDM5B 19845621 histone demethylases sometimes induced by hypoxia via HIF-a 103 KDM6B 25520177 hypoxia response gene regulated by HIF-2a 104 KIF5C IHR 105 KMO 15206725 Inhibitors of KMO reduced post ischemic neuron death 106 LEP 25889814 Expressed in response to hypoxia IHR 107 MAOA 26499200 target of HIF-1a IHR 108 MAOB 11005543 MAO-B inhibitors can protect against ischemia- IHR induced death 109 MAP2 25634435 hypoxia --> neuronal damage/MAP2 loss IHR 110 MAPK1 12871576 111 MBD1 12421618 different expression following hypoxia 112 MBD3 12421618 different expression following hypoxia 113 MC4R 16254026 activation of receptor could reduce hippocampal brain damage after ischemia 114 MECP2 3293990 MECP2 regulates responses to hypoxia; 22297041 115 MEF2C 4079651 under hypoxic conditions the VEGF-A/bFGF- IHR mediated upregulation of MEF2C is reduced and the production of alpha-2-macroglobulin largely abolished 116 MET 14759494 Delayed expression after ischemia 117 MSN 15181373 increase after focal ischemia IHR 118 MSR1 IHR 119 MTF1 16216223 transcripion factor contributes to HIF-1 activation under hypoxia 120 MTHFR 24192699 variants render neonates more vulnerable to cerebral injury in presence of hypoxia, risk factor for ischemic stroke 121 MYT1L IHR 122 NAA15 IHR 123 NDUFA5 IHR 124 NEFL IHR 125 NFIA 22807310 inhibits repair after white matter injury 126 NLGN1 17904739 ischemia can decrease NLGN1 complex IHR 127 NLGN2 IHR 128 NLGN3 IHR 129 NOS1AP 23658158 a peptide inhibiting interactions of this enzyme

53 doubles surviving tissue in neonatal hypoxia 130 NOS2A 21984255 hypoxia selective reductase IHR 131 NR3C2 24564395 expression decreased after hypoxia IHR 132 NRG1 22200588 mediates ischemia-induced angiogenesis, antioxidant 133 NRXN1 PMC2965359 Neurexin is important signaling molecule in IHR ischemia injury 134 NRXN3 22467039 downregulated after ischemia 135 NXPH1 IHR 136 OPRM1 17360495 ischemia promotes epigenetic reprogramming of IHR this gene 137 P2RX4 19447505 delayed expression after IH in rats IHR 138 PAFAH1B1 IHR 139 PCDHA4 IHR 140 PDE4A IHR 141 PDE4B IHR 142 PER1 23469952 increased neuronal injury in mice deficient for this gene, after ischemia 143 PIK3CG 17962628 knockout impairs postischemic neovascularization 144 PIK3R2 IHR 145 PLAUR IHR 146 PLCB1 24465776 upregulation after hypoxia IHR 147 PLCD1 IHR 148 PON1 23497787 polymorphism increases risk of ischemic stroke IHR 149 PRKCB IHR 150 PRODH 18195713 associated w/ hypoxia and schizophrenia IHR 151 PTEN 15102920 suppression protects against ischemic neuron IHR death 152 PTGS2 24685982 hypoxia induces expression IHR

153 PTPN11 11023980 increased ischemia-induced brain damage with overexpression of this gene 154 PTPRB IHR 155 PTPRC IHR 156 RAI1 21586670 deletion increases hypoxia tolerance in yeast 157 RAPGEF4 IHR 158 RASD1 24548484 159 RELN PMC3742925 Hypoxia increases Reelin expression 160 RORA 18658046 function of RORalpha in amplification IHR of hypoxia signaling 161 RPS6KA2 IHR 162 SCN1A 20483028 decrease reactivity IHR 163 SCN2A IHR 164 SCN7A IHR 165 SDC2 18803305 expressed in developing rat microglia after IHR

54 hypoxia 166 SERPINE1 21193004 inhibits tissue plasminogen activator (tPA), IHR differentially regulated by hypoxia 167 SHANK3 21950801 Regulated by ischemia 168 SLC1A1 25326682 HIF-dependent expression IHR 169 SLC24A2 IHR 170 SLC25A14 23266757 UCP5 up in high altitude (relative hypoxia) IHR

171 SLC4A10 17928512 hypoxia decreases expression of gene 172 SLC6A3 23800715 could mediate neural damage following injury IHR 173 SLC6A8 IHR 174 SND1 23770094 Responsive to hypoxia, alters miRNAs in response 175 SOD1 24526229 protection against cell death IHR 176 STX1A 20945070 expression altered by hypoxia in developing rat IHR brain 177 STXBP1 IHR 178 SYN1 IHR 179 SYN2 IHR 180 TBL1XR1 IHR 181 TBR1 19176828 Increased generation of Tbr1+ neurons after hypoxia 182 TCF4 10197513 Decreased after ischemia in hippocampus 183 TH 25807177 upregulated by HIF1A 184 THBS1 26503986 HIF-2a mediated IHR 185 THRA IHR 186 TNIP2 IHR 187 TPH2 24336886 greater ventilatory response in ++ mice than -- , lower TPH2 levels associated with SIDS in humans 188 TRPC6 24961969 activated by severe hypoxia 189 TSN IHR 190 VIP 23946395 levels decreased in hypoxia-reared mice IHR 191 WNT1 20716939 neuroprotection during ischemia 192 XPO1 IHR 193 YWHAE IHR 194 ZBTB16 20731660 Zinc finger proteins downregulated after hypoxia 195 ZMYND11 IHR

55