OPEN Citation: Transl Psychiatry (2014) 4, e391; doi:10.1038/tp.2014.29 © 2014 Macmillan Publishers Limited All rights reserved 2158-3188/14 www.nature.com/tp

ORIGINAL ARTICLE Genetic risk prediction and neurobiological understanding of alcoholism

DF Levey1, H Le-Niculescu1, J Frank2, M Ayalew1, N Jain1, B Kirlin1, R Learman1, E Winiger1, Z Rodd1, A Shekhar1, N Schork3, F Kiefe4, N Wodarz5, B Müller-Myhsok6, N Dahmen7, GESGA Consortium, M Nöthen8, R Sherva9, L Farrer9, AH Smith10, HR Kranzler11, M Rietschel2, J Gelernter10 and AB Niculescu1,12

We have used a translational Convergent Functional Genomics (CFG) approach to discover involved in alcoholism, by -level integration of genome-wide association study (GWAS) data from a German dependence cohort with other genetic and gene expression data, from human and animal model studies, similar to our previous work in bipolar disorder and schizophrenia. A panel of all the nominally significant P-value single-nucleotide length polymorphisms (SNPs) in the top candidate genes discovered by CFG (n = 135 genes, 713 SNPs) was used to generate a genetic risk prediction score (GRPS), which showed a trend towards significance (P = 0.053) in separating alcohol dependent individuals from controls in an independent German test cohort. We then validated and prioritized our top findings from this discovery work, and subsequently tested them in three independent cohorts, from two continents. In order to validate and prioritize the key genes that drive behavior without some of the pleiotropic environmental confounds present in humans, we used a stress-reactive animal model of alcoholism developed by our group, the D-box binding (DBP) knockout mouse, consistent with the surfeit of stress theory of addiction proposed by Koob and colleagues. A much smaller panel (n = 11 genes, 66 SNPs) of the top CFG-discovered genes for alcoholism, cross-validated and prioritized by this stress-reactive animal model showed better predictive ability in the independent German test cohort (P = 0.041). The top CFG scoring gene for alcoholism from the initial discovery step, synuclein alpha (SNCA) remained the top gene after the stress-reactive animal model cross-validation. We also tested this small panel of genes in two other independent test cohorts from the United States, one with alcohol dependence (P = 0.00012) and one with alcohol abuse (a less severe form of alcoholism; P = 0.0094). SNCA by itself was able to separate alcoholics from controls in the alcohol-dependent cohort (P = 0.000013) and the alcohol abuse cohort (P = 0.023). So did eight other genes from the panel of 11 genes taken individually, albeit to a lesser extent and/or less broadly across cohorts. SNCA, GRM3 and MBP survived strict Bonferroni correction for multiple comparisons. Taken together, these results suggest that our stress-reactive DBP animal model helped to validate and prioritize from the CFG-discovered genes some of the key behaviorally relevant genes for alcoholism. These genes fall into a series of biological pathways involved in signal transduction, transmission of nerve impulse (including myelination) and cocaine addiction. Overall, our work provides leads towards a better understanding of illness, diagnostics and therapeutics, including treatment with omega-3 fatty acids. We also examined the overlap between the top candidate genes for alcoholism from this work and the top candidate genes for bipolar disorder, schizophrenia, anxiety from previous CFG analyses conducted by us, as well as cross-tested genetic risk predictions. This revealed the significant genetic overlap with other major psychiatric disorder domains, providing a basis for comorbidity and dual diagnosis, and placing alcohol use in the broader context of modulating the mental landscape.

Translational Psychiatry (2014) 4, e391; doi:10.1038/tp.2014.29; published online 20 May 2014

INTRODUCTION Alcohol use and overuse (alcoholism) have deep historical and ‘Drunkenness is nothing but voluntary madness.’ cultural roots, as well as important medical and societal —Seneca consequences.1 Whereas there is evidence for roles for both

1Department of Psychiatry, Indiana University School of Medicine, Indianapolis, IN, USA; 2Central Institute of Mental Health, Mannheim, Germany; 3Department of Human Biology, The J. Craig Venter Institute, La Jolla, CA, USA; 4Department of Addictive Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, Heidelberg University, Heidelberg, Germany; 5Department of Psychiatry, University Medical Center Regensburg, University of Regensburg, Regensburg, Germany; 6Department of Statistical Genetics, Max-Planck-Institute of Psychiatry, Munich, Germany; 7Department of Psychiatry, University of Mainz, Mainz, Germany; 8Department of Genomics, Life & Brain Center, Institute of Human Genetics, University of Bonn, Bonn, Germany; 9Department of Medicine (Biomedical Genetics), Boston University School of Medicine, Boston, MA, USA; 10Division of Human Genetics, Department of Psychiatry, Yale University School of Medicine, VA CT Healthcare Center, New Haven, CT, USA; 11Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, and Philadelphia VAMC, Philadelphia, PA, USA and 12Indianapolis VA Medical Center, Indianapolis, IN, USA. Correspondence: Professor AB Niculescu, Department of Psychiatry, Indiana University School of Medicine, Neuroscience Research Building, 320 W. 15th Street, Indianapolis, IN 46202, USA. E-mail: [email protected] Received 23 January 2014; accepted 18 March 2014 Genetic risk prediction of alcoholism DF Levey et al 2 genes and environment in alcoholism, a comprehensive biological understanding of the disorder has been elusive so far, despite extensive work in the field. Most notably, there has been until recently insufficient translational integration across functional and genetic studies, and across human and animal model studies, resulting in missed opportunities for a comprehensive understanding. As part of a translational Convergent Functional Genomics (CFG) approach, developed by us over the last 15 years,2 and expanding upon our earlier work on identifying genes for alcoholism,3–5 we set out to comprehensively identify candidate genes, pathways and mechanisms for alcoholism, integrating the available evidence in the field to date. We have used data from a published German genome-wide association study for alcoholism.6 We integrated those data in a Bayesian-like manner with other human genetic data (association or linkage) for alcoholism, as well as human gene expression data, post- mortem brain gene expression data and peripheral (blood and cell culture) gene expression data. We also used relevant animal model genetic data (transgenic and quantitative trait loci (QTL)), as well as animal model gene expression data (brain and blood) generated by our group and others (Figures 1 and 2). Human data provide specificity for the illness, and animal model data provide Figure 2. Top candidate genes for alcoholism. sensitivity of detection. Together, they helped to identify and prioritize candidate genes for the illness using a polyevidence CFG fi can differentiate alcohol-dependent subjects from controls, score, resulting in essence in a de facto eld-wide integration fi putting together all the available lines of evidence to date. Once observing a trend towards signi cance. In order to validate and prioritize genes in this panel using a that is done, biological pathway analyses can be conducted and mechanistic models can be constructed. behavioral prism, we then looked at the overlap between our An obvious next step is developing a way of applying that panel of 135 top candidate genes and genes changed in expression in a stress-reactive animal model for alcoholism knowledge to genetic testing of individuals to determine risk for 4,5 the disorder. On the basis of our comprehensive identification of developed by our group, the DBP knockout mouse. We used top candidate genes described in this paper, we have chosen all this overlap to reduce our panel to 11 genes (66 SNPs). the nominally significant P-value SNPs corresponding to each of This small panel of 11 genes was subsequently tested and those 135 genes from the GWAS data set used for discovery (top shown to be able to differentiate between alcoholics and controls 51 candidate genes prioritized by CFG with the score of 8 and above in the three independent test cohorts, one German and two US- 52 (≥ 50% maximum possible CFG score of 16) and assembled a based, suggesting that the animal model served in essence as a Genetic Risk Prediction panel out of those 713 SNPs. We then filter to identify from the larger list of CFG-prioritized genes the developed a Genetic Risk Prediction Score (GRPS) for alcoholism key behaviorally relevant genes. Our results indicate that panels of based on the presence or absence of the alleles of the SNPs SNPs in top genes identified and prioritized by CFG analysis and associated with the illness from the discovery GWAS, and tested by a behaviorally relevant animal model can differentiate between the GRPS in an independent German cohort,51 to see whether it alcoholics and controls at a population level (Figure 3), although at

Figure 1. Convergent Functional Genomics.

Translational Psychiatry (2014), 1 – 16 © 2014 Macmillan Publishers Limited Genetic risk prediction of alcoholism DF Levey et al 3

Figure 3. Genetic Risk Prediction using a panel of top candidate genes for alcoholism (GRPS-11). Testing in independent cohorts 3 and 4.

Figure 4. Overlap of alcoholism versus other major psychiatric disorders. Top candidate genes for alcoholism identified by CFG (n = 135) in the current study versus top candidate genes for other Figure 5. Mindscape (mental landscape)-dimensional view of genes psychiatric disorders and a stress-driven animal model of alcoholism that may be involved in alcoholism and other major psychiatric (DBP knockout mouse) from our previous work. disorders. an individual level the margin may be small (Supplementary for shared genes (Figures 4 and 5) as well as shared genetic risk Figure S2). The latter point suggests that, similar to bipolar (Figure 6). 53 54 disorder and schizophrenia, the contextual cumulative combi- Overall, this work sheds light on the genetic architecture and 55 natorics of common gene variants and environment has a major pathophysiology of alcoholism, provides mechanistic targets for role in risk for illness. therapeutic intervention and has implications for genetic testing Lastly, we have looked at overlap with genes for other major to assess risk for illness before the illness manifests itself clinically, psychiatric disorder domains (bipolar disorders, anxiety disorders, opening the door for enhanced prevention strategies at a young schizophrenias) from our previous studies and provide evidence age. As alcoholism is a disease that does not exist if the exogenous

© 2014 Macmillan Publishers Limited Translational Psychiatry (2014), 1 – 16 Genetic risk prediction of alcoholism DF Levey et al 4 663 644 276 585 Control All Caucasians All Caucasians

Figure 6. Genetic load for bipolar disorder and schizophrenia in alcoholism. A total of 34 out of 66 SNPs in our alcohol GRPS-11 panel (current work; in n = 10 genes), 42 out of 224 SNPs in our bipolar GRPS53 (in n = 34 genes) and 151 out of 542 SNPs in our schizophrenia GRPS54 (in n = 35 genes) were present and tested in the alcohol cohorts 3 and 4. See also Supplementary Table S7.

agent (alcohol) is not consumed, the use of genetic information to inform lifestyle choices could be quite powerful.

MATERIALS AND METHODS Human subject cohorts Discovery cohort (cohort 1): GWAS for alcohol dependence from Germany. Data for the discovery CFG work (Cohort 1) were obtained from a GWA 0 study of self-reported German descent subjects, consisting of 411 alcohol- 0 411 740

dependent male subjects and 1307 population-based controls (663 male 16871081 366 234 475 786 and 644 female subjects).6 Individuals were genotyped using HumanHap All Caucasians All Caucasians

550 BeadChips (Illumina Inc, San Diego, CA, USA). SNPs with a nominal Alcohol dependence allelic P-value o0.05 were selected for analysis. No Bonferroni correction Alcohol dependence Alcohol abuse Control was performed.

Test cohort 2 (alcohol dependence, Germany). An independent test cohort of German descent51 consisting of 740 alcohol-dependent male subjects and 861 controls (276 male and 585 female subjects) was used for testing the results of the discovery analyses. Individuals were genotyped using Illumina Human610Quad or Illumina Human660w Quad BeadChips (Illumina Inc). The controls were genotyped using Illumina HumanHap550 Bead Chips.

Test cohort 3 (alcohol dependence, United States) and test cohort 4 (alcohol abuse, United States). The sample consisted of small nuclear families originally collected for linkage studies, and unrelated individuals, Caucasians and African-American, male and female subjects. The subjects were recruited at five US clinical sites: Yale University School of Medicine (APT Foundation; New Haven, CT, USA), the University of Connecticut Health Center (Farmington, CT, USA), the University of Pennsylvania Perelman School of Medicine (Philadelphia, PA, USA), the Medical University of South Carolina (Charleston, SC, USA) and McLean Hospital (Belmont, MA, USA). All subjects were interviewed using the Semi- Structured Assessment for Dependence and Alcoholism to derive diagnoses for lifetime alcohol dependence, alcohol abuse and other major psychiatric traits according to the DSM-IV criteria. There were 1687 male subjects with alcohol dependence, 366 male subjects with alcohol abuse and 475 male controls. There were 1081 female subjects with alcohol dependence, 234 female subjects with alcohol abuse and 786 female Discovery and test cohorts controls (Table 1). Individuals were genotyped on the Illumina HumanOmni1-Quad v1.0 microarray (988 306 autosomal SNPs). GWAS Male Female Ethnicity Male Female Male genotyping was conducted at the Yale Center for Genome Analysis and Ethnicity Female Male ethnicity (Caucasian/African-American)Female ethnicity (Caucasian/African-American) 802/885 471/610 201/165 123/111 168/307 220/566 Discovery cohort 1 GWAS Germany Test cohort 2 Germany Test cohorts 3 and 4, United States the Center for Inherited Disease Research. Genotypes were called using the Table 1. Abbreviation: GWAS, genome-wide association study.

Translational Psychiatry (2014), 1 – 16 © 2014 Macmillan Publishers Limited Genetic risk prediction of alcoholism DF Levey et al 5 GenomeStudio software V2011.1 and genotyping module version 1.8.4 model data sets,3 as well as published reports from the literature curated in (Illumina Inc).52 our databases. The rat animal model experimental work from our group was previously 3 fi described. The experimental approaches used to produce the animal Gene identi cation in discovery cohort 1 model data for CFG analysis were carried out in two rat lines selectively Quality control. Genotype data had been filtered using stringent quality- bred for divergent alcohol preference: inbred alcohol-preferring (iP) versus 51 control criteria as described earlier and accounted for call rate, inbred alcohol-non-preferring (iNP) rats. Following five brain regions were population substructure, cryptic relatedness, minor allele frequency and chosen for gene expression studies in these rat lines: the frontal cortex, batch effects. amygdala, caudate–putamen, nucleus accumbens and hippocampus. Animal studies, as well as human imaging and post-mortem analyses, Association test in discovery sample. Association testing was performed had previously provided evidence that these regions are implicated in using PLINK 1.07 (http://pngu.mgh.harvard.edu/ ~ purcell)56 software alcoholism. package. A logistic regression modelling approach was applied to correct Data for the analysis came from studies of three experimental for population stratification. Therefore, principal component analysis was paradigms. Paradigm 1 examined basal level of gene expression in the conducted considering only independent autosomal SNPs with minor brains of the alcohol-naive iP and iNP lines of rats. This basal comparison allele frequency >0.05 and pairwise R2 o0.05 within a 200-SNP window. LD was performed to determine innate differences between these two lines filtering resulted in a set of 28 505 SNPs used for principal component with a marked divergence in the willingness to consume alcohol. We analysis, which was carried out using GCTA 1.04 (http://www.complex- hypothesized that the innate differences in gene expression between the traitgenomics.com/software/gcta/).57 The first two principal components iP and iNP would involve some of the genes associated with an increased susceptibility for alcohol dependence. Paradigm 2 examined the effects of resulting from this analysis were included as covariates in the logistic chronic 24-h free-choice alcohol consumption on gene expression in iP rats regression model. compared with alcohol-naive iP rats. This paradigm looked for gene expression changes in the brain associated with the direct influence of Assignment of SNPs to genes. Genes corresponding to SNPs were fi fi peripherally self-administered alcohol in the genetically susceptible rats. In identi ed initially using the annotation le from the Illumina website Paradigm 3, iP rats were allowed to self-infuse alcohol directly into the (http://www.illumina.com, HumanHAP550v3_Gene_Annotation). Next, posterior ventral tegmental area, the originating area of the mesolimbic genes were cross-checked with GeneCards (http://www.genecards.org) to system. The advantage of this latter procedure is that it isolates ensure that each gene symbol was current. Any gene symbol that matched the neurocircuitry involved in alcohol reinforcement, and eliminates the to a different gene symbol in Gene Cards was checked to verify peripheral effects of alcohol. Following the establishment of alcohol self- number and location match with the original gene, and administration into the posterior VTA, gene expression levels in target was replaced with the current GeneCards gene symbol. SNPs from the brain areas were measured and compared with P rats that received original annotation files that had no gene matches in the annotation file artificial cerebral spinal fluid infusions into the posterior VTA. and UCSC Genome Browser (that is, not falling within an exon or intron of a known gene) were assumed to regulate and thus implicate the gene Animal model genetic evidence. To search for mouse genetic evidence closest to the SNP location, using the refSNP database from NCBI (http:// (transgenic and QTL) for our candidate genes, we utilized PubMed as well www.ncbi.nlm.nih.gov/snp/?SITE = NcbiHome&submit = Go). as the Mouse Genome Informatics (http://www.informatics.jax.org; Jackson Laboratory, Bar Harbor, ME, USA) database, and used the search ‘Genes and Markers’ form to find transgenic in categories for abnormal alcohol Convergent functional genomic analyses consumption, alcohol preference, alcohol aversion, impaired behavioral Databases. We have established in our laboratory (Laboratory of response to alcohol, hyperactivity elicited by ethanol administration and Neurophenomics, Indiana University School of Medicine, www.neurophe- enhanced behavioral response to alcohol. For QTL convergence, the start nomics.info) manually curated databases of all the human gene expression of the gene had to map within 5 cM of the location of these markers. (post-mortem brain, blood and cell cultures), human genetic (association, copy number variants (CNVs) and linkage), animal model genetic and CFG scoring. We used a nominal P-value threshold (having at least one animal model gene expression studies published to date on psychiatric SNP with Po0.05) for including genes from the discovery GWAS in the disorders. Only the findings deemed significant in the primary publication, CFG analysis. No Bonferroni correction was performed. by the study authors, using their particular experimental design and Internal score: For each of these genes implicated by SNPs, we thresholds, are included in our databases. Our databases include only calculated the percent of SNPs that were nominally significant (ratio of primary literature data and do not include review papers or other number of nominally significant SNPs over total number of SNPs tested for secondary data integration analyses to avoid redundancy and circularity. that gene, multiplied by 100), obtaining a distribution of values. The genes These large and constantly updated databases have been used in our CFG in the top 0.1% of the distribution were given an internal score of 4 points, cross-validation and prioritization (Figure 1). those in the top 5% of the distribution were given 3 points and the remaining genes all received 2 points. The internal score provides a Human post-mortem brain, blood and other peripheral tissue gene expression prioritization of genes based on GWAS results and might prioritize genes evidence. Information about genes was obtained and imported in our that have higher biological relevance and heterogeneity. databases searching the primary literature with PubMed (http://ncbi.nlm. External score: Human and animal model data, genetic and gene nih.gov/PubMed), using various combinations of keywords. For this work, expression were integrated and tabulated, resulting in a polyevidence CFG the keywords were as follows: alcohol, alcoholism, human, brain, score. All six cross-validating lines of evidence (human data and animal postmortem, lymphocytes, blood, cells and gene expression. model data) were weighted such that evidence from human studies was prioritized 2x over evidence from animal models, gene expression Human genetic evidence (association, linkage). To designate convergence evidence was prioritized 2x over genetic evidence and brain evidence for a particular gene, the gene had to have independent published was prioritized 2x over peripheral tissue evidence (Figure 1). For human genetic evidence, 2 points were assigned if it was from association and 1 evidence of association or linkage for alcoholism. We sought to avoid using point if it was from linkage studies. For animal model genetic evidence, 2 any association studies that included subjects who were also included in points if it was from transgenic and 1 point if it was from QTL. The our discovery or test cohorts. For linkage, the location of each gene was maximum possible external score for each gene is 12. obtained through GeneCards (http://www.genecards.org), and the sex- We have capped (one positive study scores maximum points) the averaged cM location of the start of the gene was then obtained through hypothesis-driven candidate gene genetic association evidence and http://compgen.rutgers.edu/old/map-interpolator/. For linkage conver- animal model genetic (transgenic) lines of evidence, regardless of how gence, per our previously published criteria, the start of the gene had to many other such studies support that gene, to avoid potential ‘popularity’ map within 5 cM of the location of a marker linked to the disorder with a biases, where some genes are more studied than others. For discovery- lod score of ≥ 2. driven gene expression studies, we have capped (one positive study scores maximum points) the human post-mortem brain work because of the Animal model brain and blood gene expression evidence. For animal model paucity of brain collections and the fact that such studies often use the brain and blood gene expression evidence, we have used our own rat same brain bank sources. However, we have not similarly capped the

© 2014 Macmillan Publishers Limited Translational Psychiatry (2014), 1 – 16 Genetic risk prediction of alcoholism DF Levey et al 6 animal model brain and blood gene expression evidence, as such studies SNPs between genes, using the PLINK software package (Supplementary are not only discovery-based, but use independent cohorts of animals. Table S5). These were scored differentially, based on the number of studies showing evidence for a given gene: three or more different studies received full Genetic risk prediction maximum points, two studies 0.75 of maximum points and one study 0.5 of the maximum points. Our group generated data sets for three The software package PLINK 1.07 (http://pngu.mgh.harvard.edu/ ~ pur- cell)56 was used to extract individual genotype information for each independent animal studies for this analysis (see above). subject from the test cohorts 2, 3 and 4 data files. The more lines of evidence for a gene—that is, the more times a gene As we had previously performed for bipolar disorder and schizophrenia, shows up as a positive finding across independent studies, platforms, we developed a polygenic GRPS for alcoholism based on the presence or methodologies and species—the higher its external CFG score (Figure 1). absence of the alleles of the SNPs associated with illness in the discovery This is similar conceptually to the Google PageRank algorithm, in which the GWAS cohort 1, and tested the GRPS in three independent cohorts, from more links to a page, the higher it comes up on the search prioritization different geographic areas, ethnicities and different types of alcoholism. list. It has not escaped our attention that other ways of weighing the lines We tested two panels: a larger panel containing all the nominally of evidence may give slightly different results in terms of prioritization, if significant SNPs in top CFG scoring candidate genes (n = 135) from the not in terms of the list of genes per se. Nevertheless, we think this simple discovery GWAS1 in the top CFG-prioritized genes (Supplementary Tables scoring system provides a good separation of genes, with specificity S1 and S4) and a smaller one (n = 11) containing genes out of the larger provided by human data and sensitivity provided by animal model data. panel that were cross-validated using an animal model of alcoholism. Of note, our genes, SNP panels and choice of affected alleles were based solely on analysis of the discovery GWAS1, which is our discovery cohort, Prioritizing top alcoholism candidate genes that overlap with a stress-reactive completely independently from the test cohorts. Each SNP has two alleles animal model of alcoholism. Stress has been proposed as a driver of (represented by base letters at that position). One of them is associated alcoholism, notably by Koob and colleagues,58,59 as well as by Heilig and with the illness (affected), the other not (non-affected), based on the odds colleagues.60 We have previously identified the circadian clock gene DBP 61 3 ratios from the discovery GWAS1. We assigned the affected allele a score of as a candidate gene for bipolar disorder, as well as for alcoholism, using 1 and the non-affected allele a score of 0. A two-dimensional matrix of a CFG approach. In follow-up work, we established mice with a subjects by GRP panel alleles is generated, with the cells populated by 0 or homozygous deletion of DBP (DBP KO) as a stress-reactive genetic animal 4 1. A SNP in a particular individual subject can have any permutation of 1 model of bipolar disorder and alcoholism. We reported that DBP KO mice and 0 (1 and 1, 0 and 1, 0 and 0). By adding these numbers, the minimum have lower locomotor activity, blunted responses to stimulants and gain score for a SNP in an individual subject is 0, and the maximum score is 2. less weight over time. In response to a stress paradigm that translationally By adding the scores for all the alleles in the panel, averaging that and mimics what can happen in humans (chronic stress-isolation housing for multiplying by 100, we generated for each subject an average score 4 weeks, with acute stress, on top of that- experimental handling in week corresponding to a genetic loading for disease, which we call Genetic Risk 3), the mice exhibit a diametric switch in these phenotypes. DBP KO mice Predictive Score.53,54 are also activated by sleep deprivation, similar to bipolar patients, and that To test for significance, a one-tailed t-test with unequal variance was activation is prevented by treatment with the mood stabilizer drug performed between the alcoholic subjects and the control subjects, . Moreover, these mice show increased alcohol intake following looking at differences in GRPS. exposure to stress. Microarray studies of brain and blood revealed a pattern of gene expression changes that may explain the observed Receiver operating characteristic curves. Receiver operating characteristic phenotypes. CFG analysis of the gene expression changes identified a curves were plotted using IBM SPSS Statistics 21. Diagnosis was converted series of candidate genes and blood biomarkers for bipolar disorder, to a binary call of 0 (control) or 1 (alcohol-dependent or abuser) and alcoholism and stress reactivity. Subsequent studies by us showed that entered as the state variable, with calculated GRPS entered as the test treatment with the omega-3 fatty acid docosahexaenoic acid (DHA) variable (Supplementary Figure S2). normalized the gene expression (brain and blood) and behavioral phenotypes of this mouse model, including reducing alcohol RESULTS consumption.5 We examined the overlap between the top candidate genes for Top candidate genes alcoholism from the current analysis and the top candidate genes from the To minimize false-negatives, we initially cast a wide net, using as a DBP KO stress mice, thus reducing the list from 135 to 11 (Figure 4). filter a minimal requirement for a gene to have both some GWAS evidence and some additional independent evidence. Thus, out of Pathway analyses. IPA 9.0 (Ingenuity Systems, www.ingenuity.com, Red- the 6085 genes with at least a SNP at Po0.05 in the discovery wood City, CA, USA) was used to analyze the biological roles, including top GWAS cohort 1, we generated a list of 3142 genes that also had canonical pathways and diseases, of the candidate genes resulting from some additional line of evidence (human or animal model data), our work (Table 2 and Supplementary Table S2), as well as used to identify implicating them in alcoholism (CFG score ≥ 2.5 (≥ 2 internal) genes in our data sets that are the targets of existing +(≥ 0.5 external)). This suggests, using these minimal thresholds (Supplementary Table S3). Pathways were identified from the IPA library and requirements, that the repertoire of genes potentially of canonical pathways that were most significantly associated with genes involved directly or indirectly in alcohol consumption and in our data set. The significance of the association between the data set alcoholism may be quite large, similar to what we have previously and the canonical pathway was measured in 2 ways: (1) a ratio of the seen for bipolar disorder62 and schizophrenia.54 To minimize false- number of molecules from the data set that map to the pathway divided positives, we used an internal score based on percent of SNPs in a by the total number of molecules that map to the canonical pathway is gene that were nominally significant, with 4 points for those in the displayed; (2) Fisher’s exact test was used to calculate a P-value top 0.1% of the distribution (n = 77), 3 points for those in the top determining the probability that the association between the genes in 5% of the distribution (n = 561) and 2 points for the rest of the the data set and the canonical pathway is explained by chance alone. We fi also conducted a KEGG pathway analysis through the Partek Genomic nominally signi cant SNPs (n = 5447). We then used the CFG Suites 6.6 software package, Partek Inc, Saint Louis, MO, USA), and GeneGo analysis and scoring integrating multiple lines of evidence to MetaCore from Thomson Reuters, New York, NY, USA) pathway analyses prioritize this list of genes (Figure 1) and focused our subsequent (https://portal.genego.com/). analyses on only the top CFG scoring candidate genes. Overall, 135 genes had a CFG score of 8 and above (≥ 50% of maximum possible score of 16). Epistasis testing. The test cohort 2 data were used to test for epistatic Of note, there was no correlation between CFG prioritization interactions among the best P-value SNPs in the 11 top candidate genes and gene size, thus excluding a gene-size effect for the observed from our work. SNP–SNP allelic epistasis was tested for each distinct pair of enrichment (Supplementary Figure S1).

Translational Psychiatry (2014), 1 – 16 © 2014 Macmillan Publishers Limited 04McilnPbihr Limited Publishers Macmillan 2014 ©

Table 2. Pathway analyses: (A) biological pathways, (B) disease and disorders

Ingenuity pathways KEGG pathways GeneGO pathways

(A) Numbers Top canonical P-value Ratio Pathway name Enrichment Enrichment Networks P-value Ratio pathways score P-value

GRPS-11, top DBP KO 1GαI signaling 4.68E − 05 3/135 Cocaine addiction 11.9589 6.40226E − 06 Neurophysiological process_Transmission of nerve 6.010E − 06 6/212 genes out of CFG (0.022) impulse score ≥ 8.0 genes 2 cAMP-mediated 2.64E − 04 3/226 Gap junction 5.71209 0.00330577 Development_Neurogenesis in general 9.163E − 04 4/192 (n = 11 genes) signaling (0.013) 3 G-protein- 4.37E − 04 3/276 Glutamatergic 5.21128 0.00545466 Reproduction_GnRH signaling pathway 6.603E − 03 3/166 coupled (0.011) synapse receptor signaling 4 14-3-3-mediated 2.14E − 03 2/121 Dopaminergic 5.0126 0.00665355 Transport_Synaptic vesicle exocytosis 7.762E − 03 3/176 signaling (0.017) synapse 5 Synaptic long- 3.33E − 03 2/161 Neuroactive 3.62169 0.0267375 Development_Neurogenesis_Synaptogenesis 8.258E − 03 3/180 term depression (0.012) ligand–receptor interaction

Ingenuity GeneGO

(B) Number Diseases and disorders P-value Molecules Diseases P-value Ratio

GRPS-11, top DBP KO 1 Hereditary disorder 1.66E − 08 to 1.29E − 02 9 Schizophrenia 1.476E − 10 11/1033 genes out of CFG score 2 Neurological disease 1.66E − 08 to 1.64E − 02 10 Suicide 1.076E − 09 6/151 ≥ 8.0 genes (n = 11 3 Psychological disorders 1.66E − 08 to 1.53-02 9 Depressive disorder, major 2.106E − 09 10/966 Levey alcoholism DF of prediction risk Genetic genes) 4 Cancer 1.90E − 08 to 1.82E − 02 10 Bipolar disorder 2.983E − 09 9/706 5 Skeletal and muscular disorders 1.50E − 07 to 1.34E − 02 6 Alcoholism 3.468E − 09 7/291

Abbreviations: CFG, Convergent Functional Genomics; DBP, D-box-binding protein. Pathway analyses of top candidate genes. al et rnltoa scity(04,1 (2014), Psychiatry Translational – 16 7 Genetic risk prediction of alcoholism DF Levey et al 8 Biological pathways and drug targets more robust results than the larger panel (Table 4), suggesting Pathway analyses were carried out on the top candidate genes that it captures the key behaviorally relevant genes. (Table 2). Notably, Gαi signaling, cocaine addiction and transmis- sion of nerve impulses were the top biological pathways in DISCUSSION alcoholism, which may be informative for treatments and drug discovery efforts by pharmaceutical companies. Of note, these top Our CFG approach helped to prioritize a very rich-in-signal and candidate genes were identified and prioritized only for evidence biologically interesting set of genes (Table 3 and Supplementary for alcoholism before pathway analyses; therefore, the overlap Table S1). Some, such as SNCA, CPE, DRD2 and GRM3, have weaker with cocaine addiction is a completely independent result, evidence based on the GWAS data but strong independent suggesting a shared drive and neurobiology. Consistent with that, evidence in terms of gene expression studies and other prior human or animal genetic work. Conversely, some of the top two of our 135 top candidate genes for alcoholism (CPE and VWF) 84 85 o − 5 63 previous genetic findings in the field, such as ADH1C (CFG had SNPs with P 10 in a recent GWAS of cocaine addiction. 86 Some of the top alcohol candidate genes have prior evidence of score of 9), GABRA2 (CFG score of 8), as well as AUTS2 (CFG score being modulated by the omega-3 fatty acid DHA in our DBP of 7), CHRM2 and KCNJ6 (CFG scores of 4) have fewer different independent lines of evidence, and thus received a lower CFG mouse animal model (Table 3 and Supplementary Table S1). That prioritization score in our analysis (Supplementary Table S1), is of particular interest, as we have previously shown that although they are clearly involved in alcoholism-related processes. treatment with the omega-3 fatty acid DHA decreased alcohol Whereas we cannot exclude that more recently discovered genes consumption in that animal model, as well as in another 5 have had less hypothesis-driven work performed and thus might independent animal model, the alcohol-preferring P rats. score lower on CFG, it is to be noted that the CFG approach Omega-3 fatty acids, particularly DHA, have been described to integrates predominantly non-hypothesis-driven, discovery-type have alcoholism, mood, psychosis and suicide-modulating proper- data sets, such as GWAS data, linkage, quantitative traits loci and, ties, in preclinical models as well as some human clinical trials and particularly, gene expression. We also cap each line of evidence fi epidemiological studies. For example, de cits in omega-3 fatty from an experimental approach (Figure 1), to minimize any acids have been linked to increased depression and aggression in ‘popularity’ bias, whereas multiple studies of the same kind are animal models64,65 and humans.66,67 DHA prevents ethanol 68 conducted on better-established genes. In the end, it is gene-level damage in vitro in rat hippocampal slices. Omega-3 supple- reproducibility across multiple approaches and platforms that is mentation can prevent oxidative damage caused by prenatal built into the approach and gets prioritized most by CFG scoring 69 alcohol exposure in rats. Of note, deficits in DHA have been during the discovery process. Our top results subsequently show 70 reported in erythrocytes and in the post-mortem orbitofrontal good reproducibility and predictive ability in independent cohort cortex of patients with bipolar disorder, and were greater in those testing, the litmus test for any such work. 71 that had high versus those that had low alcohol abuse. Low DHA At the very top of our list of candidate genes for alcoholism, levels may be a risk factor for suicide.72,73 Omega-3 fatty acids with a CFG score of 13, we have SNCA, a pre-synaptic chaperone have been reported to be clinically useful in the treatment of both that has been reported to be involved in modulating brain – – mood74 77 and psychotic disorders.78 80 plasticity and neurogenesis, as well as neurotransmission, Other existing pharmacological drugs that modulate alcohol primarily as a brake.87,88 On the pathological side, low levels of candidate genes identified by us include, besides benzodiaze- SNCA might offer less protection against oxidative stress,89 pines, dopaminergic agents, glutamatergic agents, whereas high levels of SNCA may have a role in neurodegenera- agents, as well as statins (Supplementary Table S3). tive diseases, including in Parkinson disease. SNCA has been identified as a susceptibility gene for alcohol cravings7 and 90 Genetic risk prediction score response to alcohol cues. The evidence provided by our data and other previous human genetic association studies suggest a Once the genes involved in a disorder are identified, and genetic rather than purely environmental (alcohol consumption prioritized for likelihood of involvement, then an obvious next and stress) basis for its alteration in disease, and its potential utility step is developing a way of applying that knowledge to genetic as trait rather than purely state marker. testing of individuals to determine risk for the disorder. On the fi Alcoholics carry a genetic variant that leads to reduced baseline basis of our identi cation of top candidate genes described above expression of SNCA.8 SNCA is also downregulated in expression in using CFG, we pursued a polygenic panel approach, with digitized the frontal cortex and caudate–putamen of inbred alcohol- binary scoring for presence or absence, similar to the one we have preferring rats,17 as well as in the brain (amygdala) and blood of devised and used in the past for biomarkers testing53,81 and for 53 54 our stress-reactive DBP animal model of alcoholism, before genetic testing in bipolar disorder and schizophrenia. Some- exposure to any alcohol. SNCA is upregulated in expression in what similar approaches but without CFG prioritization, attempted blood in human alcoholism,12,13 as well as in the blood of by other groups, have been either unsuccessful82 or have required 83 monkeys consuming alcohol, and in rats after alcohol administra- very large panels of markers. tion.3 Thus, it may serve as a blood biomarker. Overall, we may fi o We chose all the nominally signi cant P-value SNPs (P 0.05) in infer that, whereas low levels of SNCA may predispose to cravings each of our top CFG-prioritized genes (n = 135 with CFG score ≥8; for alcohol and consequent alcoholism, possibly mediated Supplementary Table S1) in the GWAS1 data set used for through increased neurobiological activity and drive (the SNCA discovery, and assembled a GRPS-135 panel out of those SNPs deficit hypothesis), excessive alcohol consumption then increases (Table 4). We then tested the GRPS-135 in the independent SNCA expression beyond that seen in non-alcohol-consuming German test cohort 2, based on the presence or absence of the controls, potentially compounding risk for neurodegenerative alleles of the SNPs associated with the illness, comparing the diseases in individuals that have mutations that lead to its alcoholic subjects to controls (Table 4), and showed that, although aggregation. This observation is also biologically consistent with there was a trend, we were not able to distinguish alcoholics from the fact that dementia is often observed late in the course of controls in both independent test cohorts. alcohol dependence. We then prioritized a smaller panel of 11 genes (Table 3) out of GFAP (glial fibrillary acidic protein), a top candidate gene with a this larger panel, by using as a cross-validator the top genes from CFG score of 9.5, is an astrocyte intermediate filament-type a stress-reactive mouse animal model for alcoholism, the DBP protein involved in neuron–astrocyte interactions, cell adhesion, knockout mouse4 (Figure 4). The small panel (GRPS-11) showed process formation and cell–cell communication. It is decreased in

Translational Psychiatry (2014), 1 – 16 © 2014 Macmillan Publishers Limited 04McilnPbihr Limited Publishers Macmillan 2014 ©

Table 3. Top candidate genes for alcoholism

Gene Discovery Internal score Human genetic Human post- Human Animal model Animal model Animal model CFG DBP DBP symbol/name GWAS1 nominally evidence mortem brain peripheral expression genetic evidence brain expression peripheral score stress4 stress best P-value significant expression evidence evidence (blood) DHA5 SNP SNPs/total SNPs evidence expression tested (% evidence significant)

SNCA, synuclein alpha 0.02363 2 Association7,8 (D) (I) Blood12 QTL16 (D) (I) Blood male 13 (D) (I) Blood – (non A4 component of rs17015982 4/68 (5.89%) PFC8 10 (I) Blood13 FC, CP17 adult AMY, amyloid precursor) (I) (I) Serum14 (I) cynomolgus blood FC11 DNA HIP P1 monkeys18 hypermethylation15 AMY P33

GFAP, glial fibrillary 0.01052 3 Linkage19 (D) (I) 9.5 (I) acidic protein rs744281 4/12 (33.34%) FC9,11,20 FC21 AMY HIP, NAC, PFC P13

– DRD2, dopamine 0.03652 2 Association22 26 (D) (Transgenic) 9 (D) (I) AMY receptor D2 rs4938019 2/46 (4.35%) FC, CP27 alcohol aversion PFC (D)28 decreased alcohol consumption29

GRM3, 0.001126 2 Linkage30,31 (D) (I) 9 (I) glutamate receptor, rs41440 15/133 HIP32 NAC33 AMY metabotropic 3 (11.28%) (D) 34

FC Levey alcoholism DF of prediction risk Genetic (D) CP P23

MBP, myelin basic 0.006503 2 (D) (I) (D) CG-4 glial 8.5 (D) (D) Blood al et protein rs1124941 16/109 FC11 PFC35 cells rat brain38 PFC (I) HIP (14.68%) (D) (I) cerebellum36 AMY delayed expression37

MOBP, myelin- 0.01231 2 (D) QTL39 (I) 8.5 (D) (D) Blood associated rs562545 2/43 (4.66%) FC11 PFC35 PFC (I) HIP oligodendrocyte basic (D) (I) 40 rnltoa scity(04,1 (2014), Psychiatry Translational protein NAC AMY (I) whole brain41 (I) PFC P13

GNAI1, 0.00059 2 Linkage42,43 (D) (D) 8 (D) guanine nucleotide rs12706724 11/66 HIP32 NAC P33 PFC binding protein (G (16.67%) protein), alpha- inhibiting activity polypeptide 1 – 16 9 10 rnltoa scity(04,1 (2014), Psychiatry Translational

Table. 3. (Continued)

Gene Discovery Internal score Human genetic Human post- Human Animal model Animal model Animal model CFG DBP DBP symbol/name GWAS1 nominally evidence mortem brain peripheral expression genetic evidence brain expression peripheral score stress4 stress best P-value significant expression evidence evidence (blood) DHA5 SNP SNPs/total SNPs evidence expression tested (% evidence significant)

MOG, myelin 0.01429 2 (D) (D) 8 (D) (I) HIP oligodendrocyte rs3117292 3/19 (15.79%) FC, HIP11,32 VTA44 PFC

– glycoprotein (I) 16 PFC35 (I) PFC P13 alcoholism of prediction risk Genetic RXRG, retinoid X 0.005815 2 Linkage45 (D) (I) CP P33 8 (D) receptor, gamma rs10800098 1/37 (2.71%) FC11 PFC

SYT1, synaptotagmin I 0.04139 2 (D) (D) (I) 8 (D) (D) rs1245810 7/117 (5.99%) NAC46 FC34 Cultured PFC AMY (D) neurons48 HIP47 (I) Cortical

neurons49 Levey DF

TIMP2, TIMP 0.04309 2 Linkage45 (D) (I) 8 (D) (D) metallopeptidase rs7502935 1/25 (4%) FC, HIP, VTA44 PFC Blood inhibitor 2 NAC32,46,50 al et Abbreviations: AMY, amygdala; Association, association evidence; CFG, convergent functional genomics; CP, caudate–putamen; D, decreased in expression; DBP, D-box-binding protein; DHA, docosahexaenoic acid; GWAS, genome-wide association study; HIP, hippocampus; I, increased; Linkage, linkage evidence; NAC, nucleus accumbens; P1, paradigm 1; P2, Paradigm 2; P3, paradigm 3 in the Rodd et al;3 PFC, prefrontal cortex; QTL, quantitative trait loci; SNP, single-nucleotide length polymorphism; TG, transgenic; VT, ventral tegmentum; VTA, ventral tegmental area. Top genes with a CFG score of 8 and above that overlapped with top genes from the stress-reactive animal model are shown (n = 11; Figure 4). Best P-value SNP within the gene or flanking regions is depicted. A more complete list of genes with CFG score of 8 and above (n = 135) is available in the Supplementary Information section (Supplementary Table S1). Underlined gene symbol represents means gene is a blood biomarker candidate. Bold P-values o0.001. 04McilnPbihr Limited Publishers Macmillan 2014 © Genetic risk prediction of alcoholism DF Levey et al 11

Table 4. Genetic Risk Prediction Score (GRPS)-Panels from Discovery Cohort 1

Test in cohort 2 alcohol-dependent versus control GRPS-135, P = 0.053 genes with CFG score of ≥ 8 (135 genes, 713 SNPs) all nominally significant SNPs in each gene (n = 713) GRPS-11, top animal model (DBP mouse) prioritized genes P = 0.041 out of genes with CFG score of ≥ 8 (11 genes, 66 SNPs) all nominally significant SNPs in each gene (n = 66)

Test in cohort 3 alcohol-dependent versus control GRPS-11, P = 0.00012 top animal model (DBP mouse) prioritized genes (10 genes, 34 SNPs present) out of genes with CFG score of ≥ 8 all nominally significant SNPs in each gene (n = 66) GRPS-SNCA, P = 0.000013 top CFG gene (1 gene, 1 SNP rs17015888 present) all nominally significant SNPs in it (n = 4)

Test in cohort 4 alcohol abuse versus control GRPS-11, P = 0.0094 top animal model (DBP mouse) prioritized genes (10 genes, 34 SNPs present) out of genes with CFG score of ≥ 8 all nominally significant SNPs in each gene (n = 66) GRPS-SNCA, P = 0.023 top CFG gene (1 gene, 1 SNP rs17015888 present) all nominally significant SNPs in it (n = 4) Abbreviations: CFG, Convergent Functional Genomics; DBP, DNA-box-binding protein; SNCA, synuclein alpha; SNP, single-nucleotide length polymorphism. Differentiation between alcoholics and controls in three independent test cohorts using, GRPS-135, a panel composed of all the nominally significant SNPs from GWAS1 in the top candidate genes prioritized by CFG; GRPS-11, a panel additionally prioritized by a stress-reactive animal model for alcoholism, the DBP KO-stressed mouse; and GRPS-SNCA, the top candidate gene from our analyses. P-values depict one-tailed t-test results between alcoholics and controls. expression in post-mortem brain of alcoholics, but increased in deactivation and a limbic hyperactivity, which could lead to expression in brains of animal models of predisposition to impulsivity. alcoholism, before exposure to alcohol (Table 3). This is consistent with a model for increased physiological robustness in individuals Epistasis testing of top candidate genes for alcoholism predisposed to alcoholism,3 as well as with the neurodegenerative For the top 11 candidate genes, best P-value SNPs from GWAS1 consequences of protracted alcohol use. were used to test for gene–gene interactions in GWAS2 DRD2 (dopamine receptor D2), another top candidate gene with fi a CFG score of 9, has prior human genetic association evidence. It (Supplementary Table S5). Nominally signi cant interactions were is reduced in expression in the frontal cortex in the human brain found between SNPs in SNCA and RXRG, DRD2 and SYT1, MOBP from alcoholics, as well as in the DBP animal model before any and TIMP2. As a caveat, the P-value was not corrected for multiple exposure to alcohol. One possible interpretation would be that comparisons. The corresponding genes merit future follow-up lower levels of dopamine receptors are associated with reduced work to elucidate the biological and pathophysiological relevance dopaminergic signaling and anhedonia, leading individuals to of their interactions. overcompensate by alcohol and drug abuse. Another interpreta- tion, consistent with the low SNCA and consequently higher Pathways and mechanisms neurotransmitter (including dopamine) levels, would be that these Our pathway analysis (Table 2 and Supplementary Table S2) individuals are in fact in a compulsive, hyperdopaminergic state, results are consistent with the accumulating evidence about the which drives them to hedonic activities and leads to compensa- role of neuronal excitability and signaling in alcoholism.83,93,94 tory homeostatic downregulation of their DRD2 receptors. Consistent with this later scenario, mice that have a constitutive Overlap with other psychiatric disorders knockout of their DRD2 receptors, not because of a hyperdopa- minergic state, in fact consume less alcohol,29 unless they are Despite using lines of evidence for our CFG approach that have to exposed to stress.91 do only with alcoholism, the list of genes identified has a notable Another top candidate gene, GRM3, is also involved in overlap at a pathway analysis level (Table 2B and Supplementary neurotransmitter signaling. Prior evidence in the field had Table S2B) and at a gene level (Figures 4 and 5) with other implicated another metabotropic glutamate receptor, GRM2.92 psychiatric disorders. This is a topic of major interest and debate in Other top candidate genes in the panel (MOBP, MBP and MOG) the field. We demonstrate an overlap between top candidate are involved in myelination (Table 3). They are decreased in genes for alcoholism and top candidate genes for schizophrenia, expression in the prefrontal cortex of human alcoholics, as well as anxiety and bipolar disorder, previously identified by us through in our stress-reactive DBP animal model of alcoholism, before CFG (Figure 4), thus providing a possible molecular basis for the exposure to any alcohol. Decreased myelination may lead to frequently observed clinical comorbidity and interdependence decreased connectivity. Interestingly, MOBP and MBP are between alcoholism and those other major psychiatric disorders, increased in expression in the amygdala in the DBP mice, opposite as well as cross-utility of pharmacological agents. Moreover, we to the direction of change in the PFC, consistent with a frontal tested in alcoholics genetic risk predictive panels for bipolar

© 2014 Macmillan Publishers Limited Translational Psychiatry (2014), 1 – 16 Genetic risk prediction of alcoholism DF Levey et al 12

Table 5. Individual top genes and genetic risk prediction in independent cohorts

Gene/SNPs CFG score Mean GRPS t-test

Control (n = 861) Alcohol dependence cohort 2 (n = 740)

(A) Test cohort 2 Panel of 11 top genes 66 SNPs ≥8 53.98 54.61 0.041 SNCA rs7668883 13 93.93 92.84 0.086 rs17015888 rs17015982 rs6532183 GFAP rs3744473 9.5 63.99 64.69 0.303 rs3169733 rs736866 rs744281 DRD2 rs4648317 9 13.07 15.51 0.024 rs4938019 GRM3 rs17160519 9 55.44 54.94 0.271 rs6944937 rs13236080 rs17315854 rs12668989 rs41440 rs2373124 rs13222675 rs2708553 rs12673599 rs4236502 rs1554888 rs10499898 rs1527769 rs17161018 MBP rs470131 8.5 44.92 47.07 0.002 rs2282566 rs736421 rs1789094 rs9951586 rs1667952 rs1789105 rs1789103 rs1812680 rs1789139 rs4890912 rs9947485 rs1562771 rs1015820 rs1124941 rs11877526 MOBP rs562545 8.5 49.88 49.93 0.487 rs2233204 GNAI1 rs4731111 8 72.97 72.71 0.393 rs6466884 rs7803811 rs17802148 rs7805663 rs10486920 rs2523189 rs2886611 rs2886609 rs12706724 rs4731302 MOG rs3117292 8 34.53 34.56 0.493 rs2747442 rs3117294 RXRG rs10800098 8 6.04 5.27 0.174 SYT1 rs1569033 8 39.16 40.89 0.113 rs10735416 rs1245810 rs1245819 rs1268463

Translational Psychiatry (2014), 1 – 16 © 2014 Macmillan Publishers Limited Genetic risk prediction of alcoholism DF Levey et al 13

Table. 5. (Continued)

Gene/SNPs CFG score Mean GRPS t-test

Control (n = 861) Alcohol dependence cohort 2 (n = 740)

rs1245840 rs10861755 TIMP2 rs7502935 8 67.65 70.61 0.038

Gene/SNPs CFG Mean GRPS t-test score Control Alcohol dependence cohort 3 Alcohol abuse cohort 4 Alcohol dependence Alcohol abuse (n = 1261) (n = 2768) (n = 600) cohort 3 cohort 4

(B) Test cohorts 3 and 4 Panel of 10 top genes 34 SNPs ≥ 8 47.58 48.51 48.49 0.00012 0.0094 SNCA rs17015888 13 72.28 76.96 75.58 0.000013 0.023 GFAP rs3169733 9.5 58.92 60.38 60.17 0.042 0.158 rs736866 DRD2 rs4648317 9 15.38 14.92 15.61 0.293 0.429 GRM3 rs17160519 9 35.13 37.38 35.55 0.000061 0.309 rs6944937 rs17315854 rs4236502

MBP rs470131 8.5 47.23 48.01 48.31 0.0443 0.059 rs2282566 rs736421 rs1789094 rs9951586 rs1789103 rs4890912 rs9947485 rs1124941 MOBP rs562545 8.5 50.28 50.80 50.75 0.233 0.337 rs2233204 GNAI1 rs4731111 8 61.58 63.03 62.87 0.006435 0.072 rs6466884 rs17802148 rs10486920 rs2523189 rs2886611 rs2886609 MOG rs3117292 8 48.78 46.44 46.08 0.020216 0.056 rs2747442 rs3117294 RXRG rs10800098 8 2.62 3.42 3.33 0.024279 0.126 SYT1 rs1569033 8 41.56 42.48 44.25 0.11474 0.0087 rs1245819 rs1268463 rs10861755

Abbreviations: GRPS, Genetic Risk Prediction Score; SNCA, synuclein alpha; SNP, single-nucleotide length polymorphism. Italic, nominally significant; bold italic, survived Bonferroni correction.

disorder53 and for schizophrenia54 generated in previous studies disorder, consistent with increased drive, and a decreased genetic by us, and show that they are significantly different in alcoholics load for schizophrenia, consistent with increased connectivity versus controls (Figure 6), beyond the overlap in genes with before alcohol use. These results led us to develop a heuristic, alcohol. There seems to be an increased genetic load for bipolar testable model of alcoholism (Figure 5). Some people may drink to

© 2014 Macmillan Publishers Limited Translational Psychiatry (2014), 1 – 16 Genetic risk prediction of alcoholism DF Levey et al 14 be calm, mitigating the effects of stress and anxiety, some people region, SNCA is the closest gene, and the distance is well within may drink to be happy, the common drive with bipolar disorder, the range of known examples of regulatory regions (enhancers). In and some people may drink to be drunk, to disconnect from addition, the risk allele for this SNP (G/G) seems to be the major reality and/or get unstuck from internal obsessions and variant in the population (Supplementary Table S6), suggesting ruminations. that this allele per se is evolutionarily advantageous, when not coupled with the exogenous ingestion of alcohol. Genetic risk prediction A relatively large list of genes (n = 6085) was implicated by nominally significant SNPs from the discovery GWAS. There is a Of note, our SNP panels and choice of affected alleles were based fi solely on analysis of the discovery GWAS, completely indepen- risk that out of such a large list CFG will nd something to dently from the test cohorts. Our results show that a relatively prioritize. We have tried to mitigate that by developing an internal fi fi score for each gene based on the proportion of SNPs tested in a limited and well-de ned panel of SNPs identi ed based on our fi CFG analysis could differentiate between alcoholism subjects and gene that were nominally signi cant. Moreover, in the end, we fi controls in three independent cohorts. The fact that our genetic tested the reproducibility and predictive ability of our top ndings testing worked for both alcohol dependence and alcohol abuse in multiple independent cohorts, which is the ultimate litmus test suggests that these two diagnostic categories are actually for any genetic or biomarker study. overlapping, supporting the DSM-V reclassification of a single category of alcohol use disorders. CONCLUSION Overall, whereas multiple mechanistic entry points may contribute Reproducibility among studies to alcoholism pathogenesis, it is likely at its core a disease of an Our work provides striking evidence for the advantages, exogenous agent (alcohol) modulating different mind domains/ reproducibility and consistency of gene-level analyses of data, as dimensions (anxiety, mood and cognition),95 precipitated by opposed to SNP level analyses, pointing to the fundamental issue environmental stress on a background of genetic vulnerability of genetic heterogeneity at a SNP level. In fact, it may be that the (Figure 5). The degree to which various mind domains/dimensions more biologically important a gene is for higher mental functions, are affected in different individuals is a fertile area for future the more heterogeneity it has at a SNP level and the more research into subtypes of alcoholism and lends itself to evolutionary divergence, for adaptive reasons. On top of that, CFG personalization of diagnosis and treatment, by integrating genetic provides a way to prioritize genes based on disease relevance, not data, blood gene expression biomarker data and clinical data. study-specific effects (that is, fit-to-disease as opposed to fit-to- Lastly, it is important to note that individuals with a predisposition cohort). Reproducibility of findings across different studies, to alcoholism but no exposure to alcohol may in fact have a robust experimental paradigms and technical platforms is deemed more physiology and strong neurobiological drive that can be important (and scored as such by CFG) than the strength of harnessed for other, more productive endeavors. finding in an individual study (for example, P-value in a GWAS).

Potential limitations and confounds CONFLICT OF INTEREST The GWAS study (cohort 1) on which our discovery was based The authors declare no conflict of interest directly related to this work. Although not contained males as probands but contained males and females as directly relevant to this work, HRK has been a consultant or advisory board member with Alkermes, Lilly, Lundbeck, Pfizer, and Roche. He has also received honoraria controls. This was the case for the German test cohort (cohort 2) as fi from the Alcohol Clinical Trials Initiative (ACTIVE) of the American Society of Clinical well. It is possible that some of the nominally signi cant SNPs Psychopharmacology, which is supported by Lilly, Lundbeck, AbbVie and Pfizer. identified in the discovery GWAS have to do with gender differences rather than to alcoholism per se, or at least may have to do with male alcoholism. Stratification across gender and ACKNOWLEDGMENTS ethnicities may also be a factor in our test cohorts 3 and 4 This work is, in essence, a field-wide collaboration. We would like to acknowledge our (Table 1). The issue of possible ethnicity differences in alleles, debt of gratitude for the efforts and results of the many other groups, cited in our genes and the consequent neurobiology may need to be explored paper, who have conducted and published studies (genetic and gene expression) in more in the future, with larger sample sizes, and with environ- alcoholism. With their arduous and careful work, a convergent approach, such as mental and cultural factors taken into account. However, the use ours, is possible. We would particularly like to thank the subjects who volunteered to of a CFG approach using evidence from other studies of participate in these studies. Without their generous contribution, such work to alcoholism, including animal model studies, to prioritize the advance the understanding of alcoholism and help others would not be possible. ’ findings decreases the likelihood that our final top results are This work was supported by an NIH Directors New Innovator Award (1DP2OD007363) and a VA Merit Award (1I01CX000139-01) to ABN, as well as by ethnicity- or gender-related. Of note, our GRPS predictions NIH grants R01 DA12690, R01 DA12849, R01 AA11330, R01 and AA017535 to JG and separate alcoholics from controls in independent test cohorts, in collaborators, and by grant FKZ 01GS08152 from the National Genome Research both genders, and in fact work even better at separating female Network of the German Federal Ministry of Education and Research to MR and alcoholics from female controls (Figure 3). Moreover, a series of collaborators. individual genes from the panel, not just SNCA, separates alcoholics from controls in independent cohorts (Table 5). The conversion from SNPs to genes as part of our discovery AUTHOR CONTRIBUTIONS assumed the rule of proximity—that is, an intragenic SNP ABN designed the study, with input from JG, MR, NS and AS, and wrote the implicates the gene inside which it falls, or if it falls into an manuscript. DFL, HL-N, JF and MA analyzed the data. NJ, BK, EW and RL intergenic region, it implicates the most proximal gene to it. That performed database work. HL-N and ZR generated the animal model studies. JF, may not be true in reality in all cases, generating potentially false- FK, NW, BM-M, ND, MN, MR and the GESGA consortium generated German positives and false-negatives. However, the convergent approach GWAS cohorts 1 and 2. RS, LF, AHS, HRK and JG generated US cohorts 3 and 4. and focus on the top CFG scoring genes reduce the likelihood of All authors discussed the results and commented on the manuscript. false-positives. The only SNP for SNCA that was present/tested for in cohorts 3 and 4 (rs17015888) was relatively far away upstream (0.13 MB) REFERENCES from SNCA. However, no other known genes are present in that 1 Schuckit MA. Alcohol-use disorders. Lancet 2009; 373:492–501.

Translational Psychiatry (2014), 1 – 16 © 2014 Macmillan Publishers Limited Genetic risk prediction of alcoholism DF Levey et al 15 2 Niculescu AB, Le-Niculescu H. Convergent Functional Genomics: what we have 26 Joe KH, Kim DJ, Park BL, Yoon S, Lee HK, Kim TS et al. Genetic association of DRD2 learned and can learn about genes, pathways, and mechanisms. Neuropsycho- polymorphisms with anxiety scores among alcohol-dependent patients. Biochem pharmacology 2010; 35: 355–356. Biophys Res Commun 2008; 371:591–595. 3 Rodd ZA, Bertsch BA, Strother WN, Le-Niculescu H, Balaraman Y, Hayden E et al. 27 Noble EP, Blum K, Ritchie T, Montgomery A, Sheridan PJ. Allelic association of the Candidate genes, pathways and mechanisms for alcoholism: an expanded con- D2 dopamine receptor gene with receptor-binding characteristics in alcoholism. vergent functional genomics approach. Pharmacogenomics J 2007; 7:222–256. Arch Gen Psychiatry 1991; 48:648–654. 4 Le-Niculescu H, McFarland MJ, Ogden CA, Balaraman Y, Patel S, Tan J et al. 28 Volkow ND, Wang GJ, Fowler JS, Logan J, Hitzemann R, Ding YS et al. Decreases in Phenomic, convergent functional genomic, and biomarker studies in a stress- dopamine receptors but not in dopamine transporters in alcoholics. Alcohol Clin reactive genetic animal model of bipolar disorder and co-morbid alcoholism. Am J Exp Res 1996; 20: 1594–1598. Med Genet 2008; 147B:134–166. 29 Phillips TJ, Brown KJ, Burkhart-Kasch S, Wenger CD, Kelly MA, Rubinstein Met al. 5 Le-Niculescu H, Case NJ, Hulvershorn L, Patel SD, Bowker D, Gupta J et al. Con- Alcohol preference and sensitivity are markedly reduced in mice lacking dopa- vergent functional genomic studies of omega-3 fatty acids in stress reactivity, mine D2 receptors. Nat Neurosci 1998; 1:610–615. bipolar disorder and alcoholism. Transl Psychiatry 2011; 1: e4. 30 Zhu X, Cooper R, Kan D, Cao G, Wu X. A genome-wide linkage and association 6 Treutlein J, Cichon S, Ridinger M, Wodarz N, Soyka M, Zill P et al. Genome-wide study using COGA data. BMC Genet 2005; 6(Suppl 1): S128. association study of alcohol dependence. Arch Gen Psychiatry 2009; 66:773–784. 31 Wang S, Huang S, Liu N, Chen L, Oh C, Zhao H. Whole-genome linkage analysis in 7 Foroud T, Wetherill LF, Liang T, Dick DM, Hesselbrock V, Kramer J et al. Association mapping alcoholism genes using single-nucleotide polymorphisms and micro- of alcohol craving with alpha-synuclein (SNCA). Alcohol Clin Exp Res 2007; 31: satellites. BMC Genet 2005; 6(Suppl 1): S28. 537–545. 32 McClintick JN, Xuei X, Tischfield JA, Goate A, Foroud T, Wetherill L et al. Stress- 8 Janeczek P, Mackay RK, Lea RA, Dodd PR, Lewohl JM. Reduced expression of response pathways are altered in the hippocampus of chronic alcoholics. Alcohol alpha-synuclein in alcoholic brain: influence of SNCA-Rep1 genotype. Addict Biol 2013; 47:505–515. 2014; 19: 509–515. 33 Sommer W, Arlinde C, Heilig M. The search for candidate genes of alcoholism: 9Mayfield RD, Lewohl JM, Dodd PR, Herlihy A, Liu J, Harris RA. Patterns of gene evidence from expression profiling studies. Addict Biol 2005; 10:71–79. expression are altered in the frontal and motor cortices of human alcoholics. J 34 Worst TJ, Tan JC, Robertson DJ, Freeman WM, Hyytia P, Kiianmaa K et al. Tran- Neurochem 2002; 81: 802–813. scriptome analysis of frontal cortex in alcohol-preferring and nonpreferring rats. J 10 Lewohl JM, Van Dyk DD, Craft GE, Innes DJ, Mayfield RD, Cobon G et al. The Neurosci Res 2005; 80:529–538. application of proteomics to the human alcoholic brain. Ann N Y Acad Sci 2004; 35 Tapocik JD, Solomon M, Flanigan M, Meinhardt M, Barbier E, Schank JRet al. Coor- 1025:14–26. dinated dysregulation of mRNAs and microRNAs in the rat medial prefrontal cortex 11 Lewohl JM, Wang L, Miles MF, Zhang L, Dodd PR, Harris RA. Gene expression in following a history of alcohol dependence. Pharmacogenomics J 2013; 13:286–296. human alcoholism: microarray analysis of frontal cortex. Alcohol Clin Exp Res 2000; 36 Zoeller RT, Butnariu OV, Fletcher DL, Riley EP. Limited postnatal ethanol exposure 24: 1873–1882. permanently alters the expression of mRNAS encoding myelin basic protein and 12 Taraskina AE, Filimonov VA, Kozlovskaya YA, Morozova MN, Gaschin DV, myelin-associated glycoprotein in cerebellum. Alcohol Clin Exp Res 1994; 18: Schwarzman AL. High level of alpha-synuclein mRNA in peripheral lymphocytes 909–916. of patients with alcohol dependence syndrome. Bull exp Biol Med 2008; 146: 37 Chiappelli F, Taylor AN, Espinosa de los Monteros A, de Vellis J. Fetal alcohol 609–611. delays the developmental expression of myelin basic protein and transferrin in rat 13 Bonsch D, Reulbach U, Bayerlein K, Hillemacher T, Kornhuber J, Bleich S. Elevated primary oligodendrocyte cultures. Int J Dev Neurosci 1991; 9:67–75. alpha synuclein mRNA levels are associated with craving in patients with alco- 38 Bichenkov E, Ellingson JS. Ethanol alters the expressions of c-Fos and myelin basic holism. Biol Psychiatry 2004; 56:984–986. protein in differentiating oligodendrocytes. Alcohol 2009; 43: 627–634. 14 Bonsch D, Greifenberg V, Bayerlein K, Biermann T, Reulbach U, Hillemacher T et al. 39 Weng J, Symons MN, Singh SM. Ethanol-responsive genes (Crtam, Zbtb16, and Alpha-synuclein protein levels are increased in alcoholic patients and are linked Mobp) located in the alcohol-QTL region of chromosome 9 are associated with to craving. Alcohol Clin Exp Res 2005; 29:763–765. alcohol preference in mice. Alcohol Clin Exp Res 2009; 33:1409–1416. 15 Bonsch D, Lenz B, Kornhuber J, Bleich S. DNA hypermethylation of the alpha 40 Bell RL, Kimpel MW, McClintick JN, Strother WN, Carr LG, Liang T et al. Gene synuclein promoter in patients with alcoholism. Neuroreport 2005; 16:167–170. expression changes in the nucleus accumbens of alcohol-preferring rats following 16 Liang T, Spence J, Liu L, Strother WN, Chang HW, Ellison JA et al. alpha-Synuclein chronic ethanol consumption. Pharmacol Biochem Behav 2009; 94:131–147. maps to a quantitative trait locus for alcohol preference and is differentially 41 Treadwell JA, Singh SM. Microarray analysis of mouse brain gene expression expressed in alcohol-preferring and -nonpreferring rats. Proc Natl Acad Sci USA following acute ethanol treatment. Neurochem Res 2004; 29:357–369. 2003; 100: 4690–4695. 42 Bergen AW, Yang XR, Bai Y, Beerman MB, Goldstein AM, Goldin LR. Genomic 17 Liang T, Kimpel MW, McClintick JN, Skillman AR, McCall K, Edenberg HJ et al. regions linked to alcohol consumption in the Framingham Heart Study. BMC Candidate genes for alcohol preference identified by expression profiling in genet 2003; 4(Suppl 1): S101. alcohol-preferring and -nonpreferring reciprocal congenic rats. Genome Biol 2010; 43 Foroud T, Edenberg HJ, Goate A, Rice J, Flury L, Koller DL et al. Alcoholism sus- 11: R11. ceptibility loci: confirmation studies in a replicate sample and further mapping. 18 Walker SJ, Grant KA. Peripheral blood alpha-synuclein mRNA levels are elevated in Alcohol Clin Exp Res 2000; 24: 933–945. cynomolgus monkeys that chronically self-administer ethanol. Alcohol 2006; 38: 44 McBride WJ, Kimpel MW, McClintick JN, Ding ZM, Hauser SR, Edenberg HJ et al. 1–4. Changes in gene expression within the ventral tegmental area following repeated 19 Dick DM, Aliev F, Bierut L, Goate A, Rice J, Hinrichs A et al. Linkage analyses of IQ excessive binge-like alcohol drinking by alcohol-preferring (P) rats. Alcohol 2013; in the collaborative study on the genetics of alcoholism (COGA) sample. Behav 47:367–380. Genet 2006; 36:77–86. 45 Hill SY, Shen S, Zezza N, Hoffman EK, Perlin M, Allan W. A genome wide search for 20 Liu J, Lewohl JM, Dodd PR, Randall PK, Harris RA, Mayfield RD. Gene expression alcoholism susceptibility genes. Am J Med Genet 2004; 128:102–113. profiling of individual cases reveals consistent transcriptional changes in alcoholic 46 Flatscher-Bader T, van der Brug M, Hwang JW, Gochee PA, Matsumoto I, Niwa S human brain. J Neurochem 2004; 90: 1050–1058. et al. Alcohol-responsive genes in the frontal cortex and nucleus accumbens of 21 Kimpel MW, Strother WN, McClintick JN, Carr LG, Liang T, Edenberg HJ et al. human alcoholics. J Neurochem 2005; 93: 359–370. Functional gene expression differences between inbred alcohol-preferring and 47 Kabbaj M, Evans S, Watson SJ, Akil H. The search for the neurobiological basis of -non-preferring rats in five brain regions. Alcohol 2007; 41:95–132. vulnerability to drug abuse: using microarrays to investigate the role of stress and 22 Ponce G, Perez-Gonzalez R, Aragues M, Palomo T, Rodriguez-Jimenez R, Jimenez- individual differences. Neuropharmacology 2004; 47(Suppl 1): 111–122. Arriero MA et al. The ANKK1 kinase gene and psychiatric disorders. Neurotoxic Res 48 Pignataro L, Miller AN, Ma L, Midha S, Protiva P, Herrera DG et al. Alcohol regulates 2009; 16:50–59. gene expression in neurons via activation of heat shock factor 1. J Neurosci 2007; 23 Dick DM, Wang JC, Plunkett J, Aliev F, Hinrichs A, Bertelsen Set al. Family-based 27: 12957–12966. association analyses of alcohol dependence phenotypes across DRD2 and 49 Varodayan FP, Pignataro L, Harrison NL. Alcohol induces synaptotagmin 1 neighboring gene ANKK1. Alcohol Clin Exp Res 2007; 31: 1645–1653. expression in neurons via activation of heat shock factor 1. Neuroscience 2011; 24 Kraschewski A, Reese J, Anghelescu I, Winterer G, Schmidt LG, Gallinat J et al. 193:63–71. Association of the dopamine D2 receptor gene with alcohol dependence: hap- 50 Liu J, Lewohl JM, Harris RA, Iyer VR, Dodd PR, Randall PK et al. Patterns of gene lotypes and subgroups of alcoholics as key factors for understanding receptor expression in the frontal cortex discriminate alcoholic from nonalcoholic indivi- function. Pharmacogenet Genomics 2009; 19:513–527. duals. Neuropsychopharmacology 2006; 31: 1574–1582. 25 Munafo MR, Matheson IJ, Flint J. Association of the DRD2 gene Taq1A poly- 51 Frank J, Cichon S, Treutlein J, Ridinger M, Mattheisen M, Hoffmann P et al. morphism and alcoholism: a meta-analysis of case-control studies and evidence Genome-wide significant association between alcohol dependence and a variant of publication bias. Mol Psychiatry 2007; 12:454–461. in the ADH gene cluster. Addict Biol 2012; 17:171–180.

© 2014 Macmillan Publishers Limited Translational Psychiatry (2014), 1 – 16 Genetic risk prediction of alcoholism DF Levey et al 16 52 Gelernter J, Kranzler HR, Sherva R, Almasy L, Koesterer R, Smith AH et al. Genome- 75 Parker G, Gibson NA, Brotchie H, Heruc G, Rees AM, Hadzi-Pavlovic D. Omega-3 wide association study of alcohol dependence:significant findings in African- and Fatty acids and mood disorders. Am J Psychiatry 2006; 163:969–978. European-Americans including novel risk loci. Mol Psychiatry 2014; 19:41–49. 76 Osher Y, Belmaker RH. Omega-3 fatty acids in depression: a review of three 53 Patel SD, Le-Niculescu H, Koller DL, Green SD, Lahiri DK, McMahon FJ et al. studies. CNS Neurosci Ther 2009; 15: 128–133. Coming to grips with complex disorders: genetic risk prediction in bipolar dis- 77 Clayton EH, Hanstock TL, Hirneth SJ, Kable CJ, Garg ML, Hazell PL. Reduced mania order using panels of genes identified through convergent functional genomics. and depression in juvenile bipolar disorder associated with long-chain omega-3 Am J Med Genet 2010; 153B:850–877. polyunsaturated fatty acid supplementation. Eur J Clin Nutr 2009; 63: 1037–1040. 54 Ayalew M, Le-Niculescu H, Levey DF, Jain N, Changala B, Patel SD et al. Con- 78 Peet M, Stokes C. Omega-3 fatty acids in the treatment of psychiatric disorders. vergent functional genomics of schizophrenia: from comprehensive under- Drugs 2005; 65: 1051–1059. standing to genetic risk prediction. Mol Psychiatry 2012; 17: 887–905. 79 Berger GE, Wood SJ, Wellard RM, Proffitt TM, McConchie M, Amminger GP et al. 55 Niculescu AB, Le-Niculescu H. The P-value illusion: how to improve (psychiatric) Ethyl-eicosapentaenoic acid in first-episode psychosis. A 1H-MRS study. Neu- genetic studies. Am J Med Genet 2010; 153B:847–849. ropsychopharmacology 2008; 33: 2467–2473. 56 Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira M, Bender D et al. PLINK: A 80 Amminger GP, Schafer MR, Papageorgiou K, Klier CM, Cotton SM, Harrigan SM tool set for whole-genome association and population-based linkage analyses. et al. Long-chain omega-3 fatty acids for indicated prevention of psychotic dis- Am J Med Genet 2007; 81:559–575. orders: a randomized, placebo-controlled trial. Arch Gen Psychiatry 2010; 67: 57 Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex 146–154. trait analysis. Am J Med Genet 2011; 88:76–82. 81 Kurian SM, Le-Niculescu H, Patel SD, Bertram D, Davis J, Dike C et al. Identification 58 Edwards S, Baynes BB, Carmichael CY, Zamora-Martinez ER, Barrus M, Koob GF of blood biomarkers for psychosis using convergent functional genomics. Mol et al. Traumatic stress reactivity promotes excessive alcohol drinking and alters Psychiatry 2009; 16:37–58. the balance of prefrontal cortex-amygdala activity. Transl Psychiatry 2013; 3: e296. 82 Yan J, Aliev F, Webb BT, Kendler KS, Williamson VS, Edenberg HJ et al. Using 59 Koob GF, Buck CL, Cohen A, Edwards S, Park PE, Schlosburg JE et al. Addiction as a genetic information from candidate gene and genome-wide association studies stress surfeit disorder. Neuropharmacology 2014; 76 Pt B: 370–382. in risk prediction for alcohol dependence. Addict Biol advance online publication, 60 Schank JR, Ryabinin AE, Giardino WJ, Ciccocioppo R, Heilig M. Stress-related 30 January 2013; doi:10.1111/adb.12035 (e-pub ahead of print). neuropeptides and addictive behaviors: beyond the usual suspects. Neuron 2012; 83 Kos MZ, Yan J, Dick DM, Agrawal A, Bucholz KK, Rice JP et al. Common biological 76: 192–208. networks underlie genetic risk for alcoholism in African- and European-American 61 Niculescu AB 3rd, Segal DS, Kuczenski R, Barrett T, Hauger RL, Kelsoe JR. Identi- populations. Genes Brain Behavior 2013; 12: 532–542. fying a series of candidate genes for mania and psychosis: a convergent func- 84 Edenberg HJ, Foroud T. Genetics and alcoholism. Nat Rev Gastroenterol Hepatol tional genomics approach. Physiol Genomics 2000; 4:83–91. 2013; 10: 487–494. 62 Le-Niculescu H, Patel SD, Bhat M, Kuczenski R, Faraone SV, Tsuang MT et al. 85 Biernacka JM, Geske JR, Schneekloth TD, Frye MA, Cunningham JM, Choi DS et al. Convergent functional genomics of genome-wide association data for bipolar Replication of genome wide association studies of alcohol dependence: support disorder: comprehensive identification of candidate genes, pathways and for association with variation in ADH1C. PLoS ONE 2013; 8: e58798. mechanisms. Am J Med Genet 2009; 150B:155–181. 86 Villafuerte S, Strumba V, Stoltenberg SF, Zucker RA, Burmeister M. Impulsiveness 63 Gelernter J, Sherva R, Koesterer R, Almasy L, Zhao H, Kranzler HR et al. Genome- mediates the association between GABRA2 SNPs and lifetime alcohol problems. wide association study of cocaine dependence and related traits: FAM53B iden- Genes Brain Behavior 2013; 12:525–531. tified as a risk gene. Mol Psychiatry 2014; 19:41–49. 87 Winner B, Regensburger M, Schreglmann S, Boyer L, Prots I, Rockenstein E et al. 64 DeMar JC Jr, Ma K, Bell JM, Igarashi M, Greenstein D, Rapoport SI. One generation Role of alpha-synuclein in adult neurogenesis and neuronal maturation in the of n-3 polyunsaturated fatty acid deprivation increases depression and aggres- dentate gyrus. J Neurosci 2012; 32: 16906–16916. sion test scores in rats. J Lipid Res 2006; 47:172–180. 88 Lundblad M, Decressac M, Mattsson B, Bjorklund A. Impaired neurotransmission 65 Rao JS, Ertley RN, Lee HJ, DeMar JC Jr, Arnold JT, Rapoport SI et al. n-3 poly- caused by overexpression of alpha-synuclein in nigral dopamine neurons. Proc unsaturated fatty acid deprivation in rats decreases frontal cortex BDNF via a p38 Natl Acad Sci USA 2012; 109:3213–3219. MAPK-dependent mechanism. Mol Psychiatry 2007; 12:36–46. 89 Musgrove RE, King AE, Dickson TC. alpha-Synuclein protects neurons from 66 Zanarini MC, Frankenburg FR. omega-3 Fatty acid treatment of women with apoptosis downstream of free-radical production through modulation of the borderline personality disorder: a double-blind, placebo-controlled pilot study. MAPK signalling pathway. Neurotox Res 2013; 23: 358–369. Am J Psychiatry 2003; 160: 167–169. 90 Wilcox CE, Claus ED, Blaine SK, Morgan M, Hutchison KE. Genetic variation in the 67 Lin PY, Huang SY, Su KP. A meta-analytic review of polyunsaturated fatty acid alpha synuclein gene (SNCA) is associated with BOLD response to alcohol cues. J compositions in patients with depression. Biol Psychiatry 2010; 68:140–147. Stud Alcohol Drugs 2013; 74:233–244. 68 Collins MA, Moon KH, Tajuddin N, Neafsey EJ, Kim HY. Docosahexaenoic acid 91 Delis F, Thanos PK, Rombola C, Rosko L, Grandy D, Wang GJ et al. Chronic mild (DHA) prevents binge ethanol-dependent aquaporin-4 elevations while inhibiting stress increases alcohol intake in mice with low dopamine D2 receptor levels. neurodegeneration: experiments in rat adult-age entorhino-hippocampal slice Behav Neurosci 2013; 127:95–105. cultures. Neurotox Res 2013; 23: 105–110. 92 Zhou Z, Karlsson C, Liang T, Xiong W, Kimura M, Tapocik JDet al. Loss of meta- 69 Patten AR, Brocardo PS, Christie BR. Omega-3 supplementation can restore glu- botropic glutamate receptor 2 escalates alcohol consumption. Proc Natl Acad Sci tathione levels and prevent oxidative damage caused by prenatal ethanol USA 2013; 110: 16963–16968. exposure. J Nutr Biochem 2012; 24:760–769. 93 Heilig M, Goldman D, Berrettini W, O'Brien CP. Pharmacogenetic approaches to 70 McNamara RK. DHA deficiency and prefrontal cortex neuropathology in recurrent the treatment of alcohol addiction. Nat Rev Neurosci 2011; 12:670–684. affective disorders. J Nutr 2010; 140:864–868. 94 Zhao Z, Guo AY, van den Oord EJ, Aliev F, Jia P, Edenberg HJ et al. Multi-species 71 McNamara RK, Jandacek R, Rider T, Tso P, Stanford KE, Hahn CG et al. Deficits in data integration and gene ranking enrich significant results in an alcoholism docosahexaenoic acid and associated elevations in the metabolism of arachidonic genome-wide association study. BMC Genomics 2012; 13(Suppl 8): S16. acid and saturated fatty acids in the postmortem orbitofrontal cortex of patients 95 Niculescu AB 3rd, Schork NJ, Salomon DR. Mindscape: a convergent perspective with bipolar disorder. Psychiatry Res 2008; 160:285–299. on life, mind, consciousness and happiness. J Affect Disord 2010; 123:1–8. 72 Sublette ME, Hibbeln JR, Galfalvy H, Oquendo MA, Mann JJ. Omega-3 poly- unsaturated essential fatty acid status as a predictor of future suicide risk. Am J Psychiatry 2006; 163: 1100–1102. This work is licensed under a Creative Commons Attribution- 73 Lewis MD, Hibbeln JR, Johnson JE, Lin YH, Hyun DY, Loewke JD. Suicide deaths of NonCommercial-NoDerivs 3.0 Unported License. The images or active-duty US military and omega-3 fatty-acid status: a case-control comparison. other third party material in this article are included in the article’s Creative J Clin Psychiatry 2011; 72: 1585–1590. Commons license, unless indicated otherwise in the credit line; if the material is not 74 Stoll AL, Severus WE, Freeman MP, Rueter S, Zboyan HA, Diamond E et al. Omega included under the Creative Commons license, users will need to obtain permission 3 fatty acids in bipolar disorder: a preliminary double-blind, placebo- from the license holder to reproduce the material. To view a copy of this license, visit controlled trial. Arch Gen Psychiatry 1999; 56:407–412. http://creativecommons.org/licenses/by-nc-nd/3.0/

Supplementary Information accompanies the paper on the Translational Psychiatry website (http://www.nature.com/tp)

Translational Psychiatry (2014), 1 – 16 © 2014 Macmillan Publishers Limited 1

Supplementary Information

Table S1.Top candidate genes for alcoholism - CFG analysis of GWAS data. Top genes with a CFG score of 8 and above (n= 135) are shown. I - increased; D – decreased in expression. I –increased; D – decreased. PFC- prefrontal cortex. AMY-amygdala. Gene symbols underlined are blood biomarker candidate genes (20 out of 135). Best p-value SNP within the gene or flanking regions are depicted. P-values in bold are <0.001.

Huma Human Human Animal Animal Inte n Brain Peripher Animal Model GWAS 1 Model Total DBP Gene P-Value rnal Geneti Express al Model Brain DBP Best p- Peripheral CFG Stress Symbol Sco c ion Expressi Genetic Expressio Stress values Expression Score DHA re Evide Evidenc on Evidence n SNP Evidence nce e Evidence Evidence

(D)DBP Stress (I)DBP- rs170159 AMY SNCA 0.02363 2 2 4 2 0.5 2 0.5 13 DHA 82 (D)DBP Blood Stress Blood (D)DBP- rs176886 CPE 0.03277 2 2 4 2 0 1.5 0 11.5 DHA 88 Blood rs233553 FCGRT 0.04521 3 0 4 2 0.5 1.5 0 11 4 rs392364 INSIG1 0.03586 2 1 4 2 0 1.5 0 10.5 4 (D)DBP- rs686148 SPARC 0.02757 2 1 4 2 0 1.5 0 10.5 DHA 6 Blood rs643634 ACSL3 0.01857 2 1 4 2 0 1 0 10 6 rs170607 GABRG2 0.009148 2 2 4 0 0 2 0 10 63

SYT11 rs822519 0.007779 4 0 4 0 0.5 1.5 0 10

rs171563 CXCL12 0.005029 2 0 4 2 0 1.5 0 9.5 60 (D)DBP Stress rs802745 PFC ; (D)DBP- GABRB3 0.002829 2 2 4 0 0.5 1 0 9.5 5 (D)DBP DHA HIP Stress AMY (I)DBP GFAP rs744281 0.01052 3 1 4 0 0 1.5 0 9.5 Stress

AMY rs108733 NRXN3 0.001212 2 2 4 0 0 1.5 0 9.5 25 rs967977 PDYN 0.03805 2 2 4 0 0 1.5 0 9.5 1 rs257585 PSMA1 0.02751 2 0 4 2 0 1.5 0 9.5 0 rs124150 PTPRE 0.00429 2 0 4 2 0 1.5 0 9.5 45 (I)DBP- DHA HIP; rs706897 (I)DBP- SCD 0.01188 2 0 4 2 0 1.5 0 9.5 0 DHA PFC ; (I)DBP- DHA HIP rs217320 ADH1C 0.000618 3 2 4 0 0 0 0 9 1 rs168797 CAP2 0.02187 2 1 4 0 0 2 0 9 81 rs207555 DDX5 0.01493 3 2 4 0 0 0 0 9 2 (D)DBP (I)DBP- rs493801 DRD2 0.03652 2 2 4 0 1 0 0 9 Stress DHA 9 PFC AMY (I)DBP- rs776054 DST 0.02887 2 1 4 2 0 0 0 9 DHA 2 NAC rs671363 FN1 0.002112 2 1 4 0 0 2 0 9 7 2

(I)DBP GRM3 rs41440 0.001126 2 1 4 0 0 2 0 9 Stress

AMY rs376174 HMGCR 0.03334 2 1 4 0 0 2 0 9 0 rs230594 KDR 0.000462 2 1 4 0 0 2 0 9 8 rs479289 (I)DBP- MAPT 0.009256 2 1 4 0 0 2 0 9 1 DHA HIP rs105148 (D)DBP- MPDZ 0.001443 2 2 4 0 0 1 0 9 23 DHA HIP (D)DBP (D)DBP- rs238245 NFIB 0.00072 2 0 4 2 0 1 0 9 Stress DHA 6 AMY Blood rs111406 NTRK2 0.009416 2 1 4 0 0 2 0 9 88 rs269770 QDPR 0.02419 2 1 4 0 0 2 0 9 5 (D)DBP (D)DBP- Stress DHA rs226786 PFC ; PFC; SDC4 0.0212 2 1 4 0 0 2 0 9 9 (D)DBP (D)DBP- Stress DHA AMY NAC rs806878 SOX9 0.001764 2 0 4 2 0 1 0 9 9 rs695103 STX1A 0.02611 3 0 4 0 0 2 0 9 0 rs176690 SYN2 0.03978 2 1 4 0 0 2 0 9 26 rs263572 TFAP2B 0.01621 2 2 4 0 0 1 0 9 7 rs807922 USP32 0.02085 2 1 4 2 0 0 0 9 0 (I)DBP rs109671 VLDLR 0.002602 2 2 4 0 0 1 0 9 Stress 88 AMY (D)DBP- rs313436 YWHAZ 0.000535 2 0 4 0 1 2 0 9 DHA 9 Blood rs124391 ADAM10 0.007005 2 1 4 0 0 1.5 0 8.5 89 rs163053 ANXA2 0.0109 2 1 4 0 0 1.5 0 8.5 5 (I)DBP- rs116060 ARIH1 0.01122 3 1 4 0 0.5 0 0 8.5 DHA 8 Blood

CAST rs27772 0.00013 2 1 4 0 0 1.5 0 8.5

CERS2 rs267738 0.02953 4 0 4 0 0.5 0 0 8.5

rs938972 CITED2 0.009349 2 1 4 0 0.5 1 0 8.5 4 rs668237 ENO1 0.04055 2 0 4 0 0.5 2 0 8.5 6 rs700066 ENPP2 0.007509 2 0 4 0 0.5 2 0 8.5 5 rs188514 FOXG1 0.01082 2 0 4 2 0.5 0 0 8.5 7 rs730693 IGF1 0.007657 2 1 4 0 0 1 0.5 8.5 5 rs165453 KLK6 0.007661 3 0 4 0 0.5 1 0 8.5 7 (D)DBP Stress rs121778 PFC ; (D)DBP- MAP1B 0.01705 2 1 4 0 0 1.5 0 8.5 5 (I)DBP DHA HIP Stress AMY (D)DBP (I)DBP- Stress DHA rs112494 PFC ; HIP; MBP 0.006503 2 0 4 0 0 2 0.5 8.5 1 (I)DBP (D)DBP- Stress DHA AMY Blood MOBP rs562545 0.01231 2 0 4 0 0 2 0 8.5

rs218755 MPZL2 0.02351 3 1 4 0 0.5 0 0 8.5 7 rs168337 MYO1B 0.02792 2 1 4 0 0 1.5 0 8.5 62 rs134021 SCG2 0.01329 2 1 4 0 0 1.5 0 8.5 29 3

rs196861 SESTD1 0.007605 2 2 4 0 0.5 0 0 8.5 8 rs105188 TCF12 0.03998 2 1 4 0 0 1.5 0 8.5 90 rs990452 VEZF1 0.03904 2 1 4 0 0 1.5 0 8.5 3 rs254266 ACACA 0.02582 2 1 4 0 0 1 0 8 3 rs393692 ACO1 4.97E-05 2 1 4 0 0 1 0 8 7 rs470351 ACOT12 0.04173 2 0 4 2 0 0 0 8 6 rs696215 AGMO 0.005631 2 0 4 2 0 0 0 8 0 (I) DBP- AGT rs4762 0.01166 2 0 4 0 0 2 0 8 DHA

AMY rs117655 ANLN 0.01028 2 0 4 2 0 0 0 8 57 rs117359 ANXA5 0.02185 2 0 4 0 0 2 0 8 72 rs376442 ATAD5 0.02827 3 1 4 0 0 0 0 8 1 (D)DBP (D)DBP- rs147373 ATXN1 0.000537 2 1 4 0 0 1 0 8 Stress DHA 0 PFC Blood B4GALT rs869896 0.04123 4 0 4 0 0 0 0 8 2 rs151789 CCNA1 0.000617 3 1 4 0 0 0 0 8 7 rs207136 CD74 0.000424 3 1 4 0 0 0 0 8 8 (D)DBP- rs132667 CEBPD 0.03991 2 0 4 0 0.5 1.5 0 8 DHA 91 Blood CLK1 rs7224 0.01583 3 1 4 0 0 0 0 8

CNTNAP rs253897 0.00082 2 2 4 0 0 0 0 8 2 1 (D)DBP- DHA rs146947 COTL1 0.004344 2 0 4 2 0 0 0 8 Blood 9 (D)DBP- DHA HIP (I)DBP- CSNK1A rs771243 0.005067 2 1 4 0 0 1 0 8 DHA 1 1 Blood rs380927 CUX2 0.00291 2 2 4 0 0 0 0 8 7 (D)DBP rs819042 CYB5R3 0.006456 2 0 4 0 0 2 0 8 Stress 3 PFC rs961071 CYTH4 0.01097 2 0 4 2 0 0 0 8 3 rs105034 DLC1 0.02245 2 1 4 0 0 1 0 8 35 rs170556 DPYSL2 0.03831 2 2 4 0 0 0 0 8 87 rs381576 ELL2 0.007039 3 1 4 0 0 0 0 8 8

FAM3C rs998057 0.04555 2 0 4 2 0 0 0 8

rs147669 FARP2 0.01722 2 2 4 0 0 0 0 8 8 rs123355 FSD1L 0.009382 2 2 4 0 0 0 0 8 18 rs102585 GABRA2 0.0307 2 2 4 0 0 0 0 8 2 rs100371 GABRB2 0.04398 2 2 4 0 0 0 0 8 37 rs204980 GBAP1 0.0468 4 0 4 0 0 0 0 8 5 (D)DBP rs127067 GNAI1 0.00059 2 1 4 0 0 1 0 8 Stress 24 PFC (D)DBP- rs175311 GNG12 0.02658 2 1 4 0 0 1 0 8 DHA 47 Blood (D)DBP rs133390 GOT2 0.02053 2 0 4 0 0 2 0 8 Stress 64 PFC

GRHPR rs309453 0.0319 2 1 4 0 0 1 0 8

4

rs453081 GRIA1 0.04338 2 0 4 0 0 2 0 8 7 rs380140 H2AFV 0.03082 3 1 4 0 0 0 0 8 3 HNRNPA rs126725 0.02132 3 0 4 0 0 1 0 8 2B1 36

HSPH1 rs932715 0.01153 2 1 4 0 0 1 0 8

rs232016 HTR1B 0.001425 2 2 4 0 0 0 0 8 0 rs192388 HTR2A 0.003812 2 2 4 0 0 0 0 8 5 rs112460 IFITM3 0.03415 2 1 4 0 0 1 0 8 74 rs374326 IGF1R 0.00708 2 0 4 0 0 2 0 8 4 rs478844 IST1 0.01156 3 1 4 0 0 0 0 8 9

JMJD8 rs6597 0.003844 4 0 4 0 0 0 0 8

rs125533 KDM4C 0.000602 2 2 4 0 0 0 0 8 51 rs241622 MAN2A1 0.01905 2 1 4 0 0 1 0 8 7 rs143377 MAT2B 0.005647 2 0 4 2 0 0 0 8 9 (I)DBP- METAP2 rs282323 0.000198 3 1 4 0 0 0 0 8 DHA

Blood rs380067 MKLN1 0.01405 2 2 4 0 0 0 0 8 8 (D)DBP rs311729 (I)DBP- MOG 0.01429 2 0 4 0 0 2 0 8 Stress 2 DHA HIP PFC MYL2 rs933296 0.01066 2 2 4 0 0 0 0 8

MYO1D rs225184 0.000219 2 1 4 0 0 1 0 8

rs449619 NOL11 0.009119 2 0 4 2 0 0 0 8 8 rs473750 NSMAF 0.04528 2 0 4 2 0 0 0 8 8 (D)DBP Stress rs309978 PFC ; NTM 0.008411 2 1 4 0 0 1 0 8 7 (I)DBP Stress AMY

PDK4 rs854061 0.03023 2 1 4 0 0 1 0 8

PFDN6 rs456261 0.01101 4 0 4 0 0 0 0 8

(D)DBP- rs319350 PGM2L1 0.01831 2 2 4 0 0 0 0 8 DHA 7 Blood rs385018 PLD1 0.001101 2 0 4 0 0 2 0 8 8 rs169692 PLLP 0.03152 2 0 4 0 0 2 0 8 11 rs495270 PPM1B 0.03325 2 0 4 0 0 2 0 8 3 PPP1R1 rs575144 0.03198 2 0 4 2 0 0 0 8

PRKCZ rs908742 0.04567 2 0 4 0 0.5 1.5 0 8

rs659805 PSMD13 0.02309 2 1 4 0 0 1 0 8 5 rs999020 RFTN1 0.009928 2 0 4 2 0 0 0 8 8 (D)DBP- RHOB rs342056 0.000614 2 0 4 0 0.5 1.5 0 8 DHA

Blood (D)DBP rs108000 RXRG 0.005815 2 1 4 0 0 1 0 8 Stress 98 PFC

SETD1A rs897986 0.01892 4 0 4 0 0 0 0 8

(D)DBP- rs107746 SH2B3 0.01612 3 0 4 0 0 1 0 8 DHA 23 Blood (D)DBP- SLC11A rs229070 0.005329 3 1 4 0 0 0 0 8 DHA 1 8 Blood 5

SLC12A (I)DBP- rs36693 0.00193 2 0 4 0 0 2 0 8 2 DHA HIP (D)DBP (D) DBP- rs124581 SYT1 0.04139 2 0 4 0 0 1.5 0.75 8 Stress DHA 0 PFC AMY (D)DBP (D)DBP- rs750293 TIMP2 0.04309 2 1 4 0 0 1 0 8 Stress DHA 5 PFC Blood

TIMP3 rs762886 0.01733 2 1 4 0 0 1 0 8

TMED4 rs217361 0.01461 3 1 4 0 0 0 0 8

(I)DBP (D)DBP- TMEM10 rs555835 0.03234 4 0 4 0 0 0 0 8 Stress DHA 9 PFC AMY rs116712 URI1 0.01216 2 0 4 2 0 0 0 8 55 rs106385 VWF 0.01716 2 1 4 0 0 1 0 8 6

6

Table S2. Pathway Analyses of all top candidate genes for alcoholism CFG 8 and up (n=135) A. Biological Pathways. B. Disease and Disorders.

A. Ingenuity Pathways KEGG Pathways GeneGO Pathways

Top Canonical Enrichment Enrichment # P-Value Ratio Pathway Name Networks pValue Ratio Pathways Score p-value

Hepatic Fibrosis / Cell adhesion_Synaptic 1.927E- 7/155 Nicotine addiction 10.3005 3.36E-05 14/184 1 Hepatic Stellate Cell 3.78E-05 contact 06 (0.045) Activation Neurophysiological 2.021E- 6/123 Morphine addiction 8.97218 0.000127 process_Transmission of 15/212 2 RhoA Signaling 1.43E-04 06 GRPS-135 (0.049) nerve impulse CFG score >=8.0 genes Retrograde Development_Neurogenesis 1.907E- (N=135 Signaling by Rho 8/263 endocannabinoid 8.9528 0.000129 11/180 genes) 3 1.57E-04 _Synaptogenesis 04 Family GTPases (0.03) signaling

GABAergic Reproduction_Progesterone 2.081E- 6/132 8.48991 0.000206 12/213 4 p70S6K Signaling 1.87E-04 synapse signaling 04 (0.045)

Development_Regulation of 3.182E- 7/225 Cocaine addiction 6.55773 0.001419 5 IL-8 Signaling 2.75E-04 angiogenesis 04 12/223 (0.031)

B. Ingenuity GeneGO

# Diseases and Disorders P-Value Molecules Diseases P-Value Ratio Substance-Related 1.531E-28 1 Hereditary Disorder 1.39E-17 - 3.92E-04 44 Disorders 22/847

GRPS-135 Alcoholism 4.572E-28 CFG score 2 Neurological Disease 1.39E-17 - 5.77E-04 72 34/291 >=8.0 genes Bipolar Disorder 9.395E-28 (N=135 3 Psychological Disorders 1.39E-17 - 5.77E-04 54 47/706 genes) Depressive Disorder, Skeletal and Muscular 8.210E-23 48/966 4 2.56E-15 - 4.60E-04 48 Major Disorders Diabetus Mellitus, Type 1.038E-22 5 Metabolic Disease 1.17E-13 - 1.19E-04 41 2 59/1529

7

Table S3. Ingenuity drug targets analysis. Repositioning of existing drugs to be potentially tested for treating alcoholism. Genes with CFG score of 8 and above (n= 135)

Gene Symbol/Gene CFG Location Type(s) Drug(s) Name Score

GABRG2 pagoclone,alphadolone,SEP174559,clobazam,nitrazepam,tracazolate,adinazolam,sevofl gamma-aminobutyric Plasma urane,,gaboxadol,isoniazid,felbamate,etomidate,muscimol,,estazola 10 ion channel acid (GABA) A Membrane m,clorazepate,eszopiclone,quazepam,diazepam,temazepam,zolpidem,chlordiazepoxide,l receptor, gamma 2 orazepam,,triazolam,flumazenil,clonazepam,flurazepam,midazolam,oxazepa m,alprazolam ,pregnenolone pagoclone,ethchlorvynol,alphadolone,SEP174559,clobazam,nitrazepam,adinazolam,pipe razine,acetaminophen/butalbital/caffeine,,isoflurane,gaboxadol,isoniazid,felb amate,etomidate,muscimol,halothane,/olanzapine,amobarbital,estazolam,atropi GABRB3 ne/hyoscyamine/phenobarbital/scopolamine,clorazepate,acetaminophen/butalbital,eszopi gamma-aminobutyric Plasma 9.5 ion channel clone,quazepam,mephobarbital,hyoscyamine/phenobarbital,acetaminophen/butalbital/caf acid (GABA) A Membrane feine/codeine,butabarbital,diazepam,temazepam,zolpidem,chlordiazepoxide,lorazepam,o receptor, beta 3 lanzapine,triazolam,clonazepam,flurazepam,midazolam,oxazepam,alprazolam,zaleplon,s ecobarbital,butalbital,phenobarbital,pentobarbital,thiopental,propofol,ezogabine,desfluran e,methoxyflurane,,pregnenolone

ADH1C alcohol dehydrogenase 1C 9 Cytoplasm enzyme fomepizole,ethanol (class I), gamma polypeptide

,,,,,,,pardop runox,,abaperidone,methotrimeprazine,,SLV314,,rotig otine,,sultopride,,thioproperazine,,, pipothiazine,chloropromazine,domperidone,,sulpiride,meloxicam,amanta DRD2 Plasma G-protein dine,flupenthixol,,,,,,haloper dopamine receptor 9 Membrane coupled receptor idol ,fluphenazine decanoate,thiothixene, D2 ,pramipexol,olanzapine,,,,,mesoridazi ne,,,ropinirole,,,bromocripti ne,,,dopamine,,,droperidol/fentanyl,L- dopa

GRM3 Plasma G-protein glutamate receptor, 9 fasoracetam Membrane coupled receptor metabotropic 3

HMGCR aspirin/pravastatin,atorvastatin/ezetimibe,simvastatin/sitagliptin,pitavastatin,lovastatin/nia 3-hydroxy-3- 9 Cytoplasm enzyme cin,ezetimibe/simvastatin,amlodipine/atorvastatin,fluvastatin,cerivastatin,atorvastatin,prav methylglutaryl-CoA astatin,simvastatin,lovastatin,rosuvastatin reductase

KDR kinase insert domain Plasma AEE 788,sunitinib,cediranib,pazopanib,axitinib,XL647,CEP 7055,BMS-582664,CHIR- receptor (a type III 9 kinase Membrane 265,tivozanib,OSI930,telatinib,cabozantinib,regorafenib,vatalanib,sorafenib,vandetanib receptor tyrosine kinase)

NTRK2 neurotrophic tyrosine Plasma 9 kinase cabozantinib kinase, receptor, Membrane type 2 8

methohexital,primidone,meprobamate,aspirin/butalbital/caffeine,aspirin/butalbital/caffeine /codeine,hexobarbital,pagoclone,ethchlorvynol,alphadolone,SEP 174559,zopiclone,clobazam,nitrazepam,adinazolam,butobarbital,acetaminophen/butalbit al/caffeine,sevoflurane,isoflurane,gaboxadol,felbamate,etomidate,muscimol,halothane,flu GABRA2 oxetine/olanzapine,amobarbital,estazolam,atropine/hyoscyamine/phenobarbital/scopola gamma-aminobutyric Plasma 8 ion channel mine,clorazepate,acetaminophen/butalbital,eszopiclone,quazepam,mephobarbital,hyosc acid (GABA) A Membrane yamine/phenobarbital,/chlordiazepoxide,acetaminophen/butalbital/caffeine/c receptor, alpha 2 odeine,butabarbital,diazepam,temazepam,zolpidem,chlordiazepoxide,lorazepam,olanzap ine,triazolam,clonazepam,flurazepam,midazolam,flunitrazepam,oxazepam,alprazolam,za leplon,thiamylal,secobarbital,barbital,butalbital,phenobarbital,pentobarbital,thiopental,ezo gabine,desflurane,methoxyflurane,enflurane,pregnenolone

methohexital,aspirin/butalbital/caffeine,aspirin/butalbital/caffeine/codeine,fospropofol,pag oclone,ethchlorvynol,alphadolone,SEP 174559,clobazam,nitrazepam,adinazolam,acetaminophen/butalbital/caffeine,sevoflurane, GABRB2 isoflurane,gaboxadol,isoniazid,felbamate,etomidate,muscimol,halothane,fluoxetine/olanz gamma-aminobutyric Plasma apine,amobarbital,estazolam,atropine/hyoscyamine/phenobarbital/scopolamine,clorazep 8 ion channel acid (GABA) A Membrane ate,acetaminophen/butalbital,eszopiclone,mephobarbital,hyoscyamine/phenobarbital,ace receptor, beta 2 taminophen/butalbital/caffeine/codeine,butabarbital,diazepam,temazepam,zolpidem,chlor diazepoxide,lorazepam,olanzapine,triazolam,clonazepam,flurazepam,midazolam,oxazep am,alprazolam,zaleplon,secobarbital,butalbital,phenobarbital,pentobarbital,thiopental,pro pofol,ezogabine,desflurane,methoxyflurane,enflurane,pregnenolone

GRIA1 Plasma glutamate receptor, 8 ion channel talampanel,Org24448,LY451395,tezampanel,perampanel,sevoflurane,isoflurane,\desflur Membrane ionotropic, AMPA 1 ane,methoxyflurane,enflurane

HTR1B 5-hydroxytryptamine Plasma G-protein caffeine/,asenapine,donitriptan,,,,, () receptor 8 Membrane coupled receptor ,dihydroergotamine,ergotamine,,, 1B, G protein- coupled

paliperidone,risperidone,buspirone,caffeine/ergotamine,iloperidone,,blonanserin HTR2A ,,asenapine,ocaperidone,abaperidone,psilocybine,APD125,thioproperazine,lurasi 5-hydroxytryptamine Plasma G-protein coupled done,,pipothiazine,chloropromazine,,flupenthixol,,chlorprothixe 8 (serotonin) receptor 2A, Membrane receptor ne,clozapine,,,,,fluoxetine/olanzapine,thiot G protein-coupled hixene,,epinastine,fenfluramine,quetiapine,olanzapine,,,amit riptyline,,,lisuride,sertindole,ziprasidone,,thioridazi ne,aripiprazole,,dihydroergotamine,apomorphine,ergotamine,azatadine IGF1R Plasma transmembrane insulin-like growth factor 8 Membrane receptor OSI-906,cixutumumab,ganitumab,AVE1642,BIIB022,IGF1 1 receptor

METAP2 methionyl 8 Cytoplasm peptidase PPI-2458 aminopeptidase 2

PRKCZ 8 Cytoplasm kinase ingenol 3-angelate protein kinase C, zeta

RXRG ligand- retinoid X receptor, 8 Nucleus dependent etretinate,bexarotene,adapalene,acitretin,tretinoin,9-cis-retinoic acid gamma nuclear receptor

SLC12A2 solute carrier family 12 Plasma 8 transporter bumetanide,quinethazone (sodium/potassium/c Membrane hloride transporters), member 2

9

Table S4: GRPS-135 large panel used for genetic risk prediction of alcoholism (n=135 genes).

A2 A1 A2 OR A2 HumanHap550v3 A1 P A1 (C (Co Gene Symbol/Gene (Coh (Coh (Coh _Gene_Annotatio Gene Location (Coho (Coh (Coho oh hort Name ort ort ort 3 nSNP rt 1) ort 1) rt 2) ort 3 1) 1) ,4) 2) ,4)

0.81 0.028 rs732770 NM_198838.1 flanking_3UTR A G A G A G 71 53 0.83 0.048 rs9906543 NM_198837.1 intron C T C T A G 84 79 0.79 0.040 rs2305097 NM_198837.1 intron G T G T ACACA 33 48 acetyl-CoA 0.77 0.025 carboxylase alpha rs2542663 NM_198838.1 intron C T C T 81 82 0.78 0.034 rs2898659 NM_198838.1 intron T C T C 74 17 0.76 0.031 rs9330248 NM_198837.1 intron A G A G 31 23 1.23 0.020 rs1033216 NM_002197.1 flanking_5UTR T G T G C A 4 69 1.40 0.009 rs10757998 NM_002197.1 flanking_5UTR G A G A G A 6 29 0.000 rs10813415 NM_002197.1 flanking_5UTR G A 1.68 G A G A 117 1.28 0.004 rs10813420 NM_002197.1 flanking_5UTR T C T C A G 4 285 1.19 0.027 rs10969854 NM_002197.1 flanking_5UTR T C C T G A 4 82 1.23 0.011 rs12552052 NM_002197.1 flanking_5UTR T C T C G A 5 45 0.74 0.021 rs13302787 NM_002197.1 flanking_5UTR T G T G A C 63 73 1.40 0.005 rs2888998 NM_002197.1 flanking_5UTR G A G A G A 7 323 1.33 0.039 rs2992117 NM_002197.1 flanking_5UTR C T C T G A 6 35 1.68 4.97E rs3936927 NM_002197.1 flanking_5UTR T C T C A G ACO1 8 -05 aconitase 1, 1.22 0.048 soluble rs7027645 NM_002197.1 flanking_5UTR T C T C A G 9 7 0.76 0.023 rs7035270 NM_002197.1 flanking_5UTR G A G A G A 26 32 0.001 rs10970046 NM_002197.1 flanking_5UTR G A 1.33 G A 177 0.66 0.026 rs13285154 NM_002197.1 flanking_5UTR T C T C 6 38 0.68 0.041 rs13295330 NM_002197.1 flanking_5UTR C T C T 79 14 1.51 0.000 rs1335137 NM_002197.1 flanking_5UTR T C T C 8 433 1.18 0.046 rs17776734 NM_002197.1 flanking_5UTR A C A C 8 49 1.25 0.006 rs2375104 NM_002197.1 flanking_5UTR A G G A 2 665 1.22 0.027 rs7021230 NM_002197.1 flanking_5UTR A G A G 2 08 1.24 0.007 rs7038348 NM_002197.1 flanking_5UTR C T C T 2 778 ACOT12 0.74 0.041 acyl-CoA rs4703516 NM_130767.1 intron T G T G A C 76 73 thioesterase 12 ACSL3 0.80 0.018 rs6436346 NM_004457.3 flanking_5UTR G A G A G A acyl-CoA 66 57 synthetase long- 0.78 0.034 chain family rs2395926 NM_004457.3 flanking_5UTR A C A C member 3 46 89 1.30 0.007 rs12439189 NM_001110.2 intron C A C A C A 8 005 1.23 0.030 ADAM10 rs1869135 NM_001110.2 flanking_3UTR C T C T G A ADAM 9 5 metallopeptidase 1.23 0.043 rs2081703 NM_001110.2 intron G A G A G A domain 10 7 93 1.24 0.015 rs4238331 NM_001110.2 intron T G T G A C 3 68 10

1.34 0.031 rs11499823 NM_000669.3 flanking_5UTR C T C T G A 8 12 1.37 0.003 rs1789891 NM_000669.3 flanking_3UTR A C A C A C ADH1C 7 27 alcohol 1.26 0.004 dehydrogenase 1C rs1789924 NM_000669.3 flanking_5UTR T C T C A G 9 76 (class I), gamma 0.69 0.000 polypeptide rs2173201 NM_000669.3 flanking_3UTR A C A C A C 08 618 1.27 0.003 rs2851300 NM_000669.3 flanking_5UTR T C T C A G 6 949 NM_001004320 1.39 0.015 rs2191994 flanking_3UTR C T C T G A .1 6 13 NM_001004320 1.32 0.005 rs6962150 flanking_3UTR T G T G A C .1 4 631 NM_001004320 0.78 0.045 rs10243870 flanking_3UTR G T G T AGMO .1 97 95 alkylglycerol NM_001004320 0.82 0.022 monooxygenase rs10274730 flanking_3UTR T C T C .1 34 14 NM_001004320 0.79 0.046 rs4422690 flanking_3UTR C T C T .1 09 12 NM_001004320 0.79 0.046 rs4599703 intron A G A G .1 04 43 1.25 0.019 rs2478545 NM_000029.2 intron T C T C A G 2 19 AGT 1.18 0.046 rs2493137 NM_000029.2 flanking_5UTR C T C T A G angiotensinogen 7 27 (serpin peptidase 0.011 inhibitor, clade A, rs4762 NM_000029.2 coding T C 1.34 T C A G member 8) 66 1.26 0.027 rs7536290 NM_000029.2 flanking_3UTR G A G A G A 3 17 0.73 0.010 rs11765557 NM_018685.2 flanking_3UTR T G T G A C 9 28 ANLN 1.29 0.049 anillin, actin rs3801317 NM_018685.2 intron T C T C 5 53 binding protein 1.53 0.013 rs6954831 NM_018685.2 intron A G A G 2 62 NM_001002858 1.25 0.041 rs12148854 flanking_3UTR G A G A G A .1 8 13 ANXA2 NM_001002858 1.33 0.010 rs1630535 flanking_3UTR A G A G A G annexin A2 .1 8 9 1.29 0.021 rs11735972 NM_001154.2 flanking_3UTR C T C T G A 9 85 ANXA5 0.049 rs1158024 NM_001154.2 flanking_3UTR T C 1.19 T C annexin A5 48 1.37 0.023 rs1562250 NM_005744.2 intron T C T C A G 6 67 0.80 0.017 rs17753089 NM_005744.2 flanking_3UTR T C T C A G ARIH1 62 67 ariadne RBR E3 0.58 0.047 rs17825866 NM_005744.2 flanking_3UTR C T C T G A ubiquitin protein 41 45 ligase 1 1.43 0.011 rs1160608 NM_005744.2 intron A G A G 1 22 0.034 rs2278694 NM_005744.2 intron A G 1.36 A G 74 0.034 ATAD5 rs3816780 NM_024857.3 coding T C 1.33 T C A G ATPase family, 96 AAA domain 1.35 0.028 rs3764421 NM_024857.3 coding C A C A containing 5 3 27 0.69 0.002 rs1150639 NM_000332.2 intron A G A G A G 02 488 1.40 0.006 rs12204969 NM_000332.2 intron C T C T G A 8 857 1.55 0.000 rs1473730 NM_000332.2 flanking_5UTR G A G A G A 8 537 0.78 0.007 rs16879127 NM_000332.2 flanking_5UTR A C A C A C 14 641 1.50 0.004 rs179944 NM_000332.2 intron A G A G A G 6 517 ATXN1 0.013 rs3812199 NM_000332.2 intron C T 0.81 C T G A ataxin 1 91 1.23 0.015 rs3812200 NM_000332.2 intron T C T C A G 6 7 0.82 0.045 rs3812206 NM_000332.2 intron T C T C A G 91 34 0.75 0.001 rs4074661 NM_000332.2 flanking_5UTR A G A G A G 34 507 1.35 0.003 rs6903850 NM_000332.2 intron T G T G A C 9 059 0.83 0.049 rs697739 NM_000332.2 intron A G A G G A 75 64 11

1.39 0.017 rs9396669 NM_000332.2 intron A G A G A G 4 05 1.24 0.049 rs9396726 NM_000332.2 flanking_5UTR T G T G A C 5 65 0.71 0.002 rs1144699 NM_000332.2 intron G A G A 46 979 1.41 0.026 rs12196135 NM_000332.2 intron A G A G 5 66 0.67 0.012 rs17595094 NM_000332.2 intron A G A G 19 44 1.30 0.026 rs17606174 NM_000332.2 intron T C T C 5 56 1.95 0.009 rs2744413 NM_000332.2 intron C T C T 8 804 0.65 0.001 rs3793118 NM_000332.2 intron A G A G 05 042 0.002 rs9350018 NM_000332.2 flanking_5UTR C A 0.75 C A 317 B4GALT2 UDP- Gal:betaGlcNAc NM_001005417 1.41 0.041 rs869896 intron T C T C beta 1,4- .1 8 23 galactosyltransfera se, polypeptide 2 1.40 0.021 CAP2 rs16879781 NM_006366.2 intron T C T C A G CAP, adenylate 5 87 cyclase-associated 1.18 0.041 rs13194542 NM_006366.2 intron T C T C protein, 2 (yeast) 7 38 1.25 0.005 rs10053056 NM_173060.1 intron T C T C G A 6 7 0.71 0.000 rs13362120 NM_001750.4 intron C T C T G A 82 324 0.83 0.038 rs155051 NM_001750.4 flanking_5UTR C T C T G A 86 31 1.19 0.032 rs155053 NM_001750.4 flanking_5UTR A G A G A G 3 98 0.74 0.000 rs25862 NM_173061.1 intron A G A G A G 65 433 1.19 0.034 rs261227 NM_001750.4 flanking_5UTR G T G T C A 6 97 1.27 0.025 rs26505 NM_173061.1 coding T C T C A G 4 55 1.27 0.015 rs27991 NM_173060.1 intron A G A G A G CAST 5 24 calpastatin 0.77 0.002 rs4400148 NM_001750.4 flanking_5UTR T C T C A G 45 181 0.71 0.000 rs10037212 NM_173061.1 intron A C A C 97 31 0.81 0.017 rs155039 NM_001750.4 flanking_5UTR T C T C 86 87 0.83 0.045 rs155040 NM_001750.4 flanking_5UTR A C A C 74 81 0.012 rs17476752 NM_001750.4 flanking_5UTR G T 1.32 G T 99 0.71 0.000 rs1862609 NM_001750.4 intron A G A G 15 193 1.40 0.000 rs27772 NM_173061.1 intron G A G A 6 13 1.39 0.000 rs469532 NM_173060.1 intron C A C A 7 808 0.75 0.000 rs1517881 NM_003914.2 flanking_3UTR A G A G A G 18 797 0.77 0.002 rs1517896 NM_003914.2 flanking_3UTR T C T C A G 38 482 0.74 0.000 rs1517897 NM_003914.2 flanking_3UTR T C T C A G 88 617 0.79 0.016 rs6563485 NM_003914.2 flanking_3UTR C T C T G A 61 24 0.79 0.015 rs9315422 NM_003914.2 flanking_3UTR T G T G A C CCNA1 92 93 cyclin A1 0.80 0.019 rs943725 NM_003914.2 flanking_3UTR T C T C A G 45 49 1.22 0.034 rs9576081 NM_003914.2 flanking_3UTR A G A G A G 1 75 0.73 0.001 rs9603055 NM_003914.2 flanking_3UTR T C T C A G 9 89 0.78 0.010 rs4391925 NM_003914.2 flanking_3UTR C A C A 89 98 1.26 0.004 rs7330345 NM_003914.2 flanking_3UTR C T C T 3 912 CD74 1.21 0.025 rs13175409 NM_004355.2 flanking_5UTR T C T C A G CD74 molecule, 8 56 12

major 0.50 0.000 rs2071368 NM_004355.2 flanking_5UTR C T C T G A histocompatibility 14 424 complex, class II 0.66 0.017 rs2288817 NM_004355.2 intron G A G A G A invariant chain 92 34 0.83 0.042 CEBPD rs12541086 NM_005195.2 flanking_3UTR T C T C G A CCAAT/enhancer 31 42 binding protein 0.039 rs13266791 NM_005195.2 flanking_3UTR C T 1.44 C T (C/EBP), delta 91 CERS2 0.79 0.029 ceramide synthase rs267738 NM_181746.1 coding C A C A C A 38 53 2 1.20 0.035 rs2080112 NM_006079.3 flanking_5UTR T C T C A G 6 02 1.19 0.036 rs3010311 NM_006079.3 flanking_5UTR G A G A G A 6 38 0.020 rs4896491 NM_006079.3 flanking_5UTR T C 1.52 T C A G 62 1.24 0.030 rs6903961 NM_006079.3 flanking_5UTR A G A G A G 6 76 1.18 0.043 rs6913402 NM_006079.3 flanking_3UTR T G T G C A 5 84 0.75 0.026 rs6925308 NM_006079.3 flanking_3UTR A G A G A G CITED2 24 77 Cbp/p300- 1.20 0.040 rs887780 NM_006079.3 flanking_5UTR G A G A G A interacting 5 75 transactivator, with 1.26 0.009 Glu/Asp-rich rs9389724 NM_006079.3 flanking_5UTR A G A G A G carboxy-terminal 2 349 domain, 2 1.19 0.036 rs9495538 NM_006079.3 flanking_5UTR C T C T A G 1 41 0.78 0.024 rs12525047 NM_006079.3 flanking_5UTR T C T C 54 64 0.78 0.024 rs13204837 NM_006079.3 flanking_5UTR A G A G 54 64 0.81 0.048 rs1587171 NM_006079.3 flanking_5UTR C T C T 29 21 1.31 0.016 rs17069288 NM_006079.3 flanking_5UTR T G T G 2 71 1.17 0.048 rs7759402 NM_006079.3 flanking_5UTR A G A G 9 78 NM_001024646 1.34 0.025 rs11903236 intron C T C T G A CLK1 .1 3 1 CDC-like kinase 1 1.41 0.015 rs7224 NM_004071.2 3UTR C A C A 3 83 0.83 0.036 rs10238572 NM_014141.3 intron C T C T A G 93 88 0.70 0.030 rs10243377 NM_014141.3 intron T C T C A G 18 93 1.21 0.018 rs10250570 NM_014141.3 intron A G A G A G 2 34 1.24 0.021 rs10251377 NM_014141.3 intron G A G A G A 1 37 1.24 0.033 rs10268483 NM_014141.3 intron G A G A G A 4 62 1.24 0.032 rs10276770 NM_014141.3 intron T C T C A G 9 33 0.78 0.045 rs10279409 NM_014141.3 intron A G A G A G 19 41 1.29 0.021 rs12112963 NM_014141.3 intron C A C A C A 6 61 0.73 0.026 rs12531630 NM_014141.3 intron A G A G A G CNTNAP2 84 22 contactin 1.50 0.010 rs1404732 NM_014141.3 intron C T C T G A associated protein- 8 2 like 2 0.81 0.035 rs1881726 NM_014141.3 intron C T C T G A 47 31 0.007 rs2011815 NM_014141.3 intron G T 1.33 G T C A 536 1.23 0.016 rs2204465 NM_014141.3 flanking_5UTR C T C T G A 1 6 0.64 0.025 rs2533096 NM_014141.3 intron G A G A G A 46 03 0.67 0.000 rs2538971 NM_014141.3 intron C T C T G A 12 82 0.78 0.029 rs4421277 NM_014141.3 intron A G A G A G 02 21 1.43 0.006 rs6943628 NM_014141.3 intron A G A G A G 3 44 1.74 0.007 rs6970519 NM_014141.3 flanking_3UTR T G T G A C 4 191 0.79 0.022 rs700303 NM_014141.3 intron A G A G A G 48 34 13

1.51 0.008 rs7785839 NM_014141.3 intron T C T C A G 9 994 0.025 rs7790550 NM_014141.3 intron T C 1.25 T C A G 99 1.22 0.012 rs7808274 NM_014141.3 intron A G A G G A 6 09 0.82 0.038 rs826810 NM_014141.3 intron A G A G A G 36 38 1.21 0.041 rs10272638 NM_014141.3 intron A G A G 7 47 1.22 0.014 rs11762071 NM_014141.3 intron G A G A 1 16 0.80 0.014 rs1177007 NM_014141.3 intron G A G A 14 1 1.74 0.007 rs12538810 NM_014141.3 flanking_3UTR A C A C 2 308 1.40 0.028 rs13225365 NM_014141.3 intron A G A G 9 94 0.70 0.003 rs1351312 NM_014141.3 intron T C T C 76 337 0.80 0.024 rs1517770 NM_014141.3 flanking_5UTR T C T C 73 58 0.80 0.025 rs1534702 NM_014141.3 intron T C T C 58 98 1.67 0.001 rs1718101 NM_014141.3 intron A G A G 3 685 0.78 0.035 rs2074711 NM_014141.3 intron G A G A 35 26 1.25 0.022 rs7779225 NM_014141.3 intron G A G A 6 87 1.27 0.017 rs7784277 NM_014141.3 intron T G T G 9 52 0.83 0.031 rs7805359 NM_014141.3 intron C T C T 81 82 1.49 0.003 rs7806237 NM_014141.3 intron A C A C 7 595 0.64 0.025 rs826823 NM_014141.3 intron C T C T 1 32 0.82 0.031 rs851715 NM_014141.3 intron G A G A 55 85 0.83 0.047 rs851826 NM_014141.3 intron C T C T 79 41 0.79 0.014 rs1429264 NM_021149.2 intron C T C T 73 39 COTL1 1.34 0.004 coactosin-like 1 rs1469479 NM_021149.2 intron G T G T 1 344 (Dictyostelium) 0.82 0.035 rs7188531 NM_021149.2 intron T C T C 3 81 1.44 0.032 rs17688688 NM_001873.1 flanking_3UTR A G A G A G CPE 3 77 carboxypeptidase 1.20 0.038 E rs4691213 NM_001873.1 flanking_3UTR A G A G 6 41 NM_001025105 0.71 0.005 rs7712431 intron C T C T G A .1 17 067 CSNK1A1 NM_001025105 0.64 0.046 casein kinase 1, rs6580605 flanking_3UTR C T C T .1 79 37 alpha 1 NM_001025105 1.43 0.014 rs9325142 flanking_3UTR A G A G .1 7 1 1.25 0.025 rs7300860 NM_015267.1 intron C T C T G A 4 41 CUX2 1.49 0.002 cut-like homeobox rs3809277 NM_015267.1 intron A C A C 3 91 2 1.28 0.033 rs3809278 NM_015267.1 intron T G T G 1 29 1.20 rs1002588 NM_000609.4 flanking_5UTR T C 0.042 T C A G 4 1.38 0.005 rs17156360 NM_000609.4 flanking_5UTR T C T C A G 7 029 0.72 0.040 rs17156715 NM_000609.4 flanking_5UTR A C A C A C 5 05 1.30 0.018 rs17406863 NM_000609.4 flanking_5UTR C T C T G A CXCL12 9 28 chemokine (C-X-C NM_001033886 0.80 0.023 motif) ligand 12 rs2297630 intron A G A G A G .1 34 76 1.27 0.030 rs11591534 NM_000609.4 flanking_5UTR A G A G 9 67 0.79 0.032 rs12412154 NM_000609.4 flanking_5UTR T C T C 84 91 1.20 0.043 rs7075674 NM_000609.4 flanking_5UTR T C T C 3 71 14

0.76 0.010 rs994179 NM_000609.4 flanking_5UTR C T C T 59 34 0.67 0.006 rs2285141 NM_007326.2 intron T G T G CYB5R3 22 635 cytochrome b5 0.67 0.006 reductase 3 rs8190423 NM_007326.2 intron T C T C 13 456 0.046 rs4821628 NM_013385.2 flanking_3UTR G A 1.18 G A G A CYTH4 95 cytohesin 4 0.80 0.010 rs9610713 NM_013385.2 flanking_3UTR C T C T G A 62 97 DDX5 DEAD (Asp-Glu- 1.43 0.014 rs2075552 NM_004396.2 intron A G A G Ala-Asp) box 3 93 helicase 5 DLC1 1.68 0.022 deleted in liver rs10503435 NM_006094.3 intron C T C T G A 6 45 cancer 1 1.23 0.039 rs11775967 NM_001386.4 intron A C A C A C 9 57 1.23 0.041 DPYSL2 rs11780026 NM_001386.4 intron G A G A 6 46 dihydropyrimidinas 1.48 0.038 rs17055687 NM_001386.4 flanking_3UTR T G T G e-like 2 8 31 1.48 0.038 rs17088251 NM_001386.4 flanking_3UTR A G A G 8 31 1.25 0.044 rs4648317 NM_016574.2 intron T C T C A G DRD2 8 09 dopamine receptor 1.26 0.036 D2 rs4938019 NM_016574.2 intron C T C T 8 52 0.55 0.034 rs1024196 NM_020388.2 coding G A G A G A 29 12 DST 0.55 0.034 rs11966986 NM_183380.1 intron T C T C dystonin 29 12 0.54 0.028 rs7760542 NM_183380.1 intron G A G A 34 87 1.19 0.037 rs147295 NM_012081.3 flanking_3UTR T C T C G A 9 21 1.26 0.018 rs3777179 NM_012081.3 intron T C T C A G 6 21 1.27 0.009 rs3777185 NM_012081.3 intron A G A G A G 2 015 1.25 0.020 rs7705279 NM_012081.3 flanking_5UTR A G A G A G 6 91 1.19 0.049 ELL2 rs7717348 NM_012081.3 flanking_5UTR T G T G A C elongation factor, 7 6 RNA polymerase II, 0.023 rs918629 NM_012081.3 flanking_5UTR A G 1.24 A G A G 2 9 1.24 0.016 rs10035477 NM_012081.3 intron C T C T 9 18 1.28 0.007 rs3815768 NM_012081.3 coding A G A G 2 039 1.20 0.040 rs4538631 NM_012081.3 flanking_5UTR T C T C 6 01 1.22 0.036 rs6556898 NM_012081.3 flanking_5UTR G A G A 4 32 1.19 0.040 rs6682376 NM_001428.2 flanking_5UTR C T C T A G ENO1 2 55 enolase 1, (alpha) 1.19 0.047 rs12124851 NM_001428.2 flanking_3UTR G A G A 4 23 1.21 0.049 rs1058913 NM_006209.2 coding T C T C A G 4 19 0.023 rs1992721 NM_006209.2 intron G A 1.22 G A G A 32 0.008 rs2289887 NM_006209.2 intron A G 1.57 A G A G 782 1.20 0.024 rs6993464 NM_006209.2 flanking_3UTR T C T C A G ENPP2 3 35 ectonucleotide 1.21 0.016 rs10505364 NM_006209.2 flanking_3UTR G A G A pyrophosphatase/p 8 04 hosphodiesterase 2 1.19 0.028 rs11782176 NM_006209.2 flanking_3UTR G A G A 7 83 1.17 rs4520196 NM_006209.2 intron G A 0.05 G A 5 1.24 0.007 rs7000665 NM_006209.2 flanking_3UTR C T C T 5 509 1.20 0.022 rs7846200 NM_006209.2 flanking_3UTR A G A G 5 89 1.61 0.047 FAM3C rs2697177 NM_014888.1 flanking_5UTR G A G A G A family with 1 48 sequence similarity 0.68 0.045 rs998057 NM_014888.1 flanking_5UTR A G A G 3, member C 3 55 15

1.22 0.017 FARP2 rs1476698 NM_014808.1 intron C T C T G A FERM, RhoGEF 3 22 and pleckstrin 1.26 0.035 rs3771570 NM_014808.1 intron T C T C domain protein 2 4 52 FCGRT Fc fragment of IgG, 0.80 0.045 rs2335534 NM_004107.3 flanking_5UTR A G A G A G receptor, 57 21 transporter, alpha 0.81 0.029 rs1250085 NM_212482.1 flanking_5UTR G A G A A G 4 47 0.83 0.037 rs1250064 NM_212482.1 flanking_5UTR T G T G 17 72 FN1 0.79 0.009 rs1250079 NM_212482.1 flanking_5UTR T C T C fibronectin 1 99 874 1.21 0.024 rs4411688 NM_212482.1 flanking_5UTR G A G A 8 91 0.002 rs6713637 NM_212482.1 flanking_5UTR G A 1.35 G A 112 1.22 0.011 rs10141935 NM_005249.3 flanking_5UTR G A G A G A 7 31 0.81 0.010 rs1885147 NM_005249.3 flanking_5UTR A G A G G A 1 82 1.47 0.017 rs7143611 NM_005249.3 flanking_5UTR C T C T G A 8 9 0.014 rs730195 NM_005249.3 flanking_5UTR T C 1.34 T C A G 15 FOXG1 1.22 0.012 rs10132568 NM_005249.3 flanking_5UTR A G A G forkhead box G1 4 05 1.21 0.022 rs10483345 NM_005249.3 flanking_3UTR A G A G 3 37 0.84 0.034 rs176338 NM_005249.3 flanking_5UTR A G A G 3 31 0.82 0.024 rs2224437 NM_005249.3 flanking_5UTR T C T C 24 89 0.80 0.027 rs8012337 NM_005249.3 flanking_5UTR G T G T 06 66 FSD1L fibronectin type III 1.92 0.009 rs12335518 NM_031919.1 flanking_5UTR C A C A and SPRY domain 7 382 containing 1-like GABRA2 gamma- 0.82 0.030 aminobutyric acid rs1025852 NM_000807.1 flanking_5UTR A G A G A G 83 7 (GABA) A receptor, alpha 2 0.84 0.049 rs6888242 NM_021911.1 flanking_5UTR A G A G A G GABRB2 41 84 gamma- 1.24 0.047 aminobutyric acid rs7714930 NM_021911.1 intron C T C T G A 5 43 (GABA) A receptor, 0.83 0.043 beta 2 rs10037137 NM_021911.1 flanking_5UTR T G T G 98 98 1.29 0.006 rs10873636 NM_021912.2 intron G T G T C A 3 327 1.33 0.046 rs17738087 NM_021912.2 intron C T C T G A 4 12 1.32 0.009 rs1863459 NM_021912.2 intron G A G A G A 3 25 1.20 0.049 rs2928697 NM_021912.2 intron T C T C A G GABRB3 1 64 gamma- 1.22 0.042 aminobutyric acid rs4906676 NM_021912.2 flanking_3UTR G A G A G A 5 31 (GABA) A receptor, 1.28 0.002 beta 3 rs8027455 NM_021912.2 flanking_3UTR A C A C A C 9 829 1.28 0.003 rs933950 NM_021912.2 flanking_3UTR T C T C A G 8 382 1.29 0.006 rs11161324 NM_021912.2 intron G A G A 4 13 1.25 0.006 rs4572353 NM_021912.2 flanking_3UTR T C T C 6 747 0.63 0.009 rs17060763 NM_198904.1 flanking_3UTR T G T G A C 03 148 0.79 0.047 rs6881035 NM_198904.1 flanking_3UTR G T G T C A GABRG2 1 2 gamma- 0.60 0.011 aminobutyric acid rs7713655 NM_198904.1 flanking_5UTR A G A G A G 52 53 (GABA) A receptor, 0.78 0.038 gamma 2 rs13171710 NM_198904.1 flanking_3UTR G A G A 21 22 1.25 0.037 rs6864438 NM_198904.1 flanking_5UTR C T C T 5 59 GBAP1 0.84 0.046 glucosidase, beta, rs2049805 NR_002188.1 flanking_5UTR G A G A A G 88 8 acid pseudogene 1 16

0.81 0.028 rs3169733 NM_002055.2 flanking_5UTR G A G A G A 53 21 0.80 0.011 rs736866 NM_002055.2 flanking_5UTR A C A C A C GFAP 82 4 glial fibrillary acidic 0.82 0.036 protein rs3744473 NM_002055.2 flanking_3UTR C T C T 39 95 0.80 0.010 rs744281 NM_002055.2 flanking_5UTR T C T C 6 52 0.73 0.001 rs10486920 NM_002069.4 flanking_5UTR A G A G A G 47 82 0.75 0.021 rs17802148 NM_002069.4 flanking_5UTR C T C T G A 95 24 1.19 0.037 rs2523189 NM_002069.4 intron T C T C A G 4 24 0.81 0.018 rs2886609 NM_002069.4 flanking_3UTR A G A G G A 64 36 0.75 0.006 GNAI1 rs2886611 NM_002069.4 intron T C T C A G guanine nucleotide 58 321 binding protein (G 0.80 0.032 rs4731111 NM_002069.4 flanking_5UTR G A G A G A protein), alpha 01 5 inhibiting activity 0.75 0.011 rs6466884 NM_002069.4 flanking_5UTR C T C T G A polypeptide 1 16 01 0.71 0.000 rs12706724 NM_002069.4 flanking_3UTR A G A G 21 59 0.73 0.000 rs4731302 NM_002069.4 flanking_3UTR C T C T 92 998 0.74 0.010 rs7803811 NM_002069.4 flanking_5UTR A C A C 85 6 0.73 0.006 rs7805663 NM_002069.4 flanking_5UTR G A G A 54 723 GNG12 guanine nucleotide 1.37 0.026 rs17531147 NM_018841.3 intron A G A G binding protein (G 8 58 protein), gamma 12 0.82 0.026 rs12709027 NM_002080.2 flanking_5UTR A G A G A G 88 9 0.83 0.030 rs1351575 NM_002080.2 flanking_5UTR G T G T C A GOT2 23 67 glutamic- 1.20 0.042 oxaloacetic rs1445456 NM_002080.2 flanking_5UTR A C A C A C 4 73 transaminase 2, 0.020 mitochondrial rs13339064 NM_002080.2 flanking_5UTR T C 1.3 T C 53 0.048 rs9888768 NM_002080.2 flanking_5UTR A G 1.19 A G 85 0.82 0.031 GRHPR rs309453 NM_012203.1 intron C T C T A G glyoxylate 8 9 reductase/hydroxyp 0.73 0.039 rs309455 NM_012203.1 intron T C T C A G yruvate reductase 08 26 GRIA1 glutamate 0.84 0.043 rs4530817 NM_000827.2 intron G A G A G A receptor, ionotropic, 15 38 AMPA 1 0.008 rs17160519 NM_000840.2 flanking_5UTR A C 1.3 A C A C 344 0.70 0.021 rs17315854 NM_000840.2 flanking_5UTR C T C T G A 19 15 0.030 rs4236502 NM_000840.2 flanking_5UTR C A 1.2 C A C A 87 1.25 0.021 rs6944937 NM_000840.2 flanking_5UTR C T C T G A 4 1 0.65 0.006 rs10499898 NM_000840.2 flanking_5UTR T C T C 1 829 1.26 0.047 rs12668989 NM_000840.2 flanking_5UTR C T C T 2 4 1.30 0.022 rs12673599 NM_000840.2 flanking_5UTR T C T C GRM3 9 71 glutamate 0.62 0.002 rs13222675 NM_000840.2 flanking_5UTR C T C T receptor, 43 859 metabotropic 3 0.71 0.029 rs13236080 NM_000840.2 flanking_5UTR G A G A 74 56 0.008 rs1527769 NM_000840.2 flanking_5UTR T C 0.66 T C 384 1.29 0.028 rs1554888 NM_000840.2 flanking_5UTR C T C T 5 52 1.27 0.042 rs17161018 NM_000840.2 intron A G A G 1 77 0.74 0.004 rs2373124 NM_000840.2 flanking_5UTR G A G A 99 97 0.73 0.005 rs2708553 NM_000840.2 flanking_5UTR T C T C 9 225 0.72 0.001 rs41440 NM_000840.2 flanking_5UTR G A G A 21 126 17

H2AFV 0.77 0.030 H2A histone rs3801403 NM_138635.2 intron C T C T 36 82 family, member V 1.31 0.036 rs11742194 NM_000859.1 intron T C T C A G HMGCR 3 93 3-hydroxy-3- 1.31 0.033 rs3761740 NM_000859.1 flanking_5UTR A C A C A C methylglutaryl-CoA 5 34 reductase 1.30 0.040 rs5909 NM_000859.1 3UTR A G A G A G 7 48 HNRNPA2B1 heterogeneous 0.70 0.021 nuclear rs12672536 NM_031243.1 intron C T C T 02 32 ribonucleoprotein A2/B1 0.049 rs2280059 NM_006644.2 5UTR A G 1.23 A G A G 2 0.79 0.012 rs9591599 NM_006644.2 flanking_3UTR C T C T G A 2 93 0.68 0.011 HSPH1 rs1028713 NM_006644.2 flanking_3UTR G A G A heat shock 79 95 105kDa/110kDa 0.73 0.037 rs2224779 NM_006644.2 flanking_3UTR A G A G protein 1 62 86 0.74 0.013 rs7358865 NM_006644.2 flanking_3UTR C T C T 24 54 0.72 0.011 rs932715 NM_006644.2 flanking_3UTR C T C T 69 53 0.79 0.023 rs1145831 NM_000863.1 flanking_3UTR G A G A G A 71 64 0.80 0.042 rs12189558 NM_000863.1 flanking_5UTR A G A G A G 14 57 1.20 0.047 rs12202018 NM_000863.1 flanking_3UTR G A G A G A 4 68 1.18 0.044 rs12206214 NM_000863.1 flanking_3UTR A G A G A G 1 11 1.25 0.011 rs1343485 NM_000863.1 flanking_3UTR G T G T C A 1 28 0.041 rs1548240 NM_000863.1 flanking_3UTR A G 1.22 A G A G 13 0.009 rs17263843 NM_000863.1 flanking_3UTR T C 1.27 T C A G 987 0.73 0.015 rs1777766 NM_000863.1 flanking_3UTR C T C T G A 44 84 1.29 0.029 rs2504298 NM_000863.1 flanking_5UTR C T C T G A 7 46 1.22 0.026 rs4265000 NM_000863.1 flanking_3UTR A G A G A G 7 81 1.18 0.040 rs7747803 NM_000863.1 flanking_3UTR G A G A A G 5 26 1.47 0.034 rs9343725 NM_000863.1 flanking_5UTR C T C T G A 7 46 0.027 HTR1B rs10484725 NM_000863.1 flanking_5UTR G A 1.28 G A 5- 22 hydroxytryptamine 0.035 rs10943426 NM_000863.1 flanking_3UTR G A 1.19 G A (serotonin) receptor 07 1B, G protein- 0.82 0.021 rs1213378 NM_000863.1 flanking_5UTR A G A G coupled 62 51 1.20 0.024 rs1350406 NM_000863.1 flanking_3UTR G A G A 8 56 1.27 0.035 rs17209149 NM_000863.1 flanking_5UTR C A C A 8 86 1.55 0.001 rs2320160 NM_000863.1 flanking_3UTR A G A G 7 425 0.84 0.049 rs4075570 NM_000863.1 flanking_3UTR G A G A 62 35 1.31 0.032 rs7753704 NM_000863.1 flanking_5UTR C T C T 1 69 0.80 0.024 rs9294040 NM_000863.1 flanking_3UTR A G A G 12 02 1.55 0.001 rs9341629 NM_000863.1 flanking_3UTR T C T C 3 517 1.55 0.001 rs9343544 NM_000863.1 flanking_3UTR G A G A 7 425 1.31 0.013 rs9343676 NM_000863.1 flanking_5UTR T C T C 2 47 0.015 rs9350721 NM_000863.1 flanking_5UTR T G 1.31 T G 41 1.27 0.037 rs9352483 NM_000863.1 flanking_5UTR A G A G 5 96 1.28 0.024 rs9359291 NM_000863.1 flanking_5UTR T G T G 4 46 18

1.32 0.023 rs9361346 NM_000863.1 flanking_5UTR G A G A 3 85 1.47 0.034 rs9361361 NM_000863.1 flanking_5UTR T C T C 7 46 1.34 0.017 rs9443501 NM_000863.1 flanking_5UTR T C T C 2 13 1.28 0.025 rs953576 NM_000863.1 flanking_5UTR C T C T 3 22 0.83 0.038 rs1328684 NM_000621.2 intron C T C T G A 32 25 1.35 0.044 rs1536762 NM_000621.2 flanking_5UTR C T C T G A 3 17 0.79 0.047 rs2406091 NM_000621.2 flanking_5UTR A G A G A G 13 1 0.83 0.036 rs2802396 NM_000621.2 flanking_5UTR C T C T G A 46 58 0.80 0.048 HTR2A rs582385 NM_000621.2 intron C T C T G A 5- 8 68 hydroxytryptamine 0.78 0.030 rs7333963 NM_000621.2 flanking_5UTR T C T C A G (serotonin) receptor 1 23 2A, G protein- 0.81 0.033 rs9567733 NM_000621.2 flanking_3UTR G A G A G A coupled 26 14 1.27 0.003 rs1923885 NM_000621.2 intron C T C T 7 812 0.83 0.038 rs4142745 NM_000621.2 flanking_5UTR T C T C 65 16 0.038 rs6561363 NM_000621.2 flanking_5UTR C A 1.84 C A 41 1.36 0.030 rs9526329 NM_000621.2 flanking_5UTR T G T G 1 6 IFITM3 interferon induced 1.26 0.034 rs11246074 NM_021034.1 flanking_5UTR G A G A G A transmembrane 3 15 protein 3 0.007 IGF1 rs7306935 NM_000618.2 flanking_5UTR C A 1.33 C A C A insulin-like growth 657 factor 1 1.20 0.037 rs1350356 NM_000618.2 flanking_5UTR A G A G (somatomedin C) 8 05 1.22 0.015 rs12908437 NM_000875.2 intron T C T C G A 7 55 0.60 0.039 rs2311767 NM_000875.2 intron T C T C A G 81 92 1.26 0.007 rs3743264 NM_000875.2 intron G A G A G A 2 08 0.63 0.036 rs4426332 NM_000875.2 intron T C T C A G 37 54 1.19 0.035 rs4527002 NM_000875.2 flanking_5UTR C T C T A G 1 49 IGF1R 1.21 0.020 insulin-like growth rs6598534 NM_000875.2 flanking_5UTR A G A G G A 4 6 factor 1 receptor 1.20 0.029 rs7181975 NM_000875.2 flanking_5UTR T C T C A G 6 8 0.048 rs8038015 NM_000875.2 intron C T 1.18 C T G A 25 1.25 0.008 rs1007212 NM_000875.2 intron C T C T 5 6 1.20 0.029 rs11633294 NM_000875.2 intron A C A C 9 95 1.21 0.025 rs12898502 NM_000875.2 intron T C T C 4 18 INSIG1 0.83 0.035 insulin induced rs3923644 NM_198337.1 flanking_5UTR A C A C A C 61 86 gene 1 IST1 increased sodium 1.25 0.011 rs4788449 NM_014761.2 intron G A G A tolerance 1 3 56 homolog (yeast) JMJD8 NM_001005920 1.37 0.003 jumonji domain rs6597 3UTR G T G T C A .2 9 844 containing 8 0.80 0.038 rs10116021 NM_015061.1 intron A G A G A G 13 37 0.79 0.011 rs10491779 NM_015061.1 intron A G A G A G 44 91 0.78 0.043 rs1052489 NM_015061.1 flanking_5UTR C T C T G A KDM4C 91 02 (K)-specific 1.22 0.016 demethylase 4C rs10815532 NM_015061.1 intron C T C T A G 5 39 0.80 0.027 rs1093707 NM_015061.1 intron G T G T C A 12 29 1.25 rs10976080 NM_015061.1 intron A C 0.023 A C A C 8 19

0.044 rs10976081 NM_015061.1 intron T G 1.25 T G A C 2 1.22 0.040 rs10976092 NM_015061.1 flanking_3UTR G A G A G A 4 75 0.83 0.040 rs10976129 NM_015061.1 flanking_3UTR C A C A C A 85 34 1.29 0.001 rs11790877 NM_015061.1 intron G A G A G A 5 616 1.35 0.016 rs11998950 NM_015061.1 intron T C T C A G 4 95 0.73 0.017 rs12001316 NM_015061.1 intron A G A G A G 36 91 1.29 0.009 rs12345872 NM_015061.1 intron G T G T C A 9 38 0.76 0.002 rs1570508 NM_015061.1 intron T G T G A C 65 529 0.78 0.005 rs1570512 NM_015061.1 intron G A G A G A 35 32 0.60 0.015 rs16925068 NM_015061.1 intron T C T C A G 12 1 0.002 rs2381530 NM_015061.1 intron A C 1.32 A C A C 137 1.24 0.010 rs2770759 NM_015061.1 flanking_3UTR C T C T G A 5 81 0.80 0.008 rs2792238 NM_015061.1 intron A G A G A G 45 342 1.25 0.021 rs2820914 NM_015061.1 flanking_3UTR C T C T G A 4 48 0.82 0.029 rs2890733 NM_015061.1 intron A G A G G A 89 43 1.19 0.040 rs2990658 NM_015061.1 intron T C T C A G 4 71 0.50 0.001 rs4742269 NM_015061.1 intron A G A G A G 16 892 0.74 0.003 rs4742298 NM_015061.1 intron A G A G A G 42 066 0.76 0.011 rs4742301 NM_015061.1 intron G A G A G A 53 59 0.80 0.010 rs7019042 NM_015061.1 intron G A G A A G 4 98 0.82 0.030 rs722628 NM_015061.1 intron A G A G A G 83 19 1.27 0.003 rs7848022 NM_015061.1 intron T G T G C A 2 954 0.77 0.009 rs818887 NM_015061.1 intron G A G A G A 24 574 0.84 0.038 rs913588 NM_015061.1 coding A G A G A G 09 29 0.79 0.004 rs946929 NM_015061.1 intron T C T C G A 32 648 1.22 0.046 rs10815539 NM_015061.1 flanking_3UTR G A G A 2 37 0.79 0.040 rs12353012 NM_015061.1 flanking_3UTR G T G T 12 72 0.47 0.000 rs12553351 NM_015061.1 intron G T G T 83 602 0.79 0.008 rs2381536 NM_015061.1 intron G A G A 24 355 0.76 0.005 rs2381542 NM_015061.1 intron G A G A 18 789 0.76 0.049 rs2765964 NM_015061.1 flanking_3UTR A G A G 05 01 1.24 0.014 rs2792235 NM_015061.1 intron A G A G 1 03 1.22 0.015 rs7027375 NM_015061.1 intron G A G A 7 23 0.76 0.005 rs7031179 NM_015061.1 intron G A G A 02 425 1.23 0.042 rs7037553 NM_015061.1 flanking_3UTR G A G A 8 55 0.76 0.002 rs7045009 NM_015061.1 intron T C T C 75 663 1.31 0.025 rs876294 NM_015061.1 flanking_3UTR C T C T 1 88 1.19 0.032 rs913587 NM_015061.1 intron A C A C 6 17 0.58 0.000 KDR rs2305948 NM_002253.1 coding T C T C A G kinase insert 13 462 domain receptor (a 0.78 0.033 rs17709427 NM_002253.1 flanking_3UTR C T C T type III receptor 8 93 20

tyrosine kinase) 0.75 0.010 rs2305949 NM_002253.1 intron T C T C 52 52 0.79 0.038 rs1654531 NM_002774.3 flanking_5UTR C T C T G A 85 05 KLK6 NM_001012965 0.83 0.037 kallikrein-related rs4592765 flanking_3UTR C T C T G A .1 44 26 peptidase 6 NM_001012965 0.78 0.007 rs1654537 intron C T C T .1 32 661 0.030 rs17162540 NM_002372.2 flanking_3UTR G A 1.33 G A G A 99 0.019 MAN2A1 rs2416227 NM_002372.2 flanking_3UTR C A 1.31 C A C A mannosidase, 05 alpha, class 2A, 1.23 0.026 rs6865255 NM_002372.2 flanking_3UTR G A G A A G member 1 6 91 1.23 0.028 rs10900662 NM_002372.2 flanking_3UTR A G A G 5 19 0.017 MAP1B rs1217785 NM_005909.2 flanking_5UTR T C 1.22 T C A G microtubule- 05 associated protein 0.82 0.025 rs2118695 NM_032010.1 intron A G A G G A 1B 14 42 0.78 0.011 rs11867549 NM_005910.2 intron G A G A G A MAPT 58 96 microtubule- 0.72 0.012 rs4792893 NM_005910.2 intron A G A G A G associated protein 16 45 tau 0.78 0.009 rs4792891 NM_005910.2 intron G T G T 96 256 1.23 0.023 rs10042643 NM_013283.3 flanking_3UTR T C T C A G 9 13 1.26 0.018 rs1030674 NM_013283.3 flanking_3UTR C A C A C A 2 91 0.80 0.014 rs10515861 NM_182796.1 intron C T C T G A 5 4 1.36 0.048 rs10515902 NM_013283.3 flanking_3UTR A G A G A G 2 05 1.51 0.014 rs11740544 NM_013283.3 flanking_3UTR T C T C A G 3 52 0.79 0.012 rs1421630 NM_013283.3 flanking_3UTR A C A C A C 29 51 0.80 0.016 rs1469064 NM_013283.3 flanking_3UTR T C T C A G 84 51 0.82 0.025 rs1895139 NM_013283.3 flanking_3UTR G A G A G A 57 99 0.84 0.037 rs3980160 NM_013283.3 flanking_3UTR G A G A G A 02 51 1.20 0.044 rs6864913 NM_013283.3 flanking_3UTR A G A G A G 2 08 1.19 0.037 rs6867953 NM_013283.3 flanking_3UTR C A C A A C MAT2B 5 92 methionine 0.82 0.022 rs6875996 NM_013283.3 flanking_3UTR A G A G A G adenosyltransferas 21 46 e II, beta 0.82 0.022 rs7443611 NM_013283.3 flanking_3UTR A G A G A G 14 62 1.22 0.028 rs7702763 NM_013283.3 flanking_3UTR A G A G A G 8 94 1.45 0.040 rs7719164 NM_013283.3 flanking_3UTR G A G A G A 3 43 1.61 0.005 rs1433779 NM_013283.3 flanking_3UTR T C T C 6 647 0.62 0.046 rs1895210 NM_013283.3 flanking_3UTR C T C T 38 19 1.18 0.045 rs2635175 NM_013283.3 flanking_3UTR T C T C 1 15 1.20 0.042 rs294587 NM_013283.3 flanking_3UTR G A G A 1 4 1.29 0.047 rs297966 NM_013283.3 flanking_3UTR C T C T 1 06 0.82 0.025 rs4073827 NM_013283.3 flanking_3UTR T C T C 6 96 0.81 0.017 rs4868851 NM_013283.3 flanking_3UTR G T G T 52 07 1.30 0.044 rs7735401 NM_013283.3 flanking_3UTR A G A G 8 12 NM_001025101 1.25 0.006 rs1124941 flanking_5UTR T G T G A C .1 8 503 NM_001025101 0.77 0.021 rs1789094 flanking_5UTR T C T C A G MBP .1 6 23 myelin basic NM_001025101 0.74 0.009 protein rs1789103 flanking_5UTR T G T G A C .1 56 002 NM_001025101 1.22 0.015 rs2282566 intron T C T C A G .1 4 71 21

NM_001025101 1.35 0.019 rs470131 intron C A C A C A .1 1 54 NM_001025101 1.20 0.037 rs4890912 flanking_5UTR G A G A G A .1 2 02 NM_001025101 0.79 0.006 rs736421 intron A G A G A G .1 83 722 NM_001025101 1.27 0.032 rs9947485 flanking_5UTR T C T C A G .1 5 42 NM_001025101 0.75 0.011 rs9951586 flanking_5UTR C T C T G A .1 86 34 NM_001025101 1.22 0.013 rs1015820 flanking_5UTR G A G A .1 9 71 NM_001025101 1.25 0.007 rs11877526 flanking_5UTR G A G A .1 4 231 NM_001025101 1.26 0.041 rs1562771 flanking_5UTR A G A G .1 3 22 NM_001025101 0.75 0.010 rs1667952 flanking_5UTR C T C T .1 68 48 NM_001025101 0.74 0.008 rs1789105 flanking_5UTR C T C T .1 44 632 NM_001025101 1.40 0.009 rs1789139 flanking_5UTR T C T C .1 7 443 NM_001025101 1.26 0.008 rs1812680 flanking_5UTR G A G A .1 7 193 0.75 0.000 rs10777695 NM_006838.2 flanking_5UTR T C T C G A 18 866 0.79 0.013 rs159853 NM_006838.2 flanking_5UTR T C T C G A 94 76 0.73 0.000 rs2727639 NM_006838.2 flanking_5UTR G A G A G A 9 592 0.66 0.000 rs282323 NM_006838.2 flanking_5UTR G A G A G A 65 198 0.80 0.035 rs301008 NM_006838.2 flanking_5UTR G T G T C A 35 93 0.66 0.000 rs301013 NM_006838.2 intron A C A C A C METAP2 09 546 methionyl 0.71 0.001 aminopeptidase 2 rs2456549 NM_006838.2 flanking_5UTR G T G T 05 602 0.67 0.004 rs2596761 NM_006838.2 flanking_5UTR A G A G 23 958 0.67 0.005 rs2769432 NM_006838.2 flanking_5UTR A G A G 59 181 0.71 0.001 rs2769436 NM_006838.2 flanking_5UTR G A G A 31 818 0.67 0.004 rs301042 NM_006838.2 intron C T C T 07 69 0.67 0.004 rs964496 NM_006838.2 flanking_5UTR T C T C 23 958 MKLN1 0.79 0.045 rs13237471 NM_013255.3 flanking_5UTR A G A G A G muskelin 1, 06 99 intracellular 0.59 0.014 mediator containing rs3800678 NM_013255.3 intron G A G A kelch motifs 89 05 0.82 0.029 MOBP rs2233204 NM_006501.1 intron T C T C A G myelin-associated 94 25 oligodendrocyte 1.24 0.012 rs562545 NM_006501.1 intron A G A G A G basic protein 1 31 NM_001008229 1.19 0.042 rs2747442 flanking_3UTR G A G A G A MOG .1 2 8 myelin NM_001008229 1.22 0.014 rs3117292 flanking_3UTR G A G A A G oligodendrocyte .1 6 29 glycoprotein NM_001008229 0.025 rs3117294 flanking_3UTR G T 1.21 G T A C .1 12 1.28 0.009 rs10514824 NM_003829.1 flanking_3UTR A G A G A G 1 825 0.74 0.014 rs10809906 NM_003829.1 intron A G A G A G 6 61 0.69 0.030 rs1412110 NM_003829.1 intron T C T C A G 5 08 1.21 0.023 rs1889297 NM_003829.1 flanking_5UTR T C T C G A MPDZ 1 68 multiple PDZ 1.22 0.019 domain protein rs3264 NM_003829.1 3UTR C T C T G A 4 67 0.77 0.015 rs10435761 NM_003829.1 flanking_3UTR T C T C 88 46 0.001 rs10514823 NM_003829.1 flanking_3UTR C T 1.35 C T 443 1.26 0.016 rs1328906 NM_003829.1 flanking_3UTR T C T C 3 47 MPZL2 0.80 0.023 rs2187557 NM_005797.2 flanking_5UTR T C T C A G myelin protein 67 51 22

zero-like 2 0.82 0.024 rs3759001 NM_005797.2 flanking_5UTR T C T C G A 51 19 0.83 0.030 rs7940405 NM_005797.2 flanking_5UTR G A G A 02 4 MYL2 myosin, light chain 1.24 0.010 rs933296 NM_000432.1 intron T G T G A C 2, regulatory, 3 66 cardiac, slow MYO1B 0.73 0.027 rs16833762 NM_012223.2 flanking_3UTR C T C T G A myosin IB 77 92 1.42 0.000 rs225184 NM_015194.1 intron G A G A G A 6 219 0.78 0.004 rs225206 NM_015194.1 intron A C A C A C 93 985 MYO1D 0.80 0.008 rs225209 NM_015194.1 intron T C T C A G myosin ID 16 687 0.70 0.000 rs2640828 NM_015194.1 intron A G A G A G 93 289 1.27 0.003 rs414947 NM_015194.1 intron G A G A A G 9 381 0.82 0.023 rs10481545 NM_005596.1 flanking_5UTR T C T C G A 63 31 0.64 0.002 rs10810110 NM_005596.1 intron G A G A G A 59 111 0.67 0.009 rs10810113 NM_005596.1 intron T C T C A G 91 812 1.19 0.033 rs10810136 NM_005596.1 flanking_5UTR G A G A G A 8 35 1.19 0.042 rs11788150 NM_005596.1 flanking_3UTR A C A C A C 2 48 1.22 0.025 rs12341720 NM_005596.1 flanking_3UTR T C T C A G 4 79 0.64 0.004 rs12377502 NM_005596.1 intron G A G A G A 85 182 0.83 0.046 rs1323339 NM_005596.1 flanking_5UTR T G T G A C 86 19 1.18 0.043 rs1556032 NM_005596.1 flanking_5UTR C T C T A G 1 29 0.80 0.047 rs17708974 NM_005596.1 flanking_5UTR A C A C A C NFIB 02 34 nuclear factor I/B 0.030 rs4740570 NM_005596.1 flanking_5UTR T C 1.2 T C G A 85 1.28 0.023 rs548824 NM_005596.1 intron G A G A G A 9 64 0.77 0.031 rs7042750 NM_005596.1 flanking_5UTR T C T C A G 46 06 1.31 0.018 rs7858 NM_005596.1 3UTR A G A G A G 1 04 0.003 rs7866165 NM_005596.1 flanking_3UTR T G 1.31 T G A C 853 0.64 0.004 rs10756536 NM_005596.1 intron G A G A 55 939 0.73 0.002 rs10810098 NM_005596.1 intron T C T C 69 888 0.60 0.000 rs2382456 NM_005596.1 intron T C T C 77 72 0.60 0.000 rs2382459 NM_005596.1 intron T G T G 48 928 1.24 0.030 rs7021837 NM_005596.1 flanking_3UTR C T C T 3 5 0.012 rs1542610 NM_015462.2 intron T C 1.35 T C A G NOL11 85 nucleolar protein 1.37 0.009 11 rs4496198 NM_015462.2 flanking_5UTR C T C T 2 119 0.70 0.014 rs10142370 NM_004796.3 intron A G A G A G 15 72 1.35 0.034 rs1030127 NM_004796.3 intron C T C T G A 1 01 1.28 0.016 rs10400751 NM_004796.3 intron C T C T G A 5 67 0.75 0.001 rs10873325 NM_004796.3 flanking_3UTR T C T C A G NRXN3 09 212 neurexin 3 1.35 0.021 rs12891137 NM_138970.2 intron C T C T G A 9 3 1.27 0.024 rs17109935 NM_004796.3 intron C T C T G A 1 97 1.21 0.039 rs17110183 NM_004796.3 flanking_3UTR A G A G A G 4 96 1.23 0.047 rs17175771 NM_004796.3 flanking_3UTR G T G T C A 7 55 23

1.22 0.015 rs1861957 NM_004796.3 intron A C A C C A 3 56 0.83 0.038 rs1863034 NM_004796.3 intron C T C T A G 38 39 1.20 0.026 rs6574495 NM_004796.3 intron T C T C G A 4 94 1.45 0.001 rs766024 NM_138970.2 intron T C T C A G 1 968 1.21 0.027 rs9646166 NM_004796.3 intron C T C T G A 8 51 1.23 0.022 rs10131645 NM_004796.3 intron G A G A 4 26 0.016 rs12890943 NM_004796.3 intron C T 1.43 C T 89 0.79 0.049 rs17597295 NM_138970.2 intron T C T C 97 26 1.24 0.031 rs2193135 NM_004796.3 flanking_3UTR T C T C 6 98 1.50 0.013 rs4899727 NM_004796.3 intron G A G A 9 79 0.83 0.040 rs725693 NM_004796.3 intron G T G T 54 28 NSMAF neutral sphingomyelinase 1.27 0.045 rs4737508 NM_003580.2 flanking_5UTR C T C T G A (N-SMase) 1 28 activation associated factor 1.34 0.014 rs3099774 NM_016522.2 intron T C T C A G 2 71 1.17 0.048 rs496368 NM_016522.2 intron C T C T G A 8 36 NTM 0.80 0.012 rs7116467 NM_016522.2 intron G A G A G A neurotrimin 83 48 1.33 0.021 rs3099775 NM_016522.2 intron T C T C 1 2 0.008 rs3099787 NM_016522.2 intron G A 1.38 G A 411 0.71 0.042 rs3739570 NM_006180.3 3UTR T C T C A G 44 05 0.80 0.034 NTRK2 rs11140659 NM_006180.3 flanking_5UTR T C T C neurotrophic 98 23 tyrosine kinase, 1.37 0.009 rs11140688 NM_006180.3 flanking_5UTR T C T C receptor, type 2 1 416 0.78 0.028 rs1866438 NM_006180.3 flanking_5UTR T C T C 11 94 PDK4 pyruvate 0.70 0.030 rs854061 NM_002612.2 flanking_5UTR G A G A dehydrogenase 13 23 kinase, isozyme 4 PDYN 0.69 0.038 rs9679771 NM_024411.2 flanking_3UTR A G A G A G prodynorphin 11 05 PFDN6 0.80 0.011 rs456261 NM_014260.2 intron C T C T G A prefoldin subunit 6 6 01 PGM2L1 1.21 0.018 rs3193507 NM_173582.3 3UTR A G A G phosphoglucomuta 3 31 se 2-like 1 1.20 0.025 rs3850186 NM_002662.2 flanking_5UTR A G A G A G 7 92 0.001 rs3850188 NM_002662.2 flanking_5UTR C A 1.47 C A C A 101 1.32 0.008 rs6787821 NM_002662.2 flanking_5UTR G A G A G A 7 373 1.37 0.004 rs7631361 NM_002662.2 flanking_5UTR A G A G A G PLD1 5 761 phospholipase D1, 0.83 0.026 rs9817267 NM_002662.2 flanking_5UTR G A G A A G phosphatidylcholine 37 9 -specific 0.82 0.024 rs9848505 NM_002662.2 flanking_5UTR C T C T G A 98 8 1.39 0.002 rs9968111 NM_002662.2 flanking_5UTR A C A C A C 6 883 1.31 0.013 rs9849306 NM_002662.2 flanking_5UTR A G A G 2 75 1.31 0.011 rs9872465 NM_002662.2 flanking_5UTR T C T C 2 42 PLLP 1.21 0.031 rs16969211 NM_015993.1 flanking_5UTR G A G A plasmolipin 4 52 PPM1B protein NM_001033557 1.31 0.033 phosphatase, rs4952703 intron T C T C A G .1 5 25 Mg2+/Mn2+ dependent, 1B 24

PPP1R12C protein 1.33 0.031 phosphatase 1, rs575144 NM_017607.1 intron T C T C 2 98 regulatory subunit 12C PRKCZ NM_001033582 1.18 0.045 protein kinase C, rs908742 intron A G A G A G .1 7 67 zeta PSMA1 0.82 0.027 rs2575850 NM_148976.1 intron C T C T G A proteasome 6 51 (prosome, 0.60 0.032 macropain) subunit, rs11023274 NM_148976.1 intron A G A G alpha type, 1 56 17 PSMD13 1.32 0.035 rs532483 NM_175932.1 flanking_3UTR A G A G A G proteasome 6 9 (prosome, macropain) 26S 0.82 0.023 rs6598055 NM_175932.1 intron A G A G subunit, non- 22 09 ATPase, 13 1.30 0.020 rs12249657 NM_130435.2 intron A G A G A G 3 74 1.28 0.004 rs12415045 NM_006504.3 intron T C T C G A 8 29 0.009 PTPRE rs12769331 NM_006504.3 intron G A 1.35 G A G A protein tyrosine 19 phosphatase, 1.29 0.032 rs932837 NM_006504.3 intron A G A G A G receptor type, E 3 54 1.31 0.033 rs1359850 NM_006504.3 intron C T C T 8 97 0.80 0.023 rs7895103 NM_006504.3 intron C T C T 38 54 0.036 QDPR rs2697707 NM_000320.1 flanking_3UTR A G 1.45 A G A G quinoid 86 dihydropteridine 1.25 0.024 rs2697705 NM_000320.1 flanking_3UTR G A G A reductase 4 19 1.21 0.020 rs553681 NM_015150.1 intron T C T C A G 6 4 0.81 0.026 rs556322 NM_015150.1 intron A G A G A G 85 53 0.72 0.026 rs7621664 NM_015150.1 intron C T C T G A RFTN1 56 72 raftlin, lipid raft 0.83 0.037 linker 1 rs7644810 NM_015150.1 intron C T C T G A 3 95 0.73 0.036 rs17272509 NM_015150.1 intron C A C A 42 59 0.66 0.009 rs9990208 NM_015150.1 intron T C T C 14 928 0.81 0.014 rs13016649 NM_004040.2 flanking_3UTR A C A C A C 52 79 1.35 0.000 rs342056 NM_004040.2 flanking_3UTR A G A G A G 3 614 RHOB 1.19 0.031 ras homolog family rs959058 NM_004040.2 flanking_3UTR T C T C G A 6 17 member B 1.27 0.006 rs342092 NM_004040.2 flanking_3UTR G A G A 9 39 0.000 rs6531240 NM_004040.2 flanking_3UTR C T 1.32 C T 836 RXRG NM_001009598 1.58 0.005 retinoid X receptor, rs10800098 intron A G A G A G .1 3 815 gamma 1.41 0.019 SCD rs10883463 NM_005063.4 intron C T C T stearoyl-CoA 8 46 desaturase (delta- 1.45 0.011 rs7068970 NM_005063.4 flanking_3UTR C A C A 9-desaturase) 2 88 0.63 0.013 rs13402129 NM_003469.3 flanking_5UTR G T G T C A 28 29 0.76 0.042 rs6728104 NM_003469.3 flanking_3UTR G A G A G A 89 94 1.27 0.035 rs7565690 NM_003469.3 flanking_3UTR T G T G A C 6 05 SCG2 0.82 0.028 rs10172774 NM_003469.3 flanking_5UTR T T C secretogranin II 42 25 1.39 0.028 rs17827801 NM_003469.3 flanking_3UTR T C T C 4 67 0.79 0.047 rs1921634 NM_003469.3 flanking_3UTR A C A C 94 93 1.20 0.025 rs1921651 NM_003469.3 flanking_3UTR C T C T 8 71 0.79 0.045 rs11696248 NM_002999.2 flanking_3UTR C T C T G A SDC4 49 24 syndecan 4 0.82 0.021 rs2267869 NM_002999.2 intron T C T C G A 33 2 25

0.80 0.037 rs2284277 NM_002999.2 intron A G A G A G 36 22 0.75 0.048 rs12474600 NM_178123.3 flanking_3UTR A G A G A G 81 37 1.22 0.013 rs12693178 NM_178123.3 intron G A G A A G 7 64 1.26 0.007 rs1968618 NM_178123.3 flanking_3UTR A G A G G A 5 605 1.37 0.031 rs3731739 NM_178123.3 flanking_3UTR A G A G A G SESTD1 5 42 SEC14 and 0.043 spectrin domains 1 rs6433746 NM_178123.3 flanking_3UTR G A 1.28 G A G A 78 1.41 0.019 rs17363393 NM_178123.3 intron A G A G 2 95 1.42 0.017 rs2305170 NM_178123.3 intron G T G T 4 64 1.28 0.044 rs2592579 NM_178123.3 flanking_3UTR T C T C 8 96 1.21 0.019 rs2305884 NM_014712.1 intron C T C T SETD1A 9 96 SET domain 1.22 0.018 containing 1A rs897986 NM_014712.1 intron A G A G 1 92 1.35 0.033 rs2239194 NM_005475.1 intron A G A G A G SH2B3 9 46 SH2B adaptor 1.26 0.016 protein 3 rs10774623 NM_005475.1 flanking_5UTR G A G A 8 12 SLC11A1 NM_001032220 0.005 rs2290708 intron T C 1.31 T C A G solute carrier .1 329 family 11 (proton- coupled divalent NM_001032220 1.28 0.008 metal ion rs3816560 intron C T C T transporter), .1 5 734 member 1 0.71 0.001 rs36693 NM_001046.2 flanking_3UTR T C T C A G 99 93 0.78 0.025 rs3749748 NM_001046.2 flanking_5UTR T C T C A G SLC12A2 95 5 solute carrier 0.80 0.049 rs790155 NM_001046.2 intron G A G A G A family 12 97 05 (sodium/potassium/ 0.77 0.018 chloride rs2409037 NM_001046.2 flanking_5UTR T G T G transporter), 27 08 member 2 0.76 0.014 rs36701 NM_001046.2 flanking_3UTR G A G A 25 4 0.80 0.046 rs790154 NM_001046.2 intron T C T C 75 71 0.65 0.047 rs17015888 NM_000345.2 flanking_3UTR A G A G A G 25 6 SNCA 0.69 0.023 rs17015982 NM_000345.2 flanking_3UTR G A G A synuclein, alpha 19 63 (non A4 component 0.70 0.025 of amyloid rs6532183 NM_000345.2 flanking_3UTR G A G A precursor) 21 35 0.65 0.047 rs7668883 NM_000345.2 flanking_3UTR G A G A 25 6 1.39 0.008 rs17246175 NM_000346.2 flanking_3UTR T C T C A G 3 321 0.80 0.038 rs2430549 NM_000346.2 flanking_5UTR C T C T G A 84 32 1.38 0.006 rs4793400 NM_000346.2 flanking_3UTR A C A C A C 2 083 1.20 0.024 rs7214255 NM_000346.2 flanking_5UTR A G A G A G 9 69 1.25 0.020 rs725545 NM_000346.2 flanking_5UTR G A G A G A 3 52 1.28 0.029 rs918077 NM_000346.2 flanking_3UTR C T C T G A SOX9 3 08 SRY (sex 1.32 0.020 rs12449701 NM_000346.2 flanking_3UTR G T G T determining region 3 64 Y)-box 9 1.26 0.028 rs6501473 NM_000346.2 flanking_5UTR C T C T 7 91 1.26 0.032 rs6501474 NM_000346.2 flanking_5UTR C T C T 1 83 1.26 0.031 rs6501475 NM_000346.2 flanking_5UTR C T C T 2 28 1.31 0.016 rs713158 NM_000346.2 flanking_5UTR T C T C 3 67 1.44 0.001 rs8068789 NM_000346.2 flanking_3UTR A G A G 5 764 1.27 0.025 rs975737 NM_000346.2 flanking_5UTR T C T C 3 89 SPARC 0.047 rs2033467 NM_003118.2 flanking_3UTR C T 1.18 C T A G secreted protein, 51 26

acidic, cysteine-rich 1.20 0.027 rs7719521 NM_003118.2 intron A C A C A C (osteonectin) 2 57 1.20 0.027 rs6861486 NM_003118.2 intron T C T C 2 57 STX1A 1.26 0.026 rs6951030 NM_004603.1 intron G T G T C A syntaxin 1A (brain) 9 11 SYN2 1.25 0.039 rs17669026 NM_133625.2 flanking_5UTR G A G A G A synapsin II 1 78 1.19 0.046 rs10861755 NM_005639.1 intron G A G A G A 5 17 1.19 0.046 rs1245819 NM_005639.1 intron G A G A G A 6 14 1.19 0.046 rs1268463 NM_005639.1 intron A C A C A C 5 17 SYT1 0.69 0.047 rs1569033 NM_005639.1 flanking_5UTR A G A G A G synaptotagmin I 39 94 1.19 0.047 rs10735416 NM_005639.1 flanking_5UTR T C T C 3 05 0.041 rs1245810 NM_005639.1 intron A C 1.2 A C 39 1.19 0.047 rs1245840 NM_005639.1 intron C T C T 3 81 SYT11 1.56 0.007 rs822519 NM_152280.2 intron G A G A synaptotagmin XI 8 779 TCF12 1.30 0.039 transcription factor rs10518890 NM_003205.3 intron T C T C A G 4 98 12 0.84 0.037 rs11970011 NM_003221.2 flanking_3UTR C T C T G A 37 72 0.82 0.020 rs2143081 NM_003221.2 flanking_5UTR C T C T G A 63 05 1.24 0.016 rs2635727 NM_003221.2 flanking_3UTR T C T C A G 8 21 0.033 rs2817352 NM_003221.2 flanking_3UTR A G 1.27 A G A G 66 1.26 0.037 rs4715209 NM_003221.2 flanking_3UTR C T C T G A 3 89 0.84 0.034 TFAP2B rs1178063 NM_003221.2 flanking_3UTR T C T C transcription factor 08 2 AP-2 beta 0.033 rs1178065 NM_003221.2 flanking_3UTR C T 1.27 C T (activating 66 enhancer binding 1.27 0.028 rs2636900 NM_003221.2 flanking_3UTR A G A G protein 2 beta) 9 6 1.26 0.034 rs2817334 NM_003221.2 flanking_3UTR C A C A 8 6 0.84 0.034 rs2817351 NM_003221.2 flanking_3UTR G A G A 11 21 0.84 0.037 rs3846904 NM_003221.2 flanking_3UTR A G A G 37 72 1.19 0.049 rs4605881 NM_003221.2 flanking_3UTR G A G A 3 87 0.85 0.049 rs571503 NM_003221.2 flanking_3UTR A G A G 16 29 TIMP2 TIMP 0.82 0.043 rs7502935 NM_003255.4 intron A G A G metallopeptidase 83 09 inhibitor 2 1.26 0.017 rs762886 NM_000362.4 flanking_5UTR A G A G A G TIMP3 4 33 TIMP 1.18 0.041 rs9862 NM_000362.4 coding C T C T A G metallopeptidase 3 75 inhibitor 3 0.83 0.032 rs137487 NM_000362.4 flanking_3UTR A G A G 42 65 TMED4 transmembrane 0.80 0.014 emp24 protein rs217361 NM_182547.2 intron A G A G A G 88 61 transport domain containing 4 TMEM109 1.20 0.032 transmembrane rs555835 NM_024092.1 coding T C T C A G 5 34 protein 109 1.39 0.012 rs11671255 NM_134447.1 flanking_3UTR A G A G A G URI1 5 16 URI1, prefoldin-like 1.18 0.036 chaperone rs1006584 NM_134447.1 flanking_3UTR A G A G 9 96 USP32 1.26 0.020 ubiquitin specific rs8079220 NM_032582.3 intron T C T C G A 4 85 peptidase 32 VEZF1 vascular 0.75 0.039 rs9904523 NM_007146.2 flanking_3UTR T C T C endothelial zinc 91 04 finger 1 27

NM_001018056 0.84 0.047 rs10812407 flanking_3UTR T C T C G A .1 64 36 NM_001018056 0.70 0.002 rs10967188 flanking_5UTR G T G T C A .1 74 602 NM_001018056 1.50 0.031 rs11999258 flanking_5UTR A G A G A G .1 8 44 NM_001018056 1.36 0.049 rs1355640 flanking_5UTR A G A G A G .1 4 68 NM_001018056 1.33 0.026 rs1996055 flanking_5UTR G A G A G A .1 4 75 VLDLR NM_001018056 1.26 0.014 very low density rs4741730 flanking_5UTR T G T G A C .1 2 01 lipoprotein receptor NM_001018056 1.21 0.041 rs1006575 flanking_5UTR A G A G .1 1 15 NM_001018056 0.81 0.016 rs10812382 intron A G A G .1 63 62 NM_001018056 1.40 0.047 rs10967081 flanking_5UTR A G A G .1 3 61 NM_001018056 1.21 0.036 rs1551411 intron T C T C .1 1 63 0.81 0.021 rs2242104 NM_003383.3 intron A G A G 65 36 0.80 0.017 rs1063856 NM_000552.2 coding G A G A G A 46 16 1.19 0.039 rs2239140 NM_000552.2 intron A G A G A G VWF 7 23 von Willebrand 0.80 0.032 factor rs11610629 NM_000552.2 intron C A C A 52 08 0.80 0.029 rs11611917 NM_000552.2 intron A G A G 23 36 1.25 0.008 rs13254653 NM_003406.2 flanking_3UTR T C T C A G 3 663 0.76 0.014 rs2135121 NM_003406.2 flanking_3UTR T C T C A G 88 11 0.81 0.016 rs3134373 NM_003406.2 flanking_3UTR A G A G A G YWHAZ 41 5 tyrosine 3- 1.21 0.018 rs4734487 NM_003406.2 flanking_3UTR G A G A A G monooxygenase/try 9 35 ptophan 5- 1.19 0.044 monooxygenase rs4734497 NM_003406.2 intron C T C T G A activation protein, 4 43 zeta polypeptide 1.20 0.042 rs7816400 NM_003406.2 flanking_3UTR A G A G G A 1 23 0.74 0.000 rs3134369 NM_003406.2 flanking_3UTR G A G A 74 535 1.22 0.025 rs4734494 NM_003406.2 flanking_3UTR C T C T 5 09

28

Table S5: Epistasis testing of top candidate genes for alcoholism (n=11). Best p- value SNPs from discovery GWAS1 tested in independent cohort 2. The top epistatic interactions in GWAS2 are depicted in bold and underlined and highlighted in blue. As a caveat, the p-value was not corrected for multiple comparisons. The corresponding genes merit future follow-up work to elucidate the biological and pathophysiological relevance of their interactions.

DRD2 GFAP GNAI1 GRM3 MBP MOBP MOG RXRG SNCA SYT1 TIMP2 DRD2 0.4546 0.9179 0.5273 0.7124 0.7963 0.1758 0.8905 0.6744 0.01796 0.6186 GFAP 0.4546 0.5293 0.08965 0.773 0.4843 0.3624 0.6137 0.9092 0.2335 0.06536 GNAI1 0.9179 0.5293 0.1319 0.07375 0.1498 0.4557 0.09469 0.7188 0.1391 0.05482 GRM3 0.5273 0.08965 0.1319 0.1263 0.1912 0.3744 0.5735 0.1586 0.6271 0.937 MBP 0.7124 0.773 0.07375 0.1263 0.7845 0.4333 0.7912 0.8075 0.5515 0.3699 MOBP 0.7963 0.4843 0.1498 0.1912 0.7845 0.8991 0.3103 0.8511 0.2651 0.01883 MOG 0.1758 0.3624 0.4557 0.3744 0.4333 0.8991 0.4422 0.3425 0.4614 0.8411 RXRG 0.8905 0.6137 0.09469 0.5735 0.7912 0.3103 0.4422 0.03684 0.2641 0.4612 SNCA 0.6744 0.9092 0.7188 0.1586 0.8075 0.8511 0.3425 0.03684 0.9097 0.3429 SYT1 0.01796 0.2335 0.1391 0.6271 0.5515 0.2651 0.4614 0.2641 0.9097 0.6851 TIMP2 0.6186 0.06536 0.05482 0.937 0.3699 0.01883 0.8411 0.4612 0.3429 0.6851

29

Table S6. Genotype for SNCA SNP RS17015888 in test cohorts 3 (alcohol dependence) and 4 (alcohol abuse). The G allele is the affected allele from the discovery GWAS.

SNCA Genotype %

G/G 678/1261 53.8%

Control G/A 467/1261 37.0%

A/A 116/1261 9.2%

G/G 1724/2765 62.4%

Alcohol Dependent G/A 808/2765 29.2%

A/A 233/2765 8.4%

G/G 366/600 61.0%

Alcohol Abuse G/A 175/600 29.2%

A/A 59/600 9.8% 30

Table S7. Genetic risk prediction score (GRPS) panels for A. bipolar disorder and B. schizophrenia, tested in alcoholism (Figure 6). The affected allele was assigned based on the original bipolar and schizophrenia datasets.

A. GRPS-BP Affected Gene SNP allele rs11077135 A A2BP1 rs8046170 A ALDH1A1 rs7873724 G APP rs2829984 G ARNTL rs11022781 T ATXN1 rs909786 C CACNA1A rs16016 A CAMK2A rs4958469 A CD44 rs353615 G CDH13 rs931408 C rs932918 C CUGBP2 rs2378991 T DAPK1 rs3124236 T rs1171062 G DCLK1 rs7994174 C rs10492555 T DIAPH1 rs11954658 C rs9431714 G DISC1 rs821577 A GNAI1 rs2523189 T HTR2A rs977003 C KCND2 rs10268591 G KLF12 rs4885151 G MBNL2 rs6491345 G MBP rs12967023 A MYT1L rs17039396 G NCAM1 rs4366519 T rs11421 C NDUFS2 rs5085 G rs17209251 A NR3C1 rs2918417 C NRCAM rs3763461 G OPRM1 rs2010884 T PRKCE rs4557033 A 31

rs2711293 T PTN rs6977749 G PTPRM rs4121619 G PTPRT rs1016071 G RYR3 rs744776 A SYN3 rs3788467 C TIAM1 rs845945 A TSHZ2 rs169270 A

B. GRPS-SZ Affecte Gene SNP d allele

rs881897 C ADCYAP1 rs8091765 A

rs789046 A

rs4948256 A

rs7087489 A

ANK3 rs10761507 C rs11813307 G

rs12767186 G

rs4948422 C

rs6265 C

rs11030104 A BDNF rs11030119 A

rs11030182 T

rs11030066 T CNR1 rs9362473 T

COMT rs1544325 A

DISC1 rs9728261 T 32

rs9782927 G

rs12087592 C

DRD2 rs12791990 C rs4648318 C

DTNBP1 rs1539422 T rs1935784 A

FABP7 rs9490546 G

rs8037461 C

rs8027455 A

rs1435831 C

GABRB3 rs8025575 G rs11161309 C

rs11854349 G

rs4906680 C

rs500951 T

GNB1L rs17745302 T rs13057910 T

rs17115481 A

rs4958687 A

rs4285285 A

rs10515671 A GRIA1 rs2199123 A

rs12656429 C

rs160163 T

rs17113267 A

rs2973138 T 33

rs6877008 C

rs1493383 T

rs10039253 C

rs17115298 G

rs4958560 T

rs4398624 T

rs1542485 G

rs1422337 A

rs17568427 G

rs11951398 T

rs286969 A

rs7702336 A GRIA4 rs17104589 A

rs4363703 T

rs11055597 T

rs11055930 T

GRIN2B rs2300268 A rs7306014 G

rs2300252 C

rs10772769 A

rs10845826 A

rs992259 A

rs7396702 C

GRM5 rs12362135 G rs10831155 C

rs982010 C

rs308765 T 34

rs541279 G

rs16914531 T

HINT1 rs7716702 A rs1363696 G HTR2A rs1805055 T

rs3772753 T

rs16835783 A KALRN rs3772790 G

rs13087377 G

rs7621976 T

KIF2A rs6864793 C rs414568 T

rs12959006 T MBP rs12956305 C

rs3900176 C

rs1768141 G

MOBP rs538867 T rs2233204 C

rs9835143 T

rs1945101 G

rs10891375 G NCAM1 rs1006826 C

rs2212450 C

rs7105462 G

rs1784773 T NDUFV2 rs1573321 T

rs16954628 G 35

rs11081446 A

rs12465886 G

NR4A2 rs2691775 A rs2691787 G

rs10175502 C

rs4142284 C NRCAM rs1990713 C

rs2284284 G

rs12541516 C

rs954009 T

rs16879809 T

rs4535704 A

rs4035323 A NRG1 rs7818821 G

rs11775675 C

rs10954887 G

rs17642273 A

rs3847131 G

rs1623372 A

rs1321172 C

rs6700403 T PDE4B rs17128076 G

rs524770 A

rs4492586 C

PRKCA rs16959714 T rs4791022 T

RELN rs2711865 G 36

rs2245617 T

RGS4 rs4657235 T rs951438 G

rs12273644 C SLC1A2 rs17379710 T

rs736374 A

rs6032783 T SNAP25 rs362585 G

rs362574 A

SYN2 rs2960421 A rs3755724 C

rs17594665 A

rs10401120 T

rs1377242 C

TCF4 rs4800933 C rs10164195 A

rs2588477 A

rs9949821 A

rs17089826 C

rs360414 C

rs11927009 T

rs902952 A

TNIK rs9846083 C rs1353021 A

rs6788198 G

rs11916004 A

rs9844740 C

37

Figure S1. Gene size vs. CFG score. No enrichment of larger –size genes by CFG. Top candidate genes for alcoholism (n=135).

38

Figure S2. Genetic Risk Prediction ROC curves