Toward understanding the genetics of alcohol drinking through transcriptome meta-analysis

Megan K. Mulligan*†‡, Igor Ponomarev*†‡, Robert J. Hitzemann§, John K. Belknap§, Boris Tabakoff¶, R. Adron Harris*†, John C. Crabbe§, Yuri A. Blednov*†, Nicholas J. Grahameʈ, Tamara J. Phillips§, Deborah A. Finn§, Paula L. Hoffman¶, Vishwanath R. Iyer*,**, George F. Koob††, and Susan E. Bergeson*†‡‡

*Waggoner Center for Alcohol and Addiction Research and Sections of †Neurobiology and **Molecular Genetics and Microbiology, University of Texas, Austin, TX 78712; §Department of Veterans Affairs Medical Center, Portland Alcohol Research Center, and Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR 97239; ¶University of Colorado Health Sciences Center, Aurora, CO 80045; ʈIndiana University School of Medicine, Indianapolis, IN 46202; and ††The Scripps Research Institute, La Jolla, CA 92037

Edited by Floyd E. Bloom, The Scripps Research Institute, La Jolla, CA, and approved February 9, 2006 (received for review November 30, 2005) Much evidence from studies in humans and animals supports the We propose to identify candidate that contribute to the hypothesis that alcohol addiction is a complex disease with both genetic propensity to drink by combining whole-brain expres- hereditary and environmental influences. Molecular determinants sion databases of genetic mouse models with differences in volun- of excessive alcohol consumption are difficult to study in humans. tary alcohol consumption. Most importantly, our approach should However, several rodent models show a high or low degree of allow identification of small- or moderate-effect genes. Three alcohol preference, which provides a unique opportunity to ap- different sets of oppositely selected lines bred for high and low proach the molecular complexities underlying the genetic predis- amounts of alcohol drinking, five inbred strains known to differ in position to drink alcohol. Microarray analyses of brain gene ex- voluntary alcohol consumption, and a hybrid strain recently shown pression in three selected lines, and six isogenic strains of mice to have the greatest degree of voluntary alcohol consumption of any known to differ markedly in voluntary alcohol consumption pro- known mouse genotype (4) were used (Table 1). Global measure- vided >4.5 million data points for a meta-analysis. A total of 107 ment of brain gene expression in each of the mouse models allowed arrays were obtained and arranged into six experimental data sets, us to explore which transcripts are consistently changed in different allowing the identification of 3,800 unique genes significantly and genetic models of high and low amounts of alcohol intake. It is consistently changed between all models of high or low amounts important to note that the mice used for array analysis were not of alcohol consumption. Several functional groups, including exposed to alcohol; thus, we are defining the transcriptional sig- mitogen-activated kinase signaling and transcription reg- natures of genetic predisposition to high and low levels of alcohol ulation pathways, were found to be significantly overrepresented consumption. Expression analysis of an additional mouse line congenic for a section of 9, which is known to contain and may play an important role in establishing a high level of genes that regulate alcohol consumption (3), allowed us to define voluntary alcohol drinking in these mouse models. Data from the candidate quantitative trait genes (QTGs) for this quantitative trait general meta-analysis was further filtered by a congenic strain (QTL). microarray set, from which cis-regulated candidate genes for an Analysis across data sets for common changes in expression alcohol preference quantitative trait locus on chromosome 9 were provides more statistical power to detect small and reliable changes identified: Arhgef12, Carm1, Cryab, Cox5a, Dlat, Fxyd6, Limd1, than analysis of any one or two of the data sets. Meta-analyses of Nicn1, Nmnat3, Pknox2, Rbp1, Sc5d, Scn4b, Tcf12, Vps11, and diverse gene expression data sets were recently used to successfully Zfp291 and four ESTs. The present study demonstrates the use of uncover genes related to carcinogenic phenotypes (10), but a similar (i) a microarray meta-analysis to analyze a behavioral phenotype approach has not been used for a neurobehavioral trait (for a review (in this case, alcohol preference) and (ii) a congenic strain for of meta-analyses of microarray data, see ref. 11). In this study, we identification of cis regulation. employ a meta-analysis of microarray data from different genetic models of high and low levels of alcohol consumption to define alcoholism ͉ gene expression ͉ microarray functional pathways and individual genes that may determine a predisposition for a high degree of alcohol intake. n understanding of the molecular mechanisms underlying the Agenetic propensity toward excessive alcohol consumption is Results and Discussion crucial for the development of new treatments for alcoholism. Use of Cohen’s d as an Effect-Size Measure Allows Comparison of Animal models for alcohol-related traits provide an important Microarray Experiments from Diverse Sources. Five experimental opportunity to explore mechanisms responsible for different as- microarray (Affymetrix and custom cDNA) data sets of mice pects of the uniquely human disease (1). In particular, selected lines genetically divergent in voluntary alcohol consumption were ana- and inbred strains of mice with a divergence in voluntary alcohol lyzed for differential gene expression with the Cohen’s d statistic drinking represent valuable tools to dissect the genetic components (12). The initial meta-analysis comprised 13 individual data sets of alcoholism (2, 6, 8, 9, §§). However, each model has different from three groups of selected lines and one group of six isogenic advantages. Inbred strains are homogeneous at all alleles, and

genetic (allelic) differences between strains lead to observed phe- Conflict of interest statement: No conflicts declared. notypic differences. Several inbred strains have been well charac- This paper was submitted directly (Track II) to the PNAS office. terized for numerous alcohol-related traits, but any two strains will Abbreviations: B6, C57BL͞6J; D2, DBA͞2J; QTG, quantitative trait gene; QTL, quantitative differ for numerous other phenotypes (including gene expression), trait locus; TF, transcription factor; TFBS, TF binding site(s). making identification of the specific genes for a given trait difficult. ‡M.K.M. and I.P. contributed equally to this work. Selected lines exploit homozygosity for large-effect and some ‡‡To whom correspondence should be addressed at: Waggoner Center for Alcohol and small-effect allelic variants that contribute to the divergent prefer- Addiction Research and Section of Neurobiology, University of Texas, A4800, ence for drinking. Although the resulting genome remains relatively MBB1.138AA, 1 University Station, Austin, TX 78712. E-mail: [email protected]. polymorphic, as selection continues, fixation of genes unrelated to §§Behm, A. L. L., Li, T. K. & Grahame, N. (2003) Alcohol. Clin. Exp. Res. 27, 49A, abstr. the selected phenotype occurs because of drift. © 2006 by The National Academy of Sciences of the USA

6368–6373 ͉ PNAS ͉ April 18, 2006 ͉ vol. 103 ͉ no. 16 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0510188103 Downloaded by guest on September 29, 2021 Table 1. Mouse models and microarrays used and inbred strains, compared with the pairs of selected lines, likely Data reflects differences in genetic background, an enrichment of alleles set Study Platform Strain͞line Sex N linked to alcohol preference in selected lines, and a relative randomness of such alleles in inbred strains. 1 STS (B6.D2) Affymetrix High STS F͞M10 An overlap in regulation between mouse models sharing the Low STS F͞M10 same alcohol phenotype is readily detectable despite the use of data ͞ 2 HAP LAP1 Affymetrix HAP1 M 6 sets from different animal models and platforms. The distribution LAP1 M 6 of z test P values for the meta-analysis of the four data sets is skewed 3 HAP͞LAP2 Affymetrix HAP2 M 6 LAP2 M 6 to the left, indicating a high number of significantly coexpressed 4 Inbred strains Affymetrix B6 M 6 genes detected by the meta-analysis (Fig. 1B) and suggesting a low BalbC M 6 incidence of false positives consistent with the measured Q values. D2 M 6 Thus, the use of Cohen’s d in our meta-analysis allows for the LP͞NJ M 6 detection of transcripts significantly divergent between genetically 5 Inbred strains Custom cDNA B6 F 5 distinct mouse models displaying opposite levels of alcohol FVB͞NJ F 5 consumption. FVB.B6 hybrid F 5 6 B6.D2 congenic 9 Custom cDNA B6 F͞M12 Meta-Analysis Identifies Candidate Genes for High and Low Amounts B6.D2 Congenic 9 F͞M12 of Alcohol Consumption. Use of Cohen’s d values between the mouse Three selected lines (data sets 1–3) and the inbred strains (data sets 4 and strains and lines displaying high and low levels of alcohol drinking 5) were used to form four groups for the meta-analysis. The congenic line across data sets resulted in the identification of 5,182 significant (data set 6) was used as a filter to select genes from the meta-analysis. See (d Ͻ 0.5 and Q Ͻ 0.05) differentially expressed transcripts, Materials and Methods for details. STS, short-term selection; F, female; M, representing 3,800 unique genes (data for all transcripts is shown in male. Data set 1 was contributed by R.J.H., J.C.C., and J.K.B. Data sets 2 and 3 Table 2, which is published as supporting information on the PNAS were contributed by B.T. Data set 4 was contributed by R.J.H. and J.K.B. Data Ͻ sets 5 and 6 were contributed by S.E.B. web site). The highly significant gene expression changes (Q 0.01) across the four data sets are noticeably consistent when seen in pseudocolor raster display (Fig. 2A). Red and green indicate strains (Table 1). (A data set consisting of C57BL͞6J (B6).DBA͞2J transcripts overexpressed and underexpressed in animals displaying (D2) congenic 9 animals is used later as a filter for the initial high degrees of alcohol consumption, respectively. In general, more meta-analysis.) Two advantages of the Cohen’s d approach are transcripts appear to be up-regulated in models with high levels of important. First, data from different platforms and laboratories can consumption relative to models with low levels of consumption. be combined without the use of normalization. Second, small The sheer number of significant differences suggests that nu- differences that are consistent in direction of change can be merous pathways in the brain may be distinct between mouse detected, even though these changes might not be detected in any models with different levels of alcohol consumption. Indeed, the 75   single data set. genes with the largest effect size ( d ) and lowest Q values (Fig. 2B) After calculating Cohen’s d for each data set (Fig. 1A), a pairwise fall into broad categories of cellular homeostasis and neuronal comparison of common changes was applied to test the compati- function. Several genes of interest expressed higher in preferring models include: B2m, Man2b1, Scn4b, Mapre1, Prkce, and Sst, which bility of the four data sets. The number of transcripts regulated in ͞ the same direction between each pair of data sets is significantly function in immunity cellular defense, glycosylation, ion channel activity, microtubule binding͞dynamics, intracellular signaling cas- greater than the number ascribed to chance for five of six pairs. ␤ When all four data sets are compared, the number of transcripts cades, and neuronal signaling, respectively. 2-microglobulin may regulated in the same direction is nearly 2-fold over the chance level play a role in neuronal plasticity because B2m knockout mice exhibit abnormal synaptic connections as well as altered patterns of (P Ͻ 0.00001, ␹2). The lower similarity seen between selected lines long-term potentiation and long-term depression (13). Scn4b is an auxiliary ␤-subunit of voltage-gated sodium channels, which can alter channel function and interaction between the ion channel and other (14). In cerebellar Purkinje cells, Scn4b may be required for high-frequency firing of neurons containing voltage- gated sodium channels (15). A role for Prkce (PKC-␧) in ethanol consumption is supported by the observation that Prkce knockout mice consume less alcohol compared with controls in a two-bottle choice drinking paradigm, and introduction of conditional Prkce expression in these mice was sufficient to restore alcohol drinking levels (16). Gnb1, Hmgn2, and Hyou function in GTPase activity͞ signal transduction, DNA binding͞packaging, and the cellular response to stress, respectively, and are significantly down-regulated in mice preferring alcohol. The overall results of the meta-analysis show the complexity of the alcohol drinking phenotype as indicated by the surprisingly large number of consistent, small increases or

decreases in gene expression between the mouse models. NEUROSCIENCE

In Silico Promoter and Functional Analyses Uncover Potential Signif- icant Regulation and Pathways, Respectively. Nearly 10% of the genes in the mammalian genome were detected as divergent Fig. 1. Compatibility of data sets for meta-analysis. (A) Number of transcripts between the alcohol preference lines and strains of mice. One regulated in the same direction between any two (left y axis) or all four (right possible explanation is linkage disequilibrium, the tendency of y axis) data sets (P Ͻ 0.0001, ␹2). n.s., not significant. (B) Frequency distribution closely linked genes to be coexpressed. Another explanation is that of z test P values (x axis) is shown. The solid line represents theoretical chance some of the risk-conferring QTG are transcriptional regulators. distribution. IS, Inbred strain. Although current mouse in silico promoter analysis tools are not yet

Mulligan et al. PNAS ͉ April 18, 2006 ͉ vol. 103 ͉ no. 16 ͉ 6369 Downloaded by guest on September 29, 2021 comprehensive on a genomic scale, the oPOSSUM database (17) is a valuable tool for identifying overrepresented transcription factor (TF) binding sites (TFBS) in promoter regions near a gene’s transcription start site. oPOSSUM analysis is very conservative and has strict requirements; simulation studies have suggested that it detects few false positives (17). TFBS for Zfp143 (Staf, selenorysteine RNA gene transcription activating factor) were identified as overrepresented in the up- regulated genes (for high amounts of alcohol drinking). The Staf TFBS was present in the upstream (2,000 bp) promoter regions of 64 genes. Importantly, Zfp143 expression was also found to be significantly up-regulated across the models with high drinking levels. Similarly, the TFBS for the fork-head box TF Foxa2 (HNF- 3␤, hepatocyte nuclear factor 3␤) was detected as overrepresented across the down-regulated gene group and present in 146 genes, and Foxa2 was found to be significantly down-regulated in the meta- analysis. In both cases, the overrepresented TF has the same pattern of expression as its target genes. It is easy to imagine, as evidenced by the number of gene targets for Zfp143 and Foxa2, how small changes in the levels of a TF could cause profound differences in the brain transcriptome and may account for the large number of gene expression changes (3,800 genes) detected in this meta- analysis. Although neither Zfp143 or Foxa2 were identified as cis-regulated and are not primary candidate genes for alcohol preference, an understanding of how changes in the expression level and͞or activity of transcriptional regulators affects the predisposi- tion toward alcohol consumption phenotypes warrants further investigation. To better understand possible global interactions of the divergent genes, overrepresentation analyses for function were completed. Such analyses apply statistical methods to estimate whether some biological function͞pathway is represented in a given data set more than expected by chance. Using the WEBGESTALT search engine, which queries different databases, including KEGG, BioCarta, and GO () (18–20), functional group analysis revealed that kinase and signaling pathways were overrepresented in genes divergent between alcohol-preferring and nonpreferring genotypes. The result supports previous suggestions that the mitogen-activated protein kinase, NF-␬B, and IL-6 signaling pathways (and CREBBP) are sensitive to alcohol (21). Known functional inter- actions (summarized in Fig. 3) show that gene transcription path- ways predominant in the overrepresented categories and apopto- sis͞antiapoptosis, neurodegeneration, and neuroplasticity pathways were also functionally overrepresented (for all results, see Fig. 5 and Table 3, which are published as supporting information on the PNAS web site). In silico functional analyses are powerful tools for hypothesis generation, especially for complex phenotypes, such as predisposition toward alcohol drinking. It should now be feasible, for example, to test the effect of candidate genes in Fig. 3 for their impact on alcohol consumption.

Several Genes Identified in the Meta-Analysis Are Located Within QTLs for Human Alcoholism Susceptibility. A two-pronged approach was taken to use known QTLs in both human and rodent research. First, the list of Ϸ3,800 significant genes was annotated using SOURCE (22) for mouse and human chromosomal location. Two separate lists were created by using Feighner, DSM-III-R, and ICD-10 criteria (23): one list for genes found within the boundaries of mouse QTLs for preference drinking (3) and another for genes within the boundaries of alcoholism susceptibility QTLs in humans. Fig. 2. Visual representation of microarray data. Columns represent microar- The lists were used as filters to identify overlap of significant genes ray databases used in the meta-analysis, and rows represent transcripts. The from the meta-analysis with QTLs between the two species (Table Ͻ filtered criterion was Q 0.01, and genes were sorted by effect size. (A) 4, which is published as supporting information on the PNAS web Transcripts (2,697) are listed from positive to negative Cohen’s d values (effect size). Red indicates a positive effect size and higher expression in preferring site). Thirty-six genes met criteria as candidate genes with 11 mice, and green indicates a negative effect size and lower expression in (Gstm1, Gstm2, Gstm5, Il1f5, Il1f6, Il1f8, Il1rn, S100a1, S100a10, preferring mice. Black indicates an effect size near zero. (B) The 75 unique and S100a13) belonging to only three gene families: GSTM (glu- transcripts with the highest absolute average effect size (Q Ͻ 0.05, d Ͼ 1.94). tatione S-transferase activity), S100 class calcium-binding, and STS, short-term selection; IS, Inbred strains. cytokine͞proinflammatory activity (IL-1) (Table 5, which is pub-

6370 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0510188103 Mulligan et al. Downloaded by guest on September 29, 2021 Fig. 3. Complexity of functional group interaction. Genes present in at least three overrepresented functional groups͞pathways from BioCarta, KEGG, and Gene Ontology are shown. Larger font sizes represent either smaller P values for functional groups͞pathways (from overrepresentation analysis) or larger effect size for individual genes (P Ͻ 0.001, Q Ͻ 0.01, and d Ͼ 0.5 for all genes). Lines connect gene symbols with relevant functional groups. Arrows indicate pathway connections. For full gene names, see Table 2.

lished as supporting information on the PNAS web site). GSTM1 the meta-analysis that passed the congenic filter were mapped on alleles have been shown to be associated with increased alcoholism chromosome 9 along with their relative effect size (Fig. 4). These and alcoholic liver disease (24), and recent results from mice with genes include 16 known genes (Arhgef12, Carm1, Cryab, Cox5a, mutant chemokine receptors suggest a role of inflammatory path- Dlat, Fxyd6, Limd1, Nicn1, Nmnat3, Pknox2, Rbp1, Sc5d, Scn4b, ways in alcohol consumption (25). QTL analyses for alcohol Tcf12, Vps11, and Zfp291) and four expressed sequence tags preference in rats (26, 27) (not listed in Table 5) show two syntenic (2810423O19Rik, 1110032A03Rik, 5730439E10Rik, and QTLs with mouse, and the chromosome 2 QTL is syntenic between 9030425E11Rik). Four of these (Pknox2, Scn4b, 9030425E11Rik, all three species, including humans. GST8-8, the implicated QTG and 1110032A03Rik) are correlated with alcohol preference in a within the QTL on rat chromosome 8 (28), is syntenic on mouse BXD recombinant inbred population (5). Interestingly, some chromosome 9. However, Gsta4, the mouse alias for GST8-8, was primary candidate genes also have putative transcriptional ac- not significantly divergent in this meta-analysis. Interestingly, sev- tivity: Carm1, Pknox2, Tcf12, Zfp29, and Limd1. Although their eral other significantly regulated glutatione S-transferase genes targets are not yet characterized, each may explain some of the found on mouse chromosome 1 and human were in consistently divergent gene expression observed in the meta- syntenic QTL regions for both species (again, see Table 5). analysis. The 20 genes identified with the congenic filter are It is important to note that, although there is considerable potential QTGs underlying alcohol preference and provide synteny between the genomes of rodents and man, the degree of exciting avenues of investigation into the complex behavior of evolutionary overlap between polymorphisms or alleles of genes is alcohol preference. probably much lower. Thus, translation of data from rodent models to humans is more likely to apply on a pathway level rather than as Conclusions exact mutations in specific genes. The species differences in gluta- Predisposition to excessive alcohol consumption is likely a key tione S-transferase may, in fact, be such an example. aspect of human alcoholism, but molecular determinants of this trait are difficult to study in humans. Our work provides a Use of a B6.D2 Congenic 9 Data Set Filter Identifies cis-Regulated microarray-based meta-analysis of a behavioral phenotype and Genes Occupying a QTL for Strong Alcohol Preference in Mice. QTL uses effect-size measures. The Cohen’s d metric eliminated the mapping and construction of congenic mice based on such maps need for standardization across array platforms and experi- provides another genetic approach to defining changes in gene ments, allowing identification of consistently different tran- expression linked to propensity for high levels of alcohol consump- scripts between alcohol-preferring and nonpreferring genotypes, tion. Several mouse QTL experiments show that chromosome 9 even if the transcriptional differences were small with respect to contains genes that contribute to the genetic propensity toward magnitude or significance or both.

increased alcohol intake. This QTL was found to be the largest of The meta-analysis provided 5,182 targets, including 3,800 non- NEUROSCIENCE several identified in B6- and D2-derived populations (3). Therefore, overlapping genes for further study. The significant differences B6 congenic mice containing a region that captured the D2 alleles between the genotypes for divergent amounts of alcohol drinking between 9 and 58 cM on chromosome 9 were analyzed for gene represent numerous functional groups covering a large range of expression and compared against the B6 background strain. The cellular pathways as well as many genes (Ϸ25%) whose functions differentially expressed genes located on chromosome 9 are by are not yet characterized. Thus, some genes whose functions are default cis-regulated and thus are QTG candidates. currently unknown are also likely to be important determinants of To identify cis-acting candidate genes for the chromosome 9 alcohol consumption. The 75 genes with the largest effect size (Q Ͻ alcohol preference QTL, the current meta-analysis results were 0.05, d Ͼ 1.94) (see Fig. 2) fall into the broad categories of cellular filtered with the congenic data set. Significant genes detected by homeostasis and neuronal function. Differences in the ability to

Mulligan et al. PNAS ͉ April 18, 2006 ͉ vol. 103 ͉ no. 16 ͉ 6371 Downloaded by guest on September 29, 2021 proaches provides specific candidate genes for one QTL for alcohol preference. These genes and most of the key functional groups revealed by the current study indicate our ignorance about the molecular mechanisms driving alcohol consumption and, most importantly, the opportunities to reveal mechanisms through large- scale genomic screening approaches. Materials and Methods Animal Husbandry. Naı¨ve, adult mice (60–100 d old) had 24 h ad libitum access to rodent chow, water, and 12 h:12 h lighting. All animals were housed and treated according to the National Insti- tutes of Health guidelines for the use and care of laboratory animals (36) and approved Institutional Animal Care and Use Committee protocols at each respective institution. Whole-brain total RNA from naı¨ve mice was used for all array analyses. Microarray hybridizations for the HAP͞LAP lines, inbred strains, and congenic 9 strains were completed on individual mice. The lines for high and low levels of short-term selection were hybridized with a pool of three samples for each represented N value (n ϭ 10 mice per line). Details for each strain are listed below.

Short-Term Selection Lines. Reciprocally crossed, individually housed B6ϫD2 and D2ϫB6 F2 mice (8–10 weeks of age) were initially given water in a classical two-bottle choice paradigm. One tube was then replaced with a 3% (vol͞vol) solution of ethanol in tap water followed by 10% ethanol (4 d each). The selection index for drinking was based on the 10% concentration expressed in grams per kilogram per day for each of the last days immediately Fig. 4. Congenic 9 expression analysis. Meta-analysis results were filtered by before a bottle position change (i.e., the average of days 8 and 10). significant cis-regulation on chromosome 9. Gene names and their physical The 10 highest-scoring and 10 lowest-scoring mice of each gender location are indicated on the left, and effect size and direction are shown on were used as breeding pairs to initiate the selection lines with high the right. Asterisks indicate a correlation between gene expression and the and low amounts of drinking, respectively. Bidirectional selection preference for drinking phenotype in a panel of BXD recombinant inbred strains (5) generated with the WebQTL (www.genenetwork.org) Integrative continued similarly for each of three generations; the highest of the Neuroscience Initiative on Alcoholism Brain mRNA M430 (April 2005 release) high line and the lowest of the low line were used to breed the next PDNN (Positional Dependent Nearest Neighbor) database [referred to as INIA generation of the high and low selection lines, respectively. Naı¨ve Brain mRNA M430 (Apr05) PDNN by WebQTL] (7). mice with high and low levels of drinking from selection generation S3 were used in the array experiment.

maintain or reset homeostasis and adjust neuronal function in brain HAP͞LAP. Two replicate lines of HAP and LAP mice were originally likely underlie many aspects of alcohol responses. Understanding generated from HS͞Ibg mice (2) based on bidirectional selection these differences will be key to a better understanding of alcohol- outcome of two-bottle choice 10% alcohol solution vs. water ism. For instance, it is possible that expression differences could consumption (6, §§). Male HAP1͞LAP1 and HAP2͞LAP2 mice substantially affect developmental and adaptive brain neurocir- (70–90 d old) were used from generations 26 and 19, respectively, cuitry relating to alcohol preference. for Affymetrix array analysis. It should be noted that our approach is designed to detect common differences found across the animal models used but will Inbred Strains. Male BALB͞cJ, B6, D2, and LP͞NJ inbred mice not detect differences specific for one model or one sex, nor will it were raised at the Veterans Affairs Medical Center in Portland, detect mutations that affect alcohol preference for which no OR. LP͞NJ mice were tested for two-bottle choice alcohol concomitant change in gene expression results. Unique but poten- drinking and found to drink significantly less alcohol than B6 tially important determinants of drinking require detailed analysis mice (D.A.F., unpublished data). Female B6, FVB͞NJ, and ͞ of each data set, which has been accomplished for the HAP LAP FVBϫB6 F1 mice were raised at the University of Texas. F1 mice (mouse lines demonstrating high͞low levels of alcohol preference, were recently found to drink significantly more alcohol than the respectively; see ref. 6) selected lines (29). progenitor strains when tested in an accelerating concentration Our meta-analytic approach using an effect-size metric should be two-bottle choice paradigm (4). applicable to new data sets. Two-bottle choice alcohol drinking is the most widely used model of voluntary consumption, but new Congenic Strain. A D2 QTL region for alcohol preference from 9 to models are emerging (30–33). It would be interesting to determine 58 cM on chromosome 9 was introgressed onto a B6 genomic the generality of our results to other measures of alcohol consump- background through Ն12 generations of breeder selection by tion. Follow-up studies for candidate genes identified in previous Massachusetts Institute of Technology microsatellite marker array experiments have led to the understanding of molecular genotyping and backcrossing to B6. The B6.D2Ch9 mice drink mechanisms for other behaviors (34, 35) and is likely for our significantly less than B6 control mice (difference Ϸ25%, T.J.P., candidate genes as well. unpublished data). mRNA was isolated from jointly housed and In summary, this meta-analysis shows that distinct mouse models age-matched (70–100 days) congenic and B6 mice. with genetic predisposition for elevated alcohol consumption have very consistent and reproducible differences in brain gene expres- Oligonucleotide Microarrays. Samples containing at least 10 ␮gof sion, although they have never consumed alcohol. Furthermore, the total RNA were hybridized after manufacturer’s protocol (Af- differences extend to thousands of transcripts forming many func- fymetrix, Santa Clara, CA). Details for HAP͞LAP mice have been tional groups. Combining the meta-analysis with congenic ap- published in ref. 29. RNA from STS mice and mice from inbred

6372 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0510188103 Mulligan et al. Downloaded by guest on September 29, 2021 strains BALB͞cJ, B6, D2, and LP͞NJ arrays was hybridized to Congenic 9 Data Filter. We used microarray data obtained from the Affymetrix 430 A and B chips and analyzed at the Oregon Health B6.D2Ch9 congenic strain to filter transcripts detected by meta- and Science University Gene Microarray Shared Resource Facility analysis as regulated significantly between ethanol-preferring (for more details, see Supporting Materials and Methods, which is and nonpreferring genotypes and located within the introgressed published as supporting information on the PNAS web site). region containing known ethanol preference QTLs on mouse chromosome 9. Specifically, chromosome 9 transcripts signifi- Custom cDNA Microarrays. PCR products of clones from several cantly regulated between the congenic and control lines (P Ͻ cDNA libraries (see Supporting Materials and Methods for details) 0.05, t test) were identified and matched for direction of change were printed in the University of Texas array core facility on with data from the meta-analysis. Only genes detected by both poly-L-lysine-coated (Sigma) microscope slides (Erie Scientific, approaches were selected as putative candidates for cis- Portsmouth, NH) using a custom-built robotic arrayer as described regulation of alcohol preference. in ref. 37. Microarray hybridizations were performed with the Array 350 microarray labeling kit according to the manufacturer’s pro- TFBS Overrepresentation Analysis. The oPOSSUM database (www. tocol (Genisphere, Hatfield, PA). cisreg.ca͞cgi-bin͞oPOSSUM͞opossum) (17) was used to analyze TFBS in genes that were significantly (Q Ͻ 0.05 and d Ն 0.5) up- Meta-Analysis. Five data sets listed 1–5 in Table 1 were used for or down-regulated in alcohol-preferring vs. nonpreferring mice. meta-analysis. Alcohol-preferring and nonpreferring mice were The two lists had 2,388 and 1,580 genes with 1,011 and 546 first compared by using a parametric statistical test. Student’s t test recognized as orthologs in oPOSSUM for up- vs. down-regulation for independent samples was used to compare high and low selected respectively. TFBS overrepresentation was determined based on lines (studies 1–3). Both sets of inbred strains (studies 4 and 5) were one-tailed Fisher exact probabilities and the ranking of z scores. analyzed independently by an F test using single contrasts (see After the initial analysis, the genes for the TFs themselves were Supporting Materials and Methods for additional details). The individually checked to see whether (i) they were present on the resulting t and F values for each transcript were then used to arrays and (ii) transcription differences were consistent with the estimate effect size by calculating a Cohen’s d statistic, which is the promoter TFBS analysis. difference between the groups expressed in pooled within-group SD units (12). The following formulas were applied: d ϭ 2t͌͞df ϭ Functional Overrepresentation Analysis. Transcripts passing statisti- 2͌(F͞df), where df represents the degrees of freedom. The cal thresholds (Q Ͻ 0.05, d Ն 0.5) were functionally annotated direction of change was coded in the resulting Cohen’s d values so, with WEBGESTALT (web-based gene set analysis toolkit, http:͞͞ that ͗positive͘ values indicated an up-regulation and ͗negative͘ genereg.ornl.gov.webgestalt) (19). A hypergeometric test was used values indicated a down-regulation of transcripts in alcohol- to detect functional groups overrepresented in a selected gene set preferring genotypes compared with nonpreferring genotypes. (meta-regulated genes) compared with a reference set (all genes in Data from the two inbred strain studies (studies 4 and 5 in Table meta-analysis). See Supporting Materials and Methods for additional 1) were collapsed by averaging Cohen’s d values for each transcript details. (for rationale, see Supporting Materials and Methods). To minimize any outlier effects, extremely deviant effect sizes were adjusted to This study was a collaborative effort by investigators affiliated with the a maximum absolute value of d ϭ 4. Finally, the Cohen’s d values Integrative Neuroscience Initiative on Alcoholism Consortium and was generated for the three pairs of selected lines and the combined set supported by National Institute on Alcohol Abuse and Alcoholism Grants AA013182 and AA013404 (to S.E.B.); AA013475, AA006243, of inbred strains were averaged, and a z test was used to test the and AA010760 (to J.K.B.); AA013520 (to Y.A.B.); AA013519 (to significance of deviation of the mean effect size from zero. QVALUE J.C.C.); AA013483 (to N.J.G.); AA013518 (to R.A.H.); AA013484 and software was used to estimate the false discovery rate (Q) for the AA11034 (to R.J.H.); AA013517 (to G.F.K.); and AA013489 (to B.T.); meta-analysis results (38, 39), which estimates the proportion of all and by five Merit Review Programs from the Department of Veterans declared significant results that are expected to be false positives. Affairs (to J.K.B., J.C.C., D.A.F., R.J.H., and T.J.P.).

1. Crabbe, J. C. (2002) Am. J. Med. Genet. 114, 969–974. 22. Diehn, M., Sherlock, G., Binkley, G., Jin, H., Matese, J. C., Hernandez-Boussard, T., Rees, 2. McClearn, G., Wilson, J. & Meredith, W. (1970) The Use of Isogenic and Heterogenic Mouse C. A., Cherry, J. M., Botstein, D., Brown, P. O. & Alizadeh, A. A. (2003) Nucleic Acids Res. Stocks in Behavioral Research (Appleton–Century–Crofts, New York). 31, 219–223. 3. Belknap, J. K. & Atkins, A. L. (2001) Mamm. Genome 12, 893–899. 23. Foroud, T., Edenberg, H. J., Goate, A., Rice, J., Flury, L., Koller, D. L., Bierut, L. J., 4. Blednov, Y. A., Metten, P., Finn, D. A., Rhodes, J. S., Bergeson, S. E., Harris, R. A. & Conneally, P. M., Nurnberger, J. I., Bucholz, K. K., et al. (2000) Alcohol. Clin. Exp. Res. 24, Crabbe, J. C. (2005) Alcohol. Clin. Exp. Res. 29, 1949–1958. 933–945. 5. Phillips, T. J., Crabbe, J. C., Metten, P. & Belknap, J. K. (1994) Alcohol. Clin. Exp. Res. 18, 24. Engracia, V., Leite, M. M., Pagotto, R. C., Zucoloto, S., Barbosa, C. A. & Mestriner, M. A. 931–941. (2003) Am. J. Med. Genet. 123, 257–260. 6. Grahame, N. J., Li, T. K. & Lumeng, L. (1999) Behav. Genet. 29, 47–57. 25. Blednov, Y. A., Bergeson, S. E., Walker, D., Ferreira, V. M., Kuziel, W. A. & Harris, R. A. 7. Wang, J., Williams, R. W. & Manly, K. F. (2003) Neuroinformatics 1, 299–308. (2005) Behav. Brain Res. 165, 110–125. 8. Phillips, T. J., Terdal, E. S. & Crabbe, J. C. (1990) Behav. Genet. 20, 473–480. 26. Carr, L. G., Habegger, K., Spence, J., Ritchotte, A., Liu, L., Lumeng, L., Li, T. K. & Foroud, 9. Phillips, T. J., Belknap, J. K., Hitzemann, R. J., Buck, K. J., Cunningham, C. L. & Crabbe, T. (2003) Alcohol. Clin. Exp. Res. 27, 1710–1717. J. C. (2002) Genes Brain Behav. 1, 14–26. 27. Bice, P., Foroud, T., Bo, R., Castelluccio, P., Lumeng, L., Li, T. K. & Carr, L. G. (1998) 10. Rhodes, D. R., Yu, J., Shanker, K., Deshpande, N., Varambally, R., Ghosh, D., Barrette, T., Mamm. Genome 9, 949–955. Pandey, A. & Chinnaiyan, A. M. (2004) Proc. Natl. Acad. Sci. USA 101, 9309–9314. 28. Liang, T., Habegger, K., Spence, J. P., Foroud, T., Ellison, J. A., Lumeng, L., Li, T. K. & 11. Moreau, Y., Aerts, S., De Moor, B., De Strooper, B. & Dabrowski, M. (2003) Trends Genet. Carr, L. G. (2004) Alcohol. Clin. Exp. Res. 28, 1622–1628. 19, 570–577. 29. Saba, L., Bhave, S., Lapadat, R., Hoffman, P. L., Belknap, J. K. & Tabakoff, B. (2006) 12. Rosenthal, D. (1994) Parametric Measures of Effect Size (Russell Sage Found., New York). Mamm. Genome, in press. 13. Huh, G. S., Boulanger, L. M., Du, H., Riquelme, P. A., Brotz, T. M. & Shatz, C. J. (2000) 30. Finn, D. A., Belknap, J. K., Cronise, K., Yoneyama, N., Murillo, A. & Crabbe, J. C. (2005) Science 290, 2155–2159. Psychopharmacology 178, 471–480. 14. Yu, F. H., Westenbroek, R. E., Silos-Santiago, I., McCormick, K. A., Lawson, D., Ge, P., 31. Rhodes, J. S., Best, K., Belknap, J. K., Finn, D. A. & Crabbe, J. C. (2005) Physiol. Behav. 84, 53–63.

Ferriera, H., Lilly, J., DiStefano, P. S., Catterall, W. A., et al. (2003) J. Neurosci. 23, 7577–7585. 32. O’Dell, L. E., Roberts, A. J., Smith, R. T. & Koob, G. F. (2004) Alcohol. Clin. Exp. Res. 28, NEUROSCIENCE 15. Grieco, T. M., Malhotra, J. D., Chen, C., Isom, L. L. & Raman, I. M. (2005) Neuron 45, 233–244. 1676–1682. 16. Hodge, C. W., Mehmert, K. K., Kelley, S. P., McMahon, T., Haywood, A., Olive, M. F., 33. Roberts, A. J., Heyser, C. J., Cole, M., Griffin, P. & Koob, G. F. (2000) Neuropsychophar- Wang, D., Sanchez-Perez, A. M. & Messing, R. O. (1999) Nat. Neurosci. 2, 997–1002. macology 22, 581–594. 17. Ho Sui, S. J., Mortimer, J. R., Arenillas, D. J., Brumm, J., Walsh, C. J., Kennedy, B. P. & 34. McClung, C. A. & Nestler, E. J. (2003) Nat. Neurosci. 6, 1208–1215. Wasserman, W. W. (2005) Nucleic Acids Res. 33, 3154–3164. 35. Yao, W. D., Gainetdinov, R. R., Arbuckle, M. I., Sotnikova, T. D., Cyr, M., Beaulieu, J. M., 18. Harris, M. A., Clark, J., Ireland, A., Lomax, J., Ashburner, M., Foulger, R., Eilbeck, K., Torres, G. E., Grant, S. G. & Caron, M. G. (2004) Neuron 41, 625–638. Lewis, S., Marshall, B., Mungall, C., et al. (2004) Nucleic Acids Res. 32, D258–D261. 36. National Research Council (1996) Guide for the Care and Use of Laboratory Animals (Nat. 19. Zhang, B., Kirov, S. & Snoddy, J. (2005) Nucleic Acids Res. 33, W741–W748. Res. Council, Washington, DC). 20. Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H. & Kanehisa, M. (1999) Nucleic Acids 37. DeRisi, J. L., Iyer, V. R. & Brown, P. O. (1997) Science 278, 680–686. Res. 27, 29–34. 38. Storey, J. D. & Tibshirani, R. (2003) Proc. Natl. Acad. Sci. USA 100, 9440–9445. 21. Jeong, H. J., Hong, S. H., Park, R. K., An, N. H. & Kim, H. M. (2005) Life Sci. 77, 2179–2192. 39. Storey, J. D. & Tibshirani, R. (2003) Methods Mol. Biol. 224, 149–157.

Mulligan et al. PNAS ͉ April 18, 2006 ͉ vol. 103 ͉ no. 16 ͉ 6373 Downloaded by guest on September 29, 2021