Comparative expression analysis of blood and brain provides concurrent validation of SELENBP1 up-regulation in schizophrenia

Stephen J. Glatta,b,c,d, Ian P. Everallb,c,e, William S. Kremena,c, Jacques Corbeilf,g, Roman Saˇ ´ sˇikh, Negar Khanlouc,e, Mark Hani, Choong-Chin Liewi, and Ming T. Tsuanga,c,j,k,l

aCenter for Behavioral Genomics, Departments of cPsychiatry and gMedicine, hUniversity of California San Diego Cancer Center, and eHIV Neurobehavioral Research Center, University of California at San Diego, La Jolla, CA 92093; dVeterans Medical Research Foundation, San Diego, CA 92161; fDepartment of Anatomy and Physiology, Laval University, Quebec, PQ, Canada G1V 4G2; iChondroGene, Inc., Toronto, ON, Canada M3J 3K4; jDepartments of Epidemiology and Psychiatry, Harvard Institute of Psychiatric Epidemiology and Genetics, Boston, MA 02115; and kVeterans Affairs Healthcare System, San Diego, CA 92161

Communicated by Eric R. Kandel, Columbia University, New York, NY, September 1, 2005 (received for review July 28, 2005) Microarray techniques hold great promise for identifying risk come under study. Because can reflect both genetic factors for schizophrenia (SZ) but have not yet generated widely and environmental influences, it may be particularly useful for reproducible results due to methodological differences between identifying risk factors for a complex disorder such as SZ, which is studies and the high risk of type I inferential errors. Here we thought to have a multifactorial polygenic etiology in which many established a protocol for conservative analysis and interpretation and environmental factors interact. However, the simulta- of gene expression data from the dorsolateral prefrontal cortex of neous consideration of thousands of dependent variables also SZ patients using statistical and bioinformatic methods that limit increases the likelihood of false-positive results (7). In short, false positives. We also compared brain gene expression profiles microarrays hold great promise for identifying etiologic factors for with those from peripheral blood cells of a separate sample of SZ SZ but run the risk of being too liberal and failing to provide replicable results. patients to identify disease-associated genes that generalize across Several groups (8) have characterized gene expression profiles of tissues and populations and further substantiate the use of gene SZ in postmortem tissue from the dorsolateral prefrontal cortex expression profiling of blood for detecting valid SZ biomarkers. (DLPFC) of the brain, which has been consistently identified as Implementing this systematic approach, we: (i) discovered 177 dysfunctional in the illness (9). These studies have noted variable putative SZ risk genes in brain, 28 of which map to linked chro- patterns of dysregulated gene expression in several domains, in- mosomal loci; (ii) delineated six biological processes and 12 mo- cluding G signaling, metabolism, mitochondrial function, lecular functions that may be particularly disrupted in the illness; myelination, and neuronal development. However, not all of these (iii) identified 123 putative SZ biomarkers in blood, 6 of which studies have reported significant alterations in each domain. Meth- (BTG1, GSK3A, HLA-DRB1, HNRPA3, SELENBP1, and SFRS1) had odological differences, including ethnic and demographic dispari- GENETICS corresponding differential expression in brain; (iv) verified the ties, alternative microarray platforms, and diverse methods of data differential expression of the strongest candidate SZ biomarker analysis, as well as the high risk of false positives, have been cited (SELENBP1) in blood; and (v) demonstrated neuronal and glial as factors possibly contributing to this variability (7). expression of SELENBP1 protein in brain. The continued application To overcome the limitations of prior microarray studies of SZ, we of this approach in other brain regions and populations should have adopted a rigorous and systematic approach to sequentially facilitate the discovery of highly reliable and reproducible candi- identifying, prioritizing, verifying, and validating potential etiologic date risk genes and biomarkers for SZ. The identification of valid factors in SZ. Importantly, our approach is also very conservative due to three critical design features, including: (i) the application of peripheral biomarkers for SZ may ultimately facilitate early iden- statistical and bioinformatic methods that substantially reduce type tification, intervention, and prevention efforts as well. I error rates by using statistical significance criteria rather than fold-change values; (ii) the evaluation of potential confounds such microarray ͉ ontology as psychotropic medication use; and (iii) the comparison of gene expression profiles in two tissues (brain and blood) from two chizophrenia (SZ) has a substantial genetic basis (1), but its different samples. This approach has allowed us to identify numer- Sbiological underpinnings remain largely unknown. Early at- ous putative risk factors for SZ and further validate the use of gene tempts to profile the expression of specific neurochemicals in blood expression profiling of blood for detecting SZ biomarkers, which we and postmortem brain detected several promising candidate risk described in a pilot study earlier this year (10). The stringency of our factors for SZ (2, 3) that ultimately could not be substantiated (4, methods bolsters the validity of the results and increases their 5). Subsequent progress in mapping the increased likelihood of generalizing to other samples, which should prove the viability of candidate gene association studies, which have since essential for advancing our understanding of the biological basis of SZ. The identification of valid peripheral biomarkers for SZ may proliferated (6). Most candidate genes have been targeted based on ultimately facilitate early identification, intervention, and preven- their expression within systems widely implicated in the disorder tion efforts as well. (e.g., dopamine and glutamate neurotransmitter systems), and this approach is essential for clarifying the nature of dysfunction within Methods these recognized candidate pathways; however, it may not be Design. We first acquired data from cRNA microarrays surveying optimal for identifying additional novel risk factors outside of these a vast portion of the expressed human genome in postmortem tissue systems. The advent of microarrays that can survey the entire expressed human genome has made it possible to simultaneously investigate Abbreviations: DLPFC, dorsolateral prefrontal cortex; GO, ; NBD, National the roles of several thousand genes in a disorder. Relative to Brain Databank; PBC, peripheral blood cell; SZ, schizophrenia. traditional candidate gene studies predicated on existing disease bS.J.G. and I.P.E. contributed equally to this work. models, microarray analysis is a less-constrained strategy that could lTo whom correspondence should be addressed. E-mail: [email protected]. foster the discovery of novel risk genes that otherwise would not © 2005 by The National Academy of Sciences of the USA

www.pnas.org͞cgi͞doi͞10.1073͞pnas.0507666102 PNAS ͉ October 25, 2005 ͉ vol. 102 ͉ no. 43 ͉ 15533–15538 Downloaded by guest on September 24, 2021 from the DLPFC of SZ patients and nonpsychiatric control sub- determine whether the elevated frequency of such exposures jects. We analyzed the data with an innovative statistical tool that among patients would account for group differences in gene reduces the number of false positives relative to other methods and expression. applied a bioinformatic algorithm to simplify the interpretation of Following the method used by Iwamoto et al. (17), the effects of the ontologies represented by the differentially expressed genes. In anticonvulsant, antidepressant, and anxiolytic medications were addition, following the comparative tissue approach adopted by independently examined by comparing gene expression levels ob- Martin et al. (11) for studying breast cancer, we compared gene served in treated and untreated groups with t tests for independent expression profiles in DLPFC with those derived from peripheral samples. Advancing beyond this scheme, antipsychotic medications blood cells (PBCs) from a separate sample of SZ patients and were evaluated in a more quantitative manner by converting daily nonpsychiatric control subjects. This comparison allowed for iden- dosages to a common metric [maximum effective dose (18)] and tification of those genes whose differential expression in SZ gen- examining correlations between this daily dose index and the eralizes across tissues and populations and isolation of potential expression level of each differentially expressed gene. Highly peripheral biomarkers for SZ. The differential expression of the conservative family-wise corrections for multiple testing within strongest candidate SZ biomarker emerging from the microarray each medication class were performed by using the Bonferroni analyses (SELENBP1, which was significantly up-regulated in both correction. brain and blood in SZ) was then verified in PBCs by quantitative Ontological profiling. To assist in the biological and molecular char- RT-PCR. Finally, to demonstrate that SELENBP1 protein is acterization of the differentially expressed genes, we classified these expressed in brain and to preliminarily validate the differential genes using the MicroArray Data Characterization and Profiling expression of SELENBP1 between SZ patients and control sub- (MADCAP) algorithm (19), which was developed in conjunction jects, we returned to the postmortem brain tissue to examine the with one of the authors (R.Sˇ.). MADCAP compares the list of expression of the protein product of this gene. differentially expressed genes identified by CORGON with the list of all genes on a microarray and determines which of the standardized Gene Expression in DLPFC. Samples. Gene expression data were gene ontology (GO) terms recognized by the GO Consortium (20) obtained from cRNA microarrays of fresh-frozen postmortem are more frequently represented by the CORGON-selected genes DLPFC tissue samples (50 mg) from 19 SZ patients and 27 than would be expected by chance based on the genes represented nonpsychiatric control subjects in the National Brain Databank in the entire microarray. Unlike the GOMINER program (21), this (NBD) maintained by the Harvard Brain Tissue Resource Center. method calculates conditional P values, which allows for the dis- Patients and controls were closely matched on gender (68% vs. 70% covery of significant terms even if they are small in size. Additional male; P ϭ 0.887) and mean age (57 vs. 56 yrs; P ϭ 0.955), and details of the method are published as Supporting Text. DLPFC samples were very similar in laterality (58% vs. 52% right hemisphere; P ϭ 0.875), mean pH (6.4 vs. 6.4; P ϭ 0.981), and mean Gene Expression in PBCs. Samples. Peripheral whole-blood samples postmortem interval (21 vs. 20 h; P ϭ 0.739). Ascertainment and (10 ml) were obtained from a separate set of 30 SZ patients and 24 diagnosis of these subjects according to Diagnostic and Statistical nonpsychiatric control subjects from Taiwan, as described (10). Manual of Mental Disorders (DSM-IV) criteria (12), preparation of Patients and controls were of similar gender (40% vs. 58% male; brain tissue, extraction, purification and hybridization of RNA, P ϭ 0.180) but differed in mean age (34 vs. 42 yrs; P ϭ 0.014), quantification of expression levels on cRNA microarrays, and warranting the examination of age as a potential covariate of gene quality-control procedures were all performed at the Harvard expression in subsequent statistical analyses. All blood samples Brain Tissue Resource Center by standard methods, available in were collected into sterile violet-capped Vacutainer tubes (Becton Supporting Text, which is published as supporting information on Dickinson) containing K3 EDTA, temporarily stored at 4°C, and the PNAS web site. processed within6hofcollection. Ascertainment and diagnosis of Microarray data analysis. The gene expression data generated by these these subjects according to Diagnostic and Statistical Manual of procedures were downloaded as cell intensity (CEL) files from the Mental Disorders (DSM-IV, ref. 12) criteria; collection and prepa- NBD web site and subjected to the statistical tool CORGON (13), ration of blood samples; separation and lysis of PBCs; extraction, which is an academic software package developed in conjunction purification, and hybridization of RNA; quantification of expres- with two of the authors (J.C. and R.Sˇ.) and freely available from sion levels on cRNA microarrays; and quality-control procedures them. CORGON utilizes a novel statistical model that assumes were all performed by standard methods, which are described in multiplicative rather than additive noise and eliminates statistically greater detail elsewhere (10) and are published as Supporting Text. significant outliers. CORGON also assumes a uniform background Microarray data analysis. Gene expression data were analyzed by the level that is estimated from both mismatch and perfect-match probe CORGON and Focus algorithms as outlined above for DLPFC intensities. Furthermore, CORGON accounts for mRNA prepara- samples, and the list of genes differentially expressed in the blood tion, hybridization, normalization, and image analysis efficiencies. of these patients and controls was compared with the list of genes There is no ‘‘gold standard’’ fold change in a gene that is known previously identified as differentially expressed in the DLPFC of SZ to be biologically relevant; thus, CORGON identifies differentially patients and controls in the NBD. Medication effects on gene expressed genes in conjunction with the Focus algorithm (14) based expression levels were also examined as outlined above for DLPFC instead on their statistical significance beyond a threshold of P ϭ samples, and the effect of age (which differed between patients and 0.05. The P value of each gene is determined by two-tailed controls) was evaluated by correlation as well. unadjusted permutation testing of 100,000 permutations of sample Verification by RT-PCR. The level of mRNA expression of the stron- labels for each gene. For each permutation, the t statistic was gest candidate biomarker gene (SELENBP1, which was signifi- calculated from log(expression) values, and the P value was esti- cantly up-regulated in both DLPFC and PBCs in SZ) was quantified mated as the fraction of permutations for which the absolute value in PBCs by RT-PCR. Total blood RNA isolated by the TRIzol of the t statistic was greater than or equal to the absolute value of method was reversed-transcribed into single-stranded cDNA by the unpermuted t statistic. [When applied to a series of arrays (15), using a High-Capacity cDNA Archive Kit (Applied Biosystems) in CORGON yields a type I error rate (4.4%) far superior to the rates a 100-␮l reaction. Each sample of cDNA (2 ng) was then mixed with of 29% and 15% attained by other widely accepted methods SYBR green master mix (Qiagen, Valencia, CA) and primers in a (Affymetrix MICROARRAY SUITE 5.0 and the method of Li and Wong 20-␮l reaction. Forward and reverse primers were designed by using (16), respectively]. Expression levels of all genes identified as PRIMERQUEST (Integrated DNA Technologies, Coralville, IA). differentially expressed by the CORGON algorithm were then exam- PCR amplification was performed by using DNA Engine Opticon ined in relation to antipsychotic and other medication use to (MJ Research, Cambridge, MA). An automatically calculated

15534 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0507666102 Glatt et al. Downloaded by guest on September 24, 2021 Table 1. Biological Process ontology terms overrepresented by genes differentially expressed in DLPFC in SZ Gene Gene fold change Ontology term P symbol Gene product (P)inSZ

ATP-dependent proteolysis 0.022 CRBN Cereblon 1.08 (0.01660) Circulation 0.023 LPL Lipoprotein lipase RYR2 Rvanodine receptor 2 (cardiac) 1.14 (0.01030) SRI Sorcin 1.20 (0.02480) Energy pathways 0.007 ACOX1 Acyl-Coenzyme A oxidase 1, palmitoyl 1.20 (0.00480) COX17 COX17 homolog, assembly protein (yeast) 1.10 (0.01780) COX7C Cytochrome c oxidase subunit VIIc 1.07 (0.04030) GLP1R Glucagon-like peptide 1 receptor Ϫ1.09 (0.03080) NDUFA2 NADH dehydrogenase (ubiquinone) 1 ␣ subcomplex, 2, 8 kDa 1.09 (0.02260) SUCLG1 Succinate-CoA ligase, GDP-forming, alpha subunit 1.06 (0.02870) Muscle contraction 0.041 CNN3 Calponin 3, acidic 1.37 (0.01660) RYR2 Ryanodine receptor 2 (cardiac) 1.14 (0.01030) SRI Sorcin 1.20 (0.02480) SSPN Sarcospan (Kras oncogene-associated gene) 1.21 (0.04060) Protein lipoylation 0.004 NMT1 N-myristoyltransferase 1.09 (0.00980) Regulation of action potential 0.013 SRI Sorcin 1.20 (0.02480)

melting point dissociation curve was examined to ensure specific all of these medication effects was abolished by Bonferroni cor- PCR amplification and the lack of primer–dimer formation in each rection for multiple testing. well. The comparative Ct equation (Applied Biosystems) was used The 177 differentially expressed genes were then profiled by to calculate relative fold changes between patient and control MicroArray Data Characterization and Profiling, and 25 were ϪD⌬C samples. Briefly, gene expression levels were represented as 2t , found to be linked to one or more overrepresented ontologies. where D⌬Ct ϭ {[⌬Ct(single sample)]-[mean ⌬Ct(control sam- Thirteen of the 25 genes represented six Biological Process GO ples)]}, ⌬Ct ϭ {[Ct(target gene)]-[Ct(ACTB)]}, and ACTB is the terms (Fig. 2A, which is published as supporting information on the housekeeping gene coding for ␤ actin. PNAS web site). The most populated term within this ontology was Energy Pathways (P ϭ 0.007), which described the involvement of Protein Expression in DLPFC. Samples. We examined the expression of six genes (Table 1). In addition, 12 Molecular Function GO terms the protein product of the strongest candidate biomarker gene were represented by 18 genes (Fig. 2B). The most populated term (SELENBP1) in DLPFC tissue from a randomly selected sample of in this ontology was oxidoreductase activity (P ϭ 0.031), which four SZ patients and four control subjects in the Harvard Brain described the function of seven genes (Table 2). Three of these

Tissue Resource Center from whom gene expression data were genes (NDUFA2, NDUFB5, and NDUFC1) were also classified as GENETICS obtained. having NADH dehydrogenase activity. Genes representing terms in Immunohistochemistry. Paraffin wax-embedded DLPFC brain tissue both the Biological Process and Molecular Function ontologies sections (10 ␮m) were treated with citrate buffer, microwaved for included ACOX1, COX7C, COX17, CNN3, NDUFA2, and NMT1, 10 min, and exposed for 24 h at 4°C to the mouse antiselenium- of which three genes (ACOX1, COX7C, and NDUFA2) had oxi- binding protein monoclonal antibody (1:250 dilution; MBL Inter- doreductase activity within energy pathways. The remaining 152 national, Woburn, MA). Antibody was detected by using the mouse differentially expressed genes were linked to GO terms that were monoclonal Vectastain ABC kit and the 3, 3Ј diaminobenzidine not significantly overrepresented. substrate for peroxidase (Vector Laboratories). Sections were counterstained with hematoxylin, visualized on a Zeiss microscope, Gene Expression in PBCs. Application of the CORGON and Focus and analyzed by using IMAGE-PRO PLUS software (Media Cyber- algorithms to the gene expression data from a separate sample of netics, Silver Spring, MD). 30 SZ patients and 24 control subjects from Taiwan identified 123 genes that were differentially expressed in PBCs from the two Results groups. Of these, 67 genes were up-regulated and 56 down- Gene Expression in DLPFC. Application of the CORGON and Focus regulated in SZ. The Affymetrix probe number, accession number, algorithms to the brain gene expression data from 19 SZ patients gene symbol, gene product, and chromosomal of each dif- and 27 control subjects in the NBD identified 177 genes that were ferentially expressed gene, as well as its fold-change difference in differentially expressed in the DLPFC of the two groups. Of these, expression between SZ patients and control subjects and the 111 genes were up-regulated, and 66 were down-regulated in SZ. corresponding P value, are provided in Table 5, which is published The Affymetrix probe number, accession number, gene symbol, as supporting information on the PNAS web site. Eight genes gene product, and chromosomal locus of each differentially ex- increased and 10 genes decreased significantly in expression level pressed gene, as well as its fold-change difference in expression with age. Anticonvulsant-treated subjects differed from untreated between SZ patients and control subjects and the corresponding P subjects in the expression of only one gene, which was up-regulated; value, are provided in Table 4, which is published as supporting however, antidepressant treatment was associated with significant information on the PNAS web site. Anticonvulsant-treated subjects up-regulation of 15 genes and significant down-regulation of two showed significant down-regulation of six genes and up-regulation others. The effects of age and these classes of medication did not of one gene relative to untreated subjects, whereas anxiolytic remain significant after correction for multiple testing. Remark- treatment increased the expression of two genes and decreased the ably, anxiolytic treatment was associated with significant up- expression of two others. Antidepressant treatment influenced the regulation of 34 genes and down-regulation of another 40. Differ- expression of many more genes, with 10 showing significant up- ential expression of 12 of these genes [including CSDA, EPB42, regulation and 7 showing significant down-regulation with treat- FBXO9, FKBP8, GSK3A HBA1 (two transcripts), HBA2, HBB (two ment. Daily dosage of antipsychotic medication had a significant transcripts), HLA-B, and UBB) remained significant after correct- positive impact on the expression of 13 genes but was not signifi- ing for multiple testing. Daily dosage of antipsychotic medication cantly related to down-regulation of any genes. The significance of was linearly related to the expression of only one gene (G0S2),

Glatt et al. PNAS ͉ October 25, 2005 ͉ vol. 102 ͉ no. 43 ͉ 15535 Downloaded by guest on September 24, 2021 Table 2. Molecular Function ontology terms overrepresented by genes differentially expressed in DLPFC in SZ Gene fold Gene change (P)in Ontology term P symbol Gene product SZ

Beta-Catenin Binding 0.041 APC Adenomatosis polyposis coli 1.14 (0.02510) Calpain activity 0.009 CAPN1 Calpain 1, (mu͞l) large subunit Ϫ1.10 (0.02210) CAPNS1 Calpain, small subunit 1 Ϫ1.10 (0.01740) Copper ion transporter activity 0.016 COX17 COX17 homolog, cytochrome c oxidase assembly protein (yeast) 1.10 (0.01780) Electron donor activity 0.027 ACOX1 Acyl-Coenzyme A oxidase 1, palmitoyl 1.20 (0.00480) GABA receptor activity 0.012 GABRA2 ␥-aminobutyric acid (GABA) A receptor, ␣ 2 1.22 (0.01050) GABRB1 ␥-aminobutyric acid (GABA) A receptor, ␤ 1 1.16 (0.04360) Glycylpeptide N-tetradecanoyltransferase 0.048 NMT1 N-myristoyltransferase 1.09 (0.00980) activity MHC protein binding 0.048 TAPBP TAP-binding protein (tapasin) Ϫ1.09 (0.02930) NADH dehydrogenase activity 0.019 NDUFA2 NADH dehydrogenase (ubiquinone) 1 ␣ subcomplex, 2, 8 kDa 1.09 (0.02260) NDUFB5 NADH dehydrogenase (ubiquinone) 1 ␤ subcomplex, 5, 16 kDa 1.09 (0.02910) NDUFC1 NADH dehydrogenase (ubiquinone) 1, subcomplex unknown, 1, 6 kDa 1.07 (0.00900) Oxidoreductase activity 0.031 ACOX1 Acyl-Coenzyme A oxidase 1, palmitoyl 1.20 (0.00480) COX7C Cytochrome c oxidase subunit VIIc 1.07 (0.04030) HSD17B12 Hydroxysteroid (17-beta) dehydrogenase 12 1.11 (0.03900) NDUFA2 NADH dehydrogenase (ubiquinone) 1 ␣ subcomplex, 2, 8 kDa 1.09 (0.02260) NDUFB5 NADH dehydrogenase (ubiquinone) 1 ␤ subcomplex, 5, 16 kDa 1.09 (0.02910) NDUFC1 NADH dehydrogenase (ubiquinone) 1, subcomplex unknown, 1, 6 kDa 1.07 (0.00900) P4HA1 Procollagen-proline, 2-oxoglutarate 4-dioxygenase, alpha polypeptide 1.12 (0.03640) 1 Peroxisome targeting signal receptor activity 0.031 PEX5 Peroxisomal biogenesis factor 5 Ϫ1.12 (0.00890) Phosphoserine phosphatase activity 0.009 PSPHL Phosphoserine phosphatase-like Ϫ2.14 (0.04420) Troponin C binding 0.030 CNN3 Calponin 3, acidic 1.37 (0.01660)

which also remained statistically significant after corrections for analysis. A highly significant (P ϭ 0.003) 2.2-fold increase in multiple testing were applied. SELENBP1 was observed in PBCs of SZ patients by RT-PCR, Comparing the list of 123 genes differentially expressed in PBCs which closely corresponded to the significant 2.0-fold up-regulation with that obtained from DLPFC identified six genes common to of the gene observed by microarray. both (Table 3). BTG1, HRNPA3, and SFRS1 were significantly up-regulated in the DLPFC in SZ but significantly down-regulated Protein Expression in DLPFC. Granular cytoplasmic staining of in PBCs from the other sample of SZ patients; the reverse pattern SELENBP1 protein was observed in a proportion of neurons and of differential expression was observed for GSK3A, which was the glia in DLPFC tissue from each of the four control subjects and four only one of these six genes to show a significant relationship to SZ patients. A representative example of the antibody-staining psychotropic medication (i.e., anticonvulsant) use. In contrast, pattern observed in controls is shown in Fig. 1A, whereas a SELENBP1 was significantly up-regulated in both tissues from the representative example of the antibody-staining pattern observed in two samples of SZ patients. HLA-DRB1 was significantly down- patients is shown in Fig. 1B. Compared with control tissue, the regulated in both DLPFC and PBCs in SZ; however, different intensity and ratio of glial͞neuronal SELENBP1 antibody staining probe sets (corresponding to different transcripts of the same gene) was noticeably increased in DLPFC tissue from at least three of the were associated with the illness in the two tissues. four SZ patients. The enhanced intraglial staining of SELENBP1 SELENBP1 was identified as the strongest candidate biomarker antibody in samples from SZ patients was most notable in a among all genes differentially expressed in SZ, because it was the perinuclear rim of increased expression. No staining was observed only gene for which identical probe sets indicated significant in any cell when the primary antibody was omitted. differential expression in a similar direction in both brain and blood in SZ. This significant up-regulation was substantiated by RT-PCR Discussion in PBCs from a randomly selected subset of the same SZ patients The pursuit of risk factors for SZ has been difficult, but systematic (n ϭ 21) and controls (n ϭ 18) that were profiled by microarray evaluation of widely implicated biological systems has identified

Table 3. Six genes differentially expressed in both DLPFC and PBCs in SZ

Affymetrix Fold change (P)inSZ probe Accession Gene Chromosomal number number symbol Gene product locus DLPFC PBCs

200920_s_at AL535380 BTG1 B cell translocation gene 1, antiproliferative 12q22 1.14 (0.04020) Ϫ1.36 (0.00008) 202210_x_at NM_019884 GSK3A Glycogen synthase kinase 3 ␣ 19q13.2 Ϫ1.09 (0.02490) 1.59 (0.00044) 209728_at BC005312 HLA-DRB1 MHC class II, DR ␤ 1 6p21.3 Ϫ1.17 (0.04220) NS* 209312_x_at U65585 HLA-DRB1 MHC, class II, DRB1 - NS* Ϫ1.27 (0.00007) 215193_x_at AJ297586 HLA-DRB1 MHC, class II, DRB1 - NS* Ϫ1.33 (0.00006) 211929_at AA527502 HNRPA3 Heterogeneous nuclear ribonucleoprotein A3 2q31.2 1.15 (0.01410) Ϫ2.12 (0.00004) 214433_s_at NM_003944 SELENBP1 Selenium-binding protein 1 1q21-q22 1.16 (0.04510) 1.95 (0.00093) 211784_s_at BC006181 SFRS1 Splicing factor, arginine͞serine-rich 1 (splicing factor 17q21.3-q22 1.12 (0.02460) Ϫ1.71 (0.00005) 2, alternate splicing factor)

*NS, probe was not differentially expressed on the microarray used to profile the indicated tissue.

15536 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0507666102 Glatt et al. Downloaded by guest on September 24, 2021 implicated in our study are involved in electron transfer, and three of these are also part of the NADH dehydrogenase complex located in the inner mitochondrial membrane. These data suggest that dysfunction within particular pathways or processes, but perhaps not necessarily in specific genes, might be important in the etiology of SZ. Whereas each of the 177 genes differentially expressed in Fig. 1. Expression of SELENBP1 protein in DLPFC. SELENBP1 protein expres- DLPFC of SZ patients is a promising candidate risk gene, the 123 sion in DLPFC was visualized through binding to the mouse antiselenium- genes differentially expressed in PBCs from such patients can be binding protein monoclonal antibody. The pattern and intensity of antibody considered putative biomarkers for the illness. Two other studies staining are shown at ϫ100 magnification for a representative control subject examining differences in blood-based gene expression in SZ have (A) and a representative SZ patient (B). Increased intraglial and decreased appeared since our initial report (10). The study of Zvara et al. (24) intraneuronal staining of SELENBP1 antibody were observed in DLPFC tissue used an arbitrary fold-change criterion (Ͼ2.0-fold) rather than P from at least three of the four SZ patients. Colored arrows indicate antibody staining of cytoplasm in different cell types (black, glia; red, neurons). values to identify differentially expressed genes, the limitations of which we have described above. Thus, although the two genes identified in that study as differentially expressed in PBCs in SZ several likely contributors to its pathogenesis, including central (DRD2 and Kir2.3) were not among the 123 genes we identified, it dopamine and glutamate pathways. Yet, SZ is an undeniably cannot be determined whether this is due to nonreplication or the complex disorder likely to result from dysfunction in not only these different criteria used for identifying differential expression. Sup- but also numerous other biological substrates as well. This com- porting the latter possibility (and underscoring the heightened plexity, in conjunction with the restrictions on throughput of sensitivity of our current statistical methods), we note that many of candidate gene and other single-marker approaches, has prompted the most reliable candidate biomarker genes identified in the a transition toward high-capacity technologies such as cRNA mi- present report were not among the most highly differentially croarrays, which can simultaneously reflect genetically and envi- expressed genes we previously identified based solely on fold- change criteria (10). In the other existing study of PBC gene ronmentally mediated effects on the differential expression of Ϸ potentially all human genes in an illness. However, as illustrated by expression in SZ reported by Middleton et al. (25), 300 genes were the lack of uniformity among prior microarray studies of DLPFC identified as differentially expressed between patients and their in SZ, a more rigorous approach must be adopted if highly reliable unaffected siblings, but the authors reported the identity of only the and generalizable candidate genes are to be identified by this 40 genes with the largest fold-change difference in expression method. between groups. Of those 40 genes, S100A12 (which codes for S100 Here we have established a protocol for conservative analysis and calcium-binding protein A12 [calgranulin C]) was significantly up-regulated 1.67-fold in PBCs from SZ patients, similar to the interpretation of gene expression microarray data by using the significant 1.50-fold up-regulation observed in our sample. Thus, CORGON and Focus algorithms to limit type I errors and MicroAr- although no prior report is directly comparable to the present study ray Data Characterization and Profiling to identify the ontologies (either in analytic methods or sample composition), there is already GENETICS represented by differentially expressed genes. In its present appli- some evidence for replication of results. cation to gene expression data from DLPFC in SZ, this approach Six of the putative biomarker genes we identified in PBCs were identified 177 genes, 6 biological processes, and 12 molecular also differentially expressed in the brain in SZ. Among these were functions that should be identified as high-priority targets for BTG1, which regulates cell proliferation, and three genes (GSK3A, further candidate gene association analyses and hypothesis-driven HNRPA3, and SFRS1) that regulate RNA splicing or transcription. functional studies. Twenty-eight of these genes also code to chro- These results endorse the view that, in the search for SZ risk genes mosomal loci strongly implicated in SZ by linkage analysis (22) and biologically meaningful disease markers, inherited mutations (Table 4) and are thus particularly attractive candidates. These regulating gene expression should be evaluated as thoroughly as include four genes (ACOX1, NDUFA2, SUCLG1, and TAPBP) that those that induce structural or functional changes in . were also linked to significantly overrepresented GO terms, and Although conceivable, no specific homeostatic mechanism is SFRS1 and HLA-DRB1, which were differentially expressed in both known whereby up-regulation of a gene in one tissue (e.g., DLPFC) DLPFC and PBCs in SZ. These genes provide reference points for is directly related to down-regulation of that gene in another (e.g., the commencement of future fine-mapping and positional cloning PBCs); therefore, the genes noted above should be considered of putative SZ risk loci. secondary candidate SZ biomarkers to SELENBP1 and HLA- Quite surprisingly, neither this study nor most of the prior DRB1, which showed altered expression in the same direction in microarray studies of SZ identified significant differences between both DLPFC and PBCs (up- and down-regulation, respectively). SZ patients and controls in the expression of ‘‘traditional’’ candi- Adding to this rationale, HLA-DRB1 maps to the MHC region on date genes, such as those coding for dopamine and glutamate 6p21.3, which is a prime candidate locus for SZ (22), receptors, transporters, or catalytic enzymes. Conversely, the GO and SELENBP1 maps to chromosome 1q21-22, a locus that has terms overrepresented in DLPFC in the present study were gen- been strongly linked to SZ in some but not most genome-wide erally related to neurotransmitter systems not typically implicated linkage studies (26). in SZ (e.g., GABA receptor activity), neuronal processes not The utility of SELENBP1 as a potential peripheral biomarker specific to a given neurotransmitter system (e.g., regulation of was substantiated by RT-PCR verification of its up-regulation in action potential), or biological processes not specific to the nervous PBCs. The altered levels of SELENBP1 transcripts measured in system (e.g., energy pathways). More precisely, genes related to DLPFC and PBCs also translated into observable consequences at energy metabolism were predominant among those differentially the functional level, because preliminary analyses suggested that expressed in DLPFC in SZ. These results are quite consistent with expression of SELENBP1 protein was denser in glia and less dense the report of Prabakaran et al. (23), who found significant alter- in neurons of the DLPFC in SZ; however, further stereological ations in SZ of several genes involved in the mitochondrial electron quantification of these changes are required and are presently being transport chain, NADH dehydrogenase complex, or mitochondrial undertaken in our laboratory. Little is known of the functions of membrane. Iwamoto et al. (17) also found global down-regulation SELENBP1 beyond its clear role in binding the antioxidant sele- of 76 mitochondrial genes in the DLPFC in SZ. Our results nium. Epidemiological evidence inversely relating selenium intake substantiate these observations, because at least four of the genes to the prevalence of colorectal and other cancers (27) is more

Glatt et al. PNAS ͉ October 25, 2005 ͉ vol. 102 ͉ no. 43 ͉ 15537 Downloaded by guest on September 24, 2021 compelling in light of an established reduction in the rate of these to other ethnically diverse samples. Finally, it is important to cancers among SZ patients (28). Links between selenium deficiency reiterate that SZ is an etiologically complex and heterogeneous and glutamate-induced excitotoxicity (29) are also provocative, disorder that invariably thwarts classification schemes relying on a because the increased expression of SELENBP1 could cause in- single dimension to differentiate affected and unaffected persons. creased sequestration of selenium (or perhaps be in response to Thus, although we have initially analyzed SELENBP1 most thor- already low levels of selenium, ref. 30), which may thus promote oughly, the other putative risk genes identified here should also be neurodegeneration in SZ. pursued, verified, validated, and incorporated into causal models of Although our approach has several strengths, such as careful the illness. control of type I errors, evaluation of potential confounding effects of psychotropic medications, ontological profiling of differentially Conclusion expressed genes, and comparison of gene expression in two tissues Through the implementation of a systematic approach toward the and samples, this work also has important limitations. Foremost analysis of gene expression microarray data, we: (i) discovered 177 among these, studies of postmortem brain tissue are naturalistic, putative SZ risk genes in DLPFC, 28 of which map to chromosomal and as such, we could not directly control for the effects of many loci linked to the disorder; (ii) delineated 6 biological processes and subject factors (e.g., diet, exercise, substance use, psychotropic 12 molecular functions that may be particularly disrupted in the medication use, etc.) that could influence gene expression and that illness; (iii) identified 123 putative SZ biomarkers in PBCs, 6 of also might differ among subject groups, thus inducing either which had corresponding differential expression in DLPFC; (iv) false-negative or false-positive results. We have attempted to con- verified the up-regulation of the strongest candidate SZ biomarker trol for some of these factors through appropriate matching of (SELENBP1) in PBCs; and (v) demonstrated an altered pattern of patients and controls and through statistical modeling when pos- expression of SELENBP1 protein in DLPFC in SZ. The continued sible. Statistical evaluation of psychotropic medication effects is application of this approach in other brain regions (e.g., both perhaps the most critical in this study, because this factor is the most implicated and nonimplicated structures, to identify ubiquitous and likely to systematically differ between patients and controls. Ideally, region-specific changes) and populations (e.g., bipolar disorder first-episode or unmedicated patients should be studied to further patients, to establish disease-specific changes) should facilitate the eliminate the possibility of medication effects on gene expression. discovery of highly reliable and reproducible candidate risk genes The inclusion of unaffected (and thus unmedicated) first-degree and biomarkers for SZ. Further work to replicate and validate SZ biological relatives of SZ patients could also be effective in this biomarkers in peripheral blood may ultimately provide the means regard and could also help rule out the effects of illness-associated to identify individuals at risk for SZ before disease onset, which may factors such as chronicity or degeneration after onset. in turn allow these individuals to be targeted for early intervention Another potential limitation is that the various tissues examined and prevention efforts. in this study were obtained from two very different populations, with DLPFC samples coming from subjects in the NBD identified We thank Arthur B. Pardee, Adam Dempsey, and Nadine Nossova for mostly as ‘‘white’’ or alternatively as ‘‘unknown’’ ethnicity and PBC helpful comments on the manuscript; Vural Ozdemir for recommenda- samples coming from subjects from Taiwan. This design feature tions regarding antipsychotic medication dose equivalences; and the may have caused us to miss some additional genes whose differ- Harvard Brain Tissue Resource Center for providing gene expression ential expression in DLPFC and PBCs might overlap if assessed microarray data and tissue samples from the NBD. This work was supported in part by the University of California at San Diego Center for from the same ethnic or geographic population; however, this may AIDS Research Genomics Core and National Institutes of Health also strengthen the design, in that our results may be overly (or Grants P30MH062512 (Igor Grant), R01AG018386, R01AG022381, appropriately) conservative. Studying two very different samples R01AG022982 (to W.S.K.); and R01DA012846, R01DA018662, also provides a form of replication that is rare in microarray studies R01MH065562, and R01MH071912 (to M.T.T.). J.C. is the Holder of a and may facilitate the generalization of these putative biomarkers Canada Research Chair in Genomics.

1. Faraone, S. V., Seidman, L. J., Kremen, W. S., Toomey, R., Pepple, J. R. & Computational Biology (Institut National de la Recherche Agronomique, Tsuang, M. T. (2000) Biol. Psychiatry 48, 120–126. Paris), p. GE24. 2. Breakefield, X. O. & Edelstein, S. B. (1980) Schizophr. Bull. 6, 282–288. 20. Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., 3. Rossor, M. (1984) J. Psychiatr. Res. 18, 457–465. Davis, A. P., Dolinski, K., Dwight, S. S., Eppig, J. T., et al. (2000) Nat. Genet. 4. Fleissner, A., Seifert, R., Schneider, K., Eckert, W. & Fuisting, B. (1987) Eur. 25, 25–29. Arch. Psychiatry Neurol. Sci. 237, 8–15. 21. Zeeberg, B. R., Feng, W., Wang, G., Wang, M. D., Fojo, A. T., Sunshine, M., 5. Gasque, P., Dean, Y. D., McGreal, E. P., VanBeek, J. & Morgan, B. P. (2000) Narasimhan, S., Kane, D. W., Reinhold, W. C., Lababidi, S., et al. (2003) Immunopharmacology 49, 171–186. Genome Biol. 4, R28. 6. Abusaad, I., Mackay, D., Zhao, J., Stanford, P., Collier, D. A. & Everall, I. P. 22. Lewis, C. M., Levinson, D. F., Wise, L. H., DeLisi, L. E., Straub, R. E., Hovatta, (1999) J. Comp. Neurol. 408, 560–566. I., Williams, N. M., Schwab, S. G., Pulver, A. E., Faraone, S. V., et al. (2003) 7. Pounds, S. & Cheng, C. (2004) Bioinformatics 20, 1737–1745. Am. J. Hum. Genet. 73, 34–48. 8. Harrison, P. J. & Weinberger, D. R. (2005) Mol. Psychiatry 10, 40–68. 23. Prabakaran, S., Swatton, J. E., Ryan, M. M., Huffaker, S. J., Huang, J. T., Griffin, J. L., Wayland, M., Freeman, T., Dudbridge, F., Lilley, K. S., et al. 9. Hill, K., Mann, L., Laws, K. R., Stephenson, C. M., Nimmo-Smith, I. & (2004) Mol. Psychiatry 9, 684–697. McKenna, P. J. (2004) Acta Psychiatr. Scand.. 110, 243–256. 24. Zvara, A., Szekeres, G., Janka, Z., Kelemen, J. Z., Cimmer, C., Santha, M. & 10. Tsuang, M. T., Nossova, N., Yager, T., Tsuang, M. M., Guo, S. C., Shyu, K. G., Puskas, L. G. (2005) Disease Markers 21, 61–69. Glatt, S. J. & Liew, C. C. (2005) Am. J. Med. Genet. 133, 1–5. 25. Middleton, F. A., Pato, C. N., Gentile, K. L., McGann, L., Brown, A. M., 11. Martin, K. J., Graner, E., Li, Y., Price, L. M., Kritzman, B. M., Fournier, M. V., Trauzzi, M., Diab, H., Morley, C. P., Medeiros, H., Macedo, A., et al. (2005) Rhei, E. & Pardee, A. B. (2001) Proc. Natl. Acad. Sci. USA 98, 2646–2651. Am. J. Med. Genet. 136, 12–25. 12. American Psychiatric Association (1994) Diagnostic and Statistical Manual of 26. Brzustowicz, L. M., Hodgkinson, K. A., Chow, E. W. C., Honer, W. G. & Basett, Mental Disorders (DSM-IV) (Am. Psychiatric Assoc., Washington, DC). A. S. (2000) Science 288, 678–682. 13. Sasik, R., Calvo, E. & Corbeil, J. (2002) Bioinformatics 18, 1633–1640. 27. Jacobs, E. T., Jiang, R., Alberts, D. S., Greenberg, E. R., Gunter, E. W., 14. Cole, S. W., Galic, Z. & Zack, J. A. (2003) Bioinformatics 19, 1808–1816. Karagas, M. R., Lanza, E., Ratnasinghe, L., Reid, M. E., Schatzkin, A., et al. 15. Sasik, R., Woelk, C. H. & Corbeil, J. (2004) J. Mol. Endocrinol. 33, 1–9. (2004) J. Natl. Cancer Inst. 96, 1669–1675. 16. Li, C. & Wong, W. H. (2001) Proc. Natl. Acad. Sci. USA 98, 31–36. 28. Kotler, M., Barak, P., Cohen, H., Averbuch, I. E., Grinshpoon, A., Gritsenko, 17. Iwamoto, K., Kakiuchi, C., Bundo, M., Ikeda, K. & Kato, T. (2004) Mol. I., Nemanov, L. & Ebstein, R. P. (1999) Am. J. Med. Genet. 88, 628–633. Psychiatry 9, 406–416. 29. Savaskan, N. E., Brauer, A. U., Kuhbacher, M., Eyupoglu, I. Y., Kyriakopoulos, 18. Davis, J. M. & Chen, N. (2004) J. Clin. Psychopharmacol. 24, 192–208. A., Ninnemann, O., Behne, D. & Nitsch, R. (2003) FASEB J. 17, 112–114. 19. Lozach, J., Sasik, R., Ogawa, S., Glass, C. K. (2003) in European Conference on 30. Vaddadi, K. S., Soosai, E. & Vaddadi, G. (2003) Br. J. Clin. Pharmacol. 55, 307–309.

15538 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0507666102 Glatt et al. Downloaded by guest on September 24, 2021