Genome-Wide Expression Profiling of Schizophrenia Using a Large
Total Page:16
File Type:pdf, Size:1020Kb
Molecular Psychiatry (2013) 18, 215–225 & 2013 Macmillan Publishers Limited All rights reserved 1359-4184/13 www.nature.com/mp ORIGINAL ARTICLE Genome-wide expression profiling of schizophrenia using a large combined cohort M Mistry1, J Gillis2 and P Pavlidis2,3 1Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC, Canada; 2Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada and 3Centre for High-Throughput Biology, University of British Columbia, Vancouver, BC, Canada Numerous studies have examined gene expression profiles in post-mortem human brain samples from individuals with schizophrenia compared with healthy controls, to gain insight into the molecular mechanisms of the disease. Although some findings have been replicated across studies, there is a general lack of consensus on which genes or pathways are affected. It has been unclear if these differences are due to the underlying cohorts or methodological considerations. Here, we present the most comprehensive analysis to date of expression patterns in the prefrontal cortex of schizophrenic, compared with unaffected controls. Using data from seven independent studies, we assembled a data set of 153 affected and 153 control individuals. Remarkably, we identified expression differences in the brains of schizophrenics that are validated by up to seven laboratories using independent cohorts. Our combined analysis revealed a signature of 39 probes that are upregulated in schizophrenia and 86 that are downregulated. Some of these genes were previously identified in studies that were not included in our analysis, while others are novel to our analysis. In particular, we observe gene expression changes associated with various aspects of neuronal communication and alterations of processes affected as a consequence of changes in synaptic functioning. A gene network analysis predicted previously unidentified functional relationships among the signature genes. Our results provide evidence for a common underlying expression signature in this heterogeneous disorder. Molecular Psychiatry (2013) 18, 215–225; doi:10.1038/mp.2011.172; published online 3 January 2012 Keywords: schizophrenia; gene expression; microarray; post-mortem brain; prefrontal cortex Introduction if there are any common features of the disease at the molecular level. Schizophrenia is a severe psychotic disorder that The diversity of results in transcriptome studies affects approximately 1% of the population world- can be attributed to many sources. Besides differences 1 wide. Many groups have attempted to identify in the sampled cohorts and disease heterogeneity, changes in gene expression in the brains of schizo- discrepancies between transcriptome studies can be 2–4 phrenics, often focusing on the prefrontal cortex. due to methodological differences in sample prepa- Such studies have suggested several altered mole- ration, choice of platform and data analysis. There cular processes including (but not limited to) synaptic are issues that are especially pertinent to the analysis 5–8 machinery and mitochondrial-related transcripts, of post-mortem human brain tissue. One is the 9 immune function and a reduction in oligodendrocyte confounding effect of factors such as age, gender 10–12 and myelination-related genes. The variety and and medication. Such factors are often associated scope of these processes, found in different subject with relatively large gene expression changes,16 while cohorts, raises the question as to whether there are psychiatric illnesses such as schizophrenia are asso- underlying commonalities in molecular signatures ciated with small effect sizes. If these factors are not among schizophrenics. Such commonalities are correctly controlled for, they can mask or masquerade presupposed by most genetic studies, which look for as expression patterns associated with the disease. alleles overrepresented in large numbers of schizo- Standard practice involves minimizing the effects of 13–15 phrenic individuals. It is important to establish such factors either in the experimental design by sample matching or treating these factors as covari- Correspondence: Dr P Pavlidis, University of British Columbia, ates in regression models. It is also increasingly Centre for High-Throughput Biology (CHiBi), 177 Michael Smith appreciated that technical artifacts such as ‘batch Laboratories, 2185 East Mall, Vancouver, Canada, BC V6T1Z4. 17–20 E-mail: [email protected] effects’ can result in substantial variability. In Received 29 June 2011; revised 19 October 2011; accepted 21 addition, post-mortem brain tissue is a limited November 2011; published online 3 January 2012 resource, leading to small sample sizes with low Expression profiling of schizophrenia using large cohort M Mistry et al 216 statistical power. For this reason, most studies have false discovery rate. We show that we are able to not applied multiple test correction, and perform detect small yet consistent and statistically significant validation only on the same RNA samples that were changes. Careful control of extraneous factors using used for profiling. All of these issues are likely to probe-specific statistical modelling, results in gene contribute to the differences in findings across expression changes associated with the disease effect. studies. We propose that a good way to address these Our results confirm some previously reported expres- problems is to re-analyze and meta-analyze the sion changes in schizophrenia in addition to identify- studies in question, a task we undertake in this paper. ing potential new targets suggesting alterations in The use of meta-analyses to combine high-through- synaptic function. put genomics studies has become increasingly used in neuropsychiatry.14,17,21–23 Combining data sets across Materials and methods studies increases power and facilitates the identifica- tion of gene expression changes that are consistent Data pre-processing and quality control and reliable, reducing false positives. In a meta- Genome-wide expression data sets were selected on analysis, multiple studies are statistically pooled to the basis of microarray platform, use of prefrontal provide an overall estimate of significance of an cortex (BA 9, 10 or 46), the availability of information effect, highlighting important yet subtle variations. on covariates such as age, and finally the availability Although meta-analysis has been used in the study of of the raw data. Each data set comprises a cohort of gene expression data,24–26 to our knowledge only a neuropathologically normal subjects and a cohort few studies have done so with post-mortem human of schizophrenia subjects, as diagnosed and reported brain data.17,22,27,28 A cross-study analysis of psycho- in their respective studies (Table 1). Sources for sis was conducted across seven data sets using data include the SMRI, the Harvard Brain Bank samples from the Stanley Medical Research Institute and the Gene Expression Omnibus (GEO). GEO (SMRI) post-mortem brain collections.22 Additionally, studies were identified by extensive manual and the SMRI report results from a cross-study analysis keyword searches. Although the SMRI has additional across schizophrenia data sets in their online geno- data sets, these represent repeated runs of the samples mics database (http://www.stanleygenomics.org), from the same subjects, so we selected one data set to computing ‘consensus’ fold changes while adjusting represent each of the two SMRI brain collections. Two for confounding variables. However, the studies used additional studies were obtained from the investiga- in these analyses use samples from the same two tors.30,31 Data sets consisted of single-channel inten- brain collections and are therefore not entirely inde- sity data generated from two Affymetrix (Santa Clara, pendent. More recently, a comparative analysis was CA, USA) platforms, but only probe sets on the HG- conducted across two independent schizophrenia U133A chip from each data set were used for analysis. cohorts; probes were identified as differentially Probe sets were re-annotated at the sequence level by expressed within each study and the intersecting alignment to the hg19 genome assembly, using meth- probes between the two studies were reported.29 ods essentially as described in Barnes et al.20 and also Thus, while there have been attempts to meta-analyze cross-referenced with problematic probe lists provided schizophrenia expression profiling data, there has not by http://masker.nci.nih.gov/ev/. The raw data (‘CEL’) yet been an integration using the primary data of more files from all the data sets were pooled together and than two independent microarray studies. expression levels were summarized, log transformed In this study, we present a cross-study analysis and normalized by using the R Bioconductor ‘affy’ of seven microarray data sets comprising a total of package (R Development Core Team, 2005) using 153 schizophrenia samples and 153 normal controls. default settings for the Robust Multi-array Average We applied a linear modelling approach to control algorithm. Data were also processed using four other for factors such as age, brain pH and batch effects, pre-processing methods for evaluating the robustness and applied multiple testing corrections to control the our meta-signatures (see Supplementary Text). We Table 1 Schizophrenia data sets Data set Reference Microarray platform Brain region(s) No. of subjects CTL:SZ Stanley Bahn SMRI database HG-U133A Frontal