Supporting Information
Total Page:16
File Type:pdf, Size:1020Kb
Supporting Information Xu et al. 10.1073/pnas.0908584106 SI Text (Fig. S1). Of the 15 families with a history of SCZ in a Supporting Information. While most of the genes in Table S3 have second-degree relative, 11 are avuncular pairs, three are grand- not been directly associated with schizophrenia (SCZ) in the parent-grandchild pairs, and one family is both. For our linkage literature, some of them have been implicated in psychiatric and studies, we genotyped 479 subjects from 130 families. Sixty-nine other neurobehavioral disorders or in mechanisms that may families are informative. In the 54 informative families with at underlie predisposition to such disorders. Two examples (NRG3 least two affected members, there are 60 and 79 affected relative and RAPGEF2) are discussed in the main text, but a few pairs for the narrow and broad affection categories, respectively. additional examples are provided below: Of the broadly affected relative pairs, 30 are sibling, 24 are parent-child, 14 are avuncular, six are cousin, three are second The Contactin Family. Burbach and van der Zwaag (1) recently cousin, and two are grandparent-grandchild. provided a detailed account of the contactin system and its All subjects were confirmed as Afrikaners by tracing ancestry potential contribution to a number of different psychiatric and back to the 1800s using state and church records and finally to neurobehavioral disorders, including SCZ and autism. In an the initial 2,000 founders through the Genealogies of Old South early study, Fernandez et al. (2, 3) showed that disruption of African Families. Diagnostic evaluations were conducted in- contactin 4 (CNTN4) results in developmental delay and other person by specially trained clinicians using the DIGS, which was features of the 3p deletion syndrome. In the present study, we translated and back-translated into Afrikaans. Upon completion identified a CNV at Ϸ130 kb downstream of CNTN6 (also named of the DIGS, the interviewers assigned DSM-IV diagnoses and NB3), another member of the contactin protein family. Inter- completed a detailed narrative report. All narratives and DIGS estingly, in an independent study, we identified another CNV were independently evaluated by appropriately trained clinicians disrupting the contactin associated protein 2 (CNTNAP2) that in New York. co-segregates with psychosis in one family. Structural variants in the CNTNAP2 gene have been reported in patients with psy- Sample Characteristics. The familial cases sample consisted of 31 chosis in at least two previous genome-wide scans (4, 5). Thus, males (65%) and 17 females (35%) and both their biological our results support the notion that abnormalities in the contactin parents. The sample enriched in sporadic cases consisted of 108 family proteins may be involved in SCZ and other psychiatric males (71%) and 44 females (29%) and both their unaffected conditions. biological parents. The control sample consisted of 93 females (58%) and 66 males (42%) and both their biological parents. It The CSMD1 (CUB and Sushi Multiple Domains 1) Gene. One CNV should be noted that both in this study and our previous Xu et identified in the present study is a duplication at 8p23.1–8p23.2 al. (11) study we did not find any sex-specific differences in involving the CSMD1 gene. A recent study (6) showed that a distribution of CNVs. We checked for population stratification transmitted duplication at 8p23.1–8p23.2 affecting the CSMD1 between cases and controls using principal component analysis gene may be associated with speech delay, autism, and learning as implemented in EIGENSTRAT (12). Both groups overlap in difficulties. the graphical representation of the top two principal compo- The ADARB1 (adenosine deaminase, RNA-specific, B1) gene: nents (eigenvectors), indicating no evidence of stratification. ADARB1 (also named ADAR2) encodes for adenosine deami- The phenotypic variables analyzed in the CNV scan, were nase, a key RNA editing enzyme. Although ADARB1 has not defined as follows: History of developmental delay was recorded been directly associated with psychiatric conditions, mRNAs for as positive when there was clear history of maturation lag or several important neuronal receptors such as 5-HT2CR and milestone delays before the age of 6 (i.e., delayed crawling, GLUR2 undergo posttranscriptional editing. Abnormal editing of HT2CR has been implicated in depression and SCZ (7). walking, speaking, etc.). History of learning disabilities was Other potentially interesting genes include the RXFP2 (relax- recorded as positive if there had been a diagnosis of a learning in/insulin-like family peptide receptor 2) gene (also known as disability, or clear history of being a ‘‘slow learner,’’ requiring LGR8), a member of the relaxin family peptide receptor system remediation at school, or placement in a special class. Age at whose members are distributed in both excitatory and inhibitory onset was defined as the age at which full DSM criteria for neurons in the brain and modulate diverse behaviors (8), as well schizophrenia or schizoaffective disorder were met. as the LRFN5 (leucine-rich repeat and fibronectin type III domain containing 5) gene (also named SALM5), which is Genotyping. For the linkage analysis, genotyping for 2005 di-, tri-, expressed exclusively in neuronal tissues, especially in mature and tetra-nucleotide repeat microsatellite markers was per- neurons. Several members of the LRFN protein family were formed by deCODE (Reykjavik, Iceland), through their fee-for- shown to associate with PSD-95 (9, 10). service genotyping facility. Among them, 1,904 autosomal mark- ers and 76 chromosome X markers passed our internal quality SI Methods checks. Of the 960,395 possible genotypes, 878,046 were suc- Patient Cohorts. We performed a genome-wide survey of rare cessfully called, corresponding to an average (ϮSD) per marker inherited CNVs in a total of 182 individuals, consisting of 48 genotyping rate of 91% (Ϯ7%). probands with familial SCZ (positive disease history in a first- For the CNV analysis, the families were genotyped using degree (n ϭ 33) or second-degree (n ϭ 15) relative and both of Human Genome-Wide SNP Array 5.0 (Affymetrix), which con- their biological parents, as well as all additional affected relatives tains 500,568 SNPs, as well as 420,000 additional nonpolymor- that were available for genotyping. Of the 33 families with a phic probes that can assess other genetic differences, such as CN history of SCZ in a first-degree relative, 17 are parent-child variation. Samples were processed as previously described (11). pairs, 12 are sibling pairs, and four families are both. In eight of Average call rate on arrays used in this study was 99.43%. All these 33 families, in addition to the first-degree relative, there is microarray experiments were performed in the Vanderbilt Mi- at least one second-degree relative who is also affected with SCZ croarray Shared Resource directed by Dr. Shawn Levy. Xu et al. www.pnas.org/cgi/content/short/0908584106 1of15 CNV Identification, Inherited CNV Detection, and Inherited CNV Ver- in length. The borders listed are the outermost borders defined ification. CEL files from all chips were analyzed using two by either analysis. software packages, DNA-Chip Analyzer (dChip) and Partek (iii) Rare inherited CNV detection: The CNV list generated Genomics Suite (Partek software, version 6.3 Beta, build above was first divided into two lists, a deletion list (mean CN Ͻ #6.07.1127; Partek), as previously described (11). The entire 1.25 for autosomal or mean CN Ͻ 0.25 for chromosome X) and procedure included five steps: (i) quality control; (ii) CNV a duplication list (mean CN Ͼ 2.75 for autosomal or mean CN Ͼ identification; (iii) inherited CNV detection; (iv) inherited CNV 1.75 for chromosome X). Deletion and duplication analyses were verification by intensity and genotyping filters; and (v) inherited conducted separately. A candidate inherited CNV was consid- CNV confirmation by MLPA. ered for further analysis only if there was more than 50% overlap (i) Quality control: A quality control step was conducted to in size with a variant at the same locus in the biological parental exclude families with one or more hybridization failures. chromosomes (size of overlap/total size of merged CNVs Ͼ 0.5), (ii) CNV identification: All CEL files that passed the quality but less than 50% overlap in size with any variant in the other control step were imported and analyzed by dCHIP (13–15) and parental chromosomes. Partek according to the authors’ instructions. For the dCHIP (iv) Inherited CNV verification using genotype information analysis, data were processed in batches, each batch containing and median smoothing inferred CN: The vast majority of CNVs arrays processed simultaneously in 96-well plate formats. An are inherited from parents (16) and since the false-positive or array with median overall intensity within each batch was chosen false-negative prediction rate is relatively high in all available as the baseline to adjust the brightness of all arrays to a CNV detection algorithms (17), it is important to use indepen- comparable level. Arrays were then normalized at probe inten- dent CNV inferring methods, as well as family information for sity level using the Invariant Set Normalization procedure (14). validation of inherited CNV calling results. Genotyping calls and Following normalization, a CNV identification step ensued. A Median smoothing inferred CN calls for all samples were stored reference signal value was first calculated for each probe set in MySQL database (version 5.1) along with the related pedigree using a model-based method. To obtain reference signal distri- information. For each identified candidate CNV, all genotypes bution (blind to the affected status information), we assumed and inferred CN of SNPs within the region were retrieved for that for any locus, less than 10% of all of the samples show each triad (the subject and his/her parents).