Kazima Bulayeva · Oleg Bulayev Stephen Glatt Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates A Study of Populations Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates

[email protected] Kazima Bulayeva • Oleg Bulayev • Stephen Glatt

Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates A Study of Dagestan Populations

[email protected] Kazima Bulayeva Oleg Bulayev Russian Academy of Sciences Russian Academy of Sciences Moscow, Moscow, Russia

Stephen Glatt Department of Psychiatry and Behavioral Sciences SUNY Upstate Medical University Syracuse, New York USA

ISBN 978-3-319-31962-9 ISBN 978-3-319-31964-3 (eBook) DOI 10.1007/978-3-319-31964-3

Library of Congress Control Number: 2016939944

© Springer International Publishing Switzerland 2016 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made.

Printed on acid-free paper

This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland

[email protected] One of remote highland genetic isolate where we had our study

[email protected] Foreword

Psychiatric disorders are among the world’s most complex and least understood ailments; yet, relative to other medical disorders, they cause a disproportionate share of suffering. The authors of this volume, Dr. Kazima Bulayeva and her son, Dr. Oleg Bulayev, have dedicated not just their scientific careers, but their lives, to combatting these disorders by increasing our understanding of their causes. Their work is the true embodiment of genetic epidemiology, defined by Newton Morton as “a science which deals with the etiology, distribution, and control of disease in groups of relatives and with inherited causes of disease in populations.”1 This volume represents a fitting compilation of decades of labor, a true labor of love that has persevered despite many obstacles, from the financial to the political, to the physical challenges of navigating the mountainous terrain of the , to the social and cultural difficulties of developing and nurturing deep interpersonal relationships with the indigenous people in the highlands of Dagestan. Through this effort, Drs. Bulayeva and Bulayev have been able to develop a truly unique relationship with their subjects and a matchless, singular research program. In this monograph, you, the reader, will be immersed into the scientific workflow of Drs. Bulayeva and Bulayev, and you will feel the intensity of their painstaking effort to dissect the genetic underpinnings of mental disorders that aggregate in distinct ethnic isolates. Although technical advances in the field have ushered in an era of personal and precision medicine, Drs. Bulayeva and Bulayev show us the power and the promise of careful ascertainment, rich clinical characterization, and traditional family-based genetic analysis methods for discovering genomic loci that may harbor risk-conferring for mental disorders. This work is a testimonial, a

1 Morton, N. E. (1982). Outline of genetic epidemiology. New York: Karger. ISBN 3-8055-2269-X.

vii

[email protected] viii Foreword tutorial even, on the rewards to be reaped through the careful application of fundamental methods of genetic epidemiology and of sound science.

Syracuse, NY, USA Stephen J. Glatt February 10, 2016

[email protected] Preface

The study of genetics of complex diseases is one of the main priorities of modern genetics, as these diseases are the leading cause of premature death and disability. Mental diseases—sсhizophrenia, depression, bipolar disorders, and so on—are among the most severe complex diseases for both patients and society. Currently, the rate of affectation by these mental diseases is increasing in most countries. Thousands of scientists from around the world study the genetic and environmental factors influencing the development of the disease, however, and so far, this area of medical genetics is full of conflicting results obtained by different researchers from different countries. We have proposed a cross-isolated population approach, implemented in ethni- cally and demographically subdivided genetic isolates with the aggregation of specific complex diseases. Unique genetic isolates with the aggregation of certain complex diseases, including schizophrenia, were ascertained in our long-term population-genetic studies of small indigenous ethnics of Dagestan (Northern Caucasus, Russia). Such cross-isolate approach allows us to identify common for all observed isolates and specific for every of them genomic regions containing candidate genes for diseases and genomic structural variants (CNV and ROH) linked with schizophrenia. The study of the same complex phenotype in diverse genetic isolates with different ancestors and high rate of endogamy and inbreeding enables the determination of the entire spectrum of genes and structural genomic variants involved in the pathogenesis of schizophrenia or any other complex disease. Genetic homogeneity and ancestor effect in such isolates helps identify the genomic mechanisms of the disease etiopathogenesis with substantial savings of cost and time, compared with genetically and ethnically heterogeneous large populations. Authors keep cherished memory of prominent geneticists, whose support of our pioneering genetic studies among indigenous peoples of Dagestan had a fundamen- tal importance for the development of the works presented—Timofeev-Resovskii, Dubinin, and Gindilis. We also express our sincere gratitude to the staff of our research group of human genetic adaptation at the NI Vavilov Institute of General Genetics of RAS (VIGG RAS), and members of the regular expeditions in ix

[email protected] x Preface

Dagestan— Pavlova, Gurgenova, Kurbanov, Guseynova, and Omarova. We are grateful to Politov, Kurbatova, and all the researchers of the Department of Popu- lation Genetics of VIGG RAS for review and recommendation to publish our long- term study results in this book. We are also grateful to the reviewer of the book manuscript—to Professor Golimbet, whose valuable comments in the manuscript helped to improve the presentation of our study in this book. Endless gratitude to our foreign colleagues, whose appreciation and support of our Dagestan Genetic Heritage research program was of fundamental importance for the preservation and development of this study, in spite of the numerous difficulties in the Russian science. They are outstanding scientists from the United States—Erving Gottesman, Paul Thompson and the International Scientific Corpo- ration ENIGMA, Ming Tsuang, Hilary Coon, Henry Harpending, Lynn Jorde, Michael Hammer, and Tatiana Karafet; from Italia—Giorgio Paoli and Sergio Tofanelli; from Germany Klaus-Peter Lesch; and from Japan Toru Takumi and Hideshi Kawakami. Our internships and joint work in their laboratories helped the authors of this book to master the most advanced methods of molecular genetics and bioinformatics technologies and apply them in our studies in Dagestan genetic isolates. Endless thanks to our coauthor in genetic studies of schizophrenia in Dagestan isolates and scientific editor of the book—Prof. Stephen Glatt. His participation in this study assisted in overcoming the differences in clinical and genetic methodol- ogies between Russian and US researches and certainly made available the results of our research in this book to a wide range of English-speaking colleagues and readers. Endless thanks to all Dagestan highlanders from diverse ethnic groups for their volunteer participation in our long-term study. The authors are grateful for the help in preparing the manuscript for publication to Gurgenova and for English translations to Marisa Peryer.

Moscow, Russia Kazima Bulayeva Moscow, Russia Oleg Bulayev Syracuse, NY, USA Stephen Glatt

[email protected] Contents

1 Current Problems of Complex Disease Genes Mapping ...... 1 1.1 General Problems of Complex Disease Genes Mapping ...... 1 1.2 Current Approaches of Schizophrenia Spectrum Disease Mapping ...... 2 1.3 The Current State of Gene Mapping of Schizophrenia Spectrum Disorders ...... 6 References ...... 11 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan ...... 21 2.1 History and Ethno-linguistic Diversity of Dagestan ...... 21 2.2 Genetic and Demographic Structure of the Selected Isolates . . . . . 22 2.3 Methods of Clinical Studies ...... 28 2.4 Molecular-Genetic Methods of Study ...... 30 2.5 Genetic and Statistical Methods of Experimental Data Analysis ...... 31 References ...... 34 3 Selection of Populations for Mapping Genes of Complex Diseases ... 37 3.1 Principles of Selection of Populations for Complex Disease Gene Mapping ...... 37 3.2 Ethnogenomic Structure of Dagestan Populations ...... 38 3.3 Genetic Epidemiology Study of Selected Genetic Isolates with the Aggregation of Schizophrenia Spectrum Disorders ...... 48 3.4 Gene Pool of Selected Isolates for Mapping Genes of Schizophrenia ...... 55 3.5 Role of Inbreeding in the Aggregation of a Schizophrenia and in Its Age of Onset ...... 60 References ...... 68

xi

[email protected] xii Contents

4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates .... 71 4.1 Haplotype Analysis in Pedigrees Ascertained in the Isolates . . . . . 71 4.2 Genome-Wide Nonparametric Linkage Analysis of Schizophrenia in Selected Isolates ...... 80 4.3 Genome-Wide Parametric Linkage Analysis of Schizophrenia in Selected Isolates ...... 81 4.4 Cross-Population Analysis of Genome-Wide Linkages Scan for Schizophrenia in Selected Isolates ...... 88 References ...... 98 5 Common Structural Genomic Variants in Linked with SCZ Regions ...... 103 5.1 Copy Number Variations and Runs of Homozygosity Analyses in Linked with SCZ Genomic Regions in Pedigrees of Selected Isolates ...... 103 5.2 Effect of Inbreeding on CNV and ROH Segments Sizes (Kb) and on Marker Numbers ...... 110 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with Schizophrenia Regions ...... 112 References ...... 128

Conclusions ...... 131

Appendix: List of Genome-Wide Scanned Loci and Markers in Studied Dagestan Genetic Isolates (Weber/CHLC 9.0 Markers) ..... 133

[email protected] List of Figures

Fig. 1.1 The relative risk (RR) of schizophrenia during lifetime (lifetime risk of developing schizophrenia), based on the degree of genetic affinity (Gottesman 1991). In the general population, RR of developing schizophrenia is 1 %. In groups of relatives, the RR increases significantly with closer family ties. SCZ schizophrenia ...... 3 Fig. 2.1 Women’s hats from representatives of different ethnic groups in Dagestan (Gadzhieva 1961) ...... 26 Fig. 3.1 Haplotype of founder isolate with mutations that define the disease, out of the total proto-population transmitted to descendants. Haplotype block with pathogenic loci decreases during meiosis and recombination: over generations in the demographic history population members had greater numbers of recombination in meiosis. Only a short segment of ancestral haplotype with pathogenic is maintained in 50-ths generations in population ...... 39 Fig. 3.2 PCA plot of 250 K autosomal SNPs of 56 populations from Dagestan, Caucasus, Near East, Europe, Central Asia and South Asia. ‘Drop one in’ procedure was used for analysis. PC1 and PC2 coordinates for each population were calculated as median coordinate values for individuals within populations. This revealed relatively distinct clusters of Europeans, South Asian and Central Asians, while Daghestani samples (except Nogais and Mountain Jews immigrated to Dagestan region about 700 years ago according historical data) intermingle with other Caucasus individuals and show an affinity with European and Near Eastern samples. Dagestan-ND, ethnic groups belonging to Dagestan and Nakh language family. Dagestan-non-ND, ethnics groups of Dagestan (from Karafet et al. 2016) ...... 40

xiii

[email protected] xiv List of Figures

Fig. 3.3 Distribution of Dagestani ethnic groups (, Dargins, and ) with 25 racial and ethnic groups worldwide (527 people) in the space of three (PC1–PC3) principal components. Every examined individual is designated by point, the color of which reflects ethnicity. The circle indicates Dagestani ethnic groups. The Kumyks, Dargins, and Laks show clear ethnogenic proximity to the European group ...... 41 Fig. 3.4 Distance from the African centroid and distribution of ethnic populations studied within the size variance alleles of STR loci ...... 41 Fig. 3.5 Network (Median joining), built on the basis of 20 STR loci haplotype of Y-. Nuclear haplotypes of a certain number of examinees from different ethnic groups are highlighted. Pink Kubachins; red Avars; yellow Chechens-Akkin; green Tabasarans; blue Laks (Caciagli et al. 2009) ...... 43 Fig. 3.6 Multivariate analysis of Y-STR major haplogroups’ frequency in major populations of Dagestan, the Caucasus, and the Middle East. Geographical regions are indicated by the following symbols: filled squares Dagestan; filled circles Caucasus; filled triangles West Asia. Gray color means haplogroups. Legend: MJ Mountain Jews; TAT Tats; Lk Laks; Avr Avars; Kbc Kubachins; Tbs Tabasarans; Drg Dargins; Lzg Lezgins; Rtl Rutuls; Abz Abazins; Abk Abkhaz; Arm ; AZB_NT Azerbaijanians-North Talysh; Che Chechens; Geo ; Ins Ingush; Kbd Kabardians; Krd ; Ir_Teh Teheran Iranians; Ir_Isf Isfahan Iranians; Ir Iranians; Ir_Arb Iranians ; Ir_Gil Iranians–Gilaks; Ir_Bak Iranians-Bakhtiard; Ir_Maz Iranians Mazandarinians; Ir_ST Iranians-South Talysh; Jor Jordanians; Trk Turkish; Yem Yemens ...... 44 Fig. 3.7 Multidimensional scaling of HVS-I sequence matrix (haplotype frequencies) demonstrating the genetic relationships among the ethnic populations of the Caucasus, West Asia, and Central Asia. filled squares Dagestan; filled circles Caucasus, filled triangles West Asia, and open circles Central Asia (Uzbk Uzbeks; Trkm ; Kazh Kazakhs). For symbols of other ethnic groups, see Fig. 3.5 ...... 45 Fig. 3.8 Contour map showing the distribution of J1 and J *(xJ2) haplogroups in ethnic groups of the Caucasus, Middle East, Central Asia, and North Africa, professing Islam ...... 45 Fig. 3.9 Distributions built based on genetic distances between Dagestan (Avars, Dargins, Kumyks, Lezgins, Kubachins) and other worldwide groups by hypervariable locus of HVS1 mtDNA ...... 47 Fig. 3.10 Pedigree branches of genetic isolates DGH064 (a), DGH005 (b), DGH022 (c), and DGH011 (d). Legend: P/SCZ possibly with

[email protected] List of Figures xv

schizophrenia, SCZ schizophrenia and related spectrum disorders ...... 54 Fig. 3.11 Distribution of sizes of alleles of 21 STR loci in groups of descendants from outbred (1), endogamous (2) and inbred (3) marriages. X-axis: the size of alleles of studied loci; Y-axis: frequency of their occurrence in these groups of descendants, % ...... 55 Fig. 3.12 The distribution of alleles D17S784 in the groups of descendants of exogamous (inter-populations and inter-ethnic) (1) and consanguineous (2) marriages ...... 56 Fig. 3.13 A comparative analysis of the level of heterozygosity and allelic rank distribution of grades 28 microsatellites between 3 studied ethnic groups in Dagestan and the global summary of the John Weber lab. HWEB level of heterozygosity in the combined sample from the John Weber lab in examined group of Laks (HLAKS), Dargins (HDARG), and Tindals (HTIND) ...... 56 Fig. 3.14 Distribution of H level of heterozygosity per locus (a) and inbreeding level F (b) in healthy subjects (1) and patients (2). SD standard deviation; SE standard error; X average value ...... 63 Fig. 3.15 Genealogy fragment of a primary isolate with a high frequency of cousin marriages and aggregation of paranoid schizophrenia . . . . . 65 Fig. 3.16 The frequencies of the descendants of outbred and inbred marriages in groups of healthy subjects (N) and schizophrenia spectrum disorders patients (SCZ). Differences in the distribution groups are valid: χ2 ¼ 10.9, df ¼ 1, p ¼ 0.00096, Rs ¼À0.498, t ¼ 3.721, p ¼ 0.00058 ...... 65 Fig. 3.17 The distribution of age at onset of schizophrenia in groups of descendants of the different types of marriage ...... 66 Fig. 3.18 Multivariate genetic analysis of patient groups with different age of onset within two main components II and I ...... 66 Fig. 4.1 Haplotypes of chromosome 22 in the genealogy fragment DGH005. The sequence of chromosome loci: D22S420, D22S345, D22S689, D22S685, D22S683, D22S445 ...... 73 Fig. 4.2 Haplotypes of in the genealogy fragment DGH005. The sequence of loci: D17S1308, 917S1298, D17S974, D17S1303, D17S947, D17S2196, D17S1294 ...... 74 Fig. 4.3 Haplotype of chromosome 22 in the fragment of genealogy DGH064. Loci sequence (see in Fig. 3.18) ...... 75 Fig. 4.4 Haplotypes of chromosome 17 in the fragment of genealogy DGH064. STR loci (see in Fig. 4.2) ...... 76 Fig. 4.5 Haplotypes of chromosome 17 in the genealogy fragment DGH022. The letter “Y” marks the genealogy members with genome-wide scanned microsatellites ...... 77

[email protected] xvi List of Figures

Fig. 4.6 Haplotypes of chromosome 22 in the fragment of genealogy DGH011. For sequence of loci, see in Fig. 3.18 ...... 78 Fig. 4.7 Haplotypes of chromosome in genealogy fragment DGH064. The sequence of loci: D6S1959, D6S2439, D6S2427 ...... 79 Fig. 4.8 Common genomic region 6p21.2–p22.3 we found as linked with schizophrenia (see Tables 4.2 and 4.3) for genetic isolates pedigrees studied. Overall LOD ¼ 5.3, α¼1...... 92 Fig. 4.9 Genomic region 10p11.23-p11.21 in isolate DGH011 and the 10q26.12-q26.13 region in isolates DGH005 and DGH022 were linked with schizophrenia and with dominant inheritance of disease loci. In isolate DGH034, we found the 10q12-q26.13 region to inherit disease loci recessively. LODs varied from 1.96 to 2.7 (see Table 3.12). Overall for 2 (DGH005 + DGH022) isolates LOD ¼ 5.3, α À 1...... 93 Fig. 4.10 Genomic regions at 11p114.3-p13 in isolate DGH005, and at 11q23.1-q24.3 in isolates DGH034 and DGH011 linked with schizophrenia candidate genes. LODs varied from 2.1 to 2.7 (see Table 3.12) ...... 94 Fig. 4.11 Genomic region 18p11.31 in isolates DH022/DGH034 and at 18q12.1-q12.3 in isolates DGH011 linked with schizophrenia candidate genes. LODs varied from 1.5 to 3.0 (see Table 3.12) ...... 95 Fig. 4.12 In all studied isolates, genomic region 17p11-q12 linked with schizophrenia spectrum disorders (DGH022/DGH005/DGH034/ DGH011). LOD ¼ 3.7 R/M–2.5 R/M–1.7 R/M–2.98 D/M, respectively (see Tables 4.2 and 4.3) ...... 95 Fig. 4.13 In all studied isolates, genomic region 22q11.1-q12.3 linked with schizophrenia spectrum disorders (DGH005/DGH034/DGH011). LOD ¼ 3.2 D/M–4.4 D/M–3.1 D/M, respectively (see Tables 4.2 and 4.3) ...... 96 Fig. 4.14 Differences between Primary (PI) and Secondary (SI) isolates in % of meioses in where we obtained significant linkages for SCZ as well as in rates of recombinations events in the isolates pedigrees. Differences between the isolates are statistically significant (t ¼ 2.3–7.6; p < 0.05–0.000) ...... 97 Fig. 5.1 Schematic representation of the SNP (a) and CNV (b)—deletions and duplications in chromosome ...... 105 Fig. 5.2 Genome-wide length sizes of ROH (a) and CN (b) among affected (SCZ) and healthy (N) pedigrees members summarized from studied isolates. Star marked chromosomes with reliable in the isolates and with higher LOD values obtained in our linkage analyses ...... 106

[email protected] List of Figures xvii

Fig. 5.3 Variation (in %) by chromosomes of CN gains and losses among observed SCZ patients: the differences between groups are statistically significant χ2 ¼ 32,385, df ¼ 19, p ¼ 0.02833 ...... 107 Fig. 5.4 CNV in 1q21.1 and 15q11.2 regions previously reported (Stefansson et al. 2008) (A1, B1) and in SCZ affected subjects from Dagestan genetic isolates (A2, B2). Segments with CNV and ROH were obtained in same regions: in 1q21.1—in genomes of 20 affected cases we found segments in 13 genomes with ROH, 7—deletions, 3—gains; in 15q11.2 we found segments with 7 ROH, 5 deletions and 6 gains ...... 109 Fig. 5.5 The association between the levels of inbreeding and the average size of segments and the number of markers in patients with homozygous CNV (deletion ¼ 0, duplication ¼ 4)...... 111 Fig. 5.6 The association between inbreeding coefficient and CNV: frequency of homozygous variations of the number of copies is greater with the higher inbreeding coefficients ...... 111 Fig. 5.7 CNV found in gene CRIM1 in linked with SCZ region 2p22.3- p21 in isolate DGH011 (LOD ¼ 3.1) ...... 115 Fig. 5.8 CNVs established in 3 isolates with schizophrenia-linked 6p21- p22 region with high reliable LOD ¼ 4.3 (DGH034), 2.92 (DGH022), and 2.3 (DGH011). Linked region contains candidate genes for schizophrenia: NOTCH4, HLA-DRB1, TNXB, HLA-DRB1, TAP2, etc. Genes localized in linked region had deletions in eight patients and duplications in three patients ..... 116 Fig. 5.9 Segments with copy number variations in linked with schizophrenia 8p23 region. Five patients have segments with deletions and six patients have duplications. We found no ROH segment linked with the SCZ region ...... 119 Fig. 5.10 Recurrent CNVs found in five patients with a common ancestor within linked 8p23 region ...... 119 Fig. 5.11 CNV (a, 10 patients) and ROH (b, 7 patients) ‘hot spots’ obtained among SCZ cases in gene ELAVL2 (9p21.3) confirms genomic instability reported on DGV site ...... 121 Fig. 5.12 CNV “hot spot” in linked 17p11-p12 region and in 17q21.31 . . . 122 Fig. 5.13 Duplication in CECR2 and SLC25A18 genes found in 7 SCZ patient genomes in isolates DGH005 and DGH034 at 22q11.2- q12.1. Six duplications and five deletions in genes CACNG2, PVA2B, and IFT27, as well as deletions in genomes of six patients and duplications in three patients in gene LARGE we obtained within the second linkage peak in isolate DGH034 and DGH011 at 22q12.3 ...... 123 Fig. 5.14 A summarized genome-wide scanned significant linkages obtained in four genetic isolates (color vertical lines) with CNV (del & gain) and ROH found in linked regions. Results on X and Y chromosomes were not presented ...... 126

[email protected] List of Tables

Table 1.1 The study of genes involved in dopaminergic mechanism ...... 5 Table 1.2 Genes involved in serotonin mechanism ...... 5 Table 2.1 List of parameters studied in the survey of isolates residents .... 23 Table 2.2 Number of ethnic groups in Dagestan and mono-ethnic villages in them ...... 25 Table 2.3 Dynamics of the national structure of the rural population of Dagestan (1926–1989) ...... 27 Table 3.1 Complex disease gene mapping in genetic isolates of outbred populations: advantages and disadvantages ...... 38 Table 3.2 Frequency of Y-haplogroups in the studied ethnic populations of Dagestan (Caciagli et al. 2009) ...... 43 Table 3.3 Analysis of genetic differentiation in the male genome of Caucasus people (Y-chr.), grouped according to different classification criteria ...... 46 Table 3.4 Structure of morbidity in a number of examined mountain Dagestan isolates ...... 49 Table 3.5 Description of selected isolates for the study and reconstructed pedigrees ...... 51 Table 3.6 Structure of morbidity among the members of the pedigrees of studied isolates ...... 52 Table 3.7 Comparative analysis of the level of heterozygosity and allelic loci ranks of chromosomes 17 and 18 in 5 Dagestan ethnic groups, and summary data from the John Weber lab ...... 58 Table 3.8 Hardy–Weinberg equilibrium distribution compliance of studied genomic loci of chromosome 17 ...... 60 Table 3.9 Assessment of genetic similarity of examined isolates examined by summary of genomic loci (Nei 1978) ...... 60

xix

[email protected] xx List of Tables

Table 3.10 The average coefficient of inbreeding in the studied populations of indigenous people of Dagestan, calculated by marital structure in 3 (Fpop) and 12 (Fped) generations of ancestors of the same individuals ...... 62 Table 3.11 Median test of inbreeding level distribution in groups of patients and healthy subjects ...... 64 Table 3.12 Analysis of recombination haplotype of chromosome 22 in primary and secondary isolates ...... 67 Table 3.13 Summary parameters of genetic heterogeneity of examined primary and secondary isolates ...... 67 Table 4.1 Nonparametric linkage with schizophrenia spectrum disorders in the genealogy of 4 genetic isolates ...... 81 Table 4.2 Parametric linkage analysis with schizophrenia spectrum disorders in pedigrees of 4 genetic isolates ...... 82 Table 4.3 Cross-isolates analysis of the results of parametric analysis of genomic linkages with schizophrenia pedigrees from 4 ethnically divided genetic isolates ...... 89 Table 4.4 Candidate genes localized in genomic regions, linked with schizophrenia spectrum disorders ...... 90 Table 5.1 Whole genome (autosomes) levels of heterozygosity and statistics of genome-wide copy number variations (CNV) in patients examined from genetic isolates based on AFFX GTC evaluations of AFFX SNP 6.0 microarray data ...... 108 Table 5.2 Common number of CNV duplications, deletions, and ROH obtained in affected SCZ cases in linked genomic regions ..... 113 Table 5.3 De novo mutations detected in the studied patients from Dagestan isolates ...... 125

[email protected] Introduction

Complex or multifactorial diseases are controlled by many factors of genetic and environmental nature. For example, genes play a role in the pathogenesis of cardiovascular diseases; however, the development of this pathology largely depends on lifestyle choices such as physical inactivity, smoking, and obesity. Complex diseases are extremely common, accounting for more than 90 % of human diseases, and are the (leading) most important cause of disability and premature death. The identification of complex diseases susceptibility genes there- fore has theoretical and practical importance, as they facilitate the development of effective methods for diagnosis and treatment. Many complex chronic diseases have familial aggregation. Usually caused by an unknown number of genes, these familial pathologies do not correspond to the Mendelian model and often interact with various environmental factors. Families with such aggregations are observed for diseases such as hypertension, diabetes, obesity, various types of cancer, coronary heart disease, Alzheimer’s disease, and Parkinson’s disease. These families have exceptional value for genetic studies; affected members often have an earlier age of onset and the disease is more severe. Only 1–7 % of such patients have a mutant gene that is inherited according to Mendelian laws; however, the genetic mechanisms that cause the pathogenesis of most complex diseases are largely unknown (Scheuner et al. 2004). Currently, about 50 genes of complex human diseases are mapped. Most complex diseases are characterized by major structural abnormalities within the genome, which facilitate the final stage of the gene search—the physical gene mapping. Such large chromosomal rearrangements are typical in complex diseases such as chronic lymphomatoid granulomatosis, Duchenne muscular dystrophy, blastoma, family colon polyposis, and DiGeorge syndrome. The expansion of trinucleotide repeats is furthermore characteristic of numerous complex diseases, such as Martin–Bell syndrome, spinocerebral ataxia, and Machado–Joseph disease (Terwilliger and Ott 1994; Illarioshkin et al. 1996).

xxi

[email protected] xxii Introduction

The technologies and strategies used to detect genetic factors that influence the development of complex diseases have limitations (Scheuner et al. 2004). Genetic heterogeneity causes different genes to determine the development of similar clinical symptoms in clinically homogeneous patients. Factors such as changes in habitat and migration subdivide gene pools in populations containing mutant genes involved in the pathogenesis of complex diseases. The role of these factors in forming a specific gene pool within a local population is often difficult to determine and therefore these are referred to as random or stochastic. Onset and incidence of complex diseases, as well as the evolutionary features of specific populations, are affected when genetic and environmental factors interact (Wright 1965). Scanning the entire genome and analyzing individual gene candidates are two main strategies for identifying genes that cause complex diseases. Both of these strategies have advantages and disadvantages (Maguire et al. 2000). Modern methods of complex disease gene mapping enable the establishment of a DNA segment containing genes for such complex diseases in terms of reverse genetics— in the direction of phenotype to gene. The gene determinants of complex diseases, however, are unknown. Difficulties in mapping complex and most chronic diseases are attributed to their complexity. These diseases are characterized by genetic and often clinical heterogeneity. Genetic heterogeneity of complex diseases mainly is related with interactions between genes of multiple founders in genealogy reproduced in patients from heterogeneous (outbred) human populations. Addition- ally, ethnic and social heterogeneity within such outbred populations increases the effects of penetrance and phenocopies of complex diseases due to the diversity and dynamics of environmental factors. These interrelated factors make it difficult to obtain reliable results in mapping genes of complex diseases, and therefore, careful selection of populations to map these genes is required. However, the characteris- tics of gene pool of populations, where complex disease genes of interest are studied, are very seldom analyzed. When identifying genomic regions linked or associated with the studied mental illness, these difficulties often lead to results that are not always supported by researchers working with the same disease in other different human populations, which have a different gene pool and ethno-social environment (Bulayeva 1991; Kruglyak et al. 1995; Bulayeva et al. 1999, 2000, 2002, 2005; Kruglyak 1999; Jorde et al. 2000; Peltonen 2000). The completion of the Project and intensive studies of genomic variability in human population open new horizons for practical applications—for example, in forensic genetics. A study of population’s gene pool developed during its particular demographic history is important because such gene pool defines a certain genetic architecture of complex disease among its members. Population genomics studies the specificity in the interaction of allelic variants between a genomic population and environmental factors forming the complex clinical phe- notype (Jorde et al. 2000). Genetic isolates of indigenous ethnic groups provide exceptional opportunities for identifying genes of complex diseases. These communities are established by a small number of founders and have a stable total volume for hundreds of genera- tions in an unchanging environment. The Neel classification (Neel 1992) makes

[email protected] Introduction xxiii distinctions between primary and secondary isolates. Primary isolates refer to the demographically ancient history of a population living in a native environment, with a stable total volume, endogamy, and inbreeding. Secondary isolates have a relatively young demographic history. Secondary isolates are typically represented by religious sects or migrants in relatively remote areas. The effect of the founder in these isolates leads to a high frequency of haplotype blocks with pathogenic locus in modern generations of its ill descendants in some isolates where another ancestor is a carrier of a haplotype block from pathogenic locus. Gene mapping of complex diseases in such isolates is performed using linkage analysis and linkage disequi- librium (Linkage Disequilibrium, LD). The main idea of such analyses is based on the concept that a high degree of genetic isolation causes genetic homogeneity and consequently increases the proportion of patients with inherited haplotype blocks with pathogenic locus from a common ancestor. The growing number of such patients and their genetic homogeneity of pathogenic loci derived from the general ancestor of the isolate are amplified under the influence of typical isolates of endogamy and inbreeding. Due to these specific genetic processes, a gene pool is created in isolates with reduced genetic heterogeneity (homogeneous) which is reflected as a reduction of heterogeneity of the pathogenic alleles. This ancestor effect, in combination with inbreeding, promotes accumulation of a relatively small number of pathogenic loci or alleles in particular isolates. The study of the same complex phenotype in divided isolates with different ancestors enables the deter- mination of the entire spectrum of genes involved in the pathogenesis of this phenotype (Bulayeva et al. 2007, 2011; Bulayev et al. 2008, 2009). Such genetic processes in isolates enable cost-effective and time-effective identification of pathogenic loci involved in the pathogenesis of complex diseases, if search of pathogenic loci is conducted by unified genetic and clinical methodology in ethni- cally diverse isolates. It is known that the specificity of the gene pool of human populations is 70 % determined by ethnicity (Cavalli-Sforza and Bodmer 1971). New molecular technology such as scanning of entire genome by hundreds of thousands of single nucleotide polymorphisms (SNPs) enables a deeper ethnic and genetic identity (Novembre et al. 2008; Xing et al. 2009). These studies show that the degree of human genomic identification by ethnicity is 50 % for localization within 350 km and 90 % within 700 km. An important conclusion from these studies is the conformation of significant genomic subdivision of different ethnic populations, which is extremely important for the gene mapping of human diseases in order to develop “personalized” medicine. Pioneering studies of Bulayeva et al. (1976–2011) of population genetics in indigenous ethnics of Dagestan and in neighboring regions revealed remote high- land genetic isolates of different ethnicities with predominant aggregation of certain complex diseases. Dagestan is one of the few areas in Russia where the gene mapping of complex diseases is effective; it is characterized by a unique ethnic diversity, antiquity of indigenous people, and preserved primary isolates (Bulayeva 1991). In pioneering studies, a group of scientists from the Institute of General Genetics of RAS (1976—

[email protected] xxiv Introduction now), under the supervision of Bulayeva, studied population and genetic structure of Dagestan’s population. These studies show that the isolate in Dagestan is one village called “aul.” The unique ethnic diversity and antiquity of the 26 indigenous population are reflected in the features of the gene pool and the disease patterns in populations of these nations. Therefore, Dagestan is one of most effective region in world for gene mapping of complex diseases, including schizophrenia. In particular: 1. This region contains 26 of 50 indigenous people of the Caucasus, most of whom have existed for more than 10,000 years within a stable environment in the mountains and are divided into a plurality of primary isolates. Such ethnic division is associated with genetics, which makes Dagestan a unique ethnically and genetically subdivided region that is effective for the study of many human genetic issues. 2. Stable mountain and ethno-cultural habitat of the indigenous people in Dagestan is important for complex disease gene mapping as these conditions increase penetrance and reduce phenocopy frequency, making it difficult to identify susceptibility genes of complex diseases. 3. Population and genetic structure of this region has been thoroughly studied by a team of scientists at The Institute of General Genetics of RAS (Dubinin and Bulayeva 1982; Bulayeva 1991). Such state of exploration largely contributes to the identification of mechanisms for intra- and interpopulation differences in the accumulation of specific complex diseases and their linkage with genomic markers. 4. Long reproductive isolation in harsh mountain conditions for hundreds of generations caused a high level of genetic subdivision between the populations in the region and the low level of genetic diversity within them. 5. Genetic drifts, such as the Founder Effect, and marital isolation in the demo- graphic history of these populations resulted in highly specific aggregation of certain complex diseases in some isolates compared to others. Accordingly, the number of pathogenic loci in one particular Dagestan isolate would be different from other isolates where such accumulation occurs on other loci obtained from different ancestors. The features in the demographic history of isolates make it possible to establish the number and localization of genes involved in the pathogenesis. 6. According to traditions among Dagestan ethnic groups, every family preserves the information of about 7 generations of ancestors, which provides a highly efficient way of collecting reliable data and reconstructing genealogical data by probands in 1114 generations, which is important for finding genes that cause specific diseases. 7. Most genetic isolates of the Dagestan mountain people are a comprehensive family tree with a limited number of common ancestors; this enables not only the identification of a sufficient number of patients with homogeneous clinical phenotype but also the collection of high-quality clinical and genetic material. Mental diseases are among the most severe complex diseases for both patients and society. Depression affects 5–17 % of the world’s population. The WHO

[email protected] Introduction xxv predicts depression, in which 60 % of cases end by suicide, will be the leading cause of labor losses by 2020. Schizophrenia affects about 1 % of the world’s population (~60 million people), and the USA spends 65 billion dollars annually to treat it (APA 2007). The number of patients with schizophrenia is increasing in most countries; this is mainly due to improved patient identification and an extension of schizophrenia’s diagnostic criteria. In Russia, morbidity (183 per 100,000 population) and incidence (410 per 100,000 population) of schizophrenia are at average values. In the European Union, suicides per year exceed the number of deaths from traffic accidents (58,000 versus 50,700); furthermore, its depression rates are the highest in the world, and its number of diagnoses grows every year. To address this issue, in 2006 in Europe was adopted “European Commission Green Paper—Improving the mental health of the population: towards a strategy on mental health for the European Union,” a document that proposes ways to improve the mental health of Europeans. (http://ec.europa.eu/health/ph_determinants/life_ style/mental/green_paper/mental_gp_en.pdf). Russia needs the same measures—especially because of its demographic prob- lems and population ethnic subdivision. The most important in decision of this problem is the increase of basic studies that identify genetic and environmental factors in development of mental disorders. It was found that schizophrenia is genetically predetermined at a significantly higher rate (80 %) than depression (40–50 %) (Gottesman and Shields 1972; Fowler and Tsuang 1976; Lander and Schork 1994; Kennedy 1996). The degree of heritability of both diseases warrants the involvement of genes in their development. In this regard, the genetic- epidemiological studies in Dagestan isolates focused on studying isolates with schizophrenia aggregation (MIM 181500) and schizophrenia spectrum diseases (schizoaffective disorders, schizotypal personality disorders, and paranoid disor- ders) (Gottesman and Shields 1972; Gottesman and Hanson 2005). These diseases are characterized by disturbances in thinking, emotional, and social functions (Babigian 1980; Andrews 1985; Allebeck 1989; Mortensen et al. 1999). Diagnostic methods in isolates were uniform and met DSM-IV criteria; the final diagnoses were established by Dagestan psychiatrists in cooperation with US psychiatrists participating in collaborative studies. It is established that 1 % of the world’s population will develop symptoms of schizophrenia within their lifetime (Kendler 1997). It was shown that the symptoms and risk of developing the disease is the same in all racial and ethnic groups 634 (Jablensky et al. 1992). Several studies identified genetic markers of schizophrenia spectrum disorders in a number of chromosomal regions: 2q11-q12, 8p22-p21, 6q, 13q14.1-q32, 17p, 5q21-q31, 10p15-p11, 1q21-q22, 18p, and 22q (McGuffin and Owen 1991; Gill et al. 1993; Cardno et al. 1996, 1999; Cloninger et al. 1998; Kaufmann et al. 1998; Levinson et al. 1998, 2002; Faraone et al. 1999; Brzustowicz et al. 2000). These results, however, are rarely reproduced by researchers working with different populations (Bray and Owen 2001; Prasad et al. 2002; Lewis et al. 2003). Problems with reproducing results of genetic linkages are usually caused by differences in genotyping methods, genetic structures within populations

[email protected] xxvi Introduction where experimental data were collected, and clinical diagnostic criteria between countries. Differences between population gene pools, where experimental data for map- ping genes of complex diseases are collected, and the disease’s clinical and genetic heterogeneity can result in significant differences in evidence patterns in the genetic linkages between populations. In particular, such differences can initiate false- positive or false-negative linkages. Because of these problems, careful selection of populations for mapping genes of complex diseases is required (Wright et al. 1999; Bulayeva et al. 2000; Jorde et al. 2000; Peltonen 2000). Although rarely studied, ethnic and demographic differences between populations and the associated variances in their gene pool influence the genetic architectonic of complex phenotypes (diseases).

In this regard, the main objective of our study was mapping genes of the same complex disease, diagnosed and genotyped by standardized methods, in ethnically and demographically subdivided genetic isolates.

The study, therefore, hypothesized that in genetic isolates with distinct demo- graphics and ethnicities, diagnosing and genotyping complex diseases using unified methods will establish mechanisms that determine intra- and interpopulation spec- tra of genomic linkages involved in the pathogenesis of the studied disease. Linkage analysis was carried out using 400 genome-wide microsatellites scanned in members of four genetic lineages of different isolates of indigenous people of Dagestan, where our long-term expeditionary studies revealed a high prevalence of schizophrenia and related spectrum disorders. A more detailed study of the linked region was performed to determine the presence of microdeletions and microduplications using per subject genome scanned 500 K Affymetrix SNPs 5 and 6. Several recent publications show the importance of rare deletions and duplica- tions of gene copies within a genome in the pathogenesis of schizophrenia. Com- pared to the control group, patients with schizophrenia and schizoaffective disorder had large deletions and duplications of genomic sequences (Copy Number Varia- tions, CNVs) ranging from thousands to millions of nucleotides (Walsh et al. 2008; Kirov et al. 2008; Steffanson et al. 2008). Patients with childhood schizophrenia furthermore had twice as many genomic abnormalities when compared to the healthy group. The data suggest that developing technologies for analyzing single nucleotide polymorphisms (SNPs) and copy number variations (CNVs) based on the genetics of complex diseases led to reproducible conclusions in studies that indicate schizophrenic genomes have higher numbers of structural variations. Due to its complexity, complex disease mapping requires detailed analyses of the methodological aspects of genetic testing and the mapping of results obtained by different researchers. The next section provides a brief description of the status and methodological problems of complex disease gene mapping and the main results for schizophrenia spectrum disorders.

[email protected] Introduction xxvii

References

Allebeck, P. (1989). Schizophrenia: A life-shortening disease. Schizophrenia Bulletin, 15(1), 81–89. Andrews, G. (1985). The economic costs of schizophrenia. Archives of General Psychiatry, 42(6), 537–543. Babigian, H. M. (1980). Schizophrenia: Epidemiology. In A. M. Freeman, H. I. Kaplan, & B. J. Sadock (Eds.), Comprehensive. Textbook of psychiatry II (pp. 1113–1114). Baltimore, MA: Williams and Wilkins. Bulayeva, K. B. (1991). Genetic basis of human psychophysiology (p. 218). Moscow: Science. Bulayeva, K., Roeder, K., Bacanu, S. A., et al. (1999). Genetic analysis of schizophrenia in isolated Daghestanian kindreds. The American Journal of Human Genetics, 65, 1086. Bulayeva, K. B., Leal, S., Pavlova, T. A., et al. (2000). The ascertainment of schizophrenia pedigrees in Daghestan genetic isolates. Journal of Psychiatric Genetics, 5, 100–106. Bulayeva, K. B., Glatt, S. J., et al. (2007). Genome-wide linkage scan of schizophrenia: A cross- isolate study. Genomics, 89(2), 167–177. Bulayev, O. A., Spitcin, V. A., et al. (2008). Population approach to mapping genes of complex diseases. Medical Genetics, 4(3), 3–17. Bulayev, O. A., Pavlova, T. A., & Bulayeva, K. B. (2009). Role of inbreeding in aggregation of complex pathology. Genetics, 45(8), 1096–1104. Bulayeva, K. B., Lencz, T., Glatt, S., Takumi, T., Gurgenova, F. R., & Bulayev, O. A. (2011). Genome-wide linkage scan of major depressive disorder in two Dagestan genetic isolates. Central European Journal of Medicine, 6(5), 616–624. Cavalli-Sforza, L. L., & Bodner, W. F. (1971). The genetics of human populations. San Francisco: Freeman. Dubinin, N. P., & Bulaeva, K. B. (1982). Genetic bases of individuality in human populations. Doklady Akademii Nauk SSSR, 265(2), 470–473. Fowler, R. C., & Tsuang, M. T. (1976). Letter: Schizophrenics’ families. The British Journal of Psychiatry, 128, 100–101. Freedman, R., & Leonard, S. (2001). Genetic linkage to schizophrenia at chromosome 15q14. American Journal of Medical Genetics, 105(8), 655–657. Gottesman, I. I., & Moldin, S. O. (1997). Schizophrenia genetics at the millennium: Cautious optimism. Clinical Genetics, 52(5), 404–407. Gottesman, I. I., & Hanson, D. R. (2005). Human development: Biological and genetic processes. Annual Review of Psychology, 56, 263–286. Jablensky, A., Sartorius, N., Ernberg, G., Bertelsen, A., et al. (1992). Schizophrenia: manifesta- tions, incidence and course in different cultures. A World Health Organization ten-country study. Psychological Medicine Monograph Supplement, 20, 1–97. Jorde, L. B., Watkins, W. S., & Bamshad, M. J. (2001). Population genomics: A bridge from evolutionary history to genetic medicine. Human Molecular Genetics, 10(20), 2199–2207. Kendler, K. (1997). The genetic epidemiology of psychiatric disorders: A current perspective. Social Psychiatry and Psychiatric Epidemiology, 32, 5–11. Kennedy, J. L. (1996). Schizophrenia genetics: The quest for an anchor. American Journal of Psychiatry, 153(12), 1513–1515. Kirov, G., Gumus, D., Chen, W., Norton, N., Georgieva, L., Sari, M., O’Donovan, M. C., Erdogan, F., Owen, M. J., Ropers, H. H., & Ullmann, R. (2008). Comparative genome hybridization suggests a role for NRXN1 and APBA2 in schizophrenia. Human Molecular Genetics, 17(3), 458–465. Kruglyak, L., Daly, M. J., Reeve-Daly, M. P., & Lander, E. S. (1996). Parametric and nonpara- metric linkage analysis: A unified multipoint approach. The American Journal of Human Genetics, 58(6), 1347–1363. Lander, E. S., & Schork, N. J. (1994). Genetic dissection of complex traits. Science, 265, 2037–2048.

[email protected] xxviii Introduction

Lewis, C. M., Levinson, D. F., Wise, L. H., et al. (2003). Genome scan meta-analysis of schizophrenia and bipolar disorder, part II: Schizophrenia. The American Journal of Human Genetics, 73(1), 34–48. Maguire, E. A., Gadian, D. G., Johnsrude, I. S., Good, C. D., Ashburner, J., Frackowiak, R. S., & Frith, C. D. (2000). Navigation-related structural change in the hippocampi of taxi drivers. Proceedings of the National Academy of Sciences of the United States of America, 97(8), 4398–4403. Mortensen, P. B., Pedersen, C. B., et al. (1999). Effects of family history and place and season of birth on the risk of schizophrenia. The New England Journal of Medicine, 340(8), 603–608. Neel, J. V. (1992). Minority populations as genetic isolates: The interpretation of inbreeding results. In A. H. Bittles & D. F. Roberts (Eds.), Minority populations: Genetics, demography and health (pp. 1–13). London: Macmillan. Novembre, J., Johnson, T., et al. (2008). Genes mirror geography within Europe. Nature, 456 (7218), 98–101. Scheuner, M. T., Yoon, P. W., & Khoury, M. J. (2004). Contribution of Mendelian disorders to common chronic disease: Opportunities for recognition, intervention, and prevention. Ameri- can Journal of Medical Genetics Part C: Seminars in Medical Genetics, 125C(1), 50–65. Stefansson, H., Rujescu, D., Cichon, S., Pietila¨inen, O. P., Ingason, A., Steinberg, S., et al. (2008a). Large recurrent microdeletions associated with schizophrenia. Nature, 455(7210), 232–236. Walsh, T., McClellan, J., McCarthy, S., Addington, A., Pierce, S., Cooper, G., et al. (2008). Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science, 320(5875), 539–543. doi:10.1126/science.1155174. Epub, 27. Wright, S. (1931). Evolution in Mendelian populations. Genetics, 16(2), 97–159. Wright, S. (1965). The interpretation of population structure by F-statistics with special regard to systems of mating. Evolution, 19, 395–420. Wright, A. F., Carothers, A. D., et al. (1999). Population choice in mapping genes for complex diseases. Nature-Genetics, 23(4), 397–404. Xing, J., Watkins, W., et al. (2009). Fine-scaled human genetic structure revealed by SNP microarrays. Genome Research, 19(5), 815–825.

[email protected] Chapter 1 Current Problems of Complex Disease Genes Mapping

1.1 General Problems of Complex Disease Genes Mapping

The slow advancement in the mapping of complex diseases is attributed to the following difficulties (Kruglyak 1999; Terwilliger and Ott 1994; Puzyrev and Stepanov 1997). Genetic Heterogeneity The same clinical phenotype can be caused by mutations in different genes. For example, a disruptive mutation in genes that control specific metabolic chains can lead to abnormalities in the concentration of the final chain product. Genetic heterogeneity is typical for many complex human diseases; however, the degree of heterogeneity of diseases can be very high. For example, mutations in 13 different genes cause Zellweger disease; mutations in at least 14 different genes may cause pigment retinitis. Genetic heterogeneity causes inconsistencies when determining links to a specific genetic marker in a genealogy. Genetic heterogeneity (DNA loci heterogeneity) should be distinguished from allelic heterogeneity when different mutations within the same genetic locus affect identical functions, but to different degrees and clinical consequences. For instance, the “heavy” mutations of the dystrophin gene therefore have the clinical phenotype of Duchenne muscular dystrophy and “smaller” mutations—a Becker muscular dystrophy (Puzyrev and Stepanov 1997). Incomplete Penetrance Typical of all complex diseases, some mutant genotype carriers do not express an aberrant phenotype. The probability of a gene being expressed complete penetrance means the gene or genes for a complex clinical phenotype are expressed same in all the populations that have the genes. “Incom- plete” penetrance means the genes are expressed in only part in the population due to the variations of environmental influences. Phenocopies Presence Clinical phenotype can be caused by environmental factors despite a normal genotype.

© Springer International Publishing Switzerland 2016 1 K. Bulayeva et al., Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates, DOI 10.1007/978-3-319-31964-3_1

[email protected] 2 1 Current Problems of Complex Disease Genes Mapping

Non-Mendelian Mechanisms of Genetic Information Transmission In recent years, it has become apparent that Mendelian inheritance is only one variant of genetic information transmission. There are a number of diseases which are based on mitochondrial inheritance (e.g., Leber ophthalmoneuropathy, mitochondrial myop- athy, and Kearns–Sayre syndrome), expansion of trinucleotide repeats (e.g., Martin–Bell syndrome and myotonic dystrophy), and the phenomenon of genomic imprinting (e.g., Prader–Willi/Angelman syndrome, Wiedemann–Beckwith syn- drome, and Silver–Russell syndrome). High Frequency in Populations of Alleles Linked with the Disease Adverse effects of high frequency of the complex diseases candidate genes alleles may have different origin (Terwilliger and Ott 1994; Puzyrev and Stepanov 1997). For example, attempts at mapping genes that cause Alzheimer’s disease through link- age analysis in genealogy were unsuccessful until it was discovered that the E4 allele of the apolipoprotein E gene, that is, the main genetic factor that causes Alzheimer’s disease, has a high frequency (15–20 %) in Caucasian populations. Only differentiation of patients with relatively young age of onset of Alzheimer’s disease made it possible to establish a specific genotype associated with this clinical phenotype. Clinical Heterogeneity Significant clinical phenotype determination is a key for disease gene mapping. Errors in clinical diagnostics may lead to both hyperdiagnosis and hypodiagnosis (false-negative diagnostics) which negatively impacts the prospects of genetic mapping. This is particularly relevant to mental illnesses where clinical diagnoses are investigated through interviews and intro- spective techniques. Complex disease gene mapping poses a number of objective difficulties for researchers. This requires a search of methodological ways that would enable the minimization of these problems when searching for genes that cause diseases. Complex diseases are additionally caused by specific clinical phenotypes and genetic and environmental exposure.

1.2 Current Approaches of Schizophrenia Spectrum Disease Gene Mapping

Current methods used to map complex disease-causing genes primarily include linkage analysis, linkage disequilibrium analysis, and association in populations and families. In particular, genealogical analysis of one Dagestani mountain isolate with a high aggregation of neuromuscular degeneration has shown that this disease is caused by a complex trinucleotide expansion in the dystrophy myotonin gene (DMPK) (Illarioshkin et al. 1996). Subjects with 33 or less repetitions are nearly healthy; however, carriers of a genotype with 50 or more alleles are sick.

[email protected] 1.2 Current Approaches of Schizophrenia Spectrum Disease Gene Mapping 3

Previous genetic and epidemiological studies show that 80 % of schizophrenia manifestation is genetically predetermined (Trubnikov and Gindilis 1981; Gottesman and Shields 1972), opening prospects for schizophrenia gene mapping. Determined from individual-specific environmental effects, such as maternal inter- nal infection, social stress, and stochastic (random) factors, schizophrenia suscep- tibility can only be estimated in approximately 20 % of cases. The determination of susceptibility genes largely remains a challenge, due to the genetic and clinical heterogeneity of schizophrenia and the presence of incomplete penetrance and phenocopies. Researchers collect empirical data for complex disease gene mapping in human populations that have specific demographic and ethnic histories and relative differences in their gene pool. Gene pool differences between populations are not investigated, and typically are not considered, despite the complex archi- tectonic relationship between genetic phenotypes and the gene pool of a specific population (Falconer 1960). Studies using twins and adopted children show that the risk of developing schizophrenia increases exponentially with respect to the genetic affinity of the patient; a substantial genetic component therefore causes schizophrenia. Compared to a 1 % risk of developing schizophrenia in the general population, first-degree (siblings and children) and third-degree (cousins) relatives have an approximately 9 % and 2 % chance of disease onset, respectively (Fig. 1.1). Single-born siblings and DZ twins of an affected subject, both with 50 % genetic resemblance, additionally show a similar risk for developing schizophrenia (Gottesman and Shields 1972). Although epigenetic factors, such as genomic

Fig. 1.1 The relative risk (RR) of schizophrenia during lifetime (lifetime risk of developing schizophrenia), based on the degree of genetic affinity (Gottesman 1991). In the general popula- tion, RR of developing schizophrenia is 1 %. In groups of relatives, the RR increases significantly with closer family ties. SCZ schizophrenia

[email protected] 4 1 Current Problems of Complex Disease Genes Mapping imprinting, can contribute to the dissimilarity between MZ twins, about 50 % of MZ twins do not share the schizophrenic phenotype, suggesting that environmental factors play an important role in schizophrenia etiology. Families with adopted children, in contrast, show that the familial aggregation of schizophrenia is not a consequence of the overall family environment. The adopted children of sick parents show an increased risk of schizophrenia, whereas the risk significantly increases with the presence of biological relatives. Mutations within encoded or regulated gene regions that affect the function and/or expression of their respective neurobiological protein may cause schizo- phrenia pathogenesis (Gottesman and Shields 1972; Prasad et al. 2002). Neurobiological Studies The apparent genetic and phenotypic heterogeneity of complex diseases, as well as phenocopies and incomplete penetrance, creates difficulties in determining the neurobiological mechanisms underlying the studied disease. Attempts to identify biological causes of schizophrenia traditionally focused on neuropsychological, neuropathological, neurochemical, and perhaps psychopatho- logical explanations. The notion that schizophrenia is caused by disturbances of the main neurotransmitters (nerve signal transmitters) emerged in 1950–1960. Most candidate genes studied examined the neurochemical model of schizophrenia and therefore focused on neurotransmitter receptors and metabolic . The dopa- mine hypothesis of schizophrenia studies the genes involved in dopamine trans- mission (Carlsson 1988; Conneally 1991; Cohen et al. 1999). The “dopamine hypothesis” states that schizophrenia is associated with an impaired transmission of nerve impulses in dopamine receptors and prevails as the dominant neurochem- ical theory of schizophrenia (Davis et al. 1991; Jaskiw and Weinberger 1992; Cohen et al. 1999; Ekelund et al. 1999; Cravchik and Goldman 2000). Dopaminer- gic system consists of five receptors that belong to the D1 and D2 families (Kebabian et al. 1984). The D1 family contains two DRD1 and a DRD5 receptor (Table 1.1). The D2 family includes receptors DRD2, DRD3, and DRD4 (Table 1.1). The etiology of schizophrenia may also include the dopamine transporter gene, DAT, localized in the 5p15.3 region; however, researchers seldom reproduce positive results of these gene linkages with schizophrenia (Lee et al. 1999; Hill et al. 1998; Prasad et al. 2002). Although the therapeutic effect of classical antipsychotic drugs depends heavily on dopamine receptors, clear evidence for primary dopaminergic abnormalities in schizophrenia is not yet available. Most results, including the D2 gene, were negative; however, the genetic polymorphism causing variability (Ser9Gly) in 1 dopamine receptor D3 gene (DRD3) reported positive results (Davis et al. 1991; Ekelund et al. 1999; Faraone et al. 2001). The D3 receptor, functionally associated with D2 subtype, is expressed in the nucleus region of the brain of patients who are likely targets of modern antipsychotic drugs (Moises et al. 1991; Kapur and Seeman 2001). Although the D3 polymorphism reported negative results, subsequent analysis of more than 5000 individuals confirmed the significant association between schizophrenia and

[email protected] 1.2 Current Approaches of Schizophrenia Spectrum Disease Gene Mapping 5

Table 1.1 The study of genes involved in dopaminergic mechanism Receptor Localization Linkages and associations studies Result Family 5q35.1 Grandy et al. (1990), Litt et al. (1991), Nothen€ Significant asso- D1 et al. (1994), Cichon et al. (1994), Liu ciations are not (DRD1) et al. (2004), Kojima et al. (1999) established DRD5 4p 15.3 Asherson et al. (1998) Significant asso- ciations are not established Family 11q 22-q23 Gejman et al. (1994), Asherson et al. (1994), Significant asso- D2 Nothen€ et al. (1994), Sobell et al. (1994), Shaikh ciations are not (DRD2) et al. (1994) established Significant asso- ciations are established DRD3 3q13.3 Crocq et al. (1992), Nimgaonkar et al. (1993), Significant asso- Mant et al. (1994), Nothen€ et al. (1994), Jonsson€ ciations are et al. (1993), Nanko et al. (1993), Chen established et al. (1997), Prasad et al. (1999) Significant asso- ciations are not established DRD4 11p15.5 Nothen€ et al. (1994), Jonsson€ et al. (1993), Nanko Significant asso- et al. (1993), Chen et al. (1997), Prasad ciations are not et al. (1999) established

Table 1.2 Genes involved in serotonin mechanism Receptor Localization Linkages and associations studies Result 5HT2A 13q14 Williams et al. (1996), Spurlock et al. (1998), Significant associ- gene Nimgaonkar et al. (1996), Chen et al. (1997), ation established Hawi et al. (1997), Verga et al. (1997), Lin Significant associ- et al. (1999) ations are not established 5HT2A 13 Spurlock et al. (1998), Kouzmenko Significant associ- promoter et al. (1999) ations are not established SERT 17q11.2- Heils et al. (1996), Hranilovic et al. (2000), Significant associ- (SLC6A4) q12 Naylor et al. (1998), Oliveira et al. (1998), ations are Rao et al. (1998) established Significant associ- ations are not established homozygosity at this locus (Maziade et al. 1997). LOD values for the D3 polymor- phism only equal 1.2, signifying its minor effects. The serotonergic system is also searched for candidate schizophrenia genes. To date, seven serotonin receptors are identified: 5HT-1, 5HT-2, 5HT-3, 5HT-4, 5HT-6, and 5HT-7 (Table 1.2). Clinical studies have proven the presence of other systems as well, including the monoaminergic and glutamatergic dysfunctions in schizophrenia (Garbutt and van

[email protected] 6 1 Current Problems of Complex Disease Genes Mapping

Kammen 1983; Crowe et al. 1997; Ebstein et al. 1997; Bertolino et al. 1999; Cohen et al. 1999). Studies reported positive results for the serotonin transporter gene, SERT (5-HTT), and the serotonin receptors 5-HT1A (Erdman et al. 1996) and 5HT1d beta (Nothen et al. 1993; Sidenberg et al. 1993). Since neurotransmitter systems do not function in isolation, it is likely that schizophrenia affects other neurotransmit- ter receptors as well (Gottesman and Shields 1972; Kapur and Remington 1996). In recent years, complex relationships between various systems have been determined (Knight 1983; Moises et al. 1991; Cravchik and Goldman 2000; Jacobsen et al. 2000; Blakely 2001). Presently, however, it is still unclear how these neuro- chemical complexes cause schizophrenic pathology, its compensatory mechanisms, and the environmental impact of their operation. Brain volume measurements of schizophrenic patients show small, but signifi- cant, reductions in brain size, mostly in the frontal lobe, which may have a significant genetic influence (McNeil et al. 1993; Gur et al. 2000). Identifying genetic variants involved in schizophrenia pathogenesis can be important for understanding of the disease underlying pathological mechanisms. Mapping genes of schizophrenia, however, is subject to the same problems as mapping other complex diseases, namely the genetic and phenotypic heterogeneity of schizophrenia and the nonspecific environmental influences that partially predeter- mine its manifestations. Schizophrenia defies the Mendelian model; assessing the familial risk for schizophrenia among relatives is therefore complex. Defying the Mendelian model, studying the familial risk of schizophrenia is complex. The disease involves a variety of genes, where each gene may only slightly contribute to the risk of developing schizophrenia (Alda et al. 1989; McGlashan and Hoffman 2000; Freedman et al. 2001; Levinson et al. 2002). To date, the exact number of genes and their level of interaction are still unknown.

1.3 The Current State of Gene Mapping of Schizophrenia Spectrum Disorders

Current methods of complex disease gene mapping are related with search for associations, haplotype analysis, linkage, and linkage disequilibrium within populations and families. Associative Studies Using associative studies to examine the distribution of genetic marker frequency in patients and healthy subjects requires expediency to detect differences in the genetic polymorphisms of the studied groups, which may suggest a possible link between disease genes and alleles of a particular DNA marker. The association is based on the proximity of the marker locus and disease gene on the chromosome. A genetic marker is associated with a disease if its frequency in patients is significantly higher than control samples. The chi-squared test evaluates significant differences between groups of patients and healthy subjects in simple

[email protected] 1.3 The Current State of Gene Mapping of Schizophrenia Spectrum Disorders 7 cases. Examples of established associations are the link between ApoE and blood pressure, or an early form of Alzheimer’s disease manifestation, although in the latter case the results were inconsistent (Bennett et al. 1995; Fallin et al. 1997; Kukull and Martin 1998). Current analyses of genomic associations with patho- genic loci often use linkage disequilibrium (LD). LD reflects the connection between adjacent loci in haplotypes obtained from a common ancestor. LD block size depends on specific regions of chromosomes and structure of marital relations in the demographic history of specific populations (Reich et al. 2001; Zavattari et al. 2000; Peltonen 2000). All members in genetic isolates usually are relatives to some degree; each represents one lineage with a common ancestor. The degree of LD affects the number of recombinations and meiosis; historically young secondary isolates, which have fewer recombinations and meiosis, exhibit larger LD patterns compared to primary isolates, which accumulated more recombinants (Jorde 2000; Zavattari et al. 2000). It is assumed that the age of the mutant gene that caused disease onset in the founder haplotype is equal to the age of the population. It is accordingly believed that historically young populations can be useful for deter- mining the primary association with the pathogenic locus in a sufficiently large region of DNA, which can then be significantly narrowed in the historically oldest (primary) isolated populations (Jorde 2000; Bulayeva et al. 2002, 2005). Gene mapping of dystrophic dysplasia (DTD), a Mendelian disease, exemplifies the success of this approach. Traditional linkage analyses in the genealogy of outbred populations enabled the localization of the pathogenic locus in genomic region of 2 MB. A subsequent analysis of LD in this genomic region conducted in a Finnish isolate with similar patients narrowed it down to 40 Kb (Hastbacka et al. 1994). Association considers interpopulation differences in genetic polymorphisms and the population specificity of association between diseases and genetic markers, due to genetic heterogeneity of the disease. To avoid false-positive or false-negative results, association studies require a particular population. The results of associa- tion studies are additionally influenced by the effects of population stratification due to interpopulation differences such as sex, age, gene pool, and ethno-cultural background; overcoming these effects requires associations to be assessed in ethnically homogeneous local populations matched by age, sex, and other demo- graphic factors. In order to identify disease-causing mutations, of which the muta- tion frequency is unknown, representative randomized studies require a large sample of unrelated patients (case) and subjects (control) (Bray and Owen 2001), which is difficult to achieve within a single population, and even more difficult within small isolated populations. Current complex disease gene mapping can only determine trends and facilitate further linkage studies throughout the genome loci, leading to ambiguous associations between markers and pathogenic loci (Terwilliger and Ott 1994). Association analyses can assess the contribution of genetic polymorphisms that affect the function or expression of neurobiological candidate genes. Associative studies, for example, identified the serotonin 2A (5-HT) receptor, as a mediator in atypical antipsychosis (Ebstein et al. 1997). An association study discovered the

[email protected] 8 1 Current Problems of Complex Disease Genes Mapping

T/C polymorphism in nucleotide 102 of the gene encoding 5-HT; a meta-analysis later confirmed the results, despite a number of contradictory studies. One detailed study, however, shows that the established association of the T/C polymorphism does not play a significant role in schizophrenia pathogenesis (Ebstein et al. 1997). Associative studies of schizophrenia are subject to the following: (McGuffin and Owen 1991): (1) Individuals with genetic mutations associated with schizophrenia may not develop the disorder; a large sample size is therefore required to ensure detection; (2) Genes may function neutrally; however, their linkage disequilibrium with pathogenic locus, coding, and regulatory regions may affect the structure of the protein and/or should therefore be screened during association studies. The outcomes of association studies require cautious interpretation until the results prove reproducible (McGuffin and Owen 1991) as positive associations may result from population stratification due to sex, age, ethnic, and other differ- ences among studied groups. Despite its flaws, associative studies have provided insights into the molecular basis of schizophrenia. In recent years, great hopes have been placed on exploring potential associations with complex diseases throughout the entire genome (Genome-wide association, GWA) (Levinson et al. 1998). This systematic search for genes with minor effects is essential for genetically heterogeneous diseases. Linkage disequilibrium (LD) is a promising tool for identifying genetic poly- morphisms in single nucleotide repeats (SNPs) in patients and healthy subjects Genome-wide association testing, even using SNPs, does not solve the data inter- pretation problems of association studies. Currently, researchers stray from associ- ation methods to map the genes of complex diseases and are beginning to pay closer attention to population subdivision and families with the accumulation of such diseases where it is possible to perform linkage analysis. Study of the Endophenotypic Genetics of Schizophrenia Relatively recent searches for schizophrenia susceptibility genes used quantitative physiological endophenotypes, which could directly affect the fundamental processes potentially relevant to schizophrenia. These “intermediate” endophenotypes can be identified in people without schizophrenic symptoms. Schizophrenic endophenotypes may have a less complex genetic basis in schizophrenia, facilitating association studies and linkage analyses (Gottesman and Shields 1972). The wave of P50 of evoked potentials’ endophenotype was shown as linked with 15q (Boutros et al. 1993; Gottesman and Cloud 2003; Clementz et al. 1998). Studies that modify schizophre- nia susceptibility attempted to determine the impact of clinical heterogeneity in schizophrenic gene expression. Identification of these mechanisms may provide a better understanding of the clinical variants of schizophrenia that may contribute to the selection of more effective treatments. Researchers have also begun to study the relationship between the polymorphisms in candidate genes and the profile of clinical symptom, which may have clinical and genetic heterogeneity problems (Lawrie et al. 2001).

[email protected] 1.3 The Current State of Gene Mapping of Schizophrenia Spectrum Disorders 9

Linkages Study Genetic heterogeneity may be present in families where genes segregate with the main effect. The first positive findings using linkage analysis were not replicable in other pedigrees, favoring the fact that highly penetrant mutations causing schizophrenia are extremely rare and may not even exist (Levinson et al. 2002). The availability of well-characterized, genome-wide scanned, multi-locus markers opens up new possibilities for analyzing the genetic linkage in schizophre- nia. Such studies are scarce due to analysis complexity (Gurling et al. 2001); however, the most relevant search results for genome-wide scanned linked markers, and data on individual chromosomes, are summarized below. Chromosome 1q A number of studies using linkage and association analysis provided the basis for the linkage of schizophrenia susceptibility gene with the 1q21-23 [SCZD9 (MIM 604906)] region (St Clair et al. 1990; Kosower et al. 1995; Blackwood et al. 2001; Hovatta et al. 1999; Brzustowicz et al. 2000; Millar et al. 2000). One broad Scottish genealogy demonstrated significant linkage between schizophrenia (and related diseases) and balanced translocation of chro- mosomes 1 and 11, t (1; 11) (q42.1; q14.3) (St Clair et al. 1990; Millar et al. 2000). The genealogy of a Finnish isolate, in which many descendants had schizophre- nia (Hovatta et al. 1999), demonstrated the presence of disease linkage with D1S2141 and D1S2891 loci inherited dominantly with a high penetrance level (90 %). Another Finnish linkage genome-wide scan in identified linkage peak in the same region. A meta-analysis of 24 Canadian pedigrees in the 1q22 region detected the high peak with a max LOD ¼ 6.50 (P ¼ 0.0002), showing that 75 % of the analyzed families had schizophrenia linkage with this region (Brzustowicz et al. 2000). Chromosome 5p and 5q A study using seven Iceland and the UK pedigrees found positive LOD with D5S39 and D5S76 markers localized in the 5q11-13 region, which contains the SCZD1 gene (MIM 181 510) (Sherrington et al. 1988). Subse- quent study of chromosome 5 linkage showed conflicting results: later works by the same authors confirmed a weaker linkage signal in 23 families (Kalsi et al. 1999). Other studies show linkage with the short arm of chromosome 5 in the 5p14.1-13.1 region (Buetow et al. 1994) and 5p14.1-13.1 region (Silverman et al. 1996); how- ever, a number of studies did not confirm linkage on chromosome 5 (Moises et al. 1995; King et al. 1997). A study using burdened families linked the 5q22-31 region with schizophrenia (Straub et al. 1997b; Levinson 2003; Schwab et al. 1997). Subsequent studies using Palau, Micronesia, and Finland genetic isolates showed LOD ¼ 3.03.4 for 5q22-31 (Devlin et al. 2002; Paunio et al. 2000). Chromosome 6p Linkage disequilibrium was observed in the 6p22.3 region, local- izing the candidate gene for schizophrenia, DTNBP1 (Straub et al. 1997, 2002). The TIYFZA gene (tumor necrosis factor alpha), localized to 6p21.1-21.3, reported associations and linkages with average statistical significance (Bolin et al. 2001), as

[email protected] 10 1 Current Problems of Complex Disease Genes Mapping well as the 6p21.3 region, which localizes NOTCH4, a candidate gene for schizo- phrenia. A number of studies, however, contradict the role of NOTCH4 in schizo- phrenia; some works state the presence of positive associations or linkages with schizophrenia and others deny them [see Review (Levinson 2003)]. A more recent meta-analysis (Glatt et al. 2005) indicates that the NOTCH4 gene was originally selected as a candidate gene for schizophrenia due to its location in the 6p21.3 region, which in many studies showed linkage with schizophrenia (Wei and Hemmimgs 2000; Sklar et al. 2001; McGinnis et al. 2001). Studying the association of these gene alleles enabled the establishment of significant evidence for their involvement in schizophrenia pathogenesis. Several subsequent studies did not confirm such associations (Fan et al. 2002; lmai et al. 2001). A meta-analysis combining all the known works on NOTCH4 also showed no significant association between schizophrenia and its alleles (TAA)n, (CTG)n, or (TTAT)n and the SNP1 and SNP2 polymorphism. Family studies showed significant evidence for the association of these alleles and the SNP1 and SNP2 polymorphisms, suggesting its role in the pathogenesis of schizophrenia in certain states (Glatt et al. 2005; Shayevitz et al. 2012). As more reliable and reproducible schizophrenia associa- tions were obtained for haplotypes in such polymorphisms [especially those containing SNP2 and (CTG)n], the authors concluded that vast family studies are required to clarify the role of this haplotype in the NOTCH4 gene as a risk factor for schizophrenia. The average confidence level of associations or linkages was obtained for the HLA region in 6p21.22 (Schwab et al. 2000). The results, however, were not replicable in similar studies using Japanese and Chinese populations (Koishi et al. 2004; Liu et al. 2002). Chromosome 8p The 8p22-21 region initially was identified as potentially linked to schizophrenia [SCZD6 gene (MIM 603013)]. A number of studies established maximum LOD values, LOD ¼ 2.4–3.6 (Pulver et al. 1995; Blouin et al. 1998; Kendler et al. 1996; Levinson 2003; Brzustowicz et al. 1999). Other studies, however, did not confirm the association of these regions (Levinson et al. 2002). Chromosome 11q A balanced translocation between chromosomes 1 and 11 t(1; 11) (q42.1;q14.3) was identified with the co-segregation of schizophrenia and other mental illnesses in the same Scotland genealogy (St Clair et al. 1990; Blackwood et al. 1998; Millar et al. 2000). Linkage analysis revealed that both the 1q and 11q translocation and disease locus produce LOD ¼ 3.1–6.0 (SCZD2) (Maziade et al. 1995; Devon et al. 1997; Devon and Porteous 1997). Linkage studies of these genomic regions by other authors produced negative results (Gill et al. 1993; Wang et al. 1993; Mulcrone et al. 1995) or weak signals (Nanko et al. 1992; Faraone et al. 1998; Kaufmann et al. 1998). Chromosome 15q A highly reliable linkage with SCZ was obtained in the study of a group with evoked potential R-50 in 15q13-q14 which contains gene CHRNA7 (Freedman et al. 1997, 2001). The linkage region, however, produced weak signals.

[email protected] References 11

Subsequent meta-analyses revealed significant increases of LOD values in that region for schizophrenia (Levinson et al. 2002). Chromosome 22q Velocardiofacial syndrome (VCFS), a cytogenetic abnormality, leads to a complex of morphological and psychiatric disorders, including mental retardation and chronic schizophrenia, in about 30 % of affected individuals and is caused by microdeletions in the 22q11.21. Linkage analysis in this region yielded average LOD values. An association in 22q11.21 showed the probable involvement of the genes UFD1L and SNAP29 in the pathogenesis of schizophrenia (De Luca et al. 2001; Saito et al. 2001). The same region localizes the PRODH2 gene, for which several independent studies found an average level of confidence associa- tions (Levinson et al. 2002). The COMT (catechol-O-methyltransferase) gene, localized in the same 22q11.21 region, is a main participant in the dopamine system of neurotransmitters (Levinson et al. 2002). Some studies state associations between COMT and schizophrenia; others do not (Waterwort et al. 2002; Herken and Erdal 2001; Norton et al. 2002). Numerous studies found reproducible linkages in several chromosomal regions to schizophrenia, such as the 22q11-q12, 6p24-p22, 8p22-p21, 6q, 13q14.1-q32, 5q21-q31, 10p15-p11, 1q21-q22, and 18p (McGuffin and Owen 1991; Gill et al. 1993; Cardno et al. 1996; Cloninger et al. 1998; Kaufmann et al. 1998; Levinson et al. 1998; Cardino et al. 1999; Faraone et al. 1999; Brzustowicz et al. 2000; Levinson et al. 2002). Positive and negative findings were found for each linked chromosomal region. Some authors note that the probability that loci with LOD > 3 should not be expected for a given mental complex disease, but rather we could assume the presence of multiple regions with specific alleles of susceptibility with average effects of LOD ¼ 1.53.0 (Owen et al. 2000; Prasad et al. 2002). Several statements guide current schizophrenia linkage studies: • The above findings could not be replicated in all studies by other researchers. • The quantity of statistical significance is not the same and is average in most cases. • Linked chromosomal regions are usually quite extensive (often more than 2030 cM) (Hovatta et al. 1999; Boehnke 2000; Kendler et al. 2000).

References

Alda, M., Dvorakova, M., et al. (1989). Genetic aspects in chronic schizophrenia. Morbidity risks and contributory factors. Schizophrenia Research, 2, 339–344. Asherson, P., Walsh, C., Williams, J., Sargeant, M., Taylor, C., Clements, A., Gill, M., Owen, M., & McGuffin, P. (1994). Imprinting and anticipation. Are they relevant to genetic studies of schizophrenia? British Journal of Psychiatry, 164(5), 619–624. Asherson, P., Mant, R., Williams, N., Cardno, A., Jones, L., Murphy, K., Collier, D. A., Nanko, S., Craddock, N., Morris, S., Muir, W., Blackwood, B., McGuffin, P., & Owen, M. J. (1998). A study of chromosome 4p markers and dopamine D5 receptor gene in schizophrenia and bipolar disorder. Molecular Psychiatry, 3(4), 310–320.

[email protected] 12 1 Current Problems of Complex Disease Genes Mapping

Bennett, C., Crawford, F., et al. (1995). Evidence that the APOE locus influences rate of disease progression in late onset familial Alzheimer’s disease but is not causative. AJMG (Neuropsy- chiatric Genetics), 60, 1–6. Bertolino, A., Knable, M. B., Saunders, R. C., Callicott, J. H., Kolachana, B., Mattay, V. S., Bachevalier, J., Frank, J. A., Egan, M., & Weinberger, D. R. (1999). The relationship between dorsolateral prefrontal N-acetylaspartate measures and striatal dopamine activity in schizo- phrenia. Biological Psychiatry, 45(6), 660–667. Blackwood, S. K., MacHale, S. M., Power, M. J., Goodwin, G. M., & Lawrie, S. M. (1998). Effects of exercise on cognitive and motor function in chronic fatigue syndrome and depression. Journal of Neurology, Neurosurgery and Psychiatry, 65, 541–546. Blackwood, D., Fordyce, A., & Walker, M., et al. (2001). Schizophrenia and affective disorders— Cosegregation with a translocation at chromosome 1q42. That directly disrupts brain- expressed genes: Clinical and P300 findings in a family. American Journal of Human Genetics, (69), 428–433. Blakely, R. D. (2001). Dopamine’s reversal of fortune. Science, 293, 2407–2408. Blouin, J. L., Dombroski, B. A., et al. (1998). Schizophrenia susceptibility loci on chromosomes 13q32 and 8p21. Nature Genetics, 20, 70–73. Boehnke, M. (2000). A look at linkage disequilibrium. Nature Genetics, 25, 246–247. Bolin, S. R., Stoffregen, W. C., Nayar, G. P. S., & Hamel, A. L. (2001). Postweaning multisystemic wasting syndrome induced after experimental inoculation of cesarean-derived, colostrum-deprived piglets with type 2 porcine circovirus. Journal of Veterinary Diagnostic Investigation, 13, 185–194. Boutros, N., Zouridakis, G., et al. (1993). The P50 component of the auditory evoked potential and subtypes of schizophrenia. Psychiatry Research, 47, 243–254. Bray, N., & Owen, M. (2001). Searching for schizophrenia genes. Trends in Molecular Medicine, 7(4), 169–174. Brzustowicz, L. M., Hodgkinson, K., et al. (2000). Location of a major susceptibility locus for familial schizophrenia on chromosome 1q21-q22. Science, 288, 678–682. Brzustowicz, L. M., Honer, W. G., Chow, E. W., et al. (1999). Linkage of familial schizophrenia to chromosome 13q32. The American Journal of Human Genetics, 65(4), 1096–1103. Buetow, K. H., Ludwigsen, S., Scherpbier-Heddema, T., et al. (1994). Human genetic map. Genome maps V. Wall chart. Science, 265(5181), 2055–2070. Bulayeva, K. B., Leal, S. M., Pavlova, T. A., et al. (2005). Mapping genes of complex psychiatric diseases in Daghestan genetic isolates. American Journal of Medical Genetics Part B: Neuro- psychiatric Genetics, 132(1), 76–84. Bulayeva, K. B., Pavlova, T. A., Kurbanov, R. M., & Bulayev, O. A. (2002). Mapping genes of complex disease in genetic isolates of Dagestan. Journal of Genetics (Russia), 38(11), 1539–1548. Cardino, A. G., Bowen, T., et al. (1999). CAG repeat length in the hKCa3 gene and symptom dimensions in schizophrenia. Society of Biological Psychiatry, 45(12), 1592–1596. Cardno, A. G., Murphy, K. C., et al. (1996). Expanded CAG/CTG repeats in schizophrenia a study of clinical correlates. British Journal of Psychiatry, 169, 766–771. Carlsson, A. (1988). The current status of the dopamine hypothesis of schizophrenia. Neuropsychopharmacology, 1(3), 179–203. Chen, C. H., Liu, M. Y., Wei, F. C., Koong, F. J., Hwu, H. G., & Hsiao, K. J. (1997). Further evidence of no association between Ser9Gly polymorphism of dopamine D3 receptor gene and schizophrenia. American Journal of Medical Genetics, 74(1), 40–43. Cichon, S., Nothen, M. M., Erdmann, J., & Propping, P. (1994). Detection of four polymorphic sites in the human dopamine D1 receptor gene (DRD1). Human Molecular Genetics, 3, 209. Clementz, B. A., Geyer, M. A., et al. (1998). Poor p50 suppression among schizophrenia patients and their first-degree biological relatives. American Journal of Psychiatry, 155, 1691–1694.

[email protected] References 13

Cloninger, C. R., Kaufman, C. A., et al. (1998). A genome-wide search for schizophrenia susceptibility loci: The NIMH genetics initiative and Millennium consortium. American Journal of Medical Genetics (Neuropsychiatric Genetics), 81, 275–281. Cohen, B. M., Ennulat, D. J., et al. (1999). Polymorphisms of the dopamine D4 receptor and response to antipsychotic drugs. Psychopharmacology, 141, 6–10. Conneally, P. M. (1991). Association between the D2 dopamine receptor gene and alcoholism. A continuing controversy. Archives of General Psychiatry, 48(8), 757–759. Cravchik, A., & Goldman, D. (2000). Neurochemical individuality: Genetic diversity among human dopamine and serotonin receptors and transporters. Archives of General Psychiatry, 57, 1105–1114. Crocq, M. A., Mant, R., Asherson, P., Williams, J., Hode, Y., Mayerova, A., Collier, D., Lannfelt, L., Sokoloff, P., Schwartz, J. C., et al. (1992). Association between schizophrenia and homozygosity at the dopamine D3 receptor gene. Journal of Medical Genetics, 29(12), 858– 860. Crowe, R. R., Wang, Z., Noyes, R., Jr., Albrecht, B. E., Darlison, M. G., Bailey, M. E., Johnson, K. J., & Zoe¨ga, T. (1997). Candidate gene study of eight GABAA receptor subunits in panic disorder. American Journal of Psychiatry, 154(8), 1096–1100. Davis, K. L., Kahn, R. S., et al. (1991). Dopamine in schizophrenia: A review and reconceptualization. American Journal of Psychiatry, 148(11), 1474–1486. De Luca, A., Pasini, A., Amati, F., Botta, A., Spalletta, G., Alimenti, S., Caccamo, F., Conti, E., Trakalo, J., Macciardi, F., Dallapiccola, B., & Novelli, G. (2001). Association study of a promoter polymorphism of UFD1L gene with schizophrenia. American Journal of Medical Genetics, 105(6), 529–533. Devlin, B., Bacanu, S. A., Roeder, K., Reimherr, F., Wender, P., Galke, B., Novasad, D., Chu, A., TCuenco, K., Tiobek, S., Otto, C., & Byerley, W. (2002). Genome-wide multipoint linkage analyses of multiplex schizophrenia pedigrees from the oceanic nation of Palau. Molecular Psychiatry, 7(7), 689–694. Devon, R. S., Evans, K. L., Maule, J. C., Christie, S., Anderson, S., Brown, J., Shibasaki, Y., Porteous, D. J., & Brookes, A. J. (1997). Novel transcribed sequences neighbouring a trans- location breakpoint associated with schizophrenia. American Journal of Medical Genetics, 74 (1), 82–90. Devon, R. S., & Porteous, D. J. (1997). Physical mapping of a glutamate receptor gene in relation to a balanced translocation associated with schizophrenia in a large Scottish family. Psychiat- ric Genetics, 7(4), 165–169. Ebstein, R. P., Segman, R., et al. (1997). 5-HT2C (HTR2C) serotonin receptor gene polymorphism associated with the human personality trait of reward dependence: Interaction with dopamine D4 receptor (D4DR) and dopamine D3 receptor (D3DR) polymorphisms. American Journal of Medical Genetics, 74(1), 65–72. Ekelund, J., Lichtermann, D., Ja¨rvelin, M. R., & Peltonen, L. (1999). Association between novelty- seeking and the type 4 dopamine receptor gene in a large Finnish cohort sample. American Journal of Psychiatry, 156(9), 1453–1455. Erdman, J., Shimron-Abarbanell, D., Rietschel, M., et al. (1996). Systematic screening for mutations in the human serotonin 2A (5HT2A) receptor gene identification of two naturally occurring receptor variants and association analysis in schizophrenia. Human Genetics, 97(5), 614–619. Falconer, D. S. (1960). Introduction to quantitative genetics (p. 365). London: Longman Group Ltd. Fallin, D., Reading, S., et al. (1997). No interaction between the APOE and the alpha-1- antichymotrypsin genes on risk for Alzheimer’s disease. American Journal of Medical Genet- ics, 74, 192–194. Fan, J. B., Tang, J. X., Gu, N. F., et al. (2002). A family-based and case-control association study of the NOTCH4 gene and schizophrenia. Molecular Psychiatry, 7, 100–103.

[email protected] 14 1 Current Problems of Complex Disease Genes Mapping

Faraone, S. V., Doyle, A. E., Mick, E., & Biederman, J. (2001). Meta-analysis of the association between the 7-repeat allele of the dopamine D(4) receptor gene and attention deficit hyperac- tivity disorder. American Journal of Psychiatry, 158(7), 1052–1057. Faraone, S. V., Matise, T., Svrakic, D., Pepple, J., Malaspina, D., Suarez, B., Hampe, C., Zambuto, C. T., Schmitt, K., Meyer, J., Markel, P., Lee, H., Harkavy Friedman, J., Kaufmann, C., Cloninger, C. R., & Tsuang, M. T. (1998). Genome scan of European-American schizophrenia pedigrees: Results of the NIMH Genetics Initiative and Millennium Consortium. American Journal of Medical Genetics, 81(4), 290–295. Faraone, S. V., Meyer, J., et al. (1999). Suggestive linkage of chromosome 10p to schizophrenia is not due to transmission ratio distortion. American Journal of Medical Genetics, 88, 607–608. Freedman, R., Coon, H., Myles-Worsley, M., Orr-Urtreger, A., Olincy, A., Davis, A., Polymeropoulos, M., Holik, J., Hopkins, J., Hoff, M., Rosenthal, J., Waldo, M. C., Reimherr, F., Wender, P., Yaw, J., Young, D. A., Breese, C. R., Adams, C., Patterson, D., Adler, L. E., Kruglyak, L., Leonard, S., & Byerley, W. (1997). Linkage of a neurophysiological deficit in schizophrenia to a chromosome 15 locus. Proceedings of the National Academy of Sciences of the United States of America, 94(2), 587–592. Freedman, R., Leonard, S., Olincy, A., et al. (2001). Evidence for the multigenic inheritance of schizophrenia. American Journal of Medical Genetics, 105(8), 794–800. Garbutt, J. C., & van Kammen, D. P. (1983). The interaction between GABA and dopamine: Implications for schizophrenia. Schizophrenia Bulletin, 9(3), 336–353. Gejman, P. V., Ram, A., Gelernter, J., Friedman, E., Cao, Q., Pickar, D., Blum, K., Noble, E. P., Kranzler, H. R., O’Malley, S., et al. (1994). No structural mutation in the dopamine D2 receptor gene in alcoholism or schizophrenia. Analysis using denaturing gradient gel electro- phoresis. JAMA, 271(3), 204–208. Gill, M., McGuffin, P., et al. (1993). A linkage study of schizophrenia with DNA markers from the long arm of chromosome 11. Psychological Medicine, 23(1), 27–44. Glatt, S. J., Wang, R. S., Yeh, Y. C., Tsuang, M. T., & Faraone, S. V. (2005). Five NOTCH4 polymorphisms show weak evidence for association with schizophrenia: Evidence from meta- analyses. Schizophrenia Research, 73(2–3), 281–290. Gottesman, I. I. (1991). Schizophrenia genesis: The origins of madness (p. 296). WH Freeman/ Times Books/Henry Holt & Co. Gottesman, I. I., & Shields, J. (1972). Schizophrenia and genetics, a Twin study vantage point. New York, London: Academic Press. Gottesman, I. I., & Gould, T. D. (2003). The endophenotype concept in psychiatry: Etymology and strategic intentions. American Journal of Psychiatry, 160(4), 636–645. Grandy, D. K., Zhou, Q. Y., Allen, L., Litt, R., Magenis, R. E., Civelli, O., & Litt, M. (1990). A human D1 dopamine receptor gene is located on chromosome 5 at q35.1 and identifies an EcoRIRFLP. American Journal of Human Genetics, 47(5), 828–834. Gur, R. E., Cowell, P. E., et al. (2000). Reduced dorsal and orbital prefrontal gray matter volumes in schizophrenia. Archives of General Psychiatry, 57, 761–768. Gurling, H. M., Kalsi, G., Brynjolfson, J., et al. (2001). Genomewide genetic linkage analysis confirms the presence of susceptibility loci for schizophrenia, on chromosomes 1q32.2, 5q33.2, and 8p21-22 and provides support for linkage to schizophrenia, on chromosomes 11q23.3-24 and 20q12.1-11.23. The American Journal of Human Genetics, 68(3), 661–673. Hastbacka, J., de la Chapelle, A., & Mahtani, M. M. (1994). The diastrophic dysplasia gene encodes a novel sulfate transporter: Positional cloning by fine-structure linkage disequilibrium mapping. Cell, 78(6), 1073–1087. Hawi, Z., Myakishev, M. V., Straub, R. E., O’Neill, A., Kendler, K. S., Walsh, D., & Gill, M. (1997). No association or linkage between the 5-HT2a/T102C polymorphism and schizophre- nia in Irish families. American Journal of Medical Genetics, 74(4), 370–373. Heils, A., Teufel, A., Petri, S., Stober, G., Riederer, P., Bengel, D., & Lesch, K. P. (1996). Allelic variation of human serotonin transporter gene expression. Journal of Neurochemistry, 66, 2621–2624.

[email protected] References 15

Herken, H., & Erdal, M. E. (2001). Catechol-O-methyltransferase gene polymorphism in schizo- phrenia: Evidence for association between symptomatology and prognosis. Journal of Psychi- atric Genetics, 11(2), 105–109. Hill, S. Y., Locke, J., Zezza, N., Kaplan, B., Neiswanger, K., Steinhauer, S. R., Wipprecht, G., & Xu, J. (1998). Genetic association between reduced P300 amplitude and the DRD2 dopamine receptor A1 allele in children at high risk for alcoholism. Biological Psychiatry, 43(1), 40–51. Hovatta, L., Varilo, T., Suvisaari, J., Terwilliger, J., et al. (1999). A genomewide screen for schizophrenia genes in an isolated finnish subpopulation, suggesting multiple susceptibility loci. The American Journal of Human Genetics, 65(4), 1114–1124. Hranilovic, D., Schwab, S. G., Jernej, B., Knapp, M., Lerer, B., Albus, M., et al. (2000). Serotonin transporter gene and schizophrenia: Evidence for association/linkage disequilibrium in fami- lies with affected siblings. Molecular Psychiatry, 5, 91–95. doi:10.1038/sj.mp.4000599. Illarioshkin, S. N., Ivanova-Smolenskaya, I. A., et al. (1996). Clinical and molecular analysis of a large family with three distinct phenotypes of progressive muscular dystrophy. Brain, 119, 1895–1909. Imai, Y., Gates, M. A., Melby, A. E., Kimelman, D., Schier, A. F., & Talbot, W. S. (2001). The homeobox genes vox and vent are redundant repressors of dorsal fates in zebrafish. Develop- ment, 128(12), 2407–2420. Jacobsen, L. K., Staley, J. K., et al. (2000). Prediction of dopamine transporter binding availability by genotype: A preliminary report. American Journal of Psychiatry, 157(10), 1700–1703. Jaskiw, G. E., & Weinberger, D. R. (1992). Dopamine and schizophrenia—A cortically corrective perspective. The Neurosciences, 4, 179–188. Jonsson,€ E., Lannfelt, L., Sokoloff, P., Schwartz, J. C., & Sedvall, G. (1993). Lack of association between schizophrenia and alleles in the dopamine D3 receptor gene. Acta Psychiatrica Scandinavica, 87(5), 345–349. Jorde, L. B. (2000). Linkage disequilibrium and the search for complex disease genes. Genome Research, 10(10), 1435–1444. Kalsi, G., Mankoo, B., Curtis, D., Sherrington, R., Melmer, G., Brynjolfsson, J., Sigmundsson, T., Read, T., Murphy, P., Petursson, H., & Gurling, H. (1999). New DNA markers with increased informativeness show diminished support for a chromosome 5q11-13 schizophrenia suscepti- bility locus and exclude linkage in two new cohorts of British and Icelandic families. Annals of Human Genetics, 63(Pt 3), 235–247. Kapur, S., & Remington, G. (1996). Serotonin-Dopamine interaction and its relevance to schizo- phrenia. American Journal of Psychiatry, 153(4), 466–476. Kapur, A., & Seeman, A. (2001). Does fast dissociation from the dopamine d(2) receptor explain the action of atypical antipsychotics? A new hypothesis. American Journal of Psychiatry, 158 (3), 360–369. Kaufmann, C. A., Suarez, B., et al. (1998). The NIMH genetics initiative Millennium schizophre- nia consortium: Linkage analysis of African-American pedigrees. The Journal of Medical Genetics (Neuropsychiatric Genetics), 81, 282–289. Kebabian, J. W., Beaulieu, M., & Itoh, Y. (1984). Pharmacological and biochemical evidence for the existence of two categories of dopamine receptor. The Canadian Journal of Neurological Sciences, 1, 114–117. Kendler, K. S., MacLean, C. J., O’Neill, F. A., et al. (1996). Evidence for a schizophrenia vulnerability locus on chromosome 8p in the Irish study of high-density schizophrenia fami- lies. American Journal of Psychiatry, 153, 1534–1540. Kendler, K. S., Myers, J. M., et al. (2000). Clinical features of schizophrenia and linkage to chromosomes 5q, 6p, 8p, and 10p in the Irish study of high-density schizophrenia families. American Journal of Psychiatry, 157(3), 402–408. King, N., Bassett, A. S., Honer, W. G., et al. (1997). Absence of linkage for schizophrenia on the short arm of chromosome 5 in multiplex Canadian families. American Journal of Medical Genetics, 74, 472–474.

[email protected] 16 1 Current Problems of Complex Disease Genes Mapping

Knight, J. (1983). Dopamine-receptor-stimulating autoantibodies: A possible cause of schizophre- nia. The Lancet (Nov. 13, 1982), 1073–1075. Koishi, S., Yamazaki, K., Yamamoto, K., Koishi, S., Enseki, Y., Nakamura, Y., Oya, A., Yasueda, M., Asakura, A., Aoki, Y., Atsumi, M., Inomata, J., Inoko, H., & Matsumoto, H. (2004). Notch4 gene polymorphisms are not associated with autism in Japanese population. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 125B(1), 61–62. Kojima, M., Hosoda, H., Date, Y., Nakazato, M., Matsuo, H., & Kangawa, K. (1999). Ghrelin is a growth-hormone-releasing acylated peptide from stomach. Nature, 402(6762), 656–660. Kosower, N. S., Gerad, L., Goldstein, M., et al. (1995). Constitutive heterochromatin of chromo- some 1 and Duffy blood group alleles in schizophrenia. American Journal of Medical Genetics, 60, 133–138. Kouzmenko, A. P., Scaffidi, A., Pereira, A. M., Hayes, W. L., Copolov, D. L., & Dean, B. (1999). No correlation between A(-1438)G polymorphism in 5-HT2A receptor gene promoter and thedensity of frontal cortical 5-HT2A receptors in schizophrenia. Human Heredity, 49(2), 103– 105. Kruglyak, L. (1999). Genetic isolates: Separate but equal? Proceedings of the National Academy of Sciences of the United States of America, 96(4), 1170–1172. Kukull, W. A., & Martin, G. M. (1998). Editorials: APOE polymorphisms and late-onset Alzheimer disease. Journal of the American Medical Association, 279(10), 788–789. Lawrie, S. M., Whalley, H. C., Abukmeil, S. S., et al. (2001). Brain structure, genetic liability and psychotic symptoms in subjects at high risk of developing schizophrenia. Biological Psychi- atry, 49, 811–823. Lee, J. F., Lu, R. B., Ko, H. C., et al. (1999). No association between DRD2 locus and alcoholism after controlling the ADH and ALDH genotypes in Chinese Han population. Alcoholism: Clinical and Experimental Research, 23(4), 592–599. Levinson, D. F. (2003). Molecular genetics of schizophrenia: Review of the recent literature. Current Opinion in Psychiatry, 16, 157–170. Levinson, D. F., Holmans, P. A., et al. (2002). No major schizophrenia locus detected on chromosome 1q in a large multicenter sample. Science, 296, 739–741. Levinson, D. F., Mahtani, M. M., et al. (1998). Genome scan of schizophrenia. American Journal of Psychiatry, 155(6), 741–750. Lin, C., Hsiao, C., & Chen, W. (1999). Development of sustained attention assessed using the continuous performance test among children 6–15 years of age. Journal of Abnormal Child Psychology, 27, 403–412. Litt, M., al-Dhalimy, M., Zhou, Q., Grandy, D., & Civelli, O. (1991). A TaqI RFLP at the DRD1 locus. Nucleic Acids Research, 19(11), 3161. Liu, W. M., Mei, R., Di, X., Ryder, T. B., Hubbell, E., Dee, S., Webster, T. A., Harrington, C. A., Ho, M. H., Baid, J., & Smeekens, S. P. (2002). Analysis of high-density expression microarrays with signed-rank call algorithms. Bioinformatics, 18(12), 1593–1599. Liu, L. L., Wei, J., Zhang, X., Li, X. Y., Shen, Y., Liu, S. Z., Ju, G. Z., Shi, J. P., Yu, Y. Q., Xu, Q., & Hemmings, G. P. (2004). Lack of a genetic association between the TNXB locus and schizophrenia in a Chinese population. Neuroscience Letters, 355(1–2), 149–151. Mant, R., Williams, J., Asherson, P., Parfitt, E., McGuffin, P., & Owen, M. J. (1994). Relationship between homozygosity at the dopamine D3 receptor gene and schizophrenia. American Journal of Medical Genetics, 54(1), 21–26. Maziade, M., Martinez, M., Rodrigue, C., Gauthier, B., Tremblay, G., Fournier, C., Bissonnette, L., Simard, C., Roy, M. A., Rouillard, E., & Me´rette, C. (1997). Childhood/early adolescence- onset and adult-onset schizophrenia. Heterogeneity at the dopamine D3 receptor gene. The British Journal of Psychiatry, 170, 27–30. Maziade, M., Raymond, V., Cliche, D., Fournier, J. P., Caron, C., Garneau, Y., Nicole, L., Marcotte, P., Couture, C., Simard, C., et al. (1995). Linkage results on 11Q21-22 in Eastern Quebec pedigrees densely affected by schizophrenia. American Journal of Medical Genetics, 60(6), 522–528.

[email protected] References 17

McGinnis, R. E., Fox, H., Yates, P., et al. (2001). Failure to confirm NOTCH4 association with schizophrenia in a large population-based sample from Scotland. Nature Genetics, 28, 128–129. McGlashan, T. H., & Hoffman, R. E. (2000). Schizophrenia as a disorder of developmentally reduced synaptic connectivity. Archives of General Psychiatry, 57(7), 637–648. McGuffin, P., & Owen, M. (1991). The molecular genetics of schizophrenia: An overview and forward view. European Archives of Psychiatry and Clinical Neuroscience, 240(3), 169–173. McNeil, T. F., Cantor-Graae, E., et al. (1993). Prenatal cerebral development in individuals at genetic risk for psychosis: Head size at birth in offspring of women with schizophrenia. Schizophrenia Research, 10(1), 1–5. Millar, J. K., Wilson-Annan, J. C., Anderson, S., et al. (2000). Disruption of two novel genes by a translocation co-segregating with schizophrenia. Human Molecular Genetics, 9, 1415–1423. Moises, H. W., Gelernter, J., et al. (1991). No linkage between D2 dopamine receptor gene region and schizophrenia. Archives of General Psychiatry, 48(7), 643–647. Moises, H. W., Yang, L., Krisbjanarson, H., et al. (1995). An international two-stage genome-wide search for schizophrenia susceptibility genes. Nature Genetics, 11, 321–324. Mulcrone, J., Whatley, S. A., Marchbanks, R., et al. (1995). Genetic linkage analysis of schizo- phrenia using chromosome 11q13-24 markers in Israeli pedigrees. American Journal of Medical Genetics, 60, 103–108. Nanko, S., Gill, M., Owen, M., Takazawa, N., Moridaira, J., & Kazamatsuri, H. (1992). Linkage study of schizophrenia with markers on chromosome 11 in two Japanese pedigrees. The Japanese Journal of Psychiatry and Neurology, 46(1), 155–159. Nanko, S., Sasaki, T., Fukuda, R., Hattori, M., Dai, X. Y., Kazamatsuri, H., Kuwata, S., Juji, T., & Gill, M. (1993). A study of the association between schizophrenia and the dopamine D3 receptor gene. Human Genetics, 92(4), 336–338. Naylor, L., Dean, B., Pereira, A., Mackinnon, A., Kouzmenko, A., & Copolov, D. (1998). No association between the serotonin transporter-linked promoter region polymorphism and either schizophrenia or density of the serotonin transporter in human hippocampus. Molecular Medicine, 4(10), 671–674. Nimgaonkar, V. L., Zhang, X. R., Caldwell, J. G., Ganguli, R., & Chakravarti, A. (1993). Association study of schizophrenia with dopamine D3 receptor gene polymorphisms: Probable effects of family history of schizophrenia? American Journal of Medical Genetics, 48(4), 214–217. Nimgaonkar, V. L., Zhang, X. R., Brar, J. S., DeLeo, M., & Ganguli, R. (1996, Spring). 5-HT2 receptor gene locus: Association with schizophrenia or treatment response not detected. Psychiatric Genetics, 6(1), 23–27. Norton, N., Kirov, G., Zammit, S., Jones, G., Jones, S., Owen, R., Krawczak, M., Williams, N. M., O’Donovan, M. C., & Owen, M. J. (2002). Schizophrenia and functional polymorphisms in the MAOA and COMT genes: No evidence for association or epistasis. American Journal of Medical Genetics, 114(5), 491–496. Nothen, M. M., Cichon, S., et al. (1993). Excess of homozygosis at the dopamine D3 receptor gene in schizophrenia not confirmed. Journal of Medical Genetics, 30(8), 708. Nothen,€ M. M., Erdmann, J., Shimron-Abarbanell, D., & Propping, P. (1994). Identification of genetic variation in the human serotonin 1D beta receptor gene. Biochemical and Biophysics Research Communications, 205(2), 1194–1200. Oliveira, J. R., et al. (1998). The short variant of the polymorphism within the promoter region of the serotonin transporter gene is a risk factor for late onset Alzheimer’s disease. Molecular Psychiatry, 3, 438–441. Owen, M. J., Cardno, A. G., & O’Donovan, M. C. (2000). Psychiatric genetics: Back to the future. Molecular Psychiatry, 5(1), 22–31. Paunio, T., Ekelund, J., Hovatta, I., et al. (2000). Genome wide scan of an extended Finnish schizophrenia study sample. American Journal of Medical Genetics, 96, 460.

[email protected] 18 1 Current Problems of Complex Disease Genes Mapping

Peltonen, L. (2000). Positional cloning of disease genes: Advantages of genetic isolates. Human Heredity, 50(1), 66–75. Prasad, S., Deshpande, S. N., Bhatia, T., Wood, J., Nimgaonkar, V. L., & Thelma, B. K. (1999). Association study of schizophrenia among Indian families. American Journal of Medical Genetics, 88(4), 298–300. Prasad, S., Semwal, P., Deshpande, S., et al. (2002). Molecular genetics of schizophrenia: Past, present and future. Journal of Biosciences, 27, 35–52. Pulver, A. E., Lasseter, V. K., Kasch, L., et al. (1995). Schizophrenia a genome scan targets chromosomes 3p and 8p as potential sites of susceptibility genes. American Journal of Medical Genetics, 60, 252–260. Puzyrev, V. P., & Stepanov, V. A. (1997). Pathological anatomy of human genome (p. 223). Novosibirsk: Nauka. RAS Siberian company. Rao, D., Jonsson, E. G., Paus, S., Ganguli, R., Nothen, M., & Nimgaonkar, V. L. (1998). Schizophrenia and the serotonin transporter gene [In process citation]. Psychiatric Genetics, 8, 207–212. Reich, D. E., Cargill, M., Bolk, S., et al. (2001). Linkage disequilibrium in the human genome. Nature, 411(6834), 199–204. Saito, T., Guan, F., Papolos, D. F., Rajouria, N., Fann, C. S., & Lachman, H. M. (2001). Polymorphism in SNAP29 gene promoter region associated with schizophrenia. Journal of Molecular Psychiatry, 6(2), 193–201. Schwab, S. G., Eckstein, G. N., Hallmayer, J., et al. (1997). Evidence suggestive of a locus on chromosome 5q31 contributing to susceptibility for schizophrenia in German and Israeli families by multipoint affected sib-pair linkage analysis. Molecular Psychiatry, 2, 156–160. Schwab, S. G., Hallmayer, J., Albus, M., Lerer, B., Eckstein, G. N., Borrmann, M., Segman, R. H., Hanses, C., Freymann, J., Yakir, A., Trixler, M., Falkai, P., Rietschel, M., Maier, W., & Wildenauer, D. B. (2000). A genome-wide autosomal screen for schizophrenia susceptibility loci in 71 families with affected siblings: Support for loci on chromosome 10p and 6. Molecular Psychiatry, 5(6), 638–649. Shaikh, S., Collier, D., Arranz, M., Ball, D., Gill, M., & Kerwin, R. (1994). DRD2 Ser311/Cys311 polymorphism in schizophrenia. Lancet, 343(8904), 1045–1046. Shayevitz, C., Cohen, O. S., Faraone, S. V., & Glatt, S. J. (2012). A re-review of the association between the NOTCH4 locus and schizophrenia. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 159(5), 477–483. doi:10.1002/ajmg.b.32050. Epub 2012 Apr 9. Sherrington, R., Brynjolfsson, J., Petursson, H., Potter, M., Dudleston, K., Barraclough, B., Wasmuth, J., Dobbs, M., & Gurling, H. (1988). Localization of a susceptibility locus for schizophrenia on chromosome 5. Nature, 336(6195), 164–167. Sidenberg, D. G., Bassett, A. S., Demchyshyn, L., et al. (1993). New polymorphism for the human serotonin 1D receptor variant (5-HT1D beta) not linked to schizophrenia in five Canadian pedigrees. Human Heredity, 43(5), 315–318. Silverman, J. M., Greenberg, D. A., Altstiel, L. D., Siever, L. J., Mohs, R. C., Smith, C. J., Zhou, G., Hollander, T. E., Yang, X. P., Kedache, M., Li, G., Zaccario, M. L., & Davis, K. L. (1996). Evidence of a locus for schizophrenia and related disorders on the short arm of chromosome 5 in a large pedigree. American Journal of Medical Genetics, 67(2), 162–171. Sklar, F. H., Fitz, H. C., Wu, Y., Van Zee, R., & McVoy, C. (2001). The design of ecological landscape models for everglades restoration. Ecological Economics, 37(3), 379–401. Sobell, J., Sigurdson, D. C., Heston, L., & Sommer, S. (1994). S311C D2DR variant: No association with schizophrenia. Lancet, 344(8922), 621–622. Spurlock, G., Williams, J., McGuffin, P., Aschauer, H. N., Lenzinger, E., Fuchs, K., Sieghart, W. C., Meszaros, K., Fathi, N., Laurent, C., Mallet, J., Macciardi, F., Pedrini, S., Gill, M., Hawi, Z., Gibson, S., Jazin, E. E., Yang, H. T., Adolfsson, R., Pato, C. N., Dourado, A. M., & Owen, M. J. (1998). European Multicentre Association Study of Schizophrenia: A study of the DRD2 Ser311Cys and DRD3 Ser9Gly polymorphisms. American Journal of Medical Genetics, 81(1), 24–28.

[email protected] References 19

St Clair, D., Blackwood, D., Muir, W., et al. (1990). Association within a family of a balanced autosomal translocation with major mental illness. Lancet, 336, 13–16. Straub, R. E., Jiang, Y., MacLean, C. J., et al. (2002). Genetic variation in the 6p22.3 gene DTNBP1, the human ortholog of the mouse disbinding gene, is associated with schizophrenia. The American Journal of Human Genetics, 71, 337–348. Straub, R. E., MacLean, C. J., O’Neill, F. A., et al. (1997). Support for a possible schizophrenia vulnerability locus in region 5q22-31 in Irish families. Molecular Psychiatry, 2(N2), 148–155. Terwilliger, J. D., & Ott, J. (1994). Handbook of human genetic linkage. Baltimore, MA: JHU Press. Trubnikov, V. I., & Gindilis, V. M. (1981). Table method of the component expansion of phenotypic variance based on correlations between relatives [Article in Russian]. Genetika, 17(6), 1107–1116. Verga, M., Macciardi, F., Pedrini, S., Cohen, S., & Smeraldi, E. (1997). No association of the Ser/ Cys311 DRD2 molecular variant with schizophrenia using a classical case control study and the haplotype relative risk. Schizophrenia Research, 25(2), 117–121. Wang, Z. W., Black, D., Andreasen, N. C., & Crowe, R. R. (1993). A linkage study of chromosome 11q in schizophrenia. Archives of General Psychiatry, 50(3), 212–216. Waterwort, D. M., Bassett, A. S., & Brzustowicz, L. M. (2002). Recent advances in the genetics of schizophrenia. Cellular and Molecular Life Sciences, 59(2), 331–348. Wei, J., & Hemmimgs, G. P. (2000). The NOTCH4 locus is associated with susceptibility to schizophrenia. Nature Genetics, 25(4), 376–377. Williams, J., Spurlock, G., McGuffin, P., Mallet, J., No¨then, M. M., Gill, M., Aschauer, H., Nylander, P. O., Macciardi, F., & Owen, M. J. (1996). Association between schizophrenia and T102C polymorphism of the 5-hydroxytryptamine type 2a-receptor gene. European Multicentre Association Study of Schizophrenia (EMASS) Group. Lancet, 347(9011), 1294– 1296. Zavattari, P., Lampis, R., & Mulargia, A. (2000). Confirmation of the DRB1-DQB1 loci as the major component of IDDM1 in the isolated founder population of Sardinia. Human Molecular Genetics, 9(20), 2967–2972.

[email protected] Chapter 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan

2.1 History and Ethno-linguistic Diversity of Dagestan

Dagestan has unique ethnic diversity—it contains 26 indigenous ethnic groups living in same highland regions more than 10 K years, according to archeological data (Gadzhiev et al. 1996); artifacts originating from the Lower Paleolithic period to the Middle Ages have been found in Dagestan (Kotovich 1961). Dagestan occupies 50,300 km2 and stretches diagonally along the Great Cauca- sus Mountain Range. The republic is divided into three significant geographic zones: mountainous, foothill, and plains. Flat areas constitute most of northern Dagestan. The greatest ethnic diversity exists in the highland areas, of which comprises two-thirds of Dagestan. Dagestan was often invaded due to its strategic location. Arab conquests of the territory occurred in the seventh and twelfth centuries. After the conquest of , Tabasaran, Kaitag, and Zirehgeran, the Arabs in the first half of the eighth century invaded Dagestan, reached and Avaria, and ravaged Khunzakh (Gadzhiev 1971). The Arabs forcibly implemented Islam and a predatory tax policy after their conquest of Dagestan. The Mongols later conquered Dagestan and the area comprised part of the ; Tatar-Mongols, however, could not assert their authority in the mountains (Lavrov 1951). Tamerlane invaded Dagestan in 1395 and devastated the country. After Tamer- lane conquered Middle and Nagorny Dagestan, his forces returned to Tarqui and moved on to Zirehgeran (Kubachi) and Kaitag, where its people asked for mercy and then were pardoned. Tamerlane moved from Kaitag to Derbent, where he ordered to strengthen the fortress, and left Dagestan territory. Tamerlane occupied Dagestan briefly; however, his presence severely affected the mountaineers. Economic revival started in Dagestan after Tamerlane’s death. Large settlements formed as small villages merged. Unable to establish viable agriculture, these mountain villages developed artisan trades; men would furthermore leave their villages seasonally to search for work in cities and remote areas. The mountainous

© Springer International Publishing Switzerland 2016 21 K. Bulayeva et al., Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates, DOI 10.1007/978-3-319-31964-3_2

[email protected] 22 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan areas of Dagestan produced their most famous crafts: weapons, jewelry, leather, and cloth. Producing metal products divided labor between Harbuk and Kubachi, two neighboring auls. Harbuk specialized in manufacturing tools and weapons, which then were decorated in Kubachi. The Amuzgi village also supplied Kubachi; their blades were considered the best in Dagestan. Dagestan’s capital, Makhachkala, is currently a large industrial city with typical infrastructure, including railways and airports. A 1989 official statistical survey set the population of Makhachkala at 390,000 people. All indigenous people, as well as other ethnic groups in Russia, are represented in its population. A 1723 campaign resulted in peace between Russia and ; however, it was not kept. Struggles for control of the Caucasus and Caspian region between Russia, Turkey, and Iran lasted nearly 100 years. The Russian tsarist government conducted a stringent policy in the Caucasus, causing unrest among mountaineers; their want for freedom was initially expressed in armed resistance, which then evolved into a war to achieve national liberation in 1824. A study examining 1897 census data revealed that the population of Dagestan was distinguished by the number of people who speak a particular dialect. According to the data, the Avaric-Andian dialect was spoken by the greatest portion of the population (27.45 % men and 28.09 % women), followed by Dargins (21.05 % men and 21.45 % women), Kurins (15.79 % of men and 17.32 % women), and Laks (11.97 % men and 14.75 % women). The Domo-Turkish-Tatar dialect was represented by the Kumyk (9.45 % of males and 8.52 % of women) and Tatar (6.09 % of males and 5.18 % of women) dialects. A negligible percentage of the population spoke other languages and dialects (CSA, p. 22). A 1926 census found that 79 nationalities constitute the population of Dagestan, in addition to small ethnic groups such as Darginians, Kubachians, Kaitagians, and Andians—Avars, Botlikhtians, Godoberintians, Katainians, Ahvahtians, Kvanadinians, Chamalinians, Tindinians, Cezians, Khvarshinians, Bejtinians, and Gunzebtians (i.e., representatives of Ando-Tsuntin language group). During a subsequent Soviet census, the number of ethnic groups in Dagestan dropped sharply to 14 because relatively small groups, such as speakers of the Kubachi and Ando- Cuntin languages, merged with the larger ethnics—Dargin’ and Avar’ groups.

2.2 Genetic and Demographic Structure of the Selected Isolates

Annual expedition studies of mono-ethnic Dagestan genetic isolates with aggrega- tion of schizophrenia we had between 2000 and 2011. Our group ascertained and studied four mono-ethnic genetic isolates located in remote highland areas of Dagestan; the isolates represented three indigenous groups: Laks, Dargins, and Tindals. We studied genetic structure, marriage structure, inbreeding coefficient, basic vital signs of fertility, morbidity and mortality, and epidemiology among

[email protected] 2.2 Genetic and Demographic Structure of the Selected Isolates 23

Table 2.1 List of parameters studied in the survey of isolates residents 1. N—coded number of proband 2. MR—1 marriages, all generations are endogamous 2—1 exogamous marriage in generations 3—all generations are exogamic 3. CMG—consanguineous marriage in proband ancestors 1—there were no consanguineous marriages in generations 2—there were consanguineous marriages in all generations 4. CP—coefficient of parentage between proband and spouse 0—all exogamic, remote (ethnic) 1—there is no relationship, endogamous marriage 2—four remote siblings (1/256) 3—three remote uncle and niece (1/128) 4—three remote siblings (1/64) 5—two remote uncle and niece (1/32) 6—two remote siblings (1/16) 5. NPR—number of pregnancies 0—no pregnancy 1, 2, ...—number of pregnancies 6. NLF—number of live-born children 0—there were no live-born children 1 .... n—the number of live-born children 7. NLFB—number of live-born boys 0—there were no live-born children 1 .... n—the number of live-born boys 8. NLFG—number of live-born girls 0—there were no live-born children 1 .... n—the number of live-born girls 9. NSA—number of miscarriages 0—there were no miscarriages 1 .... n—the number of miscarriages 10. NSB—number of stillbirths 0—no stillbirths 1 .... n—number of stillbirths 11. NPM—number of prenatal mortality 0—no 1 .... n—number of miscarriages and stillbirths 12. NDF—number of deaths of children after birth 0—no child died 1 .... n—the number of deaths after birth 13. NDF1—number of deaths under 1 year 0—no child died 1 .... n—number of deaths under the age of 1 year 14. NDF2—number of deaths of children aged under 18 years (continued)

[email protected] 24 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan

Table 2.1 (continued) 0—no child died 1 .... n—number of deaths under the age of 18 years 15. NSF—number of survived children 0—there were no surviving children 1 ...n—number of survived children 16. RMP—number of family members who resettled with the proband 0—not moved 1 .... n—the number of resettled members 17. RMPD—number of genetic relatives who died within the first 5 years after resettlement 0—nobody died 1...n—the number of deaths of family members after resettlement isolates residents. A questionnaire survey was distributed to 278 people who underwent the following parameters study (Table 2.1): Table 2.2 presents mono-ethnic settlements of considered ethnic groups in Dagestan. Table 2.2 also shows that Avars are the largest ethnic group, while Tindals are the smallest at 4000 people. The studied ethnic groups possess unique customs and linguistic features. Villages of the mountainous areas in Dagestan exhibit peculiar customs in con- junction with a strong dialect in relatively large ethno-linguistic groups. Figure 2.1 shows photographs of hats from different Dagestani ethnic groups and villages (Table 2.3). Total population volume is the most reliable statistical information that covers a sufficient length of time for population genetic analysis. The values, however, cannot be directly used for analysis. Individuals who do not produce offspring do not affect genetic structure of the next generation; the reproductive capacity may therefore be significantly less than in the total population. A population in which each member of reproductive age equally contributes to the formation of the gene pool of the next generation is an abstraction. A number of biological and social factors affect reproductive population size and cause fluctuations in the value over time, including unequal sex ratios, celibacy, infertility, differential fertility, infant mortality, and deviation from panmixia. Spontaneous selection of mates by external qualities is referred to as assortative marriage. Social and environmental stratifica- tion of a community causes positive assortative marriage through a variety of genetically informative indicators (ethnicity, religious beliefs, origin, etc.) which, in turn, prevent the mixing of these communities. The selection of spouses by these socio-demographic parameters can cause secondary assortative morphological and psychological phenotypes, which determine certain genotypic resemblance. Human populations include environmental, phenotypic, and genetic assortative marriage. Environmental refers to the selection of assortative spouses through factors such as origin, ethnic and religious affiliation, and socioeconomic status (Cavalli-Sforza and Bodner 1971; Bunak 1980; Cavalli-Sforza and Feldman 1978; van den Berg 1972; Bulayeva 1981; Kurbatova and Pobedonostseva 1988;

[email protected] 2.2 Genetic and Demographic Structure of the Selected Isolates 25

Table 2.2 Number of ethnic groups in Dagestan and mono-ethnic villages in them Ethnic group Total (1996) Number of settlements (total volume in them) Avars 380,758 320 (5000–250) Dargins 321,564 250 (4200–150) Kumykians 259,302 212 (4500–900) Lezghinians 212,146 190 (4000––900) Laktians 97,752 82 (3500–150) Tabasaranians 78,439 70 (2110–280) Agullas 13,830 18 (1200–300) Rutulians 14,988 12 (2800–400) Tsakhurians 5221 4 (1100–700) Andiands 30,000 9 (2800–600) Botlikhtians 4517 1 Godoberintians 3705 1 Katainians 13,795 12 (3500–700) Ahvahtians 9560 7 (3200–400) Chamalinians 8939 5 (1200–300) Bagulalyans 7955 4 (1100–550) Tindinians 3911 3 (1200–650) Khvarshinians 2000 1 Didoitians 20,000 29 (1900–200) Gunzebtians 1000 1 Genuhtuntians 2000 1 Bejtinians 14,472 19 (1100–160) Agvalintians 1924 1 Kubachians 3000 1 Archians 2000 1

Bulayeva 1990, 1991). Environmental assortative marriages can be positive, in which spouses choose partners similar to themselves (e.g., tall men usually marry women of the same height), or negative, as when spouses choose complementary partners. Genetic and demographic studies show that spousal assortativity by age ranges from 0.51 to 0.99 (Spuhler 1968) but is most often 0.8 (Cavalli-Sforza and Bodner 1971; Kurbatova et al. 1984). Such type of assortative marriages can cause sec- ondary assortative phenotype, characterized by a temporary morphological trend (Cavalli-Sforza and Bodner 1971), which is reflected in the genotypic features of the offspring of such marriages. Positive assortativity by ethnicity and spousal origin defines the radius of the marital relationship, extent of endo- and exogamy, frequency of different types of consanguineous marriages, and the level of inbreed- ing. These processes reduce the genetic diversity of a population (total heterozy- gosity) with their relation to the main indicators of gene pool viability. These properties direct spouse selection (Bulayev et al. 2008b, 2009).

[email protected] 26 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan

Fig. 2.1 Women’s hats from representatives of different ethnic groups in Dagestan (Gadzhieva 1961)

Phenotypic assortativity between spouses increases genotypic diversity in a population, due to the accumulation of assortativity effects from one generation to another. Genetic assortativity is associated with consanguineous marriages and refers to the selection of spouses within same kindred with common genetic ancestors. Unlike phenotypic assortativity, which is more common in modern human populations, inbreeding is typical in isolated populations. The effects of assortative

[email protected] 2.2 Genetic and Demographic Structure of the Selected Isolates 27

Table 2.3 Dynamics of the national structure of the rural population of Dagestan (1926–1989) Nationality 1926 1939 1959 1970 1979 1989 Both genders, total (%) 100.0 100.0 100.0 100.0 100.0 100.0 1. Avars 22.3 30.6 28.8 30.9 31.5 33.6 2. Darginians 17.6 19.5 17.2 17.7 18.0 18.8 3. Kumykians 13.2 11.0 10.7 10.7 11.3 11.9 4. Russians 10.6 8.1 7.1 5.4 3.6 2.5 5. Lezghinians 14.4 12.5 13.4 13.9 13.5 12.4 6. Laks 6.3 6.5 4.2 4.5 3.9 3.4 7. Tabasaranians 5.2 4.5 4.2 4.9 5.4 5.1 8. Azerbaijanians 0.1 2.8 3.1 3.5 3.7 3.8 9. Chechens – 3.1 1.3 2.6 2.9 3.2 10. Nogais – – 1.2 2.1 2.2 2.4 11. Rutulians 1.3 – 1.0 1.3 1.3 1.1 12. Hebrews 0.1 – 0.1 0.1 0.2 – 13. Agullas 1.7 – 0.1 1.0 1.0 1.0 14. Tatars 0.3 – – – – – 15. Ukrainians 0.3 0.5 0.4 0.2 0.9 0.1 16. Armenians 0.2 – 0.2 0.1 0.1 0.1 17. Tatars – – 0.3 0.1 0.1 0.1 18. Tsakhurians – – 0.6 0.5 0.4 0.4 19. Other 6.4 0.9 6.1 0.5 0.0 0.1 marriage and inbreeding on the population genetic structure are similar, but inbreeding is based on common genetic origin. Dagestan features a significant rate of centenarians and optimal ratios of births and deaths; both are used as the main criteria for the genetic adaptability of human populations. A group of scientists under the supervision of Bulayeva found that the relocation of highlanders from their native villages to the lowland area of Dagestan in 1944 led to a number of adverse genetic and demographic processes and a significant increase in the intensity of selection pressure among migrants. The most isolated and inbred populations with unique ethno-linguistic affiliation (one language per village) reacted more negatively to changes in their environment. Genetic studies of migrants who moved from the mountains to the lowlands show that the first years of adaptation among migrants dramatically increased rates of morbidity—up to 75 %, and mortality up to 35 % of total number of migrated highlanders (Bulayeva et al. 1993, 2008a). Our study showed that the morbidity and mortality were selective. Our previous study indicated that offspring of close consanguineous marriages had high genetic homogeneity and higher physiological sensitivity to any environmental stress (according to Pavlovian theory of CNS). Such inbred offspring among highlander migrated to the new lowland area demon- strated less resistance to the new for them climate, water, food and infections that were not in highland environment–malaria, typhus etc. (Bulayeva et al. 1996, 2008a). Lowered adaptability is typical for migrants of inbred marriages; they

[email protected] 28 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan have higher morbidity and a lower life expectancy compared with mountaineers who stay in their historical environment (Bulayeva et al. 1993; 1996; 2008a).

2.3 Methods of Clinical Studies

A study conducted by the WHO measured the incidence of schizophrenia in ten countries; using carefully established procedures and standard diagnostic criteria, they found nearly constant incidence values (Jablensky et al. 1992). Values ranged from 0.07 to 0.14 per 1000 of population for the roughest diagnostic categories with an average value of 0.10 per 1000 of population. Epidemiological studies show schizophrenia incidences are rising in most countries—especially for catatonic schizophrenia, largely due to improved patient detection and extension of diagnos- tic criteria. Differences in diagnostic tools and methods of patient identification mainly attribute to the varying incidence and prevalence rates of schizophrenia among countries. Epidemiological data on the frequencies of chronic diseases for each isolate examined in our studies were collected using the following methods: 1. Examining national health through the statistical compilations regularly issued by the Ministry of Health of Dagestan. 2. Examining the registry in Regional ambulatory, which detects past diseases and identifies the affected contingent and the structure of these diseases. 3. Examining the medical records of patients from regional and rural dispensaries and health centers. 4. Extensive expedition studies, including participating doctors from clinical cen- ters in Dagestan, since 1993. Psychiatrists Kurbanov and Guseynova participated in our regular expeditions and performed detailed clinical examinations of chronic patients and potential patients to clarify schizophrenia diagnosis during our studies. Examination of a mental patient, by international diagnostic techniques developed in the USA and approved by the WHO, took 6–8 h. Diagnoses from the Republican Clinical Hospi- tals and District Hospitals and Clinics were refined during research expeditions. Our clinicians used Dagestan psychiatric hospitals diagnoses for affected in ascertained pedigrees, as well as Diagnostic Interview for Genetic Studies (DIGS) based on DSM-IV criteria (American Psychiatric Association 2000; Nurnberger et al. 1994). DIGS was translated into Russian within joint research with scientists from the USA using joint Grants; Dagestan psychiatrists who cooperated with us had pre-internship in a psychiatric hospital at the University of Utah, USA, and were trained to use DSM-IV-based DIGS during expeditions (Kurbanov and Guseynova). Collected clinical data were translated into English for our collaborators-profes- sional psychiatrists from US Universities for their final diagnosis. Out of 230 exam- ined psychiatric patients registered in Dagestan psychiatric hospitals, 130 subjects met diagnostic criteria for various forms of schizophrenia in compliance with

[email protected] 2.3 Methods of Clinical Studies 29

DSM-IV. The remaining patients had a different range of schizoaffective states, major recurrent depression, and spectrum of manic-depressive conditions (dysthy- mia, cyclothymia, depressive disorders, anxiety disorder, BPD1, BPD2 bipolar disorder type 1 and 2). Out of 130 patients with schizophrenia, 93 % had chronic auditory hallucinations, 74 % had visual hallucinations, and 34 % had olfaction hallucinations; 90 % had paranoid delusions of persecution, and 57 % had other types of delirium. Delusions associated with television vary significantly between the studied mountain villages with aggregation of schizophrenic patients, ranging from 0 to 40 %. Variation is attributed to the social environment of these patients. In villages where local religious leaders banned television to protect the Muslim community, patients, likewise, did not experience delirium associated with televi- sion. This is a striking example of the importance of identifying all environmental factors in the description of clinical phenotypes in a genetic study. The following are the main types and symptoms of schizophrenia, diagnosed by DSM-IV and ICD-10 World Health Organization (1992): Types of Schizophrenia 1. Paranoid; 2. Disorganized; 3. Catatonic; 4. Undifferentiated; 5. Residual. Diagnostic Criteria for Schizophrenia (a) Typical symptoms. Bizarre delusions or hallucinations, in which one or several voices comment on internal dialogue, warrant schizophrenic diagnosis. A patient must present two or more of the following symptoms for more than a month to warrant diagnosis: delusions, hallucinations, disorganized speech, and highly disorganized or catatonic behavior. Negative symptoms are manifested in the form of emotional flattening, alogia, and inability to maintain social activity. (b) Dysfunction of social and professional activities. Presence of schizophrenia affects interpersonal relationships and the performance of professional activities. (c) Duration. All of the above symptoms were expressed during 6 months, and acute phase of at least 1 month should be present for the same period. Thus, to determine the incidence of a particular nosology, the Republican clinical centers obtain this diagnostic data during hospitalization. Our examination found that a limited number of studied mountain isolates were not informed of their diagnosis. Families of the patient, however, were informed of the diagnosis and it furthermore was listed in the medical records of local clinics, in the district hospital, and in national clinics where patients were hospitalized. Onset age of schizophrenia relates to sexual maturity. During long periods of disease exacerbation, patients find it difficult to work and maintain a family relationship due to their condition. Although the diagnosis is based entirely on clinical manifestation with a variety of symptoms, use of structured interviews and targeted diagnostic criteria enables a high level of diagnostic reproducibility. We studied clinically homo- or heterogeneous patients on the schizophrenia spectrum in the selected isolates when their genealogical links are reconstructed. Developed on long expeditions studying the genealogy of genetic isolates, we used

[email protected] 30 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan a recovery method that examines 300–700 members from 9 to 14 generations. As a rule, these extensive pedigrees have 2–4 founders in the most demographically old isolates. We found that all patients with homogeneous diseases usually localize into one genealogy with common ancestors. Our studies also showed that demograph- ically old primary isolates of Dagestan experience high aggregation of a certain type of disease (Bulayeva et al. 1993, 2000a). The number of patients with schizophrenia in the pedigrees of some mountain isolates ranges from 14 to 60 peo- ple; in neighboring mountain isolates other complex diseases aggregate. Neighbor- ing isolates did not accumulate specific pathology or high aggregation of other nosology diseases such as hypertension, with only 70–90 patients. Our previous studies (Bulayeva et al. 2000a) showed that the epidemiological Lifetime Morbid Risk (LMR) index of schizophrenia in several Dagestani isolates was 2–5 % which exceeded the known global values of 1 % for 2–5 times. Manifestation of definitive schizophrenia symptoms varies between patients; however, they include disorganized patterns of thinking, belief disorders, delusions, auditory hallucinations, apathy, and lack of interest in communication. Since its initial description a hundred years ago, researchers have attempted to understand the biological basis of schizophrenia, but currently its exact pathological mecha- nisms remain unknown.

2.4 Molecular-Genetic Methods of Study

Experienced nurses collected 10 mL of blood from examined subjects of Dagestani isolates using disposable syringes and vacationers. The study used 179 residents of various mountain isolates, of which many had genealogic accumulation of complex diseases. Candidates were informed of the objectives of the study in writing, and all subjects subsequently volunteered to participate. The Dagestan IRB approved the objectives of the study and Consent and Asset forms with isolates members. A 10 cM genomic scan using Weber/CHLC 9.0 markers of 300 members of pedigrees with aggregation of schizophrenia from four Daghestan highland isolates was performed at the Mammalian Genotyping Service of the National Institutes of Health (US) [Weber et al., 1993]. Approximately 100,000 genotypes located on 22 autosomes and X and Y-chromosomes were generated in the isolates pedigrees. The markers GATA, GTAT, and GGAA are tetranucleotide repeats, ATA trinucleotide, and AFM and most others dinucleotide repeats. Resulting genotypes of each subject provided 400 microsatellites categorized as di-, tri-, and tetranucleotides. A list of genomic markers is provided in Appendix 1. Genomic DNA was isolated from peripheral blood leukocytes using standard methods in the laboratory of molecular genetics at the University of Utah, USA. The DNA was used for large-scale STR (Weber/CHLC 9.0 markers) and SNP (Affymetrix SNP 6.0) analyses, genome-wide

[email protected] 2.5 Genetic and Statistical Methods of Experimental Data Analysis 31 scanned patients, and healthy members of extensive family trees containing 250–570 members of 12–14 generations, restored through our study of Dagestani genetic isolates. The size of the reconstructed family trees depended on the total amount size Nt of studied genetic isolates: the large Nt, the greater the number of pedigree members and our samples collected in isolate. Conversely, isolates with small total volumes include fewer pedigree members—affected and unaffected and smaller pedigree. SNP technology dominates modern human genome research and disease gene mapping; SNPs were genome wide scanned in 100 patients from four ethnic Laks, Dargins, and Tindals genetic isolates (500,000 SNPs each). The number of alleles in DNA microsatellites may vary among individuals, typically between 5 and 20; however, greater ranges can occur. Scanning microsatellites by 10 cM is too crude for this analysis; possible links can be “lost.” We therefore used—parametric, nonparametric (model free), and haplotype analysis out of all possible methods for complex disease mapping.

2.5 Genetic and Statistical Methods of Experimental Data Analysis

Modern complex disease gene mapping uses three main methods, including ana- lyzing linkages and associations and direct mutation screening (CNV) (Lander and Schork 1994; Sobel and Lange 1996; Stefansson et al. 2008; Walsh et al. 2008). Linkage studies determine the segregation of complex disease alleles of polymor- phic genetic markers in the genealogy of numerous patients. Markers that are close to a susceptible gene tend to be inherited together. Advances in genetics, bioinfor- matics, statistical methods of analysis, and the availability of genome-wide scans of DNA markers facilitate progress in determining the genetic nature of complex human diseases through linkage analysis (Weeks and Lange 1987; Goldgar et al. 1993; Kruglyak and Lander 1995; Kruglyak 1999). Linkage analysis is optimal for identifying genes with major or moderate effects on pathogenesis. Given the multifactorial nature of complex genetic diseases, the need to find linkages in specific human populations is obvious. Knowledge of the demographic history of a population, in which gene mapping is performed, makes the search for genes that cause complex diseases more efficient (Bulayeva et al. 2000b, 2002; Jorde 2000, 2001). The effective reproductive volume (Ne) in real populations is always less than the reproductive volume (Nr); it is calculated using the formula (Li 1976, p. 134) considering the sex ratio. The index of morbidity risk of schizophrenia, as an index of the disease risk during lifetime (Lifetime Morbid Risk, LMR), is calculated by the standard formula (Gottesman and Shields1982).

[email protected] 32 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan

LMR ¼ a=½ŠN À ðÞno þ 0:5nw ; where a—number of examined schizophrenia patients, N—total volume of population, no—number of subjects children younger than 15 years, below the manifestation age nw—number of healthy subjects inside the age range for schizophrenia manifesta- tion (15–45). Modern complex disease gene mapping primarily involves linkage and associ- ation analysis in families and total populations. Determining the genetic epidemi- ology of a complex disease at both levels is necessary before mapping can begin. The epidemiological studies identify families and populations with high aggrega- tions of the studied disease and furthermore evaluate characteristics such as disease frequency and the proportion of familial cases in the population studied; the most important characteristic, however, is the relative risk of a disease, which provides insights into the disparities for acquiring a disease between relatives of affected individuals and the general population (Puzyrev and Stepanov 1997). It was shown that the relative risk for developing schizophrenia is about 9. Maximizing relative risk of a complex disease through identifying clinically homogeneous patients with an early-onset age and high familial aggregation will increase effectiveness of genetic and epidemiological studies (Terwilliger and Ott 1992). Case–control studies and linkage disequilibrium may find variations in both gene frequency and the associations between a disease and its genetic markers. The genetic heterogeneity of a disease contributes to differences in association values; additionally, population stratification caused by ethnicity, age, and population unit differences affects the results of association studies. Studies should use ethnically homogeneous individuals matched by age, sex, and other demographic indicators in order to produce constant association values. This often conflicts with other objec- tives of associative research that requires large sample sizes to identify disease frequency and harmful mutations (McGuffin and Owen 1991). Associative research is difficult to implement in small, individual populations such as within isolates. The associations between genetic markers and pathogenic loci do not last long in the history of a population. This step in gene mapping provides a goal-directed search for susceptibility genes to further study the linkage of genome-wide scanned loci (Terwilliger and Ott 1994). In this regard, our work focused on analyzing the clinical phenotype of schizophrenia linkage with genome-wide scanned DNA markers in the pedigrees of primary and secondary isolates in Dagestan. Because scanning 10 cM in the ancestries from our isolates was too large to find associations and LD, our investigations used linkage analysis through a nonparametric method with subsequent parametric analysis to clarify the mode and linkage size of the region, followed with a detailed screening of linked genomic regions that localized the mutant gene causative of the studied clinical phenotype.

[email protected] 2.5 Genetic and Statistical Methods of Experimental Data Analysis 33

Morton (Morton 1982) developed this method of genetic linkage analysis and it is the current approach used to search for candidate genes of complex diseases. The method matches genetic markers in a genealogy to an observed inheritance pattern for a clinical phenotype including dominant or recessive with complete or incom- plete (%) penetrance and autosomal or gonosomal (linked to the X chromosome). The logarithm of the linkage likelihood ratio, called the LOD score (logarithm of odds ratio, LOD score, or LOD), quantitatively measures genetic linkage. LOD can be calculated for different values of recombinant fraction (q)—from 0 to 0.5. Value q with a maximum LOD score is the most probable measurement of the distance between a genetic marker and supposed disease gene (Morton et al. 1986; Terwilliger and Ott 1994; Puzyrev and Stepanov 1997). The LOD confirms linkage when its value is above the threshold of 3, corresponding to linkage chances of 1000:1. This value is not as useful as it appears (Terwilliger and Ott 1994; Lander and Kruglyak 1995); for two randomly selected loci in the human genome, generally the likelihood of the loci being unlinked (i.e., located on different chromosomes or shoulders) is approximately 50:1. A threshold value of 3.0 means selected loci are 20 times more likely to be linked; in other words, 1 out of 20 results confirming the linkage may be false. Currently, the most effective is multi-locus linkage analysis, in which 400 or more loci are scanned genome wide. Probability ratios are calculated for each interval between adjacent markers. Intervals with LOD values exceeding a thresh- old are the most likely regions of candidate gene localization (confirmation map- ping); gene localization is excluded at intervals with an LOD below threshold value. Multilocal linkage analysis requires extensive calculations, which are carried out with specially designed computer software such as—LINKAGE, GENEHUNTER, FASTMAR, and SIMWALK2. The computer package SIMWALK2 is one of the most promising modern techniques of multi-locus linkage analysis (Sobel and Lange 1996). The package contains a set of genetic and statistical methods for haplotype, nonparametric (Nonparametric Linkage, NPL), and parametric linkage analysis, as well as methods for analyzing genotyping errors (mistyping) and determining identity by descent (IBD). SIMWALK2 is based on Markov Monte Carlo chain (MCMC) algorithms, which enables the analysis of large, complex pedigrees with many inbred loops, as seen in our work. Other genetic and statistical packages for linkage analysis cannot analyze large pedigrees with many consanguineous marriages and therefore cannot be used in our work. SimWalk2 performed linkage analysis in our work (Sobel and Lange 1996) and provided a number of powerful methods of parametric and nonparametric linkage analysis of markers with susceptibility genes of complex diseases and analysis of haplotypes and identity by descent (IBD). Nonparametric analysis was used for the first stage of linkage searches (Sobel and Lange 1996). The computer package PROGENY created a database of family trees with haplotypes and constructed their graphical representation. Different stages of our work also warranted the use of standard genetic-statistical packages.

[email protected] 34 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan

For genome-wide scanned molecular aberrations, CNVs and LOHs, we used the computer packages GTC (Genotyping Console, Affymetrix) and SVS, version 7.6.4 (SNP and Variation Suite Manual, Golden Helix Inc.). Data were thoroughly “cleaned” before variation analysis of the number of copies as well as the loss of heterozygosity took place, taking into account the quality of a given algorithm (99 %). The algorithm of the programs determined the segments of CNVs; the minimum segment length was 100 Kb in a particular chromosomal region. The algorithm analyzing the loss of heterozygosity aims to find segments with consec- utive homozygous SNPs, enabling the detection of the loss in clinically homoge- neous patients. The algorithm converts homozygous genotypes to 1 and heterozygous to 0. The presence of homozygosity on a genome with a segment length equal to or >1 MB is the threshold for segment loss of heterozygosity.

References

American Psychiatric Association. (2000). Appendix I: Outline for cultural formulation and glossary of culturebound syndromes. In Diagnostic and statistical manual of mental disorders (4th ed., text rev.). Washington: American Psychiatric Association. Bulayeva, K. B., Pavlova, T. A., Charuhilova, S. M., et al. (1996). Genetic and demographic study of mountain populations of Dagestan and its migrants on plain. Interconnection of inbreeding, homozygosity and physiological sensitivity levels. Genetics, 32(1), 93–102 (Russia). Bulayeva, K., Marchani, E., Kurbatova, O. L., Watkins, S. W., Bulayev, O. A., & Harpending, H. C. (2008a). Genetic bottleneck among Dagestan highlanders migrating to lowlands. Central European Journal of Medicine, 8(4), 396–405. Bulayev, O. A., Spitcin, V. A., et al. (2008b). Population approach to mapping genes of complex diseases. Medical Genetics, 4(3), 3–17. Bulayev, O. A., Pavlova, T. A., & Bulayeva, K. B. (2009). Role of inbreeding in aggregation of complex pathology. Genetics, 45(8), 1096–1104. Bulayeva, K. B. (1981). Population-genetic analysis of some neurodynamic parameters of man. Behavior Genetics, 11(4), 303–308. Bulayeva, K. B. (1990). Population-Genetic variation of human psychophysiological traits (p. 346). Doctor of sciences thesis. Institute of Medical Genetics of Russ, Moscow. The Academy of Medical Sciences. Bulayeva, K. B. (1991). Genetic basis of human psychophysiology (p. 218). Moscow: Science. Bulayeva, K. B., Leal, S., Pavlova, T. A., et al. (2000a). The ascertainment of schizophrenia pedigrees in Daghestan genetic isolates. Journal of Psychiatric Genetics, 5, 100–106. Bulayeva, K. B., Leal, S., Pavlova, T. A., Kurbanov, R. M., Coover, S., & Bulayev, O. A. (2000b). The ascertainment of schizophrenia pedigrees in Dagestan genetic isolates. Psychiatric Genet- ics, 10(2), 67–72. Bulayeva, K. B., Pavlova, T. A., Dubinin, N. P., et al. (1993). Phenotypic and genetic affinities among ethnic populations in Dagestan (Caucasus, USSR). A comparison of polymorphic, physical, neurophysiological and psychological traits. Annals of Human Biology (UK), 20(5), 455–467. Bulayeva, K., Roeder, K., Bacanu, S. A., et al. (1999). Genetic analysis of schizophrenia in isolated Daghestanian kindreds. The American Journal of Human Genetics, 65, 1086. Bunak, V. V. (1980). Homo sapiens: Origin and evolution (p. 328). In A. A. Zubov (Ed.). Moscow: Science.

[email protected] References 35

Bulayeva, K. B., Pavlova, T. A., Kurbanov, R. M., & Bulayev, O. A. (2002). Mapping genes of complex disease in genetic isolates of Daghestan. Journal of Genetics, 38(11), 1539–1548 (Russia). Cavalli-Sforza, L. L., & Bodner, W. F. (1971). The genetics of human populations. San Francisco: Freeman. Cavalli-Sforza, L. L., & Feldman, M. W. (1978). Darwinian selection and “altruism”. Theoretical Population Biology, 14(2), 268–280. Gadzhiev, A. G. (1971). Anthropology of small populations of Dagestan (p. 368). Makhachkala: Dagestan Branch of the USSR Academy of Science. Gadzhiev, M. G., Davudov, O. M., & Shihsaidov, S. M. (1996). (p. 345). Moscow: Science. Gadzhieva, S. S. (1961). Kumiks. History and ethnography study. Moscow: USSR Academy of Sciences. 387 p. Gottesman, I. I., & Shields, J. T. (1982). Schizophrenia: The epigenetic puzzle (p. 258). Cam- bridge: Cambridge University Press. Goldgar, D. E., Lewis, C. M., & Gholami, K. (1993). Analysis of discrete phenotypes using a multipoint identity-by-descent method: Application to Alzheimer’s disease. Genetic Epidemi- ology, 10(6), 383–388. Jablensky, A., Sartorius, N., Ernberg, G., Bertelsen, A., et al. (1992). Schizophrenia: Manifesta- tions, incidence and course in different cultures. A World Health Organization ten-country study. Psychological Medicine. Monograph Supplement, 20(1), 97–103. Jorde, L. B. (2000). Linkage disequilibrium and the search for complex disease genes. Genome Research, 10, 1435–1444. Jorde, L. B. (2001). Consanguinity and pre reproductive mortality in the Utah Mormon population. Human Heredity, 52(2), 61–65. Kotovich, V. G. (1961). Archaeological works in Dagestan. Materials on the Archaeology of Dagestan, Makhachkala, 2, 5–56. Kruglyak, L. (1999). Genetic isolates: Separate but equal? Proceedings of the National Academy of Sciences of the United States of America, 96(4), 1170–1172. Kruglyak, L., & Lander, E. S. (1995). High-resolution genetic mapping of complex traits. American Journal of Human Genetics, 56(5), 12–23. Kurbatova, O. L., & Pobedonostseva, E. (1988). The role of migration processes in the formation of marriage structure of Moscow population. II. Assortative mating for the age, birthplace and nationality. Russian J Genetika, 24(9), 1679–1688. Russian. Kurbatova, O. L., Pobedonostseva, E., & Imasheva, A. G. (1984). Role of migration processes in shaping the marriage structure of the Moscow population. I. The age, place of birth and nationality of those entering marriage. Russian J Genetika, 20(3), 501–511. Lander, E., & Kruglyak, L. (1995). Genetic dissection of complex traits: Guidelines for interpreting and reporting linkage results. Nature Genetics, 3, 241–247. Lander, E. S., & Schork, N. J. (1994). Genetic dissection of complex traits. Science, 265, 2037– 2048. Lavrov, L. I. (1951). The reasons for multilingualism in Dagestan. Soviet Ethnography, 2, 71–82. Li, C. C. (1976). The testing of dominants for heterozygoisty. Annals of Human Genetics, 2, 183– 190. McGuffin, P., & Owen, M. (1991). The molecular genetics of schizophrenia: An overview and forward view. European Archives of Psychiatry and Clinical Neuroscience, 240(3), 169–173. Morton, N. E. (1982). Outline of genetic epidemiology. New York: Karger. ISBN 3-8055-2269-X. Morton, N. E., et al. (1986). Multipoint linkage analysis. The American Journal of Human Genetics, 38(6), 868–883. Nurnberger, J. I., Blehar, M. C., Kaufmann, C. A., et al. (1994). Diagnostic Interview for Genetic Studies (DIGS). Archives of General Psychiatry, 51, 849–859. PMID 7944874. Puzyrev, V. P., & Stepanov, V. A. (1997). Pathological anatomy of human genome (p. 223). Novosibirsk: Nauka. RAS Siberian Company.

[email protected] 36 2 Descriptions and Methods of Study in Selected Genetic Isolates of Dagestan

Sobel, E., & Lange, K. (1996). Descent graphs in pedigree analysis: Applications to haplotyping, location scores, and marker sharing statistics. The American Journal of Human Genetics, 58, 1323–1337. Spuhler, J. N. (1968). Assortative mating with respect to physical characteristics. Eugenics Quarterly, 15(2), 128–140. Stefansson, H., Rujescu, D., Cichon, S., Pietila¨inen, O. P., Ingason, A., Steinberg, S., et al. (2008). Large recurrent microdeletions associated with schizophrenia. Nature, 455(7210), 232–236. Terwilliger, J. D., & Ott, J. (1992). A haplotype-based ‘haplotype relative risk’ approach to detecting allelic associations. Human Heredity, 42(6), 337–346. Terwilliger, J. D., & Ott, J. (1994). Handbook of human genetic linkage. Baltimore, MA: JHU Press. Van Den Berg, J. H. (1972). A different existence; Principles of phenomenological psychopathol- ogy. Pittsburgh, PA: Duquesne University Press. Walsh, T., McClellan, J., McCarthy, S., Addington, A., Pierce, S., Cooper, G., et al. (2008). Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science, 320(5875), 539–543. Weeks, D. E., & Lange, K. (1987). Preliminary ranking procedures for multilocus ordering. Genomics, 3, 236–242. World Health Organization. Geneva: World Health Organization; 1992. International Statistical Classification of Diseases and Related Health Problems, Tenth Revision.

[email protected] Chapter 3 Selection of Populations for Mapping Genes of Complex Diseases

3.1 Principles of Selection of Populations for Complex Disease Gene Mapping

The genetic similarities among all 30 indigenous mountain people of Dagestan suggest that they share common ancestors from the ancient proto-populations that existed 8–10 thousand years ago (Bulayeva et al. 2003b; Tofanelli et al. 2009; Xing et al. 2009). Our long-term population genetic studies demonstrated the feasibility of the new, cross-population approach to complex disease gene mapping within ethnically divided genetic isolates (Bulayeva et al. 2000, 2002, 2005, 2007, 2011; Karafet et al. 2015, 2016; Bulayev et al. 2008, 2009, 2011). Genetic diversity, typical of large populations, is smaller between small isolated populations that are isolated for many generations. Gene’s drift, inbreeding, and endogamy contribute to the accumulation of a relatively small number of pathogenic loci and alleles, both of which are identical by descent (IBD), and reduced genetic hetero- geneity in isolates. Such processes greatly facilitate the identification of pathogenic loci in the current generations of patients (Kruglyak 1999). Our approaches involve gene mapping of the same complex disease in ethnically and demographically subdivided isolates using unified clinical and genetic methods (Table 3.1). Our experimental cross-isolate design of complex disease (SCZ) gene mapping within ethnically and demographically subdivided isolates of Dagestan focused on the following concepts: (a) Increasing with the age of a demographic isolate, recombination events during meiosis cause genetic regions, adjacent to the disease mutation, to narrow, (b) Isolation for 50 or more generations only preserves the short segment of a chromosome, accumulating with generations due to endogamy and inbreeding, (c) Due to founder effect, demographically older primary isolates have the greater number of clinically and genetically homogeneous patients.

© Springer International Publishing Switzerland 2016 37 K. Bulayeva et al., Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates, DOI 10.1007/978-3-319-31964-3_3

[email protected] 38 3 Selection of Populations for Mapping Genes of Complex Diseases

Table 3.1 Complex disease gene mapping in genetic isolates of outbred populations: advantages and disadvantages Advantages of isolates Advantages of outbred populations Ancient demography history, due to which Greater number of patients due to the large haplotype block with pathogenic locus in total size—Nt narrowed in following generations and due to recombination of meiosis facilitates the identi- fication of the block using LD Possibility of recessive gene mapping segre- More genetic polymorphisms and the possi- gating the generations due to consanguineous bility of reproducing results of mapping in marriages other populations Fewer pathogenic loci and the high frequency Disadvantages: of the haplotype block with such disease loci in isolates where the ancestor was a carrier of the loci Possibility to collect large pedigrees and study Heterogeneous environments, because of the large families within them; provides a sta- which the effects of penetrance and tistically significant level of mapping phenocopies substantially increase and make it difficult to identify the genes of complex diseases Homogeneous stable habitat that reduces the Large migratory flows role of environment in the expression of the genotype in a clinical phenotype (penetrance) Low levels of migration (emigration reduces Difficulties of pedigrees reconstruction with genetic homogeneity, immigration increases it) sufficient volume for mapping, and small size families More homogeneous clinical phenotypes related Large number of founders and respectively— with such isolates gene pool and environmental increased genetic and clinical heterogeneity of factors homogeneity that faciliate easier dis- the disease ease genes mapping Disadvantages: Due to isolation and the founder effect, a particular isolate may accumulate unique pathogenic loci (which cannot be found in other human populations)

Thus, primary isolates with accumulation of one particular pathology should guide the selection of populations for mapping the genes of complex diseases (Fig. 3.1).

3.2 Ethnogenomic Structure of Dagestan Populations

A gene mapping of complex disease gene mapping requires population-genetic and genetic-epidemiological studies that identify genetic architectonics in the demo- graphic history of a population with a high aggregation of a specific pathology. Due to a high concentration of ethnically, genetically, and geographically subdivided isolates, Dagestan is one of the most interesting regions for gene mapping (Dubinin

[email protected] 3.2 Ethnogenomic Structure of Dagestan Populations 39

Founder Chromosome Mutation

50 generations

Fig. 3.1 Haplotype of founder isolate with mutations that define the disease, out of the total proto- population transmitted to descendants. Haplotype block with pathogenic loci decreases during meiosis and recombination: over generations in the demographic history population members had greater numbers of recombination in meiosis. Only a short segment of ancestral haplotype with pathogenic locus is maintained in 50-ths generations in population and Bulayeva 1982; Dubinin et al. 1983; Bulayeva 1991; Bulayeva et al. 1997; Bulayeva et al. 1993, 2000, 2003a, b). The ethnic population of the Caucasus region has a complex history. Numerous invasions of the North and South Caucasus affected the gene pool of indigenous people. Harsh conditions and difficult access to the mountain region led to less mixing of Dagestan highland ethnic gene pools. The indigenous people of the Caucasus developed significant cultural, linguistic, and genetic diversity; however, the origin of Eurasian people and the genetic landscape of early Europeans have not been sufficiently researched. The significance of these ethnogenetic studies has important implications for complex disease gene mapping, because such disease genetic architecture depend on the specific gene pool of a local population. We compared patterns of autosomal variability and genomic variations of male and female lines in Dagestani ethnic populations within our international collaboration. Some research groups have used mtDNA and Y chromosome markers for non- Dagestani and other Caucasus ethnic populations (Nasidze et al. 2003, 2004). The authors obtained data that confirmed male and female similarity to the West Asian population (Nasidze et al. 2003, 2004). Only a small number of subjects from four ethnic groups in Dagestan were studied by these investigators to limited number of DNA loci. Since 1976, our group of human genetic adaptation under the supervision of K.B Bulayeva at VIGG RAS conducted annual expedition study of the genetic origin of ethnic populations in Dagestan. Study of 8 different ethnic groups there using 21 STR autosomal loci was conducted in mid-1990 in collaboration with geneticists at the University of Utah, USA (Prof Henry Harpending and Prof L. Jorde). The results support that ancient Dagestani ethnic groups share the common genetic roots between themselves and on the next level with European populations (Fig. 3.2) (Bulayeva et al. 2003b; Marchani et al. 2008). Our previous genetic studies using immunological and biochemical markers favor similar conclusions (Dubinin and Bulayeva 1984; Bulayeva 1991; Bulayeva et al. 2003b).

[email protected] 40 3 Selection of Populations for Mapping Genes of Complex Diseases

Fig. 3.2 PCA plot of 250 K autosomal SNPs of 56 populations from Dagestan, Caucasus, Near East, Europe, Central Asia and South Asia. ‘Drop one in’ procedure was used for analysis. PC1 and PC2 coordinates for each population were calculated as median coordinate values for individuals within populations. This revealed relatively distinct clusters of Europeans, South Asian and Central Asians, while Daghestani samples (except Nogais and Mountain Jews immigrated to Dagestan region about 700 years ago according historical data) intermingle with other Caucasus individuals and show an affinity with European and Near Eastern samples. Dagestan-ND, ethnic groups belonging to Dagestan and Nakh language family. Dagestan-non-ND, ethnics groups of Dagestan (from Karafet et al. 2016)

Our subsequent study using the 527 subjects from the 25 ethnic groups con- firmed the genetic similarities between European groups and the indigenous ethnic people of Dagestan through the aggregation of 250,000 genome-wide scanned SNPs (Affymetrix SNP, 250K) (Fig. 3.3) (Xing et al. 2009). The results of genomic distance analysis by the method of main component show that the space of three principal components includes four groups: the Central and Eastern Asian groups, the most genetically heterogeneous group of African nations, and the relatively less genetically heterogeneous European group (Fig. 3.3). The last group, comprising people of Northern and Central Europe, includes three indigenous Dagestani groups: the Laks, Dargins, and Kumyks. The results suggest that in the subsequent demographic history, the ancient “proto-population” differentiated into many endogamous communities, which in following history created new original languages. Population-genetic processes such as genes drift, endogamy, and inbreeding contributed to the formation of gene pools with reduced genetic heterogeneity close to the global minimum (Bulayeva et al. 2003b, 2006).

[email protected] 3.2 Ethnogenomic Structure of Dagestan Populations 41

25 populations, 527 individuals !Kung Alur Barahmin CEU CHB Chinese YRI Luhya Pedi Hema Ngum Hema Iban Sotho/Tswana Alur Irula Irula Mala Cambodian JPT !Kung Madign Vietnamese Japanese Iban CHB Khmer Cambodian Brahmin Chinese JPT Japanese Luhya Dag_Kumiks Madiga Pygmy Tuscan Mala Dag_Dargins Nguni

PC 3, 0.787% var CEU Laks Utah N. European Pedi Pygmy Sotho/Tswana Dag_Kumiks Tuscan PC 2, 11.9% var Dag_Dargins, Laks PC 1, 75.6% var Utah N. European Vietnamese YRI

Fig. 3.3 Distribution of Dagestani ethnic groups (Kumyks, Dargins, and Laks) with 25 racial and ethnic groups worldwide (527 people) in the space of three (PC1–PC3) principal components. Every examined individual is designated by point, the color of which reflects ethnicity. The circle indicates Dagestani ethnic groups. The Kumyks, Dargins, and Laks show clear ethnogenic proximity to the European group

Pygmy Tswana Pedi

Zulu

Vietnamese Sotho Dagestan-mix Asian Mormon Kung Chinese Japanese Avars Finns

Allele size variance French Dargins Cambodian Tsonga Lezgins

Stalskoe Poles Kubachi Malay

Distance from African Centroid

Fig. 3.4 Distance from the African centroid and distribution of ethnic populations studied within the size variance alleles of STR loci

Figure 3.4 demonstrated our results of a study of genetic heterogeneity based on STR allele sizes and on average level of heterozygosity, among main continental groups and Dagestan ethnic populations. The study used allele size variances of 21 STR loci, shown

[email protected] 42 3 Selection of Populations for Mapping Genes of Complex Diseases against the African centroid, indicated a vector of the most heterogeneous group (Harpending and Eller 1999; Harpending and Rogers 2000). The results show that the Dagestani ethnic populations are localized in the region of minimal values of the STR allele sizes (Fig. 3.4). Results obtained support that all studied ethnic Dagestani populations were the subject of a relatively recent genetic drift (Bulayeva et al. 2003b). Genetic diversity reacts slower than heterozygosity does to changes in demo- graphic history. Analyzing the genetic diversity of a population through microsatellite-estimated allele size variance (Jorde et al. 1997;Bulayeva et al. 2003b, 2006) therefore retains features of the “proto-population” gene pool. Data presented in Fig. 3.3 show high homogeneity of ethnogenomic in indigenous people of Dagestan, providing the genetic homogeneity of pathological factors that determine the development of schizophrenia spectrum disorders. In combination with the ancestral effect and other genetic drift factors, historically established genetic homogeneity of the studied isolates facilitated the aggregation of a specific haplotype with certain pathogenic loci in some isolates, compared to others with different ancestors, accumulated haplotypes, and pathogenic loci or ancestors without any pathology loci accumulation (Bulayeva 1991; Bulayev et al. 2008). Some Dagestani isolates consequentially contain many patients with various complex diseases, such as the presence or absence of schizophrenia; other Dagestani isolates report no cases of schizophrenia; however, many patients express other complex diseases. The study of genetic systems localized at nonrecombining Y-chromosome site (NRY) is a popular tool to analyze the gene pool of populations. Several hundred polymorphic SNPs (Karafet et al. 2008, 2015, 2016) and microsatellite loci (Kayser et al. 2004; Ballantyne et al. 2011; Burgarella and Navascue´s 2011; Bulayeva et al. 2003a, b, 2006) are identified in NRY. Figure 3.5 shows the main Y-haplogroup found in 5 Dagestani ethnic groups. Studied Dagestani populations on these Y-chromosome haplogroups are divided into ten nuclear male lines: E1b1b1-M35, E1b1b1a-M78, G-M201, I-M170, J1-M267, J1e1-M367/M368, J2-M172, L-M22, R1a1-M17, and R1b1-P25 (Table 3.2). The analysis shows that the most frequent polymorphic haplogroup among Dagestan people is J1 (68.6 %); a much lower frequency is observed for J2 (9.5 %), R1b1 (5 %), G, I, and R1a1 (4 %) haplogroups. The variations between populations are 13.12 % and within populations 86.88 %. In collaboration with geneticists from the University of Pisa (Italy), male line genomes were studied further in Dagestani indigenous people (Laks, Avars, and Kubachins) and people who resettled in Dagestan from Iran to the Eastern Caucasus (Tats and Mountain Jews) in the sixth century by Sasanian Shahanshah Khosrov I Anushirwan (531–579 CE). Mountain Jews and Tats descend from several migration waves. Ethnoculturally, Mountain Jews compose part of the Iranian Jews, with whom they maintained close ties since the beginning of the nineteenth century, before the inclusion of the Eastern Caucasus into Russia. Mountain Jewish language is based on Middle Persian, which they share with Tats. Historians and anthropologists do not have a consensus about the origin of these two ethnic groups due to the common language and Iranian origin.

[email protected] 3.2 Ethnogenomic Structure of Dagestan Populations 43

J2

R1a1

I*

J1e1 R1b1

G*

L*

J1 E1b1b1a

E1b1b1

Fig. 3.5 Network (Median joining), built on the basis of 20 STR loci haplotype of Y-chromosome. Nuclear haplotypes of a certain number of examinees from different ethnic groups are highlighted. Pink Kubachins; red Avars; yellow Chechens-Akkin; green Tabasarans; blue Laks (Caciagli et al. 2009)

Table 3.2 Frequency of Y-haplogroups in the studied ethnic populations of Dagestan (Caciagli et al. 2009) Avars Chechens Kubachins Laks Tabasarans N 20 20 14 21 30 R1b1-P25 0.050 0.000 0.000 0.048 0.100 R1a1-M17 0.000 0.000 0.143 0.095 0.000 J1-M267 0.600 0.600 0.857 0.429 0.767 J1e1-M367-M368 0.200 0.000 0.000 0.000 0.000 J2-M172 0.100 0.250 0.000 0.143 0.000 L-M22 0.000 0.100 0.000 0.000 0.000 G-M201 0.050 0.050 0.000 0.048 0.033 E1b1b1-M35 0.000 0.000 0.000 0.095 0.033 E1b1b1a-M78 0.000 0.000 0.000 0.000 0.033 I-M170 0.000 0.000 0.000 0.143 0.033

Our pioneering study of the ethnogenomic structure of Tats and Mountain Jews was conducted through comparing their male and female genome distribution of Y-chr haplogroups and mtDNA and then comparing that of other ethnic groups in

[email protected] 44 3 Selection of Populations for Mapping Genes of Complex Diseases

Fig. 3.6 Multivariate analysis of Y-STR major haplogroups’ frequency in major populations of Dagestan, the Caucasus, and the Middle East. Geographical regions are indicated by the following symbols: filled squares Dagestan; filled circles Caucasus; filled triangles West Asia. Gray color means haplogroups. Legend: MJ Mountain Jews; TAT Tats; Lk Laks; Avr Avars; Kbc Kubachins; Tbs Tabasarans; Drg Dargins; Lzg Lezgins; Rtl Rutuls; Abz Abazins; Abk Abkhaz; Arm Arme- nians; AZB_NT Azerbaijanians-North Talysh; Che Chechens; Geo Georgians; Ins Ingush; Kbd Kabardians; Krd Kurds; Ir_Teh Teheran Iranians; Ir_Isf Isfahan Iranians; Ir Iranians; Ir_Arb Iranians Arabs; Ir_Gil Iranians–Gilaks; Ir_Bak Iranians-Bakhtiard; Ir_Maz Iranians Mazandarinians; Ir_ST Iranians-South Talysh; Jor Jordanians; Trk Turkish; Yem Yemens

Dagestan and neighboring regions (Bertoncini et al. 2012). A multivariate analysis of the frequencies of main Y-STR haplogroups in 27 Dagestani ethnic groups and in other Caucasus and Middle East ethnic groups we used for study genetic similarities between Tats and Mountain Jewsm as well as both these groups with Iranian groups from nine different regions (Fig. 3.6). The results of multidimensional scaling, conducted to study interethnic relations between the populations of West Asia, Central Asia, and the Caucasus, suggest a substantial differentiation between the ethnic Tats and the ethnic Mountain Jews (Bertoncini et al. 2011) (Figs. 3.6 and 3.7). In order to study the relationship between religious community (Islam) and genetic similarity between the ethnic Tats and the indigenous Dagestani people, we analyzed the distribution of the most common male genome in the Dagestani population of the J1-M267 haplogroup of Caucasus, the Middle East, and North Africa. Samples from 29 ethnic groups yielded 282 J1 M267*G, 20 YSTRs, and 6 SNP-s (Fig. 3.8). Geneticists from Italy, Great Britain, Morocco, Egypt, and Iran collaborated with us on the study (Tofanelli et al. 2009). Haplogroup frequency exceeding 50 % was found in Arabia (Qatar and Yemen), Sudan (Bedouins), and the (Dagestan). These frequencies correlate with the diversity of these haplotypes, R2 ¼ 0.387, p < 0.001. The distribution of the index R2 favors the highest dispersion of these haplotypes in the Middle East and the lowest in Dagestan and Sudan. We studied Caucasus people connected to the region through linguistic and geographical ties in order to evaluate the role of linguistic and geographical differences in the intra- and intergroup differences in the genetic structure of the

[email protected] 3.2 Ethnogenomic Structure of Dagestan Populations 45

Fig. 3.7 Multidimensional scaling of HVS-I sequence matrix (haplotype frequencies) demon- strating the genetic relationships among the ethnic populations of the Caucasus, West Asia, and Central Asia. filled squares Dagestan; filled circles Caucasus, filled triangles West Asia, and open circles Central Asia (Uzbk Uzbeks; Trkm Turkmens; Kazh Kazakhs). For symbols of other ethnic groups, see Fig. 3.5

Fig. 3.8 Contour map showing the distribution of J1 and J *(xJ2) haplogroups in ethnic groups of the Caucasus, Middle East, Central Asia, and North Africa, professing Islam

[email protected] 46 3 Selection of Populations for Mapping Genes of Complex Diseases

Table 3.3 Analysis of genetic differentiation in the male genome of Caucasus people (Y-chr.), grouped according to different classification criteria % of genetic variance Between Between populations Within the Groups groups within groups population FCT FSC FST North Caucasusa 0.07 13.16 86.77 0.0007 0.1316* 0.1323* North + South À0.17 6.2 93.97 À0.0017 0.0618* 0.0603* Caucasusb Languages: À4.22 16.14 88.08 À0.0422 0.1548* 0.1192* Caucasus + Indo-European + Turkic Languages: À2.4 16.13 86.27 À0.024 0.1575* 0.1373* Caucasus + Indo-European aMountain Jews, Tabasarans, Tats, Abadzintians, Chechens, Dargins, Ingushes, Kabardians, Lezgins, , Rutuls bSouth Caucasus include Abkhazians, Armenians, , Georgians. Languages: Caucasus (Abadzintians, Abkhazians, Chechens, Dargins, Georgians Kabardians, Ingushes, Lezgins, Rutuls, Tabasarans); Indo-European Armenians, Mountain Jews, Ossetians, Tats; Turks: Azerbaijanis * P ¼ 0.0000 studied people (Table 3.3). At the same time, we identified the geographic North Caucasus region with a number of its people (Mountain Jews, Tabasarans, Tats, Abadzintians, Chechens, Dargins, Ingushes, Kabardians, Lezgins, Ossetians, Rutuls), for whose comparison we combined a group of people from the North and Southern (Abkhazians, Armenians, Azerbaijanis, Georgians) Caucasus. The data show a significant genetic differentiation between Mountain Jews and Tats, which speaks to the differences in the genetic roots between these two groups, the first established with our Italian cohorts (Caciagli et al. 2009; Tofanelli et al. 2009; Bertoncini et al. 2011). Using linguistic criteria, we identified two groups: Caucasus (including the Nakh-Dagestan) and Indo-European, as well as the Turkic-speaking people. The results show that population differences within groups vary within 6–14 %, whereas intrapopulation variability is the most significant and is between 87 and 94 % (Table 3.3). F statistics naturally reflect these intra- and intergroup differences. The highest value was observed in the group Fst of the North Caucasus (Fst ¼ 0.1323) and in the group bringing together the Caucasus and Indo-European languages (Fst ¼ 0.1373). The differences were statistically signif- icant, p ¼ 0.000. Phylogenetic analysis of the results of ethnogenomic subdivision of studied groups showed that the variability of the above genomic variants is due to demographic processes in the history of these peoples and their adaptation to environmental (climatic) conditions of the environment, rather than the influence of Islam spread in these countries.

[email protected] 3.2 Ethnogenomic Structure of Dagestan Populations 47

Hema Sotho/Tswa Ajur Nande Nguni Mongolians Pygmy San Nigerian Lowercaste Malay Tsonga Vietnamese Dagestan:KUB Middicaste Chinese DagestanHM Japanese Cambodian Dagestan:STAL1 Uppercaste 0 Itailans 4 Dagestan:URK Finns 8 N.European 12 Dagestan:KUR4 16 French Poles

Fig. 3.9 Distributions built based on genetic distances between Dagestan (Avars, Dargins, Kumyks, Lezgins, Kubachins) and other worldwide groups by hypervariable locus of HVS1 mtDNA

The study of Dagestan people subdivision by female genome by mtDNA haplogroup frequencies was carried by mitochondrial haplogroup frequencies (Fig. 3.9). These results suggest that mtDNA and Dagestani ethnic groups are located in clusters of European people. In this case, however, the genetic similarity has nearly equal distribution between the European and Asian groups (Fig. 3.9). This suggests that the female half of the gene pool of Dagestani people is more mixed. The frequencies of HV, T, and U5 in Dagestani haplogroups are similar to ethnic groups in Central and Northern Europe and Turkey (Marchani et al. 2008), with Kubachins as an exception, with a frequency >75 % of U subtype haplogroup, which is most often found in Northern European and Russian populations. Ethnic Nogais have resided on Dagestani territory for more than 700 years, and by the frequencies of mtDNA haplotypes, are closest to the Central and East Asia groups characterized by a high frequency of B, C, and D haplogroups. Range of frequency variation of Dagestan indigenous ethnic groups’ haplogroup corresponds to variations of the frequency of other ethnic groups.

[email protected] 48 3 Selection of Populations for Mapping Genes of Complex Diseases

3.3 Genetic Epidemiology Study of Selected Genetic Isolates with the Aggregation of Schizophrenia Spectrum Disorders

The manifestation of mental disorders is exacerbated in traditional Dagestani villages. Questionnaire interviews and the DIGS and FIGS clinical techniques reveal valuable information about family and kindred (tukhum) history in tradi- tional Dagestani villages. Oral history, the ethnic tradition of storing and transmit- ting information to children in families with seven generations of ancestors, substantially aids common ancestor identification among patients and enables to set types of consanguineous marriages (Bulayeva et al. 2005). Examinations, performed by clinicians participating in our expedition studies, showed that the number of chronic patients recorded in the medical and statistical summary data for Dagestan was lower compared to the actual number of patients in the mountain villages. The lack of medical care and economic resources likely causes the disparity; it is difficult for mountaineers to make the trip to District and Republican clinical centers. Our expeditionary studies used familial history of chronic disease and its nosological forms; identification was unified in all examined mountain villages, making it possible, with a certain degree of certainty, to predict the reliability of the resulting epidemiological pattern. Table 3.4 shows the frequency of the most common comorbidities in the studied isolates with aggregation of mental pathology. Our study showed that rate of officially registered affected by complex chronic disorders in different isolates is 25–35 % of its total number. But our epidemiolog- ical study indicated a greater family history with larger numbers of chronically ill patients who had no medical exam in any hospitals. The total number of chronic patients in a particular isolate was considered to be high for mental disorders when we ascertained the isolates with aggregation of certain mental diseases. Using the sources described above, analysis of the prevalence of chronic diseases revealed that one complex pathology aggregates in nearly all primary isolates, i.e., all primary genetic isolates demonstrated higher clinical homogeneity in comparison with secondary isolates. Schizophrenia is relatively more common among chronic pathologies in isolates DGH005 and DGH022 (58.7 % and 58.9 %, respectively). Isolate 6009 showed high aggregation of cancer (63.04 %), and isolate 6006 showed the accumulation of suicide and depressive disorders (major recurrent depression, anxiety, and dysthymia). The results of genetic analysis depend on selecting the populations used: in isolates with high rate of consanguineous marriages and small Nt, appropriate method is linkage analysis based on pedigrees, while in outbred genetically hetero- geneous large populations—association analysis using samples of unrelated people (patients and healthy subjects) and linkage disequilibrium analysis. The genetic architectonics of complex phenotypes depend on the genetic struc- ture of a specific population (Gindilis 1979; Falconer 1960; Zhivotovsky 1984;

[email protected] 3.3 Genetic Epidemiology Study of Selected Genetic Isolates with the... 49

Table 3.4 Structure of # DGH064 NO ¼ 44 100 % morbidity in a number of 1. Schizophrenia 24 54.5 examined mountain Dagestan isolates 2. Tuberculosis 2 4.5 3. Myopia 2 4.5 4. Cardiovascular pathology 14 31.8 5. Mental ineptitude 2 4.7 # DGH009 46 100 % 1. Schizophrenia 5 10.83 2. Oncology 29 63.04 3. Neuromuscular dystrophy 3 6.52 4. Mental ineptitude 3 6.52 5. Epilepsy 6 13.09 # DGH002 34 100 % 1. Myopia 2 5.9 2. Schizophrenia 20 58.9 3. Gastroenterological pathology 6 17.6 4. Cardiovascular pathology 6 17.6 # DGH022 No ¼ 62 100 % 1. Schizophrenia 24 38.7 2. Affective disorder 5 8.06 3. Intellectual disability 7 11.3 4. Tuberculosis 14 22.58 5. Myopia 12 19.35 # DGH011 No ¼ 75 100 % 1. Schizophrenia 38 50.7 2. Intellectual disability 12 16 3. Tuberculosis 10 13.3 4. Cr 5 6.7 5. Rheumatoid disease 10 13.3 Notes: The number of chronic patients diagnosed in isolates of Dagestani hospitals was considered 100 %. The incidence was calculated based on the total number of chronically ill patients

Gindilis et al. 1989; Bulayeva 1991; Jorde 2000). Most studies in this area use outbred genetically heterogeneous populations of the USA or Europe. Genetic heterogeneity in the selected populations makes determining reproducible genetic linkage difficult for many reasons, mainly because of genetic heterogeneity of the disease and the effects of the interaction between genes of multiple founders in the family pedigree. These populations experience greater effects from penetrance and phenocopies because of dynamic environmental factors that significantly compli- cate the identification of real genetic linkage due to false-positive or false-negative values caused by, inter alia, population stratification. These difficulties are inherent for studies examining the genetics of complex diseases and therefore require an accurate clinical diagnosis, as well as careful examination of the genetic and demographic structure of the studied population (Bulayeva 1991; Wright

[email protected] 50 3 Selection of Populations for Mapping Genes of Complex Diseases et al. 1999). This led to the search for genetic homogeneous isolates in which the ancestor effect, along with other factors of genetic drift, contributes to high haplotype frequency with pathogenic locus, which facilitates the identification of genes involved in the pathogenesis in patients of modern generations (Bulayev et al. 2008). Ethnic differences in human populations are important sources of genetic sub- division (Cavalli-Sforza and Bodner 1971; Novembre et al. 2008). Populations that are ethnically and geographically subdivided tend to have a limited number of common ancestors and may have specific gene pool. As noted by Dobzhansky, “Social transformations involve genetic as define a choice of marriage partners” (1973). These ethnic populations may differ by rare unique allele’s variants. Such interpopulation differences are much lower and are only 10–15 % as compared to interindividual differences within them (85–90 %). The 10–15 % interpopulation variation may contribute to understanding the ethnogenesis and population subdi- vision issues, as well as genetic-epidemiological differences. Genetic isolates with a small number of founders, stable total volume, and marital isolation for hundreds of generations provide exceptional opportunities for the identification of complex disease gene mapping. The specific genetic processes in these isolates cause significant reductions in genomic heterogeneity, facilitating the identification of susceptibility genes for this disease. These isolates furthermore are effective for complex disease gene mapping because their historical development often occurs in stable, yet extreme, environmental conditions (desert, jungle, highlands, or the Far North); these environments reduce the effects of low penetrance and availability of phenocopies on disease manifestation that hamper the establishment of susceptibility genes (Jorde 2000). Isolates were selected for further genetic study if an aggregation of schizophre- nia and schizophrenia spectrum disorders, predominantly within a certain kindred (tukhum) with common ancestors, was present. We selected four ethnically and demographically diverse isolates: DGH064, DGH022, DGH005, and DGH011 (Table 3.4). Table 3.5 presents a summary of the selected isolates. As can be seen from Table 3.5, LMR epidemiological index in these isolates was 2.3–3 %, 2–3 times higher than the general population value of 1 %. Our study revealed a number of other isolates in which the aggregation of the disease was even higher with LMR values of 4–5 %. Using these genetic isolates in this work was impossible because of the fact that DNA of members of this family tree is under full genome scan. Attention is drawn to a higher rate of male patients with schizophre- nia in these isolates. In total, we examined 248 people in selected isolates, including 123 patients. All isolates had recovered polysyllabic extensive pedigrees of patients spanning from 282 to 572 members of the 12–14 generations, retrospectively. The study confirmed the results of our earlier genealogical research in the Dagestan isolates, demonstrating localization of all clinically homogeneous patients with the same diagnosis on one pedigree with common ancestor or with the limited number of ancestors in each examined isolates. The inbreeding coefficient was calculated by the traditional method of population genetics, accounting marriage structure in three generations of representative and randomized samples. Such coefficient

[email protected] 3.3 Genetic Epidemiology Study of Selected Genetic Isolates with the... 51

Table 3.5 Description of selected isolates for the study and reconstructed pedigrees Ethnic NO of SCZ % affected Isolates background Nt NPM F cases male LMRa NO DGH005 Laks 931 283 0.0121 21 58.1 0.0252 37 DGH064 Tindals 1800 572 0.0092 39 52.0 0.0327 86 DGH022 Dargins 1340 314 0.0114 27 63.3 0.0291 50 DGH011 Dargins 2000 533 0.0073 36 61.0 0.0297 75 Total 4 6071 1702 0.0098 123 58.6 0.0292 248 aData on the epidemiological LMR index of the DGH064 and DGH005 isolates were calculated taking into account recent research expeditions of 2009–2010, which revealed new cases of schizophrenia among previously surveyed members of the family tree. NPM is the number of members of the recovered isolates of the family tree, F is the mean coefficient of inbreeding in the isolates, LMR is the relative risk of schizophrenia in their lifetime, and NO is the total number of surveyed in each isolate values in these isolates ranged from 0.007 to 0.012, nearly twofold higher than primary isolates DGH005 and DGH022, and are more inbred compared to second- ary isolates DGH064 and DGH011 (Table 3.5). In the current generation of ethnic Tindal migrants, more than 32 % (74 marriages) of marriages are exogamous (inter- village and interethnic). Since ethnic Tindals are one of the smallest Dagestan groups, with a total population of 3900 people, inter-village marriages with resi- dents of neighboring villages are interethnic. Such inter-aul marriages led to a lower average coefficient of inbreeding in the studied Tindal migrants, compared with Tindals living in the historic mountain environment and keeping the traditional way of marital relations. The average age of disease onset in isolates ranged from 20.8 to 24 years. Manifestation age for all three isolates is 22.9 Æ 0.568, minimum age of onset is 14 years, and the maximum is 36 years. Among the officially registered and examined patients, about 60 % were male. The average age of patients with schizophrenia at the time of the study was 37.7 Æ 1.53 years, the minimum age of 14 years, and a maximum age of 79 years. In order to determine the “purity” of the studied clinical phenotype in identified isolates, we studied the medical records of patients; 60 % of studied individuals did not have other concomitant diseases, and 40 % did. The medical records of 46 examined patients from isolate DGH064, for example, showed that 29 did not have recorded comorbidities (63 %), 5 had tuberculosis (10 %), 4 had diseases of reproductive organs and infertility (about 9 %), and 3 had gastritis or stomach ulcers (6.5 %) (Table 3.6). The study observed 48 patients with schizophrenia in primary isolates; 35.4 % of these patients had concomitant diseases. Out of the 75 patients observed from secondary isolates, 60 % had concomitant diseases. According to regional hospital records in our ascertained genetic isolates, 29 were affected by physical illnesses in primary isolates and 61 in secondary isolates. The most common somatic pathol- ogies in both types of isolates are pulmonary diseases, including lung tuberculosis, cardiovascular diseases (hypertension, atherosclerosis, and congenital heart

[email protected] 52 3 Selection of Populations for Mapping Genes of Complex Diseases

Table 3.6 Structure of morbidity among the members of the pedigrees of studied isolates Primary isolates Secondary isolates Patient groups NO NO Schizophrenia, total patients 48 75 Comorbidities found with Schizophrenia spectrum diseases: Diseases of digestive tract 3 8 Diseases of reproductive organs 4 9 Intellectual disability 2 7 Cardiovascular disease 1 8 Rheumatoid diseases 2 5 Pulmonary diseases and tuberculosis 5 8 Total 17 (35.4 %) 45 (60 %) Other diseases in healthy members of kindreds Pulmonary diseases and tuberculosis 4 14 Eye diseases 8 7 Diseases of the digestive tract 6 9 Cardiovascular disease 3 10 Epilepsy 4 Rheumatoid diseases 5 6 Intellectual disability 3 9 Cancer 2 Total 29 61 diseases), and diseases of internal organs. In both types of isolates, eye diseases were mainly associated with age-macular degeneration; only a few cases were associated with myopia. The results therefore showed that patients with schizophrenia did not have specific comorbidities that may contribute to obtaining false-positive or false- negative results of genetic linkage in pedigrees. It is noteworthy that secondary isolates, when compared to primary isolates, have an increased clinical heteroge- neity, with regard to both the schizophrenia-related somatic disorders and general nosologies in isolates (Table 3.6). Using DIGS and FIGS, we found that different isolates experience the accumu- lation of population-specific form of schizophrenia. We therefore established that isolate DGH022 showed preferential accumulation of disorganized schizophrenia from earlier age of onset (average, 20.8 + 1.51 years). Patients with disorganized schizophrenia may or may not react or display emotions adequately and may lack signs of catatonia and suicide (Bulayeva et al. 2000, 2002, 2005, 2007). Previous studies of Gindilis et al. (1989), conducted in populations of ethnic Komi, in isolates of Russian Old Believers and mixed population of migrant workers in the Ural region, established population-specific accumulation of different clinical forms of schizophrenia. Genetic drift factors, of which isolated founders produce

[email protected] 3.3 Genetic Epidemiology Study of Selected Genetic Isolates with the... 53 the most significant effects, may cause cross-population differences in the accu- mulation of different clinical forms of schizophrenia. The other isolates, DGH022, DGH011, and DGH064, predominantly accumu- late paranoid schizophrenia with both a later average manifestation age (24.0 + 2.35 years) and the presence of 4 (DGH064, DGH011) and 1 (DGH022) suicides in the last 2–3 generations. Patients in these isolates experience pronounced visual and auditory hallucinations, usually related to a single topic, and are burdened by paranoid delusions, fears, and significant depressive episodes during periods of exacerbation. Of the patients examined in isolates DGH022, DGH011, and DGH064, 93 % undergo chronic auditory hallucinations; 74 % of these patients experienced auditory hallucinations in combination with visual hallucinations. Furthermore, 34 % of cases involved hallucinations of smells, 90 % of patients had paranoid delusions of persecution, and 57 % experienced other forms of delirium. Delutions associated with TV varies significantly between mountain isolates, from 40 % in isolate DGH022 to 0 % in isolate DGH005. Our study of these phenomena showed that a variation in the diagnostic criteria reflects the social environment in which these patients reside. The aul with 0 % of TV-associated delirium, in particular, had no telecommunications and TVs in their homes. This is a striking example of the importance of accounting all environmental factors when describing clinical phenotypes used for a genetic study. The structure of mental illness in isolates DGH022, DGH011, and DGH064 accumulates various forms of schizophrenia and clinical heterogeneity. Clinical heterogeneity of the psychotic phenotype found in secondary isolates is slightly higher, and complex phenotypes, such as a combination of mental retardation and psychosis, psychosis, or epilepsy, are present (Table 3.6). Primary isolates DGH005 and DGH022 are characterized by clinical phenotypes with relative homogeneity, compared with secondary isolates DGH011 and DGH064 (Table 3.6). Figure 3.10 presents examples of pedigrees from the four specified isolates. Patients with schizophrenia spectrum disorders are highlighted. The branches show that in all pedigrees, first- or second-degree relatives express schizophrenia spectrum disorders, confirming the known clinical data (Gottesman and Shields 1972; Tsuang and Faraone 1995). The presented fragments represent a small part of the vast family trees, which favor the accumulation of patients with this pathology in homogeneous families with a long history of consanguineous marriages, spanning hundreds of generations. The results also show these genes and their complexes pass into a homozygous state during inbreeding, causing clinical manifestation of the disease. Reconstructed extended pedigrees therefore accumu- late schizophrenia spectrum diseases, in which the number of patients with schizoaffective, schizotypal disorders, and paranoid personality disorders among first- and second-degree relatives is almost two times higher than the number of registered patients with schizophrenia. Examined isolates accumulate certain forms of schizophrenia, possibly indicating the presence of certain differences and sim- ilarities in the genetic determinants of these forms of schizophrenia. Secondary isolates DGH064 and DGH011, compared with the two primary isolates DGH005

[email protected] 54 3 Selection of Populations for Mapping Genes of Complex Diseases

Fig. 3.10 Pedigree branches of genetic isolates DGH064 (a), DGH005 (b), DGH022 (c), and DGH011 (d). Legend: P/SCZ possibly with schizophrenia, SCZ schizophrenia and related spec- trum disorders

[email protected] 3.4 Gene Pool of Selected Isolates for Mapping Genes of Schizophrenia 55 and DGH022, have more clinical heterogeneity, explained by the demographic history and the great antiquity of primary isolates. Primary isolates could have a greater number of meiosis and recombinations, in which the founder haplotype with pathogenic loci was significantly narrowed compared to secondary isolates with a younger demographic history. The results of these studies show a promising outlook of cross-population studies in primary and secondary isolates in order to identify the relationship between the clinical and genetic heterogeneity of complex diseases.

3.4 Gene Pool of Selected Isolates for Mapping Genes of Schizophrenia

54 STR loci characterized the gene pool enabling the mapping of schizophrenia genes from isolates scanned at chromosomes 3, 17, and 18; samples selected from unrelated isolates with a total volume of 109 people provided polymorphisms. Data on marriage structure, fertility, mortality, and reproductive parameters were col- lected from all available isolate members. Figure 3.11 shows the results of the analysis of the observed heterozygosity on 15 loci of chromosome 17 in the descendants of inbred, outbred, and endogamous marriages. The data demonstrate the expected reduced heterozygosity in a group of descen- dants from inbred marriages. The average observed heterozygosity in a group of exogamous descendants is 0.734, and in the inbred descendants, 0.671; Rs ¼À0.328, df ¼ 2, P ¼ 0.021. The results additionally show that the descendants

Fig. 3.11 Distribution of sizes of alleles of 21 STR loci in groups of descendants from outbred (1), endogamous (2) and inbred (3) marriages. X-axis: the size of alleles of studied loci; Y-axis: frequency of their occurrence in these groups of descendants, %

[email protected] 56 3 Selection of Populations for Mapping Genes of Complex Diseases of consanguineous marriages have relatively larger STR alleles sizes compared to STR alleles with descendants from exogamic marriages. The loci D17S1293, D17S1301 (tetranucleotide), D18S854 (trinucleotide), and D17S784 (dinucleotide) show a statistically significant increase in allele size from the descendants of exogamic marriage, compared to descendants of consanguine- ous marriage (Rs vary in different isolates, from 0.48 to +0.635, t ¼ 2.2–3.5, P ¼ 0.03–0.002). Locus D17S784 (dinucleotide) exemplifies an increase in allele size (nucleotide repeats) in groups of descendants from exogamic and consanguin- eous marriages, as shown in Fig. 3.12 (descendants of endogamous marriages are excluded). Genetic drift likely causes the size increase of alleles among the children of exogamous and consanguineous marriages. Figure 3.13 presents a comparative analysis of the level of heterozygosity and allelic rank distribution of 28 microsatellites between three studied ethnic groups in Dagestan and the global summary of the John Weber lab. The results show low levels of heterozygosity in all Dagestan ethnic groups, compared with the J. Weber

80 70 1 60 2 50 40 30 20 10 0 –10 226 228 230 232 234 244

Fig. 3.12 The distribution of alleles D17S784 in the groups of descendants of exogamous (inter- populations and inter-ethnic) (1) and consanguineous (2) marriages

Fig. 3.13 A comparative analysis of the level of heterozygosity and allelic rank distribution of grades 28 microsatellites between 3 studied ethnic groups in Dagestan and the global summary of the John Weber lab. HWEB level of heterozygosity in the combined sample from the John Weber lab in examined group of Laks (HLAKS), Dargins (HDARG), and Tindals (HTIND)

[email protected] 3.4 Gene Pool of Selected Isolates for Mapping Genes of Schizophrenia 57 data. Analyzing the observed heterozygosity of studied DNA microsatellites between summary data and individual isolates within ethnic groups in Dagestan obtained similar results. There is less genetic variation in the ethnic Laks (DGH005) and ethnic Dargins (DGH022) isolate. The average level of heterozy- gosity, according to summary data from J. Weber, is 0.752; in these Dagestani groups, such heterozygosity ranged from 0.713 to 0.758. Our results show that almost all Dagestani ethnic groups have lower average heterozygosity, compared to the global summary for similar loci. Previous analysis of 54 microsatellites, studied in seven other ethnic Dagestani groups, obtained similar results (Bulayeva et al. 2003a, b, 2006). The analysis found significant differences between the Dagestani ethnic groups in polymorphism of particular loci and in the level of genomic heterogeneity (Table 3.7). For a number of loci in some ethnic groups, the frequency of certain detected alleles is two or more times higher than the rate of similar alleles in other ethnic groups. Allele 304 of the D17S1308 locus in Tindals, for example, occurs with a frequency of 8.33 %, while in Dargins, it occurs with a frequency of 48.3 %. Almost all other loci observe similar frequency distribution differences. These Dagestani groups additionally contain unique alleles, which previously have not been men- tioned in world samples. The alleles 316 and 324 of loci D17S1308 are detected only in Laks, and are considered rare alleles, with a frequency of 0.022. The same rare allele, 242 of the D17S1298 locus, was found only in Tindals; allele 258 of the same locus was only found in Laks. Only Dargins possessed alleles 221 and 264 of the D17S1303 locus, 258 and 284 of D17S947 locus, 208 of D17S1299 locus, and 239 of D17S784 locus The distribution of allele 131 of D18S535 locus observes a similar pattern; the allele is found in 10 Dargin people, whereas in the other groups, it varies between 0 and 2 people; the allele is therefore rare. In the analyzed populations, in compliance with Hardy–Weinberg distribution verified studied DNA loci. The analysis showed that allele distribution did not differ from the Hardy–Weinberg equilibrium (Table 3.8) We calculated genetic similarity parameters between examined groups by Nei (1978) using a summary of genomic loci (Table 3.9). The results show the relative proximity of Dargin and ethnic groups. More genetic affinity is revealed between two Dargin isolates, caused by a single ethnic- ity. The geographic distances between Tindals and Dargins are reflected by their genetic distances. Tindals and Laks auls are more alpine; they are located at 2000–2750 m above sea level, while Dargins auls are located 700–1800 m above sea level. The geographical localization of Tindals and Laks may contribute to creating similar genetic mechanisms for adapting to their high-mountain environ- ment, explaining the certain genetic proximity of Tindals and Laks, compared to Dargins. Results obtained in study of DNA polymorphism in selected genetic isolates indicated significant differences between them in frequency of DNA loci alleles, as well as in the level of heterozygosity, and in the presence of rare unique alleles

[email protected] Table 3.7 Comparative analysis of the level of heterozygosity and allelic loci ranks of chromosomes 17 and 18 in 5 Dagestan ethnic Diseases Complex of Genes Mapping for Populations of groups, Selection 3 and summary 58 data from the John Weber lab Summary Laks Dargins Tindals Loci H Alleles H Alleles H Alleles H Alleles D17S1308 0.66 304–316 0.682 304–324 0.656 304–312 0.54 304–312 D17S1298 0.6 246–258 0.429 246–258 0.532 246–254 0.587 242–254 D17S974 0.64 201–217 0.69 197–213 0.657 197–217 0.63 197–217 D17S1303 0.7 225–245 0.672 225–245 0.723 221–264 0.645 225–245 D17S947 0.89 250–282 0.789 260–280 0.87 258–284 0.877 262–282 D17S2196 0.81 139–163 0.821 139–167 0.817 139–167 0.776 139–163 D17S1294 0.68 248–272 0.734 252–264 0.738 248–268 0.739 244–264

[email protected] D17S1293 0.83 262–290 0.856 262–294 0.837 262–290 0.825 262–286 D17S1299 0.73 188–208 0.696 188–204 0.676 188–208 0.452 188–204 D17S2180 0.67 116–128 0.534 113–125 0.643 116–128 0.68 113–125 D17S1290 0.84 170–210 0.843 166–214 0.831 166–206 0.867 166–206 D17S2193 0.79 138–159 0.78 141–159 0.79 141–159 0.78 141–162 D17S1301 0.65 147–163 0.63 147–159 0.732 147–163 0.603 151–163 D17S784 0.77 226–238 0.722 226–244 0.772 226–238 0.73 226–234 D17S928 0.79 135–165 0.779 135–155 0.78 135–159 0.831 135–157 GATA178F11 0.82 370–398 0.9 366–398 0.811 370–396 0.748 370–386 D18S481 0.76 183–203 0.804 183–203 0.866 183–203 0.791 183–203 D18S976 0.86 171–194 0.843 175–190 0.842 171–194 0.844 175–190 D18S843 0.75 179–191 0.658 179–191 0.695 179–191 0.46 182–191 . eePo fSlce sltsfrMpigGnso ciohei 59 Schizophrenia of Genes Mapping for Isolates Selected of Pool Gene 3.4 D18S542 0.79 178–198 0.815 186–202 0.793 186–202 0.731 186–200 D18S877 0.68 117–137 0.721 121–137 0.713 117–137 0.614 117–137 D18S535 0.76 131–155 0.76 131–155 0.744 131–155 0.789 131–159 D18S851 0.73 256–276 0.707 256–272 0.684 256–276 0.712 256–272 D18S858 0.75 193–211 0.711* 193–211 0.844* 193–211 D18S1357 0.79 126–147 0.734 126–138 0.803 126–147 0.821 123–144 D18S1364 0.76 164–188 0.767 164–188 0.806 156–188 0.778 164–184 ATA82B02 0.84 172–196 0.838 175–193 0.886 175–199 0.72 175–196 D18S1371 0.7 133–153 0.743 133–153 0.729 133–153 0.789 133–157 D18S844 0.76 182–200 0.763 188–200 0.787 185–200 0.597 185–200 Hx 0.752 (0.013) 0.74 (0.019) 0.758 (0.016) 0.713 (0.022) [email protected] 60 3 Selection of Populations for Mapping Genes of Complex Diseases

Table 3.8 Hardy–Weinberg equilibrium distribution compliance of studied genomic loci of chromosome 17 ## LOCI сМ DGH022 DGH064 DGH005 1 D17S1301 0 0.607 0.042 0.566 2 D17S1298 11 0.041 0.325 0.051 3 D17S974 22 0.674 0.918 0.246 4 D17S1303 24 0.437 0.191 0.917 5 D17S947 32 0.396 0.224 0.000 6 D17S2196 45 0.411 0.008 0.446 7 D17S1294 51 0.028 0.192 0.031 8 D17S1293 56 0.091 0.084 0.398 9 D17S1299 62 0.429 0.357 0.707 10 D17S2180 67 0.286 0.956 0.136 11 D17S1290 82 0.798 0.780 0.941 12 D17S2193 89 0.574 0.371 0.175 13 D17S1301 100 0.354 0.519 0.999 14 D17S784 117 0.331 0.409 0.071 15 D17S928 135 0.279 0.061 0.768 Total 15 0–194 2 2 2 Note: cM—the location of loci on chromosome 17 physical map. P  .05 are statistically significant are in italic Study was conducted on a sample of unrelated people from studied isolates

Table 3.9 Assessment of genetic similarity of examined isolates examined by summary of genomic loci (Nei 1978) № Isolates Ethnicity DGH022 DGH011 DGH064 DGH005 1 DGH022 Dargins 1.000 0.987 0.812 0.952 2 DGH011 Dargins 1.00 0.809 0.949 2 DGH034 Tindals 1.000 0.892 3 DGH005 Laks 1.000

(Bulayeva et al. 1985; Enattah et al. 2002). Unique alleles are found in the minimal sizes of microsatellite alleles that most likely can be explained by genetic drift.

3.5 Role of Inbreeding in the Aggregation of a Schizophrenia and in Its Age of Onset

Long-term studies in isolated populations of indigenous people of Caucasus (Bulayeva 1991) showed that the combination of severe highland environment, marital isolation, and genetic drift contributes to their “self-cleaning” from severe hereditary diseases. Rare cases of severe hereditary diseases were related to drug or alcohol abuse or participation in hazardous industries (the Chernobyl accident,

[email protected] 3.5 Role of Inbreeding in the Aggregation of a Schizophrenia and in Its Age of Onset 61 environmentally harmful production). Results of epidemiology study showed that the effects of long-term endogamy and inbreeding, along with the founder effects, in primary isolates may cause population-specific aggregation of one particular complex pathology (Bulayev and Bulayeva 2001; Bulayev et al. 2008, 2009). The relationship of age manifestation for several complex diseases with sexual maturity also plays a significant role in the accumulation of complex disease; Dagestani people traditionally marry early and produce several children before the patient manifests a complex disease (SCZ), leading to the accumulation of the mutant gene and, as a result, the clinical phenotype from generation to generation. Genetic demographic parameters characterized marriage structure and basic but vital reproductive parameters, morbidity, and mortality were studied in examined subjects. Subject selection was randomized (the lack of obvious relations between them), and based on representativeness principles, i.e., depending on the population value, the number of subjects in our work ranged from 78 to 128. Two methods assess the inbreeding coefficient (F): (a) traditionally, population genetic studies determine the value from examining the marriage of parents and grandparents of an individual and (b) using the pedigrees method, in which extensive pedigrees of each individual were reconstructed spanning 11–13 generations retrospectively. The most closely related marriages in human populations are between cousins with F ¼ 0.0156. The F value is reduced in all marriages between relatives of higher degrees (F < 0.0156). The structure of marital relations in the isolates was studied using CM value, which determined the degree of endogamy and exogamy of each examined subject: (1)—all marriages of the direct ancestors in 3 generations were exogamous, (2)— these generations included 1 or 2 exogamous marriages, and (3)—all marriages were endogamous and consanguineous. This value was developed in the 1980–1990s and applied to the study of genetic and demographic structure of ethnic populations in Dagestan. The CM value showed that these populations have high endogamy, which varies from 90 to 80 % in different populations (Bulayeva 1991); this, along with high values inbreeding coefficient, demonstrates the significant marriage isolation in the isolates. Calculated using the traditional method, the inbreeding coefficient, F, was based on the marriage structure of three generations of proband retrospectively and varied in examined isolates up to 0.0017 in the ethnic Kumyks and 0.0115 in ethnic Laks (Table 3.10). A detailed study of inbreeding in the same individuals, based on their marital relationship analysis in extensive pedigrees that were reconstructed for 11–14 generations, indicated that the mean values of this coefficient increase by an average of 2.3 times, and in some cases, it increases by more than 3 times. The average coefficient of inbreeding value in a population of relatively young ethnic Kumyks is lower than other populations of indigenous people in Dagestan (Table 3.10). Inbreeding coefficient values vary considerably in different ethnic populations and among populations of one ethnicity. The average coefficient of inbreeding is 1.7–3.5 times greater in two populations of ethnic Dargins listed in Table 3.10 (Nos

[email protected] 62 3 Selection of Populations for Mapping Genes of Complex Diseases

Table 3.10 The average coefficient of inbreeding in the studied populations of indigenous people of Dagestan, calculated by marital structure in 3 (Fpop) and 12 (Fped) generations of ancestors of the same individuals Isolates Number of Isolation Fpop À Fped types Ethnicity individualsa years Fpop Fped difference 1 (PI) Laks 87 5000 0.011 0.0306 0.0192 2 (PI) Laks 121 5000 0.012 0.0291 0.0176 3 (VI) Tindals 128 6000 0.009 0.0319 0.0230 4 (VI) Kumyks 87 600 0.002 0.0023 0.0006 5 (VI) Dargins 113 700 0.007 0.0073 0.0007 6 (PI) Dargins 115 4000 0.009 0.0329 0.0239 7 (PI) Dargins 57 3000 0.011 0.0250 0.0140 8 (VI) Avars 78 800 0,010 0.0130 0.0031 9 (PI) Avars 112 3000 0.011 0.0257 0.0145 10 (VI) Dargins 87 700 0.007 0.0135 0.0069 Total 5 985 28,800 0.088 0.2110 0.1234 Average 98.5 2880 0.009 0.0211 0.0123 value aWe analyzed individual geneology by marriage structure for 3 generations of ancestors (Fpop) and for 11–14 generations of ancestors in their pedigrees (Fped)

5 and 7). Peculiarities of their demographic history cause the differences between mono-ethnic populations. Isolate number 5, for example, is demographically young and is located in the foothills, close to auls with ethnic Avars. This Dargin aul (village) was founded by migrants from different high-mountain Dargin auls about 700 years ago, and their history shows recurrent marriages with neighboring residents, including individuals from Avar auls. Dargin aul No ¼ 7 is more ancient demographically and has more pronounced isolation of marital relationships, which, in combination with localization in nearly inaccessible mountainous regions, contributes to its genetic isolation, reflected in a relatively high coefficient of inbreeding. Analyzing marriage structure in pedigrees with schizophrenia accumulation showed that more than 60 % of patients with schizophrenia and schizophrenia spectrum disorders descend from inbred marriages. The Levene test for homoge- neity variance (F ¼ 9.55, df ¼ 1, p ¼ 0.003) reflects a significant statistical increase of schizophrenia patients among offspring of consanguineous marriages (Fig. 3.14). The average value of inbreeding, level F, for healthy subjects ¼ 1.364 and for patients ¼ 1.901, SD ¼ 0.649 and 0.755, respectively. The average value of hetero- zygosity H level per locus ¼ 0.753 for healthy and for patients ¼ 0.696, SD ¼ 0.116 and 0.144, respectively. The differences between patients and unaffected individ- uals for both values warrant the nonparametric Z (Mann–Whitney) parameter: Z for H ¼ 2.05, P ¼ 0.040; Z for F ¼ 3.91, P ¼ 0.000, N1 ¼ 55, N2 ¼ 51. The data support that inbred marriages produce a greater frequency of genetically homogeneous ill ancestors, compared to exogamous marriage, and that pathogenic loci in ill ances- tors from inbred marriages are more likely to be recessive homozygous.

[email protected] 3.5 Role of Inbreeding in the Aggregation of a Schizophrenia and in Its Age of Onset 63

ab0,90 2,8

0,85 2,4

0,80

2,0 0,75

0,70 1,6 H F

0,65 1,2

0,60

0,8 0,55 +–SD +–SD +SE +SE 0,50 – 0,4 – 12XX 12

Fig. 3.14 Distribution of H level of heterozygosity per locus (a) and inbreeding level F (b)in healthy subjects (1) and patients (2). SD standard deviation; SE standard error; X average value

The founder effect, in conjunction with the features of marriage structure (endogamy, inbreeding, early marriage), and the specificity of complex disease manifestation age, contributes to the accumulation of mutant genes of complex diseases in isolates. Traditions and customs of Dagestani people, specifically consanguineous marriages and not recognizing divorces, even when a spouse develops an illness, contribute to the accumulation among such patients of ancestral pathogenic loci in the isolates. Fatal hereditary diseases that manifest within the first years of birth prevent children from marrying or producing children, resulting in the harmful mutations gradually disappearing from the population (with deaths of affected children). Endogamy and inbreeding promote the manifestation of serious hereditary diseases, as most of these diseases are recessive. Adversities such as harsh envi- ronmental conditions in the high mountains, lack of medical care, and malnutrition in the history of alpine isolates do not nurture the life and health of these ill children. SCZ is chronic with periodic exacerbations, and therefore, complex diseases do not severely impact the viability of the carrier in stable conditions of alpine isolates, compared to severe hereditary disorders (Bulayeva et al. 2000, 2002, 2003a, b, 2005, 2007, 2011; Bulayeva 2006; Bulayev et al. 2008, 2009, 2011). Previous studies showed the effect of inbreeding in Dagestani isolates on the accumulation of patients with schizophrenia, hypertension, depression, and mental retardation (Bulayev et al. 2008, 2011; Bulayeva et al. 2002, 2003a, b, 2000, 2005, 2007, 2011; Bulayev and Bulayeva 2001). Thus, analyzing the effect of inbreeding on the accumulation of patients with these diseases has shown that the risk of developing these complex diseases is much higher in offspring from inbred mar- riages in the same populations compared to descendants from exogamic marriages.

[email protected] 64 3 Selection of Populations for Mapping Genes of Complex Diseases

Table 3.11 Median test of inbreeding level distribution in groups of patients and healthy subjects Inbreeding level values Health subjects Patients Total <¼ Median: Observed (obs) 86 24 110 Expected (exp) 68.15 41.85 Obs-exp 17.85 À17.85 >Median: Observed (obs) 28 46 74 Expected (exp) 45.85 28.15 Obs-exp À17.85 17.85 Total: observed 114 70 184 Notes: The difference between the two groups of healthy subjects and patients is significant by nonparametric Kruskal–Wallis test for independent variables “patients with schizophrenic spec- trum” and “healthy subjects” depending on “level of inbreeding” variable: χ2 ¼ 30.5 df ¼ 1, P ¼ 0.000

We evaluated values of individual inbreeding in order to determine differences in the level of inbreeding between sick and healthy individuals, as well as between groups of patients with different manifestation ages. We determined marriage type of direct parents of a particular member in the greatest possible number of gener- ations retrospectively and calculated their individual inbreeding coefficient. If a subject is a sibling, coefficient of inbreeding in the cousins of that subject is 1/16 or 0.0625, and so forth. The analysis showed that the average coefficient of inbreeding is almost 2.7 times higher in schizophrenic patients, compared to healthy subjects (Table 3.11). The results show higher values of inbreeding than expected (Table 3.13). The number of patients with a lower inbreeding median is fewer than expected, and the observed number of healthy patients with a lower inbreeding median is higher than expected. Genealogy fragments from one of our isolate confirmed the effect of inbreeding on the aggregation of schizophrenia (Fig. 3.15). The genealogy includes 89 people from 8 generations of descendants (240 years) and includes 19 patients with schizo- phrenia (13 are alive) and 5 possible patients with schizophrenia. Out of 5 generations in a nuclear family of this fragment, 8 descendants of 9 first cousin marriages developed paranoid schizophrenia. This genealogy fragment favors the accumulation of patients with schizophrenic pathology in homogeneous isolate families from generation to generation. Identification and study of pedigrees from isolates clearly demonstrates that the determination of schizophrenia involves genes, which during inbreeding pass in the homozygous state causing the manifestation of the clinical phenotype. Genetic-epidemiological studies of genetic isolates in Finland also showed this effect (Peltonen et al. 1997, Peltonen 2000; Ekelund et al. 1999). The results from the study of inbreeding on the accumulation of pathology show a statistically significant increase in the number of patients who descend from closer inbred marriages (Fig. 3.16). We also studied the influence of inbreeding on the onset age of schizophrenia spectrum disorders. The findings of this study showed that the average onset age in exogamous (inter-village and inter-ethnic) marriage descendants is 21.7 and in the inbred marriage descendants 17.57 years (Fig. 3.17).

[email protected] 3.5 Role of Inbreeding in the Aggregation of a Schizophrenia and in Its Age of Onset 65

Fig. 3.15 Genealogy fragment of a primary isolate with a high frequency of cousin marriages and aggregation of paranoid schizophrenia

Fig. 3.16 The frequencies of the descendants of outbred and inbred marriages in groups of healthy subjects (N) and schizophrenia spectrum disorders patients (SCZ). Differences in the distribution groups are valid: χ2 ¼ 10.9, df ¼ 1, p ¼ 0.00096, Rs ¼À0.498, t ¼ 3.721, p ¼ 0.00058

Multivariate analysis of genetic discrimination was performed to identify genetic differences between groups of patients with different age of onset divided into three groups of 17–20, 20–25, and 25–36 within 54 randomly selected genomic loci scanned in chromosomes 3, 17, and 18 (Fig. 3.18). The results of this analysis show a statistically reliable distribution of groups with early (17–20)- and late (25–36)-onset age within principal components: χ2 ¼ 89.7, df ¼ 68, p ¼ 0.040. In almost all of the three selected groups, age

[email protected] 66 3 Selection of Populations for Mapping Genes of Complex Diseases

28 +SD 26 – +–SE Mean 24

22

20

18

Age of manifestation 16

14 Exogamy Consanguineous Patients with schizophrenia - marriage descendants

Fig. 3.17 The distribution of age at onset of schizophrenia in groups of descendants of the different types of marriage

6 17–20 5 20–25 4 26–35 3 2

II 1 0 –1 –2 –3 –4 –4 –3 –2 –101 234 I

Fig. 3.18 Multivariate genetic analysis of patient groups with different age of onset within two main components II and I manifestation falls in different clusters along the axes of the two main components; however, the greatest differences are between groups with early- and late-onset age of the disease (Fig. 3.18). This indicates the specific differences in the genomic structure of patients with different ages of onset. Clinical studies of patients from examined isolates using DIGS, in addition to the diagnosis of the Republican Psychiatric Hospital, showed that primary isolates are identified by 2–3 clinical phenotypes. Most often these are schizophrenic patients, “probable” schizophrenic patients (with mild symptoms at the time of the study), and a small number of patients with schizoaffective disorders. In secondary iso- lates, clinical phenotypes typically range from 5 to 7, and, in addition to the above phenotypes, clinical phenotypes often include affective disorders, mental retarda- tion, and/or congenital somatic pathologies. Analyzing the impact of demographic

[email protected] 3.5 Role of Inbreeding in the Aggregation of a Schizophrenia and in Its Age of Onset 67

Table 3.12 Analysis of recombination haplotype of chromosome 22 in primary and secondary isolates Robs Rexp Robs Rexp Primary Secondary Position, Haldane, cM Loci Recombination fraction isolate isolate 0.000 D22S420 0.12959 1 26.696 10 11.663 15.000 D22S345 0.08236 4 16.966 4 7.412 23.999 D22S689 0.03844 0 7.919 1 3.460 27.999 D22S685 0.03844 1 7.919 3 3.430 31.999 D22S683 0.09063 2 18.670 5 8.157 41.998 D22S445 Total 8 78.2 23 34.2

Table 3.13 Summary parameters of genetic heterogeneity of examined primary and secondary isolates Average level Average Disease Average level of recombination risk The number of of inbreeding heterozygosity level value clinical Isolates F Hobs Rexp/Robs LMR phenotypes Primary 0.0108 0.652 Æ 0.054 5.185 0.0312 2–3 Secondary 0.0067 0.691 Æ 0.047 2.940 0.0195 5–7 age on recombination frequency to search for genomic linkage yielded interesting results. Genome-wide haplotype analysis of examined isolates using the computer package SIMWALK2 found a higher level of recombination in secondary isolate pedigrees compared to that of primary isolates. Table 3.12 presents the results of such analysis on chromosome 22 example. The observed level of recombination (Robs) in the genealogy of the primary isolates is 8, expected (Rexp)—78.2, and Rexp/Robs index ratio ¼ 9.8. In geneal- ogy of the secondary isolate Robs ¼ 23, Rexp ¼ 34.2, and Rexp/Robs index ratio ¼ 1.5. In other words, in demographically ancient primary isolates, with high levels of endogamy and inbreeding level of recombination, the level of recombination is three times less than in demographically young isolates, a conse- quence of the greater genetic homogeneity in primary isolates, which combines a small number of ancestral alleles during the crossover. Rexp/Robs index ratio between isolates is 6.5 times greater. Recombination plays a key role in the pathogenic loci mapping of complex diseases. Obtained differences in the recom- bination level reflect the need to address the demographic history of populations where experimental data are collected for mapping. Table 3.13 summarizes the

[email protected] 68 3 Selection of Populations for Mapping Genes of Complex Diseases results of genetic heterogeneity study parameters of examined primary and second- ary isolates. The presented table shows that primary isolates, in comparison to secondary isolates, show a high level of inbreeding, reduced clinical and genetic heterogene- ity, and a smaller number of recombinations, which are shown at the epidemiolog- ical level as large risk indicators of schizophrenia (and related spectrum disorders) in their lifetime. The results of genetic research in Dagestan isolates confirmed not only the existence of the genetic determination of the studied diseases but also its determi- nation through identical origin alleles, expressing clinical phenotypes, promoted by endogamy and inbreeding, in combination with the founder effect.

References

Ballantyne, K. N., Keerl, V., Wollstein, A., Choi, Y., Zuniga, S. B., Ralf, A., Vermeulen, M., de Knijff, P., & Kayser, M. (2011). A new future of forensic Y-chromosome analysis: Rapidly mutating Y-STRs for differentiating male relatives and paternal lineages. Forensic Science International: Genetics, 6(2), 208–218. doi:10.1016/j.fsigen.2011.04.017. Epub 2011 May 25. Bertoncini, S., Bulayeva, K., Pagani, L., Ferri, G., Taglioli, L., Bulayev, O. A., Gurgenova, F. R., Semenov, I., Paoli, G., & Tofanelli, S. (2011). The dual origin of Tati-speakers from Dagestan as written in the genealogy of uniparental variants. Journal of Human Genetics. Published online in Wiley Online Library (wileyonlinelibrary.com). Bertoncini, S., Bulayeva, K., Ferri, G., Pagani, L., Caciagli, L., & Taglioli, L. (2012). The dual origin of tati‐speakers from Dagestan as written in the genealogy of uniparental variants. American Journal of Human Biology, 24(4), 391–399. Bulayev, O. A., & Bulayeva, K. B. (2001). Genetic-epidemiological study of cardio-vascular diseases in Daghestan highland isolates. In R. Gryglewski & P. Minuz (Eds.), Nitric Oxide: Basic research and clinical applications (NATO Science series, pp. 199–201). Amsterdam: IOS Press. Bulayev, O. A., Gurgenov, F. R., Huseynov, U. M., & Bulayeva, K. B. (2011). Genes mapping of major recurrent depression in genetic of isolates Dagestan. Journal of Neurology and Psychi- atry (named after Korsakov), 111(10), 62–69. Bulayev, O. A., Pavlova, T. A., & Bulayeva, K. B. (2009). Role of inbreeding in aggregation of complex pathology. Genetics, 45(8), 1096–1104. Bulayev, O. A., Spitcin, V. A., et al. (2008). Population approach to mapping genes of complex diseases. Medical Genetics, 4(3), 3–17. Bulayeva, K. B. (1991). Genetic basis of human psychophysiology (p. 218). Moscow: Science. Bulayeva, K. B. (2006). Overview of Genetic-epidemiology study in ethnically and demograph- ically diverse isolates of Daghestan (Northern Caucasus, Russia). Croatian Medical Journal, 47(4), 641–648. Bulayeva, K., Jorde, L., Ostler, C., et al. (2003a). Genetics and population history of Caucasus populations. Human Biology, 75(6), 837–853. Bulayeva, K. B., Dubinin, N. P., Shamov, I. A., et al. (1985). Population genetics of Dagestan mountaineers. Genetics, 21(10), 1749–1758. Bulayeva, K. B., Glatt, S. J., et al. (2007). Genome-wide linkage scan of schizophrenia: A cross- isolate study. Genomics, 89(2), 167–177. Bulayeva, K. B., Jorde, L., Watkins, S., Ostler, C., Pavlova, T. A., Bulayev, O. A., Tofanelli, S., Paoli, G., & Harpending, H. (2006). Ethnogenomic diversity of Caucasus, Daghestan. Amer- ican Journal of Human Biology, 18, 610–620.

[email protected] References 69

Bulayeva, K. B., Pavlova, T. A., Kurbanov, R. M., et al. (2003b). Genetic and epidemiological studies in the mountainous Dagestan isolates. Genetics, 39(3), 413–422. Bulayeva, K. B., Leal, S., Pavlova, T. A., et al. (2000). The ascertainment of schizophrenia pedigrees in Daghestan genetic isolates. Journal of Psychiatric Genetics, 5, 100–106. Bulayeva, K. B., Leal, S. M., Pavlova, T. A., et al. (2005). Mapping genes of complex psychiatric diseases in Daghestan genetic isolates. American Journal of Medical Genetics Part B: Neuro- psychiatric Genetics, 32(1), 76–84. Bulayeva, K., Lencz, T., Glatt, S. J., Gurgenova, F., Takumi, T., & Bulayev, O. (2011). Genome- wide linkage scan of major depressive disorder in two Dagestan genetic isolates. Central European Journal of Medicine, 6(5), 616–624. Bulayeva, K. B., Lencz, T., Takumi, T., Glatt, S. J., Gurgenova, F. R., Guseynova, U., et al. (2012). Mapping genes of early onset major depressive disorder in Dagestan genetic isolates. Turkish Journal of Psychiatry, 23(3), 161–170. Bulayeva, K. B., Pavlova, T. A., Dubinin, N. P., et al. (1993). Phenotypic and genetic affinities among ethnic populations in Dagestan (Caucasus, USSR). A comparison of polymorphic, physical, neurophysiological and psychological traits. Annals of Human Biology (UK), 20(5), 455–467. Bulayeva, K. B., Pavlova, T. A., Kurbanov, R. M., & Bulayev, O. A. (2002). Complex genetic diseases gene mapping in Dagestan isolates. Genetics, 38(11), 1539–1548. Bulayeva, K. B., Pavlova, T. A., & Bulayev, O. A. (1997). Genetic polymorphism in the 3 populations of indigenous people of Dagestan. Genetics, 33(10), 1395–1405. Burgarella, C., & Navascue´s, M. (2011). Mutation rate estimates for 110 Y-chromosome STRs combining population and father-son pair data. European Journal of Human Genetics, 19(1), 70–75. doi:10.1038/ejhg.2010.154. Epub 2010 Sep 8. Caciagli, L., Bulayeva, K., Bulayev, O., Bertoncini, S., Taglioli, L., Pagani, L., Paoli, G., & Tofanelli, S. (2009). The key role of patrilineal inheritance in shaping the genetic variation of Dagestan highlanders. Journal of Human Genetics, 54, 689–694. Cavalli-Sforza, L. L., & Bodner, W. F. (1971). The genetics of human populations. San Francisco: Freeman. 974 p. Dobzhansky, T. (1973). Genetic diversity and human equality (Vol. 12, p. 129). New York, NY: Basic Books. Dubinin, N. P., & Bulaeva, K. B. (1982). Genetic bases of individuality in human populations. Doklady Akademii Nauk SSSR, 265(2), 470–473. Dubinin, N. P., & Bulayeva, K. B. (1984). The comparative populational study of the genetic basis of the individual psychological differences. Psychologicheskii Jurnal (Journal of Psychology), 4, 95–108, Moscow: USSR Academy of Sciences. Dubinin, N. P., Bulaeva, K. B., & Trubnikov, V. I. (1983). Variability and hereditability of neurodynamic and psychodynamic parameters in human populations. Russian J Genetika, 19 (8), 1353–1363. Russian. Ekelund, J., Lichtermann, D., Ja¨rvelin, M. R., & Peltonen, L. (1999, September). Association between novelty seeking and the type 4 dopamine receptor gene in a large Finnish cohort sample. American Journal of Psychiatry, 156(9), 1453–1455. Enattah, N. S., Sahi, T., Savilahti, E., Terwilliger, J. D., Peltonen, L., & Ja¨rvela¨, I. (2002). Identification of a variant associated with adult-type hypolactasia. Nature Genetics, 30(2), 233–237. Epub 2002 Jan 14. Falconer, D. S. (1960). Introduction to quantitative genetics. Edinburgh, Scotland: Oliver and Boy. 360 p. Gindilis, V. M. (1979). Genetics of schizophrenic psychoses/Author’s abstract of doctor of biological sciences. Moscow: Medicine. 25 p. Gindilis, V. M., Gainullin, R. G., & Shmaonova, L. M. (1989). Genetic and demographic patterns of distribution of various forms of endogenous psychoses. Genetics, 25(4), 734–743. Gottesman, I. I., & Shields, J. (1972). Schizophrenia and genetics, A twin study vantage point. New York: Academic Press. Harpending, H., & Eller, E. (1999). Human diversity and its history. In M. Kato (Ed.), Biodiversity (pp. 301–314). Tokyo: Springer.

[email protected] 70 3 Selection of Populations for Mapping Genes of Complex Diseases

Harpending, H., & Rogers, A. (2000). Genetic perspectives on human origins and differentiation. Annual Review of Genomics and Human Genetics, 1, 361–385. Jorde, L., Rogers, A., & Bamshad, M. (1997). Microsatellite diver-sity and the demographic history of modern humans. Proceedings of the National Academy of Sciences of the United States of America, 94, 3100–3103. Jorde, L. B. (2000). Linkage disequilibrium and the search for complex disease genes. Genome Research, 10, 1435–1444. Karafet, T. M., Bulayeva, K. B., Bulayev, O. A., Gurgenova, F., Omarova, J., Yepiskoposyan, L., Savina, O. V., Veeramah, K. R., & Hammer M. F. (2015). Extensive genome-wide autozygosity in the population isolates of Daghestan. EJHG. [Epub ahead of print] Karafet, T. M., Bulayeva, K. B., Nichols, J., Bulayev, O. A., Gurgenova, F., Omarova, J., et al. (2016). Coevolution of genes and languages and high levels of population structure among the highland populations of Daghestan. Journal of Human Genetics, 61(3), 181–191. Karafet, T. M., Mendez, F. L., Meilerman, M. B., Underhill, P. A., Zegura, S. L., & Hammer, M. F. (2008). New binary polymorphisms reshape and increase resolution of the human Y chromo- somal haplogroup tree. Genome Research, 18(5), 830–838. doi:10.1101/gr.7172008. Kayser, M., Kittler, R., Erler, A., Hedman, M., Lee, A. C., Mohyuddin, A., Mehdi, S. Q., Rosser, Z., Stoneking, M., Jobling, M. A., Sajantila, A., & Tyler-Smith, C. (2004). A comprehensive survey of human Y-chromosomal microsatellites. The American Journal of Human Genetics, 74(6), 1183–1197. Kazima, B., Lesch, K.-P., Bulayev, O., Walsh, C., Glatt, S., Gurgenova, F., et al. (2015). Genomic structural variants are linked with intellectual disability. Journal of Neural Transmission, 122 (9), 1289–1301. Kruglyak, L. (1999). Genetic isolates: Separate but equal? Proceedings of the National Academy of Sciences of the United States of America, 96(4), 1170–1172. Marchani, E. E., Watkins, W. S., Bulayeva, K., Harpending, H. C., & Jorde, L. B. (2008). Culture creates genetic structure in the Caucasus: Autosomal, mitochondrial, and Y-chromosomal variation in Daghestan. BMC Genetics, 9(47), 1–13. Nasidze, I., Ling, E. Y., Quinque, D., Dupanloup, I., Cordaux, R., Rychkov, S., Naumova, O., Zhukova, O., Sarraf-Zadegan, N., Naderi, G. A., Asgary, S., Sardas, S., Farhud, D. D., Sarkisian, T., Asadov, C., Kerimov, A., & Stoneking, M. (2004). Mitochondrial DNA and Y-chromosome variation in the Caucasus. Annals of Human Genetics, 68(Pt 3), 205–221. Nasidze, I., Scha¨dlich, H., & Stoneking, M. (2003). Haplotypes from the Caucasus, Turkey and Iran for nine Y-STR loci. Forensic Science International, 137(1), 85–93. Nei, M. (1978). Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics, 89(3), 583–590. Novembre, J., Johnson, T., et al. (2008). Genes mirror geography within Europe. Nature, 456 (7218), 98–101. Peltonen, L. (2000). Positional cloning of disease genes: Advantages of genetic isolates. Human Heredity, 50(1), 66–75. Peltonen, L., Jalanko, A., & Varilo, T. (1997). Molecular genetics of the Finnish disease heritage. Human Molecular Genetics, 8(10), 1913–1923. Tofanelli, S., Ferri, G., Bulayeva, K., Caciagli, L., Onofri, V., Taglioli, L., et al. (2009). J1-M267 Y lineage marks climate-driven pre-historical human displacements. European Journal of Human Genetics, 17, 1520–1524. Tsuang, M. T., & Faraone, S. V. (1995). The case for heterogeneity in the etiology of schizophre- nia. Schizophrenia Research, 17(2), 161–175. Xing, J., Watkins, W., et al. (2009). Fine-scaled human genetic structure revealed by SNP microarrays. Genome Research, 19(5), 815–825. Wright, A. F., Carothers, A. D., et al. (1999). Population choice in mapping genes for complex diseases. Nature Genetics, 23(4), 397–404. Zhivotovsky, L. A. (1984). Integration of polygenic systems in populations (p. 182). Moscow: Science.

[email protected] Chapter 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

4.1 Haplotype Analysis in Pedigrees Ascertained in the Isolates

Key methods for complex disease gene mapping include haplotype analysis of patients and healthy members of pedigrees and linkage analysis using nonparamet- ric and parametric approaches. The next stages of identifying candidate genes and the genomic mechanisms of pathogenesis in our study included detailed screening of linked regions using single nucleotide polymorphisms (SNPs) and structural variations of the genome in the form of CNVs and LOH. Indigenous people in Dagestani genetic isolates who were in strict marital and geographical isolation over hundreds of generations are viable candidates for the implementation of this map- ping strategy. The duration of demographic history divides ethnic groups and local populations of the region. Ethnic Kumyks and their local populations have lower levels of isolation and inbreeding, compared to other, more ancient, ethnic groups of the Dagestani mountain people, who traditionally have high levels of isolation and inbreeding (Gadzhiev 1971; Gadzhieva 1961; Bulayeva 1991). The population structures led to high genetic differentiation between these isolates and low genetic diversity within them (Bulayeva 1991; Bulayeva et al. 1993, 1996, 1997, 2003a, b). We created a new method of extended pedigree reconstructions in genetic iso- lates ascertained in our long-term experience of study in Dagestan. This method enables to cover 300–700 members of 9–14 generations, which typically get down to 2–6 ancestors. Depending on isolate total volume and the degree of family history (LMR), the number of living patients in pedigrees of the selected mountain isolates with high aggregation of the disease will vary from 10 to 60 affected. We often did not detect schizophrenia in neighboring mountain isolates; however, the neighboring isolates may lack accumulation of any particular complex pathology (Bulayeva et al. 2000, 2003a, b, 2005). Results of our epidemiology study indicated that the epidemiological index of lifetime morbid risk (LMR) for schizophrenia in Dagestani genetic isolates ranges

© Springer International Publishing Switzerland 2016 71 K. Bulayeva et al., Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates, DOI 10.1007/978-3-319-31964-3_4

[email protected] 72 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates from 0 % (i.e., complete absence) to 5 % that is 5 times higher than the world population value of 1 % (Bulayeva et al. 2000, 2002, 2003a, b, 2005). This complex pathology aggregation, detected in certain Dagestani isolates in mountainous regions, is based on the specific and demographic history of the studied isolate; a higher degree of genetic isolation during longer demographic history produces a greater proportion of patients who inherited haplotype with pathogenic locus from a common ancestor. These isolates could possibly identify physical marker linkage with the mutant gene involved in pathogenesis (Jorde et al. 2000, 2001). Our study indicated the significance of haplotype analysis in mapping genes of such complex disease as schizophrenia enables visualization of significantly lower (smaller) regions in haplotype blocks in patient’s genealogies, compared to linkage analysis. Using haplotype and linkage analysis in combination therefore is impor- tant for mapping genes of complex disease. The results from this analysis clearly demonstrate locus and/or allele heterogeneity of schizophrenia affected between isolates and within pedigrees with common ancestors. We performed haplotype analysis using the computer package SimWalk2, based on genotyped patients and healthy relatives. The program reconstructs the probable haplotypes derived from inaccessible direct ancestors. It is therefore possible to predict the recombination events in generations; recombinations within haplotypes are marked with special icons. Figures 4.1 and 4.2 represent genealogy fragments of isolate DGH005 with haplotypes in chromosomes 22 and 17. Patients in this genealogy fragment on chromosome 22 are different from healthy subjects by heterogeneous haplotype 6-6- in 0–20 cM pter region (D22S420, D22S345 loci), which they received from a common ancestor at No. 25, 8 generations ago (Fig. 4.1). Genetic relatives in isolates favor the presence of certain mental disorders often derived from certain direct ancestors of modern patients who, after the simulation and recovery of SIMWALK haplotypes, are haplotype carriers themselves. All carriers of this haplotype are presented in Fig. 3.18 by points; evident patients diagnosed in psychiatric hospitals in Dagestan and in our expedition studies are marked by continuous color. Surveying relatives using the FIGS questionnaire favors a substantial carrier overlap, reconstructed by computer simulations of haplotypes 6-6- with possible schizophrenic spectrum pathologies tinted. All patients in the genealogy have common founders and usually are the descendants of marriages between close relatives. Chromosome 17 haplotype D17S1308-D17S1298, as presented in genealogy fragment from isolate DGH005, is homogeneous by alleles 44-44- in 0–20 cM from pter (Fig 4.2). The current generation of patients inherited the block of haplotype from a common ancestor, who existed eight generations ago (240 years) (Fig. 4.2). Ancestors in this case possess the same haplotype that favors numerous consanguineous marriages in the previous generations, as evident by the well-known archeological antiquities of these isolates (about 5000 years ago) and

[email protected] 4.1 Haplotype Analysis in Pedigrees Ascertained in the Isolates 73

Fig. 4.1 Haplotypes of chromosome 22 in the genealogy fragment DGH005. The sequence of chromosome loci: D22S420, D22S345, D22S689, D22S685, D22S683, D22S445 previously identified high frequencies of first-cousin marriages in ethnic Laks compared to other Dagestani ethnics (Gadzhiev 1971; Bulayeva 1991). This ancient isolate, in combination with endogamy and inbreeding, contributed to gene pool homogeneity, which is particularly reflected in this region on chromosome 17. The average number of alleles on chromosome 17 in this isolate is 5.1, and the average

[email protected] 74 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

Fig. 4.2 Haplotypes of chromosome 17 in the genealogy fragment DGH005. The sequence of loci: D17S1308, 917S1298, D17S974, D17S1303, D17S947, D17S2196, D17S1294 heterozygosity is 0.585 Æ 0.051, whereas in secondary isolate DGH034, these values are 6.2 and 0.692 Æ 0.044, respectively. SCZ cases in primary isolate DGH005 showed that frequency of allele 312 (rank ¼ 4) of the locus D17S1308 in patients is significantly higher (0.682) compared to that in secondary isolate DGH011 (0.382). The same is true for allele 4 of locus D17S1298. Results obtained suggest that risk alleles within same STR loci in block of haplotype associated with same clinical phenotype vary between different genetic isolates.

[email protected] 4.1 Haplotype Analysis in Pedigrees Ascertained in the Isolates 75

Fig. 4.3 Haplotype of chromosome 22 in the fragment of genealogy DGH064. Loci sequence (see in Fig. 3.18)

Haplotypes of chromosome 22 in patients from isolate DGH034, as well as isolate DGH005, differ from healthy people by haplotypes in the 0–20 cM region from pter (Figs. 3.18, 4.3), suggesting commonality in the physical localization of pathogenic loci. Pedigrees from these two isolates differ by alleles of the same locus, reflecting allelic heterogeneity of pathogenic loci: isolate DGH034 patients predominantly have heterogeneous haplotype 5-7- in D22S420 and D22S345 loci, whereas the isolate DGH005, similar by haplotype block loci, is 6-6- (Fig. 3.18). Bimodality of haplotype in patients is due to allele 16 of the D22S683 locus, determined in isolate DGH034. This locus is absent in healthy members of the genealogy (Fig. 4.2). In the fragment of extensive genealogy in isolate 6043, patients differ from healthy subjects in chromosome 17 by homogeneous haplotype 44-44 in 0–20 cm region from the pter (D17S1308, and especially, D17S1298 loci) (Fig. 4.4), inherited by modern generations of patients from common ancestors who lived seven generations ago (210 years). According to members of the genealogy, both these parents were cousins or siblings, explaining the similarity of their genomes (Fig. 4.4).

[email protected] 76 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

Fig. 4.4 Haplotypes of chromosome 17 in the fragment of genealogy DGH064. STR loci (see in Fig. 4.2)

21 people, including 9 registered patients, were examined in this genealogy fragment. Ancestral haplotype block 44-44 is completely carried by 6 of 9 patients, and the carriers display symptoms of paranoid schizophrenia. The remaining three people carry a smaller haplotype block, most likely, as a result of recombination events (Fig. 4.4). The chromosome 22 haplotype in isolate DGH022 did not differentiate between patients and healthy subjects. Analysis of chromosome 17 haplotypes in a

[email protected] 4.1 Haplotype Analysis in Pedigrees Ascertained in the Isolates 77

Fig. 4.5 Haplotypes of chromosome 17 in the genealogy fragment DGH022. The letter “Y” marks the genealogy members with genome-wide scanned microsatellites genealogy fragment from isolate DGH022 found a match of genomic D17S1298 locus with allele 4 with haplotypes of isolates DGH005 and DGH034: patients in DGH022 had 4-4- haplotype in the region of 10–20 cM from pter (D17S1298 and D17S974) (Fig. 4.5). Haplotype 4-4 of D17S974 locus is homozygous (-4-4-) and it is received by current patients from a common ancestor who existed five genera- tions ago. Significant differences in allelic and locus haplotype homogeneity on chromo- some 22 are found in isolate DGH011. In contrast to isolates DGH005, DGH022, and DGH034, patients differ from healthy people by haplotype -5-20- in the region of 30–50 cM (D22S685 and D22S683 loci) in isolate DGH011 (Fig. 4.6).

[email protected] 78 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

Fig. 4.6 Haplotypes of chromosome 22 in the fragment of genealogy DGH011. For sequence of loci, see in Fig. 3.18

Allelic and locus homogeneity is found between isolates DGH005, DGH022, and DGH034 on chromosome 17. Isolate DGH034 in (Fig. 4.6) in three mentioned loci has block haplotype -3 -3-4-, obtained from an ancestor who lived seven generations ago. Almost all current and possible patients with this genealogy contain haplotype -3 -3-4- (Fig. 4.6). Genealogy from isolate DGH022, in the same region, contains patients with haplotype -3-4- at D6S1959 and D6S2439 loci, which differ from healthy members of the genealogy. In isolate DGH011, 7 out of 9 affected members have haplotype -4-5- in the same D6S1959 and D6S2439 loci, while the remaining two have haplotypes -4-6- (Fig. 4.7). Healthy members of this genealogy do not have alleles 4 and 5. Haplotype analysis of the genealogy from isolate DGH005 revealed no specific haplotype blocks in patients.

[email protected] 4.1 Haplotype Analysis in Pedigrees Ascertained in the Isolates 79

Fig. 4.7 Haplotypes of chromosome in genealogy fragment DGH064. The sequence of loci: D6S1959, D6S2439, D6S2427

We identified differences and similarities between populations on the chromo- some 6 haplotype. Heterogeneity exists in healthy subjects in these isolates, in alleles 5, 3, and 4. Homogeneity, however, is established between isolates DGH022 and DGH034 in the haplotype block encompassing the D6S1959 locus.

[email protected] 80 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

4.2 Genome-Wide Nonparametric Linkage Analysis of Schizophrenia in Selected Isolates

Parametric analysis of genomic linkages is often used to measure the linkage of pathogenic loci of complex diseases. Such analysis requires the formulation of exact genetic inheritance model. Nonparametric method of genomic linkages (model-free nonparametric linkage analysis, NPL) can also measure the linkage and is often used at the first stage of the linkage search (Lander and Botstein 1986; Andrews et al. 1987; Giuffra and Kidd 1989; Barr et al. 1994; Sobel and Lange 1996; Kendler et al. 2000). Based on the results of the nonparametric analysis, it is possible to select an adequate inheritance model for subsequent parametric linkage analysis. Although nonpara- metric analysis is less powerful than the parametric method with a correctly formu- lated genetic model, in instances where the model is unknown, using a nonparametric linkage analysis method is advisable (Risch and Giuffra 1992). Statistics obtained during the nonparametric analysis using SIMWALK2 contribute to the choice of a genetic model for further parametric analysis (for details, see Chap. 2). Numerous linkage studies show that high penetrant mutations that cause com- plex diseases, such as schizophrenia, are rare (Bray 2001). Despite the difficulty in identifying disease genes, the data suggest the presence of linkage for schizophrenia in particular on chromosomal regions 22q11-q12, 6p24-p22, 8p22-p21, 6q, 13q14.1-q32, 5q21-q31, 10p15-p11, 1q21-q22, and 18p (Bray 2001). Our studies of Dagestan isolates also identified positive linkage signals in chromosomal regions 1q32, 2q24, 10p, and 22q, confirming the above results (Bulayeva et al. 2002, 2005). We, however, obtained these results for selected chromosomal regions in several pedigrees of Dagestani isolates. Systematic genealogy analysis of four genome-wide genetic isolates, using the same clinical diagnosis methods and design of genomic scanning, was conducted (Table 4.1). Nonparametric linkage analysis using Simwalk2 enabled the estimation of A, B, C, D, and E statistics. These five statistics estimate gene clustering that deter- mines the phenotype transmitted from the general ancestor as well as the clinical phenotype. Statistic A reflects the number of ancestor alleles involved in disease manifes- tation and is the most effective for identifying linkage with inherited recessive clinical phenotypes. Statistic B reflects the maximum number of common alleles patients receive from the general ancestor and is most effective for identifying linkage with dom- inantly inherited clinical phenotypes. Statistics C, D, and E demonstrate the presence of a limited number of alleles derived from the general ancestor of patients that determines the specific disease. Our nonparametric linkage analysis results speak to interpopulational similari- ties and differences, both in the quantitative linkage values and in accordance with recessive (A) or dominant (B) inheritance pattern of pathogenic locus (Table 4.1). Linkage region 1q32-42 in the pedigrees of three studied ethnic Dargin and Tindal isolates shows linkage with schizophrenia and dominant inheritance of pathogenic locus. Similar linkage and inheritance of pathogenic loci modes are also detected in

[email protected] 4.3 Genome-Wide Parametric Linkage Analysis of Schizophrenia in Selected Isolates 81

Table 4.1 Nonparametric linkage with schizophrenia spectrum disorders in the genealogy of 4 genetic isolates Primary isolates Secondary isolate DGH005 DGH022 DGH011 Map (Laks) (Dargins) DGH064 (Tindals) (Dargins) 1q32-q42 1.3–1.24 (B, D)* 1.3–1.5 (B, C, D, E)* 1.5 (B, D, E) 2q21-q24 1.3–1.2 (B, E) 3p13 1.13 (B, D, E) 1.2–1.5 (B, D, E) 4p15.3 1.5–1.7 (A, C)* 4q35 1.43–1.5 (A, C)* 5q31-35 1.7–1.4 (B, D, E)* 6p21-p24 1.2 (A) 1.3–1.4 (A, C) 1.3 (A) 8p11-p23 2.0–1.7 (B, D, E)* 10q22-25 1.3 (A) 11p15 1.3–1.03 (B, E) 1.3–1.1 (D, E) 12q24.2-24.3 1.6 (B, E) 1.3–1.8 (A, D, E) 13q12 1.13–1.22 (A, C) 17p11.q12 1.3–1.4 (A, C) 1.6–2.3 (A, C, E) 1.3–1.2 (A, C) 1.2–1.4 (B, D, E) 18p11-q12.1 1.2 (A, C) 1.2–1.3 (B, D, E) 1.5–1.7 (B, D, E) 22q11.23 1.5–1.7 (A, C)* 2.0–2.3 (A, C)* 2.2 (B, D, E) 22q12.3 17 in 14 Chrs 4 5 12 7 Note: Threshold statistically significant value of linkage magnitude (with a confidence level of P < 0.05) is 1.3 ethnic Dargin and Tindal isolates in the 3p13-3q13 region. The 17p11.2-12 region has statistically significant linkage in all four isolates. Pathogenic locus inheritance is recessive in three isolates, whereas in the fourth isolate, pathogenic locus is dominantly inherited. Results obtained in genome-wide nonparametric analyses performed in four isolates indicated that primary isolates generally have greater genetic homogeneity in linked with SCZ genomic loci, in comparison with second- ary isolates. From 17 total linked with SCZ genomic loci in all four genetic isolates, in primary ones we obtained on average 4.5 linked loci while in secondary ones the mean linked number of loci is 9 (Table 3.13), i.e., primary isolates demonstrated fewer linked with SCZ genomic regions than secondary isolates.

4.3 Genome-Wide Parametric Linkage Analysis of Schizophrenia in Selected Isolates

Analyzing pedigrees from isolates using the parametric method of genome-wide linkage search is then used to specify the inheritance type of the pathogenic locus, its incomplete penetrance level, and the degree of genetic heterogeneity of the disease locus 0.05. Table 4.2 presents the parametric method of genome-wide linkage search results from pedigrees in isolates.

[email protected] Table 4.2 Parametric linkage analysis with schizophrenia spectrum disorders in pedigrees of 4 genetic isolates Isolates Dagestan Selected in Schizophrenia of Genes Mapping 4 82 DGH005 (Laks) DGH022 (Dargins) DGH064 (Tindals) DGH011 (Dargins) LOD. Flanking loci LOD. Flanking loci LOD. Flanking loci LOD. Flanking loci R/M R/M, R/M R/M Map D/M (Peak, cM) D/M (Peak, cM) D/M (Peak, cM) D/M (Peak,cM) 1p13.3-q23.3 1.73 D1S1653- 1.87 D1S3723- D/M D1S1679, D/M D1S534, 160 148 1p35.2-p36.1 1.7 D1S552- R/M D1S1622, 46 2q36.3-q37.1 1.5 D2S1363- [email protected] D/M D2S427, 225 2p16.3-p23.2 1.5 D2S1356- 2.3 D2S405- 3.1 D2S1788- R/M D2S1352, 62 D/M D2S1788, 41 D/M D2S1356, 57 3p22.3-p23 1.8 D3S2432- D/M D3S1768, 48 3q28-q29 1.9 D3S2418- D/M D3S1311, 209.7 4q35.1-q35.2 2.3 D4S408 D4S1652, R/M 188 5p13.3-p14.3 1.9 D5S1501- 2.0 D/M D5S2848- D/M D5S1725, D5S1470,52 94 . eoeWd aaercLnaeAayi fShzprnai eetdIoae 83 Isolates Selected in Schizophrenia of Analysis Linkage Parametric Genome-Wide 4.3 5q35.1-q35.2 1.3 D5S1456- R/M D5S211. 180 6p21.2-p23 2.3 D6S1959 3.0 D6S1959 4.3 D6S2439- 2.4 D6S2434- D/M D6S2439, D/M D6S2439, D/M D6S2427, D/M D6S1959, 32 29 38 17 8p23.1-p23.3 1.56 D8S264-D8S277, 3 1.6 D8S264- R/M R/M D8S277, 3 9p21.3-p22.2 2.6 D9S925 R/M D9S1121, 12.4 10q26.12-q26.3 2.4 D10S1230- 1.96 D10S1213- 2.7 D10S1230- [email protected] D/M D10S1213, D/M D10S1248, R/M D10S1213, 136 153 136 10p11.21-p11.23 2.4 D10S1426- D/M D10S1208 56 11p15.4-p15.5 2.1 ATA34E08- R/M D11S1392, 34 11q23.1-q24.3 2.7 D11S1998- 2.1 D11S1998- R/M D11S4464, R/M D11S4464 116 113 12q24.23-q24.33 1.6 D12S2078- 3.1 D12S395- D/M D12S1045,149 R/M D12S2078, 141 13p11-p12 1.4 D13S787- D/M D13S1493, 3 (continued) Table 4.2 (continued) Isolates Dagestan Selected in Schizophrenia of Genes Mapping 4 84 DGH005 (Laks) DGH022 (Dargins) DGH064 (Tindals) DGH011 (Dargins) LOD. Flanking loci LOD. Flanking loci LOD. Flanking loci LOD. Flanking loci R/M R/M, R/M R/M Map D/M (Peak, cM) D/M (Peak, cM) D/M (Peak, cM) D/M (Peak,cM) 17p12-p13.2 2.5 D17S1298- 3.7 D17S1303- 1.97* D17S974-D17S1303, 3.2 D17S1294- R/M D17S974, R/M D17S947, R/M 21 D/M D17S1293, 13 27.5 52 18p11.31-q12.1 1.5 D18S481- 1.98 D18S481- 3.00 D18S542 R/M D18S976, R/M D18S976, D/M D18S877, 21 10.5 57 19q13.31-q13.42 1.92 D19S246-D19S589, 1.6 D19S178- 1.6 D19S178- 2,23 D19S433- [email protected] R/M 77 R/M D19S246, R/M D19S246, R/M D19S245, 47 67 67 21q22.13-q22.2 2.5 D21S1440- 2.5 D21S1440- D/M D21S2055, D/M D21S2055, 42 36 22q11.2-q12.1 3.2 D22S420- 4.6 D22S420- D/M D22S345, D/M D22S345, 3 3 22q12.3-q13.1 2.9 D22S683- 3.5 D22S683- D/M D22S445, D/M D22S445, 32 LOD > 3.0 1 2 2 5 LOD < 2.9>1.3 8 5 13 7 Total—25 9 7 15 12 Recessive inheritance of 4 (44%) 3 (43%) 9 (60%) 4 (33%) disease loci 4.3 Genome-Wide Parametric Linkage Analysis of Schizophrenia in Selected Isolates 85

Primary Isolate DGH005 The parametric analysis results in the ancient Lak isolate yielded nine genomic regions linked to schizophrenia spectrum disorders. Signif- icant LOD was found in 22q11 (Table 4.2). We identified suggestive signals in the 1p12-q23, 2q24-q31, 3q28, 5q14, 10q26, 11p15, 7p13.2-q12, and 19q13 regions. Nonparametric analysis in the pedigree of the DGH005 isolate on chromosome 17 did not show statistically significant linkage, although a trend (P ¼ 0.10) was observed in the first two loci of the chromosome. A previous haplotype analysis for the same loci showed haplotype homogeneity in patients (Table 3.8 and Fig. 3.11). Parametric analysis showed that this haplotype is linked with LOD ¼ 2.5, justifying the chosen model of recessive inheritance of pathogenic locus of data with incom- plete penetrance (Table 3.11). Four out of nine genomic loci linked with a patho- genic locus were inherited recessively (Table 3.11). Primary Isolate DGH022 Using the pedigree of primary isolate DGH022, non- parametric linkage analysis found significant linkage in the 17p11.2-p12 region with D17S1303 and D17S947 loci (Bulayeva et al. 2002, 2005). Values of log10 ( p-value) in a nonparametric analysis for this region are 1.54–2.1 (for statistics A and C) (Table 3.10) that favors recessive inheritance of pathogenic locus. Further parametric linkage analysis in pedigree DGH022 indicates that a recessive inher- itance pattern with incomplete penetrance found increased value of LOD up to 3.73 (Table 3.11), confirming the presence of significant genetic linkage of the studied clinical phenotype with the D17S947 locus and also confirming the legitimacy of the chosen model of recessive inheritance of the pathogenic locus. We found significant level of linkage with studied pathology with LOD ¼ 3.0 in this isolate in 6p21-p23 region. We also found suggestive linkages levels in the 1p13.3-q23.3, 2q36-q37, 18p11-p12, and 19q13 regions (Table 4.2). Three pathogenic loci out of seven linked loci inherited recessively (Table 3.11). Secondary Isolate DGH034 Compared to the primary isolates above, the number of genetic linkages by both parametric and nonparametric analysis is higher in the pedigree of isolate DGH034, which supports higher genetic heterogeneity of the schizophrenia pathogenic loci in this isolate (Tables 3.10 and 3.11). The most notable values of LODs ¼ 4.3 and 4.6 for this pedigree were established in the 6p22 and 22q11.23-q12.3 regions (Table 3.11). We obtained unusual bimodal LOD in this isolate at 22q: the first peak we found with LOD ¼ 4.6 located with a peak of 3 cM in 22q11.23 region, and the second peak LOD ¼ 2.9 in 22q12.3 region. This result confirms results of haplotype analysis and showed bimodality of haplotype in patients due to haplotype block 5-7- in 22q11.23 region and alleles 16 at D22S683 locus (22q12.3). Clinical heterogeneity and related genetic heterogeneity in the pedigree of isolate DGH034 may explain the observed bimodality of haplotype and genomic linkage in chr. 22. Our clinical studies using the medical records of regional hospitals and DIGS, translated into Russian, showed that clinical phenotype of all patients in the pedigree of isolate DGH034 varies slightly. One phenotype displays clear symptoms of paranoid schizophrenia, without obvious neurological

[email protected] 86 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates comorbidities. Patients with this phenotype are characterized by haplotype 5-7- and LOD ¼ 4.6, localized in 22q11.23. The other phenotype displays a combination of paranoid schizophrenia and varying degrees of intellectual disability (ID). These types of patients demonstrated a genotype 16 in the 22q12.3 region. Linkage analysis with specified clinical diagnosis in a large pedigree branch from isolate DGH034 did not indicate significant linkage with chromosome 17. According bimodality that we found in clinical phenotypes and in related linkages at 22q11.23-q12.3, we separated a smaller part of the pedigree with clear symptoms of paranoid schizophrenia. Identifying smaller fragments in the pedigree required the consideration of the haplotype analysis results (see Figs. 3.10, 3.13a, b). The process revealed a reliable linkage peak in the 17p11 region and in the D17S1303 and D17S947 loci by both nonparametric and parametric analyses (Tables 3.10, 3.11). Secondary Isolate DGH011 Isolate DGH011 is localized to a relatively accessible high plateau zone in Dagestan and is classified as a secondary isolate because it was formed approximately 700 years ago by migrants from several remote highland villages. Our demographic history and marriage structure study showed three large kindreds in isolate DGH011 that resulted from migrations to this isolate from three highland villages. One kindred predominantly aggregates paranoid schizophrenia; another predominantly aggregates intellectual disability (ID). Our reconstructed extended pedigree showed these two kindreds had limited numbers of marriages; we only observed an increased number of inter-kindred marriages in last 3–4 generations. The symptoms of these two clinical phenotypes differ. Regional psychiatrists diagnose the ID phenotype during early childhood if a patient presents significant socialization and learning difficulties. The second kindred observed an age of manifestation for schizophrenia typical for Dagestani isolates; according to hospital records, the schizophrenia manifestation occurred at 15–19 years in the second kindred, with one manifestation age of 27 years. Despite the large popula- tion of residents from different highland villages, isolate DGH011 has a large number of related marriages, including first-cousin marriages. This isolate expresses greater genetic heterogeneity, which is reflected by larger clinical het- erogeneity, and a broader range of clinical entities that villagers are burdened with. We described the results of linkage analyses and structural genomic variants found in ID patients of other kindreds (Bulayeva et al. 2015). When members of these two diverse kindreds married (inter-kindred marriage) with the presence of ID or schizophrenia aggregation, their offspring express a combination of both com- plex diseases. In total, our genome-wide linkage scan in this pedigree showed LOD  3 in five genomic regions, 2p22-p21, 12q24.23-q24.32, 17p13-q12, 18p11-q12, and 22q12.3, of 12 linked with schizophrenia genomic regions. Four of nine linked regions were inherited recessively (Table 3.11). Parametric analysis of isolates DGH011 and DGH034 showed a higher hetero- geneity of the genomic loci linked to schizophrenia, compared to primary isolates (Table 3.11). We detected the second linkage peak in isolate DGH034 in the same 22q12.3 region where we obtained the LOD peak in isolate DGH011. A detailed

[email protected] 4.3 Genome-Wide Parametric Linkage Analysis of Schizophrenia in Selected Isolates 87 analysis of the clinical phenotypes of isolate DGH011 shows that some schizophre- nia patients have cognitive impairments, as well as patients with genotype 16 from isolate DGH034, who were not diagnosed earlier in their medical history from regional hospital observations. The results obtained in our studies examining some linked regions confirmed previous publications in the gene mapping of schizophrenia spectrum disorders. These studies, for instance, found that the 3p13-3q13, 5q31-35, and 11q23-24 regions we identified are associated with dopamine receptor genes DRD3, DRD1, and DRD2 (Lu et al. 1996; Hill et al. 1998; Prasad et al. 2002; Golimbet et al. 1998, 2001). The 13q12 and 17p11.2-12 regions may be associated with genes that cause abnormalities in serotonin chain in schizophrenics (5HTR2A, 5HTT-SERT) (Nothen et al. 1993; Sidenberg et al. 1993; Erdman et al. 1996; Alfimova et al. 2003; Golimbet et al. 2003). Several studies found linkage with the 4q35 region mainly with schizoaffective and bipolar disorders (Blair et al. 2005; Pickard et al. 2004). The data support the existence of sub-telomeric deletions in this region, reflected in the form of a complex clinical phenotype in mental retardation and schizoaffective disorders (Pickard et al. 2004). Previous studies show schizophrenia linkage with 8p11-23 and SCZD6 gene localized in this region (Kendler et al. 1996, 2000; Blouin et al. 1998). It is possible that established linkage is due to gene activity; however, it would require an examination during a subsequent mutation screening of linked region; a number of studies have not supported linkage results in this region (Prasad et al. 2002). Linkage with the 17p region for schizophrenia spectrum disorders is noted only in two of our publications (Bulayeva et al. 2000; Freedman et al. 2001; Golimbet et al. 2004). Our linkage findings with schizophrenia, LOD > 3 at 17p11.2-q12, in one Dagestani genetic isolate were supported in publications of two ethnic-racial groups of American, European, and African descent (Freedman et al. 2001; Owen et al. 2004). The LOD¼2.54 at 17q was found in schizophrenic patients with manifestation ages younger than 45 years (Cardno et al. 2001). The pathophysio- logical mechanisms of identified linkages in this region may be associated with abnormalities of neurotransmitters involved in the serotonin metabolism of mood disorders, schizophrenia, and other mental disorders (Gelernter et al. 1995). Several groups of researchers made assumptions about the localization of schizophrenia genes in the 22q11 region (Karayiorgou et al. 1994; Pulver et al. 1994; Eliez et al. 2001a, b; Glatt et al. 2003). Evidence of linkage in this region, however, was not significant and was not confirmed in other studies (Kalsi et al. 1995; Riley et al. 1996; Parsian et al. 1997). In these studies schizophrenia spectrum of disorders have been found as linked with 22q region where located gene COMT (116790). The remaining linkage results obtained from these pedigrees are important for further studies. Weak signals from possible linkages can be important for complex diseases, such as schizophrenia. Such signals maybe are due to insufficient pedigree sample size or require deeper genome scan of the region with such signals that we need to extend and/or obtain a detailed analysis of patient genomes using fraction- ally scanned DNA markers in future studies. As was shown in Chap. 1, researchers

[email protected] 88 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates confirm the rare genomic regions found in some studies due to the high genetic heterogeneity of disease itself and the presence of incomplete penetrance and phenocopies.

4.4 Cross-Population Analysis of Genome-Wide Linkages Scan for Schizophrenia in Selected Isolates

Our linkage analysis of pedigrees found that 17 chromosomes out of 22 autosomes have linkage signals with schizophrenia in four genetic isolates we studied. All these linkages have different levels of confidence; in other words, more than half of the human genome is involved in the pathogenesis of schizophrenia spectrum disorders. These results show a nearly twofold reduction in the number of linked loci in genomes, i.e., genomic homogeneity of pathogenic loci linked to schizo- phrenia in demographically older primary isolates, in comparison with demograph- ically younger genetically heterogeneous secondary isolates. Genome-wide linkage scanning in each of specified regions revealed 17 chromosomes linked with SCZ. Nine of the total number of detected linkages are close to a significant level of linkage with the LOD  3: 2p23.2-p16.3 (DGH011), 6p21.2-p22.3 (DGH034/DGH022), 12q24 (DGH011), 17p13.2-q12 (DGH022/DGH011), 18p11 (DGH011), 22q11.23-q12.3 (DGH005/DGH034/ DGH011). Ten of the 25 genomic regions linked with SCZ are replicated in all four, two, or three of the studied isolates, which supports that revealed pathogenic loci are invariant to ethnogenomic stratification for SCZ, or demonstrate interpopulation genetic homogeneity pathogenic loci of schizophrenia (Table 4.3). Most reliable across genetic studies with high LOD and location homogeneity are linkages at 6p22-p23, 10q26, 17p12-q12, and 22q11.23-q12.3 (Table 4.3). Our results of genome-wide linkage scans across diverse isolates with aggre- gations of SCZ demonstrate the degree of genetic heterogeneity of the studied disease, under which these clinical phenotypes in the pedigrees of some populations can segregate with one DNA marker, and segregate with completely different markers in other. Moreover, our results support the insulation effect on the degree of genetic heterogeneity of this pathology at the interpopulation level, manifesting in the form of increased number of genomic loci linked with diseases in primary and secondary isolates (Table 4.3). Greater heterogeneity in the gene pool of secondary isolates and a greater number of founders, compared to primary isolates, increases the number of pathogenic loci in patient genomes in secondary isolates. The Tindal isolate, DGH034, has a small ethnic volume of approximately 3900 people and is considered a secondary isolate due to its forcible relocation in 1944 to the lowland area of Dagestan. More than 35 % of the migrants died during the first

[email protected] 4.4 Cross-Population Analysis of Genome-Wide Linkages Scan for Schizophrenia... 89

Table 4.3 Cross-isolates analysis of the results of parametric analysis of genomic linkages with schizophrenia pedigrees from 4 ethnically divided genetic isolates LODs Isolated No. Linked Region D/M_R/M DGH005/DGH022 1p13.3-q23.3 1.73D/M–1.87 D/M DGH011 1p35.2-p36.12 1.6 R/M DGH005/DGH064/DGH011 2p16.3-p23.2 1.5 R/M–2.3 R/M–3.1 D/M DGH022 2q36.3-q37.1 1.5 D/M DGH064 3p22.3-p23 1.8 D/M DGH005 3q28-q29 1.9 D/M DGH064 4q35.1-q35.2 2.3 R/M DGH005/DGH011 5q13.3-q14.3 1.9 D/M–2.0 D/M DGH064 5q35.1-q35.2 1.3 R/M DGH005/DGH022/DGH064/ 6p21-p23.1 2.3 D/MÀ3.0 D/M–4.3 D/M–2.4 DGH011 D/M DGH064 8p23.1-p23.3 1.6 R/M DGH064 9p21.3-p22.2 2.6 R/M DGH005/DGH022/DGH064 10q26.12-q26.13 2.4 D/M–1.96 D/M–2.7 R/M DGH011 10p11.21-p11.23 2.4 D/M DGH005 11p15.4-p15.5 2.1 R/M DGH011/DGH064 11q21.3-q24.3 2.7 R/M–2.1 R/M DGH005/DGH011 12q24.23-q24.33 1.6 D/M–3.1 R/M DGH064 13p11-p12 1.4 D/M DGH005/DGH022/DGH064/ 17p13.2-q12 2.5 R/M–3.7 R/M–1.97-R/M–3.2 DGH011 D/M DGH022/DGH064/DGH011 18p11-q12.1 1.5 R/M–2.0 R/M–3.0 D/M DGH005/DGH022/DGH064/ 19q13.31-q13.42 2.0 R/M–1.6 R/M–1.6 R/M/2.23 DGH011 R/M DGH064/DGH011 21q22.13-q22.2 2.5 D/M–2.5 D/M DGH005/DGH064 22q11.2-q11.23 3.2 D/M–4.4 D/M DGH011/DGH064 22q12.3 2.8 D/M–3.1 D/M In total 10 linked regions are reliable in 2–4 observed isolates, 3 are reliable in all 4 isolates

2–3 years of adaptation to the new environmental conditions. The second and third generations of surviving migrants had higher frequencies of interethnic (exoga- mous) marriage (see Chap. 3.1). Our previous studies show that deaths were selective among 35 % of migrants. Adapting to the new environmental conditions (new food, climate, water, and infections such as malaria and typhus) caused greater mortality rates in descendants from close consanguineous marriages with higher levels of homozygosity (Bulayeva et al. 1993, 2008). Our study of surviving migrants and highlanders living in their historical environment showed that off- spring of these consanguineous marriages experience greater neurophysiological sensitivity, causing reduced resistance to environmental factors, leading to greater rates of sickness and death (Bulayeva 1991; Bulayeva et al. 1993, Bulayeva and Pavlova 1993, 2008). These results, along with the increased rate of interethnic

[email protected] 90 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

Table 4.4 Candidate genes localized in genomic regions, linked with schizophrenia spectrum disorders Isolates Linked number region Start End GENES bp DGH005/ 1 p13.3- 107,554,572 162,462,074 GSTM1, GSTM2, GSTM5, NTNG1, DGH022 q23.3 NBPF4, GJAS, RGS4, CDY8, NOSIAP2 DGH011 1p35.2- 19,166,961 30,411,242 HTR1D, PLA2G2A, HTR6, p36.1 LAPTM5, PIGV, MTHFR, AHDC1 DGH022 2q36.3- 226,929,588 232,306,614 PID1,AGFG1, GIGYF2, AGFG1, q37.1 C2orf83, SPHKAP, GIGYF2, HTR2B DGH005/ 2p16.3- 29,376,632 50,933,915 TADA, OXER1, CRIM1 EGLN1, DGH034/ p23.3 EF2B4, NRXN1, SLC3A1, LRPPRC, DGH011 ALK, VIT, CDC42, SOS1 DGH034 3p22.3- 32065178 34724630 GPD1L, CKLFSF6, CMTM6 p23 DGH005 3q28-q29 192216869 197118299 APOD, UTS2D, DLG1, TFRC, HRASLS, PAK2 DGH034 4q35.1- 184740153 190843373 SLC25A4, MTRF1L, FRG1 q35.2 DGH011/ 5p13.3- 26,675,528 89,266,833 CDH9,CDH6, ZFR*, MTMR12*, DGH005 p14.3 GOLPH3, HOMER1, MEF2C DGH034 5q35.1- 168,932,510 173,348,552 FOXI1, KCNIP1, SH3PXD2B**, q35.2 DUSP1 DGH034/ 6p21.2- 6,145,760 39575565 NRN1, RPP21, HISTIN2BJ, DGH011/ p23 DTNBP1, NOTCH4, DCDC2, HLA- DGH022/ DRB5, HLA-DRB6, HLA-DRB1, DGH005 HLA-DQA1, HLA-DQB1 DGH034 8p23.1- 2030289 6616946 CSMD1, MCPH1, ANGPT2, p23.3 MYOM2 DGH034 9p21.3- 18189052 25503300 ELAVL2, CDKN2A, CDKN2B p22.2 DGH005/ 10q26.12- 122642634 131192796 FGFR2, ATE1, CPXM2, HMX3, DGH022/ q26.13 HTRA1, ADAM12, DOCK1, GPR26, DGH034 CPXM2 DGH011 10p11.23- 30395654 35357914 ARHGAP12, REM, ZEB1, NRP1 p11.21 DGH005 11p13- 25,836,984 34,740,296 BDNF, KIF18A, DCDC5, WT1, p14.3 PRRG4, FBXO3, TRIM44, ENF DGH034/ 11q23.3- 117,597,731 123,727,020 SCZD2, GRIK4, SORL1, LR11, DGH011 q24.1 SORLA, KCNJ1, FLI1, DRD2, NCAM1, HTR3A, HTR3B, IL18, POU2AF1 (continued)

[email protected] 4.4 Cross-Population Analysis of Genome-Wide Linkages Scan for Schizophrenia... 91

Table 4.4 (continued) Isolates Linked number region Start End GENES bp DGH005/ 12q24.23- 120,088,460 126,627,408 CIT, DYNLL1, P2RX7,SBNO1, DGH011 q24.33 CDK2AP1, CCDC60, TBX3, TMEM132D DGH005/ 17p13.3- 3,566,529 32,660,503 VPS53, ACACA, ITGAE, ODF4, DGH022/ q12 GABARAP, ARRB2, DLG4, PER1, DGH034/ VAMP2, NDEL1, MYOCD, DGH011 TMEM132E, SL6A4, CCL2, PMP22, COX10 DGH022/ 18p11.31 2,966,133 5,349,199 DLGAP1,TGIF1,MYOM1 DGH034/ DGH011 DGH011 18q12.1- 26,625,044 38,249,329 DSG3,NOL4, FHOD3, BRUNOL4, q12.3 RIT2, CDH2 DGH005/ 19q13.31- 44,305,515 53,906,790 KLK8, CD33, APOE, SLC1A5, DGH034/ q13.42 ATP1A3 DGH022/ DGH011 DGH034/ 21q22.13- 39,041,451 41,291,662 KCNJ6, ETS2 DGH011 q22.2 DGH005/ 22q11.1- 17,759,281 28,956,833 CECR2, BSR,PRODH, DGCR2, DGH034 q12.1 DGCR14,DGCR5, DGCR8,DGCR8, HIR, COMT DGH011/ 22q12.3- 34,495,479 37,666,244 LARGE, APOL1-APOL4, CSNK1E, DGH034 q13.1 RBM9, MYH9, IL2RB, CACNG2, SOX10, DRG1, DDX17, SYN3, YWHAH

marriages, have led to greater heterogeneity of the surviving immigrants in the younger generation. These studies show that the isolate is currently on the verge of collapse, and the results of parametric and nonparametric linkage analyses in this isolate should therefore be interpreted from the perspective of the demographic events in the history of the isolate. All four studied isolates demonstrated linkages at the 17p13.3-p11.2 regions and linked disease loci of recessive inheritance (Table 4.4). Candidate genes located in the linked region are SLC6A4, CDK5R1, TMEM98, OMG, NF1, CCL2, ASIC2, PIP4K2B, TMEM132E, SLFN12L, MYO1D, and EFCAB5.) Cross-population analysis of our genome-wide linkage scan results in SCZ pedigrees supports reliability of linkages in chr 17 region in our ethnically diverse genetic isolates. The most important reliable in all four genetic isolate linkages with high LOD are those established in the 6p21.2–p22.3, 17p11-q12, and 22q11.23- p12.3 regions (Figs. 4.8, 4.9, and 4.10). All four isolates contain genes generally localized in the linked region (chr6: 19,812,095-39,575,565) where, according to

[email protected] 92 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

Fig. 4.8 Common genomic region 6p21.2–p22.3 we found as linked with schizophrenia (see Tables 4.2 and 4.3) for genetic isolates pedigrees studied. Overall LOD ¼ 5.3, α¼1

previous studies, candidates for the studied pathology are PRSS16 (chr6: 27,215,502-27,224,399), NOTCH4 (chr6: 32,163,725-32,165,371), PGBD1 (chr6: 28,249,314-28,270,326), HIST1H2BJ (chr6: 27,100,095-27,100,575), DCDC2 (chr6: 24,206,692-24,406,985), and RPP21 (chr6: 30,312,906-30,314,635) (Fig. 4.8, Table 4.4). The linkages were furthermore detected in regions of two major histocompatibility complexes, MHCI and MHCII, confirming the findings of previous studies. Some studies, however, found antigen linkage with this region (Mayilyan et al. 2008; Glatt et al. 2008). Genome linkage with the 1p12-q23.3, 2p24.2-p21, 10q26.12-q26.13, 11q21.3- q24.3, 18p11.31- q12.1, 19q13.31-q13.42, 21q22.13-q22.2, and 22q11.1-q12.3 regions was reproducible in different isolates. Population-specific genomic linkages cause peculiarities in ancestral genomes. A set of genes localized in all four isolates in the linked region (highlighted), of which, according to previous studies, candi- date genes for the studied pathology are SLC6A4 (chr17: 28,523,378-28,538,442), PEMT (chr17: 17,408,877-17,494,994), SREBF1 (chr17: 17,720,862-17,740,325), SHMT1 (chr17: 18,243,719-18,266,856), and OMG (chr17: 29,622,027- 29,623,349) (Figs. 4.11 and 4.12). As previously established, we found the most significant linkage finding in the ethnic Dargin primary isolate, DGH022, at 17p11.2-q12 region, most likely asso- ciated with the 5HTT-SERT (SL6A4) gene, and other candidate genes for the

[email protected] 4.4 Cross-Population Analysis of Genome-Wide Linkages Scan for Schizophrenia... 93

4

DGH005+DGH022, D/M 3 DGH064, R/M

DGH011, D/M

2 LOD

1

0 0204060 80 100 120 140 160 180 200

cM

ARHGAP12, FGFR2, ATE1, CPXM2, REM, ZEB1, NRP1 HMX3, HTRA1, ADAM12, DOCK1, GPR26,CPXM2

Fig. 4.9 Genomic region 10p11.23-p11.21 in isolate DGH011 and the 10q26.12-q26.13 region in isolates DGH005 and DGH022 were linked with schizophrenia and with dominant inheritance of disease loci. In isolate DGH034, we found the 10q12-q26.13 region to inherit disease loci recessively. LODs varied from 1.96 to 2.7 (see Table 3.12). Overall for 2 (DGH005 + DGH022) isolates LOD ¼ 5.3, α À 1 studied pathology (Fig. 4.12). Responsible for transporting serotonin from synaptic spaces into presynaptic neurons, abnormalities in the SLC6A4 protein-coding gene may cause the pathophysiological mechanisms. These serotonin transmitters are involved in the pathophysiology of some mental disorders, such as bipolar disor- ders, MDD, OCD, and SCZ (Lesch et al. 1994; Gelernter et al. 1995; Gutie´rrez et al. 1998). We found full genetic homogeneity in isolates DGH005 and DGH034 at 22q11.1-q12.1, both in the location of the linked genomic region and in the mode of transmission of clinical phenotype loci with significant linkages of LOD ¼ 3.2–4.6. Linked region of isolate DGH034 is bimodal: the first peak of linkage is observed in 22q11.23 region with LOD ¼ 4.4 and the second in the 22q12.3 region with LOD ¼ 2.98 (Fig. 4.13). The second linkage peak in isolate DGH034 coincides with unimodal linkage distribution in isolate DGH011 in the same 22q12.3 region with LOD ¼ 3.1 (overlapping regions are highlighted). In isolate DGH011, the 22q12.3-q13.1 region links with SCZ with LOD ¼ 3.5. Over- all, at 22q11-q13 for all three kindred, the LOD equals 8.6 for the first peak and 5.6

[email protected] 94 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

3

2,5

2

1,5

1

0,5

0 0 20 40 60 80 100 120 140 6005c11 6034c11 6011c11

SCZD2, GRIK4, SORL1, LR11, BDNF, KIF18A, DCDC5, WT1, SORLA, KCNJ1, FLI1, DRD2, PRRG4, FBXO3, TRIM44, NCAM1, HTR3A, HTR3B, IL18,

Fig. 4.10 Genomic regions at 11p114.3-p13 in isolate DGH005, and at 11q23.1-q24.3 in isolates DGH034 and DGH011 linked with schizophrenia candidate genes. LODs varied from 2.1 to 2.7 (see Table 3.12) for the second peak. The plausible linkage interval within the first peak harbors genes CECR2, BSR, PRODH, DGCR2, DGCR14, DGCR5, DGCR8, DGCR8, HIR, COMT, and the second linked peak of DGH034/DGH011 harbors genes LARGE, APOL1-APOL4, CSNK1E, RBM9, MYH9, IL2RB, CACNG2, SOX10, DRG1, DDX17, SYN3, and YWHAH. These genes were previously reported as associated or linked with cognitive impairment, neurodevelopmental, and neurodegenerative diseases. LARGE gene mutations can cause two different forms of muscular dystrophy. (http://omim.org/entry/603590), One form, dystroglycanopathy (MDDG), previously called Walker–Warburg syndrome (WWS) or muscle-eye- brain disease (MEB), causes severe congenital forms of brain and eye anomalies (type A6; MDDGA6, 613154) The other form, formally known as congenital muscular dystrophy, type 1D (MDC1D), is a less severe congenital disease with ID (type B6; MDDGB6; 608840). SYN3 (Synapsin III) is another important gene associated with cognition and psychosis in this genomic region. SYN3, implicated in synaptogenesis and the modulation of neurotransmitter release, potentially plays a role in several neuropsychiatric diseases, supported by previous association studies with SCZ, BPD, MS, and ADHD. The protein-coding gene TOM1 (Target of Myb Protein 1) is affiliated with the lncRNA class and is associated with BPD, cystic fibrosis, and chronic wasting disease. In the 22q12.2-q13.1 linked region, CNV variation analyses showed four common ID affected deletions in gene LARGE with rs8141384 and rs8140012. Set of genes localized to the linked region, of which candidates genes for the studied pathology are COMT (chr22: 19,948,722-

[email protected] 4.4 Cross-Population Analysis of Genome-Wide Linkages Scan for Schizophrenia... 95

Fig. 4.11 Genomic region 18p11.31 in isolates DH022/DGH034 and at 18q12.1-q12.3 in isolates DGH011 linked with schizophrenia candidate genes. LODs varied from 1.5 to 3.0 (see Table 3.12)

6

5

4 DCH... :022 :005 :034

3 011 022+ 005+ 034 LOD 2

1

0 0 10 20 30 40 50 60 70 80 90 100 cM VPS53, ACACA, ITGAE, ODF4, GABARAP, ARRB2, DLG4, PERI, VAMP2, NDEL1, MYOCD, SL6A4, CCL2, PMP22, COX10, TMEM132E

Fig. 4.12 In all studied isolates, genomic region 17p11-q12 linked with schizophrenia spectrum disorders (DGH022/DGH005/DGH034/DGH011). LOD ¼ 3.7 R/M–2.5 R/M–1.7 R/M–2.98 D/M, respectively (see Tables 4.2 and 4.3)

[email protected] 96 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

Fig. 4.13 In all studied isolates, genomic region 22q11.1-q12.3 linked with schizophrenia spec- trum disorders (DGH005/DGH034/DGH011). LOD ¼ 3.2 D/M–4.4 D/M–3.1 D/M, respectively (see Tables 4.2 and 4.3)

19,951,919), CECR2 (chr22: 17,956,628-18,033,845), IL2RB (chr22: 37,521,880- 37,545,962) et al. (Table 4.4). In previously presented results and discussions, some researchers confirmed linkage with schizophrenia. A number of these studies identified candidate genes for schizophrenia. This primarily refers to genes responsible for neurochemical imbalances in schizophrenic psychoses; for example, some studies found that the 11p and 11q23 regions were associated with the DRD4 and DRD2 genes (Lu et al. 1996; Hill et al. 1998; Prasad et al. 2002). The 17p11.2-12 region may be associated with linkage genes causing abnormalities in the serotonin chain in SCZ patients (5HTR2A, 5HTT-SERT) (Nothen et al. 1993; Sidenberg et al. 1993; Erdman et al. 1996). As shown in previous chapters, a variety of other studies support the rare regions of chromosome linkage. Researchers from different countries often cannot reproduce linkages existing by listed genomic regions. Due to issues with reproducing genomic loci linkage with schizophrenia, it is interesting to compare the results of genomic marker linkage we obtained in different isolates (Table 4.3). The loci linkages of Schizophrenia and D17S2196, D17S1294, D17S1303, and D17S947 found in ethnically and

[email protected] 4.4 Cross-Population Analysis of Genome-Wide Linkages Scan for Schizophrenia... 97

Fig. 4.14 Differences between Primary (PI) and Secondary (SI) isolates in % of meioses in chromosomes where we obtained significant linkages for SCZ as well as in rates of recombinations events in the isolates pedigrees. Differences between the isolates are statistically significant (t ¼ 2.3–7.6; p < 0.05–0.000) geographically disparate isolates suggest the involvement of 30 cM fragments of chromosome 17 in schizophrenia susceptibility. Nearly all studies find similar extensive regions of chromosomes linked to schizophrenia and other forms of mental disease (often more than 20–30 cM) (Prasad et al. 2002). In our isolates, a 30 cM fragment containing pathogenic loci might commonly exist in all indigenous Dagestani people (Bulayeva et al. 2003a, b; Marchani et al. 2008). In its subsequent 8000–9000 year history, recombination events divided the population into endogamous communities, and given the haplotype block of pathogenic loci, was able to segregate into different mono-ethnic isolates. Disease aggregation in these isolates was conducted under the ancestral effect combining traditional endogamy and inbreeding. Primary isolates are demograph- ically older with greater instances of meiosis and recombinations events (Fig. 4.14). Using haplotype analysis, we evaluated the total number of meioses in chromo- somes where we obtained significant linkages with SCZ and characterize recombi- nation events in pedigrees with aggregation of SCZ (Fig. 4.14). Historically young secondary isolates, which passed fewer meioses and recom- bination events in their demographic history, showed larger genomic regions linked to the SCZ, while the primary ones accumulated more meiosis and recombinants. The number of observed recombinations may consequently be higher, but cannot be identified due to the high genetic homogeneity of the studied isolates. PIs have 1.8 times less number of linked with SCZ genomic regions, as well as smaller sizes of linked genomic regions, compared with SI, indicated by the higher genomic heterogeneity of linked with SCZ pathogenic loci and larger size of linked genomic regions.

[email protected] 98 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

A similar LD pattern was also found in several other studies using linkage disequilibrium in complex disease gene mapping (Jorde 2000; Zavattari et al. 2000). A success of this approach is modeled in mapping the genes of the Mendelian disease dystrophic dysplasia (DTD). The traditional linkage analysis in the genealogy of outbred populations enabled the localization of the pathogenic genomic locus in the 2 Mb region. A subsequent analysis of LD in this genomic region conducted in a Finnish isolate narrowed down the localized genomic region to 40 Kb (Hastbacka et al. 1994). Notes: NPM—number of meiosis in pedigrees with aggregation of SCZ reconstructed in every PI and SI isolates up to 13–14 generations. Rexp—recom- binations events expected, Robs—recombinations events observed. Rexp_Robs— ratio of Rexp to Robs; R_0—number of cases without recombination; R-1— number of recombinations present from 1 and more.

We believe that mapping genes in ancient isolates with known common ancestral group provides an opportunity to trace how haplotype blocks, with pathogenic ancestor loci, would segregate into smaller units with spe- cific haplotypes and/or to different isolates of pathogenic loci in modern generations of patients from ethnically and geographically subdivided populations. The cause for clinical phenotype linkages and genetic markers in ancient genetically homogeneous isolates is most likely a physical linkage of this marker of the gene disease. The assumption requires further testing.

The linkage results with schizophrenic genome-wide scanned markers in isolates with varying degrees of demographic antiquity suggest narrowing of the genome region linked with the pathogenic locus as the isolates age demographically; the greater the number of meiosis and recombinations in the history of the isolate, the greater the probability for detecting a physical linkage of pathogenic loci in the studied genomic markers.

References

Alfimova, M. V., Golimbet, V. E., & Mityushina, N. G. (2003). Polymorphism of serotonin receptor (5-HTR2A) gene and productivity of speech associative processes at normal and schizophrenia. Molecular Biology, 37(1), 68–73. Russian. Andrews, B., et al. (1987). A study of genetic linkage in schizophrenia. Psychological Medicine, 17(2), 363–370. Barr, C. L., Kennedy, J. L., et al. (1994). Linkage study of a susceptibility locus for schizophrenia in the pseudoautosomal region. Schizophrenia Bulletin, 20(2), 277–286. Blair, I. P., Badenhop, R. F., Scimone, A., et al. (2005). Association analysis of transcripts from the bipolar susceptibility locus on chromosome 4q35, exclusion of a pathogenic role for eight positional candidate genes. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 134(1), 56–59.

[email protected] References 99

Blouin, J. L., Dombroski, B. A., et al. (1998). Schizophrenia susceptibility loci on chromosomes 13q32 and 8p21. Nature Genetics, 20, 70–73. Bray, J. D. (2001). Commentary on being positive about schizophrenia. Journal of Psychiatric and Mental Health Nursing, 8(3), 279–280. Bulayeva, K. B. (1991). Genetic basis of human psychophysiology. Moscow: Science. Bulayeva, K., Jorde, L., Ostler, C., et al. (2003a). Genetics and population history of Caucasus populations. Human Biology, 75(6), 837–853. Bulayeva, K. B., Leal, S., Pavlova, T. A., et al. (2000). The ascertainment of schizophrenia pedigrees in Daghestan genetic isolates. Journal of Psychiatric Genetics, 5, 100–106. Bulayeva, K. B., Leal, S. M., Pavlova, T. A., et al. (2005). Mapping genes of complex psychiatric diseases in Daghestan genetic isolates. American Journal of Medical Genetics Part B: Neuro- psychiatric Genetics, 132(1), 76–84. Bulayeva, K., Lesch, K.-P., Bulayev, O., Walsh, C., Glatt, S., Gurgenova, F., Omarova, J., Berdichevets, I., & Thompson, P. M. (2015, January 28). Genomic structural variation in linked with intellectual disability regions in a Dagestan genetic isolate. Journal of Neural Transmission. [Epub ahead of print] Bulayeva, K., Marchani, E., Kurbatova, O. L., Watkins, S. W., Bulayev, O. A., & Harpending, H. C. (2008). Genetic bottleneck among Daghestan highlanders migrating to lowlands. CEJM, 8(4), 396–405. Bulayeva, K. B., & Pavlova, T. A. (1993). Behavior genetic differences within and between defined human populations. Behavior Genetics, 23(5), 449–454. Bulayeva, K. B., Pavlova, T. A., Charuhilova, S. M., et al. (1996). Genetic and demographic study of mountain populations of Dagestan and its migrants on plain. Interconnection of inbreeding, homozygosity and physiological sensitivity levels. Genetics, 32(1), 93–102. Russian. Bulayeva, K. B., Pavlova, T. A., Dubinin, N. P., et al. (1993). Phenotypic and genetic affinities among ethnic populations in Dagestan (Caucasus, USSR). A comparison of polymorphic, physical, neurophysiological and psychological traits. Annals of Human Biology, 20(5), 455– 467. Bulayeva, K. B., Pavlova, T. A., Kurbanov, R. M., et al. (2003b). Genetic and epidemiological studies in the mountainous Dagestan isolates. Genetics, 39(3), 413–422. Bulayeva, K. B., Pavlova, T. A., Kurbanov, R. M., & Bulaev, O. A. (2002, November). Mapping genes of complex diseases in genetic isolates of Dagestan [Article in Russian]. Genetika, 38 (11), 1539–1548. Bulayeva, K. B., Pavlova, T. A., & Bulayev, O. A. (1997). Genetic polymorphism in the 3 populations of indigenous people of Dagestan. Genetics, 33(10), 1395–1405. Russian. Cardno, A. G., Holmans, P. A., Rees, M. I., et al. (2001). A genome-wide linkage study of age at onset in schizophrenia. American Journal of Medical Genetics, 105(5), 439–445. Eliez, S., Antonarakis, S. E., Morris, M. A., et al. (2001a). Parental origin of the deletion 22q11.2 and brain development in velocardiofacial syndrome, a preliminary study. Archives of General Psychiatry, 58, 64–68. Eliez, S., Blasey, C. M., et al. (2001b). Velocardiofacial syndrome: Are structural changes in the temporal and mesial temporal regions related to schizophrenia. The American Journal of Psychiatry, 158(3), 447–453. Erdman, J., Shimron-Abarbanell, D., Rietschel, M., et al. (1996). Systematic screening for mutations in the human serotonin 2A (5HT2A) receptor gene identification of two naturally occurring receptor variants and association analysis in schizophrenia. Human Genetics, 97(5), 614–619. Freedman, R., Leonard, S., Olincy, A., et al. (2001). Evidence for the multigenic inheritance of schizophrenia. American Journal of Medical Genetics, 105(8), 794–800. Gadzhiev, A. G. (1971). Anthropology of small populations of Dagestan (p. 368). Makhachkala: Dagestan Branch of the USSR Academy of Science. Gadzhieva, S. S. (1961). Kumiks. History and ethnography study. Moscow: USSR Academy of Sciences. 387 p.

[email protected] 100 4 Mapping Genes of Schizophrenia in Selected Dagestan Isolates

Gelernter, J., Pakstis, A. J., & Kidd, K. K. (1995). Linkage mapping of serotonin transporter protein gene SLC6A4 on chromosome 17. Human Genetics, 95(6), 677–680. Giuffra, L. A., & Kidd, K. K. (1989). Linkage analysis in psychiatry. International Review of Psychiatry, 1, 231–242. Glatt, S. J., Faraone, S. V., & Tsuang, M. T. (2003). Association between a functional catechol O-methyltransferase gene polymorphism and schizophrenia: meta-analysis of case-control and family-based studies. The American Journal of Psychiatry, 160, 469–476. Glatt, S. J., Lasky-Su, J. A., Zhu, S. C., Zhang, R., Zhang, B., Li, J., et al. (2008). Genome-wide linkage analysis of heroin dependence in Han Chinese: Results from wave two of a multi-stage study. Drug and Alcohol Dependence, 98(1–2), 30–34. Golimbet, V. E., Aksenov, M. G., Abramova, L. I., et al. (1998). Association of allelic polymor- phism of dopamine receptor with mental illness of schizophrenia spectrum and affective disorders. Zhurnal Nevrologii i Psikhiatrii Imeni S.S. Korsakova, 98(12), 32–35. Golimbet, V. E., Alfimova, M. V., Mityushina, N. G., & Asanov, A. Y. (2003). The genes of serotonergic system, features of behavior and mental disorders. Journal of Medical Genetics, (7). Russian. Golimbet, V. E., Alfimova, M. V., Shchebatykh, T. V., et al. (2004). Serotonin transporter polymorphism and depressive-related symptoms in schizophrenia. American Journal of Med- ical Genetics. Part B, Neuropsychiatric Genetics, 126(1), 1–7. Golimbet, V. E., Shcherbatyh, T. W., Abramova, L. I., et al. (2001). The polymorphism of the serotonin transporter gene in families burdened with schizophrenia. Zhurnal Nevrologii i Psikhiatrii Imeni S.S. Korsakova, 101(10), 40–41. Gutie´rrez, M., Gibert, J., Bobes, J., Herra´iz, M. L., & Ferna´ndez, A. (1998). Risperidone in the treatment of acute exacerbation of schizophrenia symptoms]. Actas Luso Esp Neurol Psiquiatr Cienc Afines, 26(2), 83–89, Spanish. Hastbacka, J., de la Chapelle, A., & Mahtani, M. M. (1994). The diastrophic dysplasia gene encodes a novel sulfate transporter: Positional cloning by fine-structure linkage disequilibrium mapping. Cell, 78(6), 1073–1087. Hill, S. Y., Locke, J., & Zezza, N. (1998). Genetic association between reduced P300 amplitude and the DRD2 dopamine receptor A1 allele in children at high risk for alcoholism. Biological Psychiatry, 43, 40–51. Jorde, L. B. (2000). Linkage disequilibrium and the search for complex disease genes. Genome Research, 10(10), 1435–1444. Jorde, L. B., Kere, J., Nyman, D., & Erikson, A. W. (2000). Gene mapping in isolated populations: New role of old friends? Human Heredity, 50, 57–65. Jorde, L. B., Watkins, W. S., & Bamshad, M. J. (2001). Population genomics: A bridge from evolutionary history to genetic medicine. Human Molecular Genetics, 10(20), 2199–2207. Kalsi, G., Brynjolfsson, J., Butler, R., et al. (1995). Linkage analysis of chromosome 22q12-13 in a United Kingdom/Icelandic sample of 23 multiplex schizophrenia families. American Journal of Medical Genetics, 60, 298–301. Karayiorgou, M., Kasch, L., Lasseter, V., et al. (1994). Report from the Maryland Epidemiology Schizophrenia Linkage Study: No evidence for linkage between schizophrenia and a number of candidate and other genomic regions using a complex dominant model. American Journal of Medical Genetics, 54, 345–353. Kendler, K. S., MacLean, C. J., O’Neill, F. A., et al. (1996). Evidence for a schizophrenia vulnerability locus on chromosome 8p in the Irish study of high-density schizophrenia fami- lies. The American Journal of Psychiatry, 153, 1534–1540. Kendler, K. S., Myers, J. M., et al. (2000). Clinical features of schizophrenia and linkage to chromosomes 5q, 6p, 8p, and 10p in the Irish study of high-density schizophrenia families. American Journal of Psychiatry, 157(3), 402–408. Lander, E., & Botstein, D. (1986). Mapping complex genetic traits in humans: New methods using a complete RFLP linkage map. Cold Spring Harbor Symposia on Quantitative Biology, Pt.1, 49–69.

[email protected] References 101

Lesch, K. P., Balling, U., Gross, J., Strauss, K., Wolozin, B. L., Murphy, D. L., & Riederer, P. (1994). Organization of the human serotonin transporter gene. Journal of Neural Transmis- sion: General Section, 95(2), 157–162. Lu, R.-B., Ko, H.-C., Chang, F.-M., et al. (1996). No association between alcoholism and multiple polymorphisms at the dopamine D2 receptor gene (DRD2) in three distinct Taiwanese populations. Biological Psychiatry, 39, 419–429. Marchani, E. E., Watkins, W. S., Bulayeva, K., Harpending, H. C., & Jorde, L. B. (2008). Culture creates genetic structure in the Caucasus: Autosomal, mitochondrial, and Y-chromosomal variation in Daghestan. BMC Genetics, 9(47), 1–13. Mayilyan, K. R., Weinberger, D. R., & Sim, R. B. (2008). The complement system in schizophre- nia. Drug News and Perspectives, 21(4), 200–210. Review. Nothen, M. M., Cichon, S., et al. (1993). Excess of homozygosis at the dopamine D3 receptor gene in schizophrenia not confirmed. Journal of Medical Genetics, 30(8), 708. Owen, M. J., Williams, N. M., & O’Donovan, M. C. (2004). The molecular genetics of schizo- phrenia: New findings promise new insights. Molecular Psychiatry, 9(1), 14–27. Review. Parsian, A., Suarez, B. K., Isenberg, K., et al. (1997). No evidence for a schizophrenia suscepti- bility gene in the vicinity near of IL2RB on chromosome 22. American Journal of Medical Genetics, 74, 361–364. Pickard, B., Hollox, E., Malloy, M. P., et al. (2004). A 4q35.2 subtelomeric deletion identified in a screen of patients with co-morbid psychiatric illness and mental retardation. BMC Medical Genetics, 5, 1–7. Prasad, S., Semwal, P., Deshpande, S., et al. (2002). Molecular genetics of schizophrenia: Past, present and future. Journal of Biosciences, 27, 35–52. Pulver, A., Karayiorgou, M., Wolyniec, P., et al. (1994). Sequential strategy to identify a susceptibility gene for schizophrenia: Report of potential linkage on chromosome 22q12- q13.1: Part 1. American Journal of Medical Genetics, 54, 36–43. Riley, B., Mogudi-Carter, M., Jenkins, T., & Williamson, R. (1996). No evidence for linkage of chromosome 22 markers to schizophrenia in southern African Bantu-speaking families. Amer- ican Journal of Medical Genetics, 67, 515–522. Risch, N., & Giuffra, L. (1992). Model misspecification and multipoint linkage analysis. Human Heredity, 42(1), 77–92. Sidenberg, D. G., Bassett, A. S., Demchyshyn, L., et al. (1993). New polymorphism for the human serotonin 1D receptor variant (5-HT1D beta) not linked to schizophrenia in five Canadian pedigrees. Human Heredity, 43(5), 315–318. Sobel, E., & Lange, K. (1996). Descent graphs in pedigree analysis: Applications to haplotyping, location scores, and marker sharing statistics. The American Journal of Human Genetics, 58, 1323–1337. Zavattari, P., Lampis, R., & Mulargia, A. (2000). Confirmation of the DRB1-DQB1 loci as the major component of IDDM1 in the isolated founder population of Sardinia. Human Molecular Genetics, 9, 2967–2972.

[email protected] Chapter 5 Common Structural Genomic Variants in Linked with SCZ Regions

5.1 Copy Number Variations and Runs of Homozygosity Analyses in Linked with SCZ Genomic Regions in Pedigrees of Selected Isolates

For this stage of research, our objective was to study copy number variations (CNV) and runs of homozygosity (ROH) polymorphisms in the genome of patients from ethnically divided genetic isolates of Dagestan with high aggregations of SCZ. We studied genomic aberrations in genome-wide scans of patients (49) and healthy members (51) of extensive pedigrees collected in ethnically divided isolates. After specific chromosomal regions had been identified by linkage analyses and certain genes were identified as possible candidates, we used exploratory examina- tion of CNVs and ROHs based on microarray data to detect segmental variations within linked region with LODs  2 as more robust among obtained in four ethnically and demographically subdivided isolates. It is known that change in the chromosome segment with the pathogenic gene may cause the expressed pathology. Using the computer packages SVS and GTC, we identified structural variations in the genomic regions linked with the studied pathology based on genome-wide scanned SNPs. Copy number variations were identified if the size of deleted or duplicated region was 100 kb and contained at least 20 SNPs. Progress in molecular genetic technology and the completion of the large, international research program, “Human Genome,” have made evenly genome- wide scanned multiallele microsatellites and single nucleotide polymorphisms available. The Human Genome Project made detecting interindividual differences at the nucleotide level caused by duplications or deletions possible. Combined with recent advances in genetic, statistical, and bioinformatic analysis methods, the availability of markers for genome-wide scans enabled linkage and structural genome-wide genetic variation scans (Weeks and Lange 1987; Goldgar et al. 1993; Kruglyak and Lander 1995; Kruglyak et al. 1996). Normal genotypes have two copies, one from each parent on each chromosome (except the X

© Springer International Publishing Switzerland 2016 103 K. Bulayeva et al., Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates, DOI 10.1007/978-3-319-31964-3_5

[email protected] 104 5 Common Structural Genomic Variants in Linked with SCZ Regions chromosome in males, which usually has one copy). In certain cases, especially with certain diseases, segments or an entire chromosome may replicate more than two times, be in single copies or be removed. A CNV, or the difference in copies of DNA segments between individual genomes, refers to a change in the number of chromosomal copies (in , bp). A CNV changes the number of copies of a particular gene, and they therefore affect expression of the gene product. Structural variations, including CNV and loss of heterozygosity (LOH), determine most of the variations of the human genome and may cause polymorphisms or determine clinical phenotype through changes in genes. The size of CNV segments usually ranges from 1 kb to many megabytes (1 kb ¼ 1000 bp). Every human genome includes duplications and deletions that vary in size (from a few thousand to a few million nucleotide pairs); the segments may or may not include genes; however, they compose 12 % of the total genome. CNVs may be subjected to selective effects in the evolution of human populations; in this way, SNPs, in which an extra copy is included during gene duplication, or a copy of the gene is excluded during deletion, became the dominant technology in current human genome studies and disease gene mapping (Fig. 4.14). Figure 4.14 repre- sents schematic views of duplications and deletions in a chromosome, where regions of DNA are lost or duplicated, and in consequence, express a clinical phenotype. Study of structural variations in the genome of psychiatric patients began in recent years. Several publications established structurally significant rare genetic variations, leading to duplications and/or deletions of gene copies, leading to schizophrenia pathogenesis (Stefansson et al. 2008; Walsh et al. 2008; Kirov et al. 2008). A study using patients with schizophrenia and schizoaffective disorders found thousands to millions of nucleotide deletions and duplications (CNV) in genomic sequences, compared to the control group. Patients with childhood-onset schizophrenia comprised a second group and had twice as many genomic abnor- malities than healthy people (Walsh et al. 2008). A set of CNVs can furthermore occur de novo; researchers identified 15 de novo CNVs in 152 SCZ sporadic cases (about 10 %), and only two in 159 healthy controls (1.3 %). De novo mutations therefore occurred eight times more often in patients with “sporadic” SCZ, suggesting high significance of association ( p ¼ 0.00078). In 48 familial schizo- phrenia cases, researchers did not find de novo CNVs; they therefore may only be associated with sporadic schizophrenia. Stefansson et al. (2008) conducted CNV searches on the basis of the reduced reproductive capacity of patients with severe mental illness and risk alleles of these diseases, so that these alleles were subjected to negative selection. The study assumed that the negative selection of gene alleles with significant risk of the development of these serious diseases should be rare. Studies identified 66 de novo CNVs in 5000 patients with schizophrenia. Eight out of 66 CNVs were found in at least one patient, including three large deletions in 1q21.1, 15q11.2, 15q13.3, and 16p11.2 regions. The results provide firm and replicable evidence that genomes affected by neuropsychic disorders (schizophre- nia, autism, MDP, and mental retardation) contain an increased number of

[email protected] 5.1 Copy Number Variations and Runs of Homozygosity Analyses in Linked with... 105 structural genome variations (Stefansson et al. 2008; Walsh et al. 2008; Kirov et al. 2008). In our study, we identified specific chromosomal regions through linkage ana- lyses, found candidate genes, and then used an exploratory examination of ROH and CNV based on microarray data for mutation screening within linked clinical phenotypes under regions with genomic homogeneity of linkages in the isolates and with LODs close to suggestive and significant levels. First, we performed a genome-wide scan for ROH and CN segmental length sizes among affected indi- viduals; our results showed that mean segmental length sizes of ROH and CN are larger among affected SCZ subjects, than healthy subjects (Fig 5.2a, b). Given the nonuniform patterns of ROHs across the genome, we were able to uncover regions with different lengths between groups. The most significant dif- ference in ROH segment size is between SCZ and healthy groups, obtained for chromosomes 2, 6, 12,17, 18, 19, and 22 where repeated linked regions with SCZ were found (Tables 4.2, 4.4 and Fig 5.1a). Differences between the ROH length of these groups are statistically significant with χ2 ¼ 176.7, d ¼ 21, p ¼ 0.000. The results support that ROH length is two times greater for chromosomes with SCZ linkage, compared to healthy pedigree members. In the compared groups, SCZ patients, unlike healthy subjects, demonstrated larger stretches of CN segments (Fig 5.2b). Data represented in Fig. 5.1 suggest higher rates (hot spots) of CNV polymor- phisms in the genomes of patients compared to healthy subjects (N). Chromosomes 2, 6, 10, 17, 8, 9, and 19 in the genomes of SCZ patients show a high frequency (hot spots) of CNVs (Fig. 5.2), suggesting that CNVs play an increasingly important role in the pathogenesis of schizophrenia spectrum disorders. Thus, searches for CNVs in linked regions may lead to insights into the genomic mechanisms that cause schizophrenia.

Fig. 5.1 Schematic representation of the SNP (a) and CNV (b)—deletions and duplications in chromosome

[email protected] 106 5 Common Structural Genomic Variants in Linked with SCZ Regions a kb 1100000

1000000

900000

800000

700000

600000

500000

400000 12345678910111213141516171819 20 21 22

SCZ_LNG N_LNG Chrs b kb 1100000

1000000

900000

800000

700000

600000

500000

400000 1 23 4 5 67891011121314151617 18 19 20 21 22 SCZ N Chrs

Fig. 5.2 Genome-wide length sizes of ROH (a) and CN (b) among affected (SCZ) and healthy (N) pedigrees members summarized from studied isolates. Star marked chromosomes with reliable in the isolates and with higher LOD values obtained in our linkage analyses

Among CN variants, the number of gain (microduplications) is about two times higher than microdeletions: from total 205 CN we obtained in autosomes, 129 (63 %) are gains and 76 (37 %) are losses. Statistically significant gains with p  0.05 were found in chr 8, 9, and 17 and of losses in chr 17 (Fig 5.3). Homozygotic losses in our sampled cases were observed 6 times (3 %) in SCZ cases and heterozygotic in 70 (34 %), while homozygotic gains in 19 (54 %) and heterozygotic 19 (9 %). These results obtained support our previous findings in affected ID and MDD cases from different genetic isolates of Dagestan (Bulayeva et al. 2011, 2012, 2015).

[email protected] 5.1 Copy Number Variations and Runs of Homozygosity Analyses in Linked with... 107

18

14

10 %

6

2

Gain –2 123 4567891011121314151617192122 Loss Chrs

Fig. 5.3 Variation (in %) by chromosomes of CN gains and losses among observed SCZ patients: the differences between groups are statistically significant χ2 ¼ 32,385, df ¼ 19, p ¼ 0.02833

Studying the isolate pedigrees enables the differentiation between CNV inher- itance and de novo mutations, because all patients with clinically homogeneous phenotype usually descend from a common ancestor. A patient with a specific copy number variant that is not detected in other affected members of the same ancestry is a sporadic or de novo mutation, of which significance is verified by DGV (Database DGV of Genomic Variants, http://projects.tcag.ca/variation/; dgv18v6). Copy number variations not contained in the DGV can be identified as de novo variants. CNVs in genes associated with morphological and functional organization in specific parts of the brain support the relevance of CNVs to the pathogenesis of mental disorders, irrespective of their sporadic or inherited nature. The level of heterozygosity varies with sex; the average heterozygosity in patients with schizophrenia is 24,375 in men and 25.635 in women, with statisti- cally significant differences between the groups: t ¼ 2.174, df ¼ 16, p ¼ 0.045. Other statistical parameters of genomic variations are statistically insignificant, enabling the analysis of the parameters without sex division; examples include the average number of variations per individual genome, the segment size with variations of copies, and the number of markers in these segments between groups divided by gender. Previous studies found 1.41 deletions and duplications per 1 SCZ patient from European populations, compared to the control group, in which the number of CN was 0.99 per one person (Walsh et al. 2008; Kirov et al. 2008). In our group of SCZ patients, the mean number of CN was 2.23 (Table 5.1), 1.6 times higher than the average value, 1.41, for similar patients in European populations. The experimental design of our study may contribute to high average CN values in our patient sample. We purposefully chose a patient sample from genetically homogeneous isolates with high aggregations of schizophrenia; the above European studies used a plu- rality of heterogeneous European populations.

[email protected] 108 5 Common Structural Genomic Variants in Linked with SCZ Regions

Table 5.1 Whole genome (autosomes) levels of heterozygosity and statistics of genome-wide copy number variations (CNV) in patients examined from genetic isolates based on AFFX GTC evaluations of AFFX SNP 6.0 microarray data CN Size (Kb) Number of markers ID Gender Het_rate Mean SD Mean SD Mean SD DGH01 Male 25.923 2.654 0.846 290.1 262.442 96.54 131.46 DGH02 Male 24.716 1.75 1.389 743.8 1341.95 748.1 1869 DGH03 Male 24.316 2.694 0.889 217.9 187.098 83.25 73.646 DGH04 Male 25.224 1.727 1.009 216.7 134.08 65.36 38.985 DGH05 Male 24.044 2.077 1.256 261.2 260.99 66.77 61.391 DGH06 Male 24.388 1.818 1.168 316.9 192.523 107.5 97.236 DGH07 Male 23.348 2.2 1.095 242.4 256.594 148.6 224.36 DGH08 Male 26.180 2.2 1.095 285.8 353.444 59.2 28.735 DGH09 Male 24.285 2.333 1.033 300.2 344.585 62.33 39.778 DGH10 Male 23.191 2.4 1.265 285.8 316.299 76.2 64.092 DGH11 Female 24.551 2.167 1.267 279.3 269.052 72.58 48.861 DGH12 Female 25.902 3.01 1.095 300.2 284.253 67.17 24.02 DGH13 Female 26.295 1.625 1.188 175.9 83.7606 60.88 38.461 DGH14 Female 25.648 2.333 1.323 474.8 458.939 181.4 304.08 DGH15 Male 24.591 2.5 1.309 170.5 88.4631 59.88 33.901 DGH16 Female 25.779 3.083 0.793 376.3 388.995 78.58 91.036 DGH17 Male 25.130 1.286 0.756 181.9 90.7605 79.43 30.386 DGH18 Male 21.544 2.167 1.267 311.4 195.45 114.8 71.529 Total average 24.723 2.322 1.126 290.8 363.329 113.1 383.02

Differences in the average CN number of whole genome autosomes vary from 1.3 to 3.1, with the average individual CN segment sizes ranging from 176 to 743.8 kb (Table 5.1). Within certain chromosomes, the CN segmental length sizes varied; the minimal length we obtained was 101 kb (one affected at 5q12) and a maximum of 4040 kb (at 8p23.2 of one affected). Other max sizes varied from 1015 to 1517 among a limited number of affected subjects regions we found in chromo- somes 4, 14, 15, and 17. Genome-wide studies of copy number variations in SCZ patients showed a significant increase of variations in patients when compared to the control sample: χ2 ¼ 133.1, df ¼ 21, p ¼ 0.000; Rs ¼ 0.134, t ¼ 4.75, p ¼ 0.000. Relatively large genomic stability of examined patients, compared to healthy subjects, was found in chromosomes 2, 3, 5, 7, 11, and 13. Among affected cases, less CN was linked with SCZ genomic regions in chromosomes 18 and 20. This does not exclude, however, the possible presence of variations smaller than 1 kb in patients, affecting functional or regulatory regions of specific genes involved in the pathogenesis of the studied pathology. Figure 5.4 provides the results of our CNV study in 1q21.1 and 15q11.2 regions compared with the data from Stefansson et al. (2008) (Fig. 5.4).

[email protected] 5.1 Copy Number Variations and Runs of Homozygosity Analyses in Linked with... 109

Fig. 5.4 CNV in 1q21.1 and 15q11.2 regions previously reported (Stefansson et al. 2008) (A1, B1) and in SCZ affected subjects from Dagestan genetic isolates (A2, B2). Segments with CNV and ROH were obtained in same regions: in 1q21.1—in genomes of 20 affected cases we found segments in 13 genomes with ROH, 7—deletions, 3—gains; in 15q11.2 we found segments with 7 ROH, 5 deletions and 6 gains

[email protected] 110 5 Common Structural Genomic Variants in Linked with SCZ Regions

Samples of 83 patients with childhood-onset schizophrenia (COS), a rare form of schizophrenia that manifests during the prepubertal period, revealed a twofold excess of CNV number in patients, compared to the healthy control group (28 versus 13 %, p ¼ 0.03). De novo CNVs furthermore appeared in 27 patients (Walsh et al. 2008). Differences between genomic research technologies can cause difficulties in interpreting the biological significance of identified genome structural changes. A typical example of these effects is the variation of the copy number of the CSMD1 gene. DGV shows 49 CNVs within CSMD1 (the average size of these variations is 347 kb and a median size of 9560 bp) and includes 7 duplications from HapMap cell lines and 5 CNVs in one or more CSMD1 (12/49, 24.5 %). The results could support that the structure of CSMD1varies frequently, so therefore the CSMD1 is not necessarily involved with pathogenesis, A recent study (Shaikh et al. 2009), however, identified 507 CNVs within the CSMD1 region, with an average value of 7535 bp and a median value of 3445 bp. Only four out of the total number of identified CNVs (0.8 %) in this region, however, include exon abnormalities; the study did not identify extensive segment duplications reported in HapMap based on cell cultures, which the authors attribute to the in vitro artifact produced in cell cultures. Similar discrepancies in the number and location of CNVs were established in other genomal regions, with the candidate genes of various diseases which require careful analysis of genomic technologies used in interpreting the biological significance of the detected variations in the number of genes.

5.2 Effect of Inbreeding on CNV and ROH Segments Sizes (Kb) and on Marker Numbers

We analyzed inbreeding effect on CNV polymorphisms. Results obtained show that patients with homozygous mutations (deletion ¼ 0, duplication ¼ 4) and of lower inbreeding accumulation have a statistically larger CNV segment sizes and fewer markers in the segments, in comparison to similar patients with homozygous mutations who have relatively high levels of inbreeding accumulation (Fisher’s test F ¼ 18.2, p ¼ 0.005; for the number of markers F ¼ 7.32, p ¼ 0.03) (Fig. 5.5). The data suggest that homozygous copy number variations, the origin of which is influenced by inbreeding, are significantly more likely to be intergenic, not affecting the function of genes involved in morphological and functional brain organization and the pathogenesis of schizophrenia. It is unclear whether this is typical for patients with schizophrenia or if characteristic of evolutionary formed genomic mechanisms of adaptive genetic structure of the studied isolates; the issue requires further investigation. The Association Between Inbreeding Coefficient and CNV The average coefficient value of inbreeding in the group F with homozygous CNV (deletion ¼ 0; duplications ¼ 4) x1 ¼ 0.02288 in the group with heterozygous (deletion ¼ 1;

[email protected] 5.2 Effect of Inbreeding on CNV and ROH Segments Sizes (Kb) and on... 111

Fig. 5.5 The association between the levels of inbreeding and the average size of segments and the number of markers in patients with homozygous CNV (deletion ¼ 0, duplication ¼ 4)

0.036

0.032

0.028

0.024 F 0.020

0.016

0.012 ±SD ±SE 0.008 Mean Homozygotic Heterozygotic CNV

Fig. 5.6 The association between inbreeding coefficient and CNV: frequency of homozygous variations of the number of copies is greater with the higher inbreeding coefficients duplications ¼ 3) x2 ¼ 0.0192. Differences are significant: t ¼ 1.97b,df¼ 192, p ¼ 0.050 (Figs. 5.5, 5.6). The data indicate that the frequency of molecular homozygous mutations is higher as the level of inbreeding of examined individuals increases.

[email protected] 112 5 Common Structural Genomic Variants in Linked with SCZ Regions

5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with Schizophrenia Regions

We also tested for the effect of common copy number variants (CNVs) and ROH among all affected cases by using a set of AFFX SNPs 6. The results show that affected patients have 160 CNV and 105 ROH in genomic regions linked with schizophrenia (Table 5.2). The number of duplications in linked regions is higher (90) than the number of deletions (70); the number of affected cases with ROH is 105 (Table 5.2). The ratio of duplications and deletions in the linked regions is higher for about 30 %. In linked with SCZ region 2p22.3-p21 in isolate DGH011 (LOD ¼ 3.1), we found deep homozygotic deletion in gene CRIM1 (Fig 5.7). CRIM1 (*606189, Cysteine-Rich Transmembrane BMP Regulator 1) encodes a transmembrane protein containing six cysteine-rich repeat domains and an insulin- like growth factor-binding domain. The encoded protein plays a role in tissue development through interacting with members of the transforming growth factor beta family, such as bone morphogenetic (provided by RefSeq, Nov 2010). It was shown that CRIM1 may interact with growth factors implicated in motor neuron differentiation and survival (Kolle et al. 2000). No information was found on the role of CRIM1 in neurodevelopmental disease pathogenesis; its function is still unknown, except its association with a form of syndactyly (OMIM:227210). Genomic region 6p21.2-p23, linked with SCZ, contains genes that span the major histocompatibility complex (MHCI, MHCII); we found significant linkages with SCZ in three of four genetic isolates. Study of CNV in linked 6p21.2-p22.3 region with highly reliable LOD ¼ 4.3 in isolate DGH034 of ethnic Tindals that overlapped with LOD ¼ 3.0 and LOD ¼ 2.4 in isolates of ethnic Dargins found deletions in total for 6 patients and duplication in 3 patients 6p21.32 (chr6: 32,506,692-32,805,565 ¼ 298,874 bp). Within linked region at 6p22.1, we obtained one CNV segment with a length of 290,719 bp, involving the extended MHCI region with genes HIST1H2BJ, PRSS16, PGBD1, and NOTCH4 (Fig. 5.7). Six affected SCZ cases demonstrated deletions in this region and two others indicated gains. In genomic region MHC1, we also found common for all SCZ cases segment of ROH (chr6:26,689,985-29,642,806) with length size 2,952,822 bp that includes genes HIST1H2BJ, PRSS16, PGBD1, and NOTCH4, as well as many other flanking genes (Fig. 5.7). Previous publications show the involvement of these genes and genomic variants within them in the pathogenesis of SCZ (Stefansson et al. 2009; International Schizophrenia Consortium 2008). It was shown that MHC regions are consistent with an immune component to schizophrenia risk (Stefansson et al. 2009). We found a second hot spot in the region of Major histocompatibility complex, class II-MHCII (chr6: 32,510,228-32,742,465), with its length of 232,238 bp and genes HLA-DRB1-6, HLA-DQA1-2, HLA-DQB1, TAP2, related with the autoim- mune disease. Seven SCZ cases demonstrated deletions in this region, and six other cases showed gains. No ROH was found in the MHCII region with CNV (Fig. 5.8).

[email protected] 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with... 113

Table 5.2 Common number of CNV duplications, deletions, and ROH obtained in affected SCZ cases in linked genomic regions Linked Isolates # region Gain Loss ROH GENES DGH005/ 1 p13.3- 6 6 5 GSTM1, GSTM2, GSTM5, NTNG1, DGH022 q23.3 NBPF4, GJAS, RGS4, CDY8, NOSIAP2 DGH011 1p36.1- 4 2 11 HTR1D, PLA2G2A, HTR6, LAPTM5, p35.2 PIGV, MTHFR DGH005 2q24.1- 9 EXTL2P1, GORASP2, STK39, FIGN, q32.1 ZNF804A DGH022 2q36.3- 4 2 PID1, AGFG1, GIGYF2, AGFG1, q37.1 C2orf83, SPHKAP, GIGYF2, HTR2B DGH011/ 2p22.3- 2 4 5 TADA, OXER1, CRIM1, EGLN1, DGH064 p21 EF2B4 DGH064 3p23- 3 10 GPD1L, NR, CKLFSF6, CMTM6 p22.3 DGH005 3q28-q29 APOD, UTS2D, DLG1, TFRC, HRASLS, PAK2 DGH064 4q35.1- 1 3 SLC25A4, MTRF1L, FRG1, FAT1, q35.2 AC093900.2 DGH011 5p14.1- 4 CDH9, CDH6, ZFR*, MTMR12*, p13.3 GOLPH3, DGH005 5q14.1- HOMER1, MEF2C q14.3 DGH064 5q35.1- 7 FOXI1, KCNIP1, SH3PXD2B**, q35.2 DUSP1 DGH005/ 6p21.2- 6 3 14 PRSS16, PGBD1, TAP2, NOTCH4, DGH064/ p23 DCDC2, DTNBP1, HLA-DRB5, DGH022/ HIST1H2BJ, SLC17A3, ZNF165, DGH011 FKBP5, FKBP51 DGH064 8p23.3- 6 5 CSMD1, MCPH1, ANGPT2, MYOM2 p23.1 DGH064 9p22.2- 4 8 ELAVL2, CDKN2A, CDKN2B, TEK p21.3 DGH005/ 10q26.12- 5 FGFR2, ATE1, CPXM2, HMX3, DGH022/ q26.13 HTRA1, ADAM12, DOCK1, GPR26, DGH064 CPXM2 DGH011 10p11.23- 3 5 ARHGAP12, REM, ZEB1, NRP, ITGB1 p11.21 DGH005 11p14.3- 3 5 BDNF, KIF18A, DCDC5, WT1, p13 PRRG4, FBXO3, TRIM44, ENF, C11orf46 DGH064/ 11q23.3- 6 3 SCZD2, GRIK4, SORL1, LR11, DGH011 q24.1 SORLA, KCNJ1, FLI1, DRD2, NCAM1, HTR3A, HTR3B, IL18, POU2AF1, NRGN, KIRREL3 DGH011 12q24.23- 4 3 10 CIT, DYNLL1, P2RX7, SBNO1, q24.32 CDK2AP1, CCDC60, TBX3, TMEM132D (continued)

[email protected] 114 5 Common Structural Genomic Variants in Linked with SCZ Regions

Table 5.2 (continued) Linked Isolates # region Gain Loss ROH GENES DGH005/ 17p13.3- 5 8 3 VPS53, ACACA, ITGAE, ODF4, DGH022/ q12 GABARAP, ARRB2, DLG4, PER1, DGH064/ VAMP2, NDEL1, MYOCD, DGH011 TMEM132E, SL6A4, CCL2, PMP22, COX10 DGH022/ 18p11.31 4 1 DLGAP1, TGIF1, MYOM1 DGH064 DGH011 18q12.1- 4 8 14 DSG3, NOL4, FHOD3, BRUNOL4, q12.3 RIT2, CDH2, NOL2 DGH005/ 19q13.31- 2 KLK8, CD33, APOE, SLC1A5, DGH064/ q13.42 ATP1A3 DGH022 DGH064/ 21q22.13- 4 1 KCNJ6, ETS2, HUNK DGH011 q22.2 DGH005/ 22q11.1- 7 6 CECR2, BSR, PRODH, TXNRD2, DGH064 q12.1 DGCR2, DGCR14, DGCR5, DGCR8, DGCR8, HIR, COMT DGH011/ 22q12.3- 5 5 7 LARGE, APOL1-APOL4, CSNK1E, DGH064 q13.1 RBM9, MYH9, IL2RB, CACNG2, SOX10, DRG1, DDX17, SYN3, YWHAH, APOBEC3D In total 90 70 105 Notices: For total copy number variation in observed isolates linked regions Levene homogeneity test F=4.06, p<0.0093; ROH consequently, F=3.34, p=0.017

Significant associations and/or linkages with SCZ and associated genomic variants were previously reported out of these genes in relation to genes HLA-DRB1, and TAP2 (transporter 2, ATP-binding cassette, subfamily) encodes a membrane pro- tein responsible for the transport of peptides from cytoplasm (Stefansson et al. 2009; International Schizophrenia Consortium 2008). Our LOD values are much higher than previously reported, and CNV and ROH in this linked region were established first time as common for SCZ affected in ethnically and demo- graphically diverse genetic isolates. The genetic mechanisms of these linkages and their common structural varia- tions in the linked region with certain genes in MHCI and MHCII for pathogenesis of SCZ need further investigation. Positive communication with A9 allele of HLA histocompatibility gene was previously identified (Tsuang and Faraone 1995). The A9 allele of HLA histocompatibility gene is also associated with autoimmune diseases. The effect of such epigenetic factors has an additive negative effect on the expression of candidate genes for schizophrenia. In particular, it was shown that viral infection of pregnant women (diphtheria, pneumonia, and influenza) and infectious diseases of central nervous system in children increase the risk of schizophrenia fivefold, compared to healthy children (Jones and Cannon 1998;

[email protected] 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with... 115

Fig. 5.7 CNV found in gene CRIM1 in linked with SCZ region 2p22.3-p21 in isolate DGH011 (LOD ¼ 3.1)

Torrey 1997). Increased risk of developing schizophrenia after an infectious disease may be due to cross-reactivity of antibodies produced against viral components, which leads to abnormalities in the development of nervous system, as well as the viral infection itself can impair the development and functioning of the nervous system (Jones and Cannon 1998). These factors in people without genetic susceptibility may lead to the development of sporadic forms of schizophrenia and increase the risk of mental illness in cases with genetic susceptibility (Tsuang 1998). Data on the relationship of antigens of the HLA system with schizophrenia are inconsistent (Mitkevich 1981; Grow et al. 1979; Mcguffin and Owen 1991). Samples from the Swedish and Czech populations showed positive HLA-A9 association with paranoid schizophrenia (Eberhard et al. 1975; Iva´nyi et al. 1976; Ivanyi et al. 1978). It was also shown that there was statistically significant

[email protected] 116 5 Common Structural Genomic Variants in Linked with SCZ Regions

Fig. 5.8 CNVs established in 3 isolates with schizophrenia-linked 6p21-p22 region with high reliable LOD ¼ 4.3 (DGH034), 2.92 (DGH022), and 2.3 (DGH011). Linked region contains candidate genes for schizophrenia: NOTCH4, HLA-DRB1, TNXB, HLA-DRB1, TAP2, etc. Genes localized in linked region had deletions in eight patients and duplications in three patients

[email protected] 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with... 117 association of certain antigens and HLA haplotypes with the remission type for paroxysmal forms of schizophrenia; favorable outcome often occurs in patients with antigens A1 and B8 and especially with B8, A1, A2, and B12 antigen combinations in the same patient. Combination of A2 and B5 antigens is much more common in patients with schizophrenia with a poor outcome (Mitkevich 1981). A statistically significant increase in the frequency of HLA-A10, AH, and A29 and statistically significant reductions in the frequency of HLA-A2 were identified in the group of schizophrenic patients compared to healthy controls (Erkan et al. 1996). In other studies, these results were not reproduced (Wright et al. 1996). Genetic heterogeneity of complex disease (schizophrenia) and gene pool diversities between human populations, where sampled groups for study, can explain some of inconsistency of the results of schizophrenia clinical forms. Mitkevich (1981), for instance, found that the continuous-current and paroxysmal forms of schizophrenia have significant associations with different HLA antigens: continuous form is commonly associated with HLA-A10, while the HLA-B12 form is more often combined with paroxysmal schizophrenia. Significant increase of HLA-Al was found in disorganized schizophrenia patients, while Cw4 was found in patients with paranoid schizophrenia. Possible involvement of these antigens as genetic markers of three pathogenic subgroups of schizophrenia was hypothesized in order to explain the relationships between HLA antigens and schizophrenia (Ivanyi et al. 2007). The authors report that linked 6p22.2-p22.1 (chr6: 26229132-26359580 ¼ 130 bp) region containing 82 markers had duplication including the HIST1H10, HIST1H4F, HIST1H3G, HIST1H2BI, HIST1H4H, and CR593845 genes. We established the such structural variation duplication in the same genomic region in two (and deletions in 6); our work most likely reflects the recently identified role in the etiopathogenesis of mental disorders, including SCZ, and epigenetic changes which alter the expression of genes (Tang et al. 2011). These epigenetic changes in patients with neurodegenerative diseases and SCZ are connected to , the structural proteins that “wrap” the DNA strand and are responsible for supercoiled DNA formation. In some of young and elderly patients, some areas of brain cell DNA were shown in the packed state and, therefore, are not accessible for reading of genetic information, caused by a disorder mechanisms of acetylation. The authors believe that inhibitor treatments of histone deacetylase (drugs that inactivate molecules that catalyze the removal of acetyl groups from the histones) may prevent the progression of histone acetylation disorders and, consequently, the development of the epigenetic mechanisms of the disease (Tang et al. 2011). Further studies need to find connections between reveled duplication in histone genes and these epigenetic mechanisms. We previously reported (Bulayeva et al. 2012) an extremely large deletion (4040 kb) at 8q23 in gene CSMD1 that was obtained in the genome of one patient from isolate DGH005 diagnosed with schizoaffective disorder. CSMD1 is involved in the formation of neural networks. In our study, these gene duplications were found in only two patients whereas four patients with schizoaffective disorders had deletions. Moreover, one patient deletion in the region was 40 Mb, and the

[email protected] 118 5 Common Structural Genomic Variants in Linked with SCZ Regions remaining dimensions of deletions and duplications were 120–560 bp. In same region, we found no common ROH segment for SCZ affected, except for one patient of short ROH segment. Duplication in 8p23.1-8p23.2 was previously found in a child and his mother with speech delays and autism diagnosed by ICD-10 (Glancy et al. 2009). It was shown that CSMD1 gene duplication and the duplication of the adjacent MCPHI microcephaly gene significantly alter their function and lead to autism and mental retardation. These results demonstrate the significant role of CSMD1, and MCPH1, in the etiopathogenesis of SCZ, in which mutations are likely associated with impaired mechanisms of neural connections in brain function and cognitive impairment. Our results obtained confirm deletions in the CSMD1 gene at 8q23 reported in study of SCZ and healthy subjects from Norway and Denmark (Ha˚vik et al. 2011). Other studies found LD in the same 8p23.1 region among patients with bipolar disorder and SCZ from Costa Rica (Walss-Bass et al. 2006). Genetic causality of autism (Autism spectrum disorder, ASD) was proven in a number of studies; however, the main genes that determine the disease, as well as the molecular and genetic mechanisms of their functioning, are unknown. CNVs associated with autism were studied in 28 children and 62 adults to address these challenges throughout the genome; 38 CNVs were identified, and deletions in 8p23.1 and 17p11.2 were significantly greater in the groups of patients with autism compared to the control group (Cho et al. 2009). The study showed that 8p23.1 deletion occurred in DEFENSIN gene, whereas deletions in the 17p11.2 region were intergenic, as the region does not contain genes. The authors conclude that identified deletions reflect the molecular mechanisms of autism pathogenesis (Cho et al. 2009). In order to determine the hereditary or the sporadic nature of the identified copy number variations (CNV), we studied patients with the same type of mutations linked to the 8p23 locus located in different parts of the extensive pedigrees. CNV screened affected members of the DGH005 pedigree branch, marked by circles (Fig. 5.9). These patients all descend from common ancestors. Our analysis showed that current generations’ patients received a common haplotype block of 120 Kb from the common ancestor, despite the obtained fact that the CNV segment sizes in 5 SCZ members of the pedigree varied considerably. This CNV segment contains about 30 genes, of which CSMD1, MCPH1, MSRA, and DEFENSIN were shown as linked or associated with SCZ, autism, or mental retardation (intellectual disability). The family study was conducted for a number of deletions and duplications found in the linked regions. The results support that identified CNVs are hereditary, not sporadic. In the SCZ-linked region 9p22.2-p21.3 of isolate DGH034 (LOD ¼ 2.6), we found homozygotic deletions and heterozygotic duplications in the ELAV2 gene (Fig. 5.10). ELAVL2 (ELAV Like Neuron-Specific RNA Binding Protein 2) is a protein-coding gene associated with skin dermatitis. Recent GWA reports showed a nominal association of ELAV2 with schizophrenia in a Chinese population ( p ¼ 0.026). The genetic mechanism of association of ELAVL2 gene is still under study. Our results obtained in the CNVs of ELAVL2 show eight SCZ patients have

[email protected] 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with... 119

Fig. 5.9 Segments with copy number variations in linked with schizophrenia 8p23 region. Five patients have segments with deletions and six patients have duplications. We found no ROH segment linked with the SCZ region

Fig. 5.10 Recurrent CNVs found in five patients with a common ancestor within linked 8p23 region

[email protected] 120 5 Common Structural Genomic Variants in Linked with SCZ Regions deletions and four have gains. Except three patients, all nine SCZ cases are from isolate DGH034, supporting the importance of interpopulation polymorphism in CNV (CNVP). No common ROH segments were obtained in linked region of 922.2-p21.3. Linked regions of chromosome 10 (Table 5.2) did not show significant, common CNV events or ROH segments. In the linked region, 11q23.3-q24.2, of pedigrees from isolates DGH034 and DGH011, we found mutations within genes KIRREL3 that led to autosomal mental retardation with dominant inheritance (MRD4, OMIM, 607761). SCZ cases with heavy affectation and cognitive impairments from isolates DGH034 and DGH011 indicated a gain in 6 affected and deletions in 3 affected. The search for CNVs linked to the SCZ 17p13-q12 region enables the establish- ment of a CNV “hot spot” with 9 of 19 schizophrenia patients having 5 deletions and 4 duplications. The second “hot spot” with genomic instability was found in 17q21.31, confirming findings of other researchers, and 17q21.31 was found to have more CNVs in healthy subjects (Fig. 5.5). Several studies showed the linked 17p13-q12 region, with LOD ¼ 3.7–2.4, and the total LOD ¼ 5.3, contains the ARRB2, DLG4, TMEM132E, PER1, VAMP2, NDEL1, SL6A4, and CCL2 genes that participate in the pathogenesis of schizophrenia, autism, and other psychopa- thology. The gene TMEM132E linked with SCZ in our genetic isolates is located in the 17q11.2 region and is involved with a heterozygous deletion. This deletion involves the NF1 gene, in addition to contiguous genes in its flanking regions, as observed in one patient with NF1 and 17q11.2 microdeletion syndrome [MIM:162200]. The NF1 microdeletion syndrome is often characterized by a more severe phenotype than that observed in the majority of NF1 patients. Indeed, patients with the NF1 microdeletion often show variable facial dysmorphism, mental retardation, developmental delay, and an excessive number of neurofi- bromas. Abnormalities in monoamine metabolism are associated with the patho- physiology of SCZ, BPD, MDD, and suicides. The SLC6A4 gene terminates serotonin and recycles it in a sodium-dependent manner. A repeat length polymor- phism in the promoter of SLC6A4 has been shown to affect the rate of serotonin uptake and may play a role in sudden infant death syndrome, aggressive behavior in Alzheimer’s patients, and depression susceptibility in people experiencing emo- tional trauma (provided by RefSeq, Jul 2008). Common ROH was not obtained for the linked region. Deletions in 22q11, of approximately 30 Mb, cause syndrome 22q11DS, includ- ing velocardiofacial syndrome (VCFS) and Di George syndrome. Patients with these syndromes express multiple morphological, cognitive, psychotic, and cardio- vascular anomalies. About 25–30 % of these patients have severe symptoms of schizophrenia (Fig. 5.11). We found some SCZ patients with CNVs located close to VCFS phenotypic genes in isolates DGH005 and DGH034 (first peak). We found rare duplication sizes of 153,280 bp in the CECR2 gene located in 22q11.21 region (1–9/100,000 in Europe) in seven of our patients. Cat Eye Syndrome (CES) or Schmid–Fraccaro

[email protected] 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with... 121

Fig. 5.11 CNV (a, 10 patients) and ROH (b, 7 patients) ‘hot spots’ obtained among SCZ cases in gene ELAVL2 (9p21.3) confirms genomic instability reported on DGV site syndrome has an autosomal dominant inheritance. CES is neonatal and affected individuals have a normal life expectancy. The CECR2 gene was also found to be associated with some brain diseases such as Exencephaly and Neural Tube Defects and involved in the genetic pathway of VCFS (Fig 5.12). Seven of our patients indicated duplications in the SLC25A18 gene located in 22q11.21, associated with CES. These findings suggest that the duplications we found in genes relate to dosage effect and play an important role in the etiopathogenesis of brain disorders. Such duplications, however, were only detected in 6 out of 20 patients, which questions the specific clinical and neurobiological mechanisms for duplication association in SLC25A18, which we will analyze in a further study. In the 22q11.2-q22.2 region, linked with schizophrenia, we found common deletions in the FAMM108A1, GSTTP1, and GSTTP2 genes We did not discover any involvement of these genes in brain disease pathogenesis. The 22q12.3-q13.1 region contains the second linkage peak in isolate DGH034 and pedigree linkage from isolate DGH011 (Fig 5.13). Overall at 22q12.3-q13.1,

[email protected] 122 5 Common Structural Genomic Variants in Linked with SCZ Regions

6

DGH022 5 DGH005 4 DGH022+DGH005+DGH064 3

LOD DGH064

2

1

0 0 20 40 60 80 100 120 cM

Fig. 5.12 CNV “hot spot” in linked 17p11-p12 region and in 17q21.31

[email protected] 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with... 123

Fig. 5.13 Duplication in CECR2 and SLC25A18 genes found in 7 SCZ patient genomes in isolates DGH005 and DGH034 at 22q11.2-q12.1. Six duplications and five deletions in genes CACNG2, PVA2B, and IFT27, as well as deletions in genomes of six patients and duplications in three patients in gene LARGE we obtained within the second linkage peak in isolate DGH034 and DGH011 at 22q12.3

[email protected] 124 5 Common Structural Genomic Variants in Linked with SCZ Regions both kindred LOD ¼ 5.235, peak ¼ 27.9, flanking loci D22S685 and D22S683 with α ¼ 1, confirm 100 % of homogeneity with the linked genomic loci. The plausible linkage interval harbors genes LARGE, APOL1-APOL4, CSNK1E, RBM9, MYH9, IL2RB, CACNG2, SOX10, DRG1, IL2RB, MYH9, DDX17, SYN3, and YWHAH. These genes were previously reported as associated or linked with cognitive impairment, neurodevelopmental, and neurodegenerative diseases. In linked region 22q12.2-q13.1, CNV variation analyses showed five common SCZ deletions among the LARGE gene with rs8141384 and rs8140012 (Fig 5.13). In the same linked region, we obtained six duplications and five deletions in genes CACNG2, PVA2B, and IFT27. The protein-coding gene, CACNG2 (Calcium Chan- nel, Voltage-Dependent, Gamma Subunit 2), is associated with autosomal domi- nant nonsyndromic intellectual disability, epilepsy, and SCZ. Gene PVALB is a high-affinity calcium ion-binding protein that is structurally and functionally sim- ilar to calmodulin and troponin C and is thought to be involved in muscle relaxation (provided by RefSeq, Jul 2008). Diseases associated with PVALB include SCZ, as well as subcortical band heterotopia and gangliocytoma. Gene IFT27 (Intraflagellar Transport 27) encodes a GTP-binding protein that is a core component of the intraflagellar transport of complex B, and its similar characteristics with the chlamydomonas protein indicate that IFT27 functions in cell cycle control. Alter- native splicing of this gene results in multiple transcript variants (provided by RefSeq, Jan 2012). The protein encoded by IFT27is a type I transmembrane AMPA receptor regulatory protein (TARP). TARPs regulate both trafficking and channel gating of the AMPA receptors and are part of a diverse eight-member protein subfamily of the PMP-22/EMP/MP20 family. This gene is a susceptibility locus for schizophrenia (provided by RefSeq, Dec 2010) and is associated with Bardet–Biedl syndrome 19 and Bardet–Biedl syndrome. Seven patients from the studied isolates have ROH segments in the same 22q12.3-q13.1 region. Our data confirm several results from previous studies that support rare structural genetic variation (CNV) in the form of deletions or duplica- tions may significantly impact the pathogenesis of schizophrenia. Several mechanisms, primarily associated with replication and recombination in meiosis, cause genomic aberrations. Locus-specific de novo mutations occur between 100 and 10,000 times more often in CNVs than in SNPs. Studying identified CNV segments in examined isolates showed that eight segment mutations in four patients were de novo CNVs (Table 5.3). Five out of eight of these mutations were found in X chromosome; all five of the patients were males with XY genotype, double or triple copies of X chromosome segments with sizes of121–184 Kb, could not affect patient phenotype. Patient DGH004 had three of five mutations in the X chromosome had a familial connection through the paternal grandmother genera- tion with aggregation of infertility from a neighboring isolate. One of our isolate studies found high aggregations of infertile males and females in a nuclear family of cousin; the descendants of the family included five daughters. Three of the daughters acquired male phenotype during puberty and subsequently diagnosed with pseudo-hermaphroditism by the Republican Hospital. Our genomic studies showed that all three sisters had a XXY genotype.

[email protected] . rs-slt td fSrcua eoi aitosi ikdwith Linked in Variations Genomic Structural of Study Cross-Isolate 5.3

Table 5.3 De novo mutations detected in the studied patients from Dagestan isolates Nos CN Loss_Gain Locus Size (Kb) Number of markers Start End Genes DGH004. 3 Gain 1q32.3 161 96 211149115 211310208 VASH2, FLJ12505, ANGEL2, RPK118,

[email protected] RPS6KC1 DGH007 1 Loss 5q12.1 101 53 60246428 DGH0347260 NDUFAF2 DGH003 3 Gain 7q21.2 107 44 91913419 92020282 ANKIB1 DGH007 2 Gain Xq21.31 121 32 91195197 91316310 PCDH11X DGH004 3 Gain Xp22.33 184 34 1479874 1663566 PP1164, ASNTL, P2RY8, IL3RA, P2RY8, SLC25A6 DGH0024 3 Gain Xp22.33 128 10 1415800 1543540 IL3RA, ASMTL, P2RY8 DGH004 3 Gain Xp22.33 124 13 109805 233595 GTPBP6, PPP2R3B, PLCXD1 DGH004 2 Gain Xq24 149 86 118665907 118814421 SEPT6 ... 125 126 5 Common Structural Genomic Variants in Linked with SCZ Regions

Fig. 5.14 A summarized genome-wide scanned significant linkages obtained in four genetic isolates (color vertical lines) with CNV (del & gain) and ROH found in linked regions. Results on X and Y chromosomes were not presented

Establishing reliable (common) for diverse human populations genomic regions linked to SCZ is important. Each of these linked regions is comprised of 5–30 genes and requires more detailed analyses of linked regions to solve the problem of “missing heritability,” whereby studies using association methods or linkages analysis can determine mechanisms of etiopathogenesis only 15–17 % of the time. Performed a genome-wide scan, examining 300,000 SNPs estimating 45 % of its genetic variance. In this regard, we used the SNPs 500 K technique to perform a genome-wide scan of patients from four isolates for common structural variants in every isolate-linked genomic region In addition to genome-wide scan of pedigree members ascertained in rare genetic isolates with high aggregations complex diseases such as SCZ based on 400 microsatellites, this allowed us to gain insights into the genomic mechanisms of linked regions and determine the presence of structural genomic variation within them. Figure 5.14 summarizes the results of our study obtained using such experimen- tal design using patient genomes from different isolates, with indication of linkages established on the basis of STR loci, and duplications, deletions, and ROH established on the basis of SNPs (Fig. 5.13). Data presented in Fig. 5.13

[email protected] 5.3 Cross-Isolate Study of Structural Genomic Variations in Linked with... 127 demonstrate population-specific and common genomic linkages and structural genome variations specific to schizophrenia spectrum disorders for different populations and are analyzed in detail above. Previous studies showed that sporadic cases of SCZ in comparison with familial (inherited) CNVs were found only 1.5 times more frequently ( p ¼ 0.049) than in the control group, and sporadic cases of CNV were not always associated with the SCZ pathogenesis. Our study supports ancestral effects of gene drift and revealed that founders are the common ancestors in patients in genetic isolates with homogeneous complex disorders. This supports the presence of common ancestor among these patients within isolates in which modern patients inherit haplotype blocks with pathogens factors. Our results furthermore support the greater genomic and clinical homoge- neity of demographically old primary isolates, compared to the demographically young secondary isolates. Inbreeding affected both incidence and manifestation age in primary and secondary isolates. Descendants from consanguineous marriages therefore have a higher risk of developing the disease compared to descendants from non-consanguineous marriages. Genome-wide scan of linkages with SCZ in our four genetic isolates established invariant to ethnogenomic and demographic ages linkagesm as well as population- specific linkages related with ancestral effects and ethnogenomic. Common pat- terns of linkages, invariant to ethnogenomic isolates subdivisions, are based on the complex structure of the human brain is strongly shaped by genetic influences that can lead to abnormal behaviour and disease (Hibar et al. 2015). Such clinical common nature of SCZ in different ethnic groups worldwide, established by Krepelin, along with interpopulation similarities and diversities in genetic linkages and in structural genomic variants we demonstrated among Dagestan indigenous ethnic populations. At the same time, common genomic loci linked with the same brain pathology, such as SCZ, MDD etc were established in our genetic studies examining the origin of Dagestani indigenous ethnics can be result of a common for these ethnics ancestral proto-popilation that existed 8–10 thousand years ago (Caciagli et al. 2009; Marchani et al. 2008; Bulayeva et al. 2003). In its subsequent demographic history this proto-population differentiated into endogamous commu- nities that in own histories in geographically differentiated highland region created own unique languages and ethnic culture. The obtained results enabled the establishment of the role of ethnogenomic subdivisions of human populations on genetic heterogeneity of schizophrenia, in the form of different amounts and compositions of SCZ candidate genes localized to the linked regions. Genomic linkages established in schizophrenia patients with candidate genes and genome structural variation (CNV and ROH) in linked regions are risk factors for schizophrenia and in the future may become an important area of clinical genome examination in order to determine the content of certain deletions and duplications that cause schizophrenia. Variations reproduced in patients with schizophrenia from ethnic divisions of populations, i.e., invariant to ethnogenomic subdivision, hold particular interest to the practicality of studying, and applying schizophrenia spectrum diseases and population-specific structural genome

[email protected] 128 5 Common Structural Genomic Variants in Linked with SCZ Regions variations to molecular diagnosis is obvious. The genetic homogeneity and the founder effect in the demographically oldest primary isolates of different ethnicities generally support the identification of complex disease genes and structural genome variations on example of mapping genes of SCZ, with significant time and cost savings. We indicated same significance of primary isolates in comparison with secondary ones in our study of early-onset major depressive disorder (Bulayeva et al. 2011, 2012).

References

Bulayeva, K., Jorde, L., Watkins, S., Bulayev, O., & Harpending, H. (2003). Genetics and history of caucasus populations. Human Biology, 75(6), 837–853. Bulayeva, K. B., Lencz, T., et al. (2011). Genome-wide linkage scan of major depressive disorder in two Dagestan genetic isolates. Central European Journal of Medicine, 6(5), 616–624. Bulayeva, K. B., Lencz, T., Takumi, T., Glatt, S. J., Gurgenova, F. R., Guseynova, U., et al. (2012). Mapping genes of early onset major depressive disorder in Dagestan genetic isolates. Turkish Journal of Psychiatry, 23(3), 161–170. Bulayeva, K., Lesch, K.-P., Bulayev, O., Walsh, C., Glatt, S., Gurgenova, F., et al. (2015). Genomic structural variation in linked with intellectual disability regions in a Dagestan Genetic isolate. Journal of Neural Transmission, 122(9), 1289–1301. Caciagli, L., Bulayeva, K., Bulayev, O., Bertoncini, S., Taglioli, L., Pagani, L., et al. (2009). The key role of patrilineal inheritance in shaping the genetic variation of Dagestan highlanders. Journal of Human Genetics, 54, 689–694. Cho, S. C., Yim, S. H., Yoo, H. K., Kim, M. Y., Jung, G. Y., Shin, G. W., Kim, B. N., Hwang, J. W., Kang, J. J., Kim, T. M., & Chung, Y. J. (2009). Copy number variations associated with idiopathic autism identified by whole-genome microarray-based comparative genomic hybrid- ization. Psychiatric Genetics, 19(4), 177–185. Eberhard, G., Franze´n, G., & Low,€ B. (1975). Schizophrenia susceptibility and HL-A antigen. Neuropsychobiology, 1(4), 211–217. Erkan, et al. (1996). HLA antigens in schizophrenia and mood disorders. Biological Psychiatry. Kalikala, Erzurum: Ataturk University. Glancy, M., Barnicoat, A., et al. (2009). Transmitted duplication of 8p23.1-8p23.2 associated with speech delay, autism and learning difficulties. European Journal of Human Genetics, 17(1), 37–43. Goldgar, D. E., Lewis, C. M., & Gholami, K. (1993). Analysis of discrete phenotypes using a multipoint identity-by-descent method: Application to Alzheimer’s disease. Genetic Epidemi- ology, 10(6), 383–388. Grow, T. J., Tyrrell, D. A., Ferrier, I. N., Johnstone, E. C., Macmillan, J. F., Owens, D. G., & Parry, R. P. (1979). Virus-like particles in CSF in schizophrenia. Lancet, 2(8132), 35. Ha˚vik, B., Le Hellard, S., Rietschel, M., Lybak, H., Djurovic, S., Mattheisen, M., Muhleisen, T. W., Degenhardt, F., Priebe, L., Maier, W., Breuer, R., Schulze, T. G., Agartz, I., Melle, I., Hansen, T., Bramham, C. R., Nothen,€ M. M., Stevens, B., Werge, T., Andreassen, O. A., Cichon, S., & Steen, V. M. (2011). The complement control-related genes CSMD1 and CSMD2 associate to schizophrenia. Biological Psychiatry, 70(1), 35–42. Epub 2011 Mar 24. Hibar, D. P., Stein, J. L., Renteria, M. E., Arias-Vasquez, A., et al. (2015). Common genetic variants influence human subcortical brain structures. Nature, 520(7546), 224–229. International Schizophrenia Consortium. (2008). Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature, 455(7210), 237–241.

[email protected] References 129

Iva´nyi, D., Zemek, P., & Iva´nyi, P. (1976). HLA antigens in schizophrenia. Tissue Antigens, 8(3), 217–220. Ivanyi, D., Zemek, P., & Ivanyi, P. (1978). HLA antigens as possible markers of heterogeneity in schizophrenia. Journal of Immunogenetics, 5, 165–172. Ivanyi, D., Zemek, P., & Ivanyi, P. (2007). HLA antigens as possible markers of heterogeneity in schizophrenia. International Journal of Immunogenetics, 5,3. Jones, P., & Cannon, M. (1998). The new epidemiology of schizophrenia. The Psychiatric Clinics of North America, 21(1), 1–25. Kirov, G., Gumus, D., Chen, W., Norton, N., Georgieva, L., Sari, M., O’Donovan, M. C., Erdogan, F., Owen, M. J., Ropers, H. H., & Ullmann, R. (2008). Comparative genome hybridization suggests a role for NRXN1 and APBA2 in schizophrenia. Human Molecular Genetics, 17(3), 458–465. Kolle, G., Georgas, K., Holmes, G. P., Little, M. H., & Yamada, T. (2000). CRIM1, a novel gene encoding a cysteine-rich repeat protein, is developmentally regulated and implicated in vertebrate. CNS Development and Organogenesis, 90(2), 181–193. Kruglyak, L., & Lander, E. S. (1995). High-resolution genetic mapping of complex traits. The American Journal of Human Genetics, 56(5), 12–23. Kruglyak, L., Daly, M. J., Reeve-Daly, M. P., & Lander, E. S. (1996). Parametric and nonpara- metric linkage analysis: A unified multipoint approach. American Journal of Human Genetics, 58(6), 1347–1363. Marchani, E. E., Watkins, W. S., Bulayeva, K., Harpending, H. C., & Jorde, L. B. (2008). Culture creates genetic structure in the Caucasus: Autosomal, mitochondrial, and Y-chromosomal variation in Daghestan. BMC Genetics, 9(47), 1–13. McGuffin, P., & Owen, M. (1991). The molecular genetics of schizophrenia: An overview and forward view. European Archives of Psychiatry and Clinical Neuroscience, 240(3), 169–173. Mitkevich, S. P. (1981). HLA antigens and schizophrenia. Zh Nevropatol Psikhiatr Im S S Korsakova, 81(7), 1016–1018. Russian. Shaikh, T. H., Gai, X., Perin, J. C., et al. (2009). High-resolution mapping and analysis of copy number variations in the human genome. A data resource for clinical and research applications. Genome Research, 19(9), 1682–1690. Stefansson, H., Rujescu, D., Cichon, S., Pietila¨inen, O. P., Ingason, A., Steinberg, S., et al. (2008). Large recurrent microdeletions associated with schizophrenia. Nature, 455, 232–239. Stefansson, H., et al. (2009). Common variants conferring risk of schizophrenia. Nature, 460, 744–747 (6 August 2009). Tang, B., Dean, B., & Thomas, E. A. (2011). Disease- and age-related changes in histone acetylation at gene promoters in psychiatric disorders. Translational Psychiatry, 1, 64. Torrey, E. F. (1997). Psychiatric survivors and nonsurvivors. Psychiatric Services, 48(2), 143. Tsuang, M. T. (1998). Genetic epidemiology of schizophrenia: Review and reassessment. The Kaohsiung Journal of Medical Sciences, 14(7), 405–412. Review. Tsuang, M. T., & Faraone, S. V. (1995). The case for heterogeneity in the etiology of schizo- phrenia. Schizophr Research, 17(2), 161–175. Review. Walsh, T., McClellan, J., McCarthy, S., Addington, A., Pierce, S., Cooper, G., et al. (2008). Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science, 320(5875), 539–43. doi:10.1126/science.1155174. Epub, 27. Walss-Bass, C., Montero, A. P., Armas, R., Dassori, A., Contreras, S. A., Liu, W., Medina, R., Levinson, D., Pereira, M., Atmella, I., NeSmith, L., Leach, R., Almasy, L., Raventos, H., & Escamilla, M. A. (2006). Linkage disequilibrium analyses in the Costa Rican population suggests discrete gene loci for schizophrenia at 8p23.1 and 8q13.3. Psychiatric Genetics, 16 (4), 159–168. Weeks, D. E., & Lange, K. (1987). Preliminary ranking procedures for multilocus ordering. Genomics, 1(3), 236–242. Wright, P., Donaldson, P. T., Underhill, J. A., Doherty, D. G., Choudhuri, K., & Murray, R. M. (1996). Genetic association of the HLA DRB1 gene locus on chromosome 6p21.3 with schizophrenia. The American Journal of Psychiatry, 153, 1530–1533.

[email protected] Conclusions

The most important original results of the study are as follows: • Ethnogenomic stratification of isolates is reflected in common and in population specific for different population genetic linkages with SCZ, as well as the structural genomic variants (CNV and ROH) in linked with SCZ genomic regions. • Out of 26 established genomic linkages with schizophrenia spectrum disorders, 10 were reproduced in studied isolates, supporting the genomic homogeneity of pathogenic loci in the ethnically and genetically subdivided isolates. The source of these ten common linkages in different genetic isolates most likely may be related with common proto-population for indigenous Dagestani ethnic groups, existence of which has been supported in our long-term population-genetic study of all Dagestan indigenous ethnics. • Part of CNVs and ROHs identified in linked regions are involved in the patho- genesis of schizophrenia spectrum disorders via mechanisms of gene dose effect affecting the functional area of the genome or a violation of relationships between allelic gene states in the genome regulatory area. • The duplication frequency in the genome of patients from studied isolates is about 1.3 times greater than deletions in linked with SCZ genomic regions, confirming previous studies in the field. CNV frequency is 1.6 times higher in studied patients compared with healthy subjects. • Our study opens up prospects for the differentiation of sporadic and familial forms of detected CNVs and further studies of their role in the etiopathogenesis, important for the creation of personalized medicine.

© Springer International Publishing Switzerland 2016 131 K. Bulayeva et al., Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates, DOI 10.1007/978-3-319-31964-3

[email protected] Appendix: List of Genome-Wide Scanned Loci and Markers in Studied Dagestan Genetic Isolates (Weber/CHLC 9.0 Markers)

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks 1 D1S468 AFM280we5 0.76 4 0 173–191 X 1 D1S1612 GGAA3A07 0.83 16 12 94–130 X 1 D1S1597 GATA27E01 0.71 30 14 155–179 1 D1S3669 GATA29A05 0.73 37 7 171–215 X 1 D1S552 GGAT2A07 0.72 45 8 244–260 1 D1S1622 ATA20F08 0.72 57 11 252–270 1 D1S3721 GATA129H04 0.88 73 16 204–256 X 1 D1S2134 GATA72H07 0.84 76 3 257–301 1 D1S3728 GATA165C03 0.74 89 14 244–268 X 1 D1S1665 GATA61A06 0.74 102 13 219–239 1 D1S1728 GATA109 0.67 109 7 158–174 1 D1S551 GATA6A05 0.67 114 5 166–186 X 1 D1S1588 ATA2E04 0.68 126 12 118–139 1 D1S1631 ATA29D04 0.77 137 11 129–156 1 D1S3723 GATA176G01 0.84 140 4 148–206 X 1 D1S534 GATA12A07 0.83 152 11 196–212 1 D1S1653 GATA43A04 0.68 164 12 100–116 X 1 D1S1679 GGAA5F09 0.84 171 7 148–168 1 D1S1677 GGAA22G10 0.68 176 5 188–208 1 D1S1589 ATA4E02 0.76 192 16 199–220 X 1 D1S518 GATA7C01 0.84 202 10 191–223 X 1 D1S1660 GATA48B01 0.78 212 10 226–250 1 D1S1678 GGAA23C07 0.68 218 6 297–313 1 GATA124F08 0.66 226 8 229–245 1 D1S2141 GATA87F04 0.84 233 7 236–263 (continued)

© Springer International Publishing Switzerland 2016 133 K. Bulayeva et al., Genomic Architecture of Schizophrenia Across Diverse Genetic Isolates, DOI 10.1007/978-3-319-31964-3

[email protected] 134 Appendix: List of Genome-Wide Scanned Loci and Markers in studied...

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks X 1 D1S549 GATA4H09 0.77 240 6 157–193 1 D1S3462 ATA29C07 0.75 247 8 248–266 1 D1S235 AFM203yg9 0.68 255 7 175–195 1 D1S547 GATA4A09 0.79 268 13 282–308 X 1 D1S1609 GATA50F11 0.8 275 7 180–208 X 2 D2S2976 GATA165C07 0.85 4 4 202–254 2 D2S2952 GATA116B01 0.77 18 14 177–209 2 D2S1400 GGAA20G10 0.66 28 10 111–140 X 2 D2S1360 GATA11H10 0.86 38 11 136–176 2 D2S405 GATA8F07 0.67 48 10 233–253 X 2 D2S1788 GATA86E02 0.89 56 8 159–211 2 D2S1356 ATA4F03 0.76 64 9 237–252 2 D2S1352 ATA27D04 0.68 74 9 113–128 X 2 D2S441 GATA8F03 0.75 87 13 127–159 2 D2S1394 GATA69E12 0.7 91 4 162–174 2 D2S1777 GATA71G04 0.65 99 9 196–208 2 D2S1790 GATA88G05 0.8 103 4 280–328 X 2 D2S2972 GATA176C01 0.77 114 11 216–244 2 D2S410 GATA4E11 0.8 125 11 156–182 X 2 D2S1328 GATA27A12 0.75 133 7 142–166 2 D2S1334 GATA4D07 0.79 145 13 266–310 2 D2S442 GATA8H05 0.65 147 2 196–208 X 2 D2S1399 GGAA20G04 0.8 152 5 137–173 2 D2S1353 ATA27H09 0.8 165 12 138–159 2 D2S1776 GATA71D01 0.72 173 8 288–308 X 2 D2S1391 GATA65C03 0.79 186 13 109–133 X 2 D2S1384 GATA52A04 0.8 200 14 141–161 2 D2S2944 GATA30E06 0.8 210 10 108–136 2 D2S434 GATA4G12 0.77 216 5 262–286 X 2 D2S1363 GATA23D03 0.79 227 11 172–192 2 D2S427 GATA12H10 0.76 237 10 251–263 X 2 D2S2968 GATA178G09 0.72 252 15 171–195 2 D2S125 AFM112yd4 0.82 261 9 88–100 X 3 D3S2387 GATA22G12 0.86 6 6 177–213 3 D3S1304 AFM234tf4 0.8 22 17 253–269 X 3 D3S4545 GATA164B08 0.82 26 4 192–240 3 D3S1259 AFM036yb8 0.84 37 10 184–206 3 D3S3038 GATA73D01 0.78 45 8 187–219 X 3 D3S2432 GATA27C08 0.83 58 13 118–170 3 D3S1768 GATA8B05 0.77 62 4 186–206 3 D3S2409 ATA10H11 0.75 71 9 115–127 X 3 D3S1766 GATA6F06 0.76 79 8 208–232 3 D3S4542 GATA148E04 0.78 90 11 236–260 (continued)

[email protected] Appendix: List of Genome-Wide Scanned Loci and Markers in studied... 135

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks 3 D3S2406 GGAT2G03 0.88 103 13 306–350 X 3 D3S4529 GATA128C02 0.72 112 10 147–167 3 D3S2459 GATA68D03 0.84 119 7 175–203 3 D3S3045 GATA84B12 0.82 124 5 176–208 X 3 D3S2460 GATA68F07 0.76 135 10 143–171 3 D3S4523 ATA34G06 0.71 138 3 228–249 3 D3S1764 GATA4A10 0.8 153 15 225–253 X 3 D3S1744 GATA3C02 0.8 161 8 131–167 3 D3S1763 GATA3H01 0.78 177 16 260–280 3 D3S3053 GATA92B06 0.72 182 5 226–238 X 3 D3S2427 GATA22F11 0.87 188 6 203–245 3 D3S1262 AFM059xa9 0.8 201 13 112–126 3 D3S2398 GATA6G12 0.79 209 8 266–298 X 3 D3S2418 ATA22E01 0.71 216 6 96–114 3 D3S1311 AFM254ve1 0.83 225 9 134–152 X 4 D4S2366 GATA22G05 0.79 13 13 120–144 4 D4S403 AFM157xg3 0.77 26 13 217–231 X 4 D4S2639 GATA90B10 0.85 33 8 160–192 4 D4S2397 ATA27C07 0.78 43 9 126–144 4 D4S2632 GATA72G09 0.81 51 8 122–190 X 4 D4S1627 GATA7D01 0.81 60 10 177–201 4 D4S3248 GATA28F03 0.73 73 12 233–257 X 4 D4S2367 GATA24H01 0.78 78 6 127–147 4 D4S3243 GATA10G07 0.66 88 10 162–174 4 D4S2361 ATA2A03 0.74 93 5 149–164 X 4 D4S1647 GATA2F11 0.75 105 11 132–156 4 D4S2623 GATA62A12 0.74 114 9 205–241 X 4 D4S2394 ATA26B08 0.79 130 16 235–256 4 D4S1644 GATA11E09 0.72 143 13 186–206 X 4 D4S1625 GATA107 0.74 146 3 182–210 4 D4S1629 GATA8A05 0.72 158 12 137–157 4 D4S2368 GATA27G03 0.75 168 10 304–328 X 4 D4S2431 GGAA19H07 0.82 176 9 234–258 4 D4S2417 GATA42H02 0.68 182 6 251–271 4 D4S408 AFM165xc11 0.76 195 13 229–243 X 4 D4S1652 GATA5B02 0.82 208 13 136–148 5 D5S2488 ATA20G07 0.74 0 0 230–245 X 5 D5S2849 GATA145D10 0.75 8 8 194–214 5 D5S2505 GATA84E11 0.74 14 7 257–297 5 D5S807 GATA3A04 0.76 19 5 168–208 5 D5S817 GATA3E10 0.66 23 4 260–272 X 5 D5S2845 GATA134B03 0.66 36 13 147–163 5 D5S2848 GATA145D09 0.68 40 4 217–237 (continued)

[email protected] 136 Appendix: List of Genome-Wide Scanned Loci and Markers in studied...

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks 5 D5S1470 GATA7C06 0.82 45 5 173–197 X 5 D5S1457 GATA21D04 0.74 59 14 97–127 5 D5S2500 GATA67D03 0.82 69 10 149–181 X 5 D5S1501 GATA52A12 0.78 85 16 98–110 5 D5S1725 GATA89G08 0.77 98 13 188–212 5 D5S1462 GATA3H06 0.77 105 7 217–241 5 D5S1453 ATA4D10 0.76 115 9 139–163 X 5 D5S2501 GATA68A03 0.75 117 2 314–334 5 D5S1505 GATA62A04 0.8 130 13 243–275 X 5 D5S816 GATA2H09 0.83 139 10 225–253 5 D5S1480 ATA23A10 0.79 147 8 218–239 X 5 D5S820 GATA6E05 0.77 160 12 190–218 5 D5S1471 GATA7H10 0.68 172 12 152–172 X 5 D5S1456 GATA11A11 0.78 175 3 191–211 5 D5S211 Mfd154 0.72 183 8 186–204 X 5 D5S408 AFM164xb8 0.73 195 13 250–266 X 6 F13A1 SE30 0.78 9 9 179–227 6 D6S2434 ATA50C05 0.67 25 16 224–236 6 D6S1959 GATA29A01 0.65 34 9 182–198 X 6 D6S2439 GATA163B10 0.87 42 8 218–258 6 D6S2427 GGAA15B08 0.77 54 12 197–229 X 6 D6S1017 GGAT3H10 0.68 63 9 151–171 6 D6S2410 GATA11E02 0.72 73 10 236–252 X 6 D6S1053 GATA64D02 0.81 80 7 297–325 6 D6S1031 ATA28B11 0.85 89 8 258–282 X 6 D6S1056 GATA68H04 0.85 103 14 241–273 6 D6S1021 ATA11D10 0.73 112 9 141–156 X 6 D6S474 GATA31 0.77 119 6 151–167 6 D6S1040 GATA23F08 0.75 129 10 257–285 6 D6S1009 GATA32B03 0.8 138 9 237–273 X 6 GATA184A08 0.78 146 8 161–201 6 D6S2436 GATA165G02 0.75 155 9 121–149 6 D6S305 AFM242zg5 0.84 166 12 204–230 X 6 D6S1277 GATA81B01 0.72 173 7 282–306 X 6 D6S1027 ATA22G07 0.77 187 14 117–138 X 7 D7S3056 GATA24F03 0.73 7 7 168–192 7 D7S513 AFM217yc5 0.82 18 10 173–201 7 D7S3051 GATA137H02 0.75 29 12 146–182 7 D7S1802 GATA41G07 0.73 33 4 177–201 X 7 D7S1808 GGAA3F06 0.78 42 9 252–276 7 D7S817 GATA13G11 0.78 50 9 157–177 7 D7S2846 GATA31A10 0.76 58 8 172–196 7 D7S1818 GATA24D12 0.71 70 12 183–199 (continued)

[email protected] Appendix: List of Genome-Wide Scanned Loci and Markers in studied... 137

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks X 7 D7S3046 GATA118G10 0.81 79 9 318–346 7 D7S2204 GATA73D10 0.81 91 12 217–269 X 7 D7S820 GATA3F01 0.83 98 7 204–240 7 D7S821 GATA5D08 0.83 109 11 238–270 7 D7S1799 GATA23F05 0.72 114 5 171–199 X 7 D7S3061 GGAA6D03 0.83 128 14 114–154 7 D7S1804 GATA43C11 0.86 137 9 250–290 X 7 D7S1824 GATA32C12 0.82 150 13 163–203 7 D7S2195 GATA112F07 0.79 155 5 237–289 X 7 D7S3070 GATA189C06 0.8 163 8 184–208 7 D7S3058 GATA30D09 0.85 174 11 207–235 X 7 D7S559 Mfd265 0.81 182 8 196–216 X 8 D8S264 143xd8 0.83 1 1 121–145 8 D8S277 198wd2 0.73 8 8 148–180 8 D8S1130 GATA25C10 0.8 22 14 132–156 8 D8S1106 GATA23D06 0.73 26 4 127–151 X 8 D8S1145 GATA72C10 0.73 37 11 261–289 8 D8S136 cos140D4 0.88 44 7 70–90 8 D8S1477 GGAA20C10 0.86 60 16 139–179 X 8 D8S1110 GATA8G10 0.77 67 7 262–286 X 8 D8S1113 GGAA8G07 0.81 78 11 217–245 8 D8S1136 GATA41A01 0.72 82 4 241–261 8 D8S2324 GATA14E09 0.75 94 12 196–212 X 8 D8S1119 ATA19G07 0.8 101 7 173–197 8 GAAT1A4 0.66 110 9 140–156 X 8 D8S1132 GATA26E03 0.86 119 9 139–171 8 D8S592 GATA6B02 0.67 125 6 150–162 8 D8S1179 GATA7G07 0.82 135 10 162–194 X 8 D8S1128 GATA21C12 0.76 140 4 240–268 8 D8S256 AFM073yb7 0.83 148 9 210–232 X 8 D8S373 UT721 0.78 164 16 194–218 X 9 D4S2624 GATA62F03 0.64 14 14 283–295 9 D9S921 GATA21A06 0.88 22 8 175–232 X 9 D9S925 GATA27A11 0.82 32 10 167–199 9 D9S1121 GATA87E02 0.73 44 12 184–216 9 D9S1118 GATA71E08 0.81 58 14 141–177 X 9 D9S301 GATA7D12 0.8 66 8 209–241 9 D9S1122 GATA89A11 0.71 76 10 190–210 X 9 D9S922 GATA21F05 0.78 80 4 251–267 9 D9S257 AFM183xh10 0.89 92 12 259–285 9 D9S910 ATA18A07 0.66 104 13 105–129 X 9 D9S938 GGAA22E01 0.79 111 6 405–421 9 D9S930 GATA48D07 0.78 120 9 278–306 (continued)

[email protected] 138 Appendix: List of Genome-Wide Scanned Loci and Markers in studied...

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks X 9 D9S934 GATA64G07 0.76 128 8 206–230 9 D9S1825 AFM029xg1 0.79 136 8 127–145 9 D9S2157 ATA59H06 0.82 147 10 259–286 X 9 D9S1838 AFM303zg9 0.82 164 17 159–175 X 10 D10S1435 GATA88F09 0.75 4 4 256–276 10 D10S189 AFM063xf4 0.72 19 15 180–188 10 D10S1412 ATA31G11 0.73 28 9 151–166 X 10 D10S1430 GATA84C01 0.88 33 5 148–168 10 D10S1423 GATA70E11 0.74 46 13 218–238 X 10 D10S1426 GATA73E11 0.74 59 13 152–180 10 D10S1208 ATA5A04 0.65 63 4 179–200 10 D10S1221 ATA21A03 0.78 76 12 195–219 X 10 D10S1225 ATA24F10 0.76 81 5 181–193 10 GATA121A08 0.73 8 184–200 X 10 D10S1432 GATA87G01 0.74 94 6 165–185 10 D10S2327 GGAT1A4 0.66 101 7 204–228 10 D10S2470 GATA115E01 0.81 113 12 243–271 X 10 D10S677 GGAA2F11 0.81 117 5 197–225 10 D10S1239 GATA64A09 0.75 125 8 160–184 X 10 D10S1237 GATA48G07 0.82 135 9 400–436 10 D10S1230 ATA29C03 0.74 143 8 114–138 X 10 D10S1213 GGAA5D10 0.8 148 5 93–133 X 10 D10S1248 GGAA23C05 0.75 165 17 241–261 10 D10S212 AFM198zb4 0.71 171 6 189–201 X 11 D11S1984 GGAA17G05 0.79 2 2 166–206 11 D11S2362 ATA33B03 0.81 9 7 209–230 X 11 D11S1999 GATA23F06 0.8 17 8 109–137 11 D11S1981 GATA48E02 0.83 21 4 134–178 11 ATA34E08 0.76 33 12 156–171 X 11 D11S1392 GATA6B09 0.77 43 10 200–220 X 11 D11S1985 GGAA5C04 0.87 58 15 234–286 11 D11S2371 GATA90D07 0.67 76 18 193–213 X 11 D11S2002 GATA30G01 0.79 85 9 232–252 11 D11S2000 GATA28D01 0.87 101 15 199–235 X 11 D11S1986 GGAA7G08 0.88 106 5 176–252 11 D11S1998 GATA23E06 0.68 113 7 129–165 X 11 D11S4464 GATA64D03 0.78 123 10 225–249 11 D11S912 AFM157xh6 0.81 131 8 101–123 X 11 D11S968 AFM109xc3 0.81 148 17 137–155 X 12 D12S372 GATA4H03 0.76 6 6 174–190 12 D3S2395 GATA49D12 0.85 18 11 181–217 X 12 D12S391 GATA11H08 0.88 26 9 211–251 12 D12S373 GATA6C01 0.76 36 10 208–224 (continued)

[email protected] Appendix: List of Genome-Wide Scanned Loci and Markers in studied... 139

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks X 12 D12S1042 ATA27A06 0.81 49 13 118–136 12 GATA91H06 0.72 56 8 100–120 12 D12S398 GGAT2G06 0.67 68 12 120–144 12 D12S1294 GATA73H09 0.84 78 10 168–204 X 12 D12S375 GATA3F02 0.74 81 2 165–189 12 D12S1052 GATA26D02 0.72 83 3 149–165 X 12 D12S1064 GATA63D12 0.82 95 12 173–197 12 D12S1300 GATA85A04 0.63 104 9 115–135 X 12 PAH PAH 0.8 109 5 229–257 12 D12S2070 ATA25F09 0.79 125 16 86–104 X 12 D12S395 GATA4H01 0.76 137 12 223–243 12 D12S2078 GATA32F05 0.81 150 13 250–282 X 12 D12S1045 ATA29A06 0.8 161 11 76–103 12 D12S392 GATA13D05 0.79 166 5 136–153 X 13 D13S787 GATA23C03 0.72 9 9 251–267 13 D13S1493 GGAA29H03 0.8 26 17 223–243 13 D13S894 GATA86H01 0.64 33 7 180–200 X 13 D13S325 GATA6B07 0.8 39 6 195–231 13 D13S788 GATA29A09 0.84 46 7 240–270 13 D13S800 GATA64F08 0.75 55 10 295–319 X 13 D13S317 GATA7G10 0.79 64 9 175–199 13 D13S793 GATA43H03 0.77 76 12 253–273 13 D13S779 ATA26D07 0.71 83 7 180–198 X 13 D13S796 GATA51B02 0.8 94 11 148–168 X 13 D13S285 AFM309va9 0.81 111 17 92–106 X 14 D14S742 GATA74E02 0.72 12 12 395–415 14 D14S1280 GATA31B09 0.7 26 13 289–301 14 D14S608 GATA43H01 0.77 28 2 188–224 14 D14S599 ATA29G03 0.74 41 13 84–99 X 14 D14S306 GATA4B04 0.79 44 3 190–210 X 14 D14S587 GGAA10C09 0.84 56 12 250–278 14 D14S592 ATA19H08 0.68 67 11 222–240 14 D14S588 GGAA4A12 0.67 76 9 117–141 14 D14S53 Mfd190 0.71 86 11 135–161 X 14 D14S606 GATA30A03 0.73 92 5 266–282 14 GATA193A07 0.77 4 339–375 X 14 D14S617 GGAA21G11 0.78 106 10 141–173 14 D14S1434 GATA168F06 0.65 113 8 212–232 X 14 D14S1426 GATA136B01 0.78 126 13 133–157 X 15 D15S822 GATA88H02 0.77 12 12 258–306 15 D15S165 AFM248vc5 0.79 20 8 184–208 X 15 ACTC ACTC 0.87 31 11 68–92 X 15 D15S659 GATA63A03 0.84 43 12 174–206 (continued)

[email protected] 140 Appendix: List of Genome-Wide Scanned Loci and Markers in studied...

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks X 15 D15S643 GATA50G06 0.86 52 9 195–223 15 D15S1507 GATA151F03 0.73 60 8 204–220 15 D15S818 GATA85D02 0.7 72 12 150–170 15 D15S655 ATA28G05 0.72 83 11 234–252 X 15 D15S652 ATA24A08 0.8 90 7 288–309 15 D15S816 GATA73F01 0.66 101 11 128–148 X 15 D15S657 GATA22F01 0.82 105 4 336–360 X 15 D15S642 GATA27A03 0.81 122 17 200–218 X 16 ATA41E04 0.69 11 121–139 16 D16S748 ATA3A07 0.82 23 11 187–214 X 16 D16S764 GATA42E11 0.7 30 7 96–116 16 D16S403 AFM049xd2 0.85 44 14 134–152 16 D16S769 GATA71H05 0.69 51 7 258–270 X 16 D16S753 GGAA3G05 0.79 58 7 252–276 16 D16S3396 ATA55A11 0.79 64 6 139–157 X 16 D16S3253 GATA22F09 0.71 72 8 167–187 16 GATA67G11 0.79 81 9 262–286 X 16 D16S2624 GATA81D12 0.7 88 6 132–148 16 D16S516 AFM350vd1 0.73 100 13 164–176 16 D16S402 AFM031xa5 0.87 114 13 161–187 X 16 D16S539 GATA11C06 0.76 125 11 148–172 16 D16S621 GATA71F09 0.76 130 6 239–263 X 17 D17S1308 GTAT1A05 0.66 1 1 304–316 17 D17S1298 GAAT2C03 0.6 11 10 246–258 17 D17S974 GATA8C04 0.64 22 12 201–217 X 17 D17S1303 GATA64B04 0.7 24 1 225–245 17 D17S947 AFM290vc9 0.89 32 8 250–282 17 D17S2196 GATA185H04 0.81 45 13 139–163 17 D17S1294 GGAA9D03 0.68 51 6 248–272 X 17 D17S1293 GGAA7D11 0.83 56 6 262–290 17 D17S1299 GATA25A04 0.73 62 6 188–208 17 D17S2180 ATC6A06 0.67 67 5 116–128 X 17 D17S1290 GATA49C09 0.84 82 15 170–210 17 D17S2193 ATA43A10 0.79 89 7 138–159 X 17 D17S1301 GATA28D11 0.65 100 11 147–163 17 D17S784 AFM044xg3 0.77 117 17 226–238 X 17 D17S928 AFM217yd10 0.79 126 10 135–165 X 18 GATA178F11 0.82 3 370–398 18 D18S481 AFM321xc9 0.76 7 4 183–203 18 D18S976 GATA88A12 0.86 13 6 171–194 X 18 D18S843 ACT1A01 0.75 28 15 179–191 18 D18S542 GATA11A06 0.79 41 13 178–198 18 D18S877 GATA64H04 0.68 54 13 117–137 (continued)

[email protected] Appendix: List of Genome-Wide Scanned Loci and Markers in studied... 141

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks X 18 D18S535 GATA13 0.76 64 10 131–155 18 D18S851 GATA6D09 0.73 75 10 256–276 X 18 D18S858 ATA23G05 0.75 80 5 193–211 18 D18S1357 ATA7D07 0.79 89 8 126–147 X 18 D18S1364 GATA7E12 0.76 99 10 164–188 18 ATA82B02 0.84 107 8 172–196 18 D18S1371 GATA177C03 0.7 116 9 133–153 X 18 D18S844 ATA1H06 0.76 116 1 182–200 X 19 D19S591 GATA44F10 0.74 10 10 96–112 19 D19S1034 GATA21G05 0.72 21 11 222–242 X 19 D19S586 GATA23B01 0.73 33 12 222–250 19 D19S714 GATA66B04 0.79 42 9 224–256 X 19 D19S433 GGAA2A03 0.77 52 10 195–225 19 D19S245 Mfd235 0.68 59 7 187–211 19 D19S178 Mfd139 0.8 68 9 143–189 X 19 D19S246 Mfd232 0.82 78 10 185–233 19 D19S589 GATA29B01 0.72 88 10 161–181 X 19 D19S254 Mfd238 0.75 101 13 110–150 20 D20S103 AFM077xd3 0.71 2 2 92–106 X 20 D20S482 GATA51D03 0.68 12 10 151–167 20 D20S851 AFMa218yb5 0.74 25 13 128–150 20 D20S604 GATA81E09 0.72 33 8 131–147 X 20 D20S470 GGAA7E02 0.87 39 6 258–318 20 D20S477 GATA29F06 0.71 48 8 240–268 20 D20S478 GATA42A03 0.81 54 7 243–275 X 20 D20S481 GATA47F05 0.83 62 8 217–253 20 D20S480 GATA45B10 0.76 80 18 284–308 X 20 D20S171 AFM046xf6 0.78 96 16 123–149 X 21 D21S1432 GATA11C12 0.63 3 3 127–155 21 D21S1437 GGAA3C07 0.73 13 10 119–143 X 21 D21S2052 GATA129D11 0.77 25 12 121–153 21 D21S1440 ATA27F01 0.74 37 12 157–175 X 21 D21S2055 GATA188F04 0.88 40 4 117–193 X 21 D21S1446 GATA70B08 0.69 58 17 209–223 X 22 D22S420 AFM217xf4 0.77 4 4 148–164 22 D22S345 Mfd313 0.73 19 15 119–129 22 D22S689 GATA21F03 0.76 29 9 202–230 X 22 D22S685 GATA6F05 0.79 32 4 172–196 X 22 D22S683 GATA11B12 0.9 36 4 168–214 22 D22S445 GGAT3C10 0.65 46 10 110–130 X X DXS9900 GGAT3F08 0.75 0 0 158–178 X DXS9895 GATA124B04 0.75 9 9 145–161 X DXS9902 GATA175D03 0.7 22 13 170–186 (continued)

[email protected] 142 Appendix: List of Genome-Wide Scanned Loci and Markers in studied...

Between Alleles size Сет 9 Chromosome Loci Merkers H cM loci ranks X X DXS9896 GATA124E07 0.74 40 17 192–208 X DXS1068 AFM238yc11 0.79 53 13 245–259 X X DXS6810 GATA69C12 0.66 64 11 215–223 X GATA144D04 0.7 71 8 222–254 X X DXS7132 GATA72E05 0.73 83 12 283–299 X DXS6800 GATA31D10 0.76 93 10 197–221 X X DXS6789 GATA31F01 0.76 104 10 118–150 X DXS6797 GATA10C11 0.76 113 9 250–270 X X GATA172D05 0.81 116 3 110–134 X GATA165B12 0.82 133 17 197–215 X DXS1047 AFM150xf10 0.81 143 10 196–210 X X D3S2390 GATA31E08 0.73 154 11 226–254 X DXS7127 GATA100G03 0.76 165 11 236–260 X X DXYS154 SDF1 0.77 184 19 232–246 X Y DYS389 GATA30F10 0.66 0 0 248–256 Y DYS390 GATA31E10 0.76 0 0 205–221 Y DYS391 GATA32C10 0.56 0 0 285–293 X Y GGAAT1B07 0.6 0 0 179–194

[email protected]