Leeds Thesis Template
Total Page:16
File Type:pdf, Size:1020Kb
High-resolution characterization of genetic markers in the Arabian Peninsula and Near East Verónica Cristina Neves da Nova Fernandes Submitted in accordance with the requirements for the degree of Doctor of Philosophy The University of Leeds School of Biology Faculty of Biological Sciences November, 2013 - ii - - iii - I confirm that the work submitted is my own, except where work which has formed part of jointly-authored publications has been included. The contribution of the candidate and the other authors to this work has been explicitly indicated below. I confirm that appropriate credit has been given within the thesis where reference has been made to the work of others. I performed the study of the complete mitochondrial DNA sequencing of the three minor West Eurasian mitochondrial haplogroups, N1, N2 and X, in a collaborative study between University of Leeds, IPATIMUP (the second Institution where I am doing the PhD) and other laboratories. This study is already published in The American Journal of Human Genetics in which I am the first author.1 I performed the laboratory work and also the phylogeography and phylogenetic analyses. The follow authors: Alshamali F., Cherni L., Harich N. and Cerny V., provided the biological samples. Costa M. D., Pereira J. B., Soares P., helped me in the construction of an Excel database with control region (HVS-I and HVS-II) of the mitochondrial DNA sequences already published and deposited in GenBank or reported in the papers and Alves, M. constructed bioinformatic tools used in this study. Finally, Soares P., Richards M. B. and Pereira L., helped me in interpreting the results and drafting the manuscript. This copy has been supplied on the understanding that it is copyright material and that no quotation from the thesis may be published without proper acknowledgement. © 2013 The University of Leeds and Verónica Cristina Neves da Nova Fernandes - iv - - v - Acknowledgements Firstly I would like to express my sincere gratitude to my PhD supervisors, Martin Richards and Luisa Pereira for all the support that allowed this research to be undertaken. To Prof. Martin Richards I am truly grateful for his guidance, support and immense knowledge. I am really thankful for all Luisa Pereira's patience, motivation and enthusiasm. More than a supervisor she became a friend, being always present and guiding me in my research and helping me writing this thesis. I could not have imagined having a better supervisor and mentor. I would like to acknowledge the Fundação para a Ciência e a Tecnologia, who financially supported my PhD project through the scholarship SFRH/BD/61342/2009 and the project PTDC/CS-ANT/113832/2009. I would like to thank to my two host institutions: IPATIMUP, for the excellent laboratory conditions and staff and most important for the exceptional work environment that allowed me to work with truly friends; and the University of Leeds for all the experience that allowed me to grow not only intellectually but also socially, offering me new friendships and a lot of good moments. I also would like to thank to all my co-workers in the IPATIMUP and in the University of Leeds. Particularly I am very grateful to Pedro Soares for all the scientific advices and help during the statistical analyses; to Petr Triska for helping me with the Genome wide analyses; to Nuno Silva, Marco Alves and Bruno Cavadas for the computer skills; and to Maria Pala for all the laboratory help. I would like to acknowledge Farida Alshamali and Victor Cerny for the Dubai and Yemen samples that without that this research would be scientifically poorer. I am truly thankful to all my friends, for being always there cheering me up, and for all the unforgettable and wonderful moments. At last, but not the least, I would like to thank to my family, for encouraging me and supporting me spiritually throughout my life. - vi - - vii - Abstract In this work, I analysed the maternally transmitted mtDNA and biparentally inherited genome-wide polymorphisms to shed light on successive migrations into and from the Arabian Peninsula, at the crossroads between Africa, Europe and Asia. I focused on three main issues: 1- The first descendants of the out-of-Africa migration: My phylogeographic analysis on 385 complete mtDNA N1, N2 and X sequences showed their common origin in the Gulf Oasis region at ~57-65 ka. Instead of isolation, I identified a continuous gene flow between Arabia and Near East. Genome-wide data supported the strong clustering of Arabian and Near Eastern populations. 2- Major population expansions in the Arabian Peninsula: My data on the rare N(xR) lineages supported a continuous settlement of the Peninsula, while my founder analysis of other N lineages suggested Near Eastern lineages arriving continuously from the Late Glacial (31%-46%; some U and N1 lineages), Younger Dryas (24-28%; R0a and HV), Neolithic (20-25%; J and T; supported by my new complete 44 sequences) and till recently (10-15%; derived lineages). These results again challenge the hypothesis of long-term isolation between these two regions. 3- Genetic exchanges across the Red Sea: Phylogeographic analysis of L4 and L6 mtDNA complete sequences and HVS-I founder analysis from Africa into Arabia/Near East indicated that the Arab maritime dominance and slave trade (0.5- 2.5 ka) were the main contributors (~60-70%) to the African input, but the entrance began with the establishment of maritime networks in the Red Sea by 8 ka. Genome-wide analyses supported this recent introduction (ROLLOFF estimates of 30-40 generations) suggesting that Arabia has 6% eastern African and 3% western African input. The HVS-I founder analysis for the back-to-Africa migrations showed that the Late Glacial period dominated introductions into eastern Africa, while the Neolithic was more important for migrations towards North Africa. - viii - - ix - Table of Contents Acknowledgements .................................................................................................. v Abstract ................................................................................................................. vii Table of Contents ................................................................................................... ix List of Tables ........................................................................................................ xiii List of Figures ........................................................................................................ xv I. LIST OF ABBREVIATIONS .......................................................................... xxi II. INTRODUCTION ............................................................................................ 1 1. The study of human evolution ........................................................................ 2 1.1. Archaeology .......................................................................................... 2 1.2. Palaeontology ........................................................................................ 3 1.3. Climatology ............................................................................................ 4 1.4. Linguistics.............................................................................................. 6 1.5. Genetics ................................................................................................ 7 1.5.1. The DNA Molecule and the coding of information ................. 7 1.5.1.1. The autosomal nuclear genome .................................. 10 1.5.1.2. The non-recombining segments of the genome ........... 14 1.5.1.2.1. The mitochondrial DNA ............................................. 14 1.5.1.2.2. The Y chromosome .................................................. 19 2. Human population genetics .......................................................................... 22 2.1. Extant population analyses .................................................................. 22 2.1.1. Mitochondrial DNA diversity ............................................... 22 2.1.2. Y chromosome diversity ..................................................... 29 2.1.3. Genome wide-diversity ....................................................... 33 2.2. Ancient DNA Analyses......................................................................... 37 3. The origin of modern humans ....................................................................... 39 3.1. Multiregional model ............................................................................. 39 3.2. Out-of-Africa model ............................................................................. 40 3.2.1. Out-of-Africa routes ............................................................ 41 3.2.1.1. Northern route ............................................................. 41 3.2.1.2. Southern route ............................................................. 43 4. The Arabian Peninsula ................................................................................. 45 4.1. Geology ............................................................................................... 46 - x - 4.2. Climate ................................................................................................ 49 4.3. Population Distribution and Composition .............................................. 50 4.4. Linguistics and Geography ................................................................... 53 4.5. Archaeology ........................................................................................