<<

DENTAL MORPHOLOGY AND HAPLOTYPIC DIVERSITY IN THE MAJOR ETHNIC GROUPS OF SWAT AND DIR DISTRICTS

INAMULLAH

DEPARTMENT OF GENETICS HAZARA UNIVERSITY 2018 DENTAL MORPHOLOGY AND HAPLOTYPIC DIVERSITY IN THE MAJOR ETHNIC GROUPS OF SWAT AND DIR DISTRICTS

By

Inamullah

This research study has been conducted and reported as partial fulfillment of the requirements of PhD degree in Genetics awarded by Hazara University Mansehra,

The Friday 03, March 2017

DEPARTMENT OF GENETICS HAZARA UNIVERSITY MANSEHRA 2018

DENTAL MORPHOLOGY AND HAPLOTYPIC DIVERSITY IN THE MAJOR ETHNIC GROUPS OF SWAT AND DIR DISTRICTS

Submitted by INAMULLAH PhD Scholar

Research supervisor PROF. DR. HABIB AHMAD Vice Chancellor Islamia College

Co supervisor DR. BRIAN E. HEMPHILL Associate Professor Department of Anthropology University of Alaska, Fairbanks Fairbanks, AK 99775 United States

DEPARTMENT OF GENETICS HAZARAUNIVERSITY MANSEHRA 2018

AL QURAN

"O Mankind, we created you from a single pair of a male and a female, and made you in to tribes and nations so that you may know each other (not that you despise each other). Verily, the most honored of you in the sight of Allah is he who is most righteous of you, Surely Allah is All-Knowing, All-Aware."

(Al-Hujurat, 49: 13) AUTHOR’S DECLARATION

I Inamullah hereby state that my PhD thesis titled “Dental morphology and haplotypic diversity in the major ethnic groups of Swat and Dir Districts” is my own work and has not been submitted previously by me for taking any degree from this University (Hazara University Mansehra Pakistan) Or anywhere else in the country/world.

At any time if my statement is found to be incorrect even after my Graduate the university has the right to withdraw my PhD degree.

Inamullah Date: 16-02-2018

Plagiarism Undertaking

I solemnly declare that research work presented in the thesis titled ‘’ Dental morphology and haplotypic diversity in the major ethnic groups of Swat and Dir

Districts ’’ is solely my research work with no significant contribution from any other person.

Small contribution/help wherever taken has been duly acknowledged and that complete thesis has been written by me.

In understand the zero tolerance policy of the HEC and University (Hazara

University Mansehra) towards plagiarism. Therefore I as an Author of the above titled thesis declare that no portion of my thesis has been plagiarized and any material used as reference is properly referred/cited.

I undertake that if I am found guilty of any formal plagiarism in the above titled thesis even after award of PhD degree, the university reserves the rights to withdraw/revoke my PhD degree and that HEC and the University has the right to publish my name on the HEC/University Website on which names of students are placed who submitted plagiarized thesis.

Student/Author Signature: ______

Name: Inamullah

ACKNOWLEDGMENTS

Although feelings are deep, unfortunately the words are too shallow. The names may be mentioned but the extent and the level of my gratitude is impossible to capture. All praises and thanks to the greatest Almighty Allah, Omnipresent, Lord of the lords who blessed me to complete this task within specified time. One and only who made my dreams come true. "I can do everything through Him who gives me strength. I also offer the humble words of respect and profound gratitude to the Holy Prophet Muhammad (Peace Be upon Him) the most perfect and glorious among all the creatures born on surface of the earth and has been sent for enlightening our conscience and who is forever the city of knowledge for the whole humanity.

Completion of this doctoral dissertation was possible with the support, help and inspiration of many people. It is a pleasure to convey my gratitude to them all in my humble acknowledgment. In the first place, I would like to express my sincere appreciation and gratitude to my research supervisor Prof. Dr. Habib Ahmad, Vice Chancellor, Islamia College Peshawar, for the continuous support in my PhD dissertation, for his patience, motivation, enthusiasm, and immense knowledge. His guidance helped me in all the time of research and writing of this thesis. I could not have imagined having a better advisor and mentor for my PhD study. A person with an amicable and positive disposition, he has always made himself available to clarify my doubts despite his busy schedule and I consider it a great opportunity to do my doctoral programme under his guidance and to learn from his research expertise.

I also extend sincere thanks to my research co-supervisor Dr. Brian E. Hemphill Associate Professor, Department of Anthropology, University of Alaska, Fairbanks, United Sates for his splendid guidance and assistance in the completion of this work and I am indebted to him for his advice, supervision, and crucial contribution, which made him a backbone of this research and to this dissertation.

I am grateful to Prof. Eske Willeslev, Director of Center for Geogenetics, University of Copenhagen, Denmark for giving me opportunities in their group and leading me to work on diverse exciting projects.

I am also thankful to my lab colleagues and group members i.e. Morten E. Allentoft, Martin Sikora Ashot Margaryan and Constanza de la Fuente Castro at Centre for GeoGenetics Denmark for their excellent guidance.

I am also thankful to Dr. Jill K. Olofsson Department of Animal and Plant Sciences, University of Sheffield, for helping me in statistical data analysis.

i

I express my heart-felt gratitude especially to Dr. Muhammad Shahid Nadeem, Assistant Professor in the Department of Biochemistry, Faculty of Science, King Abdulaziz University, Saudi Arabia, who guided me with his intelligent ideas, thought provoking discussions and comprehensive understanding to help me sail through the initial fumbling. His attitude of living every moment as it comes, making unexpected observations and converting them to new possibilities, correlating ideas and understanding the obvious has helped me come a long way and will always guide me in future.

I would also like to express my appreciation to Dr. M. Ilyas, Director of the Centre for Human Genetics Hazara University, Mansehra for his help in statistical data analysis, sharing of scientific ideas, and his help in approaching scientific communities.

I am grateful for the funding source that allowed me to pursue my study: The Higher Education Commission (HEC) of Pakistan. The technical and generous financial support from the HEC sponsored project (NRPU- 20-1409), entitled “Ethnogenetic elaboration of KP through dental morphology and DNA analysis” of the Department of Genetics Hazara University Mansehra, is highly acknowledged.

I acknowledge Secretary Education KP, Directorate of schools and colleges of Swat and Dir districts and all the volunteers’ for their help and support in providing samples for this research. I feel most pride in expressing my deepest sense of gratitude to the Department of Genetics and everybody related to it was important in my comprehension of the work, which boosted my self-confidence during achievement of my goal. Some faculty members of the department have been very kind enough to extend their help at various phases of this research, whenever I approached them, and I do hereby acknowledge all of them.

Members of Human Genetics Lab deserve my sincerest thanks, their friendship and assistance has meant more to me than I could ever express. I could not complete my work without influential outgoing support of the participants of the Ethnogenetic project in the lab. I should also mention Ethnogenetic project for allowing me to be part of a great professional community. I am indebted to my many student colleagues for providing an encouraging and pleasurable environment. My thanks go in particular to Dr. Muhammad Tariq, Assistant Professor at Islamia College University, Peshawar, Mr. Numan Fazal, Mr. Murad Ali, Mr. Muhammad Ismail Khan (Research Associates), Mr. Faridullah, Mr. Sheraz Khan and Mr. Zakaria Khan for their help and support during lab work. I would like to express my deep thanks to Miss Mehreen Amin and Miss Shakeela Umar, who made possible the collection

ii

of female samples for my PhD research. Special thanks to Dr. Nazia for her guidance, encouragement and support. I am also thankful to Dr. Khushi Muhammad for his valuable guidance and moral support during my stay at the human genetics lab. I would like to record words of honour for all my fellows, colleagues and teachers who shaped me with their vast knowledge.

I am very thankful to Functional Genomics lab in the department of Genetics and all its members especially Dr. Inamullah, Associate Professor, Dr. Ikram Muhammad, Dr. Israr Ahmad and Miss Nazish Gul (MPhil Scholar) for their instrumental support and help throughout my research work.

I would like to express my sincere thanks to FANTA`s members, (Mr. Ikram Muhammad, Muhammad Ali, Mr. Israr Ahmad and the special one among them all, Mr. M. Jawad) for their thoughtful guidance, insightful decision and sincere encouragement during the critical situations of my PhD. Finally, I would also like to thank my family for the support they provided me through my entire life and in particular, I must acknowledge my mother, father (late) and brothers, without whose love, encouragement and prayers, I would not have finished this thesis.

Inamullah

iii

Dedication

To My

Beloved Parents

iv

CONTENTS

ACKNOWLEDGMENTS I

CONTENTS V

LIST OF TABLES X

LIST OF FIGURES XII

ABBREVIATIONS XV

ABSTRACT XVII

INTRODUCTION 1

1.1 Modern human history 1

1.1.1. Dispersal of anatomically modern humans 2

1.1.2. Time line and routes of Modern Human dispersal throughout the world 9

1.2. Pakistan 11

1.3. Study Area 15

1.3.1. District Swat 16

1.3.2. District Dir 18

1.4. The \Pakhtuns 20

1.4.1. The Yousafzai 22

1.4.2. The Utmankheils 23

1.4.3. The Tarklanis 23

1.5. The Kohistani 24

1.6. The Gujars 25

1.7. The genetic characterization of human 27

1.8. Dental Morphology/ Dental Anthropology 30

v

1.8.1 The Birth of Dental Anthropology 32

1.8.2. Dental anthropology investigations in 34

1.8.3. Non-metric dental morphological traits 36

1.8.4. Basic terminology use in dental morphology 36

1.8.5. Analysis of dental morphology traits 38

1.9. Mitochondrial DNA (mtDNA) 46

1.9.1. MtDNA in human lineages 48

1.9.2. mtDNA Variation 49

1.10. The Y-chromosome 53

1.10.1. Phylogenetic Tree based on human Y-chromosome 55

1.10.2. Y-chromosomal haplogroup distribution across the globe 58

CHAPTER 2 MATERIALS AND METHODS 61

2.1. Samples collection for dental morphology study 62

2.1.1. Collection of dental Casts 62

2.1.2. Selection of volunteers 63

2.1.3. Biosafety Measures 63

2.1.4. Dental casting and labeling 64

2.1.5. Grading and scoring of dental morphology traits 65

2.2. Analyzing the DNA 66

2.2.1. Collection of saliva samples 66

2.2.2. Genomic DNA extraction 66

2.2.3. Screening of the purified gDNA 67

2.2.4. Agarose gel electrophoresis 68

vi

2.3. Mitochondrial DNA characterization 68

2.3.1. PCR Amplification of target DNA 68

2.3.2. Thermocycling conditions for PCR 69

2.3.3. Visualization of the PCR Products 71

2.3.4. Elution of PCR Product 71

3.3. Y-chromosome analysis 72

3.3.1. Y-STR and Y-SNP datasets 72

3.1.2. Multiplex PCR profile 73

3.4. Statistical Analysis 74

3.4.1. Dental morphology Analysis 74

3.4.2. MtDNA Analysis 74

3.4.3. Y-STRs and Y-SNPs analysis 75

CHAPTER 3 RESULTS 83

3.1. Dental Morphology 83

3.1.1.1. Shovelling 85

3.1.1.2. Median Lingual ridge 88

3.1.1.3. Y-Groove Pattern 89

3.1.1.4. Hypocone 90

3.1.1.5. Metaconule 92

3.1.1.6. Major Cusp number 94

3.1.1.7. Entoconuild 96

3.1.1.8. Metaconulid 97

3.1.2. Mean Measure of Divergence 100

vii

3.1.3. Living Northern Pakistanis Only 100

3.1.3.1. Neighbor-joining Cluster Analysis 102

3.1.3.2. Multidimensional Scaling —Kruskal’s Method 103

3.1.3.3. Multidimensional Scaling —Guttman’s Method 106

3.1.3.4. Principal Coordinate Analysis 107

3.1.4. Living Pakistanis Considered in Light of Living Peninsular Indians and

Prehistoric Inhabitants of the Indus Valley and South-Central Asia 110

3.1.4.1. Neighbor-joining Cluster Analysis 110

3.1.4.2. Multidimensional Scaling—Kruskal’s Method 113

3.1.4.3.Multidimensional Scaling—Guttman’s Method 115

3.1.4.4. Principal Coordinate Analysis 117

3.2. Mitochondrial DNA analysis 119

3.2.1. Genomic DNA isolation 119

3.2.2. PCR amplification 119

3.2.3. MtDNA Haplogroups determination 121

3.2.3.1 MtDNA Haplogroups determination in the individuals of Gujars 121

3.2.3.2. MtDNA Haplogroups of the sampled Tarklani population of District Dir 130

3.2.3.3. MtDNA Haplogroup variation among the Utmankheil of District Dir 140

3.2.3.4. Haplogroups of the sampled Yousafzai of District Swat 149

3.2.3.5. Haplogroup distribution among the sampled Kohistanis of District Swat 158

3.2.4. Overall mtDNA haplogroup distribution among the five sampledethnic

groups of Swat and Dir districts 165

viii

3.2.4.1. Diversity comparison among the five sampled ethnic groups of Swat and

Dir Districts 173

3.2.4.2. Mitochondrial Genetic Differntiation 174

3.2.4.3. Multi Dimensional Scaling 175

3.2.4.4. Network Analysis based on mtDNA sequences 176

3.3. Y-chromosome STRs and Y-SNPs analysis 177

3.3.1. Multiplex performance 177

3.3.2. Genetic diversity 178

3.3.3. Genetic differentiation 181

3.3.4. Genetics, ethnicity and geography 184

3.3.5. Detailed analysis of two Y-chromosomal haplogroups 190

DISCUSSION 193

CONCLUSIONS 217

RECOMMENDATIONS 220

REFERENCES 221

APPENDIX I 257

APPENDIX II 258

APPENDEX III 259

APPENDEX IV 261

ix

LIST OF TABLES

Table 1 Details of samples collected from Swat and Dir districts ...... 62 Table 2 Details of the primer sequences used in the present study for the amplification of the target fragment of the mtDNA control region...... 69 Table 3 Components and concentration of PCR reaction mixture/sample ...... 69 Table 4 Components and concentrations of the multiplex PCR reaction...... 72 Table 5 Cycling profile for multiplex PCR reaction ...... 73 Table 6 Population samples included in the larger comparative analyses. Sample sizes and references to the original studies are shown...... 78 Table 7 Details of the living\modern and prehistoric samples used in this study for comparative analysis ...... 84 Table 8 Frequencies of dental traits among the five ethnic groups (%)...... 85 Table 9 Mean measure of divergence (MMD) distance matrix obtained from the pairwise group comparisons of the five populations and the other population used in this study...... 101 Table 10 Statistical analysis of the Gujar sample from Swat ...... 121 Table 11 Haplogroup frequencies and their respective variants found in the Gujar sample from of Swat ...... 124 Table 12 Diversity comparison of the sampled Gujar population from Swat with the other reported ethnic groups of Pakistan...... 129 Table 13 Statistical analysis of Tarklanis of District Dir ...... 130 Table 14 Haplogroup frequencies and their respective variants found among the sampled Tarklanis of Dir District ...... 131 Table 15 Diversity comparison of among the sampled Tarklanis of District Dir with the other reported ethnic groups of Pakistan...... 139 Table 16 Statistical analysis of sampled Utmankheil population of District Dir.... 140 Table 17 Haplogroup frequencies and their respective variants in the Utmankheil sample from District Dir ...... 141 Table 18 Diversity comparison among the sampled Utmankheil individuals of Dir District with the other reported ethnic groups of Pakistan...... 148 Table 19 Statistical analysis of Yousafzai of District Swat ...... 149 Table 20 Haplogroup frequencies and their respective variants among the sampled Yousafzai individuals of ...... 150 Table 21 Genetic diversity of the Yousafzai sample from District Swat in comparison to the other reported ethnic groups of Pakistan...... 157 Table 21 Statistical analysis of the Kohistani sample from District Swat ...... 158 Table 23 Haplogroup frequencies and the respective variants of the Kohistanis sampled from Swat District ...... 159

x

Table 24 The genetic diversity of the sampled Kohistani population from District Swat in comparison with the other reported ethnic groups of Pakistan...... 164 Table 25 MtDNA haplogroup frequencies distribution in the five sampled populations of Dir and Swat Districts...... 165 Table 26 Haplogroups distribution among the individuals of Swat and Dir district by associated geographic region of origin...... 170 Table 27 Genetic diversity in the mtDNA data within the five ethnic groups ...... 173 Table 28 Pairwise Fst genetic distances (below the diagonal) and corresponding p-values (above the diagonal) between five ethnic groups from Swat and Dir districts based on mtDNA sequence data...... 175 Table 29 Genetic diversity in the Y-STR (27 loci) and frequencies of Y-SNP haplogroups within five ethnic groups from Dir and Swat Districts. The values for the Y-SNP haplogroups in brackets represent 90% confidence interval...... 180 Table 30 The genetic distances among the five ethnic groups, calculated as pairwise FST values based on 23 of the 27 STR loci. FST values below the diagonal and the corresponding P-values above the diagonal...... 181 Table 31A AMOVA results when population samples are grouped based on country of origin ...... 185 Table 31B AMOVA results when population samples are grouped based on ethnicity...... 187

xi

LIST OF FIGURES

Figure 1. The Multi-Regional hypothesis of modern human migration 4 Figure 2. The representation of the Replacement hypothesis about human migration. 5 Figure 3. Diagram representing of the Assimilation model. 8 Figure 5. Geographic location of Province of Pakistan 13 Figure 6. Geographic distribution of primary languages spoken in KP, Pakistan 15 Figure 7. Graphical representation of population samples from Swat and Dir districts. 19 Figure 8. The generally accepted genealogy of the Pakhtuns originof KP, Pakistan. 21 Figure 9. Diagram represents the positional terms for the teeth and jaws 37 Figure 10. Morphological traits of canines and incisors with respect to ASUDAS. 39 Figure 11. Morphological traits of premolars. 40 Figure 12. Morphological traits with respect to ASDUAS reference plaques (a) reference plaque representing the metacone in upper molars (b) reference plaque for scoring hypocone (c) reference plaque for metaconule (d) reference plaque for Carabelli’s trait (e) reference plaque for scoring parastyle. 42 Figure 13.1. Morphological traits (a) reference plaque for the anterior fovea in lower molars with an example to its right (b) reference plaque for the deflecting wrinkle with an example to its right side (c) reference plaque for the protostylid with an example to its right side. 44 Figure 13.2.Dental morphological traits and reference plaques (a) the Y- and X-pattern (b) reference plaque for the hypoconulid (c) reference plaque for the entoconulid reference plaque for the metaconulid. 45 Figure14. Diagrammatic view of human mtDNA. 47 Figure 15. Human migration and haplogroup distribution across the world 49 Figure 16. mtDNA PhyloTree and partitioning scheme representing subtrees. 50 Figure 17. Modified structure of human Y chromosome. 53 Figure 18. Structure of the most recent and updated human Y-chromosome tree. 56 Figure 19.Geographic location of the study area. The colored circles represent location of villages where samples were collected. 61 Figure 20. Filling, signing of consent form, and cleaning of teeth by volunteer individuals. 63 Figure 21. Placement and removal of the alginate-filled tray from the subject’s mouth. 64 Figure 22. Pouring of diestone mixture into the alginate impression mold and labeling. 65 Figure 23. Scoring of dental morphology traits using the ASUDAS reference plaques 65 Figure 24.Representation of thermocycling profile for PCR. Figure (A) represents PCR conditions for HVSI, while figure (B) represents PCR conditions for HVSII. 70 Figure 25.Frequencies of shovelling among living Pakistani ethnic groups, living ethnic groups from peninsular and samples of the prehistoric inhabitants of the Indus Valley, South Central Asia (A) SHOVUI1 (B) SHOVUI2. 87 Figure 26. Frequencies of Median Lingual ridge among living Pakistani ethnic groups, living ethnic groups from peninsular India and samples of the prehistoric inhabitants of the Indus Valley, South Central Asia. 88

xii

Figure 27. Frequencies of Y-Groove Pattern among living Pakistani ethnic groups, living ethnic groups from peninsular India and samples of the prehistoric inhabitants of the Indus Valley, South Central Asia. 89 Figure 28. Frequency distribution of hypocon (A) HYPOCONM1 (B) HYPOCONM2. 91 Figure 29. Frequencies of metaconule at upper molars (A) HYPOCONM1 (B) HYPOCONM2). 93 Figure 30. Frequency Distribution of major cusps numbers at lower molars (CSPNLM) among all samples (A) Frequency of CSPNLM1 (B) Frequency of CSPNLM2 95 Figure 31. Frequencies distributions of entoconuild at lower molars (C6LM) among all samples included in this study (A) C6LM1 (B) C6LM2 97 Figure 32. Frequencies distributions of Metaconulid at lower molars (C7LM) among all samples included in this study (A) C7LM1 (B) C7LM2 98 Figure 33. Neighbor-joining cluster analysis of modern populations of northern Pakistan, peninsular Indian populations and their comparison to the major ethnic groups of Swat and Dir districts, Pakistan. 102 Figure 34. Multidimensional scaling (Kruskal's method) of the major ethnic groups residing in Swat and Dir districts in comparison with other living Pakistani and peninsular Indian ethnic groups. 104 Figure 35. Multidimensional scaling (Guttman’s method) of the major ethnic groups residing in Swat and Dir districts in comparison with other living Pakistani and peninsular Indian ethnic group samples. 106 Figure 36. Principal coordinate analysis of the major ethnic groups residing in Swat and Dir districts in comparison with other living Pakistani and peninsular Indians. 109 Figure 37. Neighbor-joining cluster analysis of the living Pakistani, other living and prehistoric inhabitants of the Indus Valley, South-Central Asia with the major ethnic groups from Swat and Dir districts. 110 Figure 38. Multidimensional scaling with Kruskal's method of Smith’s MMD pairwise distances among living Pakistani ethnic groups, living ethnic groups of peninsular India, prehistoric inhabitants of the Indus Valley, South Central Asia and the major living ethnic groups of Swat and Dir districts. 114 Figure 40. Principal coordinate analysis of living Pakistani, peninsular Indians, samples of prehistoric inhabitants of the Indus Valley and South Central Asia. 118 Figure 41. Photographs representing quality and concentretion of gDNA (A) Agarose gel electrophoresis (B) electropherogram 119 Figure 42. Agarose gel electrophoresis photograph of mtDNA control region (a) amplfied PCR fragment of HVSI (b) amplfied PCR fragment of HVSII 120 Figure 43. Agarose gel electrophoresis photographs (a) eluted PCR products of mtDNA HVSI (b) eluted PCR products of mtDNA HVSII. 120 Figure 44. Graphical representation of mtDNA haplogroups frequencies present in Gujar sample from Distict Swat. 123 Figure 45. Mega-haplogroup frequencies observed in the sample of Gujars from District Swat through mtDNA control region. 128 Figure 46. Distribution of Tarklanis haplogroup by Origins. 135

xiii

Figure 47. Graphical representation of haplogroup frequencies among the sampled Tarklani individuals from District Dir. 136 Figure 48. Haplogroup frequencies observed among the sampled Tarklani individuals from District Dir through mtDNA control region. 137 Figure 49. Graph representing haplogroup frequencies among the sampled Utmankheil individuals from District Dir. 145 Figure 50. The frequency of Mega-haplogroups observed among Utmankheils. 147 Figure 51. The distribution of Yousafzai haplogroups among the individuals sampled from District Swat by associated geographic origin. 153 Figure 52. The frequencies of mtDNA haplotypes of Yousafzai individuals sampled from District Swat with respect to their associated geographic origins. 154 Figure 53. The frequency of Mega-haplogroups observed in Yousafzai individuals from District Swat through mtDNA control regions. 155 Figure 54. Haplogroup distribution among the sampled Kohistanis from District Swat by associated geographic region of origin. 161 Figure 55. The frequencies of mtDNA haplotypes of Kohistanis sampled from District Swat with respect to their associated geographic regions of origin. 162 Figure 56. Mega-haplogroup distribution among the sampled Kohistani individuals from District Swat. 163 Figur 57. Mega-haplogroup distribution among members of the five sampled ethnic groups of Swat and Dir districts. 169 Figure 58. Haplogroup distribution among the individuals of the five sampled populations of Swat and Dir district by associated geographic region of origin. 171 Figure 59. Distribution of mtDNA lineages (A) West Eurasian (B) South Asian (C) East Eurasian 172 Figure 60: MDS plot of the five major ethnic groups of Swat and Dir districts derived from

Fst genetic distances. 175 Figure 61. Network analysis of five population samples from Swat and Dir districts based on mtDNA sequence data. 176 Figure 62. An example of typical electropherogram for Y-STRs multiplex reaction. 177 Figure 63. Multi Dimensional Scaling derived for the five major ethnic groups of Swat and Dir districts. 182 Figure 64. Median joining network based on the Y-STR haplotypes (23 loci) of the five population samples. 183 Figure 65. Multi-dimensional scaling analysis for 38 selected populations from the Indo- Pakistani sub-continent and neighboring countries. 188 Figure 66. Worldwide multi-dimensional scaling analysis of pairwise genetic distances,

estimated as FST 190 Figure 67. Y-chromosome haplogroup-specific networks based on Y-STR haplotypes (10 loci) with individuals assigned to (A) Y-SNP haplogroups G-Page94 and H1-M69 (B) Y-SNP haplogroup L1-M22(xM274). 192

xiv

ABBREVIATIONS

AMHs Anatomically Modern Humans

AWAm1 Awans collection from Mansehra District by BE Hemphill

AWAm2 Awans collection from Mansehra District by Nazia Sidiq

ChlMRG Early Chalcolithic Period collection from the archeological site of Mehrgarh (c. 4500 BC)

CHU Living tribal Chenchus from central Andhra Pradesh, India

DJR Djarkutan Period collection from the archeological site of Djarkutan (2000-1800 BC), Uzbekistan

GPD Low-status Dravidian-speaking Gompadhompti Madigas from southern Andhra Pradesh, India

GUJm2 Gujars collection from Mansehra District by Nazia Sadiq

GUJsw Gujars from Swat District (present study)

HAR Mature Period collection from the archeological site of Harappa (c. 2300-1800 BC), Province, Pakistan

INM Late Jorwe Period collection from the archeological site of Inamgaon (c. 1400 BC), Maharashtra, India

KARa Karlaars collection from District by Nazia Sadiq

KHO Khowars from Chitral City, Chitral District

KOHsw Kohistanis from Swat District (present study)

KUZ Kuzali Period collection from the site of Djarkutan (1800-1650 BC), Uzbekistan

MDK Living inhabitants of the village of Madak Lasht, Chitral District

MDA Living Madia Gond tribals from Eastern Maharashtra, India

MDS Multidimensional Scaling

MHR Living Indo-Aryan-speaking low-status Mahars from Western Maharashtra, India

MRT Living Indo-Aryan-speaking high-status Marathas from Western Maharashtra, India

xv

MOL Molali Period collection from the site of Djarkutan (1650-1500 BC), Uzbekistan

NeoMRG Aceramic Neolithic Period collection from the site of Mehrgarh(c. 6000 BC), Baluchistan Province, Pakistan

PNT High-status Dravidian-speaking Pakanati Reddis from southern Andhra Pradesh, India

PCO Principal Coordinates Analysis

SAP Sapalli Period collection from the site of Sapalli tepe (c. 2200-2000 BC), Uzbekistan

SKH Iron Age collection from the site of Sarai Khola (c. 200 BC), Punjab Province, Pakistan

SWT Living Swatis collection from Dhodial and Baffa, Mansehra District by BE Hemphill

SYDm2 Syeds collection from Mansehra District by Nazia Sadiq

TANm2 Tanolis collection from Mansehra District by Nazia Sidiq

TMG Late Bronze/Early Iron Age collection from the site of Timargarha (1400-800 BC), Dir District, Pakistan

TRKd Tarklani from Dir District (present study)

WAKg Living Wakhis from Gulmit, Gilgit-Baltistan

WAKs Living Wakhis from Sost, Gilgit-Baltistan

MtDNA Mitochondrial DNA

CRS Cambridge Reference Sequence

HV Hypervariable

HVS Hypervariable Sequence

HVSI Hypervariable Segment I

HVSII Hypervariable Segment II

TANm2 Tanolis from Mansehra District collected by Nazia Sadiq

UTHd Utmankheil from Dir District (present study)

YSFsw Yousafzai from Swat District (present study)

GOP Government of Pakistan

xvi

ABSTRACT

The ethnic groups inhabiting Dir and Swat Districts of Khyber Pakhtunkhwa Province, Pakistan are known to exhibit cultural and physical diversity. Genetic diversity however among the people of this region remains largely unknown. A research endeavor based on dental anthropology and molecular phylogenetic was conducted in for elaborating phenetic and molecular affinities among the major ethnic groups of the area. The morphological variants of permanent tooth crown were recorded for phenotypic analyses, whereas mitochondrial DNA (mtDNA) and Y-Chromosoal STRs/SNPs were considered for maternal and paternal variation, respectively, among the individuals and in between Gujar, Kohistani, Tarklani, Utmankheil and Yousafzai tribes. Dental casts and oral swabs were collected from volunteers of all the tribes/ethnic groups. Morphological variants of the permanent tooth crown were scored from maxillary and mandibular dental castes in accordance with the Arizona State University Dental Anthropology System (ASUDAS). Two mitochondrial DNA control segments viz Hypervariable segment I (HVSI), Hypervariable segment II (HVSII), 27 Y-STRs and 331 Y-SNPs were used to explore molecular phylogenetic relationships. Dental casts were obtained from 823 healthy unrelated individuals of the five ethnic groups of the two districts. The casts were analyzed for 14 tooth-trait combinations. The data was then compared with 27 samples encompassing 3,185 prehistoric and living individuals representing ethnic groups of the Hindu Kush-Karakoram highlands and Indus Valley of Pakistan, peninsular India, and Central Asia. Inter-sample affinities were computed with C.A.B. Smith’s pairwise Mean Measures of Divergence (MMD) statistic. Patterning of phenetic affinities were assessed with neighbor-joining cluster analysis (NJ), multidimensional scaling (MDS), and principal coordinate analysis (PCA). The results obtained vary with respect to data reduction technique. Neighbor-joining cluster analysis assort Gujars, Kohistanis and Utmankhels with possessing affinities to the ancient Harappans peop;e of the Indus Valley whereas Yousafzais assorted for having affinities with ethnic groups of the Hindu Kush-Karakoram highlands. The Tarklanis exhibit no close affinities to Gujars, Kohistanis, Utmankheils or Yousafzais.

xvii

The results of mtDNA generated 126 haplotypes among which, 75 were unique and 51 were shared. The results further revealed that 45% of the individuals possess matrilineages of West Eurasian derivation, 36% of South Asian derivation, 6% of individuals possessed lineages of East Eurasian derivation, while frequencies of lineages of other derivations are of extremely low frequency. The West Eurasian haplogroup R was found 62% of individuals was the most frequent haplogroup, followed by South Asian haplogroup M (32%), East Eurasian haplogroup N (5%), while one individual was found to possess the African haplogroup L. The results of Y-STRs analysis revealed 82 haplotypes in which 75% were unique and 25% were shared, yielding a haplotypic diversity of 0.99. High and statistically significant levels of genetic differentiation were obtained in nine of the 10 pairwise comparisons

(FST= 0.148-0.596), the exception being the contrast between Tarklanis and Yousafzais

(FST = 0.008). Members of the Utmankheil, also considered Pashtuns tribe, were found to be not closely related to any of the other population samples (FST= 0.445- 0.596). The high genetic differentiation was also visible in Y-chromosomal SNPs, showing very little overlap between the five population samples, except for Tarklanis and Yousafzais. When analyzed in a larger continental-scale, it is clear that the paternal lineages of these five ethnic groups fall mostly outside the previously characterized Y-chromosomal gene pools of Indo-Pakistani sub-continent. Our findings presented here contribute towards the understanding of the genetic complexity exhibited by the apparently related ethnic groups residing in the northern parts of Pakistan. It provides a sound baseline for elaborating the historical profile and anthropological standings of Pakistani people for the fastly approaching era of personal genomics and personalized medicine.

xviii

Chapter 1 INTRODUCTION

1.1 History of modern human

Ever since from the development of human civilization the questions like; where did the human race come from? Where are we going? Who were the closest relatives and what are the circumstances that led to the evolution of Homo sapiens (H.sapiens); are some of the questions for which the scientists seeks answers using the principles of evolution and molecular genetics (Whale, 2012; Stoneking, 2008). The human are unique due to: 1) an evolved intelligence, 2) hyperprosociality, and 3) a psychology for social learning (Marean et al., 2015). Ultimate explanations for this evolutionary information are better explained through synthetic studies of biology, genetics, anthropology and archaeology etc. The evidence of fossil records shows that the lineage that leads to extant modern humans appeared approximately ∼300 and 100 thousand years ago (Poznik et al., 2013; Scally and Durbin, 2012; Endicott et al., 2010;

Underhill and Kivisild, 2007). The fossil records also suggest that the world had a diverse set of hominin lineages between ∼800 and 40 thousand years ago (kya).

There was a modern human lineage in Africa i.e., Omo-Kibish No. 2, Ngaloba, Jebel

Irhoud, Herto, at least one archaic African lineage H. heidelbergensis represented by a number of fossil specimens (i.e., Bodo, Kabwe, Elandsfontein, Saldana, two archaic

Eurasian lineages (Neanderthals and Denisovans) and a widespread archaic Eurasian lineage commonly referred to Homo erectus which shows considerable temporal and

1

geographic variation (Meyer et al., 2014; Prufer et al., 2014; Mendez et al; 2013;

Lachance et al. 2012; Hammer et al. 2011; Harvati et al., 2011).

Around 700 kya, and perhaps earlier, H. erectus in Africa gave rise to H. heidelbergensis, a species more similar to modern humans in terms of body symmetries, dental adaptations and cognitive factors (Rightmire, 2009).

Archeological and DNA evidence suggests that H. sapiens evolved in Africa about

200 kya, probably from H. heidelbergensis (Rightmire, 2009; Relethford, 2008). H. heidelbergensis, often referred to as an "archaic" H. sapiens, was a dynamic big-game hunter, produced sophisticated tools, and by at least 400 kya had the ability to control fire (Roebroeks and Villa, 2011). Due to special characteristics and the advent of good quality hunting techniques, H. sapiens was able to flourish in sub-Saharan

Africa, from which they dispersed to Eurasia, Australia, the Americas and Oceania

(DeGiorgio et al., 2009).

1.1.1. Dispersal of modern man

Despite the broad consensus that Africa represents the main place of origin for

Anatomically Modern Humans (AMHs), the routes of dispersal of man from the continent remains a subject of considerable debate.

One of the most highly debated issues that focused on the origins of modern humans is that, roughly 100,000 years ago, the Old World was occupied by a morphologically diverse group of hominins. In Africa, as well as in the Middle East, there was H. sapiens; in Asia, Homo erectus; and in Europe, Homo neanderthalensis (Klein, 2008).

2

However with the passage of time about 30,000 years ago this taxonomic diversity is disappeared and all that remained were anatomically as well as behaviorally modern humans (Johanson, 2001; Klein, 1999; Tattersall and Schwartz, 1999; Clark and Willermet, 1997; Stringer and McKie, 1996; Wolpoff and Caspari, 1996; Nitecki and Nitecki, 1994; Smith and Spencer, 1984). The evolution of Modern Man from previous hominin species is disputed, nor which archaic human species from which modern humans derived, but where, geographically. Three hypotheses are currently quoted for popularity among paleoanthropologists. These include:

(i) The multi-regional hypothesis

(ii) The replacement hypothesis

(iii) The assimilation hypothesis.

Each of these hypotheses are based on fossil, archaeological, anthropological and genetic evidences (Stringer, 2002; Mellars, 2006; Schick and Tooth, 1994). An introduction to all these hypotheses is given bellow:

The multi-regional hypothesis

Proponents of the multi-regional hypothesis (Fig. 1), or the Regional Continuity

Model, suggest that Homo erectus migrated out of Africa to the various regions of the world nearly 2.0 million years ago (MYA), which gradually evolved into AMHs, providing our current worldwide distribution (Wolpoff et al., 1984; Nei, 1995). For example, Asian H. erectus evolved into Asian modern humans, African H. erectus evolved into African modern humans etc. It has also been reported that the multi- regional model does not suggest parallel evolution, independent multiple origins or

3

the simultaneous appearance of characteristics within different regions (Wolpoff et al., 2000). This hypothesis also states that the regional characteristics of modern humans can be traced back to H. erectus remains that date nearly 1 mya (Nei, 1995).

The Genomic study reveals that AHMs had no evidence of Homo neanderthalensis mtDNA contribution (Hodgsonand Disotell, 2008). This might be due to the high rate of polymorphisms found between Neanderthal and modern human mtDNA with respect to any two modern human mtDNA. However, some sequences of Homo erectus X-chromosome were identified in the genome of modern humans (Cox et al.,

2008). This phenomenon provides some genetic support for the multi-regional hypothesis.

Africa Asia Austo Asia Europe

Africa

Figure 1. The Multi-Regional hypothesis of modern human migration (Stoneking, 2008)

4

It should be noted that the model doesn’t support the possibility of different H. erectus populations breeding with one another; however it says that the main form of breeding took place within isolated H. erectus members. Hence, proponents of this hypothesis conclude that, each inhabited region showed a continuous anatomic sequence leading to the development of modern humans, and those non-African populations exhibited no special African influence. (Stringer, 2002).

The Replacement hypothesis

The Replacement hypothesis, or Out-of-Africa theory, is the primary alternative to the multi-regional hypothesis (Fig. 2).

Africa Asia Austral-asia Europe

Africa

Figure 2. The representation of the Replacement hypothesis about modern human migration (Stoneking, 2008).

5

This hypothesis also describes an African origin, but proponents of the multi- regional hypothesis focus mainly on H. erectus and not all of the AMHs. By contrast, proponents of the replacement hypothesis suggest that modern humans originated from an African H. erectus population about 100,000-200,000 years ago (Nei, 1995) or maybe ~150000 years ago (Forster and Matsumura, 2005).

This indicates that the modern humans first expanded inside Africa, then migrated to the Middle East and then onwards to other regions. Advanced genetic techniques were used to test this hypothesis (Whale, 2012). DNA obtained from Africans,

Asians, Australians, New Guineans and Europeans were analyzed for restriction fragment length polymorphisms and it was concluded that the common ancestor of all modern humans lived in East Africa between 140-280 kya (Cann et al., 1987). The mtDNA sequences obtained from chimpanzees and humans were used to determine the rate of mtDNA evolution and the results demonstrated that the common ancestor to all modern humans dates to some 166-249 kya (Vigilant et al., 1991). Another study also supported the Out of Africa hypothesis and calculated the ancestor of modern humans to be about 230-298 kya (Hasegawa and Horai, 1991; Ruvolo et al.,

1993). Later the dispersal of AMHs out of Africa proceeded along a northern route

(the Levant) or a southern route through the Horn of Africa. Recently, the initial single migration that took place through the southern route was supported by many researchers based on an array of different data sets (Chandrasekar et al., 2009; Kumar et al., 2009; Hudjashov et al., 2007; Mellars, 2006; Forster and Matsumura, 2005;

Macaulay et al., 2005; Kivisild et al., 1999). The Levantine migration shows lesser

6

impact and appears to have occurred recently about 20-10kya (Forster and

Matsumura, 2005; Winters, 2011). A third migration has also been proposed. In this case the route occurred through the narrow Strait of Gibraltar from North Africa approximately 40-35 kya, when Neanderthals were still present in western Eurasia

(Winters, 2011). The archeological and molecular genetic evidence supports a single

AMH origin in East Africa (Liu et al., 2006).

Research based on mtDNA and Y-chromosome variations also supports the out of Africa hypothesis. The results of Y- chromosome and mtDNA haplogroups show that

Australian Aboriginals and Melanesians are from founder haplogroups (haplogroup

N and M for mtDNA, and haplogroups F and C for Y- chromosome) that are related to the initial movement from Africa about 50-70 kya. Australian Aboriginals and the indigenous populations of Papua New Guinea and Melanesia are related to each other and once settled together; however, they were separated by the Timor Strait (Hudjashov et al.,

2007).

The Assimilation model

The Assimilation Model (Figure 3) is the combination of two former theories in that

AMH “arose through the integration of an important African role with multiregional aspects” (Stringer, 2002). According to proponents of the assimilation model, Africa is the origin for AHMs; however, it also suggested that the migrations and replacement of the archaic populations played a pivotal role in the local evolution of various H. erectus populations into AHMs.

7

It has also been reported that the genome of AHMs of Eurasians shares 1-4% of the

Neanderthal genomes and the Neanderthal genome is marked by a greater affinity to European AHMs relative to the genomes of African AMHs (Green et al., 2010).

We also know from the previous studies that, the genome of Neanderthal is equally similar to French individuals as to an East Asian (Han Chinese) and Papuan genomes. This pattern suggests that admixture between AMHs and Neanderthals occurred soon after modern humans dispersed out of Africa, but prior to the subsequent divergence of Europeans, East Asians and Papuans (Green et al., 2010).

Africa Asia Austral-asia Europe

Africa

Figure 3. Diagram representing of the Assimilation model about the dispersal of antatomically modern humans(Stoneking, 2008).

8

1.1.2. Time line and routes of Modern Human dispersal

Archaeological and genetic data suggest Africa as the home of AMHs (H. sapiens).

Phylogeographic studies utilizing the uniparental non-recombining DNA, mtDNA and the male-specific region of the Y Chromosome(MSY), has largely clarified the initial migration routes of anatomically modern humans (Underhill and Kivisild,

2007). The number of migration events of AMHs out of Africa is still debated, but studies based on uniparental markers suggest a single migration event

(Oppenheimer, 2012; Underhill and Kivisild, 2007). A simplified sketch of the initial dispersal of AMH is presented in Figure 4.

35-25 kYA 20-15 kYA 40 kYA 12? kYA

150-100 kYA

150-100 kYA 60-50 kYA

11? kYA

Figure 4. The main migration routes and timing of the migrations of human out-of-Africa, adapted from Oppenheimer (2012)

9

Haak (2015), investigated the massive migration of human from steppe towards

Europe through ancient DNA and support for a steppe origin (steppe hypothesis) of at least some of the Indo-European languages of Europe (

The timeline for the migration(s) of anatomically modern humans out of Africa is controversial. However, a number of evidences show that modern humans migrated out of Africa some 100-72 kya and moved eastwards towards the Indian sub- continent via the Arabian Peninsula (Oppenheimer, 2012; Relethford, 2008). After the initial dispersal out of Africa, anatomically modern humans traveled southeast and reached Australia about 60-50 kya (Oppenheimer, 2012; Rasmussen et al., 2010;

Relethford, 2008). The migration of anatomically modern humans to Europe from the Arabian Peninsula occurred approximately 40-50 kya (Soares et al., 2010;

Relethford, 2008; Novelletto, 2007) and at about 40 kya, Central Asia was inhabited by humans from Pakistan through East Asian sea coast (Oppenheimer, 2012). Later on, approximately 30-20 kya, the population from Central Asia migrated westward toward Europe and eastward into Beringia while the last geographic region colonized by human (Oppenheimer, 2012).

Most of the studies indicates that humans from Beringia reached Alaska approximately 20-15 kya (Oppenheimer, 2012; Raff et al., 2011; O'Rourke and Raff,

2010). However there is an alternative hypothesis about human migration to the

Americas which states that a Pacific coastal route was used for migration from

Siberia to South America followed by a second migration towards the Bering landbridge into North America (O'Rourke and Raff, 2010; Schurr and Sherry, 2004).

10

Most of the study indicated that the last geographic region colonized by anatomically modern humans was Oceania, specifically Polynesia (Kayser et al.,

2010). Some evidence has also been offered to suggest that anatomically modern humans migrated to Southeast Asia and Australia about 60-70 kya along the coasts of Indian Ocean (Lahr and Foley, 1994).

It has also been reported that Pakistan was the first geographic region though which anatomically modern humans migrated through this postulated southern coastal route (Wolpert, 2000; Qamar et al., 1999).

1.2. The Islamic Republic of Pakistan

Pakistan is home of more than 2100 million people and at least 18 ethnic groups that speak more than 60 local languages that have been assigned to a wide array of linguistic stocks including, but not limited to Indo-Iranian, Indo-Aryan, Dardic,

Tibeto-Burman and Dravidian (Grimes and Grimes, 2000; Newcomb, 1986). Pakistan occupies eastern Hindu Kush, western Himalaya and southern Karakurum. All these famous mountan ranges meet in Pakistan at Jaglot, near Gilgit. Pakistan lies on the crossroads of West Asia, Central Asia and South Asia (Ali et al., 2005). This region is marked by a high degree of ethnic diversity, which historically has been attributed, at least partially, to a long and dynamic history of repeated invasions by Aryans,

Macedonians, Arabs and Mongols etc (Lapidus, 2002; Bernhard, 1983; Birdwood,

1959). It is also believed that the Coast of Makran-Pakistan and the present day

Afghanistan likely served as passage for human dispersal in prehistoric times,

11

making the population dynamics of this region even more interesting (Derenko et al.,

2013). Additionally, the Hindu Kush highlands served as a physical barrier that channeled trade along the “Silk Route” that linked the Mediterranean Basin and

West Asia to China for more than 16 centuries (Petraglia et al., 2012; Kuzmina 2008;

Elisseeff 2001; Quintana-Murci et al., 1999). Furthermore, Pakistan is one of the South

Asian countries that has two well-known civilizations; the Indus valley or Harappa civilization, which flourished between 2600 BC and 2000 BC (Kenoyer, 1998) and the

Gandharan civilization that, peaked between 1500 and 1000 BC (Miller, 1985;

Basham, 1963). It is therefore possible that the extant populations of the Hindu Kush highlands show traces of historic and even prehistoric gene flow from far distant human populations. Currently, Pakistan is divided into five provinces: Punjab,

Sindh, Baluchistan, Gilgit-Baltistan, Khyber-Pakhtunkhwa (KP) and the Federally

Administered Tribal Areas (FATA) (Fig. 5).

12

Figure 5. Geographic location of Khyber Pakhtunkhwa Province of Pakistan

Khyber Pakhtunkhwa, where the Pashtuns are in majority is situated in the northwestern part of Pakistan, is recognized as the heart of civilization

(Zwalf, 1996). About 300 historic sites in different areas of the province have been identified (Arif, 2014; Docherty, 2007). The presence of the remains of animals, humans and coins of bronze have disclosed the hidden truth related to Gandhara culture during excavations in (Dani, 1980). The Bronze Age coins, about

3000 BC, old were found to be associated with the Alxon Hunnic Kings of Gandhara,

Bactria and other dynasties. The pre-Harappan civilization (4000 BC) sites were also discovered at Rehman Dheri, located on the trade route in KP that connects South

Asia, Eastern Iran, southern and Central Asia (Khan, 2013a; et al., 1991). Furthermore, about 50,000 petroglyps and inscriptions available on the

13

Karakurum Hiway near Shatial and Thag Nala near Astor dates back 5th to 9th century BC (Khan, 2013b), shows the movement of people of different regions of the world in Pakistan.. Subsequently, the historical view has been that KP was a region inhabited by Indo-Aryans in 2000 BC (Renfrew, 1996). The mtDNA haplotypic diversity shows that the populations of India, Afghanistan, Iran, Turkey and Central

Asia reflect the fact that the genetic influx from the Fertile Crescent to the Indian subcontinent was more frequent than from East to West (Kaifu et al., 2015; Quintana-

Murci et al., 2004). The historical record documents that KP was ruled by Persians in

550 BC, Macedonian dynasty in 330 BC, the Mauryans Empire in 322 BC, the

Kushana monarchy in 250 BC, the realm of Shahi in AD 1000, a Ghaznavid invasion in AD 997, was incorporated into Turk-Mongol Gurkani domain in AD

1200, the Yan Dynasty in AD 1271, experienced influxes of Pashtuns beginning in the

16th century and in the 18th century (Marbaniang, 2015; Tamimi,

2009; Rome, 2008; Aslamkhan, 1996; Barth, 1956). The merging of forien elements along with the indigenous inhabitants brought a unique social, cultural and high level of diversity in the population of KP (Crews, 2015). , Saraiki, Khowar,

Gujri, Kohistani and are the primary languages spoken in different regions of the province (Cunliffe, 2015; Bouckaert et al., 2012) and their respective areas are illustrated in Figure 6.

The province comprises of 27 districts including Bannu, Buner, Peshawar,

Abbottabad, Mansehra, Shangla, Swabi, Upper Dir, Lower Dir, Tank, Shangla, Swat,

Noshehra, , Karak, Tor Ghar, Kohistan, Hangu, Haripur, Kohistan,

14

Batagram, Lakki Marwath, , Malaknd, Chitral and D.I. Khan with a population of approximately 31 million according to the most recent census held in 2017 (GOP,

2002).

Figure 6. Geographic distribution of primary languages spoken in Khyber Pakhtunkhwa, Pakistan.

1.3. The Study Area

Swat and Dir districts were selected and explored for dental morphology and molecular anthropology of the major ethnic groups. Brief regarding both the districts is provided bellow:

15

1.3.1. District Swat

Swat is a district located in the Khyber Pakhtunkhwa (KP) province of Pakistan with a population of around 1.26 million (according to the 1998 census, GOP, 2002). It is the largest among all the valleys of Hindu Kush and encompasses an area of some

6226 km2 between 34o 30’ to 35o 55’ N longitude and 71o 45’ to 72o 50’ E latitude. The altitude of Swat ranges from 600m in the South to more than 6000m in the north, with the highest peak of Falaksair, attaining an elevation of 6261m AMSl (Ali et al.,

2012; Ahmad and Ahmad, 2003). The valley borders on Indus Kohistan and Shangla to the East, with Chitral and Ghizer to the North, Bunir, Malakand Agency to the

South, and Dir to the West (GPO, 1998). The geographic position of the District Swat is presented in Figure 5.

The Swat valley, bounded by the mountain of Hindu Raj, occupies an important position among the Hindu Kush and Himalyan mountains of Pakistan and is famous for its natural resources and biodiversity (Ahmad et al., 2015). Historical perspectives confirmed by archeologists indicate that the valley was occupied in prehistoric period between 2400-2100 BC (Ali and Khan, 1991; Stacul, 1969). Several civilizations have passed through Swat in different waves. The term Suvastu was referred to Swat for the first time in the sacrid book Rigveda, the religious account of the Aryans, which means in Sanskrit as ‘good dwelling’ while the Latin and Greek historiographers of Alexander’s army referred to the Swat Valley as Soastos named with Swastu of the Vedic origin. In the literature of Buddhist, Swat is still named as

‘Urgyan’ or ‘Orgayan’ (Tucci, 1958). Both these terms ‘Urgyan’ or ‘Orgayan’ are the

16

phonetic versions of Sanskrit word ‘Uddyana’. According to the Chinese travelers

Fa Hien, Wicking, Hiuen Tsang and Song Yun, Swat remained under Gandharan control in the 5th-8thcentury AD (Hussain 1962; Shah, 1940; McMahoon and Ramsy,

1901). Fa Hien who explored Swat in 403 AD, called it Won Chang in Chinese or park in English. He also mentioned that the people of Swat spoke the Indo-Aryan language here (McMahoon and Ramsy, 1901). Swat remained for more than 1000 years and flourished under Buddhist and Brahminic rules, whose carvings inscriptions are still available on rocks embroidery and wood carvings all over the area. Ahmed and Sirajuddin (1996), maintain that the major feature of the vegetational land scap of the area is Sino-Japanese in nature. The Aryans, alleged to be emigrants from Central Asia, who likely one or several proto-Indo-Iranian languages, take over the region from Iran to northwest Pakistan in the Second millennium BC, while the mention of Suvastu (modern Swat) in Rigveda attests to the Aryan colonization in Swat valley (Allchin and Allchin, 1982). In 327 BC,

Alexander crossed the Hindu Kush and travelled towards Afghanistan and occupied

Swat (Rome, 2008). At the decline of Greek power, Chandragupta Maurya attacked the Macedonians and occupied the whole Punjab (Smith, 1914). In swat valley

Chandragupta Maurya established his strong hol at Mura Hill, in Malakand Agency, which remaind as the last stronghold of Gujars Dynasty in the Hindu Kush region

(Anonymos, 1998; Ahmad et. al, 2011). He expanded his empire and, during the reign of his grandson known as Asoka the great, Buddhism was predominated in

Swat in 3rd century BC (, 1997).

17

After the falldown of Mauryan Dynasty, the Bactrian Greeks took over the whole regions of Gandhara, , Hunza and Swat. Swat was retained until Turk

Shahi invasion, who expanded its reign of Kabul from the borders of Seistan to the

North of Punjab during 7th century AD and in 745 AD Swat was completely occupied

(Rehman, 1979). After the downfall of Turks, the Hindu Shahi dynasty established their rule in AD 822 AD and that lasted until the 11th century AD (Rehman, 1993).

Sultan Mahmud Ghaznavi (Mahmud of Ghazna) occupied the valley of Swat in the

11th century AD defeating Raja Gira, wherein the Pushto Language and Islamic laws were introduced. Later on the valley was occupied by and Swati

Pathans/Pashtuns (Swati, 1997). The Yousafzai Pashtuns/ Pathans placed their mark on the valley in the 16th century defeating the Swatis (Rome, 2008; Qasmi,

1939). Today the Swat valley is identified by three ethnic groups Pashtuns/Pashtuns,

Gujars and Kohistanis (Barth, 1956).

1.3.2. District Dir

Dir District comprises hilly and mountainous trrain coprising of the main Dir Valley, several side valleys, narrow mouhtain gorges and part of plains of Area. It high peaks ranging from 4876m to northeast and 3048m to East with Swat and to

West with Afghanistan (Rahatullah et al., 2011). The total area of District Dir is

5284km2 when it was considered as one District now divided into to two newly separate districts (i.e. lower Dir 15,85km2 and upper Dir 36,99km2) that lies in Hindu

Kush range between 71°50 to 71° 83E longitude and 35°10 to 35°16N latitude (Ali et al., 2008). The census report of 1998 revealed that the total population of the area is

18

approximately 1.38 million (GOP, 2002). To the western border, from North to South, stretches the mountain range known as the Koh-i-Hindu Raj (Fig. 1). To the East from North to South, there is the mountain range of Swat and Dir, which serves as a boundary between the two districts and in the North which separates Swat Kohistan from Dir Kohistan (Hazrat et al., 2007). The District is also bounded by Bajaur

Agency to the west, Malakand District to the south and Chitral is situated in the north (Figure 5). Dir was invaded by Alexander, Buddhists, Mughals, but the most important event was the settlement of the Yousafzai in the 16th century (Shah, 2013).

As in Swat, Pashtuns/Pakhtuns are also the major ethnic group of District Dir, followed by Gujars and Kohistanis, while majority of the people speaks Pashto language, followed by Gojari and Kohistani (Bellow, 1994). A brief historical review of the selected ethnic groups of the study areas are described below in Figure 7.

Selected Population for present study

Pashtu Gujar Kohistan ns s is

Utmankhei Tarklani Yousafza l i Figure 7. Graphical representation of the present study population samples from Swat and Dir districts

19

1.4. The Pashtuns\Pakhtuns

Pashtuns are an Eastern-Iranian-speaking Afghan ethnic group with a widespread geographic distribution in southern and eastern parts of Afghanistan and in the northwestern portion of Khyber Pakhtunkhwa as well as in Baluchistan provinces of

Pakistan (Haber et al., 2012; Caroe, 1976).The terms Afghan, Pukhtun, Pathan and

Pashtun are synonyms used in different literatures (Glatzer, 1998). The origins of

Pashtuns are rather poorly understood, not only in terms of population genetics, but also in terms of history (Sabitov, 2011).

There are many hypotheses about the origins and inter-relationships among the various ethnic groups subsumed under the more general term “Pashtun.” Some historians are of the opinion that Pashtuns are the descendants of Jews (Qamar et al.,

2002; Caroe, 1958). Some of the European authors maintain that Pashtuns are a

Caucasian ethnic group descended from Armenians, while perhaps the strongest argument is that Pashtun Afghans are basically belongs to Aryans (Elphinstone,

2011; Mirabal et al., 2010; Robson and Lipson, 2002). Some genetic evidence also suggests that there is a very close relationship between Ashkenazi Jews and

Pashtuns (Bhatti et al., 2016a). It has also been reported that Pashtuns originated from Greeks (Firasat et al., 2007). Furthermore, the Pashtuns cannot be defined by their ethnicity only; instead, they are also defined by speaking Pukhto/Pashto and by practicing a set of traditional cultural values known as

Pakhtunwali/, also called Pukhto (Barfield, 2010; Coningham and

Young 2015; Bohner and Lucarini, 2015; Khan, 2008; Nusser and Dickore, 2002;

20

Caroe 1958). Among these ethnic subgroups of Pashtuns (Fig. 8), Yousafzai,

Tarklanis and Utmankheils were selected for this dissertation, because these ethnic

groups are representative of the study area.

The genealogy of the Pashtuns is summarized in Figure 8.

Afghanan

Qais Khalid Bin Waleed

Sarbani Bitan Ghurghakhti Karlanri

Sharkbun Hussainkhel Krozai Sanzarkheil Karshbun Dotani Mattizai Essakheil Mehsuds Ghoryakhel Khattak Sahak Yasinzai Stanikzai Lodhi Jadoon Utmankheil Musakheil Tarklani Ghilzai Khatak Yousafzai Daudzai Abdali Alikozai Achakazai

Figure 8. The generally accepted genealogy of the Pakhtuns origin of Khyber Pakhtunkhwa, Pakistan (Modefied from Caroe, 1958).

21

1.4.1. The Yousafzai

The Yousafzai (literally meaning “Sons of Joseph”) are a sub-tribe of Pashtuns that is found in the northern areas of KP, Pakistan (Tokayer, 2007). The Yousafzai have spread over a large area that stretches from the Bajaur Agency contiguous with the

Durand line, to the Easternmost reaches of Mansehra (Caroe, 1958). The Pakhtuns residing in Swat, Dir, Buner, Shangla, Mardan, Swabi and Malakand mainly belong to the Yousafzai sub-tribe (Barth, 1959).

It is clear from history that the Yousafzai inhabited Kabul, along with other Pakhtun tribes like Muhammad and Mumand, but due to clashes with the Mughal ruler, Mirza Alagh Beg, they migrated from Kabul to Peshawar at the end of 15th century under the guidance of their leaders Ahmad and Sheikh Malli

(Sirajuddin, 1970). After expelling the Delazak from Peshawar, the Yousafzai occupied Mardan, Swabi, Swat, Buner, Dir and Bajaur valleys pushing the native population of the area into other areas like Hazara or the inaccessible mountain gorges (Yasin, 2008; Barth 1959; Caroe 1958). Due to the historic position of

Yousafzai among the other Pashtuns, they are the most widely studied and recognized population in terms of tribal and clan structures, genetic profile, politics, history, language and marriage practices (Lindholm, 1982; Ahmed, 1976; Barth, 1959;

Caroe, 1958). According to (Ilyas et al., 2015) all popula-tions share a similar demographic history between 1 mil-lion to 200kyr ago. From 200kyr ago to 20kyr ago, the Pathan follow a similar trajectory to other Asian and European populations, with an inferred effective population size smaller than African populations,

22

reflecting the out of Africa bottleneck, over the last 20 k years, the Pathan shows an explosion in effective population size, contemporaneous to other Eurasian populations but much greater in magnitude. The very large effective population size likely reflects admixture between European and Asian lineages giving rise to modern Pathans rather than an actual increase in census sizes.

1.4.2. The Utmankheils

Utmankheils are a Pathan subtribe who inhabits a large portion of the country that spread across the hills surrounding the valley of Peshawar and includes the country west and southwest of the junction of Swat, Dir (Panjkora rivers), Bajour, Malakand

Agency and some parts of Mardan (Murray, 1899; FATA, 2010; International Crisis

Group, 2006). The Utmankheil appear to have acted in concert with the Tarklani and

Yousafzai in the campaigns just referred to, and at about the same time as the conquest of Swat and Dir by the Yousafzai were settled in the country they currently occupy. The Utmankheil belongs to the Karlanri subtribe of Pashtuns and within the

Karlanri, the origin of the Uthmankhel clan is debated as their current sub- populations are descended from an adopted child of unknown origin by the

Pashtuns (Barfield, 2010; Caroe, 1958). The Utmankheil are further divided into

Ismailzai, Bimmarai, Mandal, Muttakai, , Sanizai, Aseel, Gorai, Boot and

Shamozai clans (Yaad, 1986).

1.4.3. The Tarklanis

Tarklanis () are a clan within the subtribe of Pashtuns and they are mainly found in the Federally Administered Tribal Areas (FATA) of Pakistan, while

23

a large number also reside within of Afghanistan and District Dir lower of Khyber Pakhtunkhwa, Province of Pakistan (Rehman et al., 2016; Caroe,

1958; FATA, 2010; Tareekh-e- kakzai, 1993; International Crisis Group, 2006).

Tarklanis are further divided into four clans. These include: Mammund, ,

Isazai and Ismailzai (Yaad, 1986). They came along withYousafzai from central

Afghanistan replacing Dilazak of the Peshawar valley and moved towards Swat, Dir and Malakand Agency in the 15th A.D., where they got a separate ownership

(Political and Secret Department, 1933). Among the Pashtun ethnic groups of the present study, Utmankheils and Tarklanis have not been as widely as the Yousafzai, therefore a brief account is provided for two of the three sampled populations.

1.5. The Kohistani

The word “Kohistan” literally means “the place of mountains.” As a physical features, Kohistan is divided into three areas: Dir Kohistan, Indus Kohistan and

Swat Kohistan, while the people living in all three of these regions are referred to as

“Kohistanis” (Hamayun, 2005). Kohistanis speak an array of Dardic languages and practice a wide range of agricultural and transhumant herding subsistence strategies

(Bangash, 2012; Barth, 1956). The Kohistanis are commonly thought to be the descendants of the ancient nomadic herders of the area who were forced into the mountainous highlands from the low-lying fertile plains of Dir and Swat by Pashtun invaders from the west during the 16th century A.D. (Shah, 2013; Rome, 2008; Barth,

1956). Prior to the 15th or 16th centuries, the Kohistanis were non-Muslim, but due to the influence of the Yousafzai immigrants, they converted to (Baart and Sagar,

24

2002).The population of Kohistanis in districts Swat and Dir is estimated to be between 60,000 and 70,000 individuals (Hamayun, 2005).

1.6. The Gujars

Gujars, who speak Gujari/Gojri (a lowland Indo-Aryan language), are an ethnic group found in northern India and the mountainous regions of northern Pakistan, northern Afghanistan and Kashmir (Grimes and Grimes, 2000; Lalata et al., 1971;

Barth, 1956). The spellings of Gujar are not homogeneous and they may be referred to by any of the following: Gurjara, Gojar, Gujjar, Goojar, Gujar, Gurjjara and

Gurjara. Gujars are the ancient pastoralists/farming communities who herd livestock or dairy buffalo, and mostly settled agriculturalists or semi-settled agriculturalists who practice seasonal transhumance (Gooch, 1992; Barth, 1956). The pastural Gujars who speaks Gojari, along with other local lnguages are said to be the descendants of the ancient Gurjaras. There are many hypotheses regarding the origin of the Gujars and their inter-relationship. Some anthropologists recognize the Gujars as Kushan, which are of the Indo-Scythian tribe (Cunninghum, 1865). They are considerd as Central Asian in origin from where they reached to India along with

Huns Population in 5th century AD and setled in Rajasthan. In 16th century AD they moved from Rajastan towards Himachal Pradesh following Kashmir and Punjab It has also been reported that the Gujars migrated from Georgia also called Gurjistan

(in Persian, Turkish and Arabic) through Afghanistan and reached to India (Tyagi,

2009). Previous genetic work found the Gujars to be genetically closer to the pastoral, cattle farming Gola ethnic group in India than to other Pakistani ethnic groups (Raza

25

et al., 2013). Gujars all over the sub-continent claim to be indigenous natives since time immemorial. Indeed, many Gujars also claim with confidence that they are

Kashtriyas by origin; descendents of the Suryavanshi Kshatriyas (Sun Dynasty) and connect themselves with the Hindu deity Rama without having any traces of of so- called foreign origin (Lalata et al., 1971).

According to the 1941 census report of India, the tribe called "Gurjaras" were established in the area near Mount Abu in Rajasthan, around 6th century A.D. The

“Gurjaras" were Hindus at the time they were first appeared in India and established their own kingdom in A.D. 640. It seems that the Gujars successfully resisted the

Arab invasion from the north early in the eighth century A.D. It is alleged that about

A.D. 750 A.D. the Chapa dynasty of the Gurjaras, which was in power for about 200 years, were displaced by the Pratiharas in A.D. 1000. They embraced Islam after being defeated by Mahmud of and their kingdom fully flourished during the reign of Akbar. In India the Hindu Gujars are assimilated into several other groups of Hinduism, while in Pakistan the Gujars are considered a tribe (Parishad and

Bharatiya, 1996).

Today the Gujars are famous in agriculture, urban professions and have great contribution in civil cervices, occupying large scales of land especially in northern parts of Pakistan and India. The population of Gujars in India is approximately 30 million while, in Pakistan their population is about 33 million. Due to the lack of food and disasters caused by wars the Gujars migrated to northwards toward

26

Kashmir and occupied many areas of the region including Rajasthan, Gujarat and

Kathiawar (Wreford, 1943). The portion of some migrating Gujars also moved to the northern areas of Pakistan including Swat and Dir some 400 years ago (Chauhan,

2001; Rome, 2008; Barth, 1956). Despite being a country inhabited by a population of tremendous ethnic diversity, however the diversity among the people of this region remains largely unknown genetically.

1.7. The genetic characterization of human

Genetic characterization of modern human populations is very important for investigating or confirming archeological, anthropological and other information related to human history, genetic polymorphisms, racial biases and medical relevance (Bodmer, 2015; Macaulay et al., 2005; Renfrew, 2000; Ingman et al., 2000;

Excoffier and Langaney, 1989; Cann et al., 1987). The evaluation of molecular techniques used to study the genetic structure of human populations and the results obtained can yield much insight into human health and history (Bodmer, 2015).

Previous studies have interpreted the presence of genetic sub-structures in human populations as the consequence of migration patern of subgroups and genetic drift.

Consequently, individuals of the same group are very similar to each other genetically as compare to the individuals of another group (Henn et al., 2016;

Novembre, 2011; Tishkoff et al., 2009; Jakobsson et al., 2008; Rosenberg et al., 2002;

Cavalli-Sforza et al., 1994).

Genetic divergence in a population may occur due to non-random mating among isolated populations as well as the genomic diversity within and among populations,

27

which is determined primarily by mutation and certain demographic factors like effective population size and the extent of migration (i.e., gene flow) among populations (Slatkin, 1987; Wright, 1951). Population subdivisions, extension dynamics and migration patterns can be analyzed through the use of different molecular techniques (Risch et al., 2002). Several other fields have been and remain actively engaged in elucidating human history and evolution in addition to molecular evolution and genetic approaches to the origins and distribution of the human species across the globe.

The human story in the form of recorded text goes back only as far as 4,000 years.

Historical linguistics and the languages spoken today hold the evidence of their origin for more than 10,000 years (Jobling et al., 2004). Archaeological evidence provides the ability to study human history, sometimes at great time depth, through the analysis of such physical remains as bones, teeth, stone tools, pottery, waste deposits, coins, inscriptions and dwellings left by members of past populations.

Paleontology however, provides a very deep ancestral record of human beings while molecular anthropology is the most recent approach to estimate human history

(Jobling et al., 2004; Cavalli-Sforza et al., 1994).

Genetic variation at the individual level only yields insight into the past, but can also be used to shape the future with respect to possible ramifications in the field of medicine, prevention methods, disease susceptibility and response to drug treatment. Several studies have demonstrated individual differences in terms of disease risk and response to medicines (Bamshad et al., 2004; Jorde et al., 2001).

28

Consequently, the variation among members of different races at the genetic level is obligatory for the effective planning of prevention and treatment strategies.

At the beginning of the 20th century, genetic differentiation within and across the various major geographic groups of humanity was explored through ABO blood group patterning (Landsteiner, 1901). Furthermore, the importance of such genetic variation was only observed apparent when individual differences in proteins were systematically studied in the 1950s and 60s (Cavalli-Sforza et al., 1994). Genetic variation is widely studied with the expansion of evolutionary genetics, the availability of analytical tools and more effective and economical means for DNA amplification (Jobling et al., 2004; Cavalli-Sforza et al., 1994). Recently, variation in uniparental markers found on the Y-chromosome and mtDNA are being studied to investigate the dispersal and origin of modern humans (Torroni et al., 2006; Forster,

2004; Jobling and Tyler-Smith; 2003). However, these studies were usually on particular genes and were investigated under the influences on a specific phenotypic property or disease risk; therefore, the variation investigated would have been subject to selection pressures. The completion of the Human Genome Project and with the advent of sequencing technologies, such as Sanger sequencing and Next

Generation Sequencing (NGS), have permitted molecular geneeticists the ability to access large amounts of information within the genome as a database for investigating human evolution and diversification (Garrigan and Hammer, 2006;

Margulies et al., 2005; Przeworski et al., 2000; Sanger et al., 1977). Exploring information contained in mtDNA, Y-STRs and dental morphology/dental

29

anthropology are also very important tools for phylogenetic studies as well as for the investigation of human origins (Larmuseau et al., 2015; Nesheva, 2014; Bailey, 2002).

1.8. Dental Morphology/ Dental Anthropology

Dental anthropology is the study of humans present and past from the evidence provided by teeth (Hillson, 1996). Teeth provide valuable evidence about prehistoric, historic and modern populations—not just interms of morphological features of the crown and root, but teeth also have the potential to preserve a high-quality DNA for molecular anthropological analyses (Damgaard et al., 2015; Higgins and Austin,

2013; Brook and Scheers, 2006).

Dental morphology is a field of study that arose initially in the 19th century that is used to register, analyze, interpret and understand all aspects of dental crown and root morphology that yield insight into human groups, their cultural activities, biological conditions and quality of life (Irish and Scott, 2016; Moreno et al., 2004;

Carabelli, 1842).The traits present on human teeth are used for population-based studies, they can serve as identification markers, and they provide the bases for comparisons of genetic origin, thereby allowing the classification of human groups in taxonomic, phylogenetic and evolutionary categories by means of their frequency, expression of sexual dimorphism, bilateral symmetry and morphological characteristics (Rodriguez, 1999; Rodriguez, 2003). The fact that the morphological traits present on teeth are often preserved in good condition among post-industrial

30

modern humans is due to the presence of enamel, which makes it resistant to unfavorable conditions for a long time (Moreno and Moreno, 2005).

These biological traits are expressed in humans and are transfered to subsequent generations in a manner much like other genetically controlled traits, such as blood groups, fingerprint patterns, skin colour, height, which are of varying utility the reconstruction of phylogenetic relationships among various species, evolutionary changes in dentition, the impact of diet upon the dentition and for estimating the degree of biological distance observed among various communities (Scott and

Turner, 1997; Walimbe and Kulkarni, 1993).

Teeth have long been used by anthropologists for the reconstruction of life through the examination of pathological afflictions suffered by members of ancient populations that shed light on the general health conditions, diet and even the social status of individuals (Hemphill, 2012; Eshed et al., 2006; Cucina and Tiesler, 2003;

Hillson, 1979). Similarly the status of dental eruption can be used for the determination of age at death for infants and juveniles, while both micro- and macroscopic tooth wear, when calibrated for local conditions, can be used for recording adult death age and information regarding the foods consumed (Teaford and Lytle, 1996; Smith, 1991). Teeth may also be used by the forensic anthropologist for the identification of individuals, human evolution and, most recently, certain dental traits are used for the estimation of human ancestry (Edgar, 2013; Pretty and

Sweet, 2001).

31

1.8.1 The Birth of Dental Anthropology

The history and origin of dental anthropology goes back to late 19th century when researchers first focused on the teeth of mammals and reptiles and compared them to the human dentition (Osborn, 1888). The early researchers used teeth to categorize fossils, record pathological status, describe natural variations in human teeth and comparing their presence and frequency in various populations distributed throughout the world (DeSantis, 2016; Drennan, 1929; Hellman, 1928; Gregory, 1926;

Bolk, 1922; Gregory, 1922; Sullivan, 1920; Osborn, 1907; Owen, 1845). Georg von

Carabelli (1842) was the first researcher who reported and described the presence of a small accessory cusp on the mesiolingual surface of the protocone of the maxillary molars of Europeans (Scott and Turner, 1997). This was given the name Carabelli’s trait and is found and recorded in most of dental anthropological evaluations

(Marado and Campanacho, 2013; Hsu et al., 1999; Reid et al., 1991; Hassanali, 1982;

Townsend and Brown 1981; Scott, 1980). Variations in enamel and root anatomy were also observed among various races (Hellman, 1928; Tomes, 1889; Flower, 1885;

Owen, 1845). Ales Hrdlicka (1920) identified the shovel-shaped incisor, which plays a pivotal role in the classification system and researchers consider it a basic dental morphological trait in the field of dental anthropology (Scott and Turner, 1997;

Hrdlicka, 1920). Hrdlicka also observed similarities, variations and the level of shovelling expression in American Indian and Asian populations and its clear departure from that observed in African and European dentitions (Hrdlicka, 1924;

Hrdlicka, 1920). The identification of stable morphological traits in canines, incisors,

32

premolars and molars improved the analytical ability of dental morphology-based investigation of human biological differences (Dahlberg, 1945).

Recently, studies on dental variations of both hominins and modern humans are significantly improved. Dental anthropological studies have illuminated Plio-and

Pleistocene hominin dental morphology (Gomez-Robles et al., 2008; Gomez-Robles et al., 2007; Bailey, 2004; Wood et al., 1988; Wood and Engleman, 1988; Wood and

Uytterschaut, 1987; Wood et al., 1983; Wood and Abbott, 1983), new information in the study of Neanderthals (Bailey et al.,2011; Bailey, 2002), analysis of microwear- based investigations of dietary variability among hominins (Lucas et al., 2008; Scott et al.,2005; Teaford and Ungar, 2000), identification of behavioral patterns and wear- related remodeling (Margvelashvili et al., 2013), and phylogenetic relationships of the newly discovered hominin, Australopithecus sediba and other hominine species

(Irish et al.,2013).

Research based on variations in dental development between modern humans and ancestral hominins have revealed new insights into dental relationships between these taxa and new techniques to visualize internal and external dental structure from two-dimensional surfaces using low magnifying microscope (DeSantis, 2016;

Smith and Tafforeau, 2008).

Single and multiple dental morphological traits are commonly used to investigate different groups of human populations for phylogenetic relationships (Mihailidis et al., 2013; Matsumura et al., 2009; Townsend et al., 1990; Kieser, 1984; Mayhall et al.,

1982; Scott and Dahlberg, 1982; Kaul and Prakash, 1981; Kieser and Preston, 1981;

33

Townsend and Brown, 1981; Scott, 1980; Suzuki and Sakai, 1973). About 100 morphological dental traits combinations have been reported till now while new traits are added frequently soon (Cunha et al., 2012). In 1990, a standardize methodology was introduced for dental morphological scoring and observation following the techniques introduced by Hrdlicka (1920) and Dahlberg (1945) (Scott and Turner, 2008). A series of rank-scaled reference plaques for 36 dental non-metric traits were developed, called the Arizona State University Dental Anthropology

System, or ASUDAS. These plaques were accompanied by a set of rules and guidelines for observers (Turner et al., 1991), which need to be followed carefully to minimize inter- and intraobserver error and ultimately maximize comparability.

1.8.2. Dental anthropology investigations in South Asia

To find out the biological affinities between prehistoric and living South Asian populations it is important to understand the dispersal route of early humans in

South Asia also called Indo-Pak subcontinent. Therefore dental morphological features should be used because, once they are expressed within a given tooth, they remain unaffected until pathological or physical damages.

Moderate to highly heritable dental features means that these traits provide a reliable picture of the genetic relationship between the past populations and may be used to test hypotheses about past human migration patterns within and across the continents. Dental anthropology is a recently emerging field to explore variations within and among the various populations of Indo-Pak subcontinent (Hemphill,

34

2013; Hemphill, 2012; Hemphill, 2009a; Blaylock, 2008; Sharma, 1983; Kaul and

Prakash, 1981; Sharma and Kaul, 1977). Variation in the frequency of non-metric dental traits of the permanent teeth has been used to determine biological distances among South Asian prehistoric skeletal series. Relevant studies have focused on early agriculturist chalcolithic groups of the Deccan Plateau (Lukacs, 1987),

Chalcolithic and Neolithic samples from Mehrgarh, a site located in Baluchistan

Province of Pakistan (Lukacs and Hemphill, 1991; Lukacs, 1986), and Iron Age series from Sarai Khola, Timargarha (Lukacs, 1983), Parwak (Ali et al., 2005), located in northern Pakistan, respectively. The first descriptions of dental morphology of early

Holocene hunters focused on the site of Sarai Nahar Rai in the mid-Ganga Plain of

North India, but due to small sample size assessment of biological relationships was prevented (Kennedy et al., 1986). Non-metric dental trait frequencies and inter-group bio-distances were reported from a site nearby known as Mahadaha (Lukacs and

Hemphill, 1992). Researchers at the Anthropological survey of India, the University of Chandigarh, and the University of Sri Venkateswara, Tirupati have also been reported dental morphological trait frequencies from skeletal samples from South

Asian and living ethnic groups. The researchers from Chandigarh University worked on Jats (Kaul and Prakash, 1981), Tibetans (Sharma, 1983), Punjabis (Sharma and Kaul, 1977) and Andhra Pradesh (Rami- Reddy, 1985). Hindu caste Vaghelia

Rajputs and Garasias, as well as tribal Bhils were reported from Gujrat while caste

Marathas and Mahars, along with tribal Madia Gonds and urban mixed caste samples from the city of Pune were reported from the State of Maharashtra in west-

35

central India (Hemphill et al., 2000; Lukacs and Hemphill, 1992). Recently 2,455 living individuals were also reported from samples of seven populations living in the northern areas of Pakistan including the residents of Madak Lasht and Swatis

(Hemphill et al., 2010; Hemphill, 2009b). Additional dental morphology studies have been conducted among the Khows of Chitral District (Hemphill et al., 2008) and

Awans of Mansehra District (Hemphill, 2012).

1.8.3. Non-metric dental morphological traits

Non-metric dental traits are morphological variants of the root and crown that vary among populations and because of these variations researchers can get access towards human ancestry (Maula, 1993).

The non-metric traits are usually scored in two ways: (i) the traits such as groove patterns, accessory ridges, supernumerary cusps and roots are represent as

“Presence- absence,” or (ii) as the differences in form such as curvature and angles

(Scott and Turner, 1997; Hillson, 1996). When present, many of these traits vary in the degree to which a particular morphological structure is expressed (e.g. cusp or ridge size) (Scott and Turner, 1997).

1.8.4. Basic terminology use in dental morphology

Dental anthropologists use basic terms when describing specific regions or expressions of the dentition that helps the researchers orient themselves within the dentition, and makes it easy to describe morphological traits --onin a specific tooth.

These specific terms are mesial: toward the anatomical midline or the sagittal plane than runs between the two central incisors, distal: away from the midline, buccal:

36

towards the cheek, labial: toward the lip, lingual: towards the tongue and occlusal: the chewing surface of a tooth (Scott, 1997).

Figure 9. Diagram represents the positional terms for the teeth and jaws

37

1.8.5. Analysis of dental morphology traits

Dental morphological traits are analyzed with the help of an internationally recognized system called the Arizona State University Dental Anthropology System

(ASUDAS), which features 36 rank-scale reference plaques that illustrate minimum, maximum and intermediate expressions of specific traits. The ASUDAS procedures also help to standardize the observations and scoring of about more than 40 specific crown, root and intraoral osseous morphological traits of the human permanent dentition (Turner et al., 1991). The most frequently occuring dental morphological traits are: winging, which is present in central incisors of the maxilla and can be identified when the lateral margins of the antimeres are rotated labially (Enoki and

Dahlberg, 1958) (Fig. 10a). The peg-shaped (reduced and cone shaped) character found in the upper lateral incisors and is very rare as compare to the other tooth traits (Scott and Turner, 1997) (figure 10b).

Labial Convexity of the upper incisors, mostly found in upper incisor 1 (UI1), is defined as the roundness of labial surface of UI1 (Nichol et al., 1984; Scott and

Turner, 1997). Shoveling (Figs. 10c & d) is found in canines, upper and lower incisors with well differentiated distal and mesial lingual ridges (Hrdlicka, 1920; Dahlberg,

1956; Scott and Turner, 1997). Double-Shoveling occurs in canines, first premolars, upper and lower incisors while UI1 is said to be the key tooth for this trait (Dahlberg,

1956) (Fig. 10e). The Interruption Groove (IG), also called the corono-radicular groove that appears in upper incisors and is sometime common in UI2 (Scott and

Turner, 1997) (Fig. 10f).Tuberculum dentale, also known as the median lingual ridge,

38

is present in upper canines and incisors (Nichol and Turner, 1986) as shown in figure

10g.

(a) (b)

(c) (d)

(e) (f)

@4

(g) (h)

Figure 10. Morphological traits of canines and incisors with respect to ASUDAS. (a) Winging of central incisors (b) Peg-shaped upper lateral incisors (c) Reference plaque for shoveling UI1 (d) shovel-shaped UI2 (e) Reference plaque for double-shoveling (f) Interruption groove in upper lateral incisors (g) Tuberculum dentale (median lingual ridge) for UI1 (h) reference plaque for distal accessory ridge upper canines (DAR UC).

The Bushman canine (canine mesial ridge) is common in canines, especially in upper canines, and is said to be the combination of mesial marginal ridge of the canine

39

with a projection of the cingulum on the primary tubercle (Morris, 1975; Scott and

Turner, 1997). Canine distal accessory ridge appears in upper and lower canines

(Morris, 1975; Scott and Turner, 1997) (Fig. 10h).

The “Uto-Aztecan premolar” also known as the disto-sagittal ridge (Fig. 11a) found in the first maxillary premolars (Morris et al., 1978). Premolar mesial and distal accessory cusps (Fig. 11b) occur in the upper premolars (Turner, 1967). Assessments of the number, conformation, and position of the lingual cusp is assessed among the lower premolars (figure 11c) (Scott and Turner, 1997; Kraus and Furr, 1953;

Pedersen, 1949).

(a)

(b) (c)

Figure 11. Morphological traits of premolars (a) Reference plaque for Uto-Aztecan premolars (b) Premolar accessory cusps (c) Premolar lingual cusp.

40

The metacone, hypocone, metaconule, Carabelli and parastyle traits are the most common traits studied in the upper molars (Dahlberg, 1951; Turner, 1979; Scott and

Turner, 1997; Harris, 1977; Bolk, 1916).

The metacone or cusp 3 is a primary cusp of the upper molars (i.e. M1, M2 andM3) found in the distobuccal quadrant of the tooth (Fig. 12a). The metacone is almost unformly fully developed on M1, shows some reduction on M2, and is often reduced or even absent on M3 (Hillson, 1996). The disto-lingual cusp found on the upper molars is the hypocone or cusp 4 (Fig. 12b). This trait is most common on M1, while on M2 and M3 it can be found in reduced form or sometimes absent (Hillson, 1996).

The trait found in the distal fovea between metacone and hypocone on the distal marginal ridge of upper molars is known as the metaconule or cusp 5 (Fig. 12c).

Carabelli’s trait is an extra tubercular structure found at the base of meso-lingual surface of cusp 2 (protocone) in upper M1, M2 and M3 (Fig. 12d). The parastyle

(extra cusp) is a trait found on on the buccal surface of the upper molars (Fig. 12e). It may be very small or sometimes is expressed as a well-defined extra cusp mainly found on M3 while rare on M1 (Hillson, 1996). The parastyle is regarded by some dental anthropologists as one of the most important features in the field of dental anthropology found on buccal surface of molars (Scott and Turner, 1997).

41

Figure 12. Morphological traits with respect to ASDUAS reference plaques (a) reference plaque representing the metacone (cusp 3) in upper molars (b) reference plaque for scoring hypocone (cusp 4) in upper molars (c) reference plaque for metaconule (cusp 5) in upper molars (d) reference plaque for Carabelli’s trait (e) reference plaque for scoring parastyle in upper molars with a well-pronounced example to the right side of the scale.

42

The anterior fovea (precuspidal fossa) is a dental morphological trait found on the anterior and occlusal surface of all three mandibulars molars, but it is only scored on lower M1 (Turner et al., 1991; Hrdlicka, 1924). Its identification and scoring need well-experienced researchers (Scott and Turner, 1997). In some cases it forms a deep triangular fossa distal to the mesial marginal ridge (Fig. 13.1a). The deflecting wrinkle is demonstrated as the median occlusal ridge of the metaconid that goes down from the tip of the cusp toward the central fossa in lower molars M1, M2 and

M3 (Scott et al., 1997) (Fug. 13.1b). The protostylid is an extra cusp or outgrowth present on the buccal surface of cusp 1 (protoconid) of the lower molars (Turner et al., 1991). This trait can be observed on the buccal surface of the lower molars (Fig.

13.1c).

43

(a)

(b)

(c)

Figure 13.1. Morphological traits (a) reference plaque for the anterior fovea in lower molars with an example to its right (b) reference plaque for the deflecting wrinkle with an example to its right side (c) reference plaque for the protostylid with an example to its right side.

The groove pattern is identified as the configuration of contacts among different cusps (Turner et al., 1991), which may be in the form of letters X, Y or the (plus) + mark (Fig. 13.2a). The Y-pattern is recognized as the connection between cusps 2 and

3, the X-pattern is recognize as the connection between cusps 1 and 4, while a + pattern is recognize as the connection of all four major cusps (Turner et al., 1991). The

44

major cusp numbers are commonly reported in the lower molars (Gregory, 1916;

Scott and Turner, 1997). The most common major cusps of lower M1 are five in

number and are reported as mesio-buccal (metaconoid), mesio-lingual (entaconoid),

centro-buccal (hypoconulid), disto-buccal (protoconoid), and disto-lingual

(hypoconoid), while four or three cusps may also be found in lower molars

respectively. The hypoconulid, or cusp 5, is the distal cusp found on the occlusal

surface of the lower molars (figure 13.2b). Its size can be calculated in the absence of

(a)entoconulid (Turner et al., 1991). (b)

(c) (d)

Figure 13.2.Dental morphological traits and reference plaques (a) the Y- and X- pattern (b) reference plaque for scoring the hypoconulid (cusp 5) (c) reference plaque for scoring the entoconulid (cusp 6) (d) reference plaque for scoring the metaconulid (cusp 7).

45

Cusp 6, also called the tuberculum sixtum or entoconulid, is found lingual to cusp 5 in the distal fovea of the lower molars (Fig. 13.2c). Both cusps 5 and 6 in terms of size are similar to each other in grading (Turner et al., 1991). It is located between cusps 4 and 5 in lower M1, M2 and M3. The metaconulid, also known as tuberculum intermedium or shortly cusp 7, is situated between cusps 2 and 4 in the lingual groove of the lower molars (Fig. 13.2d). The key tooth for scoring cusp 7 is M1 of the lower jaw (Turner et al,. 1991).

1.9. Mitochondrial DNA (mtDNA)

A typical somatic cell performs many complex metabolic processes that are specific for that cell; for example, the synthesis of a specific protein required for a specific function and cellular energy in the form of Adenosine Tri Phosphate (ATP) is required for life activities (Guimaraes-Ferreira, 2014; Davey et al., 2002). The cell contains many organelles required for essential cellular functions. Among these organelles the nucleus and mitochondria are the most important. The DNA within the nucleus is called nuclear DNA, or the nuclear genome, while the DNA within the mitochondrion is called mitochondrial DNA (mtDNA), which synthesises its own proteins and is therefore known as the power house of the cell (Butler, 2005; Jobling et al., 2004; Holland and Parsons, 1999). The endosymbiotic theory about the origin of the mitochondrion is widely accepted that is based on the mutual symbiotic relationship between the cell and a bacterium, which eventually led to the integration of the bacterium to form the mitochondrion (van der Giezen, 2011; Joblin et al., 2004; Anderson et al., 1981).

46

Human mtDNA (Fig. 14) is a double-stranded circular molecule with length of approximately 16,569 base pairs (bp), having 37 genes of which 13 code for proteins,

22 for transfer RNAs (tRNAs), 2 for ribosomal RNAs (rRNAs)(Ebner et al., 2011), a non-coding region, a displacement loop (D-loop), also called control region, as well as the regulatory sequences for the mtDNA origin of replication, the promoters for transcription (Chang et al., 2010; Taanman, 1999; Anderson et al., 1981), cytochrome b, cytochrome c, ATPase and NADH dehydrogenase (Liu et al., 2011; Mckenzie et al.,

2010; Ketmaier and Bernardini, 2005).

16024 16383 57 372 438 574

(D-Loop)

tRNA

Cytochrome c

Figure14. Diagrammatic view of human mtDNA.

47

The major portion of the non-coding region of mtDNA is the D-loop of 1122bp, which is composed of the Hyper Variable Sequences (HV); HV-I (nucleotide position

[np] 16024-16383), HV-II (np 57-372) and HV-III (np 438-574) (Butler, 2012; Butler,

2005). These regions have mutation rates that are approximately ten times that observed in the coding sequence (1.64273 x 10-7 for HVS-I, 2.29640x10-7 for HVS-II)

(Soares et al., 2009).

The mitochondrion is present several hundred times in the cell and its inheritance is unilineal via the maternal line (Butler, 2005; Jobling et al., 2004; Lightowlers et al.,

1997; Robin and Wong, 1988). Paternal inheritance of mtDNA in humans has also been reported up to the blastocyst stage in embryos (St John et al., 2000), but this phenomenon is very rare and its contribution is considered negligible (Kraytsberg et al., 2004).

1.9.1. MtDNA in human lineages

The Cambridge Reference Sequence (CRS), also called the original sequence of mtDNA, was first obtained from the placenta of a European individual that describes the characteristics of European mtDNA lineage (Achilli et al., 2004; Anderson et al.,

1981).

The mtDNA, along with autosomal DNA and Y chromosomal DNA, have long been used in evolutionary biology, historical perspectives and population genetics

(Kivisild, 2015; Cann et al., 1987; Wallace et al., 1985). The high copy number per cell

(Piko and Matsumoto, 1976; Michaels et al., 1982), lack of recombination, maternal inheritance (Kivisild, 2015; Hutchison, 1974), and high mutation rate (Brown et al.,

48

1979), have made mtDNA a unique tool for human evolutionary studies and population genetics.

Phylogenetic study of mtDNA has a central role in the identification of the human maternal ancestors, known as the “mitochondrial Eve,” who inhabited Africa around

124,000-157,000 years ago (Fu et al., 2013; Poznik et al., 2013) and then subsequently dispersed to the rest of the Old World and eventuially into the New World as well

(Stewart, 2015; Behar et al., 2008; Torroni et al., 2006). mtDNA migration pattern is illustrated in Figure 15.

Figure 15. Human migration and haplogroup distribution across the world (Stewart, 2015)

1.9.2. MtDNA Variation

Human mtDNA differs broadly across the globe, with populations of similar descent or geographical origin sharing many of the same characteristics. In some cases, these characteristics may indicate various historical events of the population including admixtures with other populations or migrations (Whale, 2012).

49

A mtDNA haplotype is the combinations of polymorphisms that differ from the CRS and transmitted together from mother to offspring and which cannot be affected by recombination. Thus, similar mitochondrial haplotypes share a set of common mutations and can be traced to a common maternal ancestor. Individuals from the similar or same populations may share the same mtDNA sequences (haplotypes) and can be clustered together to form haplogroups (Wallace et al., 1999). A haplogroup is a set of slowly mutating markers shared by peoples of the same geographic region

(Jobling et al., 2004), and are mostly continent-specific, leading to indicate modern human history and migration paths. MtDNA haplogroups are indicated by letters of the Latin alphabet and all of these letters, except “O,” have been utilized (van Oven and Kayser, 2009). Among these letters (Fig. 16) L1, L2, L3, L4, L5 and L6 represent

African-specific mtDNA haplogroups and belong to the “L” clade (Behar et al., 2008).

Figure 16. mtDNA PhyloTree and partitioning scheme representing subtrees (van Oven et al., 2015).

50

All of the non-African lineages are said to have originated from L3 about 60,000 to

70,000 years ago (Soares et al., 2012; Behar et al., 2008). The African haplogroups L1,

L2 and L3, are also found in Makrani population of Baluchistan province of Pakistan among whom frequencies range from 28% to 39.4% (Siddiqi et al., 2015; Quintana-

Murci et al., 2004). Haplogroups M and N separated from haplogroup L3 about

77,000 years ago (Forster and Matsumura, 2005).

The M clade (including haplogroups C, D, E, G, Q and Z) is distributed in Asia,

Indonesia, Australia and the Americas. Indeed, more than 70% of mtDNA lineages identified among the inhabitants of India belong to haplogroup M (Chandrasekar et al., 2009; Metspalu et al., 2004).

Similar lineage was also found among South Indian tribes and caste populations that accounts for all but three lineages among the Chenchus (Kivisild et al., 2003).

Haplogroup M is also common in the populations living in the southern region of the Makran coast of Pakistan and northwest India, with the frequencies of 30-35%, respectively (Quintana-Murci et al., 2004). On the other hand the frequency of haplogroup M is low or absent among populations residing the west of the Indus

Valley, while it is found at frequencies of less than 12% among the populations of

Central Asia (e.g, Uzbeks, Turkmen and Shugnan) (Quintana- Murci et al., 2004).

Haplogroup N is widely distributed in Europeans and Oceanic populations in addition to Indians, Native Americans and Asians. The parahaplogroup N* of haplogroup N includes haplogroups A, I, S, W, X, and Y. Clade R is also included within the clade N, which is also very common in Europeans and is further divided

51

into parahaplogroups R* and RO. R* is divided into haplogroups B, F, J, P, and T while RO is further divided into HV, H, and V and U, the latter of which includes haplogroup K (Fig. 16). Equal distributions of haplogroup U and M are found in

Asia, especially in India and Pakistan (Quintana-Murci et al., 2004). Haplogroup H is the most common and most recent haplogroup of Europe with a total frequency of

40-45%, 20% in the Caucasus and 10% in Arabian Gulf populations (Heinz, 2015).

Haplogroup H has been reported in different populations from Pakistan, with frequencies of 28% among Sindhis, 26.3% among Brahuis, 20.5% among Baluchis,

13% among Hazaras and 12.3% among Burushos, respectively (Bhatti et al. 2016a;

Szecsenyi-Nagy et al., 2014; Brandt et al., 2013; Mikkelsen et al., 2008; Quintana-Murci et al. 2004; Richards et al., 2000). Recently, South Asian haplogroups M (28%), R (8%), and West Asians haplogroups U (17%), HV (15%), H (9%), K (8%), J (8%), W (4%), T

(3%) and N (3%), were also found among members of the various Pashtun ethnic groups sampled in northern Pakistan (Bhatti et al., 2016b).

MtDNA variation can be studied by the direct sequencing of the control region or via restriction fragment length polymorphisms (RFLP). Haplogroups are separated by the combination of HVS-I, HVS-II polymorphisms, and RFLPs (Schurr, 2004a).

MtDNA is a very useful tool to characterize variation, estimation of elapsed time of divergence on a branched tree using either coalescence or distance methods of estimation (Schurr, 2004b). It also plays a pivotal role in answering different questions related to local populations in addition to those related to human origins and evolution in general (Torres, 2016). Therefore, in the present study, mtDNA was

52

selected to find out more information about the local population samples from Swat and Dir districts of Pakistan.

1.10. The Y-chromosome

The human Y-chromosome (Fig. 17) is about 60 megabases (Mb) in length, which is inherited unilaterally through the paternal line (Li et al., 2008; Jobling et al.,

2004). The Y-chromosome carries the sex-determining region Y (SRY) gene that is responsible for the genetic and sex determination mechanism by the activation of the SOX9 gene, which in turn activates sex differentiating glands in males (Jiang et al., 2013; Foster and Graves, 1994; Berta et al., 1990; Sinclair et al., 1990).

It also carries the male-specific region Y (MSY) and the pseudoautosomal regions

(PARs) (Fig. 17). The MSY encompasses about 95% of the entire Y-chromosome is composed of the euchromatic and some of the repeat-rich heterochromatic parts

(Li et al., 2008; Skaletsky et al., 2003). The total size of euchromatin on the Y- chromosome is approximately 23 Mb of which 8 Mb is found on the Yp arm and

14.5 Mb on the Yq arm (Skaletsky et al., 2003).

Figure 17. Modified structure of human Y chromosome (Skaletsky et al., 2003; Olofsson, 2015).

53

The Y-chromosome also has three classes of euchromatin sequences known as X- transposed, X-degenerate and ampliconic (Skaletsky et al., 2003).

The size of the ampliconic sequences is about 10.2 Mb which is highly repetitive, polymorphic and composed of 1 to 8 palindromes (Li et al., 2008; Skaletsky et al.,

2003).

The euchromatin exhibits Long Interspersed Nuclear Elements 1 (LINE 1), which accounts for about 36% of the X-transposed sequences (Skaletsky et al., 2003).

About 400kb of the heterochromatic region is present in the euchromatin, while the major part about 35Mb of the heterochromatin can be identified at the lateral long arm of the Y-chromosome (Hughes et al., 2012; Skaletsky et al., 2003). It has also been reported that tandem repeats present in the heterochromatic region have no transcription factors (Alechine et al., 2016; Skaletsky et al., 2003). The length polymorphism in the heterochromatin is responsible for human Y- chromosomal variation (Repping et al., 2006).

A total of 78 transcriptional units have been recognized in the modern MSY and among these units 17 are present in single copies (Navarro-Costa, 2012; Navarro-

Costa et al., 2010; Skaletsky et al., 2003). The majority of these genes are responsible for sex determination and sperm formation in testis (Olofsson et al

2015; Li et al., 2008).

The X-transposed regions carry TGIF2LY and PCDH11Y (also called

Protocadherin 11 Y). The TGIF2LY is testis specific, while PCDH11Y assists in brain development in the fetus (Navarro-Costa, 2012; Skaletsky et al., 2003). The

54

X-degenerate region contains 27 genes of single copy also called pseudogenes whose function is to code for approximately 15 proteins. Among these 27 genes,

13 show similarities with exons and introns of the functional X- homologue while the remaining 14 are transcribed to functional genes and show similar features with the X- and Y- linked genes of non-identical features. Furthermore, all the twelve ubiquitously expressed MSY proteins are found within the X-degenerate region and are involved in sex-determination and spermatogenesis (Skaletsky et al., 2003; Lahn and Page, 1997).

The ampliconic region has nine coding genes that range from 2-35 copies that belong to the protein coding family and are expressed only in testes (Navarro-

Costa, 2012; Navarro-Costa et al., 2010; Skaletsky et al., 2003). Approximately 75 non-coding genes are also identified in the ampliconic region, of which 65 are specific to MSY families and 10 are found as single copies (Skaletsky et al., 2003).

1.10.1. Phylogenetic Tree based on human Y-chromosome

The information found on the haploid Y-chromosome is mostly used as a molecular marker in anthropology, genotyping, demography, genealogy, forensics, medicine and in evolutionary studies (Oven et al., 2014). Single nucleotide polymorphisms

(SNPs) and short tandem repeats (STRs) are two widely used markers present in the non-recombining region of the Y-chromosome (Wang et al., 2015). Y-SNPs are slowly evolving markers with mutational rates of about 3 × 10–8/ nucleotide/ generation

(Xue et al., 2009). These markers are widely used to study the paternal relationship between individuals and among members of different populations (Van Oven et al.,

55

2014; Underhill et al., 2000). The biallelic properties of Y-SNPs make it an important tool for constructing phylogenetic trees that link all the human reference populations

(Karafet et al., 2008; Consortium, 2002).

The first phylogenetic tree based on Y-SNPs was published in 2002 and was further updated in 2003 (Consortium, 2002; Jobling and Tyler-Smith, 2003). The last updated tree was published in 2008 (Karafet et al., 2008). Since then, the Y-chromosome tree has been continuously updated and the most recent tree is now publicly available at http://www.isogg.org/tree/. The main structure of the most recent and updated tree is shown in Figure 18.

Figure 18. Structure of the most recent and updated version of human Y- chromosome phylogenetic tree (Van Oven et al., 2014).

56

The y- haplogroups (Y-HGs) are named “A” to “T” where Y-HG “A” shows the deepest root of the Y-chromosomal tree (Karafet et al., 2008).

Today the short revised version of the nomenclatural system, in which the first letter represents the haplogroup or sub-haplogroup followed by the marker (e.g., R-U106 or R1b-U106) is used, rather than the previous nomenclatural system (e.g.,

R1b1a2a1a1) (Olofsson, 2015).

The second class of mutations found in the NRY consists of microsatellites, or short tandem repeats (STRs), which have 2–6 base pair (bp) repeat units (Willems et al.,

2016; Roewer et al., 2001; Goldstein et al., 1996), with the mutation rate of 3.83 × 10–4 mutation per generation (mpg) (Willems et al., 2016). Y-STRs have a wide range of forensic applications (crimes, rapes and paternity), human history and migration pattern, and phylogenetic tree construction that links human populations with each other to determine their genetic relatedness and possible origins (Kareem et al., 2015;

Butler, 2011; Underhill and Kivisild, 2007).

Thus Y-STRs are the ideal molecular markers as they are transfer from father to son without recombination, they have a high level of diversity, they are simple to genotype, they are sensitivite to genetic drift and permit the prediction of informative haplotypes (Kareem et al., 2015; Marjanovic and Primorac, 2013; Butler,

2012). Haplotype is referred to as the genetic information received from lineage markers such as Y-STR (Butler, 2012).

The convergence of Y-STR haplotypes among different haplogroups has compromised the accuracy of haplogroup prediction. Therefore, samples with

57

ambiguous Y-STRs haplotypes, its typing with Y-SNPs is a very promising method for finding haplogroup finer resolution and confirmation (Wang et al., 2013).

1.10.2. Y-chromosomal haplogroup distribution across the globe

Human migration may be predicted through Y-chromosomal haplogroup distribution. The variation may be due to bottlenecks, founder effects and genetic drift occurring along the migration routes across different regions (Olofsson, 2015).

The Y-chromosome haplogroups (Y-HGs) A and B-M60 are said to be very frequent in African population (Gomes et al., 2010; Hammer et al., 2001; Underhill et al., 2001).

Populations residing in the Horn of Africa and in Noth Africa have a very high frequencies of Y-HGs E-M96, E-M35, J-M304 and E-M81 (Trombetta et al., 2015;

Bekada et al., 2013; Gomes et al., 2010; Sanchez et al., 2005; Hammer et al., 2001;

Underhill et al., 2001). Y-HG E-M2 has also been reported in sub-Saharan African populations, with the frequencies as high as 80% in West and 60% in Central Africa, respectively (Trombetta et al., 2011).

Y-HG C-M130 is very common among the populations of Oceania and Asia

(Stoneking and Delfin, 2010; Karafet et al., 2008; Hammer et al., 2001; Underhill et al.,

2001). The occurrence of that sub-haplogroup among Native Americans confirms their Asian origins (Geppert et al., 2011; Karafet et al., 2008; Zegura et al., 2004).

Haplogroup D-M174 is more common in Japan and Central Asia, while O-M175, D-

M174 and N-M231 are frequently distributed in East Asians (Zhong et al., 2011;

Karafet et al., 2008). Individuals from Oceania and Indonesia are limited to Y-HG M-

P256 and S-M230, respectively (Karafet et al., 2008; Hudjashov et al., 2007).

58

Haplogroups R-M207 and I-M170 are frequently distributed across Europe (Karafet et al., 2008; Rootsi et al., 2004; Rosser et al., 2000; Semino et al., 2000). Haplogroup R-

M269 is found in Central and Western Europeans (Busby et al., 2012). Y-HG Q-M242 is very common in northern Eurasia and in some Siberian populations, while its sub- haplogroups are distributed with low frequencies across European, Middle Eastern and East Asian populations (Karafet et al., 2008).

The Eurasian Y-chromosomal lineages are common in Indo-Pakistani sub-continent

(Karafet et al., 2008; Sengupta et al., 2006). Y-HG R1a-M417 occurs widely throughout the Eurasian continent, especially among the populations found in South and

Central Asia (Karafet et al. 2008; Novelletto 2007; Rosser et al. 2000; Semino et al.,

2000; Sengupta et al. 2006; Underhill et al., 2015). Haplogroup H-M69 and R1a1-M17 is widely distributed in India and Pakistan, while haplogroup R1a1a-M17 is very common among the populations residing in the tribal areas of Khyber Pakhtunkhwa province of Pakistan (Lee et al., 2014; Trivedi et al., 2008).

Exploring information contained in mtDNA, Y-STRs and tooth morphology is very important for phylogenetic studies; for no such study has ever been conducted to investigate any relationship among different ethnic groups of Hindu Raj region.

Therefore, the current project was designed to characterize five populations

(Yousafzai, Gujars, Tarkalani, Kohistani, Utmankheil) residing in Swat and Dir district through dental morphology, mtDNA and YSTRs with the following objectives.

59

Objectives

• To elaborate dental morphological variations among the major ethnic groups

in Swat and Dir districts.

• Genealogical study of the ethnic groups in the area using mitochondrial hyper

variable segments 1 and 2.

• Genetic characterization of Y- chromosomal STRs haplotypes in individuals

from Swat and Dir.

• Studying genetic diversity using human dental morphology.

• Statistical and Bioinformatics analysis of the data produce.

60

Chapter 2

MATERIALS AND METHODS

Samples from five ethnically distinct populations’ viz. Tarklani, Yousafzai,

Kohistani, Gujar and Utmankheil were collected from volunteers residing in different areas of Swat and Dir districts of Khyber Pakhtunkhwa, Pakistan. Sampling sites wherefrom the sample collection was done is presented in Fig. 19.

Figure 19. Geographic location of the study area. The colored circles represent location of villages where samples were collected.

Members of three of these population samples (Tarklanis, Yousafzai, and

Utmankheils) are commonly recognized as sub-groups within the Pashtun ethnic group. Ethnicity was self-declared and first degree relatives were identified and

61

excluded from the study. All participants gave their informed written consent after the aims and procedures of the study were explained to them. The present research was approved by the Institutional Bioethical Committee of Hazara University,

Mansehra, Pakistan (appendix-I).

2.1. Samples collection for dental morphology study

2.1.1. Collection of dental Casts

A total of 823 dental casts from males and females with signed consent forms were collected from of Tarklani, Yousafzai, Kohistani, Gujar and Utmankheil volunteers of

Swat and Dir Districts (Table 1).

Table 1: Details of samples collected from Swat and Dir districts

S.No Ethnic Group Sampling site Sex Total No. of M F Casts

1 Gujars Gabral and Miandam, Swat 85 80 165

2 Kohistani Bahrain and Kalam, Swat 89 85 174

3 Tarklani Miadan, Dir 75 75 150

4 Utmankheil Maidan, Dir 75 75 150

5 Yousafzai Mingora, Swat 94 90 184

The consent form was designed according to the guidelines of the Institutional

Bioethical Committee of Hazara University (Appendix-II). All the information about the volunteers regarding geography and ethnic affiliation were saved properly for further analysis. Before sample collection all participants were guided to wash and

62

clean their teeth in such a way that the cavities of the teeth should be free from foods with the help of toothpaste and brushes provided by the research team (Fig. 20). The optimized procedure was used to reduce the chances of gagging during sample collection.

Figure 20. Filling, signing of consent form, and cleaning of teeth by volunteer individuals.

2.1.2. Selection of volunteers

Volunteers for the present research were selected on the basis of ethnicity, relatedness and condition of teeth. The individuals between 12 to 22 years of age with teeth in good condition were considered for dental casting. The ethnicity of the volunteer was self-declared or was provided by the individual’s parents. Those who did not fulfill the selection criteria were not included in the study sample.

2.1.3. Biosafety Measures

Sterilized and autoclaved dental trays and alginate (Cavex CA37), which is widely used in dental anthropology research, were selected for dental casting. The alginate used for template preparation was easy to separate from the sample dental casts. A mixture of alginate and water in a rubber bowl was prepared in semi-fluid form and

63

poured into the appropriate size impression tray. The alginated-filled impression tray was placed in the mouth for two minutes and then removed gently. After being removed from the mouth, the tray was rinsed with water to remove saliva to avoid bubbles and erosion of the impression (Fig. 21).

Figure 21. Placement and removal of the alginate-filled impression tray from the subject’s mouth.

2.1.4. Dental casting and labeling

The fine powder of diestone (DentAmerica, CA 91744, U.S.A) was mixed with water for pouring into the alginate impressions. Two mixtures of plaster (thin and thick) were prepared for preparation of a good quality dental cast. The thin mixture was poured first into the alginate impression before thick mixture was added to avoid bubbles and for better visualization of the traits. The thick plaster was used to make the cast stronger and more resilient to damage. The trays filled with diestone were kept in a sunny area to dry and were labeled carefully before removing from the trays. The labeled casts were removed from the trays and were dried properly.

Tissue paper was wrapped around the dried casts to prevent them from breakage and the casts were stored for further analysis (Fig. 22).

64

Figure 22 . Pouring of diestone mixture into the alginate impression mold and labeling of dental casts.

2.1.5. Grading and scoring of dental morphology traits

The Arizona State University Dental Anthropology System (ASUDAS) (Scott and

Turner, 1997; Turner et al., 1991) was followed for the scoring and grading of dental morphological traits of the samples with the help of 23 reference plaster plaques

(Turner et al., 1991) (Fig. 23). The dental morphology data derived from the five samples were converted into dichotomized (presence or absence) format for further analysis.

Figure 23. Scoring of dental morphology traits using the ASUDAS reference plaques

65

2.2. Analyzing the DNA

The DNA was analyzed for both paternal and maternal lineages using saliva as a source. The methods used in the present study for the collection of saliva and DNA isolation are described below in detail.

2.2.1. Collection of saliva samples

Volunteer individuals of the five selected populations (Tarklanis, Yousafzai,

Utmankheil, Gujars and Kohistanis) were properly instructed before collection of saliva samples. Two to five minutes were given to each individual for proper cleaning of their mouth to minimize the chance of contamination using tooth brushes provided to each volunteer by the research team (Fig. 19). After cleaning their mouth the individuals were instructed to wait for two minutes until new epithelial cells were produced and after that time a 5% sucrose solution was given and the subject was instructed to keep it in their mouth for two minutes. The individuals were then advised to spit the solution into a sterile specimen collection cup and their saliva was stored in styrofoam coolers in the field until they were delivered to the research laboratory. These samples were then directly processed for further DNA extraction upon delivery to the lab.

2.2.2. Genomic DNA extraction

A good quality of Genomic DNA (gDNA) was extracted from the saliva containing human epithelial cells using the optimized protocol established in our research lab by Akbar et al. (2015). The materials, chemicals and the preparation of stock solutions used in the present study are detailed in Appendix- III. A total of 2ml of saliva was

66

taken in a 2ml centrifuge tube and centrifuged at 3578 ×g for two minutes to obtain a pellet of epithelial cells. 100µl of cell lysis solution from stock (2ml lysis buffer + 20µl

β-mercapto ethanol and 2µl proteinase K) was added to the pellet and vortexed until the pellet was dissolved in the lysis solution. The sample tube was then kept in an incubator for one hour at 56ºC. A 600µl phenol and chloroform solution with 1:1 was added and again incubated for 5 to 10 minutes at room temperature after gentle shaking, followed by centrifugation at 5590 ×g for 12 minutes. After centrifugation,

500µl supernatant was transferred to a sterile 1.5 tube with proper handling. 500µl of isopropanol was then added to a tube containing the same volume of supernatant and incubated for 20 minutes at -20ºC. After incubation, the sample was centrifuged at the speed of 5590 ×g for 10 minutes and the supernatant was discarded while the pellet was washed with ethanol (70%) at 3578 ×g for five minutes. The ethanol was removed from the tube carefully and the pellet was air dried. 50µl of distilled water was added to the dried pellet of gDNA and incubation was carried out for five minutes at 56 ºC.

2.2.3. Screening of the purified gDNA

The concentration and quality of purified gDNA was determined with a Qubit flourometer (Invitrogen, life technology, cat. Number Q32857) using the Qubit dsDNA HR assay kit (Invitrogen, cat. Number Q32854) and a Agilent 2200

TapeStation instrument using a genomic DNA screen tape assay according to the instructions provided by the manufacturers. Traditional agarose gel electrophoresis

(1% agarose gel) was also performed to determine the quality of the purified gDNA.

67

2.2.4. Agarose gel electrophoresis

A 1% agarose gel was prepared by adding one gram of agarose powder in 100mL

TAE-buffer and was heated for one minute in microwave oven for gDNA quantification. When the temperature of the solution reached 40-45oC, 15µL of ethidium bromide was added to it and mixed well by shaking. The solution was poured into a gel casting tray with combs and kept smoothly to avoid bubbles at room temperature till solidified. The combs were removed from the solid gel and were set in electrophoresis equipment containing TAE-buffer of required volume. A total of 8µL of DNA (5µL DNA + 3µL loading dye) was loaded into the wells in agarose gel. About 80-100 volts of electric current was supplied to electrophoresis apparatus until the dye moved from the wells. The presence and position of the

DNA bands were visualized and photographed using gel documentation.

2.3. Mitochondrial DNA characterization

2.3.1. PCR Amplification of target DNA

The isolated gDNA was used as a template for the PCR amplification of mtDNA control region. A fragment about 450bp long at nucleotide position (np) 15974-16424 of the HVS-I and 550bp long fragment at np07-557 of the HVS-II region was amplified using Taq DNA polymerase. Primers (Table 2) for the present study were designed from Cambridge reference genome accession No. NC_012920 (Andrews et al., 1999). The components of the PCR mixture used for amplification are given in

Table 3 below.

68

Table 2: Details of the primer sequences used in the present study for the amplification of the target fragment of the mtDNA control region.

S.NO. OLIGO NAME SEQUENCE (5‘-3‘) %GC TM

1 HVS-1 (F) CTCCACCATTAGCACCCAAAGCTAAG 50 59.5

2 HVS-1 (R) GATATTGATTTCACGGAGGATGGTGGTC 46 59.9

3 HVS-2 (F) AGGTCTATCACCCTATTAACCACTCACG 46 60.0

4 HVS-2 (R) GGTGTCTTTGGGGTTTGGTTGGTTC 52 59.3

Table 3: Components and concentration of PCR reaction mixture/sample

S. Reagent Volume Final Concentration No.

1 10X Taq Buffer 2.5 µL 1X

2 2 mM dNTPs 2.0µL 0.16 mM

3 25mM MgCl2 2.0µL 2.0 mM

4 10pM /µL F-Primer 2.0µL 20 pM

5 10 pM/µL R-Primer 2.0µL 20pM

6 Taq. Polymerase (5U/µL) 0.5µL 2.5 U

7 DNA template 2.0 µL 30 ng

8 ddH2O 12 µL 12 µL

Final Volume 25.0 µL

2.3.2. Thermocycling conditions for PCR

The thermocycling conditions for both HVS I and II were adjusted as: the initial denaturation temperature was set at 95ºC for four minutes, the second denaturation

69

temperature was set at 94ºC for 40 seconds, the annealing temperature was set at

56ºC for one minute, the initial extension temperature was set for one minute at

72ºC, while the final extension was set for five minutes at 72 ºC followed by 35 cycles, respectively. All the of thermocycling conditions for HVS I and II were the same, except the annealing temperature, which was 55ºC for HVS II (Fig. 24).

Figure 24. Representation of thermocycling profile for PCR. Figure (A) represents PCR conditions for HVSI, while figure (B) represents PCR conditions for HVSII.

70

2.3.3. Visualization of the PCR Products

The amplified PCR products of the control region were run on 2% agarose gel and the corresponding bands were detected under UV using a gel documentation system. The sharp and good quality DNA fragments were selected for further cleaning.

2.3.4. Elution of PCR Product

The slices of gel containing the desired PCR products were purified using the manual provided by the GeneAll Gel Extraction Spin/vacuum (SV) Kit Cat. No. 102-

101. The gene clean kit contains GB solution for melting the gel, washing the buffer and elution buffer. GB buffer of about 500µl was added to the tube containing the fine slices of the amplified PCR fragment and was incubated for 10 minutes at 56°C after vortexing. When the gel was fully dissolved it was transferred to SV column and centrifuged for one minute at 8050 ×g and the liquid was discarded from the collection tube. 500µl of wash buffer (WB) was added to SV column and centrifuged for 1:30 minutes at 8050 ×g. The liquids from the collection tube were discarded and the empty SV column was centrifuged again. The SV column was shifted to a new

1.5 eppendorf tube; 50µl of elution buffer was added and incubated for two minutes at 56°C. After incubation the tube with SV column was centrifuged for 1:30 minutes at 9447 ×g while at this time the SV column was discarded and the purified PCR products were then checked on 1.5% agarose gel. The confirmed PCR products after agarose gel electrophoresis were sent to Macrogen, Inc. (Seoul, South Korea) for sequence analysis. Sequencing was performed using the ABI PRISM® BigDye TM

71

Terminator Cycle Sequencing Kit and sequences were analyzed on a 3730XL Genetic

Analyzer (Applied Biosystems).

3.3. Y-chromosome analysis

The samples from Swat and Dir districts were analyzed for Y-chromosome characterization to assess the genetic diversity within and among these populations.

3.3.1. Y-STR and Y-SNP datasets

A total of 27 Y-STR loci (DYS19, DYS385, DYS389I, DYS389II, DYS390, DYS391,

DYS392, DYS393, DYS437, DYS438, DYS439, DYS448, DYS449, DYS456, DYS458,

DYS460, DYS481, DYS518, DYS533, DYS570, DYS576, DYS627, DYS635, Y GATA H4,

DYF387S1) were amplified with the Yfiler®Plus PCR Amplification kit

(ThermoFisher Scientific, Cat. No. 4484678). The PCR products were separated and evaluated according to manufacturer’s protocols with modification of Olofsson et al.

(2015). A portion of the DNA sample was diluted with tris ethylin dia amin tetra acetic acid (TE) buffer in such a way that 1.0ng of total DNA was adjusted in a final volume of 10µL TE buffer and was then added to the multiplex reaction mixture

(Table 4).

Table 4. Components and concentrations of the multiplex PCR reaction.

S.No Reaction components Volume per reaction

1 Master mix 10.0µL

2 Primer Set 5.0µL

3 TE buffer (with 1.0ng DNA) 10µL

Total volume 25µL

72

3.1.2. Multiplex PCR profile

The annealing temperature for multiplex PCR was kept 61.5°C and the number of cycles were adjusted to (25-29) to amplify the 27 Y-STR loci (Table 5).

Table 5. Cycling profile for multiplex PCR reaction

Serial Operation Temperature Time Cycles number 1 Initial denaturation 95°C 1 min 2 Denaturation 94°C 4s 3 Annealing 61.5°C 1 min (26-29) 4 Extension 60.0°C 1 min 5 Final extension 60.0°C 22 min 6 Hold 4°C 

Electrophoresis was performed with 1µL of the amplified products, 0.5µL of

GeneScanTM 600 LIZ® Size Standard v. 2.0 and 9.5µL of deionized Hi-DiTM formamide and denatured at 950C for three minutes. The fragments were read on

Applied Biosystems® 3500×L Genetic Analyzer (ThermoFisher Scientific) according to the manufacturer’s recommendations, while the injection timing was reduced from 24 sec to 12 sec. The electropherograms were analyzed using GeneMapper®

IDX v. 1.4 (Thermo Fisher Scientific, Waltham, MA, USA) and the allelic data was checked manually two times for accuracy.

Initial assignment of Y-chromosome haplogroups was also carried out using genotypes of Y-chromosome SNPs included on the Infinium®OmniExpressExome-8

73

v.1.3 BeadChip array, performed commercially by AROS Applied Biotechnology

A/S Denmark. A total of 1,641 SNPs are included on the array, of which 1,226 passed genotyping filters (call rate ≥ 90%) in the sampled individuals.

3.4. Statistical Analysis

3.4.1. Dental morphology Analysis

The nonmetric data of the samples from Swat and Dir districts were examined with neighbor joining cluster analysis (Saitou and Nei, 1987), multidimensional scaling

(MDS) with Kruskal’s (1964) and with Guttman’s (1968) coefficient of alienation, and principal coordinates analysis (PCA) (Gower, 1966). Pairwise distances between the samples were calculated with C.A.B. Smith’s Mean Measure of Divergence statistic

(MMD) for intergroup comparison. Samples of prehistoric and living individuals from South Asia, Central Asia and the northern areas of Pakistan were included for comparative study along with the present study population.

3.4.2. MtDNA Analysis

All the raw sequences of mtDNA control region obtained from Macrogen (Seoul

Korea) were cleaned using Sequencher® version 5.4.6 (Gene Codes Corporation, http://www.genecodes.com). The cleaned sequences were aligned and compared with rCRS using MAAFT software (version 7) (Katoh and Standley, 2013; Andrews et al., 1999; Anderson et al., 1981). The aligned sequences were further investigated for haplotype detection with MitoTool (Fan and Yao, 2011), HaploGrep (Kloss-

Brandstatter et al., 2011) and Mitomaster (Brandon et al., 2009) using PhyloTree Build

16 (http://www.phylotree.org) as the classification tree to assess the quality of

74

mtDNA data (Van Oven and Kyser, 2009). The haplotypes were assigned to haplogroup according to phylotree (Van Oven, 2015) and published data (Van Oven et al., 2011; Behar et al., 2008; Metspalu, 2004). The population statistics i.e. Genetic

Diversity (GD), Power of Discrimination (PD) and Random Match Probability (RMP) were also calculated (Prieto et al., 2011; Tajima et al., 1989). Genetic distances between population samples were evaluated as pairwise FST calculated based on haplotype frequencies in Arlequin v. 3.5.1.2 [10,000 permutations; Excoffier and Lischer, 2010] with the other Pakistani population data (Bhatti et al., 2016a; Bhatti et al., 2016b;

Siddiqi et al., 2015), and visualized through classical multidimensional scaling (MDS) in the statistical software R (v. 3.2.1.), median joining networks of haplotypes were constructed in the program Network v. 5.0.0.0 (http://www.fluxus- engineering.com).

3.4.3. Y-STRs and Y-SNPs analysis

Y- STR data was analyzed with population genetic parameter estimation for the five samples (Tarklani, Yousafzai, Utmankheil, Gujars and Kohistani) and for the combined set of all individuals as previously described (Olofsson et al., 2015).

Genetic distances between samples were evaluated as pairwise FST calculated on the basis of haplotype frequencies in Arlequin v. 3.5.1.2 with 10,000 iterations per mutation (Excoffier and Lischer, 2010), and visualized through classical multidimensional scaling (MDS) with R (v. 3.2.1). Median joining networks of haplotypes were constructed in the program Network v. 5.0.0.0

75

(http://www.fluxus-engineering.com) and weights (1-5) were given to the included loci as previously reported (Olofsson et al., 2015).

We constructed two datasets of previously published Y-STR data to explore the patrilineal gene pool of Swat and Dir districts in a broader geographic and ethnographic context. One dataset encompassed 38 population samples specifically from the Indo-Pakistani sub-continent and Southwest Asia. The other dataset encompassed 54 worldwide population samples (including the five from the current study) from the human genome diversity project (HGDP) panel (Haber et al., 2012;

Perveen et al., 2014; Lee et al., 2014; Roewer et al., 2009; Vermeulen et al., 2009;

Rosenberg, 2006; Qamar et al., 2002; Cann et al., 2002). Details of all comparative samples included in this study are provided in Table 6. To facilitate the inclusion of the previously published data from a large number of ethnic groups the data set was limited to 10 Y-STR loci for which all samples had been characterized. The same package (Arlequin v. 3.5.1.2: Excoffier and Lischer, (2010)) was used for the analyses of molecular variance (AMOVA) between all groups and further groupings based on country of origin and ethnicity.

Following standard practice, the multi-copy loci in this kit, DYS385 and DYF387S1, and haplotypes with duplication events were excluded for estimations of genetic distances (FST) and construction of median joining networks. Furthermore, individuals with haplotypes displaying null or intermediate alleles were also excluded. As is standard for Y-STR analyses, the alleles of the DYS389II locus were

76

converted to the DYS389B nomenclature by subtracting the repeat number of

DYS389I from that of DYS389II.

All the corresponding haplotypes observed in the present study were reported to the

Y-chromosomal haplotype reference database (YHRD) (Willuweit and Roewer,

2015), with their respectitve accession numbers from YA004265 to YA004269.

77

Table 6. Population samples included in the larger comparative analyses. Sample sizes and references to the original studies are shown.

Population No of individuals References Gujara 20 This study Kohistania 20 This study Tarklania 20 This study Utmankheila 20 This study Yousafzaia 20 This study Iran-Ahvazb 46 Roewer et al., 2009 Iran-Izehb 50 Roewer et al., 2009 Iran-Rashtb 46 Roewer et al., 2009 Iran-Sarib 46 Roewer et al., 2009 Iran-Masalb 18 Roewer et al., 2009 Azerbaijan-Lenkoranb 47 Roewer et al., 2009 Afghanistan-Baluchb 13 Haber et al., 2012 Afghanistan-Hazarab 60 Haber et al., 2012 Afghanistan-Pashtunb 48 Haber et al., 2012 Afghanistan-Tajikb 56 Haber et al., 2012 Afghanistan-Uzbekb 17 Haber et al., 2012 Pakistan-Punjabib 300 Perveen et al., 2014 Pakistan-Pathanb 270 Lee et al., 2014 Pakistan-Baluch (BAL)b 59 Qamar et al., 2002 Pakistan-Balti (BLT)b 13 Qamar et al., 2002 Pakistan-Brahui (BRU)b 109 Qamar et al., 2002 Pakistan-Burusho (BSK)b 94 Qamar et al., 2002 Pakistan-Hazara (HZR)b 23 Qamar et al., 2002 Pakistan-Kalash (KAL)b 44 Qamar et al., 2002

78

Pakistan-Kashmiri (KSR)b 12 Qamar et al., 2002 Pakistan-MakraniBaluch (MAKB)b 25 Qamar et al., 2002 Pakistan-Negroid Makrani (MAKN)b 33 Qamar et al., 2002 Pakistan-Parsi (PRS)b 89 Qamar et al., 2002 Pakistan-Pathan (PKH)b 94 Qamar et al., 2002 Pakistan- (SDH)b 120 Qamar et al., 2002 Adygeic 7 Cann et al., 2002;Rosenberg, 2006; Vermeulen et al., 2009 Balochia 25 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Bantuc 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Basquec 14 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Bedouinc 26 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 BiakaPygmyc 30 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Brahuia 25 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Burushoa 17 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Cambodianc 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Colombianc 5 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Daic 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Daurc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Druzec 13 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Frenchc 10 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Hanc 22 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Hazaraa 24 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Italianc 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Japanesec 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Kalasha 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Karitianac 10 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Lahuc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Makrania 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009

79

Mandenkac 15 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 MbutiPygmyc 11 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Melanesianc 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Miaoc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Mongolac 6 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Mozabitec 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Naxic 8 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Orcadianc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Oroqenc 5 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Palestinianc 16 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Papuanc 10 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Pathana 17 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Pimac 14 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Russianc 15 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Sanc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Sardinianc 15 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Shec 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Sindhia 19 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Suruic 9 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Tuc 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Tujiac 9 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Tuscanc 5 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Uygurc 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Xiboc 20 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Yakutc 16 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Yic 7 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Yorubac 13 Cann et al., 2002; Rosenberg, 2006; Vermeulen et al., 2009 Total 2481

80

Groups marked witha were used both for the regional (MDS, AMOVA) and the worldwide analyses (MDS). Groups marked with b were used only for the regional analyses (MDS, AMOVA). Groups marked with c were used only for the worldwide MDS analysis.

81

Initial assignment of Y-chromosomal haplogroups was carried out using genotypes of Y-SNPs included on the Infinium®OmniExpressExome-8 v.1.3 BeadChip array. A total of 1,641 Y-SNPs are included in the array, of which 1,226 passed genotyping filters (call rate ≥ 90%) among the individuals included in the study. These were intersected with the ISOGG Y-DNA SNP index (http://isogg.org/tree/index.html, version 10.103), resulting in a final set of 331 haplogroup-defining Y-SNPs.

Individual haplogroups were assigned as the most derived haplogroup where the individual’s genotype matched the derived allele. Markers in parenthesis followed by an “x” indicate downstream markers for which the samples were typed but were found to be in an ancestral state.

82 Chapter 3 RESULTS

The results obtaind were proper analyzed and are logically arranged in in this chapter under three sub headings as dental morphology, mitochondrial and Y chromosomal DNA analyses

3.1. Dental Morphology

Fourteen maxillary and mandibular dental traits combinations were scored according to ASUDAS. Seven maxillary and seven mandibular traits were selected for comparative analysis. The maxillary tooth-trait variables include, shovelling of

UI1 (SHOVUI1) and UI2 (SHOVUI2), hypocone on UM1 (HYPOUM1) and UM2

(HYPOUM2), median lingual ridge (MLRUI1) development or tuberculum dentale, presence of metaconule on UM1 (MTCLUM1) and UM2 (MTCLUM2). The mandibular variables include, Y-groove pattern on LM1 (YGRVLM2), entoconuild on lower molar 1 (C6LM1) and lower molar 2 (C6LM2), metaconulid on lower molar

1 (C7LM1) and lower molar 2 (C7LM2).

3.1.1. Dichotomized Individual Trait Frequencies

Dental trait frequencies and the corresponding sample size of the present five ethnic groups residing in Swat and Dir districts of Khyber Pakhtunkhwa Province of

Pakistan and the additional samples used for comparative analysis are given in appendix- IV. These comparative samples include both living and prehistoric individuals (Table 7). The expression of dental traits shows marked variation when they were individually dichotomized into absence and presence only. The dental trait frequencies obtained from the five population samples of the present study are

83 described in table 8. Majority of these traits shows moderate frequencies while few traits i.e (Median Lingual ridge) MLRUI1, (Hypocone) HYPOUM1, (Major Cusp number) CSPNLM1 were observed with highest frequencies.

Table 7. Details of the living\modern and prehistoric samples used in this study for comparative analysis

Sample Abb. N Sample Abb. N Northern Pakistan/ Karakoram Prehistoric Central Asia Khows KHO 144 Djarkutan DJR 39 Madaklasht MDK 185 Kuzali KUZ 24 Wakhis (Gulmit) WAKg 162 Prehistoric Indus valley Wakhis (Sost) WAKs 146 Neo. Mehrgarh NeoMRG 49 Abbottabad and Mansehra Chl. Mehrgarh ChlMRG 25 Awans AWAm2 93 Harappa HAR 33 Syeds SYD 65 Timargarha TMG 25 Gujars GUJ 90 Sarai Khola SKH 15 Tanolis TAN 69 South-Eastern Indians (Andhra Pradesh) Karlaars KAR 76 Pakanati Red. PNT 182 Awans AWA1 167 Swatis SWT 178 Western Indians (Maharashtra) Gompad. Mad. GPD 178 Inamgaon INM 41 Chenchus CHU 194 Marathas MRT 198 Swat and Dir (Present study) Mahars MHR 195 Gujars GUJsw 165 Madia Gonds MDA 169 Kohistan KOHsw 174 Tarklani TRKd 150 Utmankheil UTHd 150 Yousafzai YSFsw 150 N=Number of sample size, Abb= abbrivations

84 Table 8. Frequencies of dental traits among the five ethnic groups (%).

Traits GUJsw KOHsw TRKd UTHd YSFsw SHOVUI1 31.25 33.33 44.72 40.88 29.28 SHOVUI2 20.63 17.83 31.58 22.88 15.00 MLRUI1 73.13 62.35 65.84 72.96 76.80 HYPOUM1 96.88 98.76 93.79 94.34 99.45 HYPOUM2 19.50 20.63 13.75 15.09 18.64 MTCLUM1 26.25 25.47 31.68 24.53 16.57 MTCLUM2 10.06 6.96 16.88 13.21 10.17 YGRVLM2 11.88 15.00 12.42 4.49 37.02 CSPNLM1 79.87 85.09 80.75 94.97 86.11 CSPNLM2 12.03 18.75 19.23 21.94 6.08 C6LM1 11.88 6.21 11.18 13.21 5.00 C6LM2 0.00 1.25 2.48 0.65 0.56 C7LM1 14.38 8.70 21.74 12.58 6.11 C7LM2 2.52 1.88 11.80 3.90 0.00

3.1.1.1. Shovelling

Shovelling was more frequent in the individuals of all five ethnic groups in upper incisor one (SHOVUI1) ranging from lowest 29.28% to highest 44.72% than the upper incisor two (SHOVUI2) ranging from lowest 15.00% to the highest value of

31.58% (Table 8). The frequency of SHOVUI1 was observed highest among the

Tarklani (TRKd) sample accounting for 44.72% of individuals followed by 40.88%

Utmankheil (UTHd), 33.33% Kohistani (KOHsw), 31.25% Gujar (GUJsw), and

29.28% Yousafzai (YSFsw). Shovelling of the upper incisor one was highly prevalent in the Neolithic inhabitants of Mehrgarh (NeoMRG) observed in 64.29% of individuals among all the samples included in this analysis, while no single individual was found to express this trait among the inhabitants of Sarai Khola

(SKH) sample from Indus valley (Fig. 25A). The highest frequency of this trait was also found among the foothills samples of Karlaars from Abbottabad (KARa) 57.72%

85 and Gujars from Mansehra (Gujm2) 57.42%, while its frequencies in the remaining samples of this aggregate was observed from 17.73% to 37.76%. The inhabitant of central Asian population (DJR, KUZ, MOL, SAP) collectively shows lowest frequencies of SHOVUI1 as compared to the other samples ranging from 7.69% to

18.75%. The highest prevalence was found in the individuals of Madia Gonds

(MDA) accounting for 49.08% and the lowest 37.50% was found in the individuals of

Inamgaon (INM) among the ethnic groups of West-Central India. The Madaklasht

(MDK) sample among the Hindu Kush highlander shows the highest frequency occurs in 40.78% individuals, while the lowest frequency (19.62%) of this trait was recorded in the individuals of Wakhis (WAKs). The three living Dravidian ethnic groups (PNT, CHU, GPD) of southeast Indian samples revealed 29.55% to 36% of

SHOVUI1 trait. YSFsw from Swat District, AWAm1 of Foothill sample from

Mansehra District and Pakanati Reddis (PNT) from Andhra Pradesh of southeast

India shows similar frequency of SHOVUI1 ranging upto 29% (Fig. 25A)

The frequency of Shovelling trait in the upper incisor two (SHOVUI2) was observed

31.58% among the individuals of Tarklani (TRKd) followed by 22.88% Utmankheil

(UTHd), 20.63% Gujar (GUJsw), 17.83% Kohistani (KOHsw) and 15.0% Yousafzai

(YSFsw) (Table 8). When the frequency of SHOVUI2 was compared with other population included in this study for comparative analysis, it was found that Gujar sample from Swat District (GUJsw) was the most similar to Tanoli from Mansehra

District (TANm2 ) and KHO of Hindu Kush highlands samples occur within the range of 20% (Fig 24B). Highest frequency of SHOVUI2 was observed among the

Chalcolithic inhabitants of Mehrgarh (CHMRG) occur in 58.33%of individuals

86 followed by KARa (57.72%) and GUJm2 (57.42%) of Foothill samples from Mansehra

District, while this trait was not expressed in the inhabitants of Sarai Khola (SKH) samples of Indus valley as compared to the other samples include in this study (Fig.

25B).

SHOVUI1

A

Present Study Foot hill West-Central Southeast Central Hindu Indus (Dir and Swat) (Mansehra and Indian Indian Asian Kush valley Abbottabad)

SHOVUI2

B Present Study Foot hill West-Central Southeast Central Hindu Indus (Dir and Swat) (Mansehra and Indian Indian Asian Kush valley Abbottabad)

Figure 25. Frequencies of shovelling (SHOVUI) among living Pakistani ethnic groups, living ethnic groups from peninsular India and samples of the prehistoric inhabitants of the Indus Valley, South Central Asia and the major living ethnic groups of Swat and Dir districts (A) SHOVUI1 (B) SHOVUI2.

87 3.1.1.2. Median Lingual ridge

Median lingual ridge development also called tuberculum dentale (TD) is a maxillary dental trait expression found lingually on incisors (MLRU) and canine

(MLRC). MLR on upper incisor one (MLRUI1) was highly expressed among the five population samples from Swat and Dir districts ranging from 62.35% to 76.80%

(Table 8). High frequency of MLRUI1/TDUI1 was observed among the individuals of Yousafzai (YSFsw) sample accounting for 76.88%, followed by Gujar (GUJsw)

73.13%, Utmankheil (UTHd) 72.96%, Tarklani (TRKd) 65.84% and Kohistani

(KOHsw) 62.35% (Fig. 26).

MLRUI1

Present Study Foot hill West Central southeast Central (Dir and Swat) Hindu Indus (Mansehra and Indian Indian Asian Kush valley Abbottabad)

Figure 26. Frequencies of Median Lingual ridge (MLR) among living Pakistani ethnic groups, living ethnic groups from peninsular India and samples of the prehistoric inhabitants of the Indus Valley, South Central Asia and the major living ethnic groups of Swat and Dir districts.

It was also found that the individuals of Yousafzai belongs to Swat District (YSFsw) has highest frequency of MLRUI1 in comparison to the rest of all samples including historic, prehistoric and living samples from Pakistan and peninsular India as shown in figure 25. Lowest frequency of this trait was occur in Kuzali (KUZ)

88 (15.38%) and Djarkutan (DJR) (17.65%) of Central Asian followed by Awan

(AWAm2) (17.73%) of Foothill samples from Mansehra District.

3.1.1.3. Y-Groove Pattern

Y-groove pattern or Y Occlusal groove pattern in lower molar 2 (YGRVLM2) was found with low frequency such as 4.49% in Utmankheil (UTHd) to high frequency

37.02% in Yousafzai among the five ethnic groups of Swat and Dir districts while, its prevalence in Kohistani (KOHsw) was 15%, followed by Tarklani (TRKd) 12.42% and Gujar (GUJsw) 11.88% (Fig. 27).

YGRVLM2

Present Study Foot hill West Southeast Central Hindu Indus (Dir and Swat) (Mansehra and Central Indian Indian Asian Kush valley Abbottabad)

Figure 27. Frequencies of Y-Groove Pattern (YGRVLM2) among living Pakistani ethnic groups, living ethnic groups from peninsular India and samples of the prehistoric inhabitants of the Indus Valley, South Central Asia and the major living ethnic groups of Swat and Dir districts.

The Pakanati Reddis (PNT) and Gompadhompti Madigas (GPD) ethnic groups of

Southeast Indian, SKH sample of Indus Valley and Yousafzai of Swat District

(YSFsw) reveals highest YGRVLM2 trait frequency ranging from 35.73% to 40.61%,

89 whereas the lowest expression was found in Utmankheil sample from District Dir

(UTHd) occur in 4.49%, when compared with the rest of the samples (Fig. 27).

Similarity was found among Kohistani sample of Swat District (KOHsw), Gujar

(GUJm2) and Tanoli (TANm2) of Foothill samples from Mansehra District and

Molali (MOL) sample of Central Asia where 15% of individuals express YGRVLM2 trait. The individuals of Gujar sample from District Swat (GUJsw), Awan sample from District Mansehra (AWAm2) and the individuals of Wakhis from Gulmit

(WAKg) bear the same frequency ranging from 11.88% to 12.00% respectively (Fig.

27).

3.1.1.4. Hypocone

The prevalence of hypocone (Cusp 4) trait of upper molar 1 (HYPOUM1) was found highly frequent ranging from 93.79% to 99.45% as comparison to upper molar 2

(HYPOUM2) ranging from 13.75% to 20.63% among the individuals of the five ethnic groups of the present study (Table 8). High frequency of HYPOUM1 was found in the individuals of YSFsw accounting for 99.45%, followed by KOHsw

98.76%, GUJsw 96.88%, UTHd 94.34% and TRKd 93.79% (Fig. 28A). HYPOUM1 trait was highly expressed in all populations i.e Indus valley, Hindu Kush, Central Asian,

Peninsular Indian (Southeast Indian and West Central Indian) and the samples of present study (Swat and Dir districts) ranging from 65.85% to 100% except the

Foothill samples where its frequency was found from lowest 17.73% to highest

57.72% (Fig. 28A).

90

HYPOUM1

A

Present Study Foothill (Mansehra West Southeast Central Hindu Indus Central Indian Indian Asian (Swat & Dir & Abbottabad) Kush valley

HYPOU2M

B Present Study Foothill (Mansehra West Southeast Central Hindu Indus (Swat & Dir & Abbottabad) Central Indian Indian Asian Kush valley

Figure 28. Frequency distribution of hypocon (A) Frequency distribution of hypocon at upper molar 1 (HYPOCONM1) (B) Frequency distribution of hypocon at upper molar 2 (HYPOCONM2).

The frequency of hypocone trait at upper molar 2 (HYPOUM2) was observed highest in Kohistani (KOHsw) occur in 20.63% individuals among the other samples of Swat and Dir districts followed by Gujar (GUJsw) 19.50%, Yousafzai (YSFsw)

18.64%, Utmankheil (UTHd) 15.09% and Tarklani (TRKd) 13.75% respectively (Table

8). Comparatively all the samples of Central Asian reveals highest expression of

HYPOUM2 ranging from 50% to 71.88% among the other samples included in this study, whereas only one sample (CHMRG) among the other Indus valley samples

91 and one sample (CHU) among the samples of southeast Indian has highest prevalence with the frequency of 55.56% and 42.78% respectively. No expression of

HYPOUM2 was recorded among the Indus Valley sample from Timergara (TMG) and Inamgaon (INM) sample from West Central Indian. The overall results show that HYPOUM1 was more prevalent among the individuals of all samples included in this study than HYPOUM2 (Fig. 28B).

3.1.1.5. Metaconule

Metaconule (MTCL) (cusp 5) was scored on both upper molar 1 (MTCLUM1) and upper molar 2 (MTCLUM2). The result shows that the frequency of MTCLUM1 was frequently high ranging from 16.57% to 31.68% as compared to MTCLUM2 ranging from 6.96% to 16.88% (Table 8). The frequency of MTCLUM1 was observed highest among the individuals of Tarklani (TRKd) accounting for 31.68%, followed by Gujar

(GUJsw) 26.25%, Kohistani (KOHsw) 25.47%, Utmankheil (UTHd) 24.53% and

Yousafzai (YSFsw) 16.57% samples from Swat and Dir districts (Fig. 29A). Highest expression was observed in the samples of Indus valley, Peninsular Indian and the samples of present study (Swat and Dir districts) ranging from the lowest value

14.63% to highest 46.15%, whereas one sample (SKH) among the Hindu Kush has also found with the highest frequency accounting for 33.3.% (Fig. 29A). Similarity in the expression of MTCLUM1 was found in the three ethnic groups (GUJsw, UTHd and KOHsw) from Swat and Dir districts, one sample Chenchus (CHU) from southeast India and two samples (NeoMRG and CHIMRG) from Indus Valley with frequency ranging from 24% to 26%. The rest of the samples included in this analysis

92 i.e Foothill, Central Asian and Hindu Kush (Except SKH) samples reveal less than

10% of MTCLUM1 trait expression as shown in figure 29A.

MTCLUM1

A

Present Study Foothill (Mansehra West Southeast Central Hindu Indus (Swat & Dir & Abbottabad) Central Indian Indian Asian Kush Valley

MTCLUM2

B

Present Study Foothill (Mansehra West Southeast Central Hindu Indus valley (Swat & Dir & Abbottabad) Central Indian Indian Asian Kush

Figure 29. Frequencies of metaconule at upper molars (A) Frequency distribution of metaconule at upper molar 1 (HYPOCONM1) (B) Frequency distribution of metaconule at upper molar 2 (HYPOCONM2).

The frequency of metaconule in upper molar 2 (MTCLUM2) was observed 16.88% among the individuals of Tarklani (TRKd) followed by Utmankheil (UTHd) 13.21%,

Yousafzai (YSFsw) 10.17%, Gujar (GUJsw) 10.06% and Kohistani (KOHsw) 6.9%

(Table 8). Highest expression of this trait was found among the individuals of

NeoMRG and CHIMRG of the Indus Valley with the frequencies of 40% and 33.33%

93 respectively (Fig. 29B). Lowest frequency was found among the Kuzali (KUZ) occupants of Central Asia, where this trait was express just in 4.1% of individuals and completely absent in the prehistoric Indus Valley sample from Timergara

(TMG) and in Central Asian sample from Djarkutan (DJR) figure. 29B.

3.1.1.6. Major Cusp number

Major Cusp number (CSPN) was found highest in the lower molar 1 (CSPNLM1) ranging from 78.87% to 94.97% than lower molar 2 (CSPNLM2) ranging from 6.08% to 21.94% among all the five populations sampled from Swat and Dir districts (Table

8). The frequency of CSPNLM1 was accounted 94.97% in the individuals of

Utmankheil (UTHd), 86.11%in Yousafzai (YSFsw), 85.09%in Kohistani (KOHsw),

80.75 in Tarklani (TRKd) and 79.87% in Gujar (GUJsw) respectively (Fig. 30A).

Comparatively, No clear differences was found in the frequency of this trait among all the samples used in this analysis (fig. 30A)

CSPNLM2 at lower molar 2 was frequently observed in the sample of Utmankheil

(UTHd) which occurs in 21.94% of individuals, followed by Tarklani (TRKd) 19.23%,

Kohistani (KOHsw) 18.75%, Gujar (GUJsw) 12.03% and Yousafzai (YSFsw) 6.08%.

(Fig. 30B).

This trait was markedly expressed among all the three ethnic groups from southeast

India which occur 37.21% in the individuals of Gompadhompti Madigas (GPD),

23.89% in Pakanati Reddis (PNT) and 27.75% in Chenchus (CHU), while completely absent in the prehistoric sample from Harappan (HAR). The most similar frequencies were found among the individuals of Gujar sample from District Swat

(GUJsw), the sample of Awan (AWAm1), Karlaars (KARa), Syed (SYDm2) and

94 Tanoli (TANm2) from Mansehra District and Khowars sample of Hindu Kush highlands from Chitral District (KHO) ranging from 10.29% to 13.77% respectively

(Fig. 30B). The Yousafzai sample of District Swat (YSFsw) was the most similar in

CSPNLM2 trait expression with Neolithic occupant of of Mehrgarh (NeoMRG) and

Sari Khola (SKH) of Indus Valley samples recorded in 6.08 to 6.67% of individuals

(Fig. 30B)

CSPNLM1

A Present Study Foothill West Southeast Hindu Central Indus (Dir and Swat) (Mansehra and Central Indian Indian Kush Asian valley Abbottabad)

CSPNLM2

B

Present Study Foothill West Southeast Central Hindu Indus (Dir and Swat) Central Indian Indian Asian Kush (Mansehra and valley Abbottabad) Figure 30. Frequency Distribution of major cusps numbers at lower molars (CSPNLM) among all samples (A) Frequency of major cusps numbers at lower molar 1 (CSPNLM1) (B) Frequency of major cusps numbers at lower molar 2 (CSPNLM2)

.

95 3.1.1.7. Entoconuild

The entoconuild (Cusp 6) was found more prevalent on lower molar 1 (C6LM1) than lower molar 2 (C6LM2) among the individuals of five ethnic groups of Swat and Dir districts (Table 8). The frequency of entoconuild at lower molar 1 (C6LM1) was found 13.21% among the individuals of Utmankheil (UTHd), followed by Gujar

(GUJsw) 11.88%, Tarklani (TRKd) 11.18%, Kohistani (KOHsw) 6.21% and Yousafzai

(YSFsw) 5% (Fig. 31A). The Chalcolithic period sample from Mehrgarh (CHIMRG) shows highest frequency (21.74%) as compared to the rest of the samples included in this analysis and the lowest frequency was observed among the Gujar from

Mansehra District (GUJm2), where the expression of this trait was recorded in only

1.8% of individuals, while completely absent in the prehistoric Indus Valley sample from Timergara (TMG) and Kuzali (KUZ) sample from Central Asian (Fig. 31A).

The entoconuild at lower molar 2 (C6LM2) was completely absent in Gujar (GUJsw), and its frequency in the individuals of Utmankheil (UTHd), Tarklani (TRKd),

Kohistani (KOHsw) and Yousafzai (YSFsw) samples from Swat and Dir district was recorded from 0.56% to 2.48% respectively and was considered as the lowest when compared to the other traits frequencies observed in this analysis ( Table 8). Half of the samples included in this analysis express this trait with a very low frequency ranging from 0.54% to 11.11%, while it was found completely absent in the remaing half samples i.e Gujar sample from District Swat (GUJsw), three samples (AWAm1,

AWAm2, SWT) from Mansehra District, prehistoric sample from Mahashtra (INM), prehistoric south Central Asian samples (DJR, KUZ, MOL, SAP), two samples

(KHO, WAKs) from Hindu kush and three samples (SKH, HAR, NeoMRG) from

Indus Valley (Fig. 31B).

96

C6LM1

A Present Study West Southeast Hindu Foot hill Centra Indus (Dir and (Mansehra and Central Indian l Asian Kush valley Swat) Abbottabad) Indian

C6LM2

B

Present Study West Southeast Centra Hindu Foot hill Indus (Dir and (Mansehra and Central Indian l Asian Kush valley Swat) Abbottabad) Indian

Figure 31. Frequencies distributions of entoconuild at lower molars (C6LM) among all samples included in this study (A) Frequency of entoconuild at lower molar 1 (C6LM1). (B) Frequency of entoconuild at lower molar 2 (C6LM2)

3.1.1.8. Metaconulid

Metaconulid frequencies at lower molar 1 (C7LM1) was recorded 21.74% among the individuals of Tarklani (TRKd), followed by Gujar (GUJsw) 14.38%, Utmankheil

(UTHd) 12.58%, Kohistani (KOHsw) 8.70 % and Yousafzai (YSFsw) 6.11% collected from Swat and Dir districts (Table 8). When C7LM1 was comparatively studied in all of the samples included in this analysis, it was found that highest frequency

(24.62%) was observed among the individuals of Chenchus (CHU) of southeast

Indian samples than all of the remaining samples, while Tarklani from District Dir

97 (TRKd) was the second most prevalent sample with the frequency of 21.74% (Fig.

32A). Lowest frequency was observed among the inhabitants of Sapalli tepe (SAP), where the expression of this trait was recorded in only 2.63% of individuals and was found completely absent in the individuals of Kuzali (KUZ) samples of Central

Asian (Fig. 32A).

A Present Study Foot hill West Southeast Central Hindu Indus (Swat and Dir) (Mansehra and Central Indian Indian Asian Kush valley Abbottabad

C7LM2

B West Present Study Foot hill Southeast Central Hindu Indus Central Indian (Swat and Dir) (Mansehra and Indian Asian Kush valley Abbottabad

Figure 32. Frequencies distributions of Metaconulid at lower molars (C7LM) among all samples included in this study (A) Frequency of Metaconulid at lower molar 1 (C7LM1). (B) Frequency of Metaconulid at lower molar 2 (C7LM2)

98 The Metaconulid at lower molar 2 (C7LM2) was observed 11.80% in the individuals of Tarklani (TRKd), Utmankheil (UTHd) 3.90%, Gujar (GUJsw) 2.52%, Kohistani

(KOHsw) 1.80%, while it was completely absent in the individuals of Yousafzai

(YSFsw) samples from Swat and Dir districts (Table 8). It was found that the sample of Tarklani from District Dir was the most prevalent among all of the samples included in this analysis, where 11.80% individuals express this trait followed by

Gompadhompti Madigas (GPD) from southeast Indian samples with the expression rate of 10.98%, prehistoric sample from Timergara (TMG) 10%, CHU 9.28% and PNT

6.4% belongs to Southeast Indian samples, while in the remaining samples it was observed in less than 6% of individuals (Fig. 32B).

Lowest frequency was observed among the Marathas (MRT) of West Central Indian sample, where the expression of this trait was recorded in only 0.51% of individuals, and was found completely absent in the individuals of Yousafzai sample from

District Swat (YSFsw), AWAm2 and TANm2 from District Mansehra, KUZ and SAP from Central Asia, WAKs from Hindu Kush, SKH, CHIMRG, HAR and NeoMRG samples of Indus Valley (Fig. 32B).

Furthermore the dental morphology data obtained from the five population samples of Swat and Dir districts were compared with the other Pakistani, Central Asian and

Indian (living/modern and prehistoric) samples (Table 7) and their results were used for further analysis.

99 3.1.2. Mean Measure of Divergence

A mean measure of divergence (MMD) analysis was carried out to determine the patterns of affinities among the five population samples from Swat and Dir districts, prehistoric inhabitants of the Indus Valley and South-Central Asia, as well as living peninsular Indian ethnic groups and individuals of other ethnic groups of Pakistan

(Table 8). The distance matrix values for each set of the pairwise group comparison are described in Table 9. The values obtained were used for further analysis. The high MMD values represent phenetic divergence between the paired groups while low MMD values indicate phenetic similarities between the paired samples.

3.1.3. Living Northern Pakistanis Only

Inter-sample affinities based upon pairwise MMD values were examined with neighbor-joining cluster analysis (NJ), multidimensional scaling (MDS), and principal coordinate analysis (PCA).

100 Table 9: Mean measure of divergence (MMD) distance matrix obtained from the pairwise group comparisons of the five populations and the other population used in this study.

AWAm1 AWAm2ChlMRG CHU DJR GPD GUJm2 GUJsw HAR INM KARa KHO KOHsw KUZ MDA MDK MHR MOL MRT NeoMRG PNT SAP SKH SWT SYDm2 TANm2 TMG TRKd UTHd WAKg WAKs YSFsw AWAm1 --- 0.005 0.016 0.004 0.014 0.004 0.005 0.004 0.019 0.014 0.005 0.006 0.004 0.021 0.004 0.004 0.004 0.012 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.005 0.005 0.004 AWAm2 0.008 --- 0.017 0.004 0.015 0.004 0.005 0.004 0.019 0.014 0.005 0.006 0.004 0.021 0.004 0.004 0.004 0.012 0.004 0.011 0.004 0.014 0.027 0.004 0.005 0.005 0.026 0.044 0.004 0.005 0.005 0.004 ChlMRG 0.087 0.148 --- 0.016 0.026 0.016 0.017 0.016 0.031 0.026 0.017 0.017 0.016 0.033 0.016 0.016 0.016 0.023 0.016 0.023 0.016 0.025 0.039 0.016 0.017 0.017 0.037 0.016 0.016 0.017 0.016 0.016 CHU 0.051 0.069 0.049 --- 0.014 0.003 0.004 0.004 0.018 0.014 0.005 0.005 0.004 0.020 0.004 0.004 0.003 0.011 0.003 0.010 0.003 0.013 0.027 0.004 0.004 0.004 0.025 0.004 0.004 0.004 0.004 0.003 DJR 0.080 0.084 0.058 0.051 --- 0.014 0.015 0.014 0.029 0.024 0.015 0.015 0.014 0.031 0.014 0.014 0.014 0.022 0.014 0.021 0.014 0.023 0.037 0.014 0.015 0.015 0.036 0.014 0.014 0.015 0.014 0.014 GPD 0.052 0.081 0.067 0.004 0.087 --- 0.005 0.004 0.019 0.014 0.005 0.005 0.004 0.021 0.004 0.004 0.004 0.011 0.003 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.025 0.004 0.004 0.004 0.004 0.004 GUJm2 0.030 0.048 0.083 0.086 0.092 0.097 --- 0.005 0.019 0.015 0.006 0.006 0.005 0.021 0.005 0.005 0.004 0.012 0.004 0.011 0.004 0.014 0.027 0.005 0.005 0.005 0.026 0.005 0.005 0.005 0.005 0.004 GUJsw 0.019 0.032 0.069 0.051 0.127 0.061 0.066 --- 0.019 0.014 0.005 0.005 0.004 0.021 0.004 0.004 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.004 0.004 0.004 HAR 0.019 0.027 0.054 0.049 0.122 0.059 0.074 -0.016 --- 0.028 0.020 0.020 0.019 0.035 0.019 0.019 0.019 0.026 0.018 0.026 0.019 0.028 0.042 0.019 0.019 0.019 0.040 0.019 0.019 0.019 0.019 0.019 INM -0.004 0.031 0.088 0.065 0.129 0.049 0.028 0.022 0.016 --- 0.015 0.015 0.014 0.030 0.014 0.014 0.014 0.021 0.014 0.021 0.014 0.023 0.036 0.014 0.015 0.015 0.035 0.014 0.014 0.014 0.014 0.014 KARa 0.035 0.083 0.053 0.092 0.103 0.092 0.007 0.063 0.070 0.018 --- 0.006 0.005 0.022 0.005 0.005 0.005 0.012 0.005 0.012 0.005 0.014 0.028 0.005 0.006 0.006 0.026 0.005 0.005 0.005 0.005 0.005 KHO -0.007 0.001 0.072 0.033 0.058 0.046 0.037 0.007 0.011 0.018 0.049 --- 0.005 0.022 0.005 0.005 0.005 0.013 0.005 0.012 0.005 0.014 0.028 0.005 0.006 0.006 0.027 0.005 0.005 0.006 0.006 0.005 KOHsw 0.010 0.020 0.072 0.031 0.095 0.037 0.047 -0.003 -0.007 0.017 0.050 -0.001 0.021 0.004 0.004 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.004 0.004 0.004 KUZ 0.052 0.044 0.058 0.050 -0.049 0.077 0.065 0.090 0.067 0.080 0.066 0.035 0.064 --- 0.021 0.021 0.021 0.028 0.020 0.027 0.021 0.030 0.043 0.021 0.021 0.021 0.042 0.021 0.021 0.021 0.021 0.021 MDA 0.031 0.049 0.087 0.039 0.127 0.032 0.027 0.047 0.034 0.004 0.046 0.039 0.029 0.098 --- 0.004 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.005 0.004 0.004 MDK 0.000 0.031 0.095 0.064 0.127 0.054 0.041 0.024 0.043 0.009 0.040 0.005 0.015 0.104 0.037 --- 0.004 0.011 0.004 0.011 0.004 0.013 0.027 0.004 0.005 0.005 0.026 0.004 0.004 0.004 0.004 0.004 MHR 0.016 0.035 0.097 0.053 0.152 0.047 0.045 0.017 0.006 -0.014 0.055 0.021 0.012 0.116 0.006 0.019 --- 0.011 0.003 0.011 0.004 0.013 0.027 0.004 0.004 0.005 0.025 0.004 0.004 0.004 0.004 0.004 MOL 0.055 0.059 0.021 0.040 -0.035 0.079 0.081 0.068 0.058 0.101 0.090 0.026 0.055 -0.035 0.112 0.092 0.113 --- 0.011 0.018 0.011 0.021 0.034 0.011 0.012 0.012 0.033 0.011 0.011 0.012 0.012 0.011 MRT 0.022 0.040 0.103 0.060 0.143 0.046 0.042 0.032 0.009 -0.014 0.045 0.034 0.020 0.096 0.003 0.032 -0.001 0.120 --- 0.010 0.003 0.013 0.027 0.004 0.004 0.004 0.025 0.004 0.004 0.004 0.004 0.003 NeoMRG 0.054 0.123 0.026 0.095 0.175 0.085 0.039 0.067 0.039 0.007 0.018 0.076 0.068 0.148 0.026 0.053 0.030 0.139 0.030 --- 0.011 0.020 0.034 0.011 0.012 0.012 0.032 0.011 0.011 0.011 0.011 0.011 PNT 0.031 0.064 0.063 0.015 0.108 0.005 0.095 0.030 0.015 0.027 0.080 0.029 0.020 0.086 0.031 0.037 0.026 0.084 0.027 0.061 --- 0.013 0.027 0.004 0.004 0.005 0.025 0.004 0.004 0.004 0.004 0.004 SAP 0.096 0.080 0.061 0.064 -0.041 0.106 0.118 0.112 0.101 0.142 0.135 0.061 0.092 -0.051 0.147 0.145 0.157 -0.039 0.154 0.204 0.121 --- 0.036 0.013 0.014 0.014 0.035 0.013 0.013 0.014 0.013 0.013 SKH 0.040 0.018 0.130 0.055 0.067 0.055 0.086 0.052 -0.001 -0.001 0.090 0.041 0.039 -0.026 0.039 0.094 0.037 0.064 0.009 0.113 0.038 0.046 --- 0.027 0.028 0.028 0..0481 0.027 0.027 0.027 0.027 0.027 SWT 0.000 0.029 0.070 0.038 0.098 0.038 0.057 0.014 0.024 0.019 0.056 -0.004 0.008 0.084 0.041 -0.002 0.020 0.062 0.037 0.059 0.018 0.111 0.079 --- 0.005 0.005 0.026 0.004 0.004 0.005 0.004 0.004 SYDm2 0.014 0.019 0.083 0.055 0.056 0.070 -0.003 0.045 0.042 0.023 0.020 0.014 0.025 0.028 0.024 0.033 0.037 0.044 0.035 0.062 0.068 0.075 0.047 0.040 --- 0.005 0.026 0.005 0.005 0.005 0.005 0.004 TANm2 0.013 -0.004 0.126 0.049 0.043 0.065 0.033 0.051 0.047 0.032 0.070 0.008 0.029 0.015 0.037 0.042 0.043 0.038 0.042 0.115 0.063 0.052 0.011 0.039 0.004 --- 0.026 0.005 0.005 0.005 0.005 0.005 TMG -0.014 -0.012 0.086 0.037 0.062 0.041 0.003 0.003 -0.022 -0.051 0.003 -0.007 -0.009 -0.005 -0.001 0.016 -0.009 0.040 -0.016 0.041 0.024 0.069 -0.057 0.021 -0.024 -0.018 --- 0.026 0.026 0.026 0.026 0.025 TRKd 0.037 0.066 0.049 0.048 0.150 0.055 0.051 0.007 0.001 0.016 0.042 0.032 0.012 0.113 0.029 0.036 0.016 0.093 0.029 0.030 0.034 0.148 0.073 0.032 0.042 0.073 -0.001 --- 0.004 0.004 0.004 0.004 UTHd 0.029 0.050 0.075 0.052 0.152 0.056 0.063 0.004 0.005 0.024 0.068 0.020 0.005 0.138 0.037 0.021 0.013 0.092 0.035 0.056 0.036 0.151 0.103 0.017 0.046 0.063 0.018 0.006 --- 0.004 0.004 0.004 WAKg 0.000 0.007 0.109 0.062 0.120 0.066 0.051 0.004 0.005 0.005 0.062 -0.005 0.002 0.084 0.042 0.004 0.012 0.071 0.029 0.077 0.041 0.116 0.052 0.003 0.028 0.024 -0.013 0.028 0.010 --- 0.005 0.004 WAKs 0.010 0.004 0.158 0.089 0.154 0.095 0.068 0.015 0.016 0.025 0.086 0.006 0.015 0.104 0.060 0.016 0.026 0.102 0.043 0.111 0.061 0.145 0.057 0.018 0.041 0.030 -0.003 0.045 0.028 -0.006 --- 0.004 YSFsw 0.008 0.036 0.092 0.063 0.118 0.058 0.081 0.015 0.012 0.025 0.067 0.008 0.015 0.088 0.060 0.014 0.030 0.083 0.036 0.072 0.023 0.125 0.057 0.003 0.063 0.057 0.028 0.047 0.038 0.015 0.024 --- MMD= Below Diagonal MMDsd= Above Diagonal

101 3.1.3.1. Neighbor-joining Cluster Analysis

Neighbor-joining cluster analysis revealed that, the six samples located at lower right are ethnic groups from peninsular India and shows close affinitites to each others (Fig. 33).

Figure 33. Neighbor-joining cluster analysis of modern populations of northern Pakistan, peninsular Indian populations and their comparison to the major ethnic groups of Swat and Dir districts, Pakistan.

As expected, the three Dravidian-speaking samples from Andhra Pradesh (CHU,

GPD, PNT) exhibit closest affinities to one another, as do the three Indo-Aryan- speaking ethnic group samples from Maharashtra (MDA, MHR, MRT). The remaining samples fall into four aggregates. The first aggregate includes four of the five samples (GUJsw, KOHsw, TRKd, UTHd) collected from Swat and Dir districts

102 all of whom show a very close affinities to one another, while the sample of

Yousafzais from Swat, fall into the second aggregate, in which all of the remaining members, except one (SWTm: Swatis from Mansehra District), are highland samples from either Chitral District (KHO, MDK) or Gilgit-Baltistan (WAKg, WAKs). Apart from YSFsw, the sample with closest affinities to these highland samples is the sample of Kohistanis from Swat (KOHsw). Reassuringly, the two samples of Wakhis

(WAKg, WAKs) exhibit closest affinities to one another. The third aggregate has only two “core” members and one peripheral member. The “core” members are the

Awans (AWAm2) and Tanolis (TANm2) from Mansehra District, while the peripheral member is the sample of Awans (AWAm1) also collected from Mansehra

District. The fourth aggregate includes three members: Gujars (GUJm2) and Syeds (

SYDm2) from Mansehra District, as well as the sample of Karlaars (KARa) collected from Abbottabad.

3.1.3.2. Multidimensional Scaling —Kruskal’s Method

Multidimensional scaling into three dimensions with Kruskal’s method was accomplished in seven iterations with a stress value of 0.0702 (a very good fit), accounting for 84.63% of the variance between samples. The three samples of

Dravidian-speaking ethnic groups from southeast India are separated in the upper right of the array from all other samples (Fig. 34). The sample of Yousafzais (YSFsw) collected from Swat District are identified as possessing unexpectedly close affinities to the three Dravidian-speaking samples (PNT, CHU, GPD) from Andhra Pradesh in southeast peninsular India. The three ethnic groups from Maharashtra (MDA, MRT,

MHR) are found in the lower left and the Tarklanis from District Dir are interposed

103 between the tribal Madia Gonds (MDA) of eastern Maharashtra and high-status

Marathas (MRT) from western Maharashtra (Pune).

Figure 34. Multidimensional scaling (Kruskal's method) of the major ethnic groups residing in Swat and Dir districts in comparison with other living Pakistani and peninsular Indian ethnic groups.

The sample of Utmankheils (UTHd) from Dir possess closer affinities with the low- status Mahars (MHR) from western Maharashtra than to high-status Marathas

(MRT) found right next door to them in western Maharashtra. In the foreground is an array of ethnic group samples from the northern foothills of the Indus Valley.

These samples are the members of aggregates three and four described above for the

104 neighbor-joining analysis. The most divergent are the Karlaars from Abbottabad, as well as the Gujars and Sayeds, followed by Tanolis and Awans from Mansehra

District, respectively. Their connection to the remaining samples is a very distant affinity between the sample of Awans (AWAm2 and AWAm1) from Mansehra

District.

The remaining samples may be identified as falling into two aggregates. The first is found in the middle-left of the array and is composed of the Yousafzais and

Kohistanis from Swat, as well as the sample of Swatis from Mansehra. The second aggregate is found in the lower left and includes the two Wakhi samples from

Gulmit (WAKg) and Sost (WAKs), the inhabitants of Madak Lasht (MDK) and the

Khows (KHO) from Chitral District, as well as the sample of Awans (AWAm1) from

Mansehra District.

The overall results obtained from the Kruskal’s method may be summarized as follows: 1) The sample of Yousafzais from Swat (YSFsw) have close affinities with the Dravidian-speaking ethnic groups (CHU, GPD and PNT) of Andhra Pradesh

India; 2) Gujar (GUJsw) and Kohistani (KOHsw) samples from Swat show are marked by rather close affinities with an array of highland samples from Gilgit-

Baltistan (WAKs, WAKg) and Chitral District (KHO, MDK), as well as to the sample of Swatis from Mansehra District; 3) The samples collected from Dir District (UTHd,

TRKd) show affinity to west-central peninsular Indians (MRT, MDA, MHR) and apparently possess no affinities to the samples collected from Swat (GUJsw, KOHsw,

YSFsw) and the other samples from Pakistan included in this analysis.

105 3.1.3.3. Multidimensional Scaling —Guttman’s Method

Multidimensional scaling into three dimensions with Guttman’s method was

accomplished in 11 iterations, with a stress value of 0.0906 (an extremely good fit)

accounting for 92.05% of the variance between samples. In general, the patterning

found in this array (Fig. 35) is similar to that described for MDS with Kruskal’s

method.

Figure 35. Multidimensional scaling (Guttman’s method) of the major ethnic groups residing in Swat and Dir districts in comparison with other living Pakistani and peninsular Indian ethnic group samples.

Once again, the three Dravidian-speaking samples from southeast peninsular India

(CHU, GPD, PNT) are clearly distinguished from all other samples, and on the left

side of the array there is an aggregation of highland samples, which include Gujars

106 (GUJsw), Kohistanis (KOHsw) and Yousafzais (YSFsw) from Swat, the inhabitants of

Madak Lasht (MDK), the two Wakhi samples (WAKg, WAKs) and the Khows

(KHO) of Chitral District. The array also includes three of the foothill samples—

Swatis (SWT), Awans (AWAm1) and (possibly) the sample of Awans (AWAm2) from Mansehra District. The Tarklani (TRKd) and Utmankheil (UTHd) samples from

Dir once again are closely associated with the three ethnic groups of west-central

India (MHR, MRT, MDA), while the samples from Mansehra District (TANm2,

SYDm2, and KARa) are surprisingly isolated from all of the other Pakistani population samples, except one of the Awan samples from Mansehra District

(AWAm2).

3.1.3.4. Principal Coordinate Analysis

The first three principal axes generated by principal coordinate analysis (PCA) capture 83.92%of the total variance among samples (Fig. 36). The first axis accounts for 43.487% of the variance, the second 25.962%, and the third 14.468%. Axis 1 was elongated to reflect the fact that the greatest proportion of variance is accounted for by this axis. This plot shows many similarities but also some differences from the results obtained by cluster analysis and multidimensional scaling. Occupying an isolated position in the lower right of the array, the three Dravidian-speaking ethnic groups from southeast peninsular India (CHU, GPD, PNT) are all identified as possessing closest affinities to one another and are segregated away from all other samples. Even better, the three Indo-Aryan-speaking samples from west-central peninsular India (MDA, MRT, MHR), located at the top of the array, are identified as

107 possessing closer affinities to one another than to any of the other samples. Most of the highland samples aggregate together in the upper left side of the array and this aggregate includes the inhabitants of Madak Lasht (MDK), the Wakhi (WAKg) sample from Gulmit, the Khow of Chitral District (KHO) and the Kohistanis

(KOHsw) of Swat, along with the foothill samples of Awans (AWAm1) and Swatis

(SWT) from Mansehra District. The Wakhis from Sost (WAKs) show no affinity with the other sample of Wakhis from Gulmit (WAKg). However, once again, all of the remaining samples from Mansehra District show closest, albeit distant, affinities to one another and stand apart from all other samples included in this analysis.

Intriguingly, the two samples from Dir, the Tarklanis (TRKd) and Utmankheils

(UTHd), show little affinity to one another and act as phenetic “bridge” linking the highly divergent aggregates to the remaining samples. In the case of the former, it is the Indo-Aryan-speaking samples from west-central India (MRT, MHR, MDA), while for the latter it is the Dravidian-speaking samples from southeast India (CHU,

GPD, PNT). This pattern suggests that the Tarklanis and Utmankheils do not have any particular affinities to any of the other samples and may be considered to represent phenetic isolates relative to the array of living South Asian ethnic groups encompassed by the current study. Hence, the overall results obtained from the PCA show that, within the present studied population samples from Dir and Swat

Districts, the Tarklanis and Utmankheils show some close affinities to one another,

Gujars and Kohistanis are marked by little affinity to one another while the

Yousafzai are highly isolated from the rest of the samples. This pattern also shows that Kohistanis Gujars, and Utmankheils all exhibit moderate affinities to one

108 another, with the Yousafzais more divergent. The Tarklanis appear to share no affinities to the other four sampled ethnic groups from Dir and Swat disticts.

Figure 36. Principal coordinate analysis (PCA) of the major ethnic groups residing in Swat and Dir districts in comparison with other living Pakistani and peninsular Indian ethnic groups.

109 3.1.4. Living Pakistanis Considered in Light of Living Peninsular Indians and Prehistoric Inhabitants of the Indus Valley and South-Central Asia

3.1.4.1. Neighbor-joining Cluster Analysis

Neighbor-joining cluster analysis identifies six sample aggregates. Beginning at the extreme left is an aggregate of eight samples. These eight samples may be further divided into two sub-aggregates and a “bridge” sample (Fig. 37).

Figure 37. Neighbor-joining cluster analysis of the living Pakistani, other living and prehistoric inhabitants of the Indus Valley, South-Central Asia with the major ethnic groups from Swat and Dir districts.

110 The first sub-aggregate is composed of four samples. These samples include the two prehistoric samples from Mehrgarh (NeoMRG, ChlMRG) and two living samples, the Karlaars (KARa) from Abbottabad and the Gujars from Mansehra District

(GUJm2). The second sub-aggregate is also composed of four samples. These include the three living Indo-Aryan-speaking ethnic groups from Maharashtra (MDA, MRT,

MHR), which exhibit closest affinities to one another, followed by the prehistoric sample also from Maharashtra (INM). Occupying a position in between these two sub-aggregates is the sample of Tarklanis from Dir, which are identified as possessing somewhat closer affinities to the west-central peninsular Indian samples than to the living samples of Karlaars (KARa) and Gujars (GUJm2) from the foohills riming the northern margin of the Indus Valley of Pakistan.

The second aggregate, found in the upper center of the array, is composed of the three living Dravidian-speaking ethnic groups from southeast peninsular India

(CHU, GPD, PNT), which show closest affinities to one another and are distantly separated from all of the other samples included in this analysis.

The third aggregate, found in the upper right, includes the prehistoric sample from

Harappa (HAR), two of the samples from Swat (KOHsw, GUJsw), and the sample of

Utmankheils (UTHd) from Dir District. Intriguingly, it is the Kohistani sample from

Swat (KHOsw) that links the members of this aggregate to the rest of the samples included in this analysis, the Utmankheils (UTHd) from Dir are identified as the most divergent, while the prehistoric sample (HAR) is interposed between the

Kohistanis (KOHsw) and Gujars (GUJsw) from Swat.

111 The fourth aggregate, also found in the center right of the array, encompasses six samples, all but one of which may be considered highland samples. The Khows

(KHO) from Chitral District serve as the sample that links the members of this aggregate to the rest of the samples included in this analysis. The remaining samples are divided into two sub-aggregates. The first is composed of the two Wakhi samples (WAKg, WAKs), which show closest affinities to one another, while the second is composed of the inhabitants of Madak Lasht (MDK), the Yousafzai from

Swat (YSFsw), and one of the samples of Awans (AWAm1) from Mansehra District.

The fifth aggregate, found in the lower center, encompasses five samples. These include the prehistoric sample from Timargara (TMG), three of the samples (SYDm2,

AWAm2, TANm2) from Mansehra and the late prehistoric sample from Sarai Khola

(SKH). Interestingly, it is the prehistoric sample from Dir, Timargara (TMG) that serves to link these samples to the rest of the samples included in this analysis (apart from the members of aggregate six), while it is the prehistoric sample from Sarai

Khola (SKH), that serves as the link between the members of this aggregate to the members of aggregate six.

Aggregate six, which includes four samples found in the lower right of the array, is composed entirely of the prehistoric samples from south Central Asia (KUZ, MOL,

DJR, SAP). These samples are strongly separated from all of the other samples included in this analysis and also the members of aggregate five. Of note, Sarai

Khola (SKH) shows much closer affinities to Sayeds (SYDm2), Awans (AWAm2) and

Tanolis (TANm2) from Mansehra District than to either Kuzali (KUZ), the

112 phenetically most proximate of the south Central Asian samples, or to the prehistoric

Gandharan Grave Culture sample from Timargara (TMG).

Dimension 1 also serves to separate the members of sub-aggregate one (ChlMRG,

NeoMRG, KARa, GUJm2) from all other samples in the lower right. Highland samples are found in the upper right and stand apart from the other samples by possessing high scores for Dimension 3. These not only include the two samples from Dir (TRKd, UTHd), who possess closest affinities to one another and to the sample of Yousafzais (YSFsw) from Swat District, but also the two Wakhi samples

(WAKg, WAKs), who likewise express closest affinities to one another, the prehistoric sample from Timargara (TMG), the Khows (KHO) from Chitral District,

Kohistanis (KOHsw) from Swat, and the inhabitants of Madak Lasht also from

Chitral District (MDK). This aggregate also include the sample of Swatis from

Mansehra District and the prehistoric sample from Harappa (HAR). The Indo-Aryan speaking ethnic groups (MRT, MHR, MDA) and the prehistoric inhabitants (INM) from west-central India are found in the lower right of the array.

3.1.4.2. Multidimensional Scaling—Kruskal’s Method

Multidimensional scaling into three dimensions with Kruskal’s method was accomplished in 10 iterations, with a stress value of 0.0686 (very good fit) accounting for 85.14% of the variance between samples. Dimension 1 provides a clear separation of the south Central Asian samples located in the upper left of the array from all other samples, with the late prehistoric sample from Sarai Khola (SKH) occupying the most proximate position to them (Fig. 38).

113

Figure 38. Multidimensional scaling with Kruskal's method of Smith’s MMD pairwise distances among living Pakistani ethnic groups, living ethnic groups of peninsular India and samples of the prehistoric inhabitants of the Indus Valley, South Central Asia and the major living ethnic groups of Swat and Dir districts.

However, the three living Dravidian-speaking ethnic groups (CHU, GPD, PNT) from southeast India do not exhibit any affinities to one another. Instead, the

Gompadhompti Madigas (GPD) occupy a position in the lower right of the array with closest affinities to the high-status Marathas (MRT) of west-central India. The middle-status Pakanati Reddis (PNT) are found in the middle-right, with close affinities to Gujars (GUJsw) from Swat and to the living inhabitants of Madak Lasht

(MDK), while the tribal Chenchus (CHU) possess only distant affinities to the Khows

114 (KHO) of Chitral District and to one of the samples of Awans (AWAm1) from

Mansehra District.

3.1.4.3.Multidimensional Scaling—Guttman’s Method

Multidimensional scaling into three dimensions with Guttman’s method (Fig. 39) was accomplished in 11 iterations, with a stress value of 0.0522 (very good fit) accounting for 90.97% of the variance between samples.

Figure 39. Multidimensional scaling (Guttman’s method) of living Pakistani and peninsular Indian ethnic groups, prehistoric inhabitants of the Indus Valley and South Central Asia, as well as samples of the major ethnic groups from Swat and Dir districts.

115 The pattern is similar, but not identical to that described above for MDS with

Kruskal’s method. Once again, Dimension 1 serves to separate the four prehistoric samples from south Central Asia in the upper left of the array from all other samples

(Fig. 39). However, in this case, the links between these samples and all others is with the Tanoli (TANm2) sample from Mansehra District, rather than with Sarai

Khola (SKH), which stands out as an isolate from all other samples.

The members of sub-aggregate one of aggregate one identified in the neighbor- joining cluster tree are found in the lower left, but in this case, it is the Chalcolithic period sample (ChlMRG) from Mehrgarh that is identified as a distant outlier, while the earlier Neolithic (NeoMRG) occupants of this site are identified as possessing close phenetic affinities to the Karlaars (KARa) of Abbottabad District and Gujars of

Mansehra District (GUJm2). Once again, the three living Indo-Aryan speaking ethnic groups from Maharashtra (MHR, MDA, MRT), as well as the prehistoric inhabitants of this same region of peninsular India (INM), occupy the lower right of the array and possess closest affinities to one another.

The highland samples are more widely dispersed and are divided into two groups.

The first, possessing high values for Dimension 3, occupy the upper right of the array and include the two Wakhi samples (WAKg, WAKs, which exhibit closest affinities to one another), the Yousafzai from Swat (YSFsw), the Utmankheils

(UTHd) and Tarklanis from Dir (TRKd), the prehistoric inhabitants of Timargara and the sample of Swatis (SWT) from Mansehra District. The second aggregate of highland samples, possessing lower values for Dimension 3, include the Khows

(KHO) of Chitral District, one of the samples of Awans (AWAm2) from Mansehra

116 District, Gujars (GUJsw) from Swat District, the prehistoric sample from Harappa

(HAR), and the inhabitants of Madak Lasht (MDK). However, and rather troublingly so, this aggregate also includes the tribal Chenchus (CHU) from Andhra Pradesh as well as the two other Dravidian-speaking samples (GPD, PNT) from southwest

India.

3.1.4.4. Principal Coordinate Analysis

The first three principal axes generated by principal coordinate analysis (PCA) combine to capture 84.75% of the total variance among samples. This plot shows many similarities with the results obtained by neighbor-joining cluster analysis by the two versions of multidimensional scaling. Found in the lower right of the array, the four prehistoric south Central Asian samples (SAP, DJR, MOL, KUZ) are once again clearly distinguished from all other samples (Fig. 40). The Neolithic inhabitants of Mehrgarh (NeoMRG), the Karlaars (KARa) from Abbottabad and the

Gujars (GUJm2) of Mansehra District occupy an isolated position in the upper left, as well as the Chalcolithic (ChlMRG) inhabitants of Mehrgarh, which stand out as an isolate in the upper foreground of the array. Reassuringly, the three living

Dravidian-speaking ethnic groups (GPD, CHU, PNT) of southeast peninsular India exhibit closest affinities to one another in the lower left of the array, while the three living Indo-Aryan speaking ethnic groups (MDA, MRT, MHR) of west-central India, as well as the prehistoric inhabitants (INM) of this region are tightly grouped together in the center-left and possess secondary affinities to the two samples (TRKd,

UTHd) from Dir District.

117 Once again, the remaining samples can be divided into two aggregates. Members of the first aggregate, separated by higher scores for Axis 2, include two Wakhi samples

(WAKg, WAKs), one of the samples of Awans (AWAm1) from Mansehra District and the prehistoric Gandharan Grave Culture sample from Timargara (TMG). The second aggregate, with lower scores for Axis 2, includes the Khows (KHO) of Chitral

District, Kohistanis (KOHsw) from Swat, the prehistoric sample (HAR) from

Harappa, Swatis from Mansehra District (SWT), as well as the samples of Gujars

(GUJsw) and Yousafzais (YSFsw) from Swat.

Figure 40. Principal coordinate analysis (PCA) of living Pakistani and peninsular Indian ethnic groups, samples of prehistoric inhabitants of the Indus Valley and South Central Asia, as well as samples of the major living ethnic groups of Swat and Dir districts.

118 3.2. Mitochondrial DNA analysis

3.2.1. Genomic DNA isolation

The gDNA obtained from saliva was collected from individuals belonging to five major ethnic groups (Tarklanis, Yousafzais, Kohistanis, Gujars, Utmankheils) of Swat and Dir districts using a protocol established in our lab was of a good quality and quantity (Fig. 41).

(A)

(B)

Figure 41. Photographs representing quality and concentretion of gDNA (A) Agarose gel electrophoresis (B) electropherogram

3.2.2. PCR amplification

The gDNA was used to amplify the control region (HVSI, HVSII) of mtDNA by PCR using a set of primers and the amplified products were separated by electrophoresis on 1.5% agarose gel. The amplified fragments and their corresponding band sizes are given in Figure 42.

119

(a)

(b)

Figure 42. Agarose gel electrophoresis photograph of mtDNA control region (a) amplfied PCR fragment of HVSI (b) amplfied PCR fragment of HVSII

The amplified PCR products (HVSI, HVSII) were cleaned from the agarose gel for sequencing and very good results were obtained (Fig. 43).

Figure 43. Agarose gel electrophoresis photographs (a) eluted PCR products of mtDNA HVSI (b) eluted PCR products of mtDNA HVSII.

Furthermore, the sequencing results of the PCR products obtained from Macrogen,

Inc. were converted to FASTA format and BLAST using national center for

120 biotechnology information (NCBI) data base. The mismatched sequences were removed while the most matched and accurate sequences were used for analysis.

3.2.3. MtDNA Haplogroups determination

The mtDNA sequences of the five sampled populations (Gujars=73, Tarklanis=62,

Yousafzais=56, Utmankheils=70, and Kohistanis=37) were used and determined their respective mtDNA haplogroups which are described below.

3.2.3.1 MtDNA Haplogroups determination in the individuals of Gujar population

A total of 73 samples of Gujars were analyzed for the mtDNA control region (HVSI,

HVSII). About 46 different haplotypes were observed, among which 29 were unique and 17 were shared by more than one individual. Occurring in (7%) of individuals, haplogroup M6 was found to be the most frequent. The corresponding mtDNA genetic diversity among members of the Gujar sample was (0.9223), power of discrimination (0.9097) and random match probability was determined (0.0903) respectively (Table 10).

Table 10. Statistical analysis of the Gujar sample from Swat

Population statistics Total number of samples 73 No of haplotypes 46 No of unique haplotypes 29 Random match probability 0.0903 Power of discrimination 0.9097 Genetic diversity 0.9223

The regional identification of haplogroups observed among Gujars is as follows: 42%

South Asian, 37% West Eurasian, 11% East Eurasian, 4% Southeast Asian, 2.7% East

121 Asian, 1.4% Eastern European, and 1.4% North Asian. The South Asian haplogroups include: M6 (7%), M30 (4%), M37 (4%), M5c (4%), M3 (2.7%), M3a (2.7%), M5 (2.7%),

M52a (2.7%), R5a (2.7%), M30d (1.4%), M3c (1.4%), M53 (1.4%), M54 (1.4%), M7c

(1.4%) and R22 (1.4%). West Eurasian haplogroups include: H2a (4%), T2b (4%),

H14a (2.7%), H5 (2.7%), K1a (2.7%), U7a (2.7%), H1 (1.4%), H1a (1.4%), H1e (1.4%),

H3p (1.4%), N (1.4%), T (1.4%), T1a (1.4%), U2a (1.4%), U4a (1.4%), U5b (1.4%), U7

(1.4%), V9a (1.4%) and W3a (1.4%). East Eurasian haplogroups include: B4a (5%),

D4b (1.4%), D4e (1.4%), D4g (1.4%) and D4p (1.4%). Southeast Asian haplogroups include: F1 (1.4%), G2b (1.4%) and S (1.4%). East Asian haplogroups include: A

(2.7%), Eastern European H7i (1.4%) and North Asian includeS haplogroup J (1.4%).

The observed haplogroup frequencies, their respective variants, and geographic position are provided in Figure 44 and Table 11.

122

(11%)

Figure 44. Graphical representation of mtDNA haplogroups frequencies present in Gujar sample from Distict Swat.

123 Table 11. Haplogroup frequencies and their respective variants found in the Gujar sample from of Swat

S.NO Frequency Variants Hg HGO 1 2 A73G, T152C, A234G, A235G, A263G, C309CCT, T310C, AC523d, C560A, T16105C, C16115A, A EA C16223T, C16290T, T16311C, G16319A, T16362C, T16413A 2 4 A73G, T195C, A263G, C309CCT, T310C, AC523d, C560A, A16100T, T16189C B4a EEA 3 1 C16115A, C16223T, G16274A, A16307T, C16332T, C16355T, T16362C, A16367G, G16384T, A16387G D4b EEA 4 1 T152C, T155A, A165T, A178T, C16083A, T16090A, A16100T, C16223T, G16274A, T16362C D4e EEA 5 1 C151T, T152C, A263G, A290T, C298T, C308T, C315CC, G323T, C324T, C332T, T334TT, C340T, D4g EEA C349T, C356T 6 1 A73G, T195C, C198T, A263G, C309CCT, T310C, T482C, T489C, AC523d, T16097C, G16110A, D4p EEA G16414A 7 1 A16183C, T16189C, A16194C, T16195A, C16197G, C16201A, A16203AA, C16205A, T16209A, F1 SEA C16211A, C16214A, A16230T, C16234A, C16236A, T16243G, C16245G, A16258C, A16265C, A16269C, T16271A, T16276A, C16282A, A16293C, C16301A, T16304C, C16306A, T16308A, T16311A, C16313A, A16322T, C16332A, C16339A, A16340T, T16347C, C16358T, T16359C, A16367T, T16368G, T16372A 8 1 G62GG, A73G, G184A, A200G, A263G, T310C, T310TTC, G380T, G389T, A396T, G410T, A425T, G2b SEA T430C, C445T, C465T, T16094C, T16117A, T16189C, C16192CT, A16194G, T16195G, A16212T, A16220C, C16223T, C16239G, C16245G, A16258C, A16265T, A16269G, T16276A, A16277C, A16285C, A16293T, C16294T, C16296T, A16305T, A16316G, A16326C, T16330G, A16333T, T16334A, G16346A, T16347C, A16351T, T16362C, A16367G, T16368G 9 1 G53GC, A263G, T310C, T310TTC, T16154C, G16156C, C16159T, A16166C, T16189C, A16402C, H1 WE T16413C 10 2 G71GG, T72G, A263G, C315CC, A16180C, C16256T, T16352C, G16414A H14a WE 11 1 G92A, A111d, G124T, A126T, T131A, G184A, G185T, G187A, A200G, G203T, C231T, A241T, A248T H1a WE 12 1 T89TT, C150T, A263G, C264T, A300G, T310A, G316C, C317CC, G329T, C330T, C332T, A339T, H1e WE A351T, A357T, A360T 13 3 A263G, C315CC, T16075A, C16223T, C16234T, G16274A H2a WE

124 14 1 G53GC, A263G, C315CC, T16154C, G16156C, A16166C, C16168A, C16169A, C16174A, C16222T, H3p WE C16242T, G16273A, T16356C 15 2 A263G, C315CC, G366A, G389A, T408TT, A419C, A426T, A428T, C436A, C438T, A439T, A443T, H5 WE C445A, A446T, C456T, C462T 16 1 G124T, T125A, T133A, A178T, A215d, G228A H7i EEU 17 1 G184A, A191T, A200G, C222T, A240T, A263G, C295T J NA 18 2 T63A, A73G, C150T, T199C, A263G, C315CC, G366T, C371T, G380A, T391A, A395T, A415T, A419T, K1a WE T424TT, C438T, C441T, A443T, A451T, T452A, T453G, C459T, C462T, C467A, C476T, A478T, G16129A, T16224C, C16301T, A16312C, C16321T, C16328T 19 2 A73G, T195C, A263G, T310C, T310TTC, G366A, T414G, T482C, T489C, AC523d, A561C M3 SA 20 3 A73G, T125C, T127C, T195A, A263G, C309CCT, T310C, T489C, AC523d, C560A, T16075A, A16078T, M30 SA C16223T, C16234T, G16274A, G16414A 21 1 A73G, T195A, A263G, C315CC, T489C, AC523d, C16179d, C16223T, A16302G M30d SA 22 1 T199C, A263G, A278T, A281T, A291T, C311T, G16096C, T16097C, C16223T, T16304C, T16362C M35b SA 23 3 A73G, C151T, T152C, A263G, C309CCT, T310C, T489C, T16075A, C16085A, C16221T, C16223T M37 SA 24 2 C194T, T195C, T204C, G260T, A263G, C271T, A272T, C273A, C315CC, A331T, C332A, C349T, M3a SA A16074G, T16126C, C16192T, C16223T, A16312G 25 1 A73G, T195C, A263G, C309CCT, T310C, T482C, T489C, AC523d, T16126C, T16154C, C16223T, M3c SA T16224C 26 2 G53GT, A73G, T195C, A263G, C309CCCT, T310C, T489C, C560A, G16129A, C16223T M5 SA 27 2 A73G, C78CA, G79C, T195C, A237T, A263T, C268T, C269T, A281T, A287T, C16223T, C16266T, M52a SA A16275G, C16327A, G16390A 28 1 T16154C, A16164C, A16165C, T16189C, C16192T, C16223T, C16294T, A16316G, T16362C, G16384A, M53 SA T16386A 29 1 A73G, A263G, C315CC, T489C, C560A, A16070C, G16129A, C16223T, T16304C, T16325C, G16414A M54 SA 30 3 A73G, C150T, A263G, C315CC, T489C, C560A, G16110A, C16111A, C16115A, G16118A, T16126C, M5c SA G16129A, T16209C, C16223T, T16311C

125 31 5 G54GG, A73G, T152C, A214G, A263G, C315CC, C461T, T489C, AC523d, T16140A, T16152C, M6 SA T16154C, A16155C, A16164C, A16165C, C16174G, C16223T, G16274A, T16323A, A16351T, T16362C, C16376T, G16384T, A16387G, C16404T 32 1 T16068C, A16070C, A16074G, A16078T, G16110A, T16117A, T16140d, T16144A, C16147T, A16158d, M7c SA T16161A, A16171T, T16172C, A16182T 33 1 C16223T N WE 34 1 A73G, A188G, C194T, T204C, G207A, A263G, G316C, C317CC, G329A, A339T, T344C, C353G, R22 SA A358T 35 2 G85T, G94A, G107T, A111T, G124A, T146C, T152C, A210T, G229T, A263G, G275T, A278T, A281T, R5a SA C299T, A301T, C307T, C312CT, C312T, T16094C, G16096C, T16097C, C16099T, C16266T, T16304C, T16311C, T16356C, C16393T, C16404T 36 1 T152C, A263G, C315CC, T455TT, A492T, A515T, CAC516d, C558T, A16066T, C16069T, A16070T, S SEA A16074G, C16176T, C16185T, C16223T, A16246T, A16309T, G16346T, C16348T, A16402T 37 1 A87G, A263G, C315CC, T16126C, T16143G, C16151G, C16188T, T16189C, A16194G, A16207G, T WE A16216T, C16234T, T16263C, A16277T, C16279T, A16284T, A16289T, C16294T, G16303T, C16321T, C16327T, C16337T, T16342A, A16343T, C16353T, A16367G, C16382T, T16386G, A16387G, C16393T, C16395G 38 1 A73G, T152C, T195C, A263G, C309CCT, T310C, C16174A, C16186T, T16189C, C16294T T1a WE 39 3 A73G, A263G, C285T, T310C, G316C, C317CCCC, T321C, C324G, C332A, C343T, C362A, G366A, T2b WE T372C, A379C, G380A, T383C, T391A, C394T, C404T, C411G, G429C, T430C, A432C, A448T, T460C, A464C, T471C, C473A, T474A, T482C, T489A, A492C, A523C, C527G 40 1 A73G, A183G, A188G, C194T, G207A, A263G, C315CC, G545A, G16110A, C16115A, G16129A, U2a WE A16206C, T16362C 41 1 A73G, T99TT, G124T, T199C, A263G, A270T, C296T, A300T U4a WE 42 1 A73G, C150T, A263G, C315CC, C560A, T16093C, T16094C, T16097C, G16110A, C16111A, C16115A, U5b WE T16131G, C16270T, G16412C 43 1 C16114A, C16115A, A16309G, A16318T, A16416T U7 WE

126 44 2 A73G, G94T, G97T, G103T, G121T, C151T, T152C, A183T, G187A, A189T, T208A, T233A, A243T, U7a WE A249T, G260T, A263G, G275A, T16121C, T16126C, T16131G, T16263C, A16269G, T16288C, T16304A, A16309G, T16311A, A16318T, T16359C, T16362C, T16372C, T16396C 45 1 T119C, A189G, T195C, T204C, G207A, A263G, C315CC, C516T, C530T, T16093C, T16094C, T16097C, V9a WE T16105C, G16213A, G16274A, G16319A, T16362C, G16390A 46 1 A73G, G143A, A189G, C194T, T195C, T199C, T204C, G207A, A263G, C315CC, C16223T, C16292T, W3a WE G16414A

haplogroup Hg; Haplogroup origin Hgo; East Asian EA; Southeast Asian, SEA; West Eurasian, WE; Eastern European, EEU; North Asian, NA; South Asian, SA; Eastern Eurasian, EEA.

127 The haplotypes of the Gujar sample were assigned to mega-haplogroups. The most common mega-haplogroup was haplogroup R, which was found in 35 (48%) individuals, followed by haplogroup M 33(45%) and N 5(7%), respectively (Fig. 45).

Figure 45. Mega-haplogroup frequencies observed in the sample of Gujars from District Swat through mtDNA control region (HVSI and HVSII).

By comparing the genetic parameters of reported populations living in Pakistan with the current sampled Gujar population, we found that the Gujars of Swat have a moderate number of unique haplotypes (29) relative to other sampled Pakistani ethnic groups (Table 12). The moderate frequency of unique haplotypes is reflected by the low genetic diversity (0.922) in the Gujar samples of the present study as compared to the other reported ethnic groups from Pakistan, except the Kalash whose genetic diversity (0.851) is very low (Table 12).

128 Table 12. Diversity comparison of the sampled Gujar population from Swat with the other reported ethnic groups of Pakistan.

Parameters Gujars Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk present study

No of samples 73 100 85 230 39 38 23 44 44 44 23 100

No of haplotypes 46 70 63 157 26 22 21 32 12 22 21 77

No of unique 29 54 58 128 18 15 19 25 5 12 19 63 haplotypes

Genetic diversity 0.922 0.968 0.957 0.993 0.974 0.952 0.992 0.98 0.851 0.95 0.992 0.992

Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd, Sindhi; Pk, Pakistan .

129 3.2.3.2. MtDNA Haplogroups of the sampled Tarklani population of District Dir

A total of 62 unrelated Tarklanis from District Dir, Pakistan were sampled and analyzed for HVSI and HVSII sequencing. Some 42 different haplotypes were observed, among which 31 were unique and 11 were shared by more than one individual. The corresponding mtDNA genetic diversity of the Tarklanis was

(0.9449), power of discrimination (0.9297) and random match probability was

(0.0703) (Table 13).

Table 13. Statistical analysis of Tarklanis of District Dir

Population statistics

Total number of samples 62

No of haplotypes 42

No of unique haplotypes 31

Random match probability 0.0703

Power of discrimination 0.9297

Genetic diversity 0.9449

Out of the 42 haplotypes, 31 (74%) were observed once, 6 (14%) twice, 2 (5%) three times, 2 (5%) four times and 1 (2%) five times. The haplotype frequencies, their respective variants, and geographic association are provided in Table 14.

130 Table 14. Haplogroup frequencies and their respective variants found among the sampled Tarklanis of Dir District

S.No Frequency Variants Hg HGO 1 1 T146C, T152C, T195C, T16126C, G16129C, C16223T, T16298C, C16327T C4b(C4b1) NA 2 1 G103d, T152C, T195C, A263G, C315CC, G380C H11a WE 3 1 A73G, A263G, C315CC, G389C H1a WE 4 1 A73G, A263G, A281T, A16070C, G16129A, C16259T H1a(H1a3a3) WE 5 1 G62A, A73d, C86T, T152C, A263G, C309CCT, T310C, A16038C, T16046TCG, G16047C H1c WE 6 1 T152C, T195C, A263G, C315CC, G329T, G366T, A402T, C404T, G413T, C427T, T430C, H36 WE A432T, C433T, C434T, C435T, C436T, A439T, C445T, C456T, T474A, C16278T 7 1 C64T, T195C, A263G, T310C, C315CCCC, T321G, C340A, C362A, G366A, A376C, A379C, H57 WE G380C, G389A, A16070C, C16072G, A16074G, T16093C, T16126C, T16362C 8 1 A263G, T310C, T310TTC, C16176T, C16184T, T16357C H7a(H7a2) WE 9 1 T16131G, A16166T, C16185A, C16192G, C16205A, G16208A, T16209G, G16213C, H82 WE A16219G, A16220G, C16223d, A16237G, A16253C, A16258C, C16261G, A16281d, C16286T, C16287T, T16288G, A16300T, C16301d, G16310C, C16313A 10 1 T152C, T195C, A263G, C315CC, T16090A HV1b(HV1b3) WE 11 1 CAA242d, A263G, C295T, T310C, T310TTC, C462T, T489C, A515T, A16038AC, G16039A, J1(J1) NA G16042T, G16047GC 12 1 A16210C, C16221G, C16222T, T16224C, C16261T, C16264G, A16265d, C16268T, C16279G, J1b NA A16280G, C16282G, A16285G 13 1 T88TA, A263G, C271T, C295T, C315CC, C462T, T489C, A512G, T16126C, G16145A, J1b(J1b1b) NA C16222T, C16261T 14 1 A263G, C315CC, G366T, C469T, C476T, T477A, A490T, A512T, C525T, G526T, C534T, JT(JT) NA A538T, A16074G, T16126C, T16172TT, A16181G, C16190T, G16196GG, T16209C, A16216AA, C16223CC, A16237AA, A16241AA, C16262CC, C16270CC, A16285AA, A16293C, A16293AAC, A16305AA, G16329GG, C16355CC

131 15 1 A73G, C150T, T199C, A263G, C315CC, G366T, A374C, C375T, T16093C, G16129A, K1a WE T16224C, C16278T, T16311C 16 1 G143A, T204C, T217C, A263G, C309CCT, T310C, T482C, T489C, T16126C, C16223T, M3a(M3a1) SA C16260T, T16311C 17 1 A73G, A263G, C315CC, T489C, C511T, A16137G, C16223T, A16289G, C16360T M65a SA 18 1 C16223T, T16231C, T16356C, T16362C M6a(M6a1a) SA 19 1 A73G, C80CA, T125C, T127C, C128T, T146C, T195C, A263G, T310C, T310TTC, A339C P3a Australian 20 1 C16083A, T16090A, T16094C, G16319A P4a Oceanian

21 1 T195C, A263G, T310C, T310TTC, C411G, A432C, T460C, T471A, T528G, C16053G, R0a(R0a2) SA T16093C, T16126C, A16318AA, T16362C, G16391GG, C16404CC 22 1 G109C, T152C, A263G, T310TTC, T310C, A341T, C362A, G366A, G389A, T391A, A402T, R5a(R5a2a) SA C411G, A415d, A419C, C420T, C427T, C445T, A446d, A448T, T460A, A464C, A470T, A472T, C473d, C476T, A479T, C491T, C16083A, T16097C, G16145A, T16304C, T16311C, T16356C 23 1 C122T, T152C, T195C, G228A, A263G, C271G, A272d R6a SA

24 1 A73G, C91CAG, G92A, T146C, A263G, C309CCT, T310C, G380C, G389A, C411G, T414A, T2 WE G429A, C431T, C445T, T452A, A472T, C476A, A16051T, A16066T, A16070C, G16096C, C16099d, A16100T, A16103T, T16126C, T16131G, C16176T, A16177C, C16193T, C16205T, A16206G, C16214T, A16240G, A16241G, C16251A, C16256A, A16258G, C16261G, T16263C, A16265C, C16266T, A16293G, C16294T, C16296T, G16303A, C16313T, A16318G, C16327A, C16328G, A16333C, G16336T, A16340G, C16358A, G16384T, G16391A 25 1 G94C, G109C, A263G, T310C, T310TTC, T321G, A341T, C362A, G366A, G389A, C404T, U2a(U2a1a) SA A415T, C441T, C447T, A451T, T460C, T471A, T474A, A475T, C476A, C481T, T16126C, G16129C, T16154C, A16206C, A16230G, T16311C

132 26 1 A73G, G81GG, T217C, A263G, T310C, C311CTCC, C327A, G329A, C340T, C353A, U2e(U2e1) SA C362A, G366T, C369T, C378G, C382A, A16087T, T16093C, G16129C, T16136A, A16137T, T16143A, C16147d, A16149T, C16150G, A16155AC, G16156A, AAA16180d, T16189C, AT16194d, A16200G, C16201G, A16202T, C16223T, C16225A, A16226C, C16228T, A16247T, A16254C, A16265C, G16273A, C16278G, C16279G, A16289G, A16333G, C16339A, G16384C, A16402T 27 1 G79C, T152C, T217C, A263G, C308CT, C315CC, C340T, G16129C, G16145A, T16189C, U2e(U2e1e) SA A16194C, T16195C, C16197G, C16197CGT, C16201A, C16205A, T16209A, C16211A, C16214A, C16218A, C16228T, C16236A, C16239A, C16242G, T16243G, C16245A, A16258C, A16265C, C16270A, T16276A, C16282A, A16293C, C16296T, C16301A, T16311A, T16315A, A16322T, T16330A, C16332A, C16339A, A16340T, C16344A, C16348A, C16355T, G16361C, T16362C, A16367G, G16384A, T16386A, A16387G, G16391A, C16394T, C16395T 28 1 G124T, T127C, T152C, T195C, A263G, G275A, C295T, C296T, A301T U4a(U4a1b1) WE 29 1 G124T, C151T, T152C, A263G, T282A, A291T, C315CC, C317T, T321G, G329T, C332T, U7a WE A339T, A341T, C356A, C362T, A365T, G366T, C371A, A374T, C386T, G389T, A16309G, A16318T 30 1 G143A, A189G, T192C, C194T, T195C, T196C, T204C, A263G, C315CC, A432T, C458T, W4a WE T489TT, C494T, C496T, C509T, A521T, C527G, G529T, C530T, C535T, C16111T, C16185T, C16223T, C16262CC, C16268CC, C16286A, C16286CT, C16292T, G16319A, G16319GGC, T16325TT, T16347TT 31 1 C16192T, C16223T, C16292T W6 WE 32 2 G109C, A263G, C315CC, C445A, T16126C, G16145A, C16222T, C16261T H3p WE 33 2 T16304C H5 WE 34 2 T72A, A73G, A77d, G81A, G103d, G109d, C110A, A123T, G124C, C132A, T135G, A176C, J1b(J1b1a1) NA T177C, G187A, A201T, A202C, G203C, T16126C, G16145A, T16172C, C16261T 35 2 A73G, T152C, T195A, A263G, C315CC, A326C, T405A, T16189C M30b SA

133 36 2 G16042T, T16046TCG, G16047C, T16126C, C16294T T WE

37 2 G62GG, A73G, T152C, T195C, A263G, C315CC, T16126C, A16163G, C16186T, T16189C, T1a(T1a1'3) WE C16294T 38 3 A73G, G109C, G121d, T195A, A218C, G247T, G260T, A263G, C16221T, C16223T, M30 SA C16234T, T16362C 39 3 G109T, A193C, T195C, T224A, T236C, A263G, T310C, C311CTCC, G316A, A328C, C330A, U4a(U4a2a) WE T334A, C338T, C16355T 40 4 A263G, T310C, T310TTC, G329A, C330A H2a WE 41 4 A16051G, C16085T, A16100T, C16112G, A16113G, A16127d, A16132G, C16147d, U2a WE C16148T, G16204A, A16206C, G16208A, T16209A, A16216G, A16227C, C16228T 42 5 G97C, T195C, A263G, C315CC, C338A, G366A, C375A, A376C, G380A, T16126C, M3 SA C16185T, C16223T

Hg, haplogroup; HGO, Haplogroup origin; EA, East asian; Southeastt Asian, SEA; West Eurasian, WE; Nortn Asian, NA; South Asian, SA; East Eurasian, EEA.

134 When haplogroups are considered by associated region, those of West Eurasia were the most common (54%), followed by South Asian (30%), North Asian (11%),

Oceanian (2%) and Australian (2%), respectively (Fig. 46).

Figure 46. Distribution of Tarklanis haplogroup by Origins.

Among the West Eurasian haplogroups, haplotype frequencies are as follows: U2a

(8%), H2a (6.5%), U4a (U4a2a) 5.0%, H5 (3.0%), H3p (3.0%), H11a (1.6%), H1a (1.6%),

H1a (H1a3a3) 1.6%, H1c (1.6%), H36 (1.6%), H57 (1.6%), H7a (H7a2) (1.6%), H82

(1.6%), HV1b (HV1b3) 1.6%, K1a (1.6%), U4a (U4a1b1) 1.6%, W4a (1.6%), and W6

(1.6%).

South Asian haplogroups include: M3 (8.0%), M30 (5.0%), M30b (3.0%), M3a (M3a1)

1.6%, M65a (1.6%), M6a (M6a1a) 1.6%, R0a (R0a2+195) 1.6%, R5a (R5a2a) 1.6%, R6a

(1.6%), U2a (U2a1a) 1.6%, U2e (U2e1) 1.6% and U2e (U2e1e) 1.6% (Fig. 47).

135

Figure 47. Graphical representation of haplogroup frequencies among the sampled Tarklani individuals from District Dir.

136 North Asian haplogroups include: J1b (J1b1a1) 3.0%, C4b (C4b1) 1.6%, J1(1.6%), J1b

(1.6T 3% %), J1b(J1b1b) 1.6% and JT(1.6%). West East Asian haplogroups include:

T1a (T1a1'3) 3%, T2 (1.6%) and U7a (1.6%). The sole Australian and Oceanic haplogroups include P3a (1.6%) and P4a (1.6%), respectively, as shown in Figure 47.

The haplotypes found among the sampled Tarklani individuals from District Dir were assigned to mega-haplogroups. The most frequent is haplogroup R, which was found in 46 (74%) of individuals, followed by haplogroup M, which was found in 14 individuals (23%), followed by haplogroup N, which was only found in two of the sampled individuals (Fig. 48).

Figure 48. Haplogroup frequencies observed among the sampled Tarklani individuals from District Dir through mtDNA control region (HVSI, HVSII).

137 The genetic parameters of the Tarklanis included in the present study were compared with previously reported ethnic groups of Pakistan. The comparative analysis revealed that Tarklani sample has a moderate number of unique haplotypes

(31), which is similar to most of other ethnic groups of Pakistan, while the highest number (128) was observed among Pathans, a result that was likely a consequence of the large number of samples considered (Table 15). Furthermore, the greatest genetic diversity (0.993) was observed in Pathans among the other ethnic groups previously reported from Pakistan followed by Sindhis (0.992), Hazaras (0.992), a mixed ethnicity sample from Karachi (0.992) and the Burusho of Hunza (0.980), while the sample of Tarklanis in the present study had a diversity value of 0.945 as summarized in Table 15.

138 Table 15. Diversity comparison of among the sampled Tarklanis of District Dir with the other reported ethnic groups of

Parameters Tarklani Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk present study

No of samples 62 100 85 230 39 38 23 44 44 44 23 100

No of haplotypes 42 70 63 157 26 22 21 32 12 22 21 77

Unique haplotypes 31 54 58 128 18 15 19 25 5 12 19 63

Genetic diversity 0.945 0.968 0.957 0.993 0.974 0.952 0.992 0.980 0.851 0.950 0.992 0.992 Pakistan.

Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd, Sindhi; Pk, Pakistan Karachi.

139 3.2.3.3. MtDNA Haplogroup variation among the Utmankheil of District Dir

Samples were obtained from 70 unrelated Utmankheil individuals District Dir,

Pakistan and analyzed for HVSI and HVSII sequencing. A total of 44 different haplotypes were observed among which 33 were unique and 11 were shared by more than one individual. The corresponding mtDNA genetic diversity of the

Utmankheil was (0.9118), power of discrimination was (0.8982) and the random match probability observed was (0.1018) (Table 16).

Table 16. Statistical analysis of sampled Utmankheil population of District Dir

S. No Population statistics

1 Total number of samples 70

2 No of haplotypes 44

3 No of unique haplotypes 33

4 Random match probability 0.1018

5 Power of discrimination 0.8982

6 Genetic diversity 0.9118

Of of 44 haplotypes, 33 (75%) were observed once, 4 (9%) were observed twice, 4

(9%) were observed three times, 2 (5%) were observed four times and one haplotype

(2%) was observed in nine individuals. The haplotype frequencies, their respective variants, and geographic association are provided in Table 17.

140 Table 17. Haplogroup frequencies and their respective variants in the Utmankheil sample from District Dir

S.No Frequency Variants Hg HGO 1 3 A73G, A263G, C271T, C295T, C315CC, C462T, T489C, A512G, C16069T, C16072G, J1b (J1b1b) NA A16074G, T16126C, G16145A, C16222T, C16261T 2 1 A215T, T233A, A249T, C253T, C258G, A263G, T265C, C268T, C269T, A270T, A272C, U5a WE G275A, C296T, A297T, C298T, T310C, G316C, C317CCC, G329A, A337T, C343T, C362T, G366A, A16074G, C16192T, C16256T, C16270T, T16298C, T16368C 3 1 C64CG, A73G, C76d, G79C, G109C, C151T, C170T, G171d, A183G, A189T, A191d, R30b (R30b2) SA A202G, A211T, A214T, T216A, T16093C, C16292T 4 3 A73G, C150T, T152C, A263G, C295T, C309CCT, T310C, T489C, G564A, C16069T, J2b (J2b1a) SE T16126C, C16193T, G16274A, C16278T 5 1 A73G, T89TT, A263G, T310C, T310TTC, G16039A, G16042T, G16047GC, A16417d H1e (H1e2c) WE 6 2 G92A, G94A, G107A, G109C, T115A, T119C, C122A, G124T, A126T, G136A, T161C, M49 SEA G163C, G171T, G187A, T204G, T206G, T220G, T226A, G229A, T250G, A263G, A302ACC, T310C, A376C, T407G, C420T, G429A, T460C, C16223T, C16234T 7 1 A73G, C150T, T152C, A263G, C315CC, A16247G, A16254G U5b (U5b2a1a2) WE 8 1 A73G, A263G, C315CC, T16126C, A16181G, T16209C, A16417d JT NA 9 3 G68C, A73G, A232d, T254G, A263G, A278T, A291d, T310C, T310TTC, T321G, C332T, U4a (U4a2a) WE G366A, G380C, G389A, T401A, A415T, C420T, T460C 10 2 A73G, T195C, G207A, A263G, C309CCT, T310C, T482C, T489C, G16023GA, M3 SA G16047GC, T16126C, C16185T, C16223T, A16417d 11 1 A200G, A263G, C315CC, C16192T, T16243C, T16311C, T16368C H3x WE 12 1 G124A, T152C, A153T, T157A, C170d, G171T, G187A, T16086G, T16093C, A16103G, C4 SEA T16126G, C16150G, C16223T, T16298C, C16327T 13 2 A200G, A263G, C309CCCT, T310C, G366A, C411G H1a (H1ah1) WE 14 1 A73G, A189G, T195C, G207A, A210G, A263G, C315CC, A365AA, A376C, A397AA, N1a (N1a3a3) WE A402T, T16172C, C16201T, C16223T, A16265G, T16271C

141 15 1 A200G, A263G, T310C, G316C, C317CCGG, C320T, C324G, G329A, C330A, C332A, H7c (H7c4) WE C338A, A341C, C345T, C362A, G366A, A16074G, T16092C, A16183C, T16189C, C16193CC, T16195G, C16214A, T16243G, A16258C, A16265C, T16276A, C16282A, A16293C, C16301A, C16306A, C16313A, A16322T, C16332A, C16339A, T16368G, T16372G, G16384A, T16386A, A16402C, T16413G 16 1 A263G, T310C, C16176T, C16184T, T16357C H7a (H7a2) WE 17 1 C64T, C150T, A263G, C315CC, G366A, T16189C H57 WE 18 4 T16094C, A16100T, T16124G, T16126C, T16152G, A16163G, C16186T, T16189C, T1a WEA T16195G, A16293C, C16294T, G16310A, T16315A, C16365G, A16367G, T16386G, G16391GG, A16402C, A16405C, T16413G 19 1 A73G, T119C, A189G, T195C, T204C, G207A, A263G, C315CC, C16223T W1 WE 20 1 T152C, T195C, A263G, C299A, A302C, C315CC, A373G, T480C, A16183C, T16189C, HV0 WE C16193CC, T16195G, C16201A, C16205A, C16214A, T16224C, C16228T, T16243G, A16258C, G16273A, G16274A, T16276A, C16282A, A16293C, T16297C, T16298C, T16311A, C16313A, A16322T, T16330A, T16334G, A16335G, A16340T, C16344A, T16362C, T16368G, T16372G, G16384A, T16386A, A16402C 21 1 A73G, T146C, T152C, T195A, A263G, C309CCT, T310C, T489C, AC523d, G16042T, M30c SA G16047GC, A16166C 22 1 T16126C, G16153A, C16294T, C16296T T2e WE 23 1 G143A, T152C, G228GA, G228A, A234G, A249d, A263G, C315CC, C16083A, T16090A, F1c(F1c1a) SEA C16111T, G16129A, T16304C, T16362C, A16420T 24 1 A73G, A263G, C315CC, A16066T, A16070C, A16120T, A16137G, C16174T, C16176T, D4h (D4h1) CA T16209A, G16213T 25 1 T16126C, C16184T, C16223T, A16237G, G16273A, A16283C M66 SA 26 1 C151T, T152C, A263G, C296A, A16265T, G16273T, A16280T, A16318C, C16332T, H1c WE C16344T, T16372G, C16375T, C16376T, C16377T, C16382T, T16386A, C16395T, C16401T, A16405T, C16408A

142 27 1 G121T, T125d, G136T, A137T, T142C, T146C, T152C, A153C, T172C, A181T, G203A, H1t (H1t1a1) WE A211T, C16266T, C16267T, A16305T, T16311C, G16329T, C16332T, C16339T, A16340T, A16350T 28 9 A73G, C150T, T195A, A263G, T310C, C315CCC, A339C, C356CC, A368AA, C375A, M30 SA A385AA, A390AA, C16223T, C16234T, A16338T, A16405T 29 1 A73G, A82AA, G109A, T115C, C145CC, A156d, A175T, A193T, T195C, C16223T, M3d SA C16234T, G16303T, T16304A, T16308A, A16318T, A16331T, C16344T, A16351T 30 4 A73G, A263G, C315CC, T471C, T16154C, A16206C, A16230G, T16311C, A16383T, U2a (U2a1a) WE G16384T, A16387T, C16419A 31 3 A93G, A95C, T152C, A263G, C16256T, A16265T, C16270T, A16299T, A16300AT, H14a WE C16313T, G16336T, C16337T, A16340T, A16351T, T16352C, C16365T, C16379T, G16384T, G16388A 32 2 A263G, C315CC, C456T, A16074G, T16304C H5 WE 33 1 C91CA, C151T, T152C, A263G, C296A, T310C, T310TTC, C320T, C324G, A328C, H10e (H10e1a) WE C338A, C362A, G366A, C369A, C375d, A379C, T391A, C394T, A395C, T16126C, T16131G, A16166C, C16193T, A16194T, C16201T, G16204T, A16220C, C16221T, A16226T, C16228T, T16231C, C16239T, A16240T, G16255C, A16258T, C16260T, C16266T, A16272T 34 1 A263G, T310C, G316C, C317CCCT, T321G, A326d, G329A, C338A, C343T, G347GG, R22 SEA C362A, G366A, T16090A 35 1 A73G, T152C, T195A, A263G, C309CCT, T310C, T489C, AC523d M30b SA 36 1 A73G, T199C, A263G, C309CCT, T310C, T489C, T16090A, T16094C, G16129A, M33a SA C16223T 37 1 A73G, T125C, T127C, C128T, T146C, T195C, A263G, C309CCT, T310C, T489C, M12a SA T16172C, A16180T, A16183C, T16189C 38 1 C96A, G109d, C110A, G143C, T146C, T152C, A153C, A165C, G171T, C182d, C190A, HV1a WE C16115A, T16209C, C16239T, A16318C, G16319A, A16322C, C16327A, C16332A, G16336T, A16349C, A16350C, T16352G, T16362C, A16367T, G16370T

143 39 1 A73G, T152C, T195C, A263G, C309CCT, T310C B4a (B4a1c3a) EEA 40 1 A73G, A263G, C315CC, T489C, A16100T, G16129A, T16189C M1 SA 41 1 A73G, T152C, T199C, C315CC, T471C, C481T U2d SA 42 1 A73G, T217C, A263G, T310C, C315CCCC, C327A, A331C, T16090A, T16093C, U2e SA T16094C, G16129C 43 1 G62T, A73G, G81A, G92A, G109C, A116T, G124T, T146C, T152C, C170T, A193T, U2b SA A234G, T239A, A249T 44 1 A200G, T204C, A249G, A263G, C309CCT, T310C, T489C M40a SA haplogroup Hg; Haplogroup origin HGO; Southeast Asian, SEA; West Eurasian, WE; East Eurasian, EEA; North Asian, NA; South Asian, SA; Southern European, SE;.

144 In the present study, West Eurasian haplogroups were identified in 47% of the sampled Utmankheil individuals, followed by South Asian (33%), Southeast Asian

(7%), North Asian (6%), Southern European (4%), East Eurasian (1.4%), and Central

Asian (1.4%), respectively (Fig. 49).

Figure 49. Graph representing haplogroup frequencies among the sampled Utmankheil individuals from District Dir.

145 The West Eurasian haplogroups include the following haplotypes: T1a (6%), U2a

(U2a1a) 6%, J1b (J1b1b) 4%, U4a (U4a2a) 4%, H14a (4%), H1a (H1ah1) 3%, H5 (

3%), JT (1.4%), U5a (1.4%), H1e (H1e2c) 1.4%, U5b (U5b2a1a2) 1.4%, H3x (1.4%),

N1a (N1a3a3) 1.4%, H7c (H7c4) 1.4%, H7a (H7a2) 1.4%, H57 (1.4%), W1 (1.4%), HV0

(1.4%), T2e 1.4%, H1c (1.4%), H1t (H1t1a1) 1.4%, H10e (H10e1a) 1.4% and HV1a

(1.4%).

The South Asian haplogroup includes haplotypes: M30 (13%), M3 (3%), R30b

(R30b2) (1.4%), M30c (1.4%), M66 (1.4%), M3d (1.4%), M30b (1.4%), M33a (1.4%),

M12a (1.4%), M1 (1.4%), U2d (1.4%), U2e (1.4%), U2b (1.4%) and M40a (1.4%). The

South East Asian haplogroup includes haplotypes, M49 (3%), C4 (1.4%), F1c (F1c1a)

1.4% and R22 (1.4%). The southern European haplogroups include haplotype J2b

(J2b1a, 6%). The Central Asian haplogroup includes haplotype D4h (D4h1) 1.4%, while the East Eurasian haplogroup includes haplotype B4a (B4a1c3a) 1.4%. The haplotype frequencies within each haplogroup are given in Figure 49.

The haplotypes of the sampled Utmankheil individuals of District Dir were also assigned to mega-haplogroups (Fig. 42). Among the mega-haplogroup, haplogroup

R was the most frequent being found in 45 (64%) of the sampled individuals, followed by haplogroups M 23 individuals (33%) and N two individuals (3%) (Fig.

50).

146

Figure 50. The frequency of Mega-haplogroups observed among Utmankheil individuals sampled from District Dir through mtDNA control region (HVSI, HVSII).

The genetic parameters of the Utmankheil sample in the current study were compared to that of previously reported ethnic groups of Pakistan. This comparative analysis revealed that the Utmankheil have a moderate number of unique haplotypes (33), which is consistent with most other ethnic groups of Pakistan, except Pathans in which the high number of unique haplogroups (128) was like due to the large number of samples used (Table 18). Furthermore, low genetic diversity

(0.9118) was observed among the Utmankheil individuals, relative to that that observed among previously reported ethnic groups from Pakistan, except for the

Kalash, who were found to have the least genetic diversity (0.851) of all the sampled considered (Table 18).

147 Table 18. Diversity comparison among the sampled Utmankheil individuals of Dir District with the other reported ethnic

Parameters Utmankheil Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk (present study)

No of samples 70 100 85 230 39 38 23 44 44 44 23 100

No of haplotypes 44 70 63 157 26 22 21 32 12 22 21 77

Unique haplotypes 33 54 58 128 18 15 19 25 5 12 19 63

Genetic diversity 0.9118 0.968 0.957 0.993 0.974 0.952 0.992 0.980 0.851 0.950 0.992 0.992 groups of Pakistan.

Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd,Sindhi; Pk, Pakistan Karachi.

148 3.2.3.4. Haplogroups of the sampled Yousafzai of District Swat

A total of 56 unrelated Yousafzai samples from District Swat, Pakistan were analyzed for HVSI and HVSII sequencing. The results revealed 39 different haplotypes among which 30 were unique and nine were shared by more than one individual (Table 17). The corresponding mtDNA genetic diversity of the Yousafzai is 0.9392, the power of discrimination is 0.9237, and the random match probability observed is 0.0763 (Table 19).

Table 19. Statistical analysis of Yousafzai of District Swat

S. No Population statistics

1 Total number of samples 56

2 No of haplotypes 39

3 No of unique haplotypes 30

4 Random match probability 0.0763

5 Power of discrimination 0.9237

6 Genetic diversity 0.9392

Out of the 39 haplotypes, 30 (77%) were observed once, six were observed (15%) twice, one was observed (3%) three times, one was observed (3%) four times and one was observed (3%) six times. The haplotype frequencies, their respective variants, snd their associated geographic origins are given in Table 20.

149 Table 20. Haplogroup frequencies and their respective variants among the sampled Yousafzai individuals of Swat District

S.NO Frequency Variants Hg HGO 1 5 A263G, T310C, C315CCCCC, C320T, C324G, A326C, C330A, C332A, C338A, C340A, C345T, B4a EEA A350C, C362A, G366A, A379C, G380C, T383C, C387CC, G389A, C394T, A402T, A16181T, A16183C, T16189C 2 6 T152C, A219G, A235T, A238G, C273T, A274T, G275T, T282A, A284G, A297T, C299G, C16225G, H2a W E A16237G, T16249G, A16252G, A16254G, G16255C, A16269G, C16270A, G16273C 3 1 A87G, T146C, T152C, A189G, T204C, A263G, C309CCT, T310C, C324T, A16165C, T16224C, K2a W E T16311C, T16413G 4 1 C151T, T152C, T206G, A230AA, A263G, A276AA, T292A, A302AA, C315CC, C320T, G329A, R22 SEA C330A, A331AA, C349CC 5 2 A87G, T152C, A263G, C309CCT, T310C, G366A, A464C, T482C, A488d, C494A, C505T, C518A, R5a SA AC523d, C541A, G16096C, T16097C, A16203AA, C16266T, T16304C, T16311C 6 1 A87G, A263G, T310C, C315CC, G329A, C332A, C338A, C362A, T372C, A376C, A379C, G380A, H27d W E T383C, G389A, T391A, C394T, A395G, T398A, T403A, C411G, A419C, G16129A 7 1 C151T, T152C, A263G, C303A, C315CC, T346C, A16070C H1a W E 8 1 T16126C, C16193T, G16274A J1d W E 9 1 G103d, C150T, T195A, A16070C, A16309G, A16318T, A16343G, T16362C U3b W E 10 2 G16129A, G16213A, T16362C H7h W E 11 1 A16120T, T16126C, T16131G, A16164T, A16170C, A16171G, T16178C, C16179A, A16194C, H4a W E A16203C, G16204C, A16215C, A16216G, T16217G, C16218G, A16219G, A16220d, C16225G, A16226T, A16237G, A16247d, A16252G, C16268d, C16270G 12 1 A111d, T152C, A263G, A290C, A301G, C303T, C315CC, T16105C, C16108A, C16115A, U7 W E A16309G, A16318T 13 1 C96A, A263G, C315CC, C324G, G366A, C375A, A379C, A388G, C394T, A402T, C411G, T414G, U2e SA T416G, A419C 14 1 T204C, A234G, A263G, T310C, T310TTC, C371T, G16208A, C16223T M14 SA

150 15 1 G103A, T195C, A263G, A286d, C315CC, T489C, C16223T, T16325C D4k CA 16 3 T204C, T217C, A263G, T310C, C311CTCC, G366A, C375A, G389A, C411G, C16083A, C16099T, M3a SA T16126C, C16223T, T16311C 17 1 C96A, G103A, A111d, A263G, C315CC B5b EEA 18 1 T117A, G121C, G124T, T129C, A219C, T236A, A259C, G260C, A263G, A270G, G275T U7a W E 19 1 C194T, G207A, A263G, C315CC, C514A H3s W E 20 1 T154C, A263G, C315CC, A376C, G413A, C420T R8b W E 21 2 T195A, A263G, C315CC, T489C, G16035C, A16037C, A16041T, G16042T, G16047GC, A16051G, M30 SA C16223T 22 2 A263G, C315CC, A432C, A16070C, C16076A, T16086A, C16205T, A16206C, A16212C, U2a SA A16216G, T16217G 23 2 A87G, T152C, T199C, T204C, A263G, C309CCCT, T310C, C16147G, T16172C, C16223T, N1a W E C16248T, C16344T, C16355T 24 1 C96A, G109d, A111C, T146C, C150T, T152C, T195A, A263G, C315CC, A374T, T403A, T408A, M30c SA G410C, C418G 25 1 G109C, C151T, T152C, A263G, C315CC, G16096C, T16097C, T16126C T2b W E 26 1 G107GC, T195C, C198T, A263G, C315CC HV0b W E 27 2 G109C, T152C, A200G, A263G, C315CC, C362A, C371T, C16234T, A16247G, T16304C, H5a W E T16325G, G16391T 28 1 A200G, A263G, C16223T, C16234T, T16362C M49 SA 29 1 T192d, T195C, A211T, A215G, C231A, T233A U4b W E 30 1 T127C, T146C, T152C, T168C, A202d, T226A, G229A L0d AF 31 1 G94C, A111d, T152C D1j EA 32 1 C16104T, C16223T, T16231C, C16291T, G16319A M6a SA 33 1 T16217C HV2 W E 34 1 T88TA, C242T, A263G, C295T, C315CC, C462T, T489C, T16094C, T16126C J1b NA 35 1 G92A, G94C, C105A, C112d, G121T, G124T, A176d, A218C, A219C, T220C, A234C, A249T, H3c W E G255C, G260A, A263G

151 36 1 G16110A, C16223T, A16289G M65a SA 37 1 C16099T, A16100T, C16179T, G16204A, C16221A U2c SA 38 1 T16093C, T16094C, C16099T, A16100T, C16301A, G16303A, T16311C, T16330G, C16344A, H3v W E T16347G, C16353T, C16358T, A16367C, C16375A, T16386d 39 1 A73G, A189G, C194T, T195C, T204C, G207A, A263G, T310C, C315CCC, C320T, C324G, A326C, W+194 W E G329A, A335T, C338A, G16129A, C16223T, T16249C, C16292T, C16419A haplogroup Hg; Haplogroup origin HGO; West Eurasian, WE; East Eurasian, EEA; North Asian, NA; South Asian, SA; Southeast Asian, SEA; African, AF; East Asian, EA.

152 The Yousafzai samples were assigned to haplogroups by geographic origin and haplogroups associated with West Eurasian populations were most common at

52%, followed by South Asian haplogroups at 29%, East Eurasian at 11%,

Southeast Asian at 1.8%, North Asian at 1.8%, African at1.8%, East Asian at 1.8% and Central Asian also at 1.8% (Fig. 51).

Figure 51. The distribution of Yousafzai haplogroups among the individuals sampled from District Swat by associated geographic origin.

The West Eurasian haplogroups found among the sampled Yousafzai individuals includes haplotypes: H2a (11%), H7h (3.6%), N1a (3.6)%, H5a (3.6%), while the rest of haplotypes i.e. K2a, HV2, W+194, H3V, H3c, U4b, R8b, HV0b,

T2b, H3s, U7a, U7, H4a, U3b, J1d, H1a and H27d occurred in only one individual

(1.8%), respectively. The East Eurasian haplogroup includes haplotypes B4a (9%) and B5b (1.8%). The South Asian haplogroup include haplotypes M3a (5.4%), R5a

(3.6), U2a (3.6%), and M30 (3.6%), while the remaining haplotypes (U2c, M6a,

153 M49, M65a, M30c, M14, U2e) were each limited to one individual (1.8%). North

Asian, East Asian, African, Central Asian and Southeast Asian haplogroups were observed in only one individual (1.8%), respectively in the Yousafzai sample

(Fig. 52).

West Eurasian

East Eurasian

South Asian

North Asian East Asian African Central Asian Southeast Asian

Figure 52. The frequencies of mtDNA haplotypes of Yousafzai individuals sampled from District Swat with respect to their associated geographic origins.

154 The haplotypes of the Yousafzai sample from District Swat were assigned to mega-haplogroups (Fig. 45). Observed in 40 (71%) of the sampled individuals, mega-haplogroup R was the most frequent haplogroup observed among the sampled Yousafzai individuals, followed by M 12 (21%), N 3 (6%) and L 1 (2%), respectively (Fig. 53).

Figure 53. The frequency of Mega-haplogroups observed in Yousafzai individuals from District Swat through mtDNA control regions (HVSI, HVSII).

The genetic parameters of the sampled Yousafzai population were compared with the other reported Pakistani ethnic groups. This comparison revealed that the Yousafzai sample from Swat has moderate frequency (30) of unique haplotypes that is consistent with previously reported ethnic groups of Pakistan,

155 except Pathans which has the highest (128) numbers of unique haplotypes among the other ethnic groups including the Yousafzai of the present study

(Table 21). The great number of unique haplotypes in the reported Pathan sample is likely a consequence of the large number of samples (230) considered.

The high number of unique haplotypes resulted high genetic diversity among

Pathans (0.993), followed closely by Sindhis (0.992), Hazaras (0.992), a mixed ethnicity sample from Karachi (0.992) and the Burushos of Hunza (0.980) in comparison to Yousafzai (0.9392) observed in the present study (Table 21).

156 Table 21. Genetic diversity of the Yousafzai sample from District Swat in comparison to the other reported ethnic groups of Pakistan.

Parameters Yousafzai Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk present study

No of samples 56 100 85 230 39 38 23 44 44 44 23 100

No of haplotypes 39 70 63 157 26 22 21 32 12 22 21 77

Unique haplotypes 30 54 58 128 18 15 19 25 5 12 19 63

Genetic diversity 0.9392 0.968 0.957 0.993 0.974 0.952 0.992 0.980 0.851 0.950 0.992 0.992

Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd, Sindhi; Pk, Pakistan Karachi.

157 3.2.3.5. Haplogroup distribution among the sampled Kohistanis of District Swat

A total of 37 unrelated Kohistani samples from District Swat were analyzed for mitochondrial HVSI and HVSII sequencing and 25 different haplotypes were identified. Out of the 37 samples, 15 haplotypes were unique and 10 were shared by more than one individual, while the corresponding mtDNA genetic diversity of this population was 0.9176, the power of discrimination was 0.8928 and the random match probability was 0.1072 (Table 22).

Table 21. Statistical analysis of the Kohistani sample from District Swat

S. No Population statistics 1 Total number of samples 37

2 No of haplotypes 25

3 No of unique haplotypes 15

4 Random match probability 0.1072

5 Power of discrimination 0.8928

6 Genetic diversity 0.9176

Of the 25 haplotypes, 15 (60%) were observed once, nine (36%) were observed twice and one was observed (3%) four times. The haplotype frequencies and their respective variants are provided in Table 23.

158 Table 23. Haplogroup frequencies and the respective variants of the Kohistanis sampled from Swat District

S. Frequency Variants Hg HGO No 1 2 G16129A, T16131G, C16242T, T16356C, A16417G HV12b WE 2 1 T16126C, T16189C J1c WE 3 1 G16096C, T16097C, A16098T, C16099A, G16129A, C16174T, A16177T, C16201A, T16209d, G16213A, H5c WE C16218T 4 2 T65TT, T72C, A193T, A211T, A215T, A243T, G260T, A263T, C271T, G275T, A278T, A281T, C285T, A286T, H2a WE C299T, C304T, C16107d, A16109G, T16126A, T16199G, A16215G, C16228T, C16232G, A16233C, A16235C, C16239G, A16241T, C16256A, C16262T, G16273T, A16277T, A16280T, C16282T 5 1 A16051G, C16056T, C16079T, A16081T, C16083d, T16092C, G16096T, A16098T, C16099d, A16113d, U6a WE A16116T, G16118T, C16151T 6 1 G79C, T83A, G92A, A95T, C120G, G121T, T152A, T161C, A175T, A176T, T180G, A183C, C186G, G187A, H5 WE G203T, G205A, G207T, G16156T, C16205A, G16208T, A16227T, C16282T, A16300T, C16301A, G16303T, T16304C, C16313d, C16364T, T16372A, G16384T, G16388A, T16397TT, C16410T, G16412T, G16414T 7 2 C16223T, C16292T, T16362G, A16383T, G16384d W WE 8 1 A16058T, A16074T, A16078T, G16084T, C16085A, G16089d, A16149T, C16179T, C16193T, C16197T, U4c WE G16204A, A16206T, A16207T, A16215T, A16219T, A16220T, A16227T, C16228T, A16230d, A16240T, A16241T, C16245A, G16255A, C16256A, C16259T 9 1 A16051G, T16086C, G16129A, A16206C, C16291T, T16362C U2a SA 10 1 G16129A, C16239T, C16339T, A16343T, C16380T, A16387T, A16399T, A16415T, A16420T H17c WE 11 1 G16084A, G16096C, T16097C, C16174d, A16200T, A16202T, G16204A, C16211T, C16214T, A16219T, F2e SEA C16236T, A16254T, C16260T 12 1 A16070C, T16075A, T16097d, A16100T, G16129A, G16156T, T16161A, A16175T, A16181T, C16193T, HV4a WE A16194T, C16201T, G16204T, A16210T, C16211T, A16216T, C16218T, C16221T, A16226C, A16227T, C16228d, C16239T, A16241T, A16247T, G16255T

159 13 1 G16129A, C16242T, G16273T, A16281T, A16312T, C16313A, A16333T, A16335T, G16336T, C16337T, H1e WE A16338T, C16339T, G16361T, T16362TT 14 2 A16070C, G16145A, C16176T, C16223T, C16261T, C16291T, T16311C, A16383T, G16384T M4 SA 15 1 T16102A, T16131d, C16225T, C16239T, A16252C, A16258T, C16264T, C16270T H1b WE 16 1 C16069T, T16126C, C16218T J1 NA 17 1 T16092C, C16188T, T16189C, A16207G, C16234T, A16314T, C16410T H13a WE 18 A16037C, C16040T, T16050d, C16052A, C16053T, C16056A, C16057A, C16076A, C16083A, C16085G, WE 1 T16086A, G16129A, T16131A, C16133G, T16161A H1 19 G54GG, A73G, T152C, A214G, A263G, C315CC, C461T, T489C, AC523d, T16140A, T16152C, T16154C, SA 4 A16155C, A16164C, A16165C, C16174G, C16223T, G16274A, T16323A, A16351T, T16362C, C16376T, M6 G16384T, A16387G, C16404T 20 2 A73G, T195C, A263G, T310C, T310TTC, G366A, T414G, T482C, T489C, AC523d, A561C M3 SA 21 2 A87G, A263G, C315CC, T16126C, T16143G, C16151G, C16188T, T16189C, A16194G, A16207G, A16216T, T WE C16234T, T16263C, A16277T, C16279T, A16284T, A16289T, C16294T, G16303T, C16321T, C16327T, C16337T, T16342A, A16343T, C16353T, A16367G, C16382T, T16386G, A16387G, C16393T, C16395G 22 2 G62GG, A73G, G184A, A200G, A263G, T310C, T310TTC, G380T, G389T, A396T, G410T, A425T, T430C, G2a SEA C445T, C465T, T16094C, T16117A, T16189C, C16192CT, A16194G, T16195G, A16212T, A16220C, C16223T, C16239G, C16245G, A16258C, A16265T, A16269G, T16276A, A16277C, A16285C, A16293T, C16294T, C16296T, A16305T, A16316G, A16326C, T16330G, A16333T, T16334A, G16346A, T16347C, A16351T, T16362C, A16367G, T16368G 23 2 T152C, T155A, A165T, A178T, C16083A, T16090A, A16100T, C16223T, G16274A, T16362C D4e EEA 24 2 A73G, T195C, C198T, A263G, C309CCT, T310C, T482C, T489C, AC523d, T16097C, G16110A, G16414A D4p EEA 25 1 A73G, G143A, A189G, C194T, T195C, T199C, T204C, G207A, A263G, C315CC, C16223T, C16292T, W3a WE G16414A haplogroup Hg; Haplogroup origin HGO; West Eurasian, WE; East Eurasian, EEA; Northern Asian, NA; South Asian, SA; South East Asian, SEA.

160 The Kohistani samples were assigned to haplogroup by associated origin. The most common haplotypes were found to be of West Eurasian derivation at 54%, followed by South Asian at 24%, East Eurasian at 11%, Southeast Asian at 8% and a North

Asian haplogroup was observed 3% of the sampled individuals (Fig. 54).

Figure 54. Haplogroup distribution among the sampled Kohistanis from District Swat by associated geographic region of origin

The most frequent haplotype in the Kohistani samples was M6, which is observed in

11% of individuals. The West Eurasian haplogroups haplotypes occurring with lesser frequency include: T at 5.4%, HV12b at 5.4%, H2a 5.4%, and W also at 5.4%.

The rest of haplotypes occurred with a frequency of 2.7% and include W3a, U6a,

U4c, J1c, HV4a, H5c, H5, H1e, H1b, H17c, H13a and H1. South Asian haplogroups include haplotypes M6 at 11%, M4 at 5.4%, M3 at 5.4% and U2a at 2.7%. The East

Asian haplogroup include haplotypes D4p and D4e, both of which occur with a frequency of 5.4%. The Southeast Asian haplogroup includes haplotypes G2a at 5.4%

161 and F2e at 2.7%. The North Asian haplogroup was represented by a single haplotype, J1 which occurs with a frequency of 2.7% as shown in Figure 55.

Figure 55. The frequencies of mtDNA haplotypes of Kohistanis sampled from District Swat with respect to their associated geographic regions of origin.

162 These haplotypes were further assigned into mega-haplogroups and the results show that the most frequent among them was, haplogroup R, which occurred in 20

(54%) individuals, followed by M 14 (38%) and N 3 (8%) (Fig. 56).

Figure 56. Mega-haplogroup distribution among the sampled Kohistani individuals from District Swat.

The genetic parameters of the Kohistani sample of the present study were compared with the previously reported ethnic groups of Pakistan. The comparative analysis revealed that the Kohistanis have a moderate number of unique haplotypes (15), which is consistent with the reported ethnic groups of Pakistan (Table 24).

Furthermore, low genetic diversity (0.9176) was observed in this Kohistani sample relative to previously reported ethnic groups from Pakistan, with the exception of

Kalash, which have the least genetic diversity (0.851) as shown in Table 24.

163 Table 24. The genetic diversity of the sampled Kohistani population from District Swat in comparison with the other reported ethnic groups of Pakistan.

Parameters Kohistanis Mak Sk Pt Bl Br Hz Hb Ks Ps Sd Pk present study

No of samples 37 100 85 230 39 38 23 44 44 44 23 100

No of haplotypes 25 70 63 157 26 22 21 32 12 22 21 77

Unique haplotypes 15 54 58 128 18 15 19 25 5 12 19 63

Genetic diversity 0.9176 0.968 0.957 0.993 0.974 0.952 0.992 0.980 0.851 0.950 0.992 0.992

Mak, Makrani; Sk, Saraiki; Pt, Pakhtuns; Bl, Baluch; Br, Brahui; Hz, Hazara; Hb, Hunza burusho; Ks, Kalash; Ps, Pasrsi; Sd, Sindhi; Pk, Pakistan Karachi.

164 3.2.4. Overall mtDNA haplogroup distribution among the five sampledethnic groups of Swat and Dir districts

The mtDNA control region sequences of 298 individuals from Swat and Dir districts were collectively analyzed as one geographical population for haplogroup identification. The overall results revealed 126 haplotypes in which 75 haplotypes were unique and 51 were shared. The most common haplogroups among the all five population samples were H2a and M30 observed with frequency of 5.37%, followed by U2a (4.36%), M3 (3.69%), B4a (3.02%), U4a (2.68%), T1a (2.35% ), H1a (2.01%), R5a

(1.68%), T(1.68%), H17c (1.34%), T2b (1.34%), U2e (1.34%), U7a (1.34%), W (1.34%) while, haplogroups H3p , J2b , K1a, M30b, M37, M49, M5c, N1a, R22 were found with similar frequency of 1.01% respectively. The rest of haplotypes accounted for less than 1.01% as described in Table 24. The most frequently observed haplotype among the Yousafzai and Tarklani individuals is H2a, while M30 is prevalent among

Utmankheils and M6 was the most common haplogroup among Gujar and Kohistani individuals from Swat and Dir districts. The frequencies of each of the haplogroups identified in the five sampled populations are summarized in Table 25.

Table 25. MtDNA haplogroup frequencies distribution in the five sampled populations of Dir and Swat Districts.

S.No Hg Gujars Yousafzai Tarklani Kohistani Utmankheil Total N=73 N=56 N=62 N=37 N=70 (%) 1 A 2 0 0 0 0 0.67 2 B4a 3 5 0 0 1 3.02 3 B5b 0 1 0 0 0 0.34 4 C4 0 0 0 0 1 0.34 5 C4b 0 0 1 0 0 0.34 6 D1j 0 1 0 0 0 0.34 7 D4b 1 0 0 0 0 0.34

165 8 D4e 1 0 0 2 0 1.01 9 D4g 1 0 0 0 0 0.34 10 D4h 0 0 0 0 1 0.34 11 D4k 0 1 0 0 0 0.34 12 D4p 1 0 0 2 0 1.01 13 F1 1 0 0 0 0 0.34 14 F1c 0 0 0 0 1 0.34 15 F2e 0 0 0 1 0 0.34 16 G2a 0 0 0 2 0 0.67 17 G2b 1 0 0 0 0 0.34 18 H1 1 0 0 1 0 0.67 19 H10e 0 0 0 0 1 0.34 20 H11a 0 0 1 0 0 0.34 21 H13a 0 0 0 1 0 0.34 22 H14a 2 0 0 0 0 0.67 23 H17c 0 0 0 1 3 1.34 24 H1a 1 1 2 0 2 2.01 25 H1b 0 0 0 1 0 0.34 26 H1c 0 0 1 0 1 0.67 27 H1e 1 0 0 1 1 1.01 28 H1t 0 0 0 0 1 0.34 29 H27d 0 1 0 0 0 0.34 30 H2a 4 6 4 2 0 5.37 31 H36 0 0 1 0 0 0.34 32 H3c 0 1 0 0 0 0.34 33 H3p 1 0 2 0 0 1.01 34 H3s 0 1 0 0 0 0.34 35 H3v 0 1 0 0 0 0.34 36 H3x 0 0 0 0 1 0.34 37 H4a 0 1 0 0 0 0.34 38 H5 2 0 2 1 2 2.35 39 H57 0 0 1 0 1 0.67 40 H5a 0 2 0 0 0 0.67 41 H5c 0 0 0 1 0 0.34 42 H7a 0 0 1 0 1 0.67 43 H7c 0 0 0 0 1 0.34 44 H7h 0 2 0 0 0 0.67 45 H7i 1 0 0 0 0 0.34 46 H82 0 0 1 0 0 0.34

166 47 HV0 0 0 0 0 1 0.34 48 HV0b 0 1 0 0 0 0.34 49 HV12b 0 0 0 2 0 0.67 50 HV1a 0 0 0 0 1 0.34 51 HV1b 0 0 1 0 0 0.34 52 HV2 0 1 0 0 0 0.34 53 HV4a 0 0 0 1 0 0.34 54 J 1 0 0 0 0 0.34 55 J1 0 0 1 1 0 0.67 56 J1b 0 1 4 0 3 2.68 57 J1c 0 0 0 1 0 0.34 58 J1d 0 1 0 0 0 0.34 59 J2b 0 0 0 0 3 1.01 60 JT 0 0 1 0 1 0.67 61 K1a 2 0 1 0 0 1.01 62 K2a 0 1 0 0 0 0.34 63 L0d 0 1 0 0 0 0.34 64 M1 0 0 0 0 1 0.34 65 M12a 0 0 0 0 1 0.34 66 M14 0 1 0 0 0 0.34 67 M3 2 0 5 2 2 3.69 68 M30 3 2 3 0 8 5.37 69 M30b 0 0 2 0 1 1.01 70 M30c 0 1 0 0 1 0.67 71 M30d 1 0 0 0 0 0.34 72 M33a 0 0 0 0 1 0.34 73 M35b 1 0 0 0 0 0.34 74 M37 3 0 0 0 0 1.01 75 M3a 2 3 1 0 0 2.01 76 M3c 1 0 0 0 0 0.34 77 M3d 0 0 0 0 1 0.34 78 M4 0 0 0 2 0 0.67 79 M40a 0 0 0 0 1 0.34 80 M49 0 1 0 0 2 1.01 81 M5 2 0 0 0 0 0.67 82 M52a 2 0 0 0 0 0.67 83 M53 1 0 0 0 0 0.34 84 M54 1 0 0 0 0 0.34 85 M5c 3 0 0 0 0 1.01

167 86 M6 5 0 0 4 0 3.02 87 M65a 0 1 1 0 0 0.67 88 M66 0 0 0 0 1 0.34 89 M6a 0 1 1 0 0 0.67 90 M7c 1 0 0 0 0 0.34 91 N 1 0 0 0 0 0.34 92 N1a 0 2 0 0 1 1.01 93 P3a 0 0 1 0 0 0.34 94 P4a 0 0 1 0 0 0.34 95 R0a 0 0 1 0 0 0.34 96 R22 1 1 0 0 1 1.01 97 R30b 0 0 0 0 1 0.34 98 R5a 2 2 1 0 0 1.68 99 R6a 0 0 1 0 0 0.34 100 R8b 0 1 0 0 0 0.34 101 S 1 0 0 0 0 0.34 102 T 1 0 2 2 0 1.68 103 T1a 1 0 2 0 4 2.35 104 T2 0 0 1 0 0 0.34 105 T2b 3 1 0 0 0 1.34 106 T2e 0 0 0 0 1 0.34 107 U2a 1 2 5 1 4 4.36 108 U2b 0 0 0 0 1 0.34 109 U2c 0 1 0 0 0 0.34 110 U2d 0 0 0 0 1 0.34 111 U2e 0 1 2 0 1 1.34 112 U3b 0 1 0 0 0 0.34 113 U4a 1 0 4 0 3 2.68 114 U4b 0 1 0 0 0 0.34 115 U4c 0 0 0 1 0 0.34 116 U5a 0 0 0 0 1 0.34 117 U5b 1 0 0 0 1 0.67 118 U6a 0 0 0 1 0 0.34 119 U7 1 1 0 0 0 0.67 120 U7a 2 1 1 0 0 1.34 121 V9a 1 0 0 0 0 0.34 122 W 0 1 0 2 1 1.34 123 W1 0 0 0 0 1 0.34 124 W3a 1 0 0 1 0 0.67

168 125 W4a 0 0 1 0 0 0.34 126 W6 0 0 1 0 0 0.34

These haplotypes were further analyzed for mega-haplogroup prediction and mega- haplogroup R, which occurred among 186 (62%) of the sampled individuals, was found to be the most frequent. Other identified mega-haplogroups include: “M” which was identified in 96 (32%) individuals, “N”which occurred in 15 (5%) individuals, while mega-haplogroup “L” was only found in a single individual

(0.34%) (Fig. 57).

Figur 57. Mega-haplogroup distribution among members of the five sampled ethnic groups of Swat and Dir districts.

169 The high frequency of mtDNA lineages observed in the data from Dir and Swat

Districts was West Eurasian observed in 133 (45%) individuals, followed by South

Asian, which was identified in 108 (36%) individuals, followed by: East Eurasian 19

(6%), Southeast Asian 12 (4%), North Asian 13 (4%), Southern European 3 (1%), East

Asian 3 (1%), Central Asian 2 (0.8%), Eastern European 2 (0.8%), African 1 (0.34%),

Australian 1 (0.34%) and Oceanian 1 (0.34%), respectively (Table 26 and Fig. 58).

Table 26. Haplogroups distribution among the individuals of Swat and Dir district by associated geographic region of origin.

S.No Haplogroup Origin Count %age

1 West Eurasian 133 45

2 South Asian 108 30

3 East Eurasian 19 6

4 South East Asian 12 4

5 Northern Asian 13 4

6 Southern Europe 3 1

7 East Asian 3 1

8 Central Asian 2 0.8

9 Eastern Europe 2 0.8

10 African 1 0.34

11 Australian 1 0.34

12 Oceanian 1 0.34

170

Figure 58. Haplogroup distribution among the individuals of the five sampled populations of Swat and Dir district by associated geographic region of origin.

In the present study, West Eurasian lineages occurred with high frequency, accounting for 54% of Tarklanis, 54% of Kohistanis followed by Yousafzais 52%,

Utmankheils 47% and Gujars 37%, respectively (Fig. 59A).

The frequency of South Asian lineages observed among Gujar individuals was 42%, followed by Utmankheils at 33%, Tarklanis at 30%, Yousafzais at 29% and

Kohistanis at 24%, respectively (Fig. 59B). Consequently, all of the sampled ethnic groups, except Gujars, are highly associated with West Eurasian haplogroups. In contrast, South Asian haplogroups were the most prevalent among Gujar individuals. The East Eurasian lineages vary in frequency from a high of 11% among Gujar, Yousafzai and Kohistani individuals to a low of 1% among

Utmankheil individuals, while it was found to be completely absent among the

Tarklanis (Fig. 59C).

171

A

B

C

Figure 59. Distribution of mtDNA lineages among the five ethnic groups sampled from Districts Swat and Dir (A) West Eurasian (B) South Asian (C) East Eurasian

172 3.2.4.1. Diversity comparison among the five sampled ethnic groups of Swat and Dir districts

The results obtained from the mtDNA control region (HVSI and HVSII) of the five ethnic groups (Gujars, Tarklanis, Utmankheils, Yousafzai and Kohistanis) were compared to each other for genetic variation (Table 27).

Table 27. Genetic diversity in the mtDNA data within the five ethnic groups mtDNA haplotype (shared) G T U Y K Combined 1 (unique) 29 31 33 30 15 138

2 10 6 4 6 9 35

3 5 2 4 1 0 12

4 1 2 2 1 1 7

5 1 1 0 0 0 2

More than 5 0 0 1 1 0 2

Number of haplotypes 46 42 44 39 25 196

Sample size 73 62 70 56 37 298

Unique haplotypes (%age) 0.40 0.50 0.47 0.54 0.40 0.46

Haplotype diversity 0.92 0.94 0.91 0.94 0.91 0.93

Power of discrimination 0.91 0.93 0.90 0.92 0.90 0.91

Random Match Probability 0.0903 0.0703 0.1018 0.0763 0.1072 0.0892

G, Gujars; T, Tarklanis; U, Utmankheil; Y, Yousafzai; K, Kohistanis. Power of discrimination, power to differentiate between any two people chosen at random from the population; calculated as the ratio between the number of different haplotypes and the total number of haplotypes.

173 The frequency of haplotypes identified among Gujar and Utmankheil individuals were 63%, Tarklanis and Kohistanis 67%, while the highest frequency (70%) of haplotypes were scored in the individuals of Yousafzai population (Table 27). The percentage of unique haplotypes varied between the five groups, ranging from a low of 40% among Gujars and Kohistanis to a high of 54% among Yousafzais. The percentage of unique haplotypes among Tarklanis and Utmankheils was 50% and

47%, respectively (Table 27).

As a result of the differences in unique haplotype frequency, haplotype diversity also varied between population samples, ranging from 0.91 among Kohistanis and

Utmankheils to 0.94 among Tarklanis and Yousafzais while Gujar individuals were marked by a moderate level of diversity (0.92) relative to the other samples included in the present study (Table 27). The overall power of discrimination was moderately high among Tarklani individuals (0.93), followed by Yousafzais (0.92) and Gujars

(0.91) while diversity was found to be similar among Utmankheils and Kohistanis

(0.90) (Table 27). The lowest random match probability was observed among

Utmankheils (0.0703) and Yousafzais (0.0763) in comparison to Gujars (0.0903),

Tarklanis (0.1018) and Kohistanis (0.1072) (Table 27).

3.2.4.2. Mitochondrial Genetic Differntiation

After calculating the FST, the highest genetic differentiation was observed between

Kohistanis and Gujars (0.1102), and the lowest among Yousafzai and Kohistanis

(0.0029) table 28.

174 Table 28: Pairwise Fst genetic distances (below the diagonal) and corresponding p-values (above the diagonal) between five ethnic groups from Swat and Dir districts based on mtDNA sequence data.

Yousafzai Gujar Tarkalani Kohistani Utmankheil Yousafzai * 0.0090 0.0541 0.4505 0.2793 Gujar 0.0398 * 0.0000 0.0541 0.0000 Tarkalani 0.0164 0.0621 * 0.1532 0.6306 Kohistani 0.0029 0.1102 0.0415 * 0.1712 Utmankheil 0.0040 0.0495 -0.0055 0.0451 * * p < 0.05

3.2.4.3. Multi-Dimensional Scaling

The MDS plot based on FST statistics calculated from the sequences of mtDNA control region for the five sampled ethnic groups from from Districts Swat and Dir and 17 published studies from Pakistan (Bhatti et al.,2016; Bhatti and Aslamkhan,

2015; Siddiqi etal., 2015). Each dot in the MDS plot represents the group centroid for a specific sample. (Fig. 60).

Figure 60: MDS plot of the five major ethnic groups of Swat and Dir districts derived from Fst genetic distances.

175 The results show that the Yousafzai, Utmankheil, Tarklani and Gujars cluster with other neighboring populations of Pakistan, while the Kohistanis are isolated from the rest of the samples.

3.2.4.4. Network Analysis based on mtDNA sequences

Haplotype networks were constructed using the mtDNA control region sequences obtained from Gujar, Utmankheil, Tarklani, Yousafzai and Kohistani individuals sampled from Swat and Dir districts, Pakistan. The orange colored nodes represent the Gujars, which forms a star cluster in the network (Fig. 61).

Figure 61. Network analysis of five population samples from Swat and Dir districts based on mtDNA sequence data.

176 3.3. Y-chromosome STRs and Y-SNPs analysis

3.3.1. Multiplex performance

The 100 unrelated individuals in this study self-identify as members of one of three major ethnic groups: Pathans (Pashtuns), Kohistanis, or Gujars. The Pashtuns are further represented by individuals from three widely recognized paternally-based divisions, Tarklanis, Utmankheils, and Yousafzais. All of the samples were successfully genotyped for 27 Y-STR loci. The amplified products were electrophoresed and electropherograms were generated that could be interpreted easily (Fig. 62).

Figure 62. An example of typical electropherogram for Y-STRs multiplex reaction used during the present studied populations of Swat and Dir districts.

177 3.3.2. Genetic diversity

Analyses of the 27 Y-STR loci resulted in the identification of 82 haplotypes of which

75 were unique (Table 29). The frequency of unique haplotypes varied between the five groups, from 100% (20 out of 20) among Kohistanis to 45% (9 out of 20) among

Utmankheils. Seven haplotypes were shared between two to six individuals and all but two haplotypes were population-specific (Table 29). The non-population-specific haplotypes were shared between four and five individuals, respectively. These include a haplotype shared by three Yousafzai individuals and one Tarklani individual and a haplotype shared by four Gujars and one Kohistani individual. As a result of the differences in unique haplotype frequencies, haplotype diversity also varied between population samples, ranging from 1.00 among Kohistanis to 0.93 among Utmankheils (Table 29). The overall power of discrimination was relatively high (0.85) for the combined set of individuals but varied widely between the five ethnic groups from relatively low (0.60) among Utmankheils to high (1.00) among

Kohistanis.

Information on Y-SNPs was used to assign a Y-chromosomal haplogroup

(Larmuseau et al., 2015; Karafet et al., 2008) to each individual. A relatively large number of haplogroups was observed (Table 29) and the spectrum of these haplogroups was consistent with previous studies (Lee et al., 2014;

Chennakrishnaiah et al., 2013; Zhao et al., 2009; Karafet et al., 2008; Sengupta et al.,

2006; Kivisild et al., 2003; Qamar et al., 2002). However, 85% of the studied individuals carry one of four haplogroups (H1-M69, G2b-M283, L1-M22(xM274), and

178 R1a-M417,Page7) and there are large differences in the frequencies of these four haplogroups between the five samples (Table 29).

179 Table 29: Genetic diversity in the Y-STR (27 loci) and frequencies of Y-SNP haplogroups within five ethnic groups from Dir and Swat Districts. The values for the Y-SNP haplogroups in brackets represent 90% confidence interval.

Y-STR haplotype Kohistanis Gujars Yousafzais Tarklanis s Utmankheils Combined 1 (unique) 20 16 15 17 9 75 2 1 1 2 3 1 1 1 2 4 1 1a 5 1b 6 1 1 Number of haplotypes 20 17 17 18 12 82 Sample size 20 20 20 20 20 100 Unique haplotypes 1.00 0.80 0.75 0.85 0.45 0.75 Haplotype diversity 1.00 0.98 0.99 0.99 0.93 0.99 Power of discrimination 1.00 0.85 0.85 0.90 0.60 0.82 Y-SNP haplogroup Kohistanis Gujars Yousafzais Tarklanis Utmankheils Combined G2a-L30(xL14, L13,M278) 1 (0.03-0.07) 1 (0.03-0.07) 2 (0.01-0.03) G2b-M283 2 (0.07-0.13) 16 (0.77-0.83) 18 (0-17-0.19) H1-M69 10 (0.46-0.54) 1 (0.03-0.07) 11 (0.10-0.12) J2a-L25 2 (0.07-0.13) 2 (0.01-0.03) J2b-M241 1 (0.03-0.07) 1 (0.03-0.07) 2 (0.01-0.03) L1-M22(xM274) 1 (0.03-0.07) 11 (0.51-0.59) 1 (0.03-0.07) 13 (0.12-0.14) O2-IMS-JST0213554(xP164) 1 (0.03-0.07) 1 (0.006-0.014) Q-M242(xL56, L57, L214) 2 (0.07-0.13) 2 (0.01-0.03) Q-L56,L57(xL54) 2 (0.07-0.13) 2 (0.01-0.03) R-M207,M734,P224,P280(xM173) 1 (0.03-0.07) 2 (0.07-0.13) 1 (0.03-0.07) 4 (0.03-0.05) R-M734,P224,P280(xM173) 1 (0.03-0.07) 1 (0.006-0.014) R1a-M417,Page7 5 (0.21-0.29) 3 (0.12-0.18) 16 (0.77-0.83) 16 (0.77-0.83) 2 (0.07-0.13) 42 (0.40-0.44) a Shared between three Yousafzai and one Tarklani individuals. b Shared between four Gujar and one Kohistani individuals.

180 For example, haplogroup G2b-M283 occurs at very high frequency (0.80) among

Utmankheils, but is completely absent among members of three of the other four population samples. In contrast, haplogroup R1a-M417,Page7 occurs among members of all five population samples but frequencies range from high (0.80) among Yousafzais and Tarklanis to a low (0.10) among Utmankheils. Due to small sample sizes 90% confidence intervals are relatively large and overlap for some haplogroups (Table 29).

3.3.3. Genetic differentiation

The genetic distances between the five groups, as estimated from the Y-STR markers using pairwise FST, are mostly very large, ranging from 0.148 to 0.595, except for the pairwise comparison between Tarklanis and Yousafzais, and highly significant

(Table 30 and Fig. 63).

Table 30. The genetic distances among the five ethnic groups, calculated as pairwise FST values based on 23 of the 27 STR loci. FST values below the diagonal and the corresponding P-values above the diagonal.

Gujars Kohistani Tarklanis Utmankheils Yousafzais

Gujars - 0.000±0.0005* 0.000±0.0005* 0.000±0.0005* 0.001±0.0005*

Kohistani 0.148 - 0.000±0.0005* 0.000±0.000* 0.000±0.0005*

Tarklanis 0.393 0.264 - 0.000±0.0005* 0.265±0.005

Utmankheils 0.508 0.445 0.596 - 0.000±0.0005*

Yousafzais 0.352 0.231 0.008 0.550 -

* Significant at 0.05 significant level with correction for multiple testing (0.05/10 = 0.005)

181 Despite being considered different ethnic subgroups of Pashtuns, members of these two groups are not significantly different from each other genetically (FST = 0.008, p =

0.265).

Multi-dimensional scaling (MDS) analysis of pairwise genetic distances was estimated based on FST statistics (27 Y-STR loci), for the five samples in this study with a stress value = 1.857914e-16. The Tarklanis and Yousafzais are clustered together within the MDS plot, while the Utmankheils are isolated from the rest of samples and occupy a position in the top right of the plot (Fig. 63).

Figure 63. Multi Dimensional Scaling (MDS) derived for the five major ethnic groups of Swat and Dir districts.

182 The genetic structure is also evident in the median joining network of Y-STR haplotypes in which four distinct groups may be discerned, mainly explained by the haplogroup assignment of the individuals (Fig. 64 and Table 29).

Figure 64. Median joining network based on the Y-STR haplotypes (23 loci) of the five population samples. The circle sizes indicate the number of individuals with shared Y-STR haplotypes (smallest circles = one individual). The lengths of the connecting branches indicate the number of mutational steps separating the haplotypes (shortest branch lengths = one mutational step).

183 Members of the Tarklani and Yousafzai subgroups of Pashtuns are mostly found together, being separated by only a few mutational steps. This is in contrast to the

Utmankheils and Gujars who, despite some outliers, form distinct groups separated by a large number of mutational steps. There are no shared haplotypes within the

Kohistani group; hence they appear more scattered in the network. Nevertheless, the majority of haplotypes among Kohistanis are still found close together in relative proximity to the Tarklani/Yousafzai aggregate (Fig. 64).

3.3.4. Genetics, ethnicity and geography

We included population samples from a wider geographic range to examine the genetic variation in a broader context, but limited the data set to 22 Y-STR loci for a worldwide data set and 10 Y-STR loci for the closer look at the Indo-Pakistani sub- continent and Southwest Asia (Table 6). In the AMOVA analysis, c. 90% of the genetic variation occurs within the 38 population samples from the Indo-Pakistani sub-continent and Southwest Asia (Table 31A).

184 Table 31A. AMOVA results when population samples are grouped based on country of origin

Source of variation d.f. Sum of Variance Percentage of squares component variance

Among groups 3 86.835 0.04903 (Va) 1.53

Among populations 34 531.079 0.25235 (Vb) 7.71 within groups

Within populations 1959 5747.079 2.93368 (Vc) 90.75

Total 1996 6364.993 3.235206

Fixation Indices P-value

FSC (Va) 0.07921 0.00000 ±0.000005

FST (Vb) 0.09316 0.00000 ±0.000005

FCT (Vc) 0.01515 0.02713 ±0.00152

Country Populations

Pakistan Gujar, Pakistan_Pathan, Pakistan-KAL (Kalashas), Pakistan-HZR (Hazaras), Pakistan-BSK (Burusho) , Pakistan-BRU (Brahuis), Pakistan- BLT (Baltis) , Pakistan-BAL (Baluchis), Pakistan-KSR (Kashmiris), Pakistan-MAKB (Makrani-Baluch), Pakistan-MAKN (Makrani- Negroid), Pakistan-PRS (Parsis), Pakistan-PKH (Pathan), Pakistan- SDH (Sindhis), Kohistanis, Yousafzai, Tarklani , Utmankheil, Pakistan- Punjabi, Sindhi (HGDP), Pathan (HGDP), Makrani (HGDP),Kalash (HGDP),Hazara (HGDP),Burusho (HGDP),Brahui (HGDP), Balochi (HGDP) Iran Iran-Ahvaz, Iran-Izeh, Iran-Rasht, Iran-Sari,Iran-Masal

Azerbaian Azerbaijan-Lenkoran

Afghanistan Afghanistan-Baluch, Afghanistan-Hazara, Afghanistan-Pashtun, Afghanistan-Tajik, Afghanistan-Uzbek d.f. degree of freedom; CV. Variance component.

185 When grouping these population samples by country of origin, the genetic variation among countries only accounts for 1.5% of the variation, whereas 7.7% of the total variation is explained by difference between population samples within countries

(Table 31A). However, when the 38 samples are instead grouped by ethnic relationships, differences between the ethnic groups account for 4.5% of the total variation, and the variation between population samples within the ethnic group accounts for 4.51% of the total variation (Table 31B).

The comparative AMOVA analysis based upon ethnicity (Table 31B) grouped the 30 relevant population samples into eight aggregates. The first may be designated as

Baluchis and associated ethnic groups. This aggregate includes five samples:

Afghan-Baluch, Pakistan-BAL (Baluchis), Pakistan-MAKB (Makrani-Baluch),

Pakistan-MAKN (Makrani-Negroid) and Pakistan-BRU (Brahui). The second aggregate may be designated as Pathans. This aggregate also encompasses five samples: Tarklanis, Yousafzais, Afghanistan-Pashtuns, Pakistan-PKH (Pathans), and

Pakistan-Pathans. The third group is the Utmankheils, whose separation from the other Pathan groups is justified by the results of the current study. The fourth aggregate may be designated as Iranians. This aggregate includes eight samples:

Iran-Ahvaz, Iran-Izeh, Iran-Rasht, Iran-Sari, Iran-Masal, Azerbaijan-Lenkoran, and

Pakistan-Parsi. The fifth aggregate may be designated as East Asian derived. This group includes four samples: Afghanistan-Hazaras, Pakistan-Hazaras, Afghanistan-

Uzbeks, and Pakistan-BLT (Baltis). The sixth aggregate may be designated as lowland western Indians. The group includes three samples: Pakistan-Punjabis,

Pakistan-SDH (Sindhs), and Gujars. The seventh aggregate may be designated as

186 Northern Pakistani Highlanders. This aggregate includes four samples: Kohistanis,

Pakistan-BSK (Burushos), Pakistan-KAL (Kalash), and Pakistan-KSR (Kashmiris).

Table 31B. AMOVA results when population samples are grouped based on ethnicity.

Source of variation d.f. Sum of Variance Percentage of squares component variance Among groups 6 331.036 0.14497(Va) 4.50 Among populations 31 286.878 0.14543(Vb) 4.51 within groups Within populations 1959 5747.079 2.93368(Vc) 90.99 Total 1996 6364.993 3.22408 Fixation Indices P-value FSC (Va) 0.04723 0.00000 ±0.000005 FST (Vb) 0.09007 0.00000 ±0.000005 FCT (Vc) 0.04497 0.00000 ±0.000005 Ethnic group Populations Baluchi Afghanistan-Baluch, Pakistan-BRU (Brahuis), Pakistan-MAKB (Makrani-Baluch), Pakistan-MAKN (Makrani-Negroid), Pakistan-BAL (Baluchis), Baluchi (HGDP), Brahui (HGDP), Makrani (HGDP) Pathans Tarklani , Yousafzai, Afghanistan-Pashtun (Pathans), Pakistan- PKH (Pathan/Pakhtuns), Pakistan_Pathan, Pathan (HGDP) Utmankheils Utmankheils Iranians Iran-Ahvaz, Iran-Izeh, Iran-Rasht, Iran-Sari, Iran-Masal, Azerbaijan-Lenkoran, Pakistan-PRS (Parsis) Mongol-derived Afghanistan-Hazara, Afghanistan-Tajik, Afghanistan-Uzbek, Pakistan-HZR (Hazaras), Pakistan-BLT (Baltis), Hazara (HGDP Lowland Pakistan-Punjabi, Pakistan-SDH (Sindhis), Gujars, Sindhi Western Indians (HGDP) Northern Pakistan-BSK (Burusho), Pakistan-KAL (Kalashas), Pakistan-KSR Pakistanis (Kashmiris), Kohistanis, Burusho (HGDP), Kalash (HGDP)

187 Multi-dimensional scaling (MDS) analysis based on pairwise genetic distances was estimated as FST (10 Y-STR loci) for 38 selected population samples from the Indo-Pakistani sub-continent and neighboring countries with a stress value of 0.1000367 (Fig. 65).

Figure 65. Multi-dimensional scaling (MDS) analysis for 38 selected populations from the Indo-Pakistani sub-continent and neighboring countries.

Despite the inclusion of 38 population samples from the Indo-Pakistani sub- continent and Southwest Asia, most of the genetic variation in the MDS is still defined by the five population samples from Dir and Swat Districts. Although this dataset was limited to 10 STR loci a very large genetic differences between the samples from Swat and Dir districts may still be observed. The genetic difference

188 between the Gujars and Kohistanis becomes non-significant, when the resolution has been reduced from 27 to 10 Y-STR markers.

Several specific observations may be noted. The Gujar sample and the Baluch ethnic groups from Afghanistan (Haber et al., 2012) both represent outliers that occupy the same area in the MDS plot (Fig. 65), whereas the Baluch sample from Pakistan

(Qamar et al., 2002) occupies a more central position. The Kohistanis occupy amore central position within the MDS plot that is adjacent to a large number of other sampled ethnic groups from the Indo-Pakistani sub-continent and Southwest Asia.

Noticeably, the Utmankheil sample is separated by very large and highly significant genetic distances from all other groups and on the MDS plot this sample occupies an isolated position distant from all other samples. The Pashtun groups Tarklanis and

Yousafzais are marked by very similar genetic distances to all other groups included in this analysis (Figure 65).

These results are generally mirrored when the MDS is constructed from the worldwide data set (Fig. 66).

189

2 Coordinate

Coordinate 1 Figure 66. Worldwide multi-dimensional scaling (MDS) analysis of pairwise genetic distances, estimated as FST (10 Y-STR loci), for 54 population samples (from HGDP), including the five population samples from Dir and Swat. (stress value =0.1583562).

3.3.5. Detailed analysis of two Y-chromosomal haplogroups

To get a more detailed picture of the relationships between the five population samples from Dir and Swat Districts I constructed haplotype (10 Y-STR loci) networks for individuals assigned to Y-SNP haplogroups (i) G-Page94 [(G2a-

L30(xL14, L13,M278) and G2b-M283)], (ii) H1-M69, and (iii) L1-M22(xM274), and included previously published datasets from Pakistan (Qamar et al., 2002),

Afghanistan (Haber et al., 2012), and the HGDP (Rosenberg, 2006) (Fig. 67).

Haplogroups G-Page94 and H1-M69 were combined in one network, as the Y-SNP typing of the previously published Pakistani population samples did not allow for

190 the distinction between these two haplogroups (Qamar et al., 2002). Individuals representing these two Y-haplogroups are clearly separated from each other in the

STR-network, thereby demonstrating concordance between the two datasets (Fig.

67A). Most of the Utmankheils possess haplogroup G-Page94 (more specifically,

G2b-M283) and they all cluster closely together (owing to highly similar Y-STR profiles) and with a couple of individuals from both Afghanistan and Pakistan (Fig.

67A). Only one Kohistani and one Gujar individual have a Y-SNP profile assigned to the G-Page94 haplogroup, and these two individuals share the same Y-STR haplotype, which is clearly separated from the haplotypes observed among the sampled Utmankheil individuals (Fig. 67A).

The Y-STR network with individuals assigned to SNP-haplogroup H1-M69 is more diffuse and many individuals are separated by a larger number of mutational steps.

However, most of the Kohistanis are found within this network, and many of them cluster together, sharing the same Y-STR haplotype (Fig. 67A).

The network of STR-haplotypes assigned to SNP-haplogroup L1-M22(xM274) shows at least two defined groups of individuals (Fig. 67B). All but one of the Gujar individuals in this network share the same Y-STR haplotype, which is also shared by a single Kohistani individual (and also if extended to the full 27 Y-STR loci haplotype (Table 29 and Fig. 64). Only a single Gujar individual is found in the other sub-group within the network.

191

Figure 67. Y-chromosome haplogroup-specific networks based on Y-STR haplotypes (10 loci) with individuals assigned to (A) Y-SNP haplogroups G-Page94 and H1-M69, and (B) Y-SNP haplogroup L1- M22(xM274). The circle sizes indicate the number of individuals that share the same Y-STR profile for these 10 loci. The smallest circles represent one individual. The lengths of the connecting branches indicate the number of mutational steps.

192 Chapter 4 DISCUSSION

Modern world humans are the last key features occurred late in human development. Despite broad opinion that Africa represents the main, if not nearly exclusive, place of origin for anatomically modern humans (AMHs), their patterns of dispersal out of Africa are still poorly understood and represent a challenge for researchers that continues to be investigated. One of the most hotly debated issues concerning the origins of anatomically modern humans is the role played some

100,000 years ago by a morphologically diverse array of archaic hominins. In Africa and in the Middle East there were various transitional forms spanning late Homo heidelbergensis and H. sapiens; in Asia, Homo erectus; and in Europe, Homo neanderthalensis (Klein, 2008). However, by 30,000 years ago this taxonomic diversity vanished and humans everywhere had evolved into the anatomically and behaviorally modern form of humans (Klein, 1999; Tattersall and Schwartz, 1999;

Clark and Willermet, 1997; Stringer and McKie, 1996; Wolpoff and Caspari, 1996;

Nitecki and Nitecki, 1994; Smith and Spencer, 1984). Due to advancement of techniques for survival, the H. sapiens was able to flourish in the African region, from whene they dispersed to Eurasia, Australia, Americas and eventually Oceania

(DeGiorgio et al., 2009), but their routes and pattern of migrations are poorly understood. However, the morphologic, genetic and archeological evidence suggests that dispersal of AMHs occurred through Levant and the southern routes from the

Horn of Africa, through the Arabian Peninsula into the region of southern Asia

(Reyes-Centeno et al., 2014; Fu et al., 2013; Liu and Zhao, 2006; Lahr and Foley, 1994;).

193 The presence of stone tools found in the Indo-Pakistani subcontinent (also called

South Asia) specifically in the Soan Valley of Pakistan suggest that, the humans appeared in the region at least by 200,000–400,000 years ago (Wolpert, 2000) and thus are likely to have been associated with archaic Homo species. A report based on fossile record suggests that modern humans inhabited Pakistan approximately

60,000–70,000 years ago (Hussain, 1997).

Geographically, Pakistan is borderd with the high mountains of Karakuram,

Himalyas, Hindukush ranges and Arabean Sea, situated at the crossroads of Asia, at the junction of the West Asia, Central Asia and South Asia (Ali, 2005). This region has high ethnic diversity, which historically has been, at least partially, attributed to a long and dynamic history of repeated invasions by Aryans (Bernhard, 1983),

Macedonians (Birdwood, 1959), Arabs, and Mongols (Lapidus, 2002). In addition, the Hindu Kush highlands served as a physical barrier of trade along the “Silk

Route” that channeled routes of communication between the populations of the

Mediterranean Basin and West Asia to those of China for more than 16 centuries

(Petraglia et al., 2012; Kuzmina 2007; Quintana-Murci et al., 1999). It is therefore possible that the extant populations of the Hindu Kush highlands show traces of historic, and even prehistoric, gene flow from far distant human populations.

Furthermore, Pakistan is a South Asian country that has two well-known civilizations; the Indus or Harappa civilization, which flourished between 2600 BC -

1900 BC and Gandhara Civilization dated, 1500 to 1000 BC (Kenoyer, 2005; Miller,

1985; Basham, 1963). It is believed that the southern coast of the Persian Gulf, the territory of present-day Afghanistan and the Makran Coast of Pakistan likely served

194 as passages for human dispersal out of Africa in prehistoric times, making the population dynamics of this region even more interesting (Derenko et al., 2013;

Underhill et al., 2001). However, the migration and admixture of new populations and exchange of cultural elements following these routes have made the Indo-

Pakistani people more heterogeneous and diverse (Lukacs and Hemphill, 1991). For example, in the fourth century B.C. onwards about 2000 years, different populations entered Pakistan and settled. These populations were the Greeks, Scythians,

Parthians, Pahlavas, Kushans and the Indo-Aryans (Maloney, 1974; Thapar, 1969).

The Huns came in somewhat in at the close of the Gupta period (Ingalls, 1976). The

Jews and Parsis came later via the western coast. Arabian Muslims, Persian Muslims,

Turks and Afghans each came to the region in different waves and at different times.

The Muslim immigration into India and Pakistan began even before the Arab invasions quite early in the 8th century A.D. and ended with the establishment of the

Mughal Empire in the 16th century.

It is very difficult to assess how human groups and settlements were formed in the pre-historic times, whether they were the indigenous inhabitants or were migrants from some other place? And, if they migrated, what routes they followed? These are some of the big questions scientists seek to answer, using the principal of evolution, dental anthropology and molecular genetics on the basis of past, present and future

(Whale, 2012; Stoneking, 2008).

The multicultural (Rose, 1911) and highly diverse population has made Pakistan an attractive country for the field of anthropology. The study area of Swat and Dir districts of Khyber Pakhtunkhwa, where Gujars, Kohistanis and Pashtuns are the

195 major ethnic groups are genetically isolated, lacking of intermarriages and hence being highly endogamous (Glatzer 2002; Qamar et al., 1998; Caroe, 1992).

The evidence obtained from skeletons of the region has not been studied for ancient

DNA (Kennedy, 2000). Although the genetic data available on Pakistani populations is very limited, it has indicated differences between Pakistanis and other populations of the world. Most of the earlier studies mentioned Pakistani populations as a single entity, which is incorrect, because Pakistan is the home of 18 different ethnic groups

(Grimes, 1992; Newcomb, 1986). Therefore each ethnic group of Pakistan ought to be studied separately. Recently, a few ethnic groups of the country have been studied, and such studies have demonstrated clear divergences among them (Lee et al., 2014;

Mehdi et al., 1999; Qamar et al., 1999).

The demography and the historic perspectives of different ethnic groups residing within the Indo-Pakistani subcontinent has been a subject of interest for years. As a result, three models have been offered as a consequence of these investigations. The first is known as the Long-Standing Continuity Model. According to proponents of this model, the modern human population migrated to South Asia some 62,000 to

75,000 years ago and is commensurate with the initial dispersal of H. Sapiens out of

Africa. Proponents of this model claim that once H. Sapiens arrived and settled in

South Asia, the resident population of the subcontinent was not significantly influenced by subsequent gene flow from surrounding populations or large-scale migrations within the subcontinent (Krithika et al., 2009; Sahoo et al., 2006; Kennedy et al., 1984). Therefore, the pattern of affinities among members of the living ethnic

196 groups and ancient inhabitants of South Asia are due to simple isolation-by-distance, both in time and space.

The second model is the Aryan Invasion Model. This model is predicated on the creation of war tools, domestication of horses, and the invention of horse-drawn chariots that have been found in Central Asian during the Bronze Age (Bryant and

Bryant, 2001; Renfrew, 1987). The existence of Indo-Aryan languages in the northern two-thirds of the subcontinent, and the of invaders with war horses inhibiting the castles of the noseless Dasus in the RgVeda, suggests that the Central

Asians invaded the northwestern region of the subcontinent during the mid-2nd millennium BC (Wheeler, 1968).

The third model is known as the Out of India Model. The creators of this model believe that the appearance of early agriculture and the presence of complex cities of the Indus Valley and Doab of north India (McAlpin, 1981) are the consequence of a proto-Elamo-Dravidian migration from southwestern Iran (i.e., Susa) into the subcontinent and Central Asia. However, proponents of this model are not in agreement as to when this dispersal event took place. Consequently, two versions of this model have been proposed. The first proposes that South Asia represents the true homeland of the Indo-European languages and the dispersal of populations bearing these languages occurred in 3rd millennium BC. The second version suggests that the entry of Indo-Aryan languages into the subcontinent occurred later, and is perhaps associated with the appearance of the Iron Age during the 1st millennium

BC .

197 Exploring information contained in mtDNA, Y-STRs, and tooth morphology is very important for phylogenetic studies, therefore the current project was designed to characterize the five ethnic groups (Yousafzai, Gujar, Tarklani, Kohistani,

Utmankheil) residing in Swat and Dir district through dental morphology, mtDNA and Y-STRs analysis.

Dental morphology provides an assessment of variations in the cusps, ridges, grooves and root structures that can be used for reconstruction of biological relationships among different populations (Hillson, 1996; Dahlberg, 1945; Pedersen,

1949; Moorrees, 1957). These variations are controlled by different genes and are only slightly affected by environmental factors (Scott and Turner, 1997). The dental traits exhibit significant differences in frequency among major geographic areas

(Dahlberg, 1951; Dahlberg, 1945; Hrdlicka, 1920) and, in some cases, these differences are so obvious that Caucasoid, Mongoloid and African dental complexes can be easily differentiated (Buikstra et al., 1990; Haeussler, 1989; Mayhall et al.,

1982).

The current project was designed to assess the nature of the biological affinities among members of the myriad ethnic groups of South Asia and the five samples of

Swat and Dir districts of the present study using dental morphology are but a small component of that overall endeavor. The research also analyzed the gene flow from the surrounding region into the South Asian gene pool. Furthermore, the research based on dental morphological data also explored the biological affinities among the living population from northern Pakistan, including the present studied population samples, and the ancient inhabitants of Indo-Pak subcontinent.

198 Inter-sample affinities based upon pairwise MMD values were examined with neighbor-joining cluster analysis (NJ), multidimensional scaling (MDS), and principal coordinate analysis (PCA).

If the “Long-Standing Continuity Model” is correct, i.e. the inhabitants of South Asia have experience no significant gene flow from neighboring populations or population movements within the sub-continent and their migration and establishment occurred about 75,000 years ago, the patterning of their biological affinities ought to be the consequence of regional and geographic proximities. The regional structural profile of peninsular Indians, inhabitants of Indus Valley, Central

Asians, Himalayan highlanders, Hindu Kush and the inhabitants of northern Indus

Valley boundaries; including the temporal provenience among the pre-historic inhabitants of the Indus Valley, prehistoric Central Asians and all living populations will interact in patterning of their biological affinities.

In contrast, if the Aryan invasion model is true, that South Asian was invaded by

Bronze Age Central Asians in the mid- 2nd millennium BC, This event should be reflected by a biological discontinuity within the population of the Indus valley population commensurate with the dissolution of the Harappan civilization.

Therefore, one ought to expect that all the post-Harappan populations are descendents of these Aryans from Central Asia. Moreover, if it is correct that the

Indo-European languages dispersed in South Asia due to this Aryan invasion, which afterward spread to the Upper Doab of North India, this ought to be reflected by close biological affinities between members of North Indian ethnic groups and their alleged Central Asian ancestors. However, Dravidian-speaking groups from

199 southeast India ought to show no genetic affinities with these Central Asians invaders and little affinity to their North Indian descendants. Ultimately, the ethnic groups residing in Himalaya and Hindu Kush highlands, including members of the ethnic groups residing in the northern portion of Khyber Pakhtunkhwa (KP) and the foothills rimming the northern boundary of the Indus Valley may have biological affinities to these mid-2nd millennium invaders from central Asia. The current studied population of Swat and Dir districts exhibit no affinities to the Central Asian samples included in this analysis.

However, if the Out of India Model is correct, which states that the rise of complex cities in Indus Valley indicates that Indo-European languages arose within the

Indian subcontinent and then dispersed to surrounding regions of Central and

Southwest Asia during the 3rd millennium BC, then the origin of South Asian populations ought to be attributed to long-term geographical isolation, thereby reducing the biological distances among the late Bronze Age Central Asians and post-Chalcolithic populations of the Indus Valley, North Indiana and northern

Pakistan.

The second version suggests that the entry of Indo-Aryan languages into the subcontinent occurred later, and is perhaps associated with the appearance of the

Iron Age during the 1st millennium BC

Furthermore, if the 2nd version of the Out of India Model is correct, which states that the expansion of complex cities lead to the migration of people out of India, populating the Upper Doab of North India and then migrating to the neighboring

200 areas of Central and Southwest Asia did not occur until the mid-1st millennium BC, while the entry of Indo-Aryan languages into the subcontinent occurred later, and is perhaps associated with the appearance of the Iron Age during the 1st millennium

BC. Therefore, the prehistoric Central Asian samples of the late Bronze Age, since they antedate this proposed migratory event, ought to show no affinities to any of the South Asian samples included in the present study and hence the dispersal did not take place until the Iron Age.

Variations among the present studied ethnic groups of Swat and Dir district and the other groups of northern Pakistan were carried out with an array of data reduction techniques. The results were presented through neighbor-joining cluster analysis

(NJ), multidimensional scaling (MDS), and principal coordinate analysis (PCA). The results visualized through neighbor-joining cluster analysis reveal a fundamental split between peninsular Indians on the one hand versus the ethnic groups from northern Pakistan on the other (Fig. 33). Intriguingly, the amount of diversity among the former appears greater than the diversity among the latter. Whether this is a reflection of reality, or is the consequence of the greater number of northern

Pakistani samples or their assessment by a greater number of researchers is unclear.

On the other hand, the samples collected from Swat and Dir districts revealed close affinity with each other, except for the Yousafzai, who show affinity to the Swatis sample from Mansehra as shown in Figure 25.

The MDS with Kruskal’s and Guttman’s methods revealed that the Yousafzais from

Swat possess affinities with the Dravidian-speaking ethnic groups from Andhra

Pradesh in southeastern peninsular India, while the Gujar (GUJsw) and Kohistani

201 (KOHsw) samples from Swat exhibit close affinities with the highland samples from

Chitral and the Swatis of Mansehra District. The Utmankheils and Tarklanis of Dir

District share close affinities with the ethnic groups from Maharashtra, located in

West-Central peninsular India and are distinctly separated from the other Pakistani samples included in this analysis.

The results obtained from PCA indicate that within the present studied population samples from Dir and Swat Districts, the Tarklanis and Utmankheils from Dir show some affinities to one another, the Gujars and Kohistanis are marked by affinities to one another, while the Yousafzai are highly isolated from the rest of the samples phenetically. Furthermore , the Dravidian-speaking samples from southeast India

(CHU, GPD, PNT) and the Indo-Aryan-speaking samples from west-central India

(MDA, MRT, MHR) are segregated away from each other phonetically and are linked to the remaining samples by very distant affinities to the Utmankheil and

Tarklani samples from Dir, respectively. Most of the highland samples (Madak

Lasht, Wakis from Gulmit, Khows, and Kohistanis) aggregate together along with the foothill samples of Awans (AWAm1) and Swatis (SWT) from Mansehra District.

Perhaps all the members of this aggregate ought to be considered the highland aggregate. If so, then the Wakhi sample from Sost (WAKs), the Yousafzais from Swat

(YSFsw) and even the second sample of Awans from Mansehra (AWAm2) would be considered members of this aggregate as well (Fig. 28).

An examination of the biological affinities of northern Pakistani ethnic groups in the context of living ethnic groups from peninsular India and prehistoric samples from the Indus Valley and southern Central Asia yield several consistent patterns. First,

202 prehistoric south-Central Asians (DJR, SAP, KUZ, MOL) are clearly separated from all South Asian samples, both living and prehistoric. Second, peninsular Indian samples tend to be segregated from the Pakistani samples and tend to aggregate into separate groups by both region (Andhra Pradesh vs. Maharashtra) and language

(Dravidian vs. Indo-Aryan). Intriguingly, the prehistoric sample from Maharashtra

(INM) consistently exhibits closest affinities to the living ethnic groups (MRT, MHR,

MDA) of this same region of India. This may reflect local population continuity since the 2nd millennium BC. Northern Pakistanis tend to aggregate into two groups, One appears to be largely composed of highland samples (KHO, MDK, WAKg, WAKs), possible highland groups (UTHd, YSFsw, KOHsw), as well with groups from the foothills (SWT, AWAm1). The other samples, such as (SYDm2, AWAm2, TANm2,

SYDm2, AWAm2, TANm2) tend to occupy highly anomalous positions. These samples were scored by another researcher and that their anomalous positions may be a reflection of inter-observer differences in the scoring of the dental traits.

Among the studied five population samples from Swat and Dir districts, the neighbor-joining cluster analysis identifies Gujars, Kohistanis and Utmankheils as possessing affinities to the ancient Harappans of the Indus Valley, Yousafzais as having affinities to ethnic groups of the Hindu Kush-Karakoram highlands, while

Tarklanis are identified as exhibit no close affinities to any of the other samples from

Dir and Swat Districts. The MDS identifies the Pashtun groups (YSFsw,UTHd,

TRKd) as having closest affinities to one another, with Kohistanis somewhat divergent and Gujars aligning with the ancient Harappans. PCA identifies

Kohistanis, Yousafzais, and Gujars as possessing affinities to one another. When

203 such results are viewed together, the results obtained from dental morphology suggest the immigrant Pashtun groups were small in number and appear to have intermarried extensively with members of the local ethnic groups they encountered, especially those occupying the highlands. However, the Kohistanis are not closely related to these immigrants, while the rather close affinities of Gujars to the sample of Harappans from Cemetery R37 attest to their Indus Valley origins.

Despite being a country inhabited by a population of tremendous ethnic diversity, the members of many ethnic groups in Pakistan have remained largely unstudied genetically, therefore the mtDNA control region of the five studied population samples from Dir and Swat Districts were analyzed for genetic characterization.

MtDNA has a distinctive geographic distribution throughout the world’s population. The sequence of mtDNA haplogroups varies from each other is due to the polymorphic sites or nucleotide variation found in the control regions. Studies based upon HVSI and HVSII of mtDNA have contributed to explore the genetic legacy of some Indian and Pakistani populations (Bhatti et al., 2016; Kivisild et al.,

2003; Roy et al., 2003; Macaulay et al., 1999).

In the present study, a total of 126 different haplotypes were identified among which the frequency of unique haplotypes was found to be 63% among Gujars , 67% among Tarklanis, 63% among Utmankheils, 70% among Yousafzais (70%) and 67% among Kohistanis (67%). The proportion of unique haplotypes among the other reported populations of Pakistan have been observed with the frequencies of 91% in

Orakzais of Hazara, 77% among Makranis, 74% among Saraikis , 72% among

Burushos, 68% among Pathans, 66% among Baluchis , 60% among Bangashis, 58%

204 among Brahuis, 56% among Sindhis, 52% among , 50% among Parsis, 36% among and 27% among Kalashas (Bhatti et al., 2016a; Hayat et al., 2015;

Saddiqi et al., 2014; Rakha et al., 2011; Quintana-Murci et al., 2004). The difference in haplotype frequencies among the five ethnic groups from Swat and Dir districts the other reported ethnic groups from Pakistan is due to the differences in sample size.

The number of unique haplotypes identified in the five studied population samples from Swat and Dir districts was 75 in which, 63% were unique among Gujars, 74% among Tarklanis, 75% among Utmankheils, 77% among Yousafzais and 60% among

Kohistanis. Such values are consistent with Burushos (78%), Hazaras (76%),

Makranis (76%), Baluchis (69%) and Brahuis (68%) among the other reported populations of Pakistan, but unique haplotypes were found to be more common among Saraikis (92%), Sindhis (90%) and Pathans (81%) (Hayat et al., 2015; Saddiqi et al., 2014; Rakha et al., 2011; Quintana-Murci et al., 2004).

The results obtained in the current study sugges that members of the five ethnic groups from Swat and Dir districts have experienced a strong amount of admixture in which their mtDNA reflects: i) a high proportion of West Eurasian lineages; ii) moderate to high proportion of South Asian lineages; iii) low proportion of East

Eurasian/East Asian, Southeast Asian and North Asian lineages; and iv) a small fraction of Southern European, Central Asian, Eastern European, African, Australian and Oceanic lineages.

The phylogenetic analysis revealed that the Indian and Pakistani populations share high frequencies of West Eurasian mtDNA haplogroups (Bhatti et al., 2016a; Kivisild et al., 1999), which is also very frequent accounting 45% in the individuals of the

205 present studied population samples from Swat and Dir districts. The frequency of

West Eurasian lineages is 54% among Tarklanis, 54% among Kohistanis, 52% among

Yousafzais, 47% among Utmankheils and 37% among Gujars. Similar frequencies, ranging around 55%, were reported among Pathans, followed a much lower 26% among Makranis of Pakistan (Siddiqi et al., 2015; Rakha et al., 2011). This low number of West Eurasian haplogroupe in Makrani population is due to the fact that they are of African ancestory (Siddiqi et al., 2015). Furthermore, the frequency of West

Eurasian haplogroups among ethnic groups of Indian Punjabis were reported to range from 40% to 50%, around 30% among Kashmiris and Gujaratis, while the lowest proportion of West Eurasian lineages were reported among ethnic groups residing in West Bengal, Indian cast populations and in some Indian states like Uttar

Pradesh, Kerala, Maharashtra, Tamil Nadu and Uttar Pradesh (Ahmed, 2014;

Metspalu et al., 2004; Kivisild etal., 2003). A greater proportion of West Eurasian lineages were reported among the major ethnic groups of Afghanistan, with frequencies of 40% among Hazaras, 89% among Tajiks, 74% among Baluchis and

64% among Pashtuns (Whale, 2012). The presence of West Eurasian lineages at high frequencies suggests that gene flow in the past into this region likely occured from the west through Iran or possibly from the north through Central Asia (Quintana-

Murci et al., 2004), through the invasion by different invaders in the past

(McElreavey et al., 2005).

South Asian lineages are the second most prevalent, accounting for 30% of the lineages found among the individuals of the five ethnic group samples from Swat and Dir districts. Frequencies were highest among Gujars at 42%, followed by by

206 Utmankheils at 33%, Tarklanis at 30%, Yousafzais at 29%and Kohistanis at 24%. The proportion of South Asian lineages among individuals of other reported Pakistani ethnic groups ranges from a high of 48% among Sindhis, 39.1% among Pathans, 36% among Pashtuns, 29.4% among Saraikis, and 24% among Makranis (Bhatti etal.,

2016a; Bhatti et al., 2016b; Hayat et al., 2015; Saddiqi et al., 2014; Rakha et al., 2011).

Low frequencies of South Asian lineages have been reported by other researchers among the major ethnic groups of Afghanistan, ranging from 15% among Hazaras,

13.3% in Baluchis, 7.1% among Pashtuns, and completely absent among Tajiks

(Whale, 2012). looking at the frequencies of South Asian lineage in Afghan Pathans

(7.1%) vs. its frequencies among the Tarklanis (30%), Utmankels (33%), and

Yousafzais (29%) from the present study as well as the frequencies of Pathans 36%

(Bhatti et al., 2016b) and Pashtuns 29.4% (Rakha et al., 2011) from Pakistan, suggests that there has been considerable gene flow between these immigrant groups and the local, indigenous ethnic groups they encountered once they arrived in Pakistan.

The complete dataset revealed that only 7% of the lineages found among members of the five ethnic groups of Swat and Dir districts are associated with populations of

East Eurasia/East Asia. Frequnecies range from a high of 12.8% among Yousafzais,

11% among both Gujars and Kohistanis, 1.4% among Utmankheils to complete absence among Tarklanis. Frequencies of East Eurasian/East Asian lineages previously reported by other researchers among Pakistani ethnic groups ranges from a high of 35% among Hazaras of Baluchistan, to 9% among Saraikis, to 6.9% among

Burshos of Hunza, to 5.2% among Pathans, to a low of 2% among Makronis (Bhatti et al., 2016a; Saddiqi et al., 2014; Rakha et al., 2011; Quintana-Murci et al., 2004).

207 The East Eurasian/East Asian haplogroup have also been reported in the major ethnic groups of Afghanistan where frequencies range from a high of 37.5% among

Hazaras, 31% among Uzbeks, 14.3% among Pashtuns, 13.4% among Baluchi to 10.5% among Tajiks. In addition, it has been reported that 37% of lineages observed among

Turkmens from Turkmenistan are of East Eurasian/East Asian derivation (Whale,

2012; Quintana-Murci et al., 2004).

Our examination of haplotypes among the five sampled ethnic groups reveals that about 4% of these lineages are of Southeast or North Asian origin. These lineages range from a high of 8% among Kohistanis and 7% among Utmankheils, while Easst

Euraian/East Asian lineages were scarce or uncommon among Gujars, Tarklanis and

Yousafzais. Negligible frequencies of Southern European, Central Asian, Eastern

European, African, Australian and Oceanic lineages were found among members of the five sampled ethnic groups of Swat and Dir districts.

A majority of individuals comprising the current human populations outside Africa possess mtDNA lineages that can be assigned mega-haplogroups M, R and N, all of which are believed to be derived from African lineage L3. Among the 298 individuals from the five sampled ethnic groups of Swat and Dir districts the frequency of mega-haplogroups R, M, N and L were 62%, 32%, 5% and 0.34% respectively (Fig. 49). Highest frequencies of mega-haplogroup R occur among

Tarklanis at 74%, followed by Yousafzais at 71%, Utmankheils at 64%, Kohistanis at

54% and Gujars at 48%. High frequencies of mega-haplogroup R have been previously reported by other researchers among other ethnic groups of Pakistan with frequnecies ranging from a high of 63.4% among Pathans of Districts Mardan

208 and Charsada ( Tabassum, et al. 2016), followed by 61.3% among Pathans residing within the Federally Administrated Tribal Areas (Rakha et al., 2011), 30.8% among

Baluchis (Whale, 2012), 17.3% among Hazaras, , 16.89% in the Hazarwal population of Hazara Division ( Akbar et al., 2016), , , 9.1% among Makranis, 8.7% among

Hazaras of Baluchistan, 8% among Pashtuns, 7.7% among Baluchis, 7.9% among

Brahuis, 6.9% among Sindhis (Bhatti et al., 2016a, 2016b), and 2.3% among the

Burushos of Hunza (Quintana-Murci et al., 2004). This haplogroup has also been previously reported by other researchers among the major ethnic groups of

Afghanistan where mtDNA lineages corresponding to mega-haplogroup R were found to range from a high of 28.6% among Pashtuns 28.6% to 20% among Uzbeks,

15.8% among Tajiks, to a low of 7.5% among Hazaras. In India, frequencies of mtDNA belonging to mega-haplogroup R range from a high of 31% among

Koyas,8.8% among Gujaratis, 8.77% among Tamils, and 1% smong Chenchus

(Ranaweera et al., 2014; Kivisild et al., 2003; Quintana-Murci et al., 2004).

Lineages belonging to mega-haplogroup M among members of the five sampled ethnic groups of Dir and Swat district occurred with highest frequency among

Gujars (45%), followed by Kohistanis (38%), Utmankheils (33%), Tarklanis (23%) and

Yousafzai (21%). Frequencies of lineages withinmega-haplogroup M among ethnic groups of Pakistan previously reported by other researchers ranges from a high of

33% among Baluchis, to 30.9% among Pathans resident within the Federally

Administerred Tribal Areas, to 30.4% among Sindhis, to 28% among Pastuns, to

26.8% among Pashtuns of Charsada and Mardan Districts, to 22.7% among

Burushos from Hunza, to 21.78% in the Hazarwal population of Hazara Division, to

209 13% among Hazaras, to a low of 9.1% among Makranis (Sadia et al., 2016; Rakha et al., 2011; Whale, 2012; Nazia et al., 2016; Bhatti et al., 2016a; Bhatti et al., 2016b;

Quintana-Murci et al., 2004). The proportion of mtDNA lineages assignable to mega- haplogroup M among members of Afghan ethnic groups reported by other researchers ranges from a high of 15% among Hazaras, 13.3% among Baluchis, 7.1% among Pashtuns, to complete absence among Tajiks (Whale, 2012). Lineages assignable to mega-haplogroup M are predominant among ethnic groups of peninsular India, occuring in 60-70% of population, 26-64% in Indian Sub-Continent

(Chandrasekar et al., 2009; Metspalu et al., 2004; Quintana-Murci et al., 2004; Kivisild et al., 1999).

I observed that lineages attributable to mega-haplogroup N were found among 5% of the 298 sampled individuals from Dir and Swat Districts. Frequencies were highest for Kohistanis (8%), Gujars (7%), and Yousafzais (6%), while frequencies were much lower among Tarklanis and Utmankheils at 3%, respectively. The frequency of mtDNA lineages attributable to mega-haplogroup N among other

Pakistani ethnic groups reported by other researchers ranges from a high of 15.56% in the Hazarwal population of Hazara Division, to 8.6% among Pashtuns from

Districts Charsada and Mardan, to 7.8% among Pathans residing within the

Federally Administered Tribal Areas, 6.9% among Sindhi, 5.2% among Baluchis, 3% among Pashtuns from Khyber Pakhtunkhwa and Makranis of Sindh, , , 2.6% among

Brahuis , to a low of 2.3% among the burusho of Hunza and (Tabassum et al., 2016;

Akbar et al., 2016; Bhatti et al., 2016a; Bhatti et al., 2016b; Whale, 2012; Rakha et al.,

2011; Quintana-Murci et al., 2004). Its prevalence among the major ethnic groups of

210 Afghanistan has been reported by other researhcers as ranging from a high of 10.5% among Tajiks, to 7.5% among Hazaras , with an overall frequency of 5.9% in the

Afghan population as a whole (Whale, 2012). The prevalence of M, N and R lineages within the present study five population samples from Swat and Dir districts, other

Pakistani populations and the neighboring populations from Afghanistan and India may revealed that these areas are the initial place where human settled after its dispersal from Africa (Chandrasekar etal., 2009).

No common haplotypes were observed among members of the five sampled ethnic groups of Dir and Swat Districts, but various specific haplotypes are identified in which H2a was the most prevalent, being found in 26.6% of individuals, followed by

M30 (25.3%), U2a (21.4%), M3 (19%), M6 (17.6%), B4a (15.9%), J1b (12.5%), U4a

(12.2%), T1a (10.3%),T (10%), M3a (9.74%), W (8.6%), R5a (7.94%), H17c (7%), D4p

(6.8%), U2e (6.4%), U7a (6.14%), T2b (5.8%), HV12b, H1e (5.4%), G2a (5.4%), M4

(5.4%), D4e (5%) and N1a (5%). Among these haplotypes H2a was observed to be the most common haplotype among Yousafzai and Tarklani individuals, M6 was the most common among Gujar and Kohistani individuals, while M30 was the most common haplotype observed Utmankheil individuals.

Haplotype H2a is predominant in European and west Eurasian populations

(Brotherton et al., 2013; Loogvali et al., 2004), M6 is frequently reported in the Indus

Valley (Metspalu et al., 2004), M30 is India-specific (Maji, 2009), while U2a is restricted to South Asia (Quintana-Murci et al., 2004). The prevalence of specific haplotype H2a among Yousafzais and Tarklanis suggests that the maternal gene pools of these two populations are derived from West Eurasian populations. The

211 predominance of haplotype M6 among Gujars and Kohistanis may indicate maternal gene flow from ethnic groups occupying the Indus Valley, while the high prevalence of haplotype M30 among Utmankheils suggests some kind of general South Asian influence on their maternal gene pool. The MDS graphs depict Kohistanis as clear outlyers relative to the four other sampled ethnic groups of Dir and Swat Districts.

Such results may be a consequence of the fact that Kohistanis are highly endogamous and genetically isolated relative to the other sampled ethnic groups included in this study.

Our analysis of patrilineal genetic diversity among members of the five sampled ethnic groups of Dir and Swat Districts yielded several interesting insights. First, the level of Y-STR haplotype diversity within each ethnic group was found to be generally high and comparable to average global values (Purps et al., 2014), except for the Utmankheil sample, which displays less diversity and fewer unique haplotypes (Table 29). Second, the five groups are makred by an extreme level of genetic differentiation, both among themselves (Table 30, Fig. 63) and in relation to other population groups in the region (Fig. 65). Based on all 27 loci, the average FST between these five ethnic groups is very high 0.34 (Table 30), with an extreme FST of

0.60 observed between Tarklanis and Utmankheils (Table 30). The middle range FST values (0.1-0.3) were found between some of the ethnic groups (i.e., Gujar–Kohistani,

Tarklani–Kohistani, Yousafzai – Kohistani) are comparable to genetic distances reported previously between population groups from the Indo-Pakistani sub- continent (Perveen, 2014; Seema et al., 2011; Alam et al., 2010) and the Middle East

(Triki‐Fendri, 2015). However, the extreme genetic distances were observed (FST >

212 0.35) in several of the pairwise comparisons are unusual and higher than observed between most human populations - even when occupying different continents

(Purps et al., 2014). Small sample sizes can inflate the genetic distances and, with just

20 sampled individuals from each group, the FST values should be interpreted with some caution. However, we note that such extreme genetic distances have been observed previously between other ethnic groups living in relative proximity (Zeng et al., 2014), when they have experienced prolonged and severe genetic isolation coupled with long-standing endogamy (Zeng et al., 2014; Roewer, 2013; Gaikwad et al., 2006). As such, it is perhaps not unexpected to observe such great genetic distances between the ethnic groups of Swat and Dir districts given their isolated residential localities, their cultural preferences for endogamous marriages, as well as their differences in subsistence practices, lifestyles and languages specially among

Pashtuns, Gujars and Kohistanis (Barth, 1956). The high differentiation could be an effect of male founder effects (see below) and are might not be mirrored in genome- wide autosomal data, but further studies are needed to clarify this. Nevertheless, our results indicate that isolated lifestyles and cultural preferences can have a very large impact on genetic distances between geographically closely residing populations.

The genetic distinction between members of these ethnic groups is further underscored by differences in haplogroup frequencies (Table 29). The only haplogroup shared by members of all five population samples is R1a-M417,Page7, which is not surprising as this haplogroup occurs widely throughout the Eurasian continent, especially among populations found in Central Asia and the Indo-

213 Pakistani sub-continent (Pamjav et al., 2012; Karafet et al., 2008; Novelletto , 2007;

Sengupta et al., 2006).

It is widely recognized that cultural factors, such as language and group associations, can sometimes play a role in shaping the genetic structure among human populations, especially those found in remote areas where populations are small and isolated physically (Ayub et al., 2009; Gaikwad et al., 2006). The AMOVA results confirm that this is also the case for the Indo-Pakistani sub-continent, where

4.1% of the genetic variation is explained by ethnicity whereas only 1.6% is explained by origin. Members of the studied ethnic groups were found to be more similar genetically to population samples assigned to their respective ethnicity than to population samples obtained in the same geographic location (Figure 65, Table

30).

Unlike Gujars, Kohistanis and especially Utmankheils, Tarklanis and Yousafzais cannot be differentiated from each other genetically with the 23 analyzed Y-STR markers (FST = 0.008, Table 30), and the SNP data show that the majority of these individuals carry variants of haplogroup R1a-M417,Page7, that are intermingled in a loosely defined group in the network (Fig. 64). This haplogroup is common today among Europeans, Central Asians, and many of the ethnic groups of South Asia

(Sengupta et al., 2006; Kivisild et al., 2003; Jobling and Tyler-Smith., 2003). Recent studies have dissected the R1a-M417, Page7 haplogroup in greater detail (Kivisild et al., 2015; Pamjav et al., 2012). It is reasonable to hypothesize that the Pakistani individuals from this study assigned to haplogroup R1a-M417,Page7 belong to one of the sub-haplogroups of R1a-Z95, such as R1a-Z2125, R1a-M560, or R1a-M780

214 (Underhill et al., 2015). According ti the local people Tarklanis and the Yousafzais are distinct subgroups of Pashtuns (Fig. 8), but several studies have suggested that there are cultural and linguistic similarities (Caroe, 1992; Khan, 2008), which is clearly mirrored in our genetic data. The results suggest that both historic and current gene flow between members of these sub-groups (i.e., patrilineal clans) prevails despite their current residency in remote areas of the Hindu Kush-Hindu Raj highlands. In addition, neither of these two populations was significantly different from Pashtun individuals from Afghanistan after Bonferroni correction (Fig. 65).

Utmankheils are also considered Pashtuns (Fig. 8), but with FST distances ranging between 0.45 and 0.60 (23 loci) from the other four population samples from Dir and

Swat (Table 30) and distances ranging between 0.21 and 0.56 (10 loci) to populations from the Indo-Pakistani sub-continent and Southwest Asia, they are genetically very different from any other sample included in this study (Fig. 65). This is also reflected in the haplogroup networks where most Utmankheils form a very distinct cluster within haplogroup G-Page94 (Figs. 64 and 67, Table 29). This haplogroup is common in the Caucasus but is also found in medium to low frequencies in the Middle East and southern Europe (Nijjar, 2008; Kivisild et al., 2003). Consequently, the

Utmankheils can be considered a genetic outlier within the Indo-Pakistani sub- continent or even Eurasia (Fig. 66), at least in regard to the Y-chromosome. Such results suggest that they either have a different genetic origin than the members of the other Pashtun sub-tribes included here or that the Utmankheil male lineage has been subjected to severe genetic drift, due to a male founder effect or genetic bottleneck followed by isolation. The latter scenario is perhaps supported by lower

215 genetic diversity observed among Utmankheils relative to that seen among members of the other groups (Table 29).These results are particularly interesting and suggest that members of the current Utmankheil clan are all descendants of a single adopted son of unknown origin (Barfield, 2010; Caroe, 1992). This could explain the apparent genetic isolation of the Utmankheil male lineage, although the presence of other Y-

SNP haplogroups in the population sample (Table 29) indicates that least some male- mediated gene flow must have occurred in either ancient or recent times or that the bottleneck was not quite as dramatic as proposed (i.e. one male). We note that our findings do not question the ethnic descriptions of the Utmankheils as a sub-ethnic group of the Pashtuns, but rather underline the fact that close cultural associations may easily arise without a closely shared genetic history.

The Gujar population sample is also much differentiated genetically but shares relatively close affinities to Baluchi population samples from Afghanistan and

Pakistan (Fig. 65). This observation could support previously suggested cultural connections, such as a shared transhumant lifestyle and marriages (Adamec, 2011;

Nijjar, 2008; Barth, 1956) between Gujar and Baluch populations despite rather profound linguistic differences (Grierson, 1903-1928; Strand, 1973; Morgenstierne,

1932). The high proportion of individuals sharing haplotype L1-M22(xM274) could again be the result of strong genetic drift. This haplogroup is today found in West

Asia and the Indo-Pakistani sub-continent (Kivisild et al., 2003; Jobling and Tyler-

Smith, 2003). Although speculative, the data could also indicate recent gene flow between Gujars and Kohistanis which may be due to a type of symbiotic relationship arose between members of these two ethnic groups, since these share haplotypes

216 within haplogroup H1-M69, G2a-L30(xL14, L13,M278), and L1-M22(xM274)(Table 29 and Figure 67B). Haplogroup L1-M22(xM274) is found in low frequency among

Kohistanis but is the most frequent haplogroup among Gujars and thus recent paternal gene flow from Gujars to Kohistanis can be hypothesized. However, more data are needed to clarify this.

In contrast to the other four ethnic groups included in this study, Kohistanis are more genetically diverse and not significantly different from a wide array of population samples from the Indo-Pakistani sub-continent (Table 29, Figs. 65 and

67A). However, the exact relationship within haplogroup H1-M69 (the most frequent haplogroup within Kohistanis) between Kohistanis and members of the other ethnic groups of Pakistan and Afghanistan is unclear. This is possibly because the individuals assigned to the network where haplogroup H1-M69 is included may encompass a large range of (sub)-haplogroups, depending on the sub-set of Y-SNPs characterized in individual studies. The result could suggest the term “Kohistani” may have less biological meaning than the other ethnic group identifiers. After all,

Gujar refers to specific caste of herders, while Tarklani, Utmankheil and Yousafzai refer to patrilineal clans. Kohistani merely refers to a resident of a particular region, which may have no specific demand with regard to suitable marriage partners (at least to the degree seen in the other four ethnic groups), and are therefore found genetically admixed in our study.

Conclusions

In the current doctorial thesis, a total of 14 tooth-trait combinations defined by the

Arizona State University Dental Morphology System were investigated in 823

217 individuals belong to five ethnic groups (Gujars, Kohistanis, Yousafzai, Tarklanis and Utmankheils) residing in Dir and Swat Districts. Gujars, Kohistanis and

Utmankheils tended to exhibit affinities to the Chalcolithic era inhabitants of

Harappa located within the Indus Valley. Yousafzais were found to exhibit close affinities to ethnic group residing within the Hindu Kush-Karakoram highlands, while Tarklanis were found to possess no close affinities to members of the other samples included in the analysis. These results were confirmed by multidimensional scaling. Principal coordinate analysis yielded a somewhat different picture. In this case Kohistanis, Yousafzais, and Gujars were identified as possessing affinities to one another. It was concluded from the results of the dental morphology analysis that the immigrant Pashtuns groups (i.e, Tarklanis, Utmankheils, Yousafzais) were likely small in number and upon their arrival in Pakistan intermarried extensively with members of local groups, especially those occupying these highlands. On the other hand, Kohistanis are not closely related to these immigrants but share affinities to other highland ethnic groups, while the affinities of Gujars provides a clue to their

Indus Valley origin. In short all the three ethnic groups i.e Pashtuns, Gujars and

Kohistanis has retained their originality.

Molecular characterization of the five sampled ethnic groups was screened for mtDNA haplogroups. High frequency of Western Eurasian haplogroup among the four ethnic groups (Yousafzai, Kohistanis, Utmankheil and Tarklanis) reveales that these populations have greater affinity with Western Eurasian gene pool and are also closely related to each other, while the presence of South Asian lineages among

Gujar individuals confirms their affinities to ethnic groups of the Indus Valley and

218 beyond in peninsular India as attested by the results of dental morphology. The occurrence of lineages assignable to maternal mega-haplogroup lineage R among members of the five sampled ethnic groups of Dir and Swat Districts also confirms that the inhabitants of northern Pakistan share their gene pool with West Eurasians and Europeans. This genetic influx might be due to the Neolithic and Paleolithic dispersal of populations from West Eurasia to South Asia through Iran and along the

Arabian Sea coast. Our results show that most of the ethnic groups exhibit a high proportion of individuals possessing a West Eurasian haplogroup followed by those possessing haplogroups of South Asian origin. East Asian, Southeast Asian,

Southern European and Central Asian lineages are all quite rare in maternal gene pools of the five sampled erthnic groups of Dir and Swat Districts.

We have also characterized the genetic diversity for paternal lineages for members of the same five ethnic groups residing witthin the mountainous Dir and Swat Districts of the Khyber Pakhtunkhwa Province, Pakistan. With the exception of Tarklanis and

Yousafzais, we have documented extreme levels of genetic differentiation of the male lineages between the groups. Such differences conclude that either a lack of shared ancestry; perhaps due to several distinct ancient or historical migrations into this region, and/or bottlenecks and isolation events resulting in severe genetic drift in the local male gene pools. The Y-STR data presented here do not offer sufficient resolution to investigate these scenarios further but the results provide a strong impetus to resolve the demographic history of this region with genome-scale analyses.

219 In concurrence with previous studies, we conclude that ethnicity provides a more accurate predictor of genetic associations than simple geographic propinquity.

However, our data also illustrate a clear exception in that Utmankheils are not related to other Pashtuns group anlyzed. Thus the cultural association must either be a more recent phenomenon not explained by shared ancestry or that a founder event, such as a putative adoption among the Utmankheils followed by strong genetic drift, simply erased the genetic links but not the cultural ones. The overall results also conclude that these populations are strongly associated with West

Eurasian and South Asian gene pools.

Recommendations

We analysed non-metric dental traits in the present study, further analysis based on odontometrics should be also analysed to provide a clearer picture of the five population samples from Swat and Dir districts. A cohort based study is recommended of all the ethnic groups in Khyber Pakhtunkhwa Province and

Afghanistan and other adjoining areas to provide insight into the overall patterns of biological affinities among members of these ethnic groups. The data produced provides a sound baseline for elaborating the histocial profile and anthroplogial standings of Pakistani people and development of a sound data base for personal genomics and personalized medicine.

220 REFERENCES

Achilli, A., C. Rengo, C. Magri, V. Battaglia, A. Olivieri, R. Scozzari, F. Cruciani, M. Zeviani, E. Briem, V. Carelli and P. Moral. 2004. The molecular dissection of mtDNA haplogroup H confirms that the Franco-Cantabrian glacial refuge was a major source for the European gene pool. Am J Hum Genet. 75(5): 910- 918.

Adamec, L.W. 2011. Historical dictionary of Afghanistan: Scarecrow Press.

Ahmad, H. and Sirajuddin 1996. Ethnobotanical profile of Swat. Proc. First. Train. Workshop Ethnob. Appl. Conserve. Islamabad pp 202-206.

Ahmad, H., M. Ozturk, W. Ahmad and M. S. Khan. 2015. Status of Natural Resources in the Uplands of the Swat Valley Pakistan: In Climate Change Impacts on High-Altitude Ecosystems. Springer International Publishing. pp. 49-98.

Ahmad, H. and R. Ahmad. 2003. Agroecology and biodiversity of the catchment area of Swat River. The Nucleus 40: 67-76. 271.

Ahmad, H. I. Hussain, I. Ahmad and F. Rahman. 2001. Historical overview and ethnography of Swat Valley. Museum Bull. XIV: 23-27.

Ahmed, A. 1976. Millennium and Charisma among Pathans: A Critical Essay in Social Anthropology. Routledge and Kegan Paul, London.

Ahmed, M and A. Sirajuddin. 1996. Ethnobotanical profile of Swat. In Proceeding of first training workshop on Ethnobotany and its application to conservation. Islamabad, Pakistan.

Ahmed, M. 2014. Ancient Pakistan – an Archaeological History. Amazon, Create Space Independent Publishing Platform.

Akbar, N., H. Ahmad, M.S Nadeem, N. Ali and M. Saadiq, 2015. An Efficient Procedure for DNA Isolation and Profiling of the Hyper Variable MtDNA Sequences. J Life Sci. 9: 530-534.

Alam, S., E. M. Ali, A. Ferdous, T. Hossain, M. M. Hasan and S. Akhteruzzaman. 2010. Haplotype diversity of 17 Y-chromosomal STR loci in the Bangladeshi population. Forensic Sci Int-Gen. 4(2):59-60.

Alechine, E., W. Schempp and D. Corach. 2016. Characterization of the AZF region of the Y chromosome in Native American haplogroup Q. J Sci Hum Art. 3(4): 7-58.

221 Ali, H., H. Ahmad, K.B. , M. Yousaf, B. Gul and I. Khan. 2012. Trade potential and conservation issues of medicinal plants in district swat, pakistan. Pak J Bot. 44(6): 1905-1912.

Ali, H., J. Shah and A. K. Jan. 2008. "Medicinal value of family Ranunculaceae of Dir District, Pakistan." Pak J Bot. 39 (4):1037-1044.

Ali, I., M. Zahir, M. Qasim. 2005. Archaeological survey of district Chitral. Frontier Archaeology. 3: 91.

Ali, U and M.A. Khan. 1991. Origin and diffusion of settlements in Swat valley. Pak J Geogr. 1(1): 97-115.

Anderson, S., A. T. Bankier, B. G. Barrell, M. H. L. de Bruijin, A. R. Coulson, J. Drouin, I. C. Eperon, D. P. Nierlich, B. A. Roe, F. Sanger, P. H. Schrier, A. J. H. Smith, R. Staden and I. G. Young. 1981. Sequence and organization of human mitochondrial genome. Nature. 290: 457-465.

Andrews, R. M., I. Kubacka, P. F. Chinnery, R. N. Lightowlers, D. M. Turnbull and N. Howell. 1999. Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA. Nat Genet.23: 147.

Anonymous. 1998. District census report Malakand Agency-1998: Population Census Organization, Government of Pakistan, Islamabad.

Arif, M. 2014. An overview of archaeological research in Gandhara and its adjoining regions (Colonial and Post Colonial Period). J Asian Civil. 37:73–78.

Aslamkhan M. 1996. Sapta Sindhvas: the Land of Seven Rivers. Lahore Mus Bull. 9: 59–67.

Ayub, Q and C. Tyler-Smith. 2009. Genetic variation in South Asia: assessing the influences of geography, language and ethnicity for understanding history and disease risk. Brief funct genomic proteomics. 8: 395-404.

Baart, J. L and Z. M. Sagar. 2002. The Gawri language of Kalam and Dir Kohistan. Pp. 1-22.

Bailey, S. E. 2002. A closer look at Neanderthal postcanine dental morphology: the mandibular dentition. Anat Rec. 269(3): 148-156.

Bailey, S. E. 2004. A morphometric analysis of maxillary molar crowns of Middle- Late Pleistocene hominins. J Hum Evol. 47(3): 183-198.

222 Bailey, S. E., M. M. Skinner and J. J. Hublin. 2011. What lies beneath? An evaluation of lower molar trigonid crest patterns based on both dentine and enamel expression. Am J Phys Anthropol. 145(4): 505-518.

Bamshad, M., Wooding, S., Salisbury, B. A. and Stephens, J. C., 2004. Deconstructing the relationship between genetics and race. Nat Rev Genet. 5(8): 598-609.

Bangash, S. 2012. Socio-economic conditions of post-conflict Swat: a critical appraisal. J Peace Dev II FATA Research Centre, Islamabad.

Barfield, T. 2010. Afghanistan: A cultural and political history. Princeton University Press.

Barth, F. 1956. Ecologic relationships of ethnic groups in Swat, North Pakistan. Am Anthropol. 58: 1079-1089.

Barth, F. 1959. Political Leadership among Swat Pathans. The Athlone Press, London.

Basham, A. L. 1963. The Wonder That Was India. Orient Longmans Limited, New Delhi.

BBC, 2010. Afghanistan Country Profile. Retrieved November 10th, 2010, from http://news.bbc.co.uk/1/hi/world/south_asia/country_profiles/1162668.st m

Behar, D., E. Metspalu, T. Kivisild, S. Rosset and S. Tzur. 2008. Counting the founders: the matrilineal genetic ancestry of the Jewish Diaspora. PLoS One. 3(4): e2062.

Bekada, A., R. Fregel, M. V. Cabrera, M. J. Larruga, J. Pestano, S. Benhamamouch and M. A. Gonzalez. 2013. Introducing the Algerian mitochondrial DNA and Y-chromosome profiles into the North African landscape. PLoS One. 8(2): e56775.

Bellew, H. W. 1994. A General Report on the Yousafzais. 1864; 3rd edn. Lahore: Sang- e-Meel Publications. p.97.

Bernhard, W. 1983. Ethnogenesis of South Asia with special reference to India. Anthropol Anz. 93-110.

Berta, P., R. J. Hawkins, H. A. Sinclair, A. Taylor, L. B. Griffiths, N. P. Goodfellow and M. Fellous. 1990. Genetic evidence equating SRY and the testis- determining factor. Nature. 34: 448-450.

223 Bhatti, S., M. Aslamkhan, M. Attimonelli, S. Abbas and H. H. Aydin. 2016a. Mitochondrial DNA variation in the Sindhi population of Pakistan. Aust J Forensic Sci. 48: 115–130.

Bhatti, S., M. Aslamkhan, S. Abbas, M. Attimonelli, H. H. Aydin and S. M. E. de Souza. 2016b. Genetic analysis of mitochondrial DNA control region variations in four tribes of Khyber Pakhtunkhwa, Pakistan. Mitochondr DNA. 1-11.

Birdwood. 1959. A History of the Pathans: Review. Geogr J . 125: 414-416.

Blaylock, S. R. 2008. Are the Koh an indigenous population of the Hindu Kush?. A dental morphology investigation. Unpublished Master’s Thesis. California State University, Bakersfield.

Bodmer, W. 2015. Genetic characterization of human populations: from ABO to a genetic map of the British people. Genetics. 199(2): 267-279.

Bohner, J and V. Lucarini. 2015. Prevailing climatic trends and runoff response from Hindukush-Karakoram-Himalaya, upper Indus basin. arXiv preprint arXiv:150306708.

Bolk, L. 1916. Problems of human dentition. Am J Anthropol. 19: 91- 148.

Bolk, L. 1922. On the relationship between reptilian and mammalian teeth. Odontological Essays IV. J of Anat. 56: 107-136.

Bouckaert, R., P. Lemey, M. Dunn, S. J. Greenhill, A. V. Alekseyenko, A. J. Drummond, R. D. Gray, M. A. Suchard and Q. D. Atkinson. 2012. Mapping the origins and expansion of the Indo-European language family. Science. 337:957–960.

Brandon, M. C., E. Ruiz-Pesini, D. Mishmar, V. Procaccio, M. T. Lott, K. C. Nguyen, S. Spolim, U. Patil, P. Baldi and D. C. Wallace. 2009. Mitomaster: a bioinformatics tool for the analysis of mitochondrial DNA sequences. Hum Mutat. 30(1): 1-6.

Brandt, G., W. Haak, C. J. Adler, C. Roth, A. Szecsenyi-Nagy, S. Karimnia, S. Moller- Rieker, H. Meller, R. Ganslmeier, S. Friederich, V. Dresely, N. Nicklisch, J. K. Pickrell, F. Sirocko, D. Reich, A. Cooper, K. W. Alt. 2013. Ancient DNA reveals key stages in the formation of Central European mitochondrial genetic diversity. Science. 342: 257–261.

Bridge and R. Allchin. 1982. The Rise of Civilization in India and Pakistan. Cambridge University Press. p. 306.

224 Brook, A. H and M. Scheers. 2006. Variations of Tooth Root Morphology in a Romano-British Population. Dent Anthropol. 19(2): 33-38.

Brotherton, P., W. Haak, J. Templeton, G. Brandt, J. Soubrier, C. J. Adler, S. M. Richards, C. Der Sarkissian, R. Ganslmeier, S. Friederich and V. Dresely. 2013. Neolithic mitochondrial haplogroup H genomes and the genetic origins of Europeans. Nat commun. 4.p.1764.

Brown, W. M., M. George and C. A. Wilson. 1979. Rapid evolution of animal mitochondrial DNA. Proc Natl Acad Sci. 76: 1967–71.

Bryant., E and E.F. Bryant. 2001. The quest for the origins of Vedic culture: the Indo- Aryan migration debate. Oxford University Press.

Buikstra, J. E., S. R. Frankenberg and L.W. Koningsberg. 1990. Skeletal Biological Distance Studies in American Physical Anthropology: Recent Trends. Am J Anthropol. 82: 1-7.

Busby, G. B., F. Brisighelli, P. Sanchez-Diz, E. Ramos-Luis, C. Martinez-Cadenas, G. M. Thomas, G. D. Bradley, L. Gusmao, B. Winney, W. Bodmer and M. Vennemann. 2012. The peopling of Europe and the cautionary tale of Y chromosome lineage R-M269. In Proc R Soc B. 279 (1730): 884-892.

Butler, J. 2005. Forensic DNA Typing: Biology, Technology, and Genetics of STR Markers. 2nd Edition. London. Elsevier Academic Press.

Butler, J. M. 2011. Y-chromosomal DNA testing. In: Advanced topics in forensic DNA typing: Methodology, London: Academic Press. 371–403.

Butler, J. M. 2012. Advanced Topics in Forensic DNA Typing: Methodology.Elsevier Academic Press.ISBN 978-0-12-374513-2.

Cann, H. M., C. De Toma, L. Cazes, F. M. Legrand, V. Morel, L. Piouffre, J. Bodmer, F. W. Bodmer, B. Bonne-Tamir, A. Cambon-Thomsen and Z.Chen. 2002. A human genome diversity cell line panel. Science. 296(5566): 261-262.

Cann, R., M. Stoneking and A. Wilson. 1987. Mitochondrial DNA and Evolution. Nature. 325: 31-36.

Carabelli, G. 1842. Anatomie des Mundes. Braumüller und Seidel, Wien press.

Caroe, O. 1958. The Pathans. Oxford University Press, London.

Caroe, O. 1976. The Pathans: 550 B.C.–A.D. 1957. Oxford University Press, London.

225 Caroe, O. 1992. The Pathans (1958). Karachi: Oxford University Press.

Cavalli-Sforza, L. L., P. Menozzi, and A. Piazza. 1994. The History and Geography of Human Genes. Princeton. Princeton University Press.

Chandrasekar, A., S. Kumar, J. Sreenath, N. B. Sarkar, P. B. Urade, S. Mallick, S. S. Bandopadhyay, P. Barua, S. S. Barik, D. Basu, U. Kiran, P. Gangopadhyay, R. Sahani, R. V. B. Prasad, S. Gangopadhyay, R. G. Lakshmi, R. R. Ravuri, K. Padmaja, N. P. Venugopal, M-B. Sharma and R. V. Rao. 2009. Updating Phylogeny of Mitochondrial DNA Macrohaplogroup M in India: Dispersal of Modern Human in South Asian Corridor. PLoS One. 4: e7447.

Chang, X., Z. Wang, P. Hao, Y. Y. Li and Y. X. Li. 2010. Exploring mitochondrial evolution and metabolism organization principles by comparative analysis of metabolic networks. Genome. 95(6): 339-344.

Chauhan, R. A. H. 2001. A short history of the Gurjars: past and present/by Rana Ali Hasan Chauhan.

Chennakrishnaiah, S., D. Perez, T. Gayden, L. Rivera, M. Regueiro and J. R. Herrera. 2013. Indigenous and foreign Y-chromosomes characterize the Lingayat and Vokkaliga populations of Southwest India. Gene. 526(2): 96-106.

Clark, G. A. and C. M. Willermet (eds). 1997. Conceptual Issues in Modern Human Origins Research. Transaction Publishers, New York.

Coningham, R and R. Young. 2015. The archaeology of South Asia: from the Indus to Asoka, c. 6500 BCE–200 CE: Cambridge University Press.

Consortium, Y. C. 2002. A nomenclature system for the tree of human Y- chromosomal binary haplogroups. Genome res. 12: 339-348.

Cox, M., F. Mendez, T. Karafet, M. Pilkington, S. Kingan, G. Destro-Bisol, B. Strassmann and M. Hammer. 2008. Testing for Archaic Hominin Admixture on the X Chromosome: Model Likelihood for the Modern Human RRM2P4 Region from Summaries of Genealogical Topology Under the Structural Coalescent. Genetics 178: 427-437.

Crews R. D. 2015. Afghan Modern: the history of a global nation. Cambridge, MA: Harvard University Press.

Cucina, A and V. Tiesler. 2003. Dental caries and antemortem tooth loss in the Northern Peten area, Mexico: A biocultural perspective on social status differences among the classic Maya. Am J Phys Anthropol. 122: 1-10.

226 Cunha, C., A. M. Silva, J. D Irish, G. R Scott, T. Tome and J. Marquez. 2012. Hypotrophic roots of the upper central incisors - a proposed new discrete dental trait. Dent Anthropol. 25(1): 8-14.

Cunliffe, B. 2015. By Steppe, Desert, and Ocean: The Birth of Eurasia. Oxford, United Kingdom: Oxford University Press.

Cunnighum, A. 1865. Archoeological Survey Report of India, 1862-63. Delhi: Indological Book House.II. p.73.

Dahlberg, A A. 1945. The changing dentition of man. J Am Dent Assoc. 32: 676-690.

Dahlberg, A. A. 1951. “The dentition of the American Indian,” in Papers on the Physical Anthropology of the American Indian. Edited by W.S. Laughlin. New York: Viking Fund. 138-176.

Dahlberg, A. A. 1956. Materials for the establishment of standards for classification of tooth characteristics, attributes, and techniques in morphological studies of the dentition. Chicago: University of Chicago Press.

Damgaard, P. B., A. Margaryan, H. Schroeder, L. Orlando, E. Willerslev and E. M. Allentoft. 2015. Improving access to endogenous DNA in ancient bones and teeth. Sci rep-UK. 5. p.11184.

Dani, A. H. 1980. North-West Frontier Burial Rites in their wider archaeological setting. In: Loofs-Wissowa HHE, editor. The Diffusion of Material Culture (Asian and Pacific Archaeology Series 9). Manoa: University of Hawaii. p. 121– 150.

Davey, M. J., D. Jeruzalmi, J. Kuriyan and M. O'Donnell. 2002. Motors and switches: AAA+ machines within the replisome. Nat Rev Mol Cell Bio. 3(11): 826-835.

DeGiorgio, M., M. Jakobsson and N. A. Rosenberg. 2009. Out of Africa: modern human origins special feature: explaining worldwide patterns of human genetic variation using a coalescent-based serial founder model of migration outward from Africa. Proc Natl Acad Sci. 106: 16057-16062

Derenko, M., B. Malyarchuk, A. Bahmanimehr, G. Denisova and M. Perkova. 2013. Complete mitochondrial DNA diversity in Iranians. PloS one. 8: e80673.

DeSantis, L. R. 2016. Dental microwear textures: reconstructing diets of fossil mammals. Surf Topogr: Metro Prop. 4(2): 023002.

Docherty, P. 2007. The Khyber Pass: a history of empire and invasion. New York: Union Square Press.

227 Drennan, M. R. 1929. The dentition of a Bushman tribe. Ann S Afr Mus. 24: 61-88.

Durrani, F., I. Ali and G. Erdosy. 1991. Further excavations at Rehman Dheri. Ancient Pakistan. 7:61–151.

Ebner, S., R. Lang, E. Mueller, W. Eder, M. Oeller, A. Moser, J. Koller, B. Paulweber, J. Mayr, W. Sperl, B. Kofler. 2011. Mitochondrial Haplogroups, Control Region Polymorphisms and Malignant Melanoma: A Study in Middle European Caucasians. PLoS One. 6:e27192.

Edgar, H. J. 2013. Estimation of ancestry using dental morphological characteristics. J forensic sci. 58: S3-S8.

Elphinstone, M. 2011. Account of the Kingdom of Caubul, and its dependencies in Persia, Tartary, and India: comprising a view of the Afghan Nation, and a history of the Dooraunee Monarchy. Cambridge University Press.

Endicott, P., S. Y. W. Ho and C. Stringer. 2010. Using genetic evidence to evaluate four palaeoanthropological hypotheses for the timing of Neanderthal and modern human origins. J Hum Evol. 59: 87–95.

Enoki, K and A. A. Dahlberg. 1958. Rotated maxillary central incisors. Orth J Jpn. 17: 157-159.

Eshed, V. A., Gopher and I. Hershkouitz. 2006. Tooth wear and dental pathology at the advent of agriculture: New evidence from the Levant. Am J Phys Anthropol. 130: 145-159.

Excoffier, L and A. Langaney. 1989. Origin and differentiation of human mitochondrial DNA. Am J Hum Genet. 44: 73–85.

Excoffier, L and H. E. Lischer. 2010. Arlequin suite ver 3.5: a new series of programs to perform population genetics analyses under Linux and Windows. Mol Ecol Resour. 10: 564-567.

Fan, L and Y. G. Yao. 2011. MitoTool: A web server for analysis and retrieval for mitochondrial DNA sequence variations. Mitochondrion. 2: 351–6.

FATA. 2010. Federally administered tribal area. (http://www.fata.gov.pk/).

Firasat, S., S. Khaliq, A. Mohyuddin, M. Papaioannou, C. Tyler-Smith, A. P. Underhill and Q. Ayub. 2007. Y-chromosomal evidence for a limited Greek contribution to the Pathan population of Pakistan. Eur J Hum Genet. 15(1): 121- 126.

228 Flower, W. H. 1885. On the size of teeth as a character of race. Jour Roy Anthrop Inst. 14: 183-186.

Forster, P and S. Matsumura. 2005. Did Early Humans Go North or South?. Science. 308:965-966.

Forster, P. 2004. Ice Ages and the mitochondrial DNA chronology of human dispersals: a review. Philos Trans R Soc Lond B Biol Sci. 359: 255-64.

Foster, J. W and A. J. Graves. 1994. An SRY-related sequence on the marsupial X chromosome: implications for the evolution of the mammalian testis- determining gene. P Natl Acad Sci. 91. 1927-1931.

Fu, Q., A. Mittnik, P. L. Johnson, K. Bos, M. Lari, R. Bollongino, C. Sun, L. Giemsch, R. Schmitz, J. Burger and A. M. Ronchitelli. 2013. A revised timescale for human evolution based on ancient mitochondrial genomes. Curr Biol. 23(7): 553-559.

Gaikwad, S., T. Vasulu and V. Kashyap. 2006. Microsatellite diversity reveals the interplay of language and geography in shaping genetic differentiation of diverse Proto Australoid populations of west central India. Am J Phys Anthropol. 129: 260-267.

Garrigan, D and M. F. Hammer. 2006. Reconstructing human origins in the genomic era. Nat Rev Genet. 7: 669-80.

Geppert, M., M. Baeta, C. Nunez, B. Martínez-Jarreta, S. Zweynert,., O.W.V. Cruz, F. González-Andrade, J. González-Solorzano, M. Nagy and L. Roewer. 2011. Hierarchical Y-SNP assay to study the hidden diversity and phylogenetic relationship of native populations in South America. Forensic Sci Int- Gen. 5(2): 100-104.

Glatzer, B. 1998. Being Pashtun-being Muslim: concepts of person and war in Afghanistan. Essays on South Asian Society: Culture and Politics II. 83-94.

Glatzer, B. 2002. The Pashtun tribal system. Concept of tribal society. 5: 265-282.

Goldstein, D. B., L. A. Zhivotovsky, K. Nayar, A. R. Linares, L. L. Cavalli-Sforza and W. M. Feldman. 1996. Statistical properties of the variation at linked microsatellite loci: implications for the history of human Y chromosomes. Mol Biol Evol. 13 (9): 1213-1218.

Gomes, V., P. Sanchez-Diz, A. Amorim, A. Carracedo and L. Gusmao. 2010. Digging deeper into East African human Y chromosome lineages. Hum genet. 127. 603- 613.

229 Gomez-Robles, A., M. Martinon-Torres, J. M. B. de Castro, L. Prado, S. Sarmiento and J. L. Arsuaga. 2008. Geometric morphometric analysis of the crown morphology of the lower first premolar of hominins, with special attention to Pleistocene Homo. J Hum Evol. 55 (4): 627-638.

Gomez-Robles, A., M. Martinon-Torres, J. M. B. de Castro, A. Margvelashvili, M. Bastir, J. L. Arsuaga, A. Perez-Perez, F. Estebaranz and L. M. Martinez. 2007. A geometric morphometric analysis of hominin upper first molar shape. J Hum Evol. 53(3): 272-285.

Gooch, P. 1992. Transhumant pastoralism in Northern India: the Gujar case. Nomadic Peoples. pp.84-96.

Government of Pakistan. 2002. District census report Swat: Population Census Organization, Government of Pakistan, Islamabad.

Gower, J.C. 1966). Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika. 53: 325-338.

Green, R., J. Krause, W. A. Briggs, T. Maricic, U. Stenzel, M. Kircher, N. Patterson, H. Li, W. Zhai, M. His-Yang Fritz, F. N. Hansen, Y. E. Durand, A-S. Malspinas, D. J. Jensen. 2010. A Draft Sequence of the Neanderthal Genome. Science. 328: 710-722.

Gregory, W. K. 1916. Studies in the evolution of the Primates. Part I. Cope-Osborn theory of trituberculy and the ancestral molar patterns of the Primates. Part II. Phylogeny of recent and extinct anthropoids, with special reference to the origin the man. Bull Am Mus Nat Hist. 35: 259-355.

Gregory, W. K. 1922. The origin and evolution of the human dentition. Baltimore: Williams and Wilkins.

Gregory, W. K. 1926. Paleontology of the human dentition. Am J Phys Anthropol. 9: 401-426.

Grierson, G.A. 1903-1928. Linguistic Survey of India (11 volumes). Calcutta: Office of the Superintendent Government Printing press, India.

Grimes, B. F. 1992. Ethnologue: languages of the world. Summer Institute of Linguistics, Dallas.

Grimes, B. F. and J. E. Grimes. 2000. Languages of the world. 14th ed. Ethnologue (1). Dallas: SIL International.

230 Guimaraes-Ferreira, L. 2014. Role of the phosphocreatine system on energetic homeostasis in skeletal and cardiac muscles. Einstein. 12(1): 126-131.

Guttman, L. 1968. A general nonmetric technique for finding the smallest coordinate space for a configuration of points. Psychometrika. 33(4): 469-506.

Haak, W., I. Lazaridis, N. Patterson, N. Rohland, S. Mallick, B. Llamas, G. Brandt, S. Nordenfelt, E. Harney, K. Stewardson and Q. Fu. 2015. Massive migration from the steppe was a source for Indo-European languages in Europe. Nature. 522(7555): 207-211.

Haber, M., D. E. Platt, M. A. Bonab, S. C. Youhanna, D. F. Soria-Hernanz, B. Martı nez- Cruz, B. Douaihy, M. Ghassibe-Sabbagh, H. Rafatpanah, M. Ghanbari, J. Whale, O. Balanovsky, R. S. Wells, D. Comas, C. Tyler-Smith, P. A. Zalloua. 2012. Afghanistan’s ethnic groups share a Y-chromosomal heritage structured by historical events, Genographic Consortium. PLoS One. 7. e34288.

Haber, M., M. Mezzavilla, Y. Xue, C. Tyler-Smith. 2016. Ancient DNA and the rewriting of human history: be sparing with Occams razor. Genome Biol. 17:1– 8.

Haeussler, A. M. 1989. Morphological and Metrical Comparison of San and Central Sotho Dentitions from Southern Africa. Am J Anthropol. 78: 115- 122.

Hamayun, M. 2005. Ethnobotanical profile of Utror and Gabral valleys, district Swat, Pakistan. Ethnobotanical Leaflets. (1): 1-37. (http://www.siu.edu/~ebl/).

Hammer, M. F., A. E. Woerner, F. L. Mendez, J. C. Watkins and J. D. Wall. 2011. Genetic evidence for archaic admixture in Africa. P Natl Acad Sci. 108: 15123– 28.

Hammer, M. F., M. T. Karafet, J. A. Redd, H. Jarjanazi, S. Santachiara-Benerecetti, H. Soodyall and L. S. Zegura. 2001. Hierarchical patterns of global human Y- chromosome diversity. Mol Biol Evol. 18: 1189-1203.

Harris, E. F. 1977. Anthropologic and genetic aspects of the dental morfology of Solomon Islanders, Melanesia Tempe. PhD Dissertation, Arizona State University.

Harvati, K., C. Stringer, R. Grun, M. Aubert, P. Allsworth-Jones, C. A. Polorunso. 2011. The Later Stone Age calvaria from Iwo Eleru, Nigeria: morphology and chronology. Plos One. 6: e24024.

Hasegawa, M. and S. Horai. 1991. Time of the Deepest Root for Polymorphism in Human Mitochondrial DNA. J mol evol. 32:37-42.

231 Hassanali, J. 1982. Incidence of Carabelli's trait in Kenyan Africans and Asians. Am J Phys Anthropol. 59(3): 317-319.

Hayat, S., T. Akhtar, M. H. Siddiqi, A. Rakha, N. Haider, M. Tayyab, G. Abbas, A. Ali, S. Y. A. Bokhari, M. A. Tariq. 2015. Mitochondrial DNA control region sequences study in Saraiki population from Pakistan. Leg Med (Tokyo). 17:140–144.

Hazrat, A., J. Shah, M. Ali and I. Iqbal. 2007. Medicinal value of ranunculaceae Of Dir Valley. Pak J Bot. 39(4): 1037-1044.

Heinz, T. M. 2015. Genetic ancestry of the Bolivian population: PhD thesis. Universidad de Santiago de Compostela.

Hellman, M. 1928. Racial characters in human dentition part I. A racial distribution of the Dryopithecus pattern and its modifications in the lower molar teeth of man. Proc Am Philos Soc. 67(2): 157-174.

Hemphill, B. E. 2009a. Bioanthropology of the Hindu Kush High Lands: A Dental Morphology Investigation. Pak Herit. 1: 19-36.

Hemphill, B. E. 2009b. The Swatis of Northern Pakistan-emigrants from Central Asia or colonists from peninsular India? A dental morphometric investigation. Am J Phys Anthropol. pp 147-147.

Hemphill, B. E. 2013. Grades, gradients, and geography: a dental morphometric approach to the population history of South Asia. In: Scott GR, Irish JD, editors. Anthropological Perspectives on Tooth Morphology: Genetics, Evolution, Variation. Cambridge: Cambridge University Press. pp. 341-387.

Hemphill, B. E., I. Ali, A. Hameed. 2010. Dental Anthropology of the Madaklasht I: A description and Analysis of Variation in Morphological Features of the Permanent Tooth Crown. Pak Herit. 2: 1-33.

Hemphill, B. E., J. R. Lukacs and S. R. Walimbe. 2000. Ethnic Identity, Biological History and Dental Morphology: Evaluating the Indigenous Status of Mahatrashtra’s Mahars. Antiquity. 74: 671-681.

Hemphill, B.E. 2012. Tooth Size, Crown Complexity, and the Utility of Combining Archaeologically-derived Samples with Living Samples for Reconstruction of Population History. Pak Herit. 4: 49-85.

Hemphill, B.E., I. Ali, S. Blaylock and N. Willits. 2008. Are the Kho an Indigenous Population of the Hindu Kush?: A Dental Morphometric Approach. In: M.

232 Tosi and D. Frenez (eds.), South Asian Archaeology 2008, Volume I. Oxford: Archaeopress-BAR, pp. 127-137.

Hemphill, B.E. 2012. The Awans of Northern Pakistan: Emigrants from Central Asia, Arabs from Western Afghanistan, or Colonists from Peninsular India? A Dental Morphometric Investigation. Am J Phys Anthropol (Suppl 54): 163.

Henn, B. M., Botigue, L. R. Peischl, S. Dupanloup, I. Lipatov, M. Maples, B. K. and L. Excoffier. 2016. Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. Proc Nat Acad Sci USA. 113(4): E440-E449.

Higgins, D and J. J. Austin. 2013. Teeth as a source of DNA for forensic identification of human remains: a review. Sci Justice. 53(4): 433-441.

Hillson, S. 1979. Diet and Dental Disease. World Archeol. 2: 147-162

Hillson, S. 1996. Dental Anthropology, Cambridge. Cambridge University Press.

Hodgson, J and T. Disotell. 2008. No Evidence of a Neanderthal Contribution to Modern Human Diversity. Genome Biol. 9(2): 206.

Holland, M. M and T. J. Parsons. 1999. Mitochondrial DNA sequence analysis- validation and use for forensic casework. Forensic Sci Rev. (11): 21-50.

Hrdlicka, A. 1920. Shovel-shaped teeth. Am J Anthropol. 3: 429-465.

Hrdlicka, A. 1921. Further studies of tooth morphology. Am J of Phys Anthropol. 4(2): 141-176.

Hrdlicka, A. 1924. New data on the teeth of early man and certain fossil European apes. Am J Anthropol. 7: 109-132.

Hsu, J. W., P. Tsai, T. H. Hsiao, H. P. Chang, L. M. Lin, K. Liu, H. S. Yu and D. Ferguson. 1999. Ethnic dental analysis of shovel and Carabelli’s traits in a Chinese population. Aust Dent J. 44(1): 40-45.

Hudjashov, G., T. Kivisild, P. Underhill, P. Endicott, J. Sanchez, A. Lin, P. Shen, P. Oefner, C. Renfrew, R. Villems, P. Forster. 2007. Revealing the prehistoric settlements of Australia by Y chromosome and mtDNA Analysis. Proc Natl Acad Sci USA. 104: 8726-8730.

Hughes, J. F., H. Skaletsky, G. L. Brown, T. Pyntikova, T. Graves, S. R. Fulton, S. Dugan, Y. Ding, J. C. Buhay, C. Kremitzki and Q. Wang. 2012. Strict evolutionary conservation followed rapid gene loss on human and rhesus Y chromosomes. Nature. 483: 82-86.

233 Hussain, A. A. 1962. The Story of Swat as told by the Founder Miangul Abdul Wadud Badshah Sahib to Muhammad Arif Khan. feroz sons Ltd, Peshawar.

Hussain, J. 1997. A history of the peoples of Pakistan towards independence. Oxford University Press, Karachi, Pakistan

Hutchison, C. A., J. E. Newbold, S. S. Potter and M. H. Edgell. 1974. Maternal inheritance of mammalian mitochondrial DNA. Nature. 251:536–8.

Ilyas, M., J.S. Kim, J. Cooper, Y. A. Shin, H. M. Kim, Y. S. Cho, S. Hwang, H. Kim, J. Moon, O. Chung and J. Jun. 2015. Whole genome sequencing of an ethnic Pathan (Pakhtun) from the north-west of Pakistan. BMC.Genomics. 16(1):172.

Ingalls, D.H. 1976. Kalidasa and the Attitudes of the Golden Age. J Am Ori Soc. Pp15-26.

Ingman, M., H. Kaessmann, S. Paabo and U. Gyllensten. 2000. Mitochondrial genome variation and the origin of modern humans. Nature. 408: 708–713.

International Crisis Group. 2006. Pakistan's Tribal Areas: Appeasing the Militants. International Crisis Group.

Irish, J. D and G. R. Scott. 2016. Crown wear: identification and categorization. In a comparison to dental Anthropology (ed). new York: Wiley Blackwell. pp. 415- 432

Irish, J.D., D. Guatelli-Steinberg, S.S. Legge, D.J. de Ruiter and L.R. Berger. 2013. Dental morphology and the phylogenetic “place” of Australopithecus sediba. Science. 340(6129): 1233062.

Jakobsson, M., S. W. Scholz, P. Scheet, J. R. Gibbs, J. M. VanLiere, H. C. Fung and Z. A. Szpiech. 2008. Genotype, haplotype and copy-number variation in worldwide human populations. Nature. 451: 998-1003.

Jiang, T., C. C. Hou, Y. Z. She and X. W. Yang. 2013. The SOX gene family: function and regulation in testis determination and male fertility maintenance. Mol Biol Rep. 40: 2187-2194.

Jobling, M. A. and C. Tyler-Smith. 2003. The human Y chromosome: an evolutionary marker comes of age. Nat Rev Genet. 4: 598-612.

Jobling, M. A., M. E. Hurles and C. Tyler-Smith. 2004. Human evolutionary genetics: Origins, Peoples and Disease. New York, Garland Publishing.

Johanson, D. 2001. Origins of Modern Humans: Multiregional or Out of Africa. American Institute of Biological Sciences.

234 Jorde, L. B., W. S. Watkins and M. J. Bamshad. 2001. Population genomics: a bridge from evolutionary history to genetic medicine. Hum Mol Genet. 10: 2199-207.

Kaifu Y., M. Izuho, T. Goebel, H. Sato and A. Ono. 2015. Emergence and diversity of modern human behavior in Paleolithic Asia. College Station, TX: Texas A&M University Press.

Karafet, T. M., F. L. Mendez, M. B. Meilerman, P. A. Underhill and S. L. Zegura. 2008. New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree. Genome Res. 18: 830-838.

Kareem, M. A., O. A. Hussein and H. I. Hameed. 2015. Y-chromosome short tandem repeat, typing technology, locus information and allele frequency in different population: A review. Afr J Biotechnol. 14(27): 2175-2178.

Katoh, K and D. M. Standley. 2013. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 30(4): 772-780.

Kaul, V and S. Prakash. 1981. Morphological features of Jat dentition. Am J of Phys Anthropol. 54(1): 123-127.

Kennedy, K. A. R., N. C. Lovell and C. B. Burrow. 1986. Mesolithic Human Remains from the Gangetic Plains: Sarai Nahar Rai (Occasional Papers and Theses of the South Asia Program number 10), Cornell University, Ithaca.

Kennedy, K.A.R., Chiment, J., Distell, T., and Meyers, D. 1984. Principal-components Analysis of Prehistoric South Asian Crania. Am J of Phys Anthropol. 64(2), 105- 118.

Kenoyer, J.M. 2005. Culture Change during the Late Harappa Period at Harappa. In: Bryant, E.F. and Patton, L.L., (Ed.). The Indo-Aryan Controversy: Evidence and Inference in Indian History (pp. 21-49). London: Routledge

Ketmaier, V and C. Bernardini. 2005. Structure of the mitochondrial control region of the Eurasian Otter (Lutra lutra; Carnivora, Mustelidae): Patterns of genetic heterogeneity and implications for conservation of the species in Italy. J Hered. 96(4): 318-328.

Khan, F. 2013b. Recent discovery of Petroglyphs at Parwak, District Chitral, Pakistan. J Asian Civil. 36:101–109.

Khan, M. N. 2013a. The early arrival of Muslims in Ancient Gandhara study based on numismatic evidence from Kashmir Smast. Gandharan Stud. 7:115–119.

235 Khan, T. M. 2008. The Tribal Areas of Pakistan, a Contemporary Profile. Sang-e-Meel Publications.

Khattak, M. H. K. 1997. Buner, The forgotten Part of Ancient Uddiyana. Noble Art Press Karachi. p.43

Kieser, J. A and C. B. Preston. 1981. The dentition of the Lengua Indians of Paraguay. Am J Phys Anthropol. 55(4): 485-490.

Kieser, J. A. 1984. Dental morphology of a Griqua skeletal population. Anthropolo Anz 93-99.

Kivisild, T. 2015. Maternal ancestry and population history from whole mitochondrial genomes. Investig genet. 6(1):3

Kivisild, T., J. M. Bamshad, K. Kaldma, M. Metspalu, E. Metspalu, M. Reidla, S. Laos, J. Parik, S. W. Watkins, W. S, E. M. Dixon, S. S. Papiha, S. S. Mastana, R. M. Mir, V. Ferak, R. Villems. 1999. Deep Common Ancestry of Indian and Western-Eurasian Mitochondrial DNA Lineages. Curr Biol. 9: 1331-1334.

Kivisild, T., S. Rootsi, M. Metspalu, S. Mastana and K. Kaldma. 2003. The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. Am J Hum Genet. 72: 313-332.

Klein, R. 1999. The Human Career: Human Biological and Cultural Origins. University of Chicago Press.

Klein, R. G. 2008. Out of Africa and the Evolution of Human Behaviour. Evol Anthropol. 17:267-281.

Kloss-Brandstatter, A., D. Pacher, S. Schonherr, H. Weissensteiner, R. Binna, G. Spechtand and F. Kronenberg. 2011. HaploGrep: a fast and reliable algorithm for automatic classification of mitochondrial DNA haplogroups. Hum Mutat. 32(1): 25–32.

Kraus, B. S and M. Fur. 1953. Lower first premolars. J Dent Res. 32: 554-564.

Kraytsberg, Y., M. Schwartz, A. T. Brown, K. Ebralidse, S. W. Kunz, A. D. Clayton, J. Vissing and K. Khrapko, K., 2004. Recombination of human mitochondrial DNA. Science. 304(5673). 981-981.

Krithika, S., S. Maji and T.S. Vasulu. 2009. A microsatellite study to disentangle the ambiguity of linguistic, geographic, ethnic and genetic influences on tribes of India to get a better clarity of the antiquity and peopling of South Asia. Am J of Phys Anthropol. 139(4). 533-546.

236 Kruskal, J. B. 1964. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika. 29(1): 1-27.

Kumar, S., R. Reddy Ravuri, P. Koneru, P. B. Urade, N. B. Sarkar, A. Chandrasekar and R. V. Rao. 2009. Reconstructing Indian-Australian Phylogenetic Link. Evol Biol. 9: 173-177.

Kuzmina, E. E and V. H. Mair. 2008. The prehistory of the Silk Road: University of Pennsylvania Press.

Kayser, M., 2010. The human genetic history of Oceania: near and remote views of dispersal. Curr Biolo. 20(4): 194-201.

Lachance, J., B. Vernot, C. C. Elbers, B. Ferwerda, A. Froment, M. J. Bodo, G. Lema, W. Fu, B. T. Nyambo, R. T. Rebbeck and K. Zhang. 2012. Evolutionary history and adaptation from high-coverage whole-genome sequences of diverse African hunter-gatherers. Cell. 150: 457–69

Lahn, B. T and C. D. Page. 1997. Functional coherence of the human Y chromosome. Science. 278(5338): 675-680.

Lahr, M. M and R. Foley. 1994. Multiple dispersals and modern human origins. Evol Anthropol: Issues, News, and Reviews. 3(2): 48-60.

Lalata., Prasada and Pandeya. 1971. Sun-worship in ancient India. Motilal Banarasidass. p. 245.

Landsteiner, K. 1901. Ueber agglutinationserscheinungen normalen menschlichen blutes. Wien Kli Wchnschr. 14: 1132-1134.

Lapidus, I. M. 2002. A history of Islamic societies: Cambridge University Press.

Larmuseau, M. H., A. Van Geystelen, M. Kayser, M. van Oven and R. Decorte. 2015. Towards a consensus Y-chromosomal phylogeny and Y-SNP set in forensics in the next-generation sequencing era. Forensic Sci Int- Genet. 15: 39-42.

Lee, E. Y., J. K. Shin,A. Rakha, E. J. Sim, J. M. Park, Y. N. Kim, I. W. Yang and Y. H. Lee. 2014. Analysis of 22 Y chromosomal STR haplotypes and Y haplogroup distribution in Pathans of Pakistan. Forensic Sci Int-Genet. 11: 111-116.

Li, Z., C. J. Haines and Y. Han. 2008. “Micro-deletions” of the human Y chromosome and their relationship with male infertility. J Genet Genomics. 35(4):193-199.

Lightowlers, R., P. Chinnery, D. Turnbull and N. Howell. 1997. Mammalian mitochondrial genetics: heredity, heteroplasmy and disease. Trends Genet. 13(11): 450-5.

237 Lindholm, C. 1982. Generosity and Jealousy: The Swat Pukhtun of Northern Pakistan Columbia University Press, New York.

Liu, H. Y., C. P. Liao, T. K. Chuang and C. M. Kao. 2011. Mitochondrial targeting of human NADH dehydrogenase (ubiquinone) flavoprotein 2 (NDUFV2) and its association with early-onset hypertrophic cardiomyopathy and encephalopathy. J Biomed sci. 18(1): 1.

Liu, H., F. Prugnolle, A. Manica and F. Balloux. 2006. A Geographically Explicit Genetic Model of Worldwide Human-Settlement History. Am J Hum Genet. 79: 230-237.

Liu, N and H. Zhao. 2006. A non-parametric approach to population structure inference using multilocus genotypes. Hum genomics. 2(6):253.

Loogvali, E. L., U. Roostalu, B. A. Malyarchuk, M. V. Derenko, T. Kivisild, E. Metspalu, K. Tambets, M. Reidla, H. V. Tolk, J. Parik and E. Pennarun. 2004. Disuniting uniformity: a pied cladistic canvas of mtDNA haplogroup H in Eurasia. Mol Biol Evol. 21(11): 2012-2021.

Lucas, P. W., J. P. Constantino and A. B. Wood. 2008. Inferences regarding the diet of extinct hominins: structural and functional trends in dental and mandibular morphology within the hominin clade. J Anat. 212: 486–500

Lukacs J. R., B. E. Hemphill and S. R. Walimbe. 1998. Are Mahars Autochthones of Maharashtra? Dental Morphology and Population History in South Asia. Human Dental Development,Morphology, and Pathology: A Tribute to Albert A. Dahlberg. pp 119–53.

Lukacs, J. R and B. E. Hemphill. 1991. Dental anthropology of prehistoric Baluchistan: a morphometric approach to the peopling of South Asia. In: Kelley, M.A., Larsen, C.S. (Eds.), Advances in Dental Anthropology. Wiley- Liss, New York. pp. 77–119.

Lukacs, J. R. 1986. Dental morphology and odontometrics of early agriculturalists from Neolithic Mehrgarh, Pakistan. In: Russell, D. E., J. P. Santoro, D. Sigogneau-Russell (Eds.). Teeth Revisited: Proceedings of the VIIth International Symposium on Dental Morphology. Mémoires de la Museé national Histoire naturelle (série C). Paris. 53: 285–303.

Lukacs, J. R. 1987. Biological relationships derived from morphology of permanent teeth: recent evidence from prehistoric India. Anthrop Anz. 45: 97–116.

Lukacs, J. R., B. E Hemphill. 1992. Chapter V: Dental Anthropology. In: Kennedy, K.A.R., et al. (Eds.), Human Skeletal Remainsfrom Mahadaha: A Gangetic

238 Mesolithic Site. South Asia Occasional Papers and Theses, No. 11. Cornell University, Ithaca, pp.157–270.

Lukacs, J.R., 1983. Dental anthropology and the origins of two Iron Age populations from northern Pakistan. Homo Gottingen. 34(1): 1-15.

Macaulay, V., C. Hill, A. Achilli, C. Rengo, D. Clarke, W. Meehan, J. Blackburn, O. Semino, R. Scozzari, F. Cruciani, A. Taha, N. Kassim Shaari, J. Maripa Raja, P. Ismail, Z. Zainuddin, W. Goodwin, D. Bulbeck, H. J. Bandelt, S. Oppenheimer, A. Torroni and M. Richards. 2005. Single, Rapid Coastal Settlement of Asia Revealed by Analysis of Complete Mitochondrial Genomes. Science. 308:1034- 1036.

Macaulay, V., M. Richards, E. Hickey, E. Vega, F. Cruciani, V. Guida, R. Scozzari, B. Bonne-Tamir, B. Sykes and A. Torroni. 1999. The emerging tree of west Eurasian mtDNAs: a synthesis of control-region sequences and RFLPs. Am J Hum Genet. 64: 232-249.

Maji, S., S. Krithika and T.S. asulu. 2009. Phylogeographic distribution of mitochondrial DNA acrohaplogroup M in India. J Genet. 88(1):127-139.

Maloney, C. 1974. People of south Asia.Rinehart and Winston.New York Holt publisher, New York.

Marado, L. M and V. Campanacho. 2013. Carabelli’s trait: definition and review of a commonly used dental non-metric variable. Cadernos do Geevh. 2(1): 24-39.

Marbaniang, D. 2015. History of Hinduism: Pre-vedic and Vedic Age. Raleigh, NC: Lulu.com.

Marean, C. W. 2015. An Evolutionary Anthropological Perspective on Modern Human Origins. Annu Rev Anthropol. 44: 533–56

Margulies, M., M. Egholm, W. E. Altman, S. Attiya, J. S. Bader, L. A. Bemben, J. Berka, M. S. Braverman, Y. J. Chen, Z.Chen and S. B. Dewell. 2005. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 437(7057): 376-380.

Margvelashvili, A., C. P. E. Zollikofer, D. Lordkipanidze, T. Peltomaki and M. S. Ponce de Leon. 2013. Tooth wear and dentoalveolar remodeling are key factors of morphological variation in the Dmanisi mandibles. Proc Natl Acad Sci U S A. 110(43): 17278-17283

Marjanovic, D., d. Primorac. 2013. Forensic Genetics: Theory and Application, 2nd ed. [Bosnian] Sarajevo: Lelo, d.o.o.

239 Matsumura, H., H. T. Ishida, H. Amano Ono and M. Yoneda. 2009. Biological affinities of Okhotsk-culture people with East Siberians and Arctic people based on dental characteristics. Anthropol Sci. 117(2): 121-132.

Maula, E. 1993. Hampaat-menneisyyden tietopankki. Hammaslaakarilehti. 7(93): 416- 418.

Mayhall, J. T., S. R. Saunders and P. L. Belier. 1982. The dental morphology of North American whites: a reappraisal. Teeth: form, function, and evolution New York: Columbia University Press. p. 245-258.

McAlpin, D.W. 1981. Proto-Elamo-Dravidian: The evidence and its implications. Transactions of the American Philosophical Society. 71(3): 1-155.

McElreavey, K and L. Quintana-Murci. 2005 A population genetics perspective of the Indus Valley through uniparentally-inherited markers. Ann Hum Biol. 32:154– 162.

Mckenzie, M and T. M. Ryan. 2010. Assembly factors of human mitochondrial complex I and their defects in disease. IUBMB life. 62(7): 497-502.

McMahon, A. H and G. D. A. Ramsy. 1901. Report on the tribe of dir, Swat and bajawar together with the Uthmankheil and Sam Ranezai, Saeed book Bank, Peshawar.

Mehdi, S., R. Qamar, Q. Ayub, S. Khaliq and A. Mansoor. 1999. The Origins of Pakistani Populations. Genomic Diversity: Springer. pp. 83-90.

Mellars, P. 2006. Going East: New Genetic and Archaeological Perspectives on the Modern Human Colonization of Eurasia. Science. 313: 796-800.

Mendez, F. L., T. Krahn, B. Schrack, A. M. Krahn, K. R. Veeramah, E. A, Woerner, M. L. F. Fomine, N. Bradman, G. M, Thomas, M. T. Karafet and F. M. Hammer. 2013. An African American paternal lineage adds an extremely ancient root to the human Y chromosome phylogenetic tree. Am J Hum Genet. 92: 454–59

Metspalu, M., T. Kivisild, E. Metspalu, J. Parik, G. Hudjashov, K. Kaldma, P. Serk, M. Karmin, D. M. Behar, M. T. P. Gilbert, P. Endicott, S. Mastana, S. S. Papiha, K. Skorecki, A. Torroni and R. Villems. 2004. Most of the extant mtDNA boundaries in South and Southwest Asia were likely shaped during the initial settlement of Eurasia by anatomically modern humans. BMC Genet. 5: 26.

Meyer, M., Q. Fu, A. Aximu-Petri, I. Glocke and B. Nickel. 2014. A mitochondrial genome sequence of a hominin from Sima de los Huesos. Nature. 505:403–6.

240 Michaels, G. S., W. W. Hauswirth and J. P. Laipis. 1982. Mitochondrial DNA copy number in bovine oocytes and somatic cells. Dev Biol. 94: 246–51.

Mihailidis, S., G. Scriven, M. Khamis and G. Townsend. 2013. Prevalence and patterning of maxillary premolar accessory ridges (MxPARs) in several human populations. Am J Phys Anthropol. 152(1): 19-30.

Mikkelsen, M., E. Rockenbauer, E. Sorensen, M. Rasmussen, C. Borsting and N. Morling. 2008. A Mitochondrial DNA SNP Multiplex Assigning Caucasians into 36 Haplo- and Subhaplogroups. Forensic Sci Int- Genet. 1: 287-289.

Miller, D. 1985. Ideology and the Harappan civilization. J Anthropol Archaeol. 4: 34- 71.

Mirabal, S., T. Varljen, T. Gayden, M. Regueiro, S. Vujovic, D. Popovic, M. Djuric, O. Stojkovic and J. R. Herrera. 2010. Human Y-chromosome short tandem repeats: a tale of acculturation and migrations as mechanisms for the diffusion of agriculture in the Balkan Peninsula . Am J Phys Anthropol. 142: 380–390.

Moorrees, C. F. A. 1957. The Aleut dentition. Cambridge, Mass. Harvard University Press.

Moreno, F., M. S. Moreno, A. C. Diaz, A. E. Bustos and V. J. Rodriguez. 2004. Prevalence and variability of eight non-metric dental traits in students of Cali, Colombia. Col Med. 35: 17-23.

Morgenstierne, G. 1932. Report on a linguistic mission to north-western India: Indus Publication.

Morris, D. H. 1975. Bushmen maxillary canine polymorfism. SA J Sci. 71: 333-335.

Morris, D. H., A. A. Dahlberg and S. Glasstone-Hughes. 1978. The Uto-Aztecan premolar: The anthropology of a dental trait, in P.M. Butler, K.A. Joysey (eds.), Developme\nt Function and Evolution of the Teeth. London, Academic Press. Pp. 69-79.

Murray, J. W. 1899. A Dictionary of the Pathan Tribes on the North-west Frontier of India. Office of the Superintendent, Government Print, India.

Navarro-Costa, P. 2012. Sex, rebellion and decadence: the scandalous evolutionary history of the human Y chromosome. Biochim Biophys Acta. 1822: 1851-1863.

Navarro-Costa, P., E. C. Plancha and J. Goncalves. 2010. Genetic dissection of the AZF regions of the human Y chromosome: thriller or filler for male (in)fertility?. J biomed and biotech. 2010:1-18.

241 Nei, M. 1995. Genetic Support for the out-of-Africa theory of human evolution. Proc Nat Acad Sci USA. 92: 6720-6722.

Nesheva, D. V. 2014. Aspects of ancient mitochondrial DNA analysis in different populations for understanding human evolution. Balkan J Med Genet. 17(1): 5- 14.

Newcomb, L. 1986. The Islamic Republic of Pakistan: country profile. Int Demogr. 5: 1-8.

Nichol, C. R. and C. G.Turner. 1986. Intra- and interobserver concordance in

observing dental morphology, Am. J. Anthropol., 69: 299-315.

Nichol, C. R., C. G. Turner and A. A. Dahlberg. 1984. Variation in the convexity of the human maxillary incisor labial surface. Am J Anthropol. 63: 361-370.

Nijjar, B. S. 2008. Origins and History of Jats and Other Allied Nomadic Tribes of India: 900 BC-1947 AD: Atlantic Publishers and Dist.

Nitecki, M. H and D. V. Nitecki (eds). 1994. Origins of Anatomically Modern Humans. New York. Plenum Press.

Novelletto, A. 2007. Y chromosome variation in Europe: Continental and local processes in the formation of the extant gene pool. Ann Hum Biol. 34: 139-172.

Novembre, J. and S. Ramachandran. 2011. Perspectives on human population structure at the cusp of the sequencing era. Ann Rev Genom Hum G. 12: 245- 274.

Nusser, M and W. B. Dickore. 2002. A tangle in the triangle: vegetation map of the eastern Hindukush (Chitral, northern Pakistan). Erdkunde. 56:37–59.

Olofsson, J. 2015. Forensic and Population Genetic Studies of the Human Y Chromosome: PhD Thesis. Faculty of Health and Medical Sciences, University of Copenhagen.

Olofsson, J. K., H. S. Mogensen, A. Buchard, C. Børsting and N. Morling. 2015. Forensic and population genetic analyses of Danes, Greenlanders and Somalis typed with the Yfiler® Plus PCR amplification kit. Forensic Sci Int- Genet. 16: 232-236.

Olofsson, J. K., V. Pereira, C. Børsting and N. Morling. 2015. Peopling of the North Circumpolar Region–insights from Y chromosome STR and SNP typing of Greenlanders. PloS one. 10: e0116573.

242 Oppenheimer, S. 2012. Out-of-Africa, the peopling of continents and islands: tracing uniparental gene trees across the map. Phil Trans R Soc B. 367: 770-784.

O'Rourke, D.H and A. J. Raff. 2010. The human genetic history of the Americas: the final frontier. Curr Biol. 20(4): 202-207.

Osborn, H. F. 1888. The evolution of mammalian molars to and from the tritubercular type. The Ame Nat. 22(264): 1067-1079.

Osborn, H. F. 1907. Evolution of mammalian molar teeth to and from the triangular type. New York: Macmillan.

Owen, R. B. 1845. Odontography or a treatise on the comparative anatomy of the teeth: their physiological relations, mode of development and microscopic structure in vertebrate animals. London: Hyppolyte Bailliere.

Pamjav, H., T. Feher, E. Nemeth and Z. Padar. 2012. Brief communication: New Y‐chromosome binary markers improve phylogenetic resolution within haplogroup R1a1. Am J Phys Anthropol. 149: 611-615.

Parishad and G. Bharatiya. 1996. Gurjara aura Unaka Itihasa me Yogadana Vishaya para Prathama Itihasa Sammelana. The Packard Humanities Institute. Pp. 34– 65. Retrieved, 2007-05-31.

Pedersen, P. O. 1949. The East Greenland Eskimo dentition: numerical variations and anatomy, a contribution to comparative ethnic odontography. Copenhagen: Medd Gronl. 142: 1-244.

Perveen, R., Z. Rahman, M. S. Shahzad, M. Israr and M. Shafique. 2014. Y-STR haplotype diversity in Punjabi population of Pakistan. Forensic Sci Int-Gen. 9: e20.

Petraglia, M. D., A. Alsharekh, P. Breeze, C. Clarkson and R. Crassard. 2012. Hominin dispersal into the Nefud desert and Middle Palaeolithic settlement along the Jubbah palaeolake, northern Arabia. PLoS One. 7: e49840.

Piko, L and L. Matsumoto. 1976. Number of mitochondria and some properties of mitochondrial DNA in the mouse egg. Dev Biol. 49:1–10.

Political and secret Department. 1933. Who is who in Dir, swat and Bajour Agency. Government of India press, new Delhi. p 1-30. (www.Mahraka.com).

Poznik, G. D., B. M. Henn, M. C. Yee, E. Sliwerska, G. M. Euskirchen, A. A. Lin, M. Snyder, L. Quintana-Murci, J. M. Kidd, P. A. Underhill and C. D. Bustamante.

243 2013. Sequencing Y chromosomes resolves discrepancy in time to common ancestor of males versus females. Science. 341: 562–65.

Pretty, I. A and D. Sweet. 2001. A look at forensic dentistry part 1: the role of teeth in determination of human identity. Brit Dent J. 190: 359-365.

Prieto, L., B. Zimmermann, A. Goios, A. Rodriguez-Monge, G. G. Paneto, C. Alves, A. Alonso, C. Fridman, S. Cardoso, G. Lima and J. M. Anjos. 2011. The GHEP– EMPOP collaboration on mtDNA population data—A new resource for forensic casework. Forensic Sci Int-Genet. 5(2): 146-151.

Prufer, K., F. Racimo, N. Patterson, F. Jay, S. Sankararaman, S. Sawyer, A. Heinze, G. Renaud, H. P. Sudmant, C. De Filippo and H. Li. 2014. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature. 505: 43–49.

Przeworski, M., R. R. Hudson and A. Di Rienzo. 2000. Adjusting the focus on human variation. Trends Genet. 16: 296-302.

Purps, J., S. Siegert, S. Willuweit, M. Nagy and C. Alves. 2014. A global analysis of Y- chromosomal haplotype diversity for 23 STR loci. Forensic Sci Int-Gen. 12: 12- 23.

Qamar, R., Q. Ayub, A. Mohyuddin, A. Helgason, K. Mazhar, A. Mansoor, T. Zerjal, C. Tyler-Smith and S. Q. Mehdi. 2002. Y-chromosomal DNA variation in Pakistan, Am J Hum Genet. 70: 1107–1124.

Qamar, R., Q. Ayub, S. Khaliq, A. Mansoor, T. Karafet, Q. S. Mehdi and F. M. Hammer 1999. African and Levantine origins of Pakistani YAP + Y chromosomes. Hum Biol. 71: 745–755

Qasmi, A. G. 1939. Tarikh-i-Riyasat-i-Swat. Peshawar: Hamidia Press. 27-29.

Quintana-Murci, L., O. Semino, H. J. Bandelt, G. Passarino and K. McElreavey. 1999. Genetic evidence of an early exit of H. Sapiens sapiens from Africa through eastern Africa. Nat genet. 23: 437-441.

Quintana-Murci, L., R. Chaix, R.S. Wells, D.M. Behar, H. Sayar, R. Scozzari, C. Rengo, N. Al-Zahery, O. Semino, A. S. Santachiara-Benerecetti and A. Coppa. 2004. Where west meets east: the complex mtDNA landscape of the southwest and Central Asian corridor. Am J Hum Genet. 74(5): 827-845.

Raff, J.A., A. D. Bolnick, J. Tackney and H. D. O'Rourke. 2011. Ancient DNA perspectives on American colonization and population history. Am J phys Anthropol. 146: 503-514.

244 Rahatullah., F. Haq, S. A. K. Saeed and S. Rehman .2011. Diversity and distribution of ladybird beetles in District Dir Lower, Pakistan. Int J Biodivers Conserv. 3(12): 670-675.

Rami- Reddy, V., 1985. Dental eruption of India. Dental anthropology: Applications and methods. New Delhi: Inter-India Publications. p, 55-73

Ranaweera, L., S. Kaewsutthi, A. W. Tun, H. Boonyarit, S. Poolsuwan and P. Lertrit. 2014. Mitochondrial DNA history of Sri Lankan ethnic people: their relations within the island and with the Indian subcontinental populations. J Hhum Genet. 59(1): 28-36.

Rasmussen, M., Y. Li, S. Lindgreen, J. S. Pedersen, A. Albrechtsen, I. Moltke, M. Metspalu, E. Metspalu, T. Kivisild, R. Gupta, M. Bertalan, K. Nielsen, M. T. Gilbert, Y. Wang, M. Raghavan and P. F. Campos. 2010. Ancient human genome sequence of an extinct Palaeo-Eskimo. Nature. 463(7282):757-62.

Raza, A., S. Firasat, S. Khaliq, A. Abid, S. S. Shah, Q. S. Mehdi and A. Mohyuddin. 2013. HLA class I and II polymorphisms in the Gujar population from Pakistan. Immunol Invest. 42(8): 691-700.

Rehman, A. 1979. “The Last Two Dynasties of Shahis”, Center for the Study of the Civilization of Central Asia, Quaid-e-azam University, Islamabad. pp. 3 – 4.

Rehman, A. 1993. “Date of Overthrow of Laghman: The Last Turkishahi Ruler of Kabul” . Lahore Museum Bulletin, Lahore. 11. pp. 29 – 31.

Rehman, A. U and S. Malik. 2016. Evaluation of Tribal Diversity of Pashtuns of Bajaur Agency, North-West Pakistan, on the Basis of Allelic Polymorphisms at ABO and Rh Loci. Pak J Zool. 48(3): 697-702.

Reid, C., J. F. V. Reenen and H. T. Groeneveld. 1991. Tooth size and the Carabelli trait. Am J of Phys Anthropol. 84 (4): 427-432.

Relethford, J. H. 2008. Genetic evidence and the modern human origins debate. Heredity. 100: 555-563.

Renfrew, C. 1987. Archaeology and Language - the Puzzle of Indo- European Origins. Jonathan Cape. London.

Renfrew, C. 1996. Language families and the spread of farming. In: Harris D, editor. The origins and spread of agriculture and Pastoralism in Eurasia. Washington, DC: Smithsonian Institution Press. 70–92.

245 Renfrew, C. 2000. America past, America present: genes and languages in the Americas and beyond. The McDonalds Institute for Archeological Research, Cambridge.

Repping, S., K. S. van Daalen, G. L. Brown, M. C. Korver, J. Lange, D. J. Marszalek, T. Pyntikova, F. van der Veen, H. Skaletsky, C. D. Page and S. Rozen. 2006. High mutation rates have driven extensive structural polymorphism among human Y chromosomes. Nat genet. 38: 463-467.

Reyes-Centeno, H., S. Ghirotto, F. Detroit, D. Grimaud-Hervé, G. Barbujani and K. Harvati. 2014. Genomic and cranial phenotype data support multiple modern human dispersals from Africa and a southern route into Asia. Proc Natl Acad Sci. 111(20): 7248-7253.

Richards, M., V. Macaulay, E. Hickey, E. Vega, B. Sykes, V. Guida, C. Rengo, D. Sellitto, F. Cruciani, T. Kivisild, R. Villems, M. Thomas, S. Rychkov, O. Rychkov, Y. Rychkov, M. Gölge, D. Dimitrov, E. Hill, D. Bradley, V. Romano, F. Cali, G. Vona, A. Demaine, S. Papiha, C. Triantaphyllidis, G. Stefanescu, J. Hatina, M. Belledi, A. Di Rienzo, A. Novelletto, A. Oppenheim, S. Norby, N. Al-Zaheri, S. Santachiara-Benerecetti, R. Scozzari, A. Torroni and H-J. Bandelt. 2000. Tracing European Founder Lineages in the Near Eastern mtDNA Pool. Am J of Hum Gen. 67: 1251-1276.

Rightmire, G. P. 2009. Out of Africa: modern human origins special feature: middle and later Pleistocene hominins in Africa and Southwest Asia. Proc Natl Acad Sci USA. 106: 16046-16050

Risch, N., E. Burchard, E. Ziv and H. Tang. 2002. Categorization of humans in biomedical research: genes, race and disease. Genome Biol. 3: 310-318.

Robin, E. D and R. Wong. 1988. Mitochondrial DNA Molecules and Virtual Number of Mitochondria per Cell in Mammalian Cells. J Cell Physiol. 136: 507-513.

Robson, B. and J. Lipson. 2002. The Afghans: Their history and culture. Cultural Orientation Resource Center, Center for Applied Linguistics.

Rodriguez, C. D. 2003. Antropologia dental prehispanica: variacion y distancias biológicas en la poblacion enterrada en el cementerio prehispanico de Obando, Valle del Cauca, Colombia entre los siglos VIII y XIII d.C. Miami. Syllaba Press.

Rodriguez, J. V. 1999. Advances of dental anthropology in Colombia. National University of Colombia (UNAL). Bogota.

246 Roebroeks, W and P. Villa. 2011. On the earliest evidence for habitual use of fire in Europe. Proc Natl Acad Sci USA. 108(13): 5209-5214

Roewer, L., M. Krawczak, S. Willuweit, M. Nagy, C. Alves, A. Amorim, K. Anslinger, C. Augustin, A. Betz, E. Bosch, and A. Caglia. 2001. Online reference database of European Y-chromosomal short tandem repeat (STR) haplotypes. Forensic sci int. 118: 106-113.

Roewer, L., M. Nothnagel, L. Gusmao, V. Gomes and M. Gonzalez. 2013. Continent- wide decoupling of Y-chromosomal genetic variation from language and geography in native South Americans. PLoS Genet. 9: e1003460.

Roewer, L., S. Willuweit, M. Stoneking and I. Nasidze. 2009. A Y-STR database of Iranian and Azerbaijanian minority populations. Forensic Sci Int- Genet. 4: e53- e55.

Rome, S. I. 2008. Swat State (1915-1969) From Genesis to Merger: An Analysis of Political, Administrative, Socio-Political, and Economic Developments. Karachi: Oxford University Press.

Rootsi, S., T. Kivisild, G. Benuzzi, H. Help, M. Bermisheva, I. Kutuev, L. Barac, M. Pericic, O. Balanovsky, A. Pshenichnov and D. Dion. 2004. Phylogeography of Y-chromosome haplogroup I reveals distinct domains of prehistoric gene flow in Europe. Am J Hum Genet. 75(1): 128-137.

Rose, H. A., MacLagan and D. Edward. 1911. A Glossary of the Tribes and Castes of the Punjab and North-West Frontier Province II. Lahore: Samuel T. Weston at the Civ. and Mili. Gaz. Press, Pp. 272–273.

Rosenberg, N. A. 2006. Standardized subsets of the HGDP‐CEPH Human Genome Diversity Cell Line Panel, accounting for atypical and duplicated samples and pairs of close relatives. Ann Human Genet. 70: 841-847.

Rosenberg, N. A., J. K. Pritchard, J. L. Weber, H. M. Cann, K. K. Kidd, L. A. Zhivotovsky and M.W. Feldman. 2002. Genetic structure of human populations. Science. 298: 2381-2385.

Rosser, Z. H., T. Zerjal, M. E. Hurles, M. Adojaan and D. Alavantic. 2000. Y- chromosomal diversity in Europe is clinal and influenced primarily by geography, rather than by language. Am J Hum Genet. 67: 1526-1543.

Roy, S., C. M. Thakur and P. P. Majumder. 2003. Mitochondrial DNA variation in ranked caste groups of Maharashtra (India) and its implication on genetic relationship and origins. Ann Hum Biol. 30: 443–454.

247 Ruvolo, M., S. Zehr, M. von Dornum, D. Pan, D. Chang and J Lin. 1993. Mitochondrial COII Sequences and Modern Human Origins. Molec Biol Evol. 10: 1115-1135.

Sabitov, Z. 2011. The origin of the Pashtuns (Pathans). Russ J Genet Geneal. 2(1): 60-63.

Sahoo, S., A. Singh. G. Himabindu. J. Banerjee. T. Sitalaximi. S. Gaikwad. R. Trivedi. P. Endicott. T. Kivisild. M. Metspalu and R. Villems. 2006. A prehistory of Indian Y chromosomes: evaluating demic diffusion scenarios. Proc Nat Acad Sci- USA. 103(4): 843-848.

Saitou, N and M. Nei. 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 4(4): 406-425.

Sanchez, J. J., C. Hallenberg, C. Borsting, A. Hernandez and N. Morling. 2005. High frequencies of Y chromosome lineages characterized by E3b1, DYS19-11, DYS392-12 in Somali males. Eur J Hum Genet. 13: 856-866.

Sanger, F., S. Nicklen and A. R. Coulson. 1977. DNA sequencing with chain- terminating inhibitors. Proc Natl Acad Sci USA. 74. 5463-5467.

Scally, A and R. Durbin. 2012. Revising the human mutation rate: implications for understanding human evolution. Nat Rev Genet. 13: 745–53.

Schick, K. D and P. N. Toth. 1994. Making silent stones speak: Human evolution and the dawn of technology. Simon and Schuster.

Schurr, T. G. 2004b. Molecular genetic diversity in Siberians and Native Americans suggests an early colonization of the NewWorld. In: Madsen DB (ed) Entering America: Northeast Asia and Beringia before the last glacial maximum. Salt Lake City: University of Utah Press. Pp. 187–238.

Schurr, T. G. 2004a. The peopling of the New World: perspectives from molecular anthropology. Annu Rev Anthropol. 33: 551–583.

Schurr, T.G and T. S. Sherry. 2004. Mitochondrial DNA and Y chromosome diversity and the peopling of the Americas: evolutionary and demographic evidence. Am J Hum Biol . 16(4): 420-439.

Scott, G. R and C. G. Turner. 1997. The Anthropology of Modern Human Teeth. Dental Morphology and its Variation in Recent Human Populations. Cambridge: Cambridge University Press.

Scott, G. R. 1980. Population variation of Carabelli's Trait. Hum Biol. 52(1): 63-78.

248 Scott, R. S., P. S. Ungar, T. S. Bergstrom, C. A. Brown, F. E. Grine, M. Teaford and A. Walker. 2005. Dental microwear texture analysis shows within-species diet variability in fossil hominins. Nature. 436(7051): 693-695.

Seema, N. P., A. Geetha and C. Jagannath. 2011. Y-short tandem repeat haplotype and paternal lineage of the Ezhava population of Kerala, south India. Croat Med J. 52: 344-350.

Semino, O., G. Passarino, P. J. Oefner, A. A. Lin and S. Arbuzova. 2000. The genetic legacy of Paleolithic H. Sapiens sapiens in extant Europeans: AY chromosome perspective. Science. 290: 1155-1159.

Sengupta, S., L. A. Zhivotovsky, R. King, S. Mehdi and C. A. Edmonds. 2006. Polarity and temporality of high-resolution y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of Central Asian pastoralists. Am J Hum Genet. 78: 202-221.

Shah, G. A. 2013. Administration of dir under nawab shah jehan. Pak Ann Res J. 1 (49): 121-138.

Shah, T. V. 1940. Ancient India. From 900 B.C. to 100 A.D. vol III and IV. Shashikant and Co. Baroda, India.

Sharma, J. C and V. Kaul. 1977. Dental morphology and odontometry in Panjabis. J Ind Anthropol Soc. 12: 213–226.

Sharma, J. C. 1983. Dental morphology and odontometry of the Tibetan immigrants. Am J Phys Anthropol. 61: 495–505.

Siddiqi, M. H., T. Akhtar, A. Rakha, G. Abbas, A. Ali, N. Haider, A. Ali, S. Hayat, S. Masooma, J. Ahmad and A. M. Tariq. 2015. Genetic characterization of the Makrani people of Pakistan from mitochondrial DNA control-region data. Leg Med. 17(2): 134-139.

Sinclair, A. H, P. Berta, S. M. Palmer, R. J. Hawkins, L. B. Griffiths, J. M. Smith, W. J. Foster, M. A. Frischauf, R. Lovell-Badge and N. P. Goodfellow. 1990. A gene from the human sexdetermining region encodes a protein with homology to a conserved DNA-binding motif. Nature. 346: 240-244.

Sirajuddin, S. 1970. Sarguzasht-e-Swat () Lahore: Al-Hamra Academy. p.53.

Skaletsky, H., T. Kuroda-Kawaguchi, J. P. Minx, S. H. Cordum, L. Hillier, G. L. Brown, S. Repping, T. Pyntikova, J. Ali, T. Bieri, and A. Chinwalla. 2003. The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature. 423: 825-837.

249 Slatkin, M. 1987. Gene flow and the geographic structure of natural-populations. Science. 236: 787–792.

Smith, B. H. 1991. Standards of human tooth formation and dental age assessment .In: M. Kelley and C. Larsen (eds), Advances in Dental Anthropology. New York: Wiley-Liss. 19: 143-168.

Smith, F. H and F. Spencer (eds.). 1984. The Origins of Modern Humans: A World Survey of the Fossil Evidence. New York. Liss.

Smith, T. M and P. Tafforeau. 2008. New visions of dental tissue research: tooth development, chemistry, and structure. Evol Anthropol. 17(5): 213-226.

Smith, V. A. 1914. The Early History of India from 600 B.C. to the Muhammadan Conquest: Including the Invasion of Alexander the Great. Clarendon Press. 3.p 117.

Soares, P., F. Alshamali, J. B. Pereira, V. Fernandes, N. M. Silva, C. Afonso, M. D. Costa, E. Musilova, V. Macaulay, M. B. Richards, V. Cerny, L. Pereira. 2012. The Expansion of mtDNA Haplogroup L3 within and out of Africa. Mol Biol Evol. 29: 915–27.

Soares, P., L. Ermini, N. Thompson, M. Mormina, T. Rito, A. Rohl, A. Salas, S. Oppenheimer, V. Macaulay and M. Richards. 2009. Correcting for Purifying Selection: An Improved Human Mitochondrial Molecular Clock. Am J Hum Genet. 84: 740-759.

St John, J., D. Sakkas, K. Dimitriadi, A. Barnes, V. Maclin, J. Ramey, C. Barratt and C. De Jonge. 2000. Failure of elimination of paternal mitochondrial DNA in abnormal embryos. Lancet. 355(9199): 200-200.

Stacul, G. 1969. Excavation near Ghaligai (1968) and Chronological Sequence of Protohistorical Cultures in the Swat Valley (West Pakistan). East and West. 19: 44-91.

Stewart, J. B and F. P. Chinnery. 2015. The dynamics of mitochondrial DNA heteroplasmy: implications for human health and disease. Nat Rev Genet. 16(9): 530-542.

Stoljarova, M., J.L. King, M. Takahashi, A. Aaspollu and B. Budowle. 2016. Whole mitochondrial genome genetic diversity in an Estonian population sample. Int J Legal Med. 130:67–71.

Stoneking, M and F. Delfin. 2010. The human genetic history of East Asia: weaving a complex tapestry. Curr Bio. 20: 88-193.

250 Stoneking, M. 2008. Human origins. The molecular perspective. EMBO Rep. 9 (1): 46– 50.

Strand, R. F. 1973. Notes on the Nūristāni and Dardic Languages. J Am Orient Soc: 297-305.

Stringer, C and R. McKie. 1996. African Exodus: The Origins of Modern Humanity. New York: Henry Holt.

Stringer, C. 2002. Modern Human Origins: Progress and Prospects. Philosophical Transactions. Bio Sci. 357: 563-579.

Sullivan, L. R. 1920. Differences in the pattern of the second lower molar tooth. Am J Phys Anthropol. 3: 255-257.

Suzuki M and T. Sakai. 1973. Occlusal surface pattern of the lower molars and the second deciduous molar among the living Polynesians. Am J Phys Anthropol. 39(2): 305-315.

Swati, M. F. 1997. Recent Discovery of Buddhist Sites in the Swat Valley, “Athariyat (Archaeology). A Research Bulletin of the National Heritage Foundation Peshawar, Pakistan. 1:151-84.

Szecsenyi-Nagy, A., G. Brandt, J. Jakucs, B. G. Mende, E. Banffy and K. W. Alt. 2014. Ancient mitochondrial and Y chromosomal DNA reveals the western Carpathian Basin as a corridor of the Neolithic expansion. (Presentation) ISBA6, Basel, Switzerland 27-29th

Taanman, J. W. 1999. The mitochondrial genome: structure, transcription, translation and replication. BBA-Bioenerge. 1410(2): 103-123.

Tabassum, S., M. Ilyas,., I. Ullah,., M. Israr and H. Ahmad. 2017. A comprehensive Y- STR portrait of Yousafzai’s population. Int J Leg Med. pp.1-2.

Tajima, F. 1989. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genet. 123: 585–95.

Tamimi, M. J. 2009. Hinduism in South Asia: Myth and Reality. Centre for South Asian Studies, University of the Punjab, Quaid-e-Azam Campus, Lahore. 24: 221–241.

"Tareekh-e-" (a.k.a. "Hidayat Afghani Tareekh-e-Kakazai " (Originally Published May, 1933).

Tattersall, I and J. H. Schwartz. 1999. “Hominids and hybrids: The place of Neanderthals in human evolution.” Proc Natl Acad Sci USA. 96: 7117-7119.

251 Teaford, M and J. Lytle. 1996. Brief Communication: Diet induced changes in rates of human tooth microwear: A case study involving stoneground maize. Am J Phys Anthropol. 100: 143-147.

Teaford, M. F and S. P. Ungar. 2000. Diet and the evolution of the earliest human ancestors. Proc Natl Acad Sci. 97: 13506-11.

Thapar, R .1969. A History of India. Part 1. Baltimore Penguin Press.

Tishkoff, S. A., F. A. Reed, F. R. Friedlaender, C. Ehret, A. Ranciaro, A. Froment, and J. B. Hirbo, A. A. Awomoyi, M. J. Bodo, O. Doumbo and M. Ibrahim, . 2009. The genetic structure and history of Africans and African Americans. Science. 324: 1035-44.

Tokayer, R. M. 2007. Mystery of the . Nihon-Yudaya, Huuin no Kodaishi. (http://www.moshiach.com/features/tribes/afghanistan.php).

Tomes, C. S. 1889. A manual of dental anatomy: human and comparative. London: J and A. Churchill.

Torres, J. B. 2016. A history of you, me, and humanity: mitochondrial DNA in anthropological research. AIMS Genet. 3(2): 146-156.

Torroni, A., A. Achilli, V. Macaulay, M. Richards and H. J. Bandelt. 2006. Harvesting the fruit of the human mtDNA tree. Trends Genet. 22: 339–345.

Townsend, G and T. Brown. 1981. The Carabelli trait in Australian aboriginal dentition. Arch Oral Biol. 26(10): 809-814.

Townsend, G., H. Yamada and P. Smith. 1990. Expression of the entoconulid (sixth cusp) on mandibular molar teeth of an Australian Aboriginal population. Am J Phys Anthropol. 82(3): 267-274.

Triki Fendri, S., P. Sanchez Diz, D. Rey Gonzalez, I. Ayadi and A. Carracedo. 2015. Paternal lineages in Libya inferred from Y chromosome haplogroups. Am J Phys Anthropol. 157: 242-251.

Trivedi, R., S. Sahoo, A. Singh, G. Bindu, J. Banerjee, M. Tandon, S. Gaikwad, R. Rajkumar, T. Sitalaximi, R. Ashma and N. B. G. Chainy. 2008. Genetic imprints of Pleistocene origin of Indian populations: A comprehensive phylogeographic sketch of Indian Y-chromosomes. Int J Hum Genet. 8(1-2): 97- 118.

Trombetta, B., E. D’Atanasio, A. Massaia, M. Ippoliti, A. Coppa, F. Candilio, V. Coia, G. Russo, M. J. Dugoujon, P. Moral and N. Akar. 2015. Phylogeographic

252 refinement and large scale genotyping of human Y chromosome haplogroup E provide new insights into the dispersal of early pastoralists in the African continent. Genome Biol Evol. 7(7): 1940-1950.

Trombetta, B., F. Cruciani, D. Sellitto and R. Scozzari. 2011. A new topology of the human Y chromosome haplogroup E1b1 (E-P2) revealed through the use of newly characterized binary polymorphisms. PLoS One. 6(1): e16073.

Tucci, G. 1958. Preliminary report on an archaeological survey in Swat. East and West. 9(4): 279-328.

Turner, C. G. II. 1967. The dentition of Arctic peoples, PhD Dissertation, Madison, University of Wisconsin.

Turner, C. G., C. R. Nichol and G. R. Scott. 1991. Scoring procedures for key morphological traits of the permanent dentition: the Arizona State University Dental Anthropology System. In: Kelly MA, and Larsen CS (eds.). Advances in Dental Anthropology. New York: Wiley-Liss. p 13-31.

Tyagi, V. P. 2009. Martial races of undivided India. New Delhi: Kalpaz Publications.

Underhill, P. A and T. Kivisild. 2007. Use of Y-chromosome and mitochondrial DNA population structure in tracing human migrations. Annu Rev Genet. 41: 539– 64.

Underhill, P. A., G. D. Poznik, S. Rootsi, M. Jarve, A. A. Lin, J. Wang, B. Passarelli, J. Kanbar, N. M. Myres, R. J. King and J. Di Cristofaro. 2015. The phylogenetic and geographic structure of Y-chromosome haplogroup R1a. Eur J Hum Genet. 23(1):124-31.

Underhill, P. A., P. Shen, A. A. Lin, L. Jin, G. Passarino, H. W. Yang, E. Kauffman, B. Bonne-Tamir, J. Bertranpetit, P. Francalacci, M. Ibrahim and T. Jenkins. 2000. Y chromosome sequence variation and the history of human populations. Nat Genet. 26: 358–361.

Underhill, P.A., G. Passarino, A. A. Lin, P. Shen, M. Mirazon Lahr, A. R. Foley, J. P. Oefner and L. L. Cavalli-Sforza. 2001. The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Ann Hum Genet. 65: 43-62. van der Giezen, M. 2011. Mitochondria and the rise of eukaryotes. J Bioscience. 61(8): 594-601. van Oven, M. 2015. PhyloTree Build 17: Growing the human mitochondrial DNA tree. Forensic Sci Int- Genet. 5:e392-e394.

253 van Oven, M. and M. Kayser. 2009. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Hum mutat. 30.2:e386-e394.

Van Oven, M., A. Geystelen, M. Kayser, R. Decorte and M.H. Larmuseau. 2014. Seeing the wood for the trees: a minimal reference phylogeny for the human Y chromosome. Hum mutat. 35(2): 187-191.

Vermeulen, M., A. Wollstein, K. van der Gaag, O. Lao, Y. Xue. 2009. Improving global and regional resolution of male lineage differentiation by simple single-copy Y-chromosomal short tandem repeat polymorphisms. Forensic Sci Int-Gen. 3: 205-213.

Vigilant, L., M. Stoneking, H. Harpending, K. Hawkes and A. Wilson. 1991. African Populations and the Evolution of Human Mitochondrial DNA. Science. 253:1503-1507.

Walimbe, S. R and S. S. Kulkarni. 1993. Biological Adaptations in Human Dentition: An Odontometric Study on Living and Archaeological Populations in India (Vol. 1). Deccan College Post Graduate and Research Institute, Pune, India.

Wallace, D. C., K. Garrison and W. C. Knowler. 1985. Dramatic founder effects in Amerindian mitochondrial DNAs. Am J Phys Anthropol. 68(2): 149-155.

Wallace, D., M. Brown and M. Lott. 1999. Mitochondrial DNA variation in human evolution and disease. Genetics. 238(1): 211-30.

Wang, C. C., L. X. Wang, R. Shrestha, S. Wen, M. Zhang, X. Tong, L. Jin and H. Li. 2015. Convergence of Y chromosome STR haplotypes from different SNP haplogroups compromises accuracy of haplogroup prediction. J Genet Genomics. 42(7):403-407.

Wang, C. C., X. L. Wang, R. Shrestha, S. Wen, M. Zhang, X. Tong, L. Jin and H. Li. 2013. Convergence of Y chromosome STR haplotypes from different SNP haplogroups compromises accuracy of haplogroup prediction. arXiv preprint arXiv [q-bio.PE]:1310.5413.

Whale, J.W. 2012. Mitochondrial DNA analysis of four ethnic groups of Afghanistan: PhD Thesis. Portsmouth: University of Portsmouth.

Wheeler, M. 1968. The Indus Civilization. 3rd Edition. Cambrdige: Cambridge University Press

Willems, T., M. Gymrek, D. G. Poznik, C. Tyler-Smith. and Y. Erlich. 2016. Population-Scale Sequencing Data Enable Precise Estimates of Y-STR Mutation Rates. The Amer J Human Genet. 98(5): 919-933.

254 Winters, C. 2011. The Gibraltar Out of Africa Exit for Anatomically Modern Humans. Webmed Central. 2(10):WMC002319.

Wolpert, S. 2000. A new history of India. Oxford University Press, New York

Wolpoff, M. H and R. Caspari. 1996. Race and Human Evolution: A Fatal Attraction. New York. Simon and Schuster.

Wolpoff, M. H., X. Z. Wu, and A. G. Thorne. 1984. Modern Homo sapiens Origins: A General Theory of Hominid Evolution Involving the Fossil Evidence from East Asia. In: Smith, F.H. and Spencer, F. (eds.). The Origins of Modern Humans. pp. 411-483.

Wolpoff, M., J. Hawks and R. Caspari. 2000. Multiregional, Not Multiple Origins. Am J Phys Anthropol. 112: 129-136.

Wood, B. A and C.A. Engleman. 1988. Analysis of the dental morphology of Plio- Pleistocene hominids. V. Maxillary postcanine tooth morphology. J o Anat. 161: 1-35.

Wood, B. A and H. Uytterschaut. 1987. Analysis of the dental morphology of Plio- Pleistocene hominids. III. Mandibular premolar crowns. J Anat. 154: 121-156.

Wood, B. A and S. A. Abbott. 1983. Analysis of the dental morphology of Pliopleistocene hominids. I. Mandibular molars: crown area measurements and morphological traits. J of Anat. 136: 197-219.

Wood, B. A., S. A. Abbott and S. H. Graham. 1983. Analysis of the dental morphology of Plio-Pleistocene hominids. II. Mandibular molars--study of cusp areas, fissure pattern and cross sectional shape of the crown. J Anat. 137: 287-314.

Wood, B. A., S. A. Abbott and H. Uytterschaut. 1988. Analysis of the dental morphology of Plio-Pleistocene hominids. IV. Mandibular postcanine root morphology. J of Anat. 156: 107-139.

Wreford, R. G. 1941. Census of India. Vol. XXII, Jammu and Kashmir. Ranbir Government Press, 1943.

Wright, S. 1951. The genetical structure of populations. Ann Eugen. 15: 323–354.

Xue Y., Q. Wang, Q. Long, L. B. Ng, H. Swerdlow, J. Burton, C. Skuce, R. Taylor, Z. Abdellah, Y. Zhao, Asan and D. G. MacArthur. 2009. Human Y chromosome base- substitution mutation rate measured by direct sequencing in a deep- rooting pedigree. Curr Biol. 19: 1453–1457.

255 Yaad, L. 1986. Pukhtana Qabily Opejanai. p.86 (www.Khyber.Org).

Yasin, H. M. 2008. Social Welfare Program in the Former State of Swat (The Paradise Lost). The Dialogue. 4(3): 1-32.

Zegura, S. L., M. T. Karafet, A. L. Zhivotovsky and F. M. Hammer. 2004. High- resolution SNPs and microsatellite haplotypes point to a single, recent entry of Native American Y chromosomes into the Americas. Mol Bio Evol. 21: 164- 175.

Zeng, Z., R. Garcia-Bertrand, S. Calderon, L. Li and M. Zhong. 2014. Extreme genetic heterogeneity among the nine major tribal Taiwanese island populations detected with a new generation Y23 STR system. Forensic Sci Int-Gen. 12: 100- 106.

Zhao, Z., F. Khan, M. Borkar, R. Herrera and S. Agrawal. 2009. Presence of three different paternal lineages among North Indians: a study of 560 Y chromosomes. Ann Hum biol. 36: 46-59.

Zhong, H., H. Shi, B. X. Qi, Y. Z. Duan, P. P. Tan, L. Jin, B. Su and Z. R. Ma. 2011. Extended Y chromosome investigation suggests postglacial migrations of modern humans into East Asia via the northern route. Molec bio and evol. 28(1): 717-727.

Zwalf, W. 1996. A Catalogue of the Gandhāra Sculpture in the British Museum, vol. I, British Museum Press, London

256 APPENDIX I

257 APPENDIX II

258 APPENDEX III

STOCK REAGENTS

Phenol:Chloroform Mixture (1:1)

For each sample 200uL of phenol and 200uL of chloroform were used.

Lysis Buffer

500mM Tris-base

250 mM EDTA

5% SDS

Proteinase K 75ug/mL of lysis solution

Β-mercaptoethanol (14.4M),1uL/mL of lysis solution

50X TAE buffer

M Tris-HCl pH8

0.5 M EDTA

Make up to 1 L with dH2O and autoclave

Bromophenol blue dye

50 ml dH2O

50 g sucrose

1.86 g EDTA

0.1 g bromophenol blue

Dissolve

Adjust volume to 100 ml with dH2O, stir overnight pH to 8.0

Filter through Whatmann filter paper

Store at room temperature

259 10 mg/ml Ethidium bromide (EtBr)

Add 1 g of ethidium bromide to

100 ml of ddH2O

Stir for several hours until completely dissolved

Store wrapped in aluminum foil at 4˚C

1kb size standard

285 μl 1kb ladder (cat# DM001)

143 μl Ficoll dye

2 400 μl 1 X TE

260 APPENDEX IV

Dental morphological trait frequencies of all poplations samples used in this study

ABB. TRAIT P N F ABB. TRAIT P N F ABB. TRAIT P N F INM SHOVUI1 9 24 37.50 DJR SHOVUI1 3 16 18.75 KARa SHOVUI1 86 149 57.72 - SHOVUI2 4 19 21.05 - SHOVUI2 8 22 36.36 - SHOVUI2 75 151 49.67 - MLRUI1 14 25 56.00 - MLRUI1 3 17 17.65 - MLRUI1 53 115 46.09 - HYPOUM1 27 41 65.85 - HYPOUM1 30 30 100.0 - HYPOUM1 132 143 92.31 - HYPOUM2 0 20 0.00 - HYPOUM2 21 32 65.63 - HYPOUM2 9 89 10.11 - MTCLUM1 6 41 14.63 - MTCLUM1 1 29 3.45 - MTCLUM1 20 138 14.49 - MTCLUM2 3 20 15.00 - MTCLUM2 0 32 0.00 - MTCLUM2 4 76 5.26 - YGRVLM2 7 24 29.17 - YGRVLM2 11 35 31.43 - YGRVLM2 38 116 32.76 - CSPNLM1 32 39 82.05 - CSPNLM1 20 21 95.24 - CSPNLM1 113 147 76.87 - CSPNLM2 4 24 16.67 - CSPNLM2 2 36 5.56 - CSPNLM2 16 124 12.90 - C6LM1 4 37 10.81 - C6LM1 1 20 5.00 - C6LM1 4 106 3.77 - C6LM2 0 24 0.00 - C6LM2 0 36 0.00 - C6LM2 1 87 1.15 - C7LM1 2 36 5.56 - C7LM1 1 32 3.13 - C7LM1 7 109 6.42 - C7LM2 1 25 4.00 - C7LM2 1 39 2.56 - C7LM2 1 89 1.12 MHR SHOVUI1 77 186 41.40 KUZ SHOVUI1 1 13 7.69 SYDm2 SHOVUI1 54 143 37.76 - SHOVUI2 22 181 12.15 - SHOVUI2 5 14 35.71 - SHOVUI2 46 140 32.86 - MLRUI1 106 177 59.89 - MLRUI1 2 13 15.38 - MLRUI1 34 103 33.01 - HYPOUM1 163 195 83.59 - HYPOUM1 23 23 100.0 - HYPOUM1 150 153 98.04 - HYPOUM2 10 164 6.10 - HYPOUM2 11 22 50.00 - HYPOUM2 14 113 12.39 - MTCLUM1 43 191 22.51 - MTCLUM1 2 21 9.52 - MTCLUM1 12 152 7.89 - MTCLUM2 33 153 21.57 - MTCLUM2 1 24 4.17 - MTCLUM2 6 99 6.06 - YGRVLM2 30 161 18.63 - YGRVLM2 5 15 33.33 - YGRVLM2 14 105 13.33 - CSPNLM1 170 192 88.54 - CSPNLM1 10 15 66.67 - CSPNLM1 126 146 86.30 - CSPNLM2 30 178 16.85 - CSPNLM2 1 14 7.14 - CSPNLM2 19 138 13.77 - C6LM1 13 191 6.81 - C6LM1 0 14 0.00 - C6LM1 3 101 2.97 - C6LM2 3 174 1.72 - C6LM2 0 15 0.00 - C6LM2 0 95 0.00 - C7LM1 25 191 13.09 - C7LM1 0 18 0.00 - C7LM1 11 104 10.58 - C7LM2 3 177 1.69 - C7LM2 0 18 0.00 - C7LM2 2 96 2.08 MDA SHOVUI1 80 163 49.08 MOL SHOVUI1 4 25 16.00 TANm2 SHOVUI1 31 149 20.81 - SHOVUI2 23 161 14.29 - SHOVUI2 11 27 40.74 - SHOVUI2 21 148 14.19

261 - MLRUI1 60 153 39.22 - MLRUI1 9 23 39.13 - MLRUI1 40 125 32.00 - HYPOUM1 155 169 91.72 - HYPOUM1 41 41 100.0 - HYPOUM1 143 149 95.97 - HYPOUM2 10 153 6.54 - HYPOUM2 23 37 62.16 - HYPOUM2 14 79 17.72 - MTCLUM1 36 156 23.08 - MTCLUM1 3 39 7.69 - MTCLUM1 4 147 2.72 - MTCLUM2 34 138 24.64 - MTCLUM2 3 37 8.11 - MTCLUM2 4 69 5.80 - YGRVLM2 31 133 23.31 - YGRVLM2 5 33 15.15 - YGRVLM2 17 106 16.04 - CSPNLM1 149 161 92.55 - CSPNLM1 29 33 87.88 - CSPNLM1 130 148 87.84 - CSPNLM2 32 158 20.25 - CSPNLM2 2 35 5.71 - CSPNLM2 14 109 12.84 - C6LM1 12 158 7.59 - C6LM1 3 33 9.09 - C6LM1 6 126 4.76 - C6LM2 5 152 3.29 - C6LM2 0 35 0.00 - C6LM2 0 114 0.00 - C7LM1 27 165 16.36 - C7LM1 2 39 5.13 - C7LM1 14 126 11.11 - C7LM2 7 158 4.43 - C7LM2 1 36 2.78 - C7LM2 0 112 0.00 MRT SHOVUI1 81 198 40.91 CHU SHOVUI1 64 194 32.99 GUJsw SHOVUI1 50 160 31.25 - SHOVUI2 24 194 12.37 - SHOVUI2 33 191 17.28 - SHOVUI2 33 160 20.63 - MLRUI1 95 194 48.97 - MLRUI1 88 194 45.36 - MLRUI1 117 160 73.13 - HYPOUM1 170 197 86.29 - HYPOUM1 192 193 99.48 - HYPOUM1 155 160 96.88 - HYPOUM2 4 179 2.23 - HYPOUM2 80 187 42.78 - HYPOUM2 31 159 19.50 - MTCLUM1 56 193 29.02 - MTCLUM1 50 191 26.18 - MTCLUM1 42 160 26.25 - MTCLUM2 32 169 18.93 - MTCLUM2 32 183 17.49 - MTCLUM2 16 159 10.06 - YGRVLM2 51 181 28.18 - YGRVLM2 51 182 28.02 - YGRVLM2 19 160 11.88 - CSPNLM1 166 195 85.13 - CSPNLM1 188 192 97.92 - CSPNLM1 127 159 79.87 - CSPNLM2 28 192 14.58 - CSPNLM2 53 191 27.75 - CSPNLM2 19 158 12.03 - C6LM1 17 194 8.76 - C6LM1 13 186 6.99 - C6LM1 19 160 11.88 - C6LM2 5 191 2.62 - C6LM2 1 186 0.54 - C6LM2 0 157 0.00 - C7LM1 15 198 7.58 - C7LM1 48 195 24.62 - C7LM1 23 160 14.38 - C7LM2 1 197 0.51 - C7LM2 18 194 9.28 - C7LM2 4 159 2.52 PNT SHOVUI1 52 176 29.55 GPD SHOVUI1 63 175 36.00 KOHsw SHOVUI1 54 162 33.33 - SHOVUI2 27 177 15.25 - SHOVUI2 22 174 12.64 - SHOVUI2 28 157 17.83 - MLRUI1 110 177 62.15 - MLRUI1 85 176 48.30 - MLRUI1 101 162 62.35 - HYPOUM1 177 182 97.25 - HYPOUM1 177 178 99.44 - HYPOUM1 159 161 98.76 - HYPOUM2 40 170 23.53 - HYPOUM2 55 170 32.35 - HYPOUM2 33 160 20.63 - MTCLUM1 58 182 31.87 - MTCLUM1 49 178 27.53 - MTCLUM1 41 161 25.47 - MTCLUM2 34 168 20.24 - MTCLUM2 36 168 21.43 - MTCLUM2 11 158 6.96 - YGRVLM2 67 165 40.61 - YGRVLM2 65 166 39.16 - YGRVLM2 24 160 15.00 - CSPNLM1 173 181 95.58 - CSPNLM1 169 171 98.83 - CSPNLM1 137 161 85.09

262 - CSPNLM2 43 180 23.89 - CSPNLM2 64 172 37.21 - CSPNLM2 30 160 18.75 - C6LM1 22 182 12.09 - C6LM1 21 169 12.43 - C6LM1 10 161 6.21 - C6LM2 5 179 2.79 - C6LM2 5 172 2.91 - C6LM2 2 160 1.25 - C7LM1 31 181 17.13 - C7LM1 23 172 13.37 - C7LM1 14 161 8.70 - C7LM2 11 182 6.04 - C7LM2 19 173 10.98 - C7LM2 3 160 1.88 KHO SHOVUI1 33 122 27.05 AWAm1 SHOVUI1 48 162 29.63 TRKd SHOVUI1 72 161 44.72 - SHOVUI2 24 121 19.83 - SHOVUI2 36 161 22.36 - SHOVUI2 48 152 31.58 - MLRUI1 77 127 60.63 - MLRUI1 96 163 58.90 - MLRUI1 106 161 65.84 - HYPOUM1 134 136 98.53 - HYPOUM1 158 167 94.61 - HYPOUM1 151 161 93.79 - HYPOUM2 15 61 24.59 - HYPOUM2 14 112 12.50 - HYPOUM2 22 160 13.75 - MTCLUM1 9 133 6.77 - MTCLUM1 8 160 5.00 - MTCLUM1 51 161 31.68 - MTCLUM2 4 51 7.84 - MTCLUM2 10 102 9.80 - MTCLUM2 27 160 16.88 - YGRVLM2 16 86 18.60 - YGRVLM2 37 136 27.21 - YGRVLM2 20 161 12.42 - CSPNLM1 111 128 86.72 - CSPNLM1 144 162 88.89 - CSPNLM1 130 161 80.75 - CSPNLM2 10 80 12.50 - CSPNLM2 14 136 10.29 - CSPNLM2 30 156 19.23 - C6LM1 4 129 3.10 - C6LM1 7 162 4.32 - C6LM1 18 161 11.18 - C6LM2 0 85 0.00 - C6LM2 0 138 0.00 - C6LM2 4 161 2.48 - C7LM1 12 129 9.30 - C7LM1 12 165 7.27 - C7LM1 35 161 21.74 - C7LM2 1 90 1.11 - C7LM2 4 142 2.82 - C7LM2 19 161 11.80 SKH SHOVUI1 0 9 0.00 MDK SHOVUI1 73 179 40.78 UTHd SHOVUI1 65 159 40.88 - SHOVUI2 0 9 0.00 - SHOVUI2 38 173 21.97 - SHOVUI2 35 153 22.88 - MLRUI1 2 9 22.22 - MLRUI1 125 178 70.22 - MLRUI1 116 159 72.96 - HYPOUM1 11 14 78.57 - HYPOUM1 178 181 98.34 - HYPOUM1 150 159 94.34 - HYPOUM2 2 13 15.38 - HYPOUM2 15 150 10.00 - HYPOUM2 24 159 15.09 - MTCLUM1 3 9 33.33 - MTCLUM1 5 178 2.81 - MTCLUM1 39 159 24.53 - MTCLUM2 2 14 14.29 - MTCLUM2 17 147 11.56 - MTCLUM2 21 159 13.21 - YGRVLM2 5 14 35.71 - YGRVLM2 38 144 26.39 - YGRVLM2 7 156 4.49 - CSPNLM1 9 15 60.00 - CSPNLM1 158 176 89.77 - CSPNLM1 151 159 94.97 - CSPNLM2 1 15 6.67 - CSPNLM2 36 160 22.50 - CSPNLM2 34 155 21.94 - C6LM1 1 14 7.14 - C6LM1 9 177 5.08 - C6LM1 21 159 13.21 - C6LM2 0 15 0.00 - C6LM2 2 165 1.21 - C6LM2 1 154 0.65 - C7LM1 1 15 6.67 - C7LM1 10 176 5.68 - C7LM1 20 159 12.58 - C7LM2 0 15 0.00 - C7LM2 1 163 0.61 - C7LM2 6 154 3.90 TMG SHOVUI1 1 7 14.29 SWT SHOVUI1 59 177 33.33 YSFsw SHOVUI1 53 181 29.28 - SHOVUI2 2 7 28.57 - SHOVUI2 36 176 20.45 - SHOVUI2 27 180 15.00

263 - MLRUI1 3 8 37.50 - MLRUI1 131 181 72.38 - MLRUI1 139 181 76.80 - HYPOUM1 17 22 77.27 - HYPOUM1 177 180 98.33 - HYPOUM1 180 181 99.45 - HYPOUM2 0 13 0.00 - HYPOUM2 26 120 21.67 - HYPOUM2 33 177 18.64 - MTCLUM1 4 19 21.05 - MTCLUM1 14 179 7.82 - MTCLUM1 30 181 16.57 - MTCLUM2 0 13 0.00 - MTCLUM2 16 112 14.29 - MTCLUM2 18 177 10.17 - YGRVLM2 3 18 16.67 - YGRVLM2 41 149 27.52 - YGRVLM2 67 181 37.02 - CSPNLM1 19 25 76.00 - CSPNLM1 162 173 93.64 - CSPNLM1 155 180 86.11 - CSPNLM2 3 17 17.65 - CSPNLM2 27 150 18.00 - CSPNLM2 11 181 6.08 - C6LM1 0 22 0.00 - C6LM1 8 172 4.65 - C6LM1 9 180 5.00 - C6LM2 1 18 5.56 - C6LM2 0 154 0.00 - C6LM2 1 180 0.56 - C7LM1 2 24 8.33 - C7LM1 20 176 11.36 - C7LM1 11 180 6.11 - C7LM2 2 20 10.00 - C7LM2 1 165 0.61 - C7LM2 0 180 0.00 NeoMRG SHOVUI1 18 28 64.29 WAKg SHOVUI1 37 145 25.52 GUJm2 SHOVUI1 89 155 57.42 - SHOVUI2 17 37 45.95 - SHOVUI2 26 144 18.06 - SHOVUI2 53 151 35.10 - MLRUI1 15 26 57.69 - MLRUI1 100 145 68.97 - MLRUI1 48 143 33.57 - HYPOUM1 35 42 83.33 - HYPOUM1 133 144 92.36 - HYPOUM1 151 157 96.18 - HYPOUM2 2 41 4.88 - HYPOUM2 13 111 11.71 - HYPOUM2 9 103 8.74 - MTCLUM1 7 28 25.00 - MTCLUM1 13 141 9.22 - MTCLUM1 7 155 4.52 - MTCLUM2 10 25 40.00 - MTCLUM2 10 102 9.80 - MTCLUM2 9 95 9.47 - YGRVLM2 12 37 32.43 - YGRVLM2 12 102 11.76 - YGRVLM2 15 99 15.15 - CSPNLM1 39 43 90.70 - CSPNLM1 120 141 85.11 - CSPNLM1 125 152 82.24 - CSPNLM2 3 49 6.12 - CSPNLM2 20 114 17.54 - CSPNLM2 11 120 9.17 - C6LM1 3 37 8.11 - C6LM1 5 140 3.57 - C6LM1 2 110 1.82 - C6LM2 0 44 0.00 - C6LM2 1 115 0.87 - C6LM2 1 90 1.11 - C7LM1 4 40 10.00 - C7LM1 10 143 6.99 - C7LM1 10 110 9.09 - C7LM2 0 43 0.00 - C7LM2 1 113 0.88 - C7LM2 1 90 1.11 ChlMRG SHOVUI1 13 25 52.00 WAKs SHOVUI1 31 158 19.62 AWAm2 SHOVUI1 25 141 17.73 - SHOVUI2 14 24 58.33 - SHOVUI2 20 156 12.82 - SHOVUI2 12 143 8.39 - MLRUI1 14 25 56.00 - MLRUI1 112 160 70.00 - MLRUI1 57 120 47.50 - HYPOUM1 22 22 100.0 - HYPOUM1 159 162 98.15 - HYPOUM1 144 147 97.96 - HYPOUM2 10 18 55.56 - HYPOUM2 4 108 3.70 - HYPOUM2 14 109 12.84 - MTCLUM1 5 19 26.32 - MTCLUM1 10 158 6.33 - MTCLUM1 5 146 3.42 - MTCLUM2 6 18 33.33 - MTCLUM2 7 99 7.07 - MTCLUM2 6 102 5.88 - YGRVLM2 6 22 27.27 - YGRVLM2 12 117 10.26 - YGRVLM2 15 125 12.00 - CSPNLM1 20 23 86.96 - CSPNLM1 127 159 79.87 - CSPNLM1 125 151 82.78

264 - CSPNLM2 2 24 8.33 - CSPNLM2 18 120 15.00 - CSPNLM2 11 124 8.87 - C6LM1 5 23 21.74 - C6LM1 4 158 2.53 - C6LM1 4 126 3.17 - C6LM2 2 18 11.11 - C6LM2 0 120 0.00 - C6LM2 0 125 0.00 - C7LM1 3 25 12.00 - C7LM1 18 158 11.39 - C7LM1 10 127 7.87 - C7LM2 0 24 0.00 - C7LM2 0 120 0.00 - C7LM2 0 128 0.00 HAR SHOVUI1 2 15 13.33 SAP SHOVUI1 2 19 10.53 - SHOVUI2 4 16 25.00 - SHOVUI2 5 17 29.41 - MLRUI1 8 12 66.67 - MLRUI1 4 17 23.53 - HYPOUM1 16 16 100.0 - HYPOUM1 36 36 100.0 - HYPOUM2 2 18 11.11 - HYPOUM2 23 32 71.88 - MTCLUM1 6 13 46.15 - MTCLUM1 3 37 8.11 - MTCLUM2 4 16 25.00 - MTCLUM2 2 34 5.88 - YGRVLM2 3 31 9.68 - YGRVLM2 7 38 18.42 - CSPNLM1 17 20 85.00 - CSPNLM1 22 28 78.57 - CSPNLM2 0 33 0.00 - CSPNLM2 2 41 4.88 - C6LM1 1 20 5.00 - C6LM1 3 25 12.00 - C6LM2 0 28 0.00 - C6LM2 0 40 0.00 - C7LM1 1 22 4.55 - C7LM1 1 38 2.63 - C7LM2 0 28 0.00 - C7LM2 0 43 0.00

265