Quick viewing(Text Mode)

Genetic Polymorphism of Mitochondrial Control Region As a Tool for Forensic Discrimination and Comparison of Its Diversity Among Various Ethnic Groups of Pakistan

Genetic Polymorphism of Mitochondrial Control Region As a Tool for Forensic Discrimination and Comparison of Its Diversity Among Various Ethnic Groups of Pakistan

GENETIC POLYMORPHISM OF MITOCHONDRIAL CONTROL REGION AS A TOOL FOR FORENSIC DISCRIMINATION AND COMPARISON OF ITS DIVERSITY AMONG VARIOUS ETHNIC GROUPS OF

A THESIS SUBMITTED TO UNIVERSITY OF HEALTH SCIENCES IN FULFILMENT OF THE

DOCTOR OF PHILOSOPHY

IN

DISCIPLINE OF HUMAN GENETICS AND MOLECULAR BIOLOGY

By

Dr. Shahzad Bhatti

JULY, 2015

UNIVERSITY OF HEALTH SCIENCE - PAKISTAN

i

CERTIFICATE

It is hereby certified that this thesis is based on the results of experiments carried out by Mr. Shahzad Bhatti and that it has not been previously presented for PhD. Degree.

Mr. Shahzad Bhatti has done his research work under my supervision. He has fulfilled all the requirements and is qualified to submit the accompanying thesis for the degree of Doctor in Philosophy.

PROF.DR. MUHAMMAD ASLAMKHAN

M.Sc. (), D.Sc. (Maize, Germany), FFBSP, FRSTM &

FPAMS,

Professor& Supervisor

Department of Human genetics and Molecular biology

University of Health Sciences

Lahore, Pakistan.

ii ACKNOWLEDGMENT

All praises and thanks to the grace of ALLAH ALMIGHTY, WHO is the ultimate source of all knowledge to mankind. He bestowed man with intellectual power and understanding and gave him spiritual insight enabling him to discover his “Self” know his Creator through His wonders and conquer nature. Bow in obscene, I before my Lord, WHO bestows me to fortitude and impetus to accomplish this task and elucidate a drop of already existing ocean of knowledge. WHO made me reach at present pedestal of knowledge with quality of doing something adventurous, novel, thrilling, sensational, and path bearing.

Next to all His Messenger HAZRAT MUHAMMAD (Peace Be upon Him) Who is an eternal torch of guidance and fountain of knowledge for humanity, Who made mankind to get out of depths of evil & darkness. It is indeed my honor and pleasure due Dr Muhammad Aslamkhan, Prof, University of the Health Sciences Lahore, for providing me this opportunity to work in Human genetics Molecular Biology and supervise me in solving problems His kind behavior coupled with friendly attitude towards science was a source of inspiration for me. The work presented in this manuscript was accomplished under his enthusiastic guidance, sympathetic attitude and intellectual supervision. I express sincere appreciation to Dr. MH Qazi, Vice Chancellor, The University of Lahore, from the core of my heart for his guidance, assistance, encouragement and providing research facilities in IMBB Lab during my research work. Without the help of him I was unable to complete my research work. I have no words of thanks for his sincere attitude. With profound gratitude and deep sense of devotion and obligation, I wish to especially recognize the guidance and encouragement extended by my wife Miss SANA Abbas their continuous support pulled me through the tough hours of my research. Last but not the least, I am enormously grateful to all those who taught me even a single word of knowledge. I am greatly thankful to Mr. Shahid (lab Attendant) for his assistance and continuous encouragement during the span of my study.

SHAHZAD BHATTI

iii

ACRONYMS

Abbreviation Details ATP Adenosine triphosphate bp base pair C-stretch cytosine stretch ddNTP dideoxynucleotides ddNTP’s dideoxynucleotides D-loop Displacement loop DNA Deoxyribonucleic acid GD Genetic Diversity HVS-I Hypervariable segment I HVS-II Hypervariable segment II HVS-III hypervariable segment III mRNA Messenger RNA mtDNA Mitochondrial DNA OXPHOS Oxidative phosphorylation P Probability value PCR Polymerase chain reaction rCRS Revised Cambridge refrence sequence RFLP Restriction fragment length polymorphism RMP random match probability RNA Ribonucleic acid rRNA Ribosomal ribonucleic acid SNP Single nucleotide polymorphism STR Short tandem repeats tRNA Transfer Ribonucleic acid MYA Million years ago KYA Kilo years AMHs Anatomical modern Humans ybp years before present kya Kilo years ago

iv

TABLE OF CONTENTS

Page No.

CERTIFICATE ii

ACKNOWLEDGMENT iii

ACRONYMS iv

LIST OF FIGURES vi

LIST OF TABLES X

ABSTRACT xii

INTRODUCTION 1

REVIEW ABOUT PAKISTAN 27

SUBJECTS AND METHODS 65

RESULTS 75

DISCUSSION 155

REFERENCES 182

v

LIST OF FIGURES

No. Title Page No.

1.1 Map of Lake Turkana in Kennya 2

1.2 Multiregional and out of Africa model 4

1.3 Route Trajectory followed by the Migrant of Africa 6

1.4 Demonstrating Out of Africa Route 7

Mitochondrial DNA showing different Coding and Non coding 1.5 9 regions

Updated comprehensive phylogenetic tree of global human 1.6 11 mitochondrial DNA

The deep embedded autocanthus clads of M, N and R, 1.7 predominantly inhabit in and other lineages shared 13 with East Asians and West Eurasians

1.8 Map of Pakistan 27

1.9 Indo-Gangetic Plain 29

1.10 Map of Sistan or Helmand River Drainage Basin 30

Shows the Indus civilization. The archeological sites are marked 1.11 32 with Red dots.

1.12 Excavated sites of Harappa (A) and Moenjodaro (B) 34

1.13 Scanned Rout of Indo-Aryan in Indian Subcontinent 36

Alexandria on Indus situated at the adjunction of Indus and 1.14 39 Acesines

Illustration of Map of Pakistan showing an administrative 1.15 43 divisions and Provinces

Illustration of Map of Pakistan showing a population per square 1.16 44 meter in the Provinces.

1.17 Social structuring of South Asian population 53

3.1 DNA quantification from blood samples 81

PCR amplification of HVI & HVII segment of mitochondrial 3.2 82 control region

vi PCR amplification of HVIII segment of mitochondrial control 3.3 83 region

Sequence of HVI segment of mitochondrial control region and 3.4 85 its dendogram

Sequence of HVII segment of mitochondrial control region and 3.5 86 its dendogram

Sequence of HVIII segment of mitochondrial control region and 3.6 87 its dendogram

Dendograms of HVI segments of mtDNA in Pakistani 3.7 90 population

Dendograms of HVII segments of mtDNA in Pakistani 3.8 93 population

Dendograms of HVIII segments of mtDNA in Pakistani 3.9 96 population

Most common transition was c → t at position 16223bp in HVI 3.10 97 region

Most common transition was a → g at position 263bp in HVIII 3.11 98 segment

Most common transition was t →c at position 489bp of HVIII 3.12 99 region

3.13 C-stretch heteroplasmy in HVII region 100

3.14 Deletion of “C” in control region of mtDNA 102

A network relating haplogroup M of Pakistani caste and tribal populations. Circle areas are proportional to the haplotypes 3.15 116 frequencies and variant nucleotides are numbered and shown along the links between haplotypes.

Reduced median networks of haplogroup N based on Pakistani caste and tribal population. The number along the branched 3.16 denotes mutations with reference to CRS (andereson et al 1981). 117 While circle areas are proportional to the haplotypes frequency and reticulations indicate parallel mutational pathways.

A phylogenetic network of haplogroup R of Pakistani caste and tribal populations estimated by reduced median method. The size 3.17 of node constitutes frequency distribution of each haplotype and 119 the identities of mutations that define major haplogroup subset are portrayed along selected internodes.

vii

Major subset of haplogroup U in Pakistani caste and tribal population are exhibited in reduced median networks. Frequency 3.18 121 of each haplotype in term of nodal size and variant nucleotides are numerated and displayed along the links between haplotypes.

A reduced median networks of haplogroup J in Pakistani caste and tribal populations. Nodal size acquainted the haplotype 3.19 frequency and reticulation indicated multiple mutations and 123 variant nucleotides are listed as indicated along the links between haplotypes.

Phylogeny of haplogroups W AND T are estimated by constructed reduced median networks of Pakistani caste and 3.20 tribal populations. The size of each node directly proportion to 125 the frequency of haplotypes and variant nucleotides were enumerated and shown along the links between haplotypes.

A network relating haplogroup H in Pakistani caste and tribal populations. Circle areas are proportional to the haplotypes 3.21 127 frequencies and variant nucleotides are numbered and revealed along the links between haplotypes.

An un-rooted neighbor joining network of 22 populations based 3.22 on genetic distances among caste communities and inset 129 indicates the relationship for the major population groups.

Graphical presentation of major haplogroups percentage in 22 3.23 130 populations of Pakistan

MDS plot of 22 caste of Pakistani population, based on Fst 3.24 131 distances.

Electropherograms of multiplex A & B for individual of 3.25 132 Haplogroup M

Primer extension assay of multiplex A and B that exhibited 3.26 133 Haplogroup R

Electropherograms of multiplex A and B for individual of 3.27 134 Haplogroup N

Electropherograms of Multiplex A and B for Individual of 3.28 135 Haplogroup H

Electropherograms of multiplex A and B for individual of 3.29 136 haplogroup J

Electropherograms of multiplex A and B for individual of 3.30 137 Haplogroup U

viii Electropherograms of multiplex A and B for individual of 3.31 138 haplogroup T

3.32 Mismatch distribution in Rajput population 144

Haplotype distance matrix between/within the 22 ethnic groups 3.33 145 of Pakistani population.

3.34 Matrix of pairwise Fst 146

Average number of pairwise differences within and 3.35 147 between population

3.36 Haplotype frequencies in 22 isonym Population of Pakistan 148

3.37 Expected heterozygosity in 22 ethnic groups of Pakistan 149

Number of alleles at different loci in 22 isonym groups of 3.38 150 Pakistan

Molecular diversity indexes of 22 isonym population of 3.39 151 Pakistan.

3.40 Co-ancestory coefficient matrix 152

3.41 Slatkin’s linearized Fst’S 153

Detection of loci under selection from genome scans based on 3.42 154 Fst

ix

LIST OF TABLES

No Title Page. No

1.1 Population of Pakistan by Administrative Units according to 43 1998 Census

1.2 Climatic and Geographical regions of Azad Jammu and 45

1.3 District wise population density in Azad Kashmir. 46

1.4 Population of Balochistan by Districts according to 47 1998Census

1.5 Population of KPK (NWFP & FATA) by Districts according 49 to 1998Census

1.6 Population of Punjab by Districts according to 1998Census. 52

1.7 Population of by Districts according to 1998Census 59

2.1 Set of Primers for Amplification of Control region. 68

2.2 Haplogroups specific SNPs 71

2.3 Primers for Multiplex assay A & B 72

2.4 Detail of primers for SBE assay A & B 72

3.1 Names and number of subhaplogroups found in 22 ethnic 76 groups of Pakistan

3.2 Quantification of DNA of probands from 22 different ethnic/ 77 isonym groups from four provinces of Pakistan

3.3 Frequencies of haplogroup M in 22 populations of Pakistan 113

3.4 Frequency distribution of haplogroup N in different in 22 118 ethnic groups

3.5 Frequency distribution of haplogroup R in 22 ethnic groups of 120 Pakistan

3.6 Frequency distribution of haplogroup U in 22 ethnic groups of 122 Pakistani population

3.7 Frequency distribution of haplogroup J in different ethnic 124 groups of Pakistan

3.8 Frequency distribution of haplogroups T and W in different 126

x ethnic groups in Pakistan

3.9 Frequency distribution of haplogroups H in different ethnic 128 groups of Pakistan

3.10 AMOVA of 22 populations of Pakistan 139

3.11 Diversity and demographic parameters inferred from control 141 region of mtDNA

3.12 Molecular diversity indicis 142

3.13 Neutrality tests for 22 populations of Pakistan 143

xi

ABSTRACT

Because of its geo strategic position at the crossroad of Asia, Pakistan has gained crucial importance of playing its pivotal role in subsequent human migratory events, both prehistoric and historic. This human movement became possible through an ancient overland network of trails called “The Silk Route” linking Asia Minor, Middle East China, Central Asia, and Southeast Asia. The present study was designed to investigate the control region in 500 unrelated individuals of 22 ethnic groups of the Pakistani population. In order to discourse the genetic diversity, affiliations and origin of castes of Punjab and tribes of Baluchistan, KPK and Sindh. The study revealed high genetic diversity in Pakistani population which is comparable to the other Central Asian, Southeast Asian and European populations. For this purpose Sequence analysis scrutinized, 412 haplotypes, defined by a particular set of nucleotides were found (ignoring the C insertions around position 309 and 315) in 22 ethnic groups. In spite of that 65% sequences were observed once, 11 %, twice, 8 % thrice, 5 % four time and 2.2 % five times. The most common South Asian haplotypes are observed, M 46 %, N 7 %, and R 13%, while West Eurasian haplotypes are U 18%, H 5%, J 4%, W 3% and T 2% in 22 ethnic groups. The mean number of pairwise differences were observed 5.2 ± 2.8 to 12.5 ± 6.2. A random match probability between two unrelated individuals was found between 0.01 to 0.06 %. While genetic diversity was found to be 0.991 to 0.999, with nucleotide diversity ranging from 0.0089 to 0.0142 for the whole control region in Pakistani population. The configuration of genetic variation and heterogeneity further unveiled through Multidimensional Scaling and phylogenetic analysis. The results revealed that Pakistani ethnic groups are the composite mosaic of West Eurasian ancestry of numerous geographic origin. They received substantial gene flow during different invasive movements and have a high element of the Western provenance.

xii INTRODUCTION

For several years, scientists, geneticist, biologist, anthropologists, including ourselves as speculative humans, have pondered and asked the question, who was the earliest inhabitant of Earth? What was our Fate? Where do we originate from? The evolutionary scientists like Charles Darwin, Alfred Russell Wallace and Thomas

Huxley have supported us in the clarification of this question. It has been noteworthy that early hominids varied from our next-door relatives, the apes, approximately 5-7

MYA (million years ago). This was evident from the fossil record and protein evaluation between Asian and African apes and humans (Stoneking, 2008). It has been noted that the hominid remains, that are dated after 4.2 MYA, do not coincide with the features of the genus Homo, are grouped in the genus that antedated it,

Australopithecus (Jobling et al., 2013). The Fossil remains of this genus have only been detected in Africa and the oldest among them is Australopithecus anamensis, which was discovered in Lake Turkana (figure 1.1) adjacent to the cities of Kanapoi and Allia Bay in northern Kenya (Bloszies et al., 2015).. The fossil evidence that was assembled on this site, is the skeleton, including fragments of skill, mandible, premolar, molar, tibia and humerus (Cartmill and smith, 2009). That belongs to the era known as Lower Pliocene (3.9-4.2 MYA) as described by Jobling, (2001).

Another species was Australopithecus afarensis, with a small heighted male as compared to A. anamensis, have been discovered from the Eastern Rift Valley, Hadar,

Ethiopia and Tanzania. (De Sousa et al., 2012). A concurrent species to A. afarensis was A. bahrelghazali. It is close to eastern relatives that was found in Koro Toro,

Central Chad, about 2,500 km west of the Eastern Rift Valley and were about 3.0-

3.5 MYA (Jobling et al., 2013). Both species differentiated from each other on the

1

basis of some salient features such as mandibular symphysis, right and left halves of mandibles and lower premolars with 3 roots.

Figure 1.1: Map of Lake Turkana in Kenya Source: (http://www.zehabesha.com/kenya-ethiopia-security-teams-to-meet-amid-anxiety-on-lake-turkana/)

Immediately after this era, there appears a transition of hominids Australopithecus and upcoming genus Homo that has been dated to approximately 2.5 MYA and designated as nascent Homo with larger brain and small cheek teeth (Jobling et al., 2013; Strait et al., 2015).). Homo habilis has Australopithecus like brain and Homo like face, however, there was contradiction leading on to a claim that a sister species so called

H. rudolfensis have such features (Smith and James, 2013).

In addition to that controversy may lie in the Homo species with reference to H. erectus and H. ergaster (Bolus, 2015). The only matter is that H. erectus sometime mention as reference to non-African while H. ergaster is referred as an African individual, but actually there is no difference between the two (Jobling et al., 2013).

2 It has been noted that oldest H. erectus was found approximately, 1.8 to 1.9 MYA near Lake Turkana in Kenya with some other fissile remnants dated to 1.7 to 1.8.

MYA. Nevertheless, it was the only hominid that was found outside of Africa in

Indonesia and China. The only difference between African H. erectus and Asian H. erectus was their body size, as Asian H. erectus had a larger body than its counterpart.

It strengthened the idea of their migration out of Africa, due to high tolerance against adverse environments (Jobling et al., 2013). The size of the limbs and tooth of H. erectus is same as that of AMH (anatomically modern human) but with smaller brain

(Bruner et al., 2015). The other two ancestors of AMHs included Homo heidelbergensis (1 MYA) and Homo neanderthalensis or Neanderthal (250 Kilo years) with larger brain. Fossils of Neanderthal were found in Western Asia, Europe, including France, Germany, Israel and Iraq (Jobling et al., 2013).

Evolution of Modern Human:

The evolution from hominid species into AMHs based on three hypotheses; i)

Multiregional hypothesis ii) Replacement hypothesis or Out of Africa and iii)

Assimilation hypothesis (Ambrose, 2001). However, in the modern era the emphasis was much more on genetic data, archaeological and anthropological evidences instead of geographical distribution (Stringer, 2002; Mellars, 2006).

According to Multiregional hypothesis the AMHs were evolved (1 MYA) from H. erectus, migrated out of Africa to the various regions of the world (Nei, 1989) allowing our worldwide distribution, e.g., Asian H. erectus evolved into Asian modern Humans, African H. erectus acquired into African modern man (D'Errico et

3

al., 2015). Nevertheless, Wolpoff et al., (1984) opposed that hypothesis as this model does not suggest the parallel evolution. However, independent multiple origins of

AMHs which remain unchanged since the time of their ancestors more than 1 MYA would seem doubtful (Nei, 1989). Furthermore the presence of X-chromosomal sequence data of H. erectus have been found in modern human (Hammer et al., 2011) thus giving genetic support to multiregional hypothesis.

The out of Africa hypothesis supports the African origin of AMHs from African H. erectus nearly about 100,000-200,000 years ago (Nei, 1989) or 150 KYA (kilo years) not from Asian H. erectus. Moreover AMHs would remain in Africa and later on migrated to the Middle East and the rest of the world (Forster et al., 2002; Ermini et al., 2015).). This migration was supported by the Mitochondrial DNA (mtDNA) and

Y-chromosomal data, both are unilaterally inherited (Cann et al., 1987).

Figure 1.2: Multiregional and out of Africa model (Campbell and loy, 2000) Source: http://blogs.discovermagazine.com/gnxp/tag/out-of-africa The archeological, fossil record and genetic imprinting also support the migration of

Anatomically Modern Human out of Africa (200,000) ybp, (years before present) but

4 the migration route is not properly understood. It deems that further migration out of

Africa was commenced nearly about a period of 55,000 to 85,000 years before present

(ybp) (Yotova et al., 2007). Nevertheless the mega draughts between the time span

135,000 and 75,000 ybp, considered with the timing of migration of modern human out of Africa (Scholz et al., 2007). It has been now established that the Africa is a subsequent source for the migration of modern human because of higher genetic variations, as compared to any world population (Kivisild et al., 1999; Macaulay et al., 1999; Forster et al., 2002; Mellars, 2006; Hudjashov et al., 2007; Compbell and

Tishkoff, 2008; Chandrasekar et al., 2009; Kumar et al., 2009; Pugach & Stoneking,

2015).

It has been noted that migration is not always in symmetrical pattern especially when gender, migration is considered: Men by nature seems to be more migrant than women. This gender selection consequence may appear in the form of genetic variations and genomic variations that was slowly channelized by father or mother and transmitted to daughter or son (Hugo, 2009).

The rout trajectory followed by the migrant of Africa is “southern exit route” From the horn of Africa, through the mouth of the Red sea (Majumder, 2010), along the coastal route to southern Asia and colonization in Asian subcontinent especially the

Middle East, Iran, Pakistan, Southwest Asia and finally Australia (Oppenheimer

2012). Moreover, southwestern Asia corridor that expended between Anatolia and

Iranian Plateau that leads to Indo-Gengetic plain (Pakistan) was characterized by a mesh-work of various genetic and anthropological margins equipped with numerous

5

diversified languages such as the Sino Tibetan language, Turkish language and Indo-

European language (Oppenheimer, 2012).

Figure 1.3: Route Trajectory followed by the Migrant of Africa Source: http://forums.bharat-rakshak.com/viewtopic.php?p=1326608

This route further verified by the presence of M and N haplogroups (decedent of

African L3 haplogroup) having large number of lineages in South Asia. The antiquity of N lineage is much more in South Asia than Europe and Middle East (Richard et al.,

2006) and most likely the date of dispersal through the southern exit route was about

70,000-80,000 years ago (Kivisild, 2015).

The geographical evidence in support of this route has been scanty, due to the submerging of coastlines and rapid increase in sea levels. However, recent evidence suggested that human migration from Africa followed the “North exit route” through

Nile Valley into Central Asia and then spread through South Asia (Lahr and Foley,

6 1998; Fregel et al., 2015). South Asia is considered as a paragon for the geneticists and evolutionary biologist as it amassed with 4,635 anthropological well-defined populations, out of that 532 are tribes and they vary from each other with respect to their culture, social norms, dress, food and habits, etc., (Thangaraj et al., 2003).

Figure 1.4: Demonstrating Out of Africa Route (Richard et al., 2006) Source: http://r2dnainfo.blogspot.com/2011/11/out-of-india-after-africa-nat-geo-now.html

The Assimilation hypothesis is the mixture of the previously described hypotheses

(Stringer, 2002). According to this model AMHs came out of Africa and proposed the evolution of various H. erectus populations into AMHs (Hershkovitz et al., 2015).

This idea was defended by Prüfer et al., (2014), who observed that Neanderthal genome has greater sharing affinity with AHMs of Eurasia than those of Africa.

Mitochondrial DNA:

A typical somatic cell consists of a central nucleus and the surrounding cytoplasm bounded by the cell membrane. The nucleus is the center of the cell that controls the

7

whole trafficking of the cell, by transcription and translation of functional and structural proteins. It consists of many small organelles such as Endoplasmic reticulum, Mitochondria, Ribosomes, and Golgi complex, etc. Endoplasmic reticulum and Ribosomes are involved in the protein synthesis while Golgi complex is involved in a packaging of protein for transportation within and out of the cell. However, the mitochondrion is a doubled membrane-bounded structure called the powerhouse of the cell (Berdanie, 2005). Mitochondrion was first recognized as discrete organelles in

1840, by Richard Altman. It was named as mitochondrion by Ernster and Schatz in

(1981). However, almost 100 years later, in 1948, mitochondria were isolated through zonal centrifugation technique. Their length is about one to tenth micrometer, whereas

0.5 to 0.1µ in diameter and were an eminent paradigm for the energy predisposition in the cell. Mitochondrial DNA (mtDNA) consists of two strands, one is heavy, i.e., high content of purine bases and other is light strand, i.e., having high content of pyrimidine bases.

It has been considered that mitochondrion is a bacterium that was engulfed by eukaryotic cells and become a part of them, this theory has not been refuted (Starr et al., 2015).

Mitochondrial DNA is maternally inherited monoploid in nature. In 1981, first time human mitochondrial genome was sequenced and since then hundreds of sequences have been determined in different regions of the world. Mitochondrial DNA has

16,569 bp (base pair) with small and circular structure. The non coding DNA region contains a control region of mtDNA with 1150 bp that contains an origin of replication, origins of transcription and translation, supplementary transcription and replication control element. The structure of mitochondrial DNA is shown in figure

1.5.

8

Mitochondrial DNA consists of 37 genes, that codes for 13 proteins, 22 tRNAs and two rRNAs with little noncoding region designated as control region or D-loop (Capt et al., 2015). Moreover, highly polymorphic mtDNA has sequence variability dense in control region segments HVI, HVII and HVIII.

Figure 1.5: Mitochondrial DNA shows different Coding and Non coding

regions (Wan-ru et al., 2007)

Due to this density aggregation of sequence variation, these regions are widely used for evolutionary as well as forensic recognition, i.e., HVI corresponding to (15975 to

16420), HVII corresponding to (08 – 429) and HVIII corresponding to (4362 – 599)

(Anderson et al., 1981). Paternal inheritance has also been observed in blastocyst of

9

some abnormal embryos (St Jhon et al., 2000) but generally that contribution is insignificant (0.7%) (Zsurka et al., 2005). A typical ovum contains 100,000 mtDNA genomes as compared to spermatocyte that has 100 mtDNA genome (Chen et al.,

1995).

From Orthodox to Haploid Markers: Datum Folder of South Asia

Carl Landsteiner discovered ABO blood groups in 1801 that was used first time to evaluate the genetic variations in modern human (Landsteiner et al., 1925). Later on

Cavalli-Sforza and his coworkers (1967) reconstruct the prehistory of human beings by the use of allelic distributions and phylogenetic analysis of fifteen Asian populations. The commencement of the DNA era was the key niche to compare primarily this fragment with different individuals and to find their origin to most recent common ancestors (MRCA). Since mtDNA is an important tool in population genetics, earlier mtDNA analysis has been done through 14 restriction enzymes. e.g.,

AccI at site 14465 & 15254, HinfI at site 12308, NlaIII at site 4216 and MseI at site

14766, etc., (Torroni et al.,1992), which was later on transferred to MspI, both targets the sequence at site C:CGG (Kivisild et al., 1999; Quintana-Murci et al., 2004).

Lately restriction enzymes are employed in the mtDNA coding region in the Pakistani

Pathan ethic group, to determine the diagnostic SNPs for haplogroups, which are already defined by partial sequencing of hypervariable regions HVI, HVII, and HVIII

(Rakha et al., 2011).

10 Macro Haplogroup Classification: Dissecting the South Asian

Genetics:

The gender variability is more crucial as for the evolutionary events are considered, such as migration to new global and geographical areas are totally contingent on these differential rates. The difference in the mtDNA of maternal lineage has changed over the time by the accumulation of mutations in the individuals, greater the elapsed time to their common ancestors, greater the mutagenesis. In addition to this, accompanied early changes keep engrafted in the mtDNA segment that carries later changes serving as a signature called haplotype (Nordberg, 1997; Emery et al., 2015).

These haplotypes are clustered on the basis of similarity in their genetic material into a group called Haplogroups which are named as alphabets, e.g., L0 (oldest group) L3,

M, N, and R, etc. The genealogical and phylogenetic relationship of these haplogroups is shown in figure 1.6:

Figure 1.6: Global Phylogenetic tree of mtDNA showing evolutionary

haplogroups (Van Ovan and Kayser, 2009).

11

Each haplogroup is defined by the specific genetic variants. Nevertheless, some additional variants in the haplogroup assists in inferring the age of haplogroup, smaller the variations in the DNA variants younger the haplogroup and vice versa

(Kingman, 2000). The extent of variation in the variant DNA per year remains constant than one can predict the roughly estimated time of particular haplogroup.

These variants of SNPs carry the footprints of our ancestral lineage (Kingman, 2000).

The matrilineal lineage of South Asia is mainly comprised of several Aboriginal deeply rooted lineages, flourished directly from basal node M (Metspalu et al., 2004).

The recent knowledge exhibits that the macro haplogroup M of mtDNA and C haplogroup of Y-chromosome provides a significant piece of evidence for the “Out of

Africa” migration of the modern humans follow the Southern route to Australia

(ArunKumar et al., 2015). All mtDNA roots present between South Asia and Oceania are directly bedded to the two non-African originating clads M and N as described in figure 1.7 by (Thangaraj et al., 2006).

12

Figure 1.7: The deep embedded autocanthus clads of M, N and R, predominantly inhabit in South Asia and other lineages shared with East Asians and West

Eurasians (Gounder Palanichamy et al., 2004).

Nevertheless molecular clocks of genetic evidence suggested that coastal dispersion took place approximately 66,000 ybp across the border of the Subcontinent, to

Oceania (Macaulay et al., 2005; Chan et al., 2015). The M, N and R (sub-haplogroup

13

of N) were founders of the first human who entered the Indian subcontinent. These haplogroups prompted rapid divergence once reach at Indian subcontinent and their

`descendants formed a diversified and spectacular genetic makeup in this region in the form of sub-haplogroups HV, J, T and B, (Maji et al., 2009) as shown in figure 1.7.

Roychoudhury et al., (2000) exposed the hidden truths of migration of modern human

“Out of Africa” by the sequence analysis of 23 ethnic groups of different geographic, linguistic and the demographic origin sharing of a small number of haplotypes. It was indicated in the results that females arrived from the earlier wave of “Out of Africa” migration of modern human possibly was founder of Indian populations and ethnic diversity might be the case of demographic expansion and geographical dispersion.

The frequency of Haplogroup M was high in tribal as well as Dravidians populations of southern . While the northern Indian population exhibit higher frequencies of haplogroup U as compared to haplogroup M that showed the Caucasoid admixture in them. Tentatively, the sharing of haplotypes between India and South Asians

(Pakistan) populations was observed exclusively. Moreover the subtle evidence was present that South Asians were the peopled originating from two waves of migration one wave initiating from India and other from Southern China. These outcomes were inferred in the light of previous genomic and historical studies.

Kivisild et al., (2003) explored the variation in mtDNA, Y-chromosome and one autosomal marker. They compared the results with six castes groups and 2 tribal populations (Chenchus and Koyas) in India. Tribal group phylogenies showed a high frequency of M and N (Indian specific haplogroups), while Y- chromosomal analysis showed H, L and R2 haplogroups present in both castes and tribal groups, especially

R1a (previously associated with Aryan invasion) was found at a high frequency (26%)

14 in the Chenchu tribe. Furthermore, they elaborated, that the southern and western

Asians are the ancestors of Indian tribal and caste population, influenced by the external gene flow.

Al-Zahery et al., (2003) Used mtDNA and Y-chromosome as a uniparental approach to explain the haplogroup diversity in Iraq. The most numerous mtDNA haplogroups were H, J, T and U, while Y-chromosomal haplogroups were J (xM172) and J-M172, thought to originate from Eastern Eurasian and prevail throughout the western

Eurasia. Moreover Y-chromosomal analysis revealed male dominating spreading of gene instead of the female.

Comas et al., (2004) worked on the mtDNA segment HVI of 12 Central Asian populations. They found primarily most prevalent haplogroups prevailed in East

Asians such as M and its Sub-clads, whereas, South Asian specific haplogroups were present in small fractions. West Eurasian haplogroups including N and its subclades were also seen. It was proposed that genetic diversity found in Central Asia was due to the admixture of Eastern and Western Eurasian populations.

Metspalu et al., (2004) Scrutinized the mtDNA control and coding regions of the

Indian population (including caste and tribal populations from all over the India and portion of Iranian population). Results showed a high frequency distribution of M haplogroup and its sub scalds, i.e., 40% in Gujrat (Punjab), 65% in southern States.

The frequency of haplogroup W was 15% in a caste group than tribal (8%) population. The most frequent West Eurasian haplogroup was U existed throughout the Indian subcontinent and included subclade U2i, U2b, U2c respectively. In addition

15

to that, haplogroup U7 was existed about 9 to 12% in Gujrat and Punjab in India respectively. Over 90% haplogroups found in Iran were HV, T, J, N1, N2 and X, that was frequently found in West Eurasia. Moreover haplogroup U found (29%) and its sub haplogroup U7 was about 9.4%. In contrast to this the haplogroup M was found in very low frequency (5.3%) in Iranian population.

Gounder-palanichamy et al., (2004) determined the controversy of macrohaplogroup

N lineage among India and western Eurasia. In order to resolve this discrepancy control region based N haplogroups were selected across the India for complete sequencing of mtDNA. They identified unique haplogroup N5 and its subclades R30.

Which further divided by R8 and R7. They also identify and reconfirm preexisting haplogroups such as U (U2a, U2b, and U2C) and R (R5 & R6) respectively.

Watkins et al., (2005) investigated the mitochondrial DNA variations in 9 tribes and 8 castes groups of South Asia. The high percentage of M haplogroup representing a strong affinity to East Eurasians in Astro-Asiatic tribes of South Asia, while haplogroup U has high affinity to Western Eurasians in all caste groups. High heterozygosity was observed in the north and east tribal populations (0.69 to 0.74) as compared to South Indians (0.54-0.69) respectively. A pairwise distance of Indian tribes estimated was more closely related to the castes of Indian but geological differences was not prognostic between tribes and castes.

Rajkumar and his coworkers (2005) analyzed the complete human mtDNA for determination of the phylogeny and the ancientness of different lineages belonging to the haplogroups L, M and N in India. The phylogenetic analysis revealed two lineages

16 M30 and M31, pre-requisite by transitions at 12007 and 5319 separately.

Phylogenetically, M30 consists of M18 that categorizes a latent new sub-lineage, having a substitution at 16223 and 16300. M30 is subdivided into M30a defined by the motive 15431 and 195A. The age of M30 was calculated about 33,042 ybp, that signifying a more current expansion time than M2, i.e., 49,686. The M31 clad incorporates the M6 lineage along with the earlier defined M3 and M4 lineages. There was also identification of a new lineage in the ancient haplotype M branch that defined by motives 16223 and 16325.

Achilli et al., (2005) pointed out that Saami (Scandinavia) and Berbers (North Africa) have a high affinity towards haplogroup U, after emerging from Africa, it spread as diverse subclades. Moreover Sami and Berbers surprisingly, belonged to a very young branch approximately 9,000 ybp. From this, it was inferred a link between Berbers and hunter gatherers in South West Europe.

Sun et al., (2006) resolved the discrepancies of AMH “Out of Africa theory”. A comparative study was designed between South East Asia, India, including Oceania at cryptic level. Whole sequencing of haplogroup M was done to determine the Indian sub-continent autocanthus M lineages. Consequently the ambiguity of D-loop based haplogroups were redefined such as M2 to M6 haplogroups.

In addition to this, seven unique haplogroups of M ranging from M34 to M40 were classified. It was inferred that the modern man’s dispersion was done through Asian caste after emerging from Africa.

17

Thanseem et al., (2006) proposed that the massive bulk of Indian maternal lineage comprises of Dravidian (local inhabitants) and Indo-European (West Eurasian) speakers, which were genetically, imprecise but fairly indistinguishable, such as the male mediated migration was more in the late Pleistocene period. They studied mtDNA and Y-chromosomal markers in 3 tribal populations (Pardhan, Naikpod and

Andh) and observed that most common haplogroup was M, i.e., 67% in total subjects.

The second most prevalent haplogroup was R (18%) as compared to U which was

(10%) in all tribal populations. Nevertheless Y SNP data also exhibit the genetic evidence that lower castes were originated from tribes in the Indian subcontinent. It has been proposed that with the dispersal of Neolithic agriculturalists, the lower castes emerged from the lower hierarchical division of tribal groups and this was done much earlier before the invasion of Aryan in Indian subcontinent. Furthermore Indo-

Europeans recognized themselves as upper class of caste emerged from higher hierarchical division within the tribes. This was strengthened that no remarkable differences has been found in tribal population and castes of India. The parental lineage (Y-lineage) of lower castes exhibited considerably closer affinities to the tribal populations than to the upper castes.

Thangaraj et al., (2006) published the results in the favor of the rapid dispersal theory along the Asian coast. To support this argument, they selected 11 whole mtDNA and

2231 M lineage sequences based on the control region sequencing. They demarcated

M41 as unique haplogroup by sequencing of whole mtDNA sequence and reviewed the pre-existed classification of haplogroup M3, M18 and M31 respectively. On the basis of the results they inferred that Indian subcontinental mtDNA pool comprises of deeply rooted autocanthus macro haplogroup M, which was originated in South Asia,

18 probably in Indian subcontinent. These autocanthus rooted clads were enormously spreading all over the Indian subcontinent and were not language specific. In addition to that, they reconstructed phylogenetic tree by the reanalysis of Andamanese specific lineage M31. They found two additional sub branches M31a1 and M31a2.

Kumar et al., (2008) suggested that the Austro-Asiatic speaker of central Asia,

Dravidians, and tribes of Southern and Eastern region were the modern symbol of the most primitive settler of the Indian subcontinent. They studied the phylogeographic expansion and diversity indices of haplogroup M2 and revealed that M2 lineage was subdivided into M2a and M2b among the tribes of different geographical regions. The comparatively high frequency of M2 represented the phylogenetic signature of early settlers in Indian subcontinent.

Eaaswarkhanth et al., (2010) estimated the contribution of West Asian and Arabian admixture of Indian Muslims. It has been shown that Indian Muslim populations harbor the major genetic influx from geographically close non-Muslim populations.

Though an admixture of sub-Saharan African (L0a2a2), Arabian (M2), West Asians

(U, T, W) were also observed among Indian Muslims. Moreover, they concluded that

Islam was spread in Indian subcontinent mainly from Iran and Central Asia, instead of directly from the Arabian Peninsula.

Sultana et al., (2014) demonstrated that mtDNA (control region) and three coding regions exhibited high genetic diversity and low random match probability in Bengali population. According to their studies, the most frequent haplogroup was M and its subclades of South Asian origin, including haplogroup N and R of East Eurasian, A4 and U2b (West Eurasians) were 47.1%, 36.2%, 12.5%, 3% & 1.0% respectively. The

19

haplogroups with low frequency were M21 (0.9 %) and (M45 0.8%) exhibited the relationship between Bangladesh, South Asia and South East Asia.

Mitochondrial DNA: Ideal Marker for Forensic Phylogenetic:

An important role of mtDNA is to identify the biological remains through identification and interpretation of hair, bone, blood, teeth and minute or degraded

DNA samples.

Mitochondrial DNA is particularly retrieved due to its solitary features because of its matrilineal inheritance that makes it important as an investigation and identification genomic indicators in the cases of misplaced persons. It has been noted that a proband’s mother and all other family members shared the same mtDNA as that of a subject under question. Alternatively, samples from the mother’s side can be used to identify the missing person (Bisbing and Richard, 1982).

Today, for identification purpose, the analysis of biological evidence sample is done through various PCR-based DNA typing tests. The allele specific PCR-based DNA typing tests are exhibit basically, through targeting the nuclear genome specific markers and used for identification purposes (Wilson et al., 1995).

In contrast, nuclear gene’s which are amplified by PCR (polymerase chain reaction) based reaction, present as a single copy inside the nucleus of the cell, while mtDNA is present thousands of copy number in each human cell (Allen et al., 1998; Lemnrau et al., 2015). Since PCR amplification requires the DNA region, intact fragments and in the absence of it fails to amplify. In contrast, numerous copies of mitochondrial

DNA are significant in getting results from degraded samples. Abundance of mtDNA

20 also allows, obtaining results from samples having low DNA quantity (Carracedo et al., 2000). The higher rate of mutation is present in the control region of mtDNA, which does not encode any gene, as compared to the remaining mtDNA (Cui et al.,

2015). For differentiation between non-maternally individual, this rare rate of mutation in the control region acts as a significant tool. Due to this significant capability, samples with low resolution of DNA and less quantity can be analyzed by mtDNA analysis, which is a major nook in forensic science (Melton and Nelson,

2001; Quispe-Tintaya et al., 2015).

For forensic purposes, analysis of control region (HVI, HVII and HVIII) of mtDNA is hotspot (Holland and Gordan., 2015) and done by different methods like sequence based method, direct DNA sequencing (Quispe-Tintaya et al., 2015) and pyrosequencing (Bintz et al., 2014), probe based method (Miller et al., 2014) conventional dot-blot, Luminex hybridization (Beebe et al., 2015) and reverse dot- blot (Derda et al., 2015) or linear array (Garbriel et al., 2003).

Recently, to allow mtDNA forensic casework and its implementation by law in various countries or geographic regions, numerous databases regarding to mtDNA the control region has been favored because of its free error prone environment (Budowle et al., 1999; Hong et al., 2015). However, the majority of this published data on mtDNA sequence is related to of HVI and HVII segments. In spite of that the lack of relevant database information regarding to mtDNA and the strength of mtDNA evidence makes its use limited. To overcome this problem, the generation of mtDNA sequence database for the extension of mtDNA typing capability is required (Hong et

21

al., 2015). It is also important to mention that additional population data on ethnic basis is helpful to increase the size of existing databases (Imaizumi et al., 2002).

In addition, mtDNA is also one step ahead from nuclear DNA. Mitochondrial DNA provides highly designed intra-specific tree from phylogenetic point of view and explain the new bio-geographic and historical insights from genomic study.

Comparison between different population and ethnic groups and their origins by phylogenetic tree generation was done through analysis of HVI, HVII and HVIII domains (Maruyama et al., 2003; Simão et al., 2015).

Bisbing and his coworkers (1982) presented a key advantage of employing mtDNA in forensic applications, i.e., the high copy number of molecules of DNA in the cell.

Nuclear autosomal loci employed in forensic study are there with only two copies in a single cell, whereas, mtDNA is found in approximately 500-2000 copies in a single mammalian cell. They also proved that mtDNA survived in highly tainted samples that would not otherwise provide a nuclear DNA profile. Samples that were characteristically submitted for DNA examination include tainted bloodstains, bones, saliva, fingernails and hair shafts.

Mitochondrial DNA typing of hair shafts was a principally significant application as sheds hair was commonly found as sources of evidence. Presently, forensic based hair discrimination of evidentiary and reference specimen is rooted on morphological characteristics. These investigations tend to be subjective, relying on the verdict and the skill of the examiner. In an effort to alleviate the subjectivity associated with hair analysis, a mounting number of crime labs are progressing towards performing

22 mtDNA testing. Furthermore, mtDNA can distinguish among hair that can’t be disqualified as coming from the same individual (Parson et al., 2015).

Alvarez et al., (2007) corroborated the population data in the control region of mtDNA and find the relative distribution of haplogroups identified in the samples.

Profiles of Spanish persons were elucidated for the determination of genetic variants

(SNPs) and characterization of haplogroups. Haplgroup H was the most frequent haplogroup observed in Spanish population, whereas the West Eurasian haplogroups such as U, T, and J were also abundantly present with an incidence of 6.4%.

Parson et al., (2007) verified mtDNA analysis as the fundamental paradigm in the missing person’s identification. Mitochondrial DNA has high fidelity in degraded samples having very minute quantity of DNA and are abruptly distorted in the rough environment.

Mabuchi et al., (2007) configured out a reliable process that was employed to determine the genetic variability in D-loop of mtDNA. They identified different SNPs in three regions of D-loop, HVI (80), HVII (37) and HVIII (14), except for C stretch variants. Genetic diversity was observed about 0.998, suggested that high power of discrimination in the samples of mtDNA. Frequent haplogroups were M7a1 with a frequency of 13.7%. M7b2 with a frequency of 8.9 and D4a with a frequency of 9.7% respectively.

Andrea et al., (2007) scrutinized the genetic disparity in mitochondrial DNA by providing sole evidence about the diversity in the Amazon population (Afro-

23

descendent). They classify 133 haplotypes on the basis of 97 SNPs, in which 9 samples shared 3 or more SNPs. The genetic diversity was about 0.998 ± 0.0016 and

Random match probability (RMP) was noted about 1.2% of the total population.

Alshamali, (2008) elucidated the undamaged control region from Dubai population and determined the haplogroups prevalent in the population, such as (A, B, C, H, J).

RMP was observed 1 out of 141 samples of the whole D-loop and the samples which were singleton about 156 with heteroplasmy 5.6%.

Lehocky et al., (2008) discussed and determined the polymorphic positions and their frequencies, in a population of Slovakia. The Sequence assessment led to the detection of genetic variations 0.0997 and RMP was 0.06% by scanning of 284 samples.

Wang et al., (2012) observed a high frequency of East Eurasian paternal descent in

Nepal, suggested that direct migration of haplogroups from Tibet or might have originated from Northeast India, where an exclusive East Eurasian maternal lineage have been distinguished. This clue indicated that most likely Tibet or North India was the homeland of East Eurasian in the Nepal. This idea may further be strengthened by linguistic, archeologists and Y-chromosome analysis, now it was more powerfully persuaded that East Asia genetic lineage had got an entry through Himalayas approximately 6 kilo years ago (kya).

Tipirisett et al., (2014) elaborated that west Eurasian clads and sub-clads had their origin within India and responsible root for the origin of caste system and language.

24 They identified two haplogroups “HV14a1” and U1a1a4 that were originated approximately, 10.5-17.9 thousand years ago. In addition to this other U7 haplogroup including U7a1, U7a2b, U7a3, U7a3, U7a6, U7a7 and U7c had provided essential clues for the genetic variation of caste system that might have been influenced by

Indo-Aryan migration.

Sultana et al., (2014) analyze the mtDNA from a region of Bangladesh and identified

14 diverse haplotypes having high diversity (0.8475 ± 0.13406), low random match probability and 9.698 ± 1.8658 pairwise difference. The nucleotide difference (95%

CL 9.67-9.69) with contrast, reported by Northeast Asia and various other populations, signifying it for forensic applications.

Chaitanya et al., (2014) investigated that bio-geographic and etymological history predicted that mtDNA used for identification of matrilineal blood lines of unknown suspects were more strongly anticipated than that of autosomal short tandem repeats

(STR). They explicated a six multiplex genotyping assays targeting 62 mtDNA SNPs and elaborated, major haplogroups present in America, Africa, Western and Eastern

Eurasia, Oceania and Australia, served as bio-geographic ancestry prediction.

However the sensitivity of the assay was very high to produce at the input DNA quantity of as little as 1pg gathered at a crime scene from degraded samples of blood, semen, hair and saliva. With the help of this authenticated tool, it was possible to ascertain the matrilineal bio-geographic origin of unidentified individuals at the level of the continent.

25

Ovchinnikov, (2014) expounded that the Tajik mtDNA pool was the commixture and intermixture of western Eurasian (62.6%) and eastern Eurasian (26.4 %) haplogroups respectively. They were exhibited 90 different haplotypes with high (0.999 ± 0.022) genetic variability and moderated (0.014 ±0.007), nucleotide variations. Further, they demonstrated that Tajik mtDNA has low RMP (0.111) and high discrimination power

(0.988).

26 REVIEW ABOUT PAKISTAN:

Is an uttermost mixture of landscapes, varying from hills to forests and plains to deserts. It has a wide range of plateaus in the South, ranging from the coastal areas of the Arabian Sea and in the north mountain of Krakoram (Khan, 1991). Geographically

Pakistan sandwich between India and Eurasian tectonic plates, on the northwestern corner of the Indian plate Sindh and Punjab are situated, while Eurasian plate which consist of the Iranian plateau, little part of the Middle East and Central Asia lies

Balochistan and major portion of the Khyber-Pakhtunkhwa. On the verge of Indian plate near Central Asia Gilgit-Baltistan and Azad Kashmir (Geo, 2014) is located as described in map of

Pakistan in figure 1.8.

Figure 1.8: Map of Pakistan Source: http://owl-and-mouse.com/online-atlas/Asia/pakistan-map.htm

27

The Pakistan total land border is 6,774 kilometers (km) long, in the North West

Afghanistan lies (2,430 km long border) and on the West Iran (909 km long border), on the north People’s Republic of China (523km long border), on the Eastern

Territory of India is located (Kureshy et al., 1977; Smyth, 2010). Pakistan is facing regional clashes and have intensified circumstances between the neighbor countries, especially India on Kashmir and the Durand Line with . However, Khyber

Pass and Bolan Pass located on the western border and is a central hub between

Central Asia and South Asia (Ahmad et al., 1966) showing in figure 1.8.

The Northern Altitudes:

The Northern Altitude comprises of Hindu Kush, Himalayas and Karakoram Range including second most famous peak as K2 (8,611 meter) in the world. The area also includes most of the top hills ranging from 4,500 to 6,500 meters long restrict the movement of the intruders (Robinson 1989; Abbasi et al., 2015).

South Asian Region:

It is also known as the Indo-Gangetic Plain or Indus Ganga which is about 630 million acres, the highly fertile land surrounding most of northern and eastern India and Pakistan (Kazmi, 1984) shown in figure 1.9.

28

Figure 1.9: Indo-Gangetic Plain Source: http://aciar.gov.au/files/mn-158/s3_3-gangetic-plain-punjab.html

Geographically, this is the flood plain of the Indus and Ganga Brahmaputra river system. They come from the Himalaya Mountains (Lacau et al., 2011), from the west of Jammu and Kashmir (disputed area), east in the Assam and gradually flow in the north and eastern India. This area is about 700,000 km2 and varies in width; major rivers are the Ganga, Indus along with their branches; Gomati, Beas, Ravi, Chambal,

Sutlej and Chenab (Geddes, 1960). This is the place that give birth to South Asian culture called Indus Valley Civilization. This is also known as Sapta Sindhvas or

“Seven River” land on the surface of this earth (Bag, 2015), The Gangetic and “Seven

River” plain, was apathetically Tethys Sea, boarded on the north by the Angaraland, i.e., the plateau of Siberia and south by Gondwanaland, i.e., plateau of the Deccan and

Arabia. According to Rig Veda this region was known as “Aryavarta” means Land of

Aryans (Frawley, 1994).

29

Sistan Basin:

This is closed drainage basin that keeps water in it and restrict the outflow of water as a river, but drain it into lakes, covering a large area of southern, western Afghanistan and Southern eastern Iran (Tirrul et al., 1983). Archeologically, Sistan basin inhabited by ancient cultures, especially Kang and Zaranj were two major cultures that flourished in this area, the remnants of irrigation system, including canals are still observable in the Dasht-e-Margo and Chakhansur of Afghanistan (Fischer, 1968) as shown in figure 1.10.

History of External Invaders on South Asian Territory:

More than one and a half billion people with diverse genetic makeup and culture inhabit in Indian subcontinent. There are three major potential sources that contribute its diversity of the gene pool, the

Figure 1.10: Map of Sistan or Helmand River Drainage Basin Source: https://en.wikipedia.org/wiki/Sistan_Basin#/media/File:Helmandrivermap.png

30 First one is an old Paleolithic element that may be almost vanished now a days (Ayub et al., 2015) The second source was a Neolithic migrant from the Fertile Crescent adjacent to Eastern horn that has proto Dravidian introduced caste system that hierarchically present in the Indian subcontinent (Village, 2015). The third source was

Austro-Asiatic and Tibeto-Burman peoples inhabiting in East Asia (Cavalli-sforza et al., 2001; Schliesinger, 2015).

During the Pleistocene, five glacial stages are identified among 600,000 to 11,000 ybp

(Rhodin et al., 2015). The chronology of a sequence of plains and mountain glaciation has been studied extensively in Europe than Asia, but a parallel glaciation was also observed in the Potohar Plain (Khan et al., 2010), i.e., area around Islamabad (capital of Pakistan). In this plain many important fossils, especially pre-human form,

Ramapithecus punjabicus, discovered that has substantial importance in the curiosity of human origin (Greenfield, 1979). The absolute age is 9 to 12 million years and forms a substantial link with the ancestor of mankind. This plan of Potohar became a place which transpire Stone Age culture called “Soan Culture” adjacent to Rawalpindi

(City of Pakistan) nearly about 480,000 to 11,000 ybp (Paterson, 1962; Sørensen,

2015).

The period from 11,000 to 8,000 ybp is regarded as a dark age in the history of man in

Pakistan because of fighting for survival with intruders coming from the west and left their marks in the form of diverse cultures, such as Middle Stellenbosch, Upper

Stellenbosch, Upper Acheulian (Mgeladze et al., 2015) and Upper Clacton (Conard,

2015). Which were found in the Potohar region of Pakistan (Aslamkhan, 1996).

31

17,000-5500 BC

The evidence for the early Neolithic settlements has been tracked down in the

Mehrgarh that comprises of 500 acres, sited near the Kachi plain (Roustae et al.,

2015), at the base of Bolan Pass, to the west of the Bolan River in Balochistan and to the west of Indus valley (Possehl and Gregory, 1997). Presently, this area lies between the cities of , Kalat and Sibi. In Mehrgarh three main areas of the site was dug up enumerated as MR3 represented an early Neolithic period I, MR2 Chalolithic, period III and MRI late Chalolithic, period V up to VII (Jarrige and Jean-Francois

1981; Dibyopama et al., 2015).

The early Neolithic period was classified on the basis of radiocarbon dating and copper beads containing the cotton fibers. These were unearthed from the graves, which predicted the beginning of the period I was prior to 6,000 BC (Moulherat,

2002).

Figure 1.11: shows the Indus civilization. The archeological sites are marked with Red dots. Source: http://taapworld.wikispaces.com/Indus+River

32 6000-1500 BC:

Indus valley civilization was one of the ancient civilization that harbor the area about

500,000 square miles and spread 6 countries, Pakistan, India, Iran, Afghanistan,

Russia and China (Rao, 1978). The mature age of this culture was about in 2500 BC, however the primitive early culture was also present in the 5,000 BC. Out of 350 to

450 archaeological sites (Valentine et al., 2015), two main sites Moenjodaro and

Harappa were the principal sites situated near . Harappa was the primary excavated site explored in 1920 and sometime referred to as Harrapan civilization

(Dibyopama et al., 2015). The distance between both sites was about 350 miles

(Mughal, 1973). The architecture of Indus civilization is a splendid stage of civilized planning, spectacular consistency and excellent symmetry (Ratnagar, 1981).

Exclusively, Moenjodaro was a unique modern town of the Indus civilization, archeological record specified nine time reconstruction of this fabulous town (Jarrige, and Jean-Francois, 1981; Shephard, 2015). On the basis of social hierarchy

Moenjodaro divided into two parts; upper and lower. The upper part was forty feet high and some sort of bastion for protection and for the elite class, while the lower part was for the working class (Possehl and Gregory 1982).

33

(A)

Figure 1.12: Excavated sites of Harappa (A) and Moenjodaro (B) Source: http://www.ancient.eu/india

The downfall of this civilization is unknown, but most of the archeologist suggested that it was purely due to adverse environmental conditions and climate changes.

Northern hemisphere record suggested that during the Holocene Epoch that began after the end of Paleolithic ice age (12,000 to 11,500 ybp) revealed an increased variability in solar radiations per millennium. The most drastic change occurred in 5.5 to 4.2 kyr before present as several shifts towards dry conditions were noted by

34 (Sirocko et al., 1996; Ortiz et al., 2000; Gasse, 2000). This was the time when

Mesopotamia Civilization (amalgamation of varied cultures in Iran, Syria and Turkey) were on the verge of demolition because of persistent drought and at the same time the Indus civilization was in a transition of organized urban phase to post urban phase escorted by migration towards the Southeast region (Staubwasser et al., 2003). The primary mechanism of such changes and persistent drought conditions resulted in climate change, is actually the reason of downfall of highly organized, fabulous civilization (Singh et al., 1990; von Rad et al., 1999; Enzel et al., 1999; Weiss et al.,

1993).

Nevertheless Indus Valley civilization (Harrapan / Moenjodaro), introduced advance cultural norms in agriculture and trade with Middle East peoples. However, they could not withstand against the barbaric invaders coming from the west. From this point on, the Indus Valley region (Pakistan) has a continuous history of West

Invasion, which is elaborated below sequentially:

1500-50 BC:

Another disaster came into this region was in the form of Aryans came from Central

Asia. Though they also played significant role in the downfall of the Indus civilization

(Shaffer, 1984; Parpola, 2015). There were two successive invasions of Aryan in the subcontinent dated 1500 BC and 1400 BC and they settled in Sapta Sindhvas (The land of the Seven Rivers).

35

Figure 1.13: Scanned Rout of Indo-Aryan in Indian Subcontinent Source: http://webcache.googleusercontent.com

According to Hindu mythology the first Aryan Invasion was done under the Aryan

God Called Indra (Deva of rain and thunderstorms), who was a mighty warrior and believed to govern on the sky and clouds (Danino, 2006). Later the second invasion was launched in 1400 BC by Bharata from Afghanistan. The havoc and terror remained continue throughout this period, that change the destiny of the Indus civilization (Klostermaier 1998). In the political hierarchy of Aryans, the highest political unit was known as “Jana” and the chief of Jana was “Rajan” who was helped by the priest called “Purohita”, who got advice Senate called Sabha and Samiti

(Anand et al.,

2015).

36 According to Vedic literature a fight was famous between the Dasa (local peoples called Dravidians) black, short and with flatted nose, while Aryans were well tall and fair in color, solid in speech, belief and practices (Bryant and Edwin, 2004). The purpose of this war was to capture and hunt the treasures of the local inhabitants.

They annihilated the local tribes and took the captive back to West also where they are known as Gypsies (Lal, 1962; Sharma, 2005). Aryans are the founder of the ethnic base of Aryana state, Panoptic Iran, Afghanistan, and part of Pakistan. Currently, evidence of their culture is scanty (Satpathy, 2015). Nevertheless, they left Vedic literature with evidence of caste groups (Frawley, 2001).

600-185 BC:

The oldest civilization found in this era was Gandhara that comprises of valley, , Sawat, Dir, Malakand, Bajuaur agencies in KPK (including FATA region) and finally Punjab (Taxila) of Pakistan up to Jalalabad in Afghanistan (Farooq et al., 2015). This was the total blooming area of Gandhara civilization later was spread towards east in Japan and Korea (Bartel, 1979).

Taxila is the main center of the Gandhara civilization and ancient about 3,000 years.

The peak of this civilization had been seen in 2 BC, when religion, Buddhism was designated as a state religion, which prevailed in this region more than 1,000 years up to 10 AD (Hasan, 2005). The Gandhara civilization, not only influenced by culture but also has spiritual influence and introduced a “Gandhara art” throughout the world as shown below.

37

328 - 311 BC:

During this time Alexander, the Great, conquered the Aryana and left a deep impression on the culture of Aryana. The Alexander was born in Macedon approximately, 356 BC, at the age of twenty he became the emperor after the death of his father (Alexander, 2015) and proved himself as a great warrior and conquered the most of the world in a short time of twelve years (M'Crindle and John Watson 1816).

He was the educatee of Aristotle, who created him an interest about medicine, science and philosophy. In 333 BC Alexander fight against emperor Darius III and marched towards Egypt and got the victory against Assyria and Babylonia (Iraq), then moved toward Central Asia and married a central Asian woman in Bactrian (Lyons, 2015). In

326 BC Alexander attracted toward Indian subcontinent and fought an epic battle against King Porus (ruler of Punjab) and later showed his kindness and courtesy to him by choosing him as a satrap of his own kingdom (Tarn, 2003).

Alexander believed on cultural and racial assimilation and forgave who surrendered, and made them a part of his army by taking the oath. In this way his army was the amalgamation of different cultures and diversified ethnic peoples (McCrindle et al.,

1896). A large number of troops inhabited in Indus region when he gets back to the homeland, thus contributed a vital role in the diversification of South Asian genetic pool. Nevertheless, behind his success, was the education of his outstanding teacher

Aristotle, that deemed reflected in his solid decisions (Gabriel, 2015). Alexander was the founder of 70 cities including Alexandria on Indus (Pakistan), in

Afghanistan (Alexandropolis) and Egypt (Alexandria) etc. (Stein, 2014).

38 100-5 AD:

The nomads of Eurasia, designated as Scythians or Sakas are the admixture of

Europoid, chiefly Andronovo, Mause, Azes are some foremost kings (Marácz, 2015).

They occupied and ruled over Siestan (Sakistan) and Texila (Pakistan) (Abetekov and

Abetekov, 1994).

Figure 1.14: Alexandria on Indus situated at the junction of Indus and Acesines Source: https://en.wikipedia.org/wiki/Alexandria_on_the_Indus

5-75 AD:

Another swarm invaded on Central Asia was Parthians great kings like, Gondophares,

Abdagases and Orthagnes (Iravani 2015). They banished the Sakas in Siestan and

39

Gandhara, destroyed the remnants of the Indo-Greeks. Probably the Pathans of

Gandhara have descended from Indo-Parthians (Mukherjee, 1969).

75-230 AD:

In this era Sapta Sindhvas was invaded by patrons of Buddha. They brandished across the eastern boundary and on the west they have good relation with Roman Empire. It was third greatest Scythic invasion in the Aryana (Khademi, 2003)

360-575 AD:

The White Huns so called Epthalites were the Turko-Mongol origin, spread across the

Asia like a thunderstorm and destroyed the Kushanas civilization (Biswas, 1971;

(Marácz, 2015). They left behind a dark age and contributed much in the genetic diversity of South East Asia. In 443 AD the Hunas or Sweta Hunas attacked on the

Iran and then India (Thakur 1967; Ruangsup, 2015). They were belligerent invaders led by the brutal chieftains Toramana and Mihirakula, they divested north India state.

They traversed the Hindukush, occupied the Kabul valley and Gandhara and leave behind their impression of brutishness on Indian Sub content (Maenchen-Helfen,

1973).

650-850 AD:

This was the era when the Prophet of , Muhammad (SAW), completely transformed the fate of Arabia. They preached, the people within a span of twenty three years and convert the barbaric Arabs down to earth. Even after the death of the

Prophet, their followers took the message of Islam to the every corner of the world

(Ikram, 1989). The traders from the Arabia used southern India for trading, after

40 becoming Muslim, they preached the local peoples and converted them into Muslims

(Wink, 2002). During the ruling of Umayyad Caliph, Walid bin Abdul Aziz, who nominated Hijjaj- Bin-Yousef as administrator of the Eastern province. This was the period when Sindh was under the central command of Raja Dahir, treated the people inhumanly (Lal, 1984). In 712 AD Hijjaj bin Yousef sent his nephew Muhammad Bin

Qasim at the age of seventeen to invade Sindh. Muhammad bin Qasim defeated the

Raja Dahir killed him and Sindh came under the command of Muslims (Hoodbhoy,

1985; Khan et al., 2015). The Arab military occupied the Sindh (Province of

Pakistan), the lower Indus Valley and integrate it into an Arabian domain (Schimmel,

1982). Consequently Sindh not only attain the territory of Islamic outpost, but established as an Indo-Muslim empire that has direct acquaintances with the Middle

East trading. At the beginning of this era, most of the enormous clan of Turkish tribes comes into the light of Islam and moved through North-West coastline from the

Afghanistan, Iran and finally reached the Indian subcontinent (Gib 1923).

750-950 AD:

The Turki Shahis invaded on the northern region of the Indus Valley. These aggressive lavish invaders established their Turk Kingdom in Delhi, which further empowered the Persian and Afghan Muslims intruders to spread across Pakistan and replacing Indo-European languages (Keay, 2011). Within next few decades Muslim kingdom expand its commands “East to Bengal” and “South to Deccan” and remain predominant in the Indian subcontinent until 750 AD (Chandra, 2004).

Furthermore, there was an extensive expansion of Muslim religion accompanied by mercenaries and business men coming with Mahmud Ghaznavi ruled on norther region (Barnhill, 2015). Mahmud attacked at India 17 times to destroy the power of

41

Hindu Rajas, after defeating Tarnochalpal, Mahmud occupied Punjab. The drastic impact of Mahmud’s expedition was implementation of Muslim rules and the establishment of Islam society under the supervision of the great Scholars of Islam.

Later the Mughal descendent of Mongols like Genghiz Khan also moved westbound and invaded over an immense region Including Pakistan, India and Iran and reached up to Turkey and contributed in the aggressive diffusion pattern of mtDNA. (Cavalli- sforza et al., 1994). The British were the last Invaders came from the West in 1857.

They kept the Indian subcontinent as a colony and overcome the Muslim kingdom.

They garnered the resources of the land and sent them to their homeland (Bellman

1997).

In contrast to this the Arabian Peninsula may also serve as a hub for human migration and vast genetic variability of African and Eurasian origin often observed. Besides the contribution of Arabia, Iran also have great influence as a credible genetic source of

Pakistani population (Metspalu, 2004).

The Classical genetic study further strengthened that concept and unveiled that most of the Muslims have their similarities to their non-Muslim population deliberately local Hindus accept northern and northwestern Pakistani, likely because of genetic lineages of external origin (Papiha, 1996).

Administrative Units of Pakistan:

The administrative units of Pakistan comprise of five provinces Balochistan, Gilgit

Baltistan , Punjab, Sindh, (according to recent division), and group of federally administered tribal areas as described in map figure 1.14.

42 Table 1.1: Population of Pakistan by Administrative Units according to 1998 Census.

Figure 1.15: Illustration of Map of Pakistan showing an administrative division and Provinces Source: http://nationalheritage.gov.pk/provinces.html

43

Figure 1.16: Illustration of Map of Pakistan showing a population per square meter in the Provinces. Source: https://en.wikipedia.org/wiki/Pakistan Azad Jammu and Kashmir:

Azad Jammu and Kashmir (AJK) is a one of the administrative territory of Pakistan. It is located west of the Jammu and Kashmir occupied by India that was subjected to be a long running conflict between Pakistan and India. It comprises of 5134 square miles and comes under orogenic belt, on the southern side is plain area, i.e. Kotli, Mirpur and Bhimber, while the northern side comprises of hilly areas such as Neelum,

Muzaffarabad, Hattian etc. The area is full of natural beauty with dense forests, lakes, streams. The main river is Poonch, Neelam, and Jhelum.

44 Table 1.2: shows Climatic and Geographical regions of Azad Jammu and

Kashmir.

Source: https://en.wikipedia.org/wiki/Azad_Kashmir

Population:

Azad Jammu and Kashmir have 2.973 million population, according to 1998 census

(it was the last census took place in Pakistan). Maximum population comprises of

Muslim communities ratio of 22:3 in rural and urban regions. Every square kilometer has 320 persons with moderate to high literacy rate (55% to 70%).

45

Table 1.3: District wise population density in Azad Kashmir.

Source: Provincial Consensus report 1998

Languages:

Urdu is an official language, other languages are Gojri, Kashmiri and Pahari just like

Pothwari and .

Castes:

The major castes are Kashmiri, Gujjar’s, Sudhans, Rajput, Jat, but no census has been carried out in 1998 till now.

Balochistan:

Balochistan is one of the four provinces of Pakistan that is situated in the southwestern region of Pakistan, the provincial capital city is Quetta. In the South

Afghanistan, Iran and Arabian Sea is located and in the north east Punjab, Sindh,

Khyber Pakhtunkhwa and FATA lie. Baluchistan’s economy depends upon natural

46 resources such as natural gas, whereas Gwadar port has become a backbone of

Pakistani economy very soon. It constitutes approximately 43 % of the total area of the country and roughly 6.6 million population inhabited according to 1998 census.

The southern region is Makran, central is Kalat and in the northern region is Sulaiman mountains and Bolan Pass leads to Afghanistan.

Table 1.4: Population of Balochistan by Districts according to 1998Census.

Languages:

The major languages are Balochi, Hazaragi, Dehwari, Brahui, and Saraiki.

Major Castes:

47

Bugti:

They are Baloch tribe and clan of the Rind tribe between Jacobabad and Sibi. They are descended from Qureshi Arabs and are the generation of Hazrat Amir Hamza, as narrated by old Balochi poetry, but there is no genetic evidence (Kumar 2008).

Laghari:

They are the one of the oldest Baloch tribe living in the Baluchistan, Sindh and

Punjab. According to Bellew and Henry (1891) Lagari comes into the region along with Arab conquerors and a clan of Rindh tribes.

Lashari:

They have Arab origin and are living in Pakistani province of Baluchistan (Lashar),

Oman, Kuwait, Saudi Arabia and Qatar and are a clan of the Rind tribe (Kumar 2008).

Mazari:

The word Mazari derived from the Balochi word “Mazar” named tiger in Baloch language. They are one of the oldest tribe among Baloch tribes and are migrated to

Bampur and then Iranian Baluchistan (Bellew and Henry 1891).

KPK (NWFP & FATA):

The Khyber Pakhtunkhwa (KPK) known as NWFP situated on the Eurasian land plate and the Iranian plateau in the northwestern region of the country. Peshawar is the capital of the KPK and situated on north Gilgit-Baltistan, on east Punjab, Azad

Kashmir and on the southeast Afghanistan. KPK is the third largest province of

Pakistan by size and participate about 10.5% of total GDP of the country. It is the major site of the Gandhara Art and place of invaders such as Persians, Greeks, Shahis,

Ghaznavi, Mughal, and Sikhs rulers. Climatically, KPK is divided into two zones:

Southern zone extending from the range of the Hindu Kush up to Peshawar Basin is

48 cold and snowy, Northern zone covering up to Derajat basin is hot in summer and cold in winter. The major rivers are Zob River, Kabul River, Gomal River, Karam

River, Bara River, Panjgor River and Swat River. The populations, according to the

1998 census of KPK (Old name is NWFP) is given in the table below

Table 1.5: Population of KPK (NWFP & FATA) by Districts according to 1998Census.

Languages:

Urdu, Pashto, Hindko, Saraiki, Khowar, Punjabi, Kohistani, Gojri and Dari are the common languages of KPK.

Major Castes of KPK (NWFP):

Bangash:

They derived its name from the mountain range of Koh-e-Sulleman. They are the most dominant and prominent Pashtun tribe of the eastern Afghanistan and

Northern Pakistan. In the fifteen century, they settled into the Kurram Valley and

49

merged with Karlani Pashtun. A limited number of descendants were settled in

Jammu and Kashmir, Northern India, and Muzaffarabad (Azad Kashmir) in

Pakistan (Balland, 2010).

Khattak:

They are Pashtun tribe that inhabits along the western bank of Indus River from north uphill as Dher, Sher Garh and neighboring Malakand, and district of . Primarily they are the immigrant of (Ghani and Logar) present day Afghanistan (Wise, 1960).

Orakzai:

They are Pashtun tribe, and belong to the valley of KPK. means “The

Lost Son” (Wrak Zoi). They are divided into sub clans: the Akhel, Ismailzai,

Massuzai, Alisherzai, and Muhammad . They inhabit in , bounded by the Khyber Agency in the northwest and northeast. Many Orakazai settled in Thall,

Parachinar, Peshawar, and due to limited resources and barren land (Gait and

Edward, 1902).

Yusufzai:

They are a sub clan of great tribe Sarban of Pashtun and inhabited in KPK

(particularly in Sawat and Mardan districts) FATA and some eastern parts of

Afghanistan. The origin is not known, but in 330BC Alexander the Great mentioned this tribe as “Isapzais” and later on also mentioned by Baber in the 16th century

(Marten et al 2009).

Mahsuds:

They are also known as a Karlani Pashtun tribe and they are inhibited in South

Waziristan Agency. They are settled in the Afghanistan in Logar province primarily in Charkh District, and Kunduz provinces (Kumar 2008).

50

Wazirs:

They are on the border of Afghanistan and consist of two sub-clans

Utmanzai and both have settled in the North and South .

According to some western ethnologists they may have Rajput origin that is being mixed with Indian ethnic race and some non-Indian ethnicity, probably Tatar and

Scythian (Bellew and Henry 1891).

Punjab:

The word Punjab means “The Land of Five Rivers”. Geographically, Khyber

Pakhtunkhwa is located in its north, Azad Kashmir in north east, India in south east,

Sindh in the south west and Balochistan in the West. Lahore is the capital of Punjab.

This province is one of the largest populated areas approximately 65% of the total populations of Pakistan lives in Punjab that is the home of the Indus, Chenab, Beas,

Sutlej and River. In the past it was the part of Indus valley civilization, Indo-Aryan civilization, Gandhara civilization, Macedonian civilization, Kushans civilization and

Parthians civilization. In Punjab caste systems with more powerful than tribal system and does not allow inter-ethnic marriages leads to less admixture in the different ethnic groups of the population.

Caste System and its Impression on gene pool of South Asia:

South Asia renders a key to learn the numerous factors persuading geographically, incredible large metameric society. The South Asian population enriches with varied languages enriched with socio-economic culture, fostered by diverse norms. However,

51

predominantly the diverse origin and ethnic separation within the subcontinent makes it an actual interest to study the genetic similarities between different ethnic groups.

Table 1.6: Population of Punjab by Districts according to 1998 Census.

The expansion of distinct linguistic geography and rigorous practice of endogamy across all social ranks lead to emergence of population specific diverse social traditions. These variant groups of population provide a subtle basement for indulgent

52 genetic characters and primordial prehistoric colonization in an ecological area that results in increase interaction among different population groups. Generally, the human beings cluster themselves into small divisions in such a way that each cluster rarely exchange their genes because of geographical and cultural hurdles resulting in a highly diverse genetic population known as “Endogamy”. The South Asian endogamy is distinctive because of exceptional barriers, including social norms, traditions, cultural and religious boundaries are explained by Chaubey 2010 by elaborating in figure 1.17.

Figure 1.17: Social structuring of South Asian population: A) In castes no consanguineous marriages prevalent and boundaries are rigid B) consanguineous marriages prevalent C) consanguineous marriages are prevalent in some extent. While spouses are shown in solid circles and bidirectional lines represent their movement. Whereas dotted circles and bidirectional lines portray the permeable boundaries.

53

In (2009) Rao and his coworkers, described South Asia as the orthodox land of castes.

Caste is a social organization, originating sanction from the Hindu religion.

According to Rig Vedic society that has been based on the Chaturyarna doctrine, i.e.,

Brahmin, Kshatriya, Vaishya and Sudra. Brahmin is the leading people of religion and perform ritual activities. Kshatriya placed next as sovereigns, worriers and defenders, whereas Vaishya became farmers, artists and merchants. These three castes are called

“Dwij”. The fourth class falls into the lowest category called Sudra, who works as labors in the hierarchy.

According to Thapar (1966) in spite of Vedic literature, the Puranas and Upnishads gave evidence about such structuring. He suggested that Vedic class ordering was based on the profession of the individuals. The caste system is a hot topic in South

Asia, which has been studied extensively, but it is also true that different conventions made it one of the debatable cases. Each caste is subdivided into multiple Gotras

(endogamous branch) that forbid consanguineous marriages as described in figure

1.17 (Gupta 2006).

Anonymously caste system in South Asia constructs cultural hurdles and keeps apart the population into numerous endogamous pockets which restrain the movement of spouse, resulted in a restricted movement of the gene flow outside of the caste groups

(Bamshad et al., 2001).

Major Castes in Punjab:

As an influence of the Indian culture, Pakistan has retained the name of many Hindu castes even after becoming Muslims. This has enhanced the system after the partition

54 of the Subcontinent in 1947, when a large number of people from all over India migrated to Pakistan. This has also resulted in the admixture of the local and migrants.

The detail of major castes is enumerated below.

Arian:

They are the mainly caste of Punjab province with little portion present in Sindh.

They are assumed to be originated from Indo- Aryans. The Aryans divided the Indians into castes in ordered to enhance their eminence, the original cast system initiated when they migrated from north of India. Arian is believed to originate from the

Vaisyas considered as a middle class (Cassan, 2010). Surprisingly, out of twenty five one person is discriminated on the basis of caste. In Pakistani rural areas caste discrimination is more than an urban region, especially during the elections, intermarriages and kinship.

Gujar:

They are assumed to be the part of decedents as an integer of aristocratic, Eurasian peoples, comprising of Indo-Scythians, Georgians and Khazars of the Caspian Sea.

According to one hypothesis Gujar are the Turko-Iranian tribes that merge with Indo-

Aryan groups during the Scythian invasions of South Asia and primarily settle down in Gujrat, Punjab and Kashmir states. It was believed that Dadda, the founder of

Gurjara-Pratihara dynasty accomplished a state at Nandipur (in old literature named as Nandol) in the 7th century (Tufail, 2012).

The Gujar Caste came into modern Pakistan and northern India at the time of the

Huna invasions of the region. In the 8th to 12th Century, they were classified in the

Varna system as Kshatriya, and many of them later converted to Islam during the

55

Muslim rule in South Asia. In the Himalayan areas, Gujar deliberated as a significant and historic tribe that governed over many states in northern India for hundreds of years and departed their trajectories in the Himalayan ranges and carved them in such a way that they could not be demolished even after thousands of years (Bittles, 2004)

Some geographical and archaeological signs suggested that word Gujar was originated from Gurjara and appeared as “Gurjiya or Georgia” (Persian name for

Georgia) acclaimed that Gujar tribe was somewhat Central Asian origin. According to

Dr. Huthi they settled here to save them from Timur that detained a sovereignty of terror over them. They called themselves by Persian word for “Georgian” that later, probably changed into “Gujar” (Sofi, 2011).

In contrast with the report conducted by Trible Research and Cultural Foundation, a national organization working on Gujjar’s of Indian, stated that Gujar race was originated from Turkey in the third millennium BC by a word “Ger” stands for a leading ethnic group who was later turned into Gujar. The bulk of the Gujar was inhabited in Pakistan, while India has the second largest Gujar population. Most were the Muslims and spoke mother tongues called “Gujarati, Gujari” that was much similar to Marwani or Rajasthani (Nongbri and Tiplut, 2013).

Jat:

Jat are the paradigmatic example of community, according to Varna system the Jat were considered as” Kshatriyans or degraded Kshatriyas” and have Indo-Scythian origin and were inhibited in a lowered Indus river of Sindh. They belonged to agricultural communities in North India and Punjab in Pakistan (Chakravarti, 2003).

The Arab writers described the Clan of Jat as “Zat” in the Arids. In between the

56 eleventh and sixteen centuries Jat drovers migrated up along the river valleys into the

Punjab and were the founder of ploughing (Bayly, 2001). It has been noted that castes in the subcontinent are the amalgamation and complex blend of native (Dravidians) and Aryans. The sub clan in the Varna system called Jati. Each jati member can only marry with jati woman and not outside of the caste. The large caste at that time were

Madhara, Jartika, and Kokiya. It is believed that Madahra was the ancestors of Jat

(Ahmed 2009).

Kakazai:

They are also known as Loye a division of clan and a part of the tribe were inhabited in Bajaur agency and Punjab in Pakistan. They came to South

Asia with Muhammad Ghaznavi but originally evolved from the Laghman province of Afghanistan, in the eastern region (khursheed, 2007).

Rajput:

There was no uncertainty that the people known as Aryans reached India around 1500

BC. The migration was in successive waves and they settled down in the Indo-

Gangetic plains. They forced aboriginal Dravidian speakers towards Southern India and Sri-Lanka, and introduce Indo-Aryans as a superior class of people. This idea is further strengthen by ; Bamshad et al., 2001; Basu et al., 2003; Cordaux et al., 2004;

Sahoo et al., 2006; Thanseem et al., 2006). They imparted that the caste populations are a bit closer to Central Asians (Aryan Invasion) and exclusively diverged from tribal population. Aryan was fair skinned, who overran the dark skinned Dravidian civilization (Gupta, 2001). This might be distinguished in this context that the term

Varna in Rig Vedas means skin color or stratified order and each order has a color

57

pennant of its own as they were characterized by the phases of the sun around the earth. The rising sun, magnificent of all was red and this was color known to have ruled Kshatriyas. Brahman has given white color because thought to be the color of the sun at noon. Vaishya’s were yellow and Sudra was blue, because they considered as the hue of the setting sun (Asopa, 1976). These Aryans married the local women and then there was admixing of the gene pool that resulted in the emergence of other castes such as Rajput, Jat from Kshatriyas etc. (Chattopadhyaya 1976). Presently

Rajput is found to be Rajasthan, Gujrat, Uterperdash, Punjab (Pakistan) Sindh etc.

Sayyid:

They are Arabs inhabited in the Arabian Peninsula and are the decedent of a branch of the Banu-Hashim, which belongs to the tribe of Quresh. It was evident that there were frequent movements between Africa and Asia through Arabian Peninsula took place.

The anthropological, archeological and genetic diversity has given a strong evidence of migration of modern human out of Africa through a southern route followed by

Arabian Peninsula (Kazuo, 2012).

Sindh:

Sindh is the fourth province of Pakistan situated in the southern part of the country, in the west it is bounded by Balochistan, in the east Gujrat and Rajasthan while in south

Arabian Sea lies. The capital of Sindh is called and is economical back bone of Pakistan (Chandio, 2009). Historically Sindh was the place of Mehrgarh

Civilization, Indus Valley Civilization, Macedonian Civilization, Ancient Egyptian

Civilization and Mesopotamia Civilization. The population of Sindh is 30.4 million, according to the 1998 census of Pakistan as shown in table (1.7).

58 Table 1.7: Population of Sindh by Districts according to 1998 Census.

59

Languages:

The major languages are Sindhi, Urdu, Pashto and Punjabi.

Major Casts of Sindh:

Bijarani:

They are clan of Buledi tribes lived in Sindh and Baluchistan, Pakistan. It has been noted that Bijarani roamed Aleppo (North Syria) in the second century, after Hijra, they colonized in hilly areas of Baluchistan through the South of Caspian Sea. When

Bijarani entered Makran, they settled in the village Buleda (Makran division of

Baluchistan). After few centuries Buledi tribes started migrating from Buleda to

Bolan and then to the upper parts of Sindh especially Jacobabad, Tangwani, Kandhkot and Kashmore (Captin Postans, 1844).

Chandio:

They are the sub clan of the Hooth tribes and they left Kalat (Harboe) and Bolan

(Dhadar) due to a racial extermination war between Rindh and Lashari and settled in

Koh-e- Sulleman areas of D.G. Khan in 1600 AD. The Chandio tribe spread out over

Pakistan, with a larger concentration in Sindh (Asimov, 1998).

Ghallu:

Was believed to be the decedent of king Ghallu Singh of Rajistan and lived in the

South-west corner of Sindh and south of Punjab and the major speaking language was a Sindhi, Saraiki and Balochi (Singh et al, 2001).

60 Khoso:

They are the Baloch tribe of Pakistan that lived in Sindh and Baluchistan. Balochi history revealed that they were one of the forty tribes that had survived since the 11th century A.D (Captin Postans, 1844).

Nasrani:

They are the oldest Christian community that has originated from evangelistic activity of Saint Thomas in the 1st century and were mostly exist in Kerala, India and Sindh in

Pakistan. Anciently Nasrani was the part of the “Church of the EAST” placed in

Persia (Fahlbusch, 2005). In 8th century, they were represented as the Ecclesiastical

Province of India. Nasrani represented a single ethnic group developed from the East

Syrian and Jewish influence mixed with local social customs and indigenous Indian and European peoples (Neil and Stephen, 2004).

Mitochondrial DNA studies in Pakistan:

In spite of large and varied populations of Pakistan, investigative studies of mtDNA are scanty.The initiative work was done by Quintana-Murci et al., 2004 studied 23 populations including 100 samples from South East Pakistan, Pakistani Baloch (67),

Brahui (58), Parsis (45), Sindhi (91), Karachi Pakistani (77), Pathan (86), Makrani

(73), Hunza Burusho (73) and Kalash (25) respectively, the genetic diversity was observed ranging from genetic variability (0.974) in Balochi, Lur (0.978), Brahui

(0.952), Parsis (0.943), Makrani (0.975), Hunza Burusho (0.980), Klasians (0.830).

The most prevalent haplogroups in Balochi population was L3d (2.6%), M, (33%),

N1a (2.6), R2 (7.7), H (26.3), U (15%) and J (7.9). However Sindhi has most prevalent haplogroup M (54.5 %), U4 (19%) and T 11%. In Karachi Pakistani, M

61

was (47%), H (12%), U (16%). Moreover Pathan has haplogroup M (30%), N (10%),

R (4%), U (16%) and J (15%). The populations of Indus Valley inhabited in west comprised of western Eurasian frequently, east Eurasians moderately and South Asia predominantly.

Cordaux et al., (2004) analyzed 14 bi-allelic, five short tandem repeats of the Y- chromosome marker and HVI mtDNA sequence variations were observed in the

North East Indians and Pashtoons.

It was suggested that northeast Indian were the gateway linked the Indian subcontinent to East Asia and South East Asia. Northern Indians also exhibited markedly reduced diversity of Y-chromosome and prominently high diversity of mtDNA. Which provided the solid evidence for the genetic incoherence among the northern Indians groups and the rest of South Asian and East Asian groups at least within past era. Therefor it was noteworthy that northeast Indian gateway act as topographical hurdle rather than as a corridor for human voyage between the northern.

Indians and East Asian.

Rakha et al., (2011) generated a genetic profile of the Pathan ethnic group that was a part of KPK and FATA of Pakistan and exuberated eight major haplogroups. These haplogroups are divided into South Asian specific haplogroup (M) and East Asian haplogroup (N1) and West Eurasian haplogroups (R, J, T and U). Overall 193 haplotypes were found that was predefined by 215 SNPs. In comparison with other populations and ethnic groups they differ with high genetic diversity and low match

62 probability determines that Pathans has an endogamous ethnic group identity rather than pooling data into the population of Pakistan for forensic identification.

Whale, (2012) worked on Balochi population, Pashtoons and found 37.5 % East

Asian haplogroups and 64.3% West Eurasian haplogroups. Mitochondrial DNA analysis illustrated that the Hazara of Afghanistan possesses a large East Asian haplogroup contributing about 37.5%, while the Baluch, , and Tajiks possess a much lesser contribution; less than 14.3%.

Siddiqi et al., (2015) elucidate the Makrani individuals living in Pakistan and found

70 different haplotypes out of which 54 were unique and 16 were shared by more than two individuals. They showed a high power of discrimination (0.9592) with 28%

African haplogroups, 26% West Eurasian, and 24% South Asian, and 2% East Asian haplogroups.

Hayat et al., (2015) elucidated mtDNA control region of 85 unrelated Saraiki population of Sindh, Punjab, NWFP, and Balochistan. They found 63 different haplotypes out of which 58 were distinctive and one individual have more than five mutually shared sequence. Most frequent haplotypes were W6 observed with a frequency of (12.9%), M2a1a1 (1.1%), M4 (1.1%), M18a (2.3%), HV2a (1.1%), R31

(1.1%), U7 (7%). The genetic diversity was noted (0.9570) and the power of discrimination was (0.945) respectively.

Ilyas et al., (2015) observed 3.8 million SNP and 0.5 million Indels in first ethnic male Pathan genome by Next generation sequencing belonging to H2 haplogroup and

63

L1 Y-haplogroup. A total of 129,441 SNPs were unique in 5,344 genes, might be marked as high risk for diseases and possibly affect the drug metabolism and further showed an admixture of European and Asian lineage in this geographical region.

In the light of above discussion the present bifocal project was designed to find the haplogroups in Pakistani caste and tribal system, by the comprehensive elaboration of mtDNA variation among different isonym groups.

Aim and Objectives:

The main aim of this scrutinized, study was to contribute towards an mtDNA database of the Pakistani population. The specific objectives were:

1. To identify SNP by sequencing of the control region of mtDNA and

documentation of their frequency distribution in major ethnic groups of the

Pakistani population.

2. To study SNP genotypes for detection of Major haplogroups from isonym

groups with intra caste marriages.

3. To establish a phylogenetic association of different ethnic groups in Pakistan.

64 SUBJECTS AND METHODS:

Study design: Descriptive population Genetic study

Setting: Department of Human Genetics & Molecular Biology, University of Health

Sciences, Lahore, and University of Lahore, Pakistan.

Sample Size: Calculated sample size 500 (Baluchistan = 25, KPK (NWFP) = 65,

Punjab = 280, Sindh = 115 and FATA = 15).

Sampling techniques: Representative random sampling according to Population density (Baluchistan= 5%, KPK=13%, Punjab=56%, Sindh= 23% and FATA= 3%).

Sample selection: Different ethnic groups in Pakistan. The sample was collected from different isonym ethic communities of different districts of the four Provinces of

Pakistan. For this purpose, major ethnic groups were selected from each province, according to population density.

Baluchistan: 25 sample

They were divided into Bugti 5, Laghari 5, Lashri 5 and Mazari 10.

FATA: sample 15

They were divided into 10 and Wazirs 5.

Khyber Pakhtunkhwa (NWFP): 65 sample

They were divided into 20, 20, Orakzai 20, and 5.

Punjab: 280 sample

Arain 50, Gujar 30, Jat 50, 50, Rajput 50, and Sayyid 50, sample was collected.

Sindh: 115 sample

Bijarani 20, Chandio 20, Ghallu 20, Khoso 20, Naserani 20, and Solangi 15 samples was collected.

65

Inclusion criteria:

The sample was collected from different isonym ethic communities of different districts of the four provinces of Pakistan. Unrelated individuals from different ethnic groups.

Exclusion criteria: None.

Blood was drawn in disposable syringes and transported in the thermal flasks containing ice to keep the optimal temperature up to 40 C. In the laboratory the blood sample was stored at -200 C till the mtDNA extraction.

Sample Collection:

This prospective study was approved by the Institutional ethical committee of

University of Health Sciences in accordance with Helsinki Declarations. 10ml blood was drawn from each of individual in 15ml culture tubes containing 100ul of 0.5M

EDTA. Proper informed written consent was obtained from all participants.

DNA Extraction:

DNA extraction was done by Phenol chloroform (Sambrook et al., 1989) extraction method as follows:

5ml of frozen blood was thawed at 37°C for 5 minutes before mtDNA extraction.

Samples were mixed with 30- 50 ml Tris EDTA buffer and centrifuged at 4000 rpm for 30 min. Supernatant was wasted and the pellet was busted by tapping it tenderly, then 45ml of TE buffer was added repeat the step 3 & 4, until pellet became light pink. The supernatant was discarded and 6ml TNE buffer (Tris HCL 10mM, EDTA

2mM, NaCl 400mM) was added. SDS with concentration 10% & 50g proteinase K was added to the vial and incubated overnight at 37°C in a shaker. After digestion, the pellet was checked. For complete protein precipitation supersaturated NACL was added and the tubes were vigorously shaken and instantly placed on ice and

66 centrifuged again. Decanted the supernatant in Eppendorf by adding phenol- chloroform-isoamyl alcohol with the ratio of 1:3 and centrifuge it. Poured the supernatant carefully and add absolute alcohol in it and centrifuged at 4000 rpm for

15 min. Air dried the DNA pallet, after that added the T.E buffer 1.5 ml and put it in overnight. Centrifuge the sample for a short time and DNA was transferred in properly labeled 2.0 ml autoclaved tubes and stored at -20C.

Note: (All centrifugations in subsequent steps should be performed at 25oC).

Quantification of DNA:

The quantity of the extracted DNA was determined by as follows

1) Spectrophotometry

2) Horizontal Electrophoresis

1) Spectrophotometry:

Spectrophotometry was done through spectrophotometer.

Procedure:

The instrument was cleaned on both the upper and lower pedestals with lens-cleaning tissue, moistened with deionized water and loaded the T.E solution as blank sample

(1µl) onto the lower pedestal, closed the sample chamber by lowering the swing arm carefully, and clicks the “Blank” button on the screen. The instrument was cleaned after the measurement of blank by simple wiping both upper and lower pedestals using lens-cleaning tissue. Loaded the samples (1 µl), type in sample name, and select the “Measure” button on the measurement screen. To check the DNA purity the ratio of 260/280 was used and noted the reading displayed on screen.

67

2) Horizontal Electrophoresis:

0.8% agarose gel was dissolved in 100 ml 1X TBE buffer by adding 10µl ethidium bromide and run at 60 volts for 1.5 hours.

Mitochondrial DNA Amplification Primer:

The mitochondrial DNA control region was amplified by PCR. The primer used for amplification and sequencing are designed in this study and given in table 2.1.

Table 2.1: Set of Primers for Amplification and Sequencing of Control region.

Primer S. No 5’-3’ Sequence PCR Sequencing Name 1 F15976 TCCACCATTAGCACCCAAAG Yes Yes 2 R16444 GAGTAGCACTCTTGTGCGGG No Yes 3 F16398 ACCACCATCCTCCGTGAAATC No Yes 4 R-133 AGATACTGCGACATAGGGTGC No Yes 5 F8 GGTCTATCACCCTATTAACCAC No Yes 6 R429 CTGTTAAAAGTGCATACCGCC No Yes 7 F362 CAAAGAACCCTAACACCAGC No Yes 8 F599 TTGAGGAGGTAAGCTACATA No Yes 9 R638 GGA CCA AAC CTA TTT GTT TAT GGG Yes Yes

Mitochondrial DNA Amplification procedure:

The volume for PCR amplification was 25ul containing 2.5μl of 10X PCR buffer, 4μl of MgCl2 (15mM), 2.5μl of dNTPs (25mM), 1ul of 10 pmoles each of forward and reverse primer, 1μl of Ampli Taq (2U/μl), 5ul of input DNA and double distilled H2O to make up the volume up to 25μl. The samples were amplified (40 cycle) under the following conditions:

68 Product Gel:

After the PCR amplification, 2μl of each sample and 1.5μl of 6X loading dye was run on agarose about 2% concentration. Then it was stained, with 5mg/ml ethidium bromide, along with a molecular ladder (100bp) to ensure the correct product size at

80V for 1 hour and 20 minutes.

Purification of PCR Product:

The PCR product was purified as follows:

Ethanol precipitation was used to eliminate surplus primers and unincorporated dNTPs from the PCR products. 64l of 95% ethanol solution and 16l of PCR water were added to each 22l reaction mixture. (Final concentration of ethanol 62%). The samples were mixed with vortex, for 10 seconds at 25C. Then spin at 15000, rpm for

15 minutes. After discarding the supernatant pellets were washed with ethanol (70%).

The pellets were desiccated and re-suspended in 15l of low TE buffer (Tris HCl-

10mM, EDTA-0.2mM). 2l of the suspension was run on 1.5% agarose gel to confirm the presence of single clean PCR product and to estimate its concentration.

Mitochondrial control region sequencing PCR procedure:

Big Dye Reaction was used to sequence the PCR products. The components of the sequencing PCR amplification were totally of l0μl containing 1.5μl of 5X sequencing

PCR buffer, 3.2pmoles each of forward and reverse primer separately, 0.8μl dye used for control mitochondrial region, 15 pg of input DNA and distilled water to make up the reaction volume up to 10μl. The samples were amplified (35 cycle) under the following conditions:

69

Purification of sequencing product:

40l of 75% ethanol was used to precipitate the PCR product and mixed with vortex.

Then centrifuged it at 15000 rpm for 15 minutes at low temperature 4C. By inverting the Eppendorf, supernatant was wasted and the remaining pellet was air dried.

Formamide was used to dissolve the washed pellet. Put the mixture for denaturation

95C for 5 minutes and placed it immediately on ice for instant cooling. The run the sample on genetic analyzer

The sequence was analyzed manually in a software Chromas version (2.5.1). All the sequences were compared against revised Cambridge reference sequence (Anderson et al, 1981) by using Sequence scape version 4.2. Any change in the control region was detected by sequencing both sense and antisense strands. Sequences were aligned using CLUSTAL W and haplogroups were labelled to all samples conferring by

Thangaraj et al., 2006.

Mitochondrial DNA coding region SNP analysis: To determine the mtDNA major haplogroups, M, N, R, T, U, J, H and W, we analyze twelve coding region SNPs using a multiplex SBE reaction.

Multiplex –PCR: The selected SNPs (12 in number) divided into two multiplex reactions consisted of

1×buffer, ddNTPs 250 μmol/L, Mgcl2 2 mmol/L and one unit of Taq enzyme, optimum concentration of primer reverse (R) and forward (F) and 10ng of genomic

DNA so that the final volume became 720 μl for 32 cycles and amplification conditions as given below.

70

Mitochondrial DNA SNP Multiplex Assay:

ExoSAP enzyme was used to clean the PCR product incubated at 37°C for 20 min

For denaturing the enzyme it was further incubated 80°C for 15mints. To determine the major haplogroups multiplex assay A & B was used, the detail of which is given below:

Table 2.2: Haplogroup specific SNPs

Table: 2.3 Primer detail for Multiplex assay A

71

Table 2.4: Detail of primer for SBE assay B

PCR Amplification Multiplex B Single Base Extension Multiplex B Conc Primer Sequencing(5'-3') Conc Tm Site Primer Sequencing(5'-3') (μLM) ( bp) (5'aspecific tail in lower case italics) (µmol) °C 769 F GAACAAGCATCAAGCACGCA 0.5 189 act(gact)10 CGTTTTGAGCTGCATTG 2 59.2 R ACGCCGGCTTCTATTGACTT 0.5 1243 F AATCGATAAACCCCGATCAA 0.05 93 actgact CGATCAACCTCACCACC 0.15 59.8 R TGCGCTTACTTTGTAGCCTTC 0.05 1736 F TGACCGCTCTGAGCTAAACC 0.05 115 t(gact)2gac TCAATTTCTATCGCCTATACTTTAT 0.15 59.1 R TTGCGCCAGGTTTCAATTTCT 0.05 10400 F AGCCCTAAGTCTGGCCTATGA 0.2 206 ct(gact)5gac CGTTTTGTTTAAACTATATACCAATTC 0.3 59 R GGGAGGATATGAGGTGTGAGC 0.2 12705 F CAACCCAAACAACCCAGCTC 0.3 239 t(gact)6g TTAATCAGTTCTTCAAATATCTACTCAT 0.15 59 R AATTCCTACGCCCTCTCAGC 0.3 15301 F GTCCCACCCTCACACGATTC 0.2 127 ct(gact)4 ATTCTTTACCTTTCACTTCATCTT 0.15 58.9 R TGGGAGGTGATTCCTAGGGG 0.2

Multiplex Single Base Extension (SBE):

SBE was performed with a total volume 5μL. It included 2 μL of SNaPshot enzyme

mediated mixture and 2 μL washed PCR product and 1 μL of extension primers (0.05

μmol/L) for 25 cycles. Extension primers (0.05 μmol/L) for 25 cycles.

Analysis on Genetic Analyzer:

10 μL of formamide and 1 μL of cleaning product were mixed with 0.3 μL of 120 LIZ

standard and were heated at 95°C for 6 minutes. The raw data were collected and

analyzed with GeneMapper software v 3.2 (Applied Biosystem USA).

72

Phylogenetic Reconstruction:

The mtDNA haplogroups classification was based on phylogenetic tree 16 build (19

February 2014) http://www.phylotree.org/. The phylogenetic tree was constructed from median joining networks by using software Network 4.61.3 available on the web http://www.fluxus-engineering.com/ (Bandelt et al, 1999). The tree was checked manually to resolve homoplasies. While PHYLIP software version 3.6.6 was used to draw an unrooted phylogenetic tree of 22 populations based on genetic distances among caste communities (Felsenstein, 1988). In order to elaborate the association between the 22 castes and tribes of the Pakistani Population multidimensional scaling

(MDS) was performed by using XLstate software (Cordaux et al., 2003).

Statistical analysis:

Statistical analysis was done through Arlequin software v 3.5.2.1 that was incorporated software for population genetics and data analysis. Parameters which included in this analysis were genetic diversity, means of expected heterozygosity, and standard deviation of expected heterozygosity, haplotypes numbers, random match probability and power of discrimination (Exoffier et al., 2005).

73

RESULTS

A Complete mtDNA control region sequence analysis have been done for 500 unrelated Pakistani individuals of 22 ethnic groups groups namely, Arain, Bangash,

Bijarani, Bugti, Chandio, Ghallu, Gujar, Jat, Kakazai, Khattak, Khoso, Laghari,

Lashari, Mahsuds, Mazari, Nasrani, Orakazai, Rajpot Solangi, Sayyid, Yusufzai and

Wazirs) from 4 provinces, namely Baluchistan, Khyber Pakhtunkhwa, (including

FATA), Punjab and Sindh. We found 8 major haplogroups (M, N, R, D, U, T, W and

H), the detailed haplogroups classification is given in Table 3.1.

On the basis of sequence analysis, 412 haplotypes, defined by a particular set of nucleotides were found (ignoring the C insertions around position 309 and 315). In spite of that 65% sequences were observed once, 11 %, twice, 8 % thrice, 5 % four time and 2.2 % five times. The most common South Asian haplotypes were observed,

M 46 %, N 7 %, and R 13%, while West Eurasian haplotypes were U 18%, H 5%, J

4%, W 3% and T 2% in 22 ethnic groups. Disregarding the length variation at position 309 and 315, a random match probability between two unrelated individuals was found to be 0.01 to 0.06 %. Genetic diversity was observed 0.991 to 0.999, while nucleotide diversity fall between 0.0089 to 0.0142. The mean number of pairwise differences were observed 5.2 ± 2.8 to 12.5 ± 6.2, whereas expected heterozygosity was revealed as 0.119 ± 0.11 to 0.459 ± 0.093 in 22 ethnic

Based on AMOVA majority of variation was observed within a population, attributed as intra- population variations (95.67 %), while analysis of variance between groups was (1.45) and among populations within groups was (2.88). Comparatively, high value of average nucleotide diversity and low value of random match probability was observed in control region of mtDNA in 22 ethnic groups that ascertained its forensic

74 use. A multiplex assay was developed to score the 12 SNPs of coding region for defining major South East Asian haplogroups.

Table 3.1: Major Haplogroups found in 22 ethnic groups of Pakistan.

Haplotype Number Haplotype Number Haplotype Number M2 16 M46 3 H11 1 M3 49 M49 13 W1 5 M4 8 M51 1 W3 2 M5 32 M52 6 W5 1 M6 5 M57 2 W7 1 M7 5 R31 7 N1 3 M12 3 R7 16 N2 2 M14 1 R8 13 N5 2 M18 10 R9 3 N7 1 M24 1 U2 43 N9 13 M30 9 U3 1 N10 3 M33 14 U4 2 N11 4 M34 3 U5 14 R2 15 M35 6 U6 4 R5 4 M37 1 U7 20 R6 3 M38 2 U8 5 J1 6 M40 5 H6 6 J2 11 M41 1 H2 11 JT 1 M43 1 HV2 11 T2 3 M45 7 HV6 1 T3 1

Quantification of DNA:

DNA is quantified in the Tris EDTA Solution through Spectrophometry, as shown in

Table 3.2.

75

Table. 3.2 Quantification of DNA of 22 different ethnic/ isonym groups from four provinces of Pakistan.

Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof

RJ_01 1.65 1.96 80.7 JT_51 1.75 1.99 177.8 AY_101 1.23 1.98 68.9 RJ_02 1.66 1.99 89.7 JT_52 1.71 1.98 125 AY_102 1.82 1.91 112.3 RJ_03 1.69 2.03 59.9 JT_53 1.8 1.99 258.3 AY_103 1.76 1.89 125.2 RJ_04 1.75 2.07 132.7 JT_54 1.78 2.03 181.8 AY_104 1.72 1.99 128.6 RJ_05 1.78 2.06 102.1 JT_55 1.8 2.01 248.5 AY_105 1.73 2.01 157.9 RJ_06 1.79 1.85 114.7 JT_56 1.79 2.09 186.4 AY_106 1.75 2.08 254 RJ_07 1.6 1.91 41.7 JT_57 1.83 2.07 358 AY_107 1.74 2.09 213 RJ_08 1.76 1.96 80.3 JT_58 1.81 2.01 199.6 AY_108 1.79 2.01 65.3 RJ_09 1.8 1.98 103.1 JT_59 1.83 2 257.6 AY_109 1.79 2.1 158.9 RJ_10 1.89 2.05 100.6 JT_60 1.8 1.98 148.5 AY_110 1.72 2.03 127.3 RJ_11 1.75 1.99 69 JT_61 1.82 1.99 149.7 AY_111 1.69 2.03 174.3 RJ_12 1.76 1.93 66 JT_62 1.75 1.92 83.8 AY_112 1.65 2.07 157.8 RJ_13 1.78 1.98 281.1 JT_63 1.82 1.92 257.3 AY_113 1.78 1.98 123.6 RJ_14 1.75 2.01 281.1 JT_64 1.81 1.96 149.3 AY_114 1.73 1.92 125.9 RJ_15 1.57 2.09 94.1 JT_65 1.81 1.98 279.5 AY_115 1.77 1.99 15.9 RJ_16 1.57 2.07 76.8 JT_66 1.84 1.95 237.5 AY_116 1.8 1.96 124.7 RJ_17 1.7 2.09 229.5 JT_67 1.82 1.96 164.5 AY_117 1.78 1.98 127.1 RJ_18 1.76 2.07 312.7 JT_68 1.81 1.99 194.2 AY_118 1.76 1.99 254.7 RJ_19 1.73 2.05 212.8 JT_69 1.84 2.12 132.2 AY_119 1.75 1.64 241.9 RJ_20 1.75 2.08 282.8 JT_70 1.79 1.98 151.5 AY_120 1.7 2.08 254.7 RJ_21 1.65 2.03 71.4 JT_71 1.83 2.04 86.2 AY_121 1.82 2.01 214.7 RJ_22 1.8 2.05 452.4 JT_72 1.85 2.04 280.6 AY_122 1.74 1.98 283.9 RJ_23 1.75 1.99 162.9 JT_73 1.52 2.09 670.9 AY_123 1.74 1.93 147.8 RJ_24 1.58 1.96 138.5 JT_74 1.81 2.02 207.5 AY_124 1.646 1.99 152.7 RJ_25 1.7 1.99 193.8 JT_75 1.8 2.07 190.8 AY_125 1.65 2.09 122.2 RJ_26 1.82 1.98 558.5 JT_76 1.8 2 137.6 AY_126 1.66 2.03 185.3 RJ_27 1.76 2.03 229.1 JT_77 1.79 1.98 249.1 AY_127 1.69 2.09 197.8 RJ_28 1.72 2.08 237.6 JT_78 1.84 2.09 174.7 AY_128 1.75 2.01 127.1 RJ_29 1.73 2.06 258.6 JT_79 1.83 2.14 527 AY_129 1.78 2.11 198.9 RJ_30 1.75 2.09 192.8 JT_80 1.78 1.88 133.2 AY_130 1.79 2.13 83.8 RJ_31 1.74 2.08 168.2 JT_81 1.8 1.96 246 AY_131 1.6 2.15 257.3 RJ_32 1.79 1.99 262.9 JT_82 1.8 1.93 125.7 AY_132 1.76 2.15 149.3 RJ_33 1.79 1.98 250.9 JT_83 1.83 2.05 161.8 AY_133 1.8 2.16 279.5 RJ_34 1.72 2.01 135.9 JT_84 1.78 1.96 224.4 AY_134 1.89 2.51 237.5 RJ_35 1.69 2.09 66.9 JT_85 1.76 1.78 22.7 AY_135 1.75 1.98 164.5 RJ_36 1.65 2.03 48.4 JT_86 1.63 2.03 24 AY_136 1.76 1.99 194.2 RJ_37 1.78 1.98 121.1 JT_87 1.77 2.03 23.7 AY_137 1.78 1.96 132.2 RJ_38 1.73 1.99 147.6 JT_88 1.54 1.98 6.6 AY_138 1.75 1.93 151.5 RJ_39 1.77 1.93 177 JT_89 1.46 1.93 13.5 AY_139 1.57 1.97 86.2 RJ_40 1.8 1.98 116.2 JT_90 1.8 1.96 22.7 AY_140 1.57 2.03 280.6 RJ_41 1.78 1.96 172.9 JT_91 1.75 1.88 22.4 AY_141 1.7 2.08 670.9 RJ_42 1.76 1.96 147.8 JT_92 1.66 1.99 6.7 AY_142 1.76 2.1 207.5 RJ_43 1.75 1.92 204.4 JT_93 1.82 1.97 10.2 AY_143 1.73 2.03 190.8 RJ_44 1.7 2.01 110.6 JT_94 1.84 2.1 153.5 AY_144 1.75 2.18 137.6 RJ_45 1.82 2.07 111.9 JT_95 1.73 1.99 65.1 AY_145 1.65 2.16 249.1 RJ_46 1.74 2.09 114.5 JT_96 1.85 2.14 176.5 AY_146 1.8 2.01 254.1 RJ_47 1.74 2.07 135.8 JT_97 1.42 1.24 15.5 AY_147 1.75 2.07 235.1 RJ_48 1.646 2.01 79.4 JT_98 1.81 2.11 177.1 AY_148 1.58 2.06 23.9 RJ_49 1.77 2.03 227.6 JT_99 1.85 2.25 244.5 AY_149 1.7 2.01 18.9 RJ_50 1.8 2.07 317.6 JT_100 1.81 2.09 131.8 AY_150 1.6 2 42.7

Continue

76 Sample(n) R=260 R=260/230 Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof

Quantityof

DNAng/ul

/280

SD_151 1.8 2 36.2 KZ_201 1.83 2.01 237.6 GJ_251 1.82 1.66 52.9 SD_152 1.8 1.91 123.6 KZ_202 1.81 2.07 258.6 GJ_252 1.75 1.68 36.9 SD_153 1.79 1.98 125 KZ_203 1.82 1.95 192.8 GJ_253 1.92 1.68 45.9 SD_154 1.84 2.09 124 KZ_204 1.75 1.97 168.2 GJ_254 1.96 1.99 78.9 SD_155 1.83 2.14 258 KZ_205 1.82 1.93 262.9 GJ_255 1.81 1.96 68.9 SD_156 1.78 1.88 321 KZ_206 1.81 1.98 250.9 GJ_256 1.84 1.98 45.1 SD_157 1.8 1.96 254.9 KZ_207 1.81 2.09 135.9 GJ_257 1.82 2.08 78.9 SD_158 1.8 1.93 124.9 KZ_208 1.84 2.01 66.9 GJ_258 1.81 2.09 258.1 SD_159 1.83 2.05 11.2 KZ_209 1.82 2.03 48.4 GJ_259 1.84 2.07 25.9 SD_160 1.78 1.96 112.5 KZ_210 1.81 1.94 121.1 GJ_260 1.79 1.96 66.3 SD_161 1.76 1.78 236.9 KZ_211 1.84 2.12 147.6 GJ_261 1.83 1.98 77.9 SD_162 1.63 1.39 129 KZ_212 1.79 1.99 177 GJ_262 1.85 1.89 78.4 SD_163 1.77 1.51 129.7 KZ_213 1.83 2.09 192.8 GJ_263 1.99 1.99 96.3 SD_164 1.54 1.04 128.1 KZ_214 1.85 1.98 168.2 GJ_264 1.85 1.98 89.9 SD_165 1.46 1.48 142.6 KZ_215 1.52 2 262.9 GJ_265 1.96 1.91 78.9 SD_166 1.8 1.19 147.2 KZ_216 1.81 2.02 250.9 GJ_266 2.08 2.01 99.9 SD_167 1.75 1.88 151.3 KZ_217 1.76 2.06 258.3 GJ_267 2.09 2.09 369.2 SD_168 1.66 1.73 11.1 KZ_218 1.72 2.07 123.3 GJ_268 2.01 2.01 77.5 SD_169 1.82 1.64 258 KZ_219 1.73 2.04 365.3 GJ_269 1.96 2.09 98.9 SD_170 1.84 2.1 78.9 KZ_220 1.75 2.07 125.8 GJ_270 1.93 2.03 78.1 SD_171 1.73 1.85 49.9 KZ_221 1.74 2.05 325 GJ_271 1.99 1.98 96.9 SD_172 1.85 2.14 236.5 KZ_222 1.79 2.07 239.4 GJ_272 2.03 1.75 22.9 SD_173 1.42 2.01 125 KZ_223 1.79 2.03 96.3 GJ_273 1.88 1.52 66.9 SD_174 1.81 2.11 147 KZ_224 1.72 1.96 89.3 GJ_274 2.05 1.99 79.5 SD_175 1.75 1.98 49.9 KZ_225 1.69 1.99 45.6 GJ_275 1.56 1.99 22.8 SD_176 1.71 1.96 55.9 KZ_226 1.65 1.92 256.3 GJ_276 2.03 1.66 56.7 SD_177 1.8 2.01 85.3 KZ_227 1.78 2.08 365.9 GJ_277 2.07 1.52 87.9 SD_178 1.78 2.09 96.3 KZ_228 1.73 2.05 36.9 GJ_278 2.06 1.68 258 SD_179 1.8 2.15 68.9 KZ_229 1.77 2.07 152.9 GJ_279 1.96 1.68 26.9 SD_180 1.79 2.04 289.3 KZ_230 1.77 2.03 12.9 GJ_280 1.95 1.96 128.9 SD_181 1.83 2.09 259.3 KZ_231 1.54 2.07 125.3 NS_281 1.81 1.96 96.9 SD_182 1.81 2.07 233.6 KZ_232 1.46 2.03 23.6 NS_282 1.89 2.01 99.7 SD_183 1.83 2 56.9 KZ_233 1.83 2.09 37.9 NS_283 1.56 2.09 235 SD_184 1.8 1.98 69.9 KZ_234 1.75 1.88 125.7 NS_284 1.99 2.09 68 SD_185 1.82 1.97 93.3 KZ_235 1.66 2.03 136 NS_285 1.96 1.57 98.7 SD_186 1.75 2.05 89.6 KZ_236 1.82 1.99 127 NS_286 1.99 2.06 36 SD_187 1.82 1.98 256.3 KZ_237 1.84 2.11 12.3 NS_287 2.01 1.82 78 SD_188 1.81 1.93 269.3 KZ_238 1.73 1.85 177.8 NS_288 1.98 2.05 97.5 SD_189 1.81 1.92 45.9 KZ_239 1.85 2.14 125 NS_289 1.96 2.01 36 SD_190 1.84 1.93 59.9 KZ_240 1.42 1.24 258.3 NS_290 1.99 1.96 98 SD_191 1.82 1.94 188.1 KZ_241 1.81 2.11 181.8 NS_291 2.05 1.91 125.7 SD_192 1.81 1.98 244.2 KZ_242 1.85 2.25 248.5 NS_292 2.04 1.99 369.2 SD_193 1.84 2.12 299 KZ_243 1.81 2.09 186.4 NS_293 2.09 1.96 124.8 SD_194 1.79 1.96 211.2 KZ_244 1.65 1.39 358 NS_294 2.05 1.89 483.9 SD_195 1.83 1.98 45.6 KZ_245 1.66 2.07 199.6 NS_295 2.02 1.93 127.5 SD_196 1.85 2.07 78.2 KZ_246 1.69 2.01 257.6 NS_296 2.03 1.59 154.9 SD_197 1.52 2.09 36.9 KZ_247 1.75 2.09 148.5 NS_297 20.7 1.72 52.8 SD_198 1.81 2.02 89.7 KZ_248 1.78 1.62 149.7 NS_298 1.7 1.75 16.9 SD_199 1.53 1.99 98.7 KZ_249 1.79 2.01 83.8 NS_299 1.82 1.52 48.9 SD_200 1.63 1.96 96.9 KZ_250 1.69 1.99 59.6 NS_300 1.74 2.01 87.9

Continue

77

Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof Sample R=260/280 R=260/230 Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof

Quantityof

DNAng/ul

(n)

GH_301 1.75 2.01 194.2 KS_351 1.78 1.85 174.7 BG_401 1.46 2.09 257.6 GH_302 1.71 2.07 132.2 KS_352 1.87 1.91 527 BG_402 1.83 1.85 148.5 GH_303 1.89 1.99 151.5 KS_353 1.81 1.97 133.2 BG_403 1.75 1.92 149.7 GH_304 1.78 1.97 86.2 KS_354 1.83 1.74 246 BG_404 1.66 2.03 83.8 GH_305 1.85 1.89 280.6 KS_355 1.78 2.51 125.7 BG_405 1.82 1.99 257.3 GH_306 1.79 1.85 670.9 KS_356 1.76 1.99 161.8 BG_406 1.84 1.98 149.3 GH_307 1.83 1.98 207.5 KS_357 1.63 1.67 224.4 BG_407 1.73 2.09 279.5 GH_308 1.81 1.96 190.8 KS_358 1.77 1.68 22.7 BG_408 1.85 1.99 237.5 GH_309 1.83 1.99 137.6 KS_359 1.54 2.01 24 BG_409 1.42 2.03 164.5 GH_310 1.86 2.05 249.1 KS_360 1.46 1.91 23.7 BG_410 1.81 1.96 132.7 GH_311 1.82 1.93 174.7 CN_361 1.88 2.09 6.6 BG_411 1.85 1.99 102.1 GH_312 1.75 1.98 527 CN_362 1.75 1.98 13.5 BG_412 1.81 1.98 114.7 GH_313 1.82 1.93 133.2 CN_363 1.66 2.08 22.7 BG_413 1.65 1.96 41.7 GH_314 1.81 1.99 246 CN_364 1.82 2.03 22.4 BG_414 1.66 1.95 80.3 GH_315 1.81 2.03 125.7 CN_365 1.84 1.57 6.7 BG_415 1.69 1.93 103.1 GH_316 1.84 2.08 161.8 CN_366 1.73 1.89 10.2 OK_416 1.75 1.87 100.6 GH_317 1.82 2 224.4 CN_367 1.85 2.01 153.5 OK_417 1.78 1.56 69.1 GH_318 1.81 2.06 22.7 CN_368 1.42 2.09 65.1 OK_418 1.79 1.99 66.2 GH_319 1.84 1.99 162.9 CN_369 1.81 1.89 176.5 OK_419 1.69 2.03 281.1 GH_320 1.79 1.97 138.5 CN_370 1.85 1.92 15.5 OK_420 1.23 1.93 281.1 BJ_321 1.83 1.93 193.8 CN_371 1.81 1.91 41.7 OK_421 1.98 1.98 94.1 BJ_322 1.85 1.82 558.5 CN_372 1.74 1.99 80.3 OK_422 1.32 1.96 236.2 BJ_323 1.52 2.07 229.1 CN_373 1.63 1.67 103.1 OK_423 1.23 2.03 123.5 BJ_324 1.81 2 237.6 CN_374 1.29 1.91 100.6 OK_424 1.33 1.85 147.3 BJ_325 1.8 2.2 258.6 CN_375 1.69 1.6 69 OK_425 1.39 1.95 125.3 BJ_326 1.75 1.91 192.8 CN_376 1.56 1.99 66 OK_426 1.45 1.96 129.7 BJ_327 1.76 1.72 168.2 CN_377 1.59 1.72 281.1 OK_427 1.25 1.93 174.6 BJ_328 1.78 1.97 262.9 CN_378 1.78 1.75 281.1 OK_428 1.98 2.09 155.3 BJ_329 1.75 1.92 250.9 CN_379 1.67 1.99 94.1 OK_429 1.77 1.99 144.5 BJ_330 1.57 1.98 135.9 CN_380 1.66 1.93 76.8 OK_430 1.23 1.99 178.6 BJ_331 1.57 1.96 66.9 SL_381 1.64 2.09 229.5 OK_431 1.32 2.09 133.2 BJ_332 1.77 1.99 48.4 SL_382 1.65 2.97 312.7 OK_432 1.45 1.91 123.9 BJ_333 1.76 2.07 121.1 SL_383 1.56 2.04 212.8 OK_433 1.55 1.91 158.6 BJ_334 1.73 2.06 22.3 SL_384 1.83 1.93 282.8 OK_434 1.75 1.96 145.6 BJ_335 1.75 2.09 598.1 SL_385 1.74 2.97 71.4 OK_435 1.32 1.93 148.9 BJ_336 1.65 1.99 265.3 SL_386 1.78 1.65 452.4 KT_436 1.88 2.03 147.6 BJ_337 1.84 2.01 136.3 SL_387 1.72 2.06 162.9 KT_437 1.89 1.89 152.8 BJ_338 1.75 2 134.2 SL_388 1.89 1.95 138.5 KT_438 1.99 2.07 159.3 BJ_339 1.58 1.98 126.9 SL_389 1.66 1.89 193.8 KT_439 1.89 1.85 147.3 BJ_340 1.7 1.96 36.9 SL_390 1.69 1.93 558.5 KT_440 1.55 2.08 185.3 KS_341 1.82 1.97 265.3 SL_391 1.25 1.93 229.1 KT_441 1.86 2.07 178.9 KS_342 1.76 1.96 98.3 SL_392 1.23 1.95 237.6 KT_442 1.98 2.1 152.3 KS_343 1.64 1.94 72.8 SL_393 1.36 2.09 258.6 KT_443 1.63 2.14 225.3 KS_344 1.23 1.93 45.1 SL_394 1.52 1.93 152.3 KT_444 1.56 2.16 278.9 KS_345 1.32 2.01 58.3 SL_395 1.59 1.96 12.6 KT_445 1.67 2.01 245.9 KS_346 1.76 2.08 26.9 BG_396 1.69 1.98 28.9 KT_446 1.69 2.03 298.3 KS_347 1.83 2.09 45.1 BG_397 1.65 1.93 126.3 KT_447 1.75 1.95 310.6 KS_348 1.88 2.1 86.9 BG_398 1.56 1.96 145.3 KT_448 1.66 2.02 54.9 KS_349 1.22 1.96 452.3 BG_399 1.89 2.09 127.9 KT_449 1.45 2.07 52.6 KS_350 1.88 1.86 85.9 BG_400 1.58 1.99 145.7 KT_450 1.52 1.99 54.1

Continuous

78 Sample(n) R=260/280 R=260/230 DNAng/ul Quantity Sample(n) R=260/280 R=260/230 DNAng/ul Quantityof

of

KT_451 1.11 1.93 45 BT_476 1.63 1.99 153.3 KT_452 1.57 1.86 23.3 BT_477 1.45 1.96 186.7 KT_453 1.88 1.85 12.3 BT_478 1.65 1.98 12.6 KT_454 1.76 2.09 48.4 BT_479 1.96 2.03 18.9 KT_455 1.45 2.01 25.6 BT_480 1.93 1.96 20.9 YF_456 1.96 2.1 48.9 LS_481 1.53 2.01 45.7 YF_457 1.77 1.95 45.3 LS_482 1.68 2.09 42.3 YF_458 1.47 1.88 74.6 LS_483 1.59 2.01 49.8 YF_459 1.77 2.15 49.3 LS_484 1.66 2.04 56.7 YF_460 1.75 1.95 76.7 LS_485 1.48 2.09 89.9 MZ_461 1.63 1.95 63.9 MS_486 1.58 2.07 262.9 MZ_462 1.86 1.96 78.2 MS_487 1.76 1.85 250.9 MZ_463 1.42 1.93 87.3 MS_488 1.89 2.07 135.9 MZ_464 1.52 1.98 25.6 MS_489 1.85 2.09 66.9 MZ_465 1.46 1.95 45.9 MS_490 1.63 2.03 48.4 MZ_466 1.83 2.03 41.7 MS_491 1.77 2.07 121.1 MZ_467 1.63 1.96 65.3 MS_492 1.23 2.04 147.6 MZ_468 1.78 1.89 36.9 MS_493 1.32 2.07 177 MZ_469 1.65 1.99 12.9 MS_494 1.45 2.03 116.2 MZ_470 1.77 1.96 42.6 MS_495 1.55 1.99 172.9 LG_471 1.66 1.75 25.3 WZ-496 1.75 1.98 147.8 LG_472 1.56 2.09 89.9 WZ- 498 1.32 1.93 204.4 LG_473 1.85 1.98 16.9 WZ-499 1.36 1.94 110.6 LG_474 1.36 2.07 152.3 WZ-500 1.32 1.85 120.69 LG_475 1.96 2.08 178.9

AY(Arain), BG (Bangash), BJ (Bijarani), BT (Bugti), CN (Chandio),GH (Ghallu), GJ (Gujar), JT (Jat), KZ

(Kakazai), KT (Khattak), KS (Khoso), LG (Laghari), LS (Lashari), MH (Mahsuds), MZ (Mazari), NS (Nasrani),

OK (Orakazai), RJ (Rajput),SL (Solangi), SD (Sayyid), YZ(Yusufzai),WZ(Wazirs).

79

Horizontal electrophoresis:

To check the quality of extracting DNA samples were arbitrarily analyzed in horizontal electrophoresis. A standard of 25ng and 50ng DNA was used to compare the sample running in different wells of Agarose gel. Illumination of DNA bands showed its quality as shown in Figure 3.1.

50ng 25ng 80ng 100ng 90ng 100ng 70ng 80ng 100ng 120ng

S1 = 50ng of standard DNA S2 = 25ng of Standard DNA C1-8 = Sample DNA

Figure 3.1: DNA quantification from blood

80 Amplification through PCR:

The entire control region was amplified through PCR extension. In order to confirm

the amplification, it was run on the Agarose gel with (1kb) standard leader (Figures

3.2 and 3.3) verified the correct amplification of the required region of mtDNA.

1000bp

500bp

L = Standard leader 1kb

S1-8 = Amplified samples of 1150bp

C = Positive control

Figure 3.2: PCR amplification of mitochondrial control region

81

1000 bp 500 bp

M = Standard leader 1kb

S1 to S8 = Amplified samples

Figure 3.3: PCR amplification of 1150bp segment of mitochondrial control region

82 Mitochondrial Control Region Sequencing PCR and Sequence

Analysis:

The Mitochondrial control region was amplified for the Sequencing PCR by the reverse and forward primers listed in Table 2.1. The control region of mtDNA was analyzed between positions 16024 to 16569, 073 to 576 respectively. Data of nucleotide polymorphism presented in the Figures 8, 9 and 10. Nucleotide substitution

(46%) is most common, compared to insertion (30%) and deletion (7.6%). The utmost shared transition was C to T and rifest transversion was found A to T. Insertion was an addition of C resides were observed.

83

                    

Figure 3.4: Sequence of fragment 16024 to 16569 bp of mtDNA and its dendogram

84

       

Figure 3.5: Sequence of fragment 8 to 420 bp of mtDNA and its dendogram

           

85

   

Figure 3.6: Sequence of fragment 362 to 576 bp of mtDNA its dendogram

86 Sequencing of 16024 to 16569 bp Segment:

Sequence analysis of 16024 to 16569 bp was done to find out the mitochondrial variants in the population of Pakistan (figure 3.7).

87

88

Figure 3.7 Dendograms of 16024 to 16569 bp segment of mtDNA in Pakistani population

89

Sequencing results of 8 to 420 bp Segment:

Sequencing results of segment 8-420 bp was done to find the different variants in the population of Pakistan shown in figure 3.9.

90

91

Figure 3.8 Dendograms of 8 to 420 bp segment of mtDNA in Pakistani population.

92 Sequencing results of 362 to 576 bp Segment:

Sequencing results of fragment 362 to 576 bp are shown in different dendograms in the Pakistani population.

93

94

Figure 3.9: Dendograms of 362 to 576 bp segments of mtDNA in Pakistani population

95

Data Analysis:

Transitions:

A transition is the change of purine to purine or pyrimidine to pyrimidine bases. The most prevalent nucleotide substitution found at position 16223bp (C to T).

Normal sequence having “CCC”

Figure 3.10: Most common transition was c → t at position 16223bp in mtDNA control region

96

Figure 3.11: Most common transition was a → g at position 263bp in mtDNA control region

.

97

Figure 3.11a: Most common transition was a → g at position 73bp in mtDNA control region

.

Figure 3.11b: Most common transition was t → c at position 146C and 152C in mtDNA control region

98

Figure 3.11c: Most common transition was t → c at position 152C and g → a at position 195bp in mtDNA control region.

Figure 3.11d: Most common transition was a → g at position 193 in mtDNA control region.

99

Figure 3.11e: Most common transition was c → t at position 194 and t → c at position 246 in mtDNA control region.

Figure 3.11f: Most common transition was t → c at position 204, g → a at position 207 and a → g 263 in mtDNA control region.

100

Figure 3.11g: Most common transition was c → t at position 151 and t → c at position 152 in mtDNA control region.

Figure 3.11h: Most common transition was t → c at position 16311 and g → a at position 16319 in mtDNA control region.

101

Figure 3.11i: Most common transition was c → t at position 16223 and c → t at position 16234 in mtDNA control region.

Figure 3.11j: Most common transition was t → c at position 16126 and g → a at position 16145 in mtDNA control region.

102

Figure 3.11k: Most common transition was c → t at position 16234, 16239 and 16278 in mtDNA control region.

Figure 3.11ℓ: Most common transition was c → t at position 16294 and a → g at position 16309 in mtDNA control region.

103

Figure 3.11m: Most common transition was g → a at position 16129 in mtDNA control region.

Figure 3.11n: Most common transition was t → c at position 16217 in mtDNA control region.

104

Figure 3.11o: Most common transition was c → t at position 16354 in mtDNA control region.

Normal sequence having “T”

Figure 3.12: Most common transition was t →c at position 489bp.

105

Transversion:

Transversion are the substitution of purine to pyrimidine. It was prevalent in the

control region, of mtDNA. The substitution A to T transversion is noted at position

(16220bp) shown in figure 3.13.

Normal sequence

Figure 3.13: Transversion a → t at position 16220bp in mtDNA control region

106

Figure 3.12a: Most common transversion was a → t at position 16318 in mtDNA control region.

Figure 3.12b: Most common transversion was a → t at position 16318 and transition was t → c at position 16311and t → c at position 16325 in mtDNA control region.

107

Figure 3.12c: Most common transversion was a → c at position 16209 in mtDNA control region.

108 Insertions:

The most common insertion observed in control region was C-stretch sequence variants at position 309-315bp known as length heteroplasmy. It has been noted that in some sample resolution of dendogram was lost due to C stretch heteroplasmy, especially when sequenced with reversed primer but forward primer gave high-quality results.

109

Length heteroplasmy (LH) and Point mutation heteroplasmy (PMH)

The most common type of heteroplasmy observed in the mitochondrial control regions was length heteroplasmy (LH). The subjects of this condition have multiple sets of the mtDNA genome that vary in length, usually cytosine (C) residue on the light strand (homopolymeric tract) while, the heavy strand has a complementary base.

Primarily, the outcome of LH may result in a failure to read or interpret downstream data precisely, because it forbade the replication fork’s sequential rhythm (de

Camargo et al., 2011; Irwin et al., 2009; Ramos et al., 2013). Preferably, there are five common sites where C- stretch (Poly- Cytosine) residues have observed in the control region of mtDNA. These sites are 16189, 303-315, 568-573, and 514-524 (AC- residues). In addition to that, point mutation heteroplasmy (PMH) or sequence heteroplasmy also observed simultaneously demarcated by the incidence of distinctive but dissimilar nucleotides at a single position on both strands L and H of mtDNA(Al-

Rashedi et al., 2016; Gardner et al., 2015). Moreover, it has been observed that purine heteroplasmy is less common than pyrimidine heteroplasmy in different non- pathological haplotypes(Jakupciak et al., 2008; Ramos et al., 2013). These mitochondrial hotspots were positioned at nucleotide 16093, 16189, 16234, 73, 309-

315, 514-524 and 568-573.

110 Figure 3.13a: Length heteroplasmy at position 303-315

111

112 Figure 3.13b: Length Heteroplasmy at position 568-573

Figure 3.13c: Point mutation heteroplasmy at position 73

113

Figure 3.13d:Point Mutation Heteroplasmy at position 16093

Figure 3.13e: Point mutation heteroplasmy at position 16189

114 Figure 3.13f: Point Mutation Heteroplasmy at position 16234

Deletions:

Deletion of a single C residue is located at 523bp and 524bp of mtDNA control region in the population of Pakistan.

Figure 3.14: Length Heteroplasmy at position 514-524 deletion of AC repeats at position 523-524bp.

115

Normal sequence having “CC”

Figure 3.14a: Deletion of “C” in control region of mtDNA

116 Distribution of Haplogroup Profile:

On the basis of mutational analysis of the control region of mtDNA and with relation to the reference sequence, individuals were categorized into specific types of haplogroups and monophyletic clads. Distinctively, haplogroups descent into 3 major clads, designated as L, M and N. They have their specific worldwide geographical distribution. It has been noted that Indian subcontinent was the place of early settlers containing diverse genetic history having M and N haplogroups which are frequently distributed among different Indian subcontinent populations.

The oldest macro-haplogroup was L that inhabited completely in Africa, predominantly, in sub-Saharan African population. It mainly comprises of monophyletic clads L0, L1 to L3, out of these L3 had two major subclades M and N which entered into the Indian subcontinent approximately 60,000 ybp ago and later on spread to South East Asia and Australia.

First, the Indian subcontinent turns out to be a hot spot for mitochondrial DNA divergence due to the Pleistocene population expansion. Approximately, 60% overall human population existed in that region around 38,000 year ago. Second, the salient migratory events at several different time periods on the Silk route between the widespread empire of China and India contributed much in the population diversity6.

Third, the “Fertile Crescent”, (the area between Southeast Anatolia and Zagros mountains)” had history of urban civilization as early as 3,000 ybp (years before present) and served as a channel for human migration between Mesopotamia and

Iranian Plateau including North-Western part of Pakistan’

117

The haplogroups were mainly concerned with the lineage, which can be ascribed exclusively in South Asia, Western Eurasian and eastern Eurasia. The east Eurasian’s constituent the main haplogroups designated as N, N9, and N11, while south Asian represented a major haplogroups of M-lineage were M3, M5, M18, M30, and M40 respectively.

118 Table 3.2. Haplogroups distribution in 22 ethnic groups of Pakistani Population:

Sample Coading region np 16024n - 16569 np 001-437 np 362-576 SL_1,6, 12 10400T 12705T 16126C 16193T 16223T 73G 263G 315.1C 482C 489C 523d 524d M3 RJ_39 10400T 12705T 16223T 16311C 73G 195A 146C 263G 309.1C 315.1C M30c SD_4 10400T 12705T 16223T 16311C 73G 195A 146C 263G 309.1C 315.1C 489C M30c SD_6 10400T 12705T 16093 16223T 16311C 16319A 73G 195C 246C 263G 309.1C 315.1C 482C 489C M38c CN_2,8,10 10400T 12705T 16126C 16193T 16223T 73G 207A 263G 315.1C 489C M3a MZ_4 10400T 12705T 16126C 16189C 16179T 16294T 16223T 73G 152C 263G 309.1C 315.1C 482C 489C 523d 524d M3c1b SD_14 10400T 12705T 16179T 16463G 16223T 73G 263G 309.1C 315.1C 482C 489C M40 KZ_4 10400T 12705T 16179T 16319A 16463G 16223T 16356C 73G 263G 309.1C 315.1C 489C M40a SD_3 10400T 12705T 16189C 16223T 16311C 73G 146C 263G 309.1C 315.1C 489C M45 RJ_34 10400T 12705T 16189C 16223T 16311C 73G 146C 263G 309.1C 315.1C 489C M45 KS_1 10400T 12705T 16093C 16234T 16223T 73G 263G 309.1C 315.1C 489C 523d 524d M 16129A 16182G 16183C 16193.1C 16234T 16290T KS_3 M12 10400T 12705T 16362C 16223T 73G 263G 309.1C 315.1C 489C NS_13 10400T 12705T 16182C 16234T 16290T 16223T 73G 263G 309.1C 315.1C 489C M12 KT_10 10400T 12705T 16183C 16234T 16290T 16223T 73G 263G 309.1C 315.1C 489C 309.2C 315.1C M12 KZ_10 10400T 12705T 16126T 16234T 16290T 16223T 73G 263G 309.1C 315.1C 489C 523d 524d M12 AY_28 10400T 12705T 16093C 16223T 16318T 16325C 73G 93G 194T 246C 263G 309.1C 315.1C 489C M18a AY_31 10400T 12705T 16093C 16223T 16318T 16325C 73G 93G 194T 246C 263G 309.1C 315.1C 489C 523d 524d M18a BJ_8, 20 10400T 12705T 16126C 16223T 16318T 16311C 16325C 73G 93G 152C 194T 246C 263G 309.1C 315.1C482C 489C 523d 524d M18a AY_11, 40, 47, 49 10400T 12705T 16126T 16223T 16318T 16325C 73G 93G 194T 246C 263G 309.1C 315.1C 489C M18a RJ_32 10400T 12705T 16223T 16311C 16318T 16325C 73G 152C 246C 263G 309.1C 315.1C 489C 523d 524d M18b GJ_40 10400T 12705T 16183C 16234T 16223T 16278T 16311C 73G 195A 263G 309.1C 315.1C 489C 523d 524d M30b CN_4 10400T 12705T 16126C 16223T 16311C 73G 195A 246C 146C 263G 309.2C 315.1C 489C 513A 523d 524d M30c CN_14,17 10400T 12705T 16189C 16223T 16311C 73G 195A 146C 263G 309.1C 315.1C 482C 489C M30c CN_5, 16 10400T 12705T 16129A 16234T 16223T 16311C 152C 195A 246C 73G 263G 309.1C 315.1C 482C 489C 523d 524d M30e GH_15 10400T 12705T 16126C 16234T 16223T 16311C 73G 146C 152C 195A 263G 309.1C 315.1C 489C 513A M30e KT_8, 14 10400T 12705T 16126C 16234T 16223T 16311C 73G 146C 152C 195A 263G 309.1C 315.1C 489C M30e KZ_12 10400T 12705T 16234T 16223T 16311C 73G 152C 195A 263G 309.1C 315.1C 489C M30e GJ_8, 10400T 12705T 16093C 16169T 16223T 16294T 16311C 73G 195C 263G 309.1C 315.1C 489C M33a1a RJ_42 10400T 12705T 16093 16223T 16311C 16319A 73G 199C 263G 309.1C 315.1C 482C 489C M35a1a KZ_8 10400T 12705T 16185t 16126C 16193t 16223T 73G 482C 207A 263G 315.1C 482C 489C M3a+204 KZ_7 10400T 12705T 16185t 16126C 16193t 16223T 73G 482C 207A 263G 315.1C 482C 489C M3b RJ_37 10400T 12705T 16145A 16223T 16311C 73G 152C 263G 315.1C 489C 523d 524d M4 KZ_1 10400T 12705T 16145A 16179T 16319A 16223T 16294T 16356C 73G 263G 309.1C 315.1C 489C 523d 524d M40a SD_2 10400T 12705T 16182G 16234T 16153A 16223T 73G 263G 309.1C 315.1C 447G 489C 523d 524d M49a SD_8 10400T 12705T 16234T 16153A 16223T 16278T 73G 263G 309.1C 315.1C 489C 513A M49a SD_12 10400T 12705T 16234T 16153A 16223T 16278T 16311C 73G 146C 152C 263G 309.1C 315.1C 489C 523d 524d M49a AY_33 10400T 12705T 16234T 16223T 16278T 16311C 73G 146C 152C 263G 309.1C 315.1C 489C M49c SD_1 10400T 12705T 16234T 16223T 16278T 16311C 73G 146C 152C 263G 309.1C 315.1C 489C 523d 524d M49c SD_5 10400T 12705T 16234T 16223T 16278T 16311C 73G 146C 152C 263G 309.1C 315.1C 489C 523d 524d M49c SD_9 10400T 12705T 16234T 16223T 16278T 16311C 73G 146C 152C 263G 309.1C 315.1C 489C 523d 524d M49c GH_10,19 10400T 12705T 16126C 16129A 16223T 73G 152C 263G 309.1C 315.1C 447G 489C 523d 524d M5 KS_9 10400T 12705T 16129A 16182C 16223T 73G 263G 309.1C 315.1C 489C M5 GJ_11, 14, 36, 15, 5010400T 12705T 16129A 16182C 16223T 16519C 73G 195C 263G 309.2C 315.1C 489C 523d 524d M5 GJ_6 10400T 12705T 16169T 16182C 16193.1C 16223T 16438A 16519C 73G 152C 263G 309.1C 315.1C 489C M52b1 KZ_13 10400T 12705T 16275G 16390A 16223T 16438A 73G 152C 263G 309.1C 315.1C 489C 523d 524d M52b1 BJ_12 10400T 12705T 16126C 16129A 16192 16223T 16291T 73G 152C 263G 309.1C 315.1C 482C 489C 523d 524d M5a1 NS_1, 6, 10 10400T 12705T 16129A 16169T 16223T 16291T 73G 195C 263G 309.1C 315.1C 489C M5a1 NS_16 10400T 12705T 16193C 16129A 16291T 16223T 16319A 73G 263G 309.2C 315.1C 489C M5a1 NS_17 10400T 12705T 16129A 16192T 16223T 73G 152C 263G 309.1C 315.1C 334C 489C 523d 524d M5a1

119

16093 16183C 16129A 16291T 16129A 16223T NS_18 M5a1 10400T 12705T 16291T 16311C 73G 263G 309.1C 315.1C 489C KZ_9 10400T 12705T 16129A 16291T 16223T 73G 152c 263G 309.1C 315.1C 489C 523d 524d M5a1 BJ_5 10400T 12705T 16129A 16223T 16231C 16356C 16362C 73G 263G 309.1C 315.1C 461T 489C 514d 515d M6a1a NS_7 10400T 12705T 16129A 16184T 16256G 16223T 16362C 16311C 73G 152T 195C 263G 309.2C 315.1C 461T 489C M6b KZ_11 10400T 12705T 16126C 16184T 16256g 16223T 16362C 73G 263G 309.1C 315.1C 489C M6B AY_22 10400T 12705T 16145A 16184T 16223T 73G 152C 263G 315.1C 489C 523d 524d M7b2 RJ_50 10400T 12705T 16125A 16184T 16223T 73G 146C 152C 263G 315.1C 489C M7b2 SD_29 12308G 16256T 16309G 16318T 73G 151T 152C 263G 315.1C ??? BG_14 16129A 16182C 16223T 16362C 73G 263G 152C 489C 309.2C 315.3C D4a BT_9 7028C 16182C 263G 309.2C 315.3C H1 CN_13 7028C 16162G 16311C 152C 195C 73G 263G 309.2C 315.1C H1a BT_11 7028C 16193.1C 16183C 16162G 16357C 73G 263G 309.1C 315.1C H1a MZ_2, MZ_8 7028C 16162G 16223T 16311C 73G 309.5C 315.7C 523d 524d H1a KS_17 7028C 16162G 16209C 16291T 16292T 73G 263G 309.1C 315.1C 523d 524d H1a1 BG_6 7028C 16162G 16182C 16209C 16357C 16519C 73G 263G 309.1C 315.3C H1a1 BG_9 7028C 16051G 16162G 16357C 73G 263G 309.3C 315.3C H1a3 BT_7, BT_10 7028C 16051G 16162G 16298C 16357C 16223T 73G 263G 309.1C 315.1C H1a3 SL_11, 14 7028C 16270T 16357C 263G 309.1C 315.2C H1ba SL_3, 5, 15 7028C 16193.1C 73G 263G H1e2c BT_8 7028C 16354T 73A 263G 309.1C 315.1C H2a1 MZ_1 7028C 16311C 16223T 152C 263G 309.3C 315.2C 523d 524d H2b BG_12 16093C 16298C 16519C 72C 263G 309.4C 315.5C 523d 524d HV0 BG_2,22 16126C 16298C 16519C 72C 263G 309.1C 315.1C Hv0a1 BG_15 16067T 16183C 16519C 263G 309.2C 315.3C 523d 524d HV1 BJ_4 16217C 16325C 16311T 73G 152C 263G 309.2C 315.1C 523d 524d HV2 KS_18 16217C 16325C 73G 152C 263G HV2 SL_7 16217C 16182C 16362C 73G 152C 195C 263G 309.2C 315.3C 523d 524d HV2 BG_5,23 16217C 16182C 16183C 73G 152C 195C 263G 309.1C 315.3C HV2 BT_3, BT_13 16093C 16129A 16217C 16223T 73G 152C 195C 263G 309.1C 315.1C HV2 LG_4 16217C 16182C 16183C 16223T 73G 152C 195C 263G 309.1C 315.3C HV2 LS_5, LS_7, LS_9 16189C 16223T 16278T 16217C 16294T 73G 152C 195C 263G HV2 KS_13 16214T 16217C 16335G 72C 73G 152C 195C 246C 263G 309.1C 315.1C HV2a GJ_22, 30 16182C 16214T 16217C 16335G 16223T 73G 152C 195C 246C 263G 309.1C 315.1C HV2a CN_7 10398G 16069T 16126C 16145A 16260T 73G 146C 263G 295T 309.1C 315.1C 462T 489C 523d 524d J1 KT_5,16 10398G 16069T 16126C 16193.1C 16519C 73G 263G 295T 462T 489C 309.1C 315.1C 523d 524d J1 BT_4 10398G 16069T 16126C 16223T 73G 263G 295T 309.2C 315.3C 462T 489C 595T J1 GH_9 10398G 16069T 16126C 16145A 16222T 16261T 16292T 73G 152C 263G 295T 309.1C 315.1C 462T 489C J1b 16069T 16126C 16182C 16145A 16222T 16261T KT_1, 11, 13, 15 J1b 10398G 16519C 73G 263G 309.1C 315.1C 295T 462T 489C 523d 524d GJ_20 10398G 16069T 16126C 16183C 16223T 16519C 73G 242T 263G 309.1C 315.1C 462T 489C J1b1a KT_2 10398G 16069T 16126C 16375T 73G 150T 152C 195C 263G 295T 309.3C 315.3C489C J2a BT_5, BT_14 10398G 16069T 16126C 16223T 16375T 73G 150T 152C 195C 263G 295T 309.3C 315.3C489C J2a1 KT_3 10398G 16126C 16182C 16519C 73G 263G 309.1C 315.1C 523d 524d JT BT_2 16126C 16182C 16223T 16519C 73G 263G 523d 524d JT BG_8,24 16129A 16182C 16224C 16234T 16311C 16519C K1a1b1a 73G 114T 152C 263G 497T KT_12, 16, 16093C 16172C 16224C 16234T 16311C 16519C K1a1b1a 21,23,24 73G 114T 263G 309.1C 315.2C 497T AY_34, 39, 45 10400T 12705T 16223T 16372C 73G 263G 315.1C 489C M BJ_2, 16, 17 10400T 12705T 16182C 16223T 73G 263G 315.1C 489C M CN_12 10400T 12705T 16126C 16183C 16189C 16223T 73G 263G 489C 523d 524d M GJ_19 10400T 12705T 73G 263G 309.1C 315.1C 489C M RJ_01,2,7,8,12,15 10400T 12705T 16193T 16223T 73G 263G 315.1C 489C M

120 RJ_01,2,7,8,12,15 10400T 12705T 16193T 16223T 73G 263G 315.1C 489C M RJ_40 10400T 12705T 16223T 16372C 73G 263G 489C M RJ_35 10400T 12705T 16129A 16223T,16311C 73G 263G 489C M10A1+16129 KT_4 10400T 12705T 16183C 16318T 16223T 16519C 73G 246C 263G 309.1C 315.1C 489C 518d 519d M18 BT_12 10400T 12705T 16311C 16318T 16223T 16519C 246C 263G 309.1C 315.1C 489C M18 GJ_23 10400T 12705T 16183C 16223T 16311C 16318T 16519C 73G 93G 194T 246C 263G 309.2C 315.3C 489C M18a RJ_30, 31 10400T 12705T 16223T 16311C 16318T 73G 263G 309.1C 315.1C 489C M18C SD_11 10400T 12705T 16223T 16270T 16274A 16292T 16319A 16352C 73G 263G 309.1C 315.1C 447G 489C M2a1 KZ_2 10400T 12705T 16223T 16270T 16274A 16292T 16319A 16352C 73G 204C 263G 315.1C 489C M2a1 SD_18 10400T 12705T 16223T 16270T 16311C 16319A 16352C 16362C 73G 195C 204C 263G 309.1C 315.1C M2a1a CN_9 10400T 12705T 16129A 16223T 16240C 16274A 16311C 16319A 73G 263G 315.1C 447G 489C M2a2 CN_15 10400T 12705T 16223T 16274A 16311C 16319A 73G 146C 263G 309.1C 315.1C 447G 489C 575T M2a3 CN_3 10400T 12705T 16223T 16265C 16274A 16311C 16319A 73G 146C 263G 315.1C 447G 489C M2a3a CN_18 10400T 12705T 16129A 16126C 16223T 16217C 73G 263G 309.1C 315.2C 482C 489C 523d 524d M3 GH_16 10400T 12705T 16111T 16126C 16223T 16362C 73G 263G 30.2C 315.1C 482C 489C M3 KS_11 10400T 12705T 16093C 16126C 16223T 73G 263G 309.1C 315.1C 482C 489C 513A M3 KT_6 10400T 12705T 16126C 16223T 16311C 16356C 73G 263G 309.3C 315.1C 482C 489C M3 KT_9, 20 10400T 12705T 16111T 16126C 16223T 16362C 16519C 73G 263G 30.2C 315.1C 309.2C 315.3C 482C 489C M3 LS_1, LS_6 16126C 16223T 16379T LS_10 10400T 12705T 73G 263G 309.1C 315.1C 482C 489C M3 GJ_3, 28,29,39,48 10400T 12705T 16129A 16126C 16182C 16223T 73G 146C 152C 263G 309.1C 315.1C 482C 489C M3 RJ_20 10400T 12705T 16234T 16223T 73G 152C 195A 263G 309.1C 315.1C 489C 523d 524d M30e KZ_5 10400T 12705T 16172C 16223T 16294T 16311C 73G 263G 309.1C 315.1C 489C M33a RJ_33 10400T 12705T 16169T 16172C 16223T 16311C 73G 263G 489C M33A2 RJ_36 10400T 12705T 16172C 16223T 16311C 73G 263G 489C M33A2'3 GJ_1 10400T 12705T 16182C 16223T 16324C 16325C 16327T 16362C 73G 263G 150T 152C 309.2C 315.3C 489C M33b RJ_48 10400T 12705T 16223T 16324C 73G 195C 228A 263G 489C M33B RJ_49 10400T 12705T 16223T 16324C 16362C 73G 195C 228A 263G 489C M33B2 SD_16 10400T 12705T 16223T 16324C 16362C 73G 195C 228A 263G 489C M33B2 SD_13 10400T 12705T 16111T 16223T 16292T 16294T 16309G 16362C 73G 263G 309.1C 315.1C 489C M33c GH_12 10400T 12705T 16093C 16152C 16223T 16246G 16319A 73G 146C 152C 199C 263G 309.1C 315.1C 482C 489C M35a1a GH_8 10400T 12705T 16126C 16223T 16356C 16362C 73G 263G 309.1C 315.1C 482C 489C M3a GH_13 10400T 12705T 16126C 16223T 16311C 16356C 73G 263G 309.3C 315.1C 482C 489C M3a NS_4, 14 10400T 12705T 16126C 16223T 16271C 16379T 73G 263G 309.1C 315.1C 482C 489C M3a RJ_21, 24, 29 10400T 12705T 16126C 16234T 16223T 73G 263G 309.1C 315.1C 482C 489C 523d 524d M3a RJ_22 10400T 12705T 16126C 16234T 16223T 73G 482C 263G 315.1C 489C 523d 524d M3a+204 LG_5, LG_10, M3a1 LG_14, LG_15 10400T 12705T 16126C 16223T 16519C 73G 263G 204C 315.1C 482C 489C KS_8 10400T 12705T 16126C 16183C 16223T 73G 150T 152C 263G 315.1C 482C 489C 523d 524d M3a2a KS_16 10400T 12705T 16093C 16126C 16223T 73G 150T 152C 263G 315.1C 482C 489C M3a2a GJ_25, 44 10400T 12705T 16126C 16193.1C 16223T 16311C 16519C 73G 150T 195C 263G 309.3C 315.2C 482C 489C M3a2a RJ_26 10400T 12705T 16126C 16193T 16223T 73G 482C 263G 315.1C 489C 523d 524d M3b BG_13, 19 10400T 12705T 16126C 16189C 16223T 16294T 73G 263G 309.1C 315.1C 482C 489C M3C1 LS_2, LS_8 10400T 12705T 16126C 16182C 16189C 16294T 16223T 73G 152C 263G 309.1C 315.1C 482C 489C 523d 524d M3c1 GJ_21, 35 10400T 12705T 16093C 16126C 16182C 16189C 16223T 73G 152C 263G 309.1C 315.2C 482C 489C M3c1 SD_10 10400T 12705T 16223T 16189C 16294T 73G 152C 234G 263G 309.1C 315.1C 482C 489C M3c1 SD_7 10400T 12705T 16223T 16189C 16294T 16124C 73G 152C 234G 263G 309.1C 315.1C 482C 489C M3c1b1 BG_1 10400T 12705T 16126C 16193.1C 16223T 16154 73G 263G 152C 309.1C 315.2C 482C 489C M3C2 AY_25, 48 10400T 12705T 16145A 16155T 16220T 16223T 16261T 16311C 73G 152C 263G 309.1C 315.1C 482C 489C M4 CN_11 10400T 12705T 16145A 16155T 16223T 16261T 16311C 73G 152C 263G 309.1C 315.1C 482C 489C M4 RJ_28 10400T 12705T 16223T 16327T 16330C 73G 263G 315.1C 489C M41 RJ_18 10400T 12705T 16223T 16189C 73G 146C 263G 315.1C 489C M45 RJ_41 10400T 12705T 16189C 16223T 73G 146C 263G 489C M45

121

RJ_10 10400T 12705T 16223T 16362C 73G 152C 146C 263G 309.1C 315.1C 489C M46 RJ_27 10400T 12705T 16140C 16223T 16362C 73G 146C 152C 263G 489C M46 KZ_6 10400T 12705T 16223T 16233G 16126C 16234T 73G 263G 315.1C 489C M49 RJ_05 10400T 12705T 16234T 16153A 16223T 73G 263G 309.1C 315.1C 447G 489C M49a AY_26, 38, 43 10400T 12705T 16169C 16129A 16223T 73G 263G 309.1C 315.1C 482C 489C M5 BJ_18 10400T 12705T 16223T 16274A 16129A 152C 73G 263G 309.1C 315.1C 482C 489C 523d 524d M5 CN_1 10400T 12705T 16126C 16129A 16223T 16270T 16311C 16515C 73G 195C 204C 152C 263G 309.1C 315.1C 482C 489C M5 MZ_10 10400T 12705T 16129A 16183C 16223T 73G 263G 309.4C 315.3C 489C 523d 524d M5 RJ_03,4,16,19 10400T 12705T 16129A 16223T 73G 263G 315.1C 489C M5 SD_15 10400T 12705T 16129A 16223T 16270T 16311C 73G 195C 204C 152C 263G 309.1C 315.1C 489C M5 KZ_3 10400T 12705T 16129A 16144C 16223T 16270T 16274A 73G 204C 263G 315.1C 489C M5 RJ_38 10400T 12705T 16275G 16390A 16223T 16327A 16438A 73G 152C 263G 309.1C 315.1C 489C M52B1 RJ_25 10400T 12705T 16223 16294 73G 146C 189 263G 489C M57B LG_6, LG_13 10400T 12705T 16129A 16182C 16291T 16223T 73G 263G 309.1C 315.1C 489C 523d 524d M5A1 MZ_3 10400T 12705T 16129A 16220T 16223T 16291T 16519C 73G 263G 309.1C 315.1C 489C M5a1 RJ_09 10400T 12705T 16129A 16223T 16291T 73G 263G 309.1C 315.1C 489C M5a1 RJ_23 10400T 12705T 16193 16144A 16223 73G 234G 263G 489C M5a2a SD_17 10400T 12705T 16193 16144A 16223 73G 234G 263G 309.1C 315.1C 489C M5a2a AY_4 10400T 12705T 16129A 16223T 16261T 73G 150T 263G 309.1 315.1C 489C M5C1 GJ_9, 32,33,37 10400T 12705T 16129A 16182C 16223T 73G 150T 263G 309.1C 315.1C 489C 575T M5C1 RJ_11 10400T 12705T 16188T 16223T 16231C 73G 146C 263G 309.1C 315.1C 489C M6a1b SD_19 12705T 15301G 16086C 16223T 16335G 16362C 73G 263G 309.1C 315.1C N NS_15, 19, 20 12705T 15301G 16146G 16172C 16223T 16319A 16362C 73G 263G 309.1C 315.1C N10 GJ_17 12705T 15301G 16069T 16193.1C 16172C 16189C 16223T 73G 150T 152C 199C 263G 309.2C 315.1C N10b NS_2 12705T 15301G 16223T 16355T 16362C 16311C 73G 195C 263G 309.1C 315.1C 523d 524d N11a AY_3 12705T 15301G 16189C 16223T 16355T 73G 146C 152C 195C 263G 309.1C 315.1C N11a1 AY_15 12705T 15301G 16111T 16129A 16223T 16257A16261T 73G 150T 263G 309.1C 315.1C 489C N1a AY_19 12705T 15301G 16223T 16362C 73G 146C 207A 263G 309.1C 315.1C N1a1 KT_17 12705T 15301G 16176G 16390A 16223T 73G 152C 263G 309.1C 315.2C 523d 524d N1b KS_7, 19 12705T 15301G 16145A 16176G 16223T 16390A 73G 152C 204C 263G 309.1C 315.1C N1b1 BG_ 16,25 12705T 15301G 16176G 16390A 16145A 16223T 73G 152C 263G 309.1C 315.2C N1b1 AY_27 12705T 15301G 16220T 16223T 73G 263G 315.C N2 KS_2 12705T 15301G 16182C 16223T 73G 189G 263G 309.2C 315.1C N2 SD_20 12705T 15301G 6111T 16192T 16223T 16311C 73G 263G 309.1C 315.1C 456T N5 SD_22 12705T 15301G 16189C 16223T 16292T 73G 263G 309.1C 315.1C N9b KZ_14 12705T 15301G 16189C 16223T 16362C 73G 94A 263G 309.1C 315.1C N9B1C GJ_12 12705T 15301G 16189C 16223T 16294T 73G 150T 199C 204C 263G 309.1C 315.1C N9b2 KZ_15 12705T 15301G 16172C 16189C 16223T 16294T 73G 146C 152C 263G 309.1C 315.1C N9b2 NS_12 12705T 15301G 16187T 16189C 16223T 16294T 16309G 16379T 73G 263G 309.1C 315.1C N9b2a SD_21 12705T 15301G 16051G 16184T 16189C 16223T 16294T 16309G 73G 263G 309.1C 315.1C N9b2a SD_23 12705T 15301G 16189C 16223T 16292T 16294T 16309G 73G 263G 309.1C 315.1C N9b2a LS_3 12705T 16071T 16193T 73G 152C 263G 309.1C 315.1C R2 KZ_21 12705T 16129A 16223T 16262T 16071T 73G 152C 263G 315.1C R2 BT_1, BT_15 12705T 16071T 16172C 16182C 16183C 16223T 73G 152C 195C 263G 357G 309.3C 315.3C R2+195 SD_24 12705T 16071T 16293G 16311C 73G 195C 204C 152C 263G 309.1C 315.1C R2+195 KT_19 12705T 16071T 16223T 73G 152C 195C 263G 309.1C 315.1C R2a LG_2 12705T 16071T 16223T 73G 150T 152C 195C 263G 309.1C 315.1C 523d 524d R2a MZ_5 12705T 16183C 16071T 16223T 73G 152C 195C 263G 309.2C 315.2C 523d 524d R2a LS_4 12705T 16071T 73G 195C 263G 309.1C 315.1C 523d 524d R2a BJ_15 12705T 16071T 146C 152C 73G 16223C 263G 513A 523d 524d R2b

122

LG_1, LG_9, R2b LG_11 12705T 16071T 16193.1C 16223T 73G 146C 152C 195C 263G 309.1C 315.1C 523d 524d AY_7, 37, 46 12705T 16209C 16223T 73G 152C 263G 315.1C R30a1b KZ_16 12705T 16209C 73G 145T 263G 315.1C R30a1b AY_20, 42 12705T 16362C 16223T 73G 263G 309.1C R31 AY_30 12705T 16183G 16223T 16189C 16362C 73G 263G 309.1C 315.1C R31 GJ_27, 34, 43,49 12705T 16189C 16223T 16362C 16519C 73G 263G 309.1C 315.3C R31 GJ_41 12705T 16223T 16172C 16183C 16362C 73G 152C 263G 315.1C R31a BJ_19 12705T 16093C 16182C 16304C 16311C 73G 263G 309.1C 315.1C 523d 524d R5 AY_32 12705T 16223T 16220T 16266T 16304C 73G 152C 263G 309.1C 315.1C R5a BJ_3 12705T 16223T 16266T 16304C 73G 152C 263G 309.1C 315.1C R5a KT_18 12705T 16266T 16304C 16524G 16311C 73G 263G 309.3C 315.1C R5a KZ_24 12705T 16266T 16304C 16311C 16355T 16356C 16093C 73G 152C 263G 309.1C R5a2 KS_4 12705T 16182G 16304C 16311C 16356C 73G 152C 263G 309.2C 315.3C 523d 524d R5a2a BJ_1, 10 12705T 16129A 16362C 16266T 73G 195C 263G 309.1C 315.1C R6 AY_12 12705T 16129A 16223T 16266T 16362C 73G 146C 195C 263G 309.1C R6+16129 GH_7 12705T 16129A 16093C 16189C 16266T 16311C 16362C 73G 195C 228A 195C 263G 309.1C 315.1C R6a AY_16 12705T 16129A 16223T 16318G 16320T 16362C 73G 228A 263G 309.1C 315.1C R6a1 AY_23 12705T 16129A 16223T 16266T 16318G 73G 195C 228A263G 309.1C 315.1C R6A1 KZ_19 12705T 16311C 16319A 16362C 73G 263G R7 KZ_22 12705T 16274A 16319A 16362C 73G 263G 309.1C 315.1C R7 AY_1, 9 12705T 16146G 16223T 16311C 16319A 16362C 73G 195C 263G 309.1C 315.1C 523d 524d R7a1a NS_11 12705T 16146G 16260T 16261T 16319A 16362C 73G 152C 263G 309.1C 315.1C R7a1a KZ_2 12705T 16145A 16146G 16260T 16261T 16311C 16319A 16362C73G 146C 263G 309.1C 315.1C R7a1a KZ_20 12705T 16145A 16260T 16261T 16311C 16319A 16362C 16390A73G 263G 309.1C 315.1C R7a1b2 GJ_5,7, 31, 38 12705T 16183C 16223T 16274A 16260T 16319A 16362C 73G 195C 263G 309.1C 315.2C R7a'b GJ_13 12705T 16193.1C 16223T 16260T 16319A 16362C 16311C 73G 146C 263G 309.1C 315.1C R7b KZ_26 12705T 16145A 16260T 16261T 16311C 16319A 16362C 73G 146C 263G 309.1C 315.1C R7b GJ_10,47 12705T 16129A 16223T 16260T 16311C 16319A 16362C 73G 146C 152C 195C 263G 309.1C R7b1a KZ_18 12705T 16145A 16221T 16260T 16261T 16311C 16319A 16362C73G 146C 152C 263G 309.1C 315.1C R7b1a AY_18, 24 12705T 16223T 16294T 16311C 16355T 73G 152C 195C 263G 309.1C 315.1C R8 KZ_25 12705T 16294T 16311C 16355T 73G 152C 195C 263G 309.1C 315.1C R8 AY_29 12705T 16153A 16223T 16302G 73G 263G 309.1C 315.1C 456T R8A1A1B SD_25 12705T 16153A 16302G 73G 263G 315.1C 456T R8A1A1B KZ_17 12705T 16209C 16254G 16292T 73G 145T 152C 195C 263G 315.1C R8a1a3 KS_14 131368A 16126C 16189C 16294T 73G 152C 195C 263G 309.1C 315.1C 489C T1 BG_17 131368A 16126C 16163G 16182C 16189C 16294T 73G 263G 309.3C 315.3C T1 BG_4 131368A 16126C 16183C 16163G 16189C 16294T 73G 263G 309.2C 315.1C 309.3C 315.1C T1 BT_6 131368A 16126C 16163G 16189C 16223T 16294T 73G 263G 309.2C 315.1C T1 MZ_9 131368A 16126C 16163G 16189C 16223T 16294T 73G 263G 309.1C 315.2C T1 GJ_24 131368A 16126C 16186T 16189C 16223T 16294T 73G 146C 263G 309.2C 315.3C T1a BG_7 131368A 16126C 16294T 16296T 16519C 73G 263G 309.1C 315.1C T2 AY_2 12308G 16051G 16183C 16223T 16519C 73G 263G 309.1C 315.1C U2 SL_13 12308G 16051G 16172C 16182C 16230G 73G 263G 315.1C U2 BG_18,20 12308G 16051G 16193.1C 16519C 195C 73G 263G 309.2C 315.4C U2 MZ_7 12308G 16051G 73G 195C 263G 309.2C 315.4C 523d 524d U2 SD_33 12308G 16051G 16193A 16172C 16192T 16209C 73G 146C 263G 315.1C U2 SD_35 12308G 16051G 16172C 16206G 16230G 73G 263G 315.1C U2 KZ_31 12308G 16051G 16193A 16172C 16192T 16209C 73G 146C 263G 315.1C U2 AY_10 12308G 16051G 16129A 16223T 73G 263G 152C 315.1C U2+152 AY_6 12308G 16051G 16206C 16223T 73G 195C 263G 315.1C U2a

123

AY_35 12308G 16051G 16206G 16223T 73G 263G 315.1C U2a GH_4,20 12308G 16051G 16192T 16206C 16223T 16362C 73G 189G 263G 309.2C 315.1C U2a AY_13 12308G 16051G 16223T 16145C 73G 146C 263G 309.1C 315.1C U2b CN_6 12308G 16051G 16189C 16270T 16274A 73G 146C 263G 309.1C 315.1C U2b CN_19, 20 12308G 16193C 16051G 16172C 16192T 73G 146C 234G 263G 39.1C 315.1C 523d 524d U2b GH_14,18 12308G 16051G 16206C 16223T 16311C 73G 146C 263G 309.3C 315.1C U2b KS_6 12308G 16051G 16193.1C 16356C 16362C 73G 146C 152C 263G 309.1C 315.1C U2b KS_20 12308G 16051G 16189C 16274A 16311C 73G 146C 263G 309.1C 315.1C U2b KT_7, 22,25 12308G 16051G 16182C 16311C 73G 146C 263G 309.3C 315.1C U2b MZ_6 12308G 16051G 16223T 16334C 16350G 16372C 73G 146C 152C 263G 309.1C 315.1C U2b SD_37 12308G 16051G 16206G 16260T 73G 146C 152C 263G 309.1C 315.1C U2b KZ_30 12308G 16051G 16126C 73G 146C 152C 263G 309.1C U2B NS_5, 8, 9 12308G 16051G 16126C 16168T 16182C 16234T 73G 146C 152C 263G 309.1C 315.1C 523d 524d U2b1 KZ_35 12308G 16051G 16126C 16168T 73G 146C 152C 263G 309.1C 315.1C U2b1 16051G 16192T 16209C 16239T 16352C 16353T SL_2 U2b2 12308G 16311C 73G 152C 234G 263G 309.1C 315.1C GJ_16 12308G 16051G 16209C 16223T 16352C 16519C 73G 146C 263G 309.1C 315.1C U2b2 SD_26 12308G 16051G 16192T 16209C 16239T 16352C 16353T 16311C73G 234G 263G 309.1C 315.1C U2b2 SD_31 12308G 16051G 16209C 16239T 16353T 16362C 73G 146C 152C 263G 309.1C 315.1C U2b2 SD_36 12308G 16051G 16192T 16209C 16239T 16352C 16353T 16362C73G 146C 152C 234G 263G 309.1C U2b2 KZ_29 12308G 16051G 16209C 16239T 16353T 16362C 73G 146C 152C 263G U2b2 KZ_34 12308G 16051G 16126C 16234T 73G 146C 152C 263G 309.1C 315.1C U2c 16098T 16051G 16179T 16240C 16223T 16234T KS_12 U2c1a 12308G 16372C 73G 146C 152C 263G 309.1C SL_10 12308G 16051G 161931.C 16126C 16179T 16240C 16234T 73G 152C 263G 309.1C 315.1C U2c1a SD_27 12308G 16051G 16179T 16234T 16240C 16242T 16362C 73G 152C 263G 309.1C U2c1a KZ_28 12308G 16051G 16126C 16179T 16240C 73G 152C 263G 309.1C 315.1C U2c1a SD_38 12308G 16051G 16218T 16234T 16362C 73G 152C 263G 315.1C U2cd AY_8 12308G 16051G 16129C 16234T 16189C 16223T 16362C 73G 152C 263G 309.1C U2E KZ_32 12308G 16051G 16129C 16189C 16362C 73G 152C 263G 309.1C U2E SD_40 12308G 16343G 16356C 16392A 73G 146C 150T 234G 263G 309.1C 315.1C U3a1c LG_7, LG_12 12308G 16192T 16223T 16256T 16270T 73G 263G 309.2C 315.3C U5a LG_3 12308G 16192T 16223T 16256T 16270T 16399G 73G 263G 309.2C 315.1C 523d 524d U5a1 SL_8, 9 12308G 16093C 16182C 16189C 16192T 16270T 16372C 73G 150T 263G 309.1C 315.1C U5b1b SD_34 12308G 16172C 16189C 16192T 16209C 16270T 16311C 73G 150T 263G 309.1C 315.1C U5b1c SD_39 12308G 73G 146C 150T 234G 263G 309.1C 315.1C U5b2a1a2 BJ_11,13 12308G 16051G 16189C 16192T 16270T 16304C 16335G 73G 150T 195C 228A 263G 309.2C 315.1C U5b3 BJ_9 12308G 16129A 16192T 16270T 16304C 16311C 73G 150T 152C 189G 228A 263G 309.1C 315.1C573T U5b3b1 AY_5 12308G 16223T 16318T 16309G 73G 152C 263G 309.1 315.1C U7 AY_14, 36,41, 50 12308G 16223T 16309G 16318T 16256T 73G 152C 263G 309.1C 315.1C U7 GH_1 12308G 16051G 16093C 16309G 16318T 16342C 73G 150T 152C 263G 309.1C 315.1C U7 SL_4 12308G 16126C 16183C 16309G 16318T 73G 263G 150T 152C 309.1C 315.1C 523d 524d U7 LG_8 12308G 16183C 16318T 16309G 16223T 16519C 73G 152C 263G 309.1C 315.3C U7 GJ_4, 26,42,45,46 12308G 16223T 16256T 16309G 16318T 16362C 16519C 73G 146C 152C 263G 309.1C U7 SD_28 12308G 16120G 16256T 16309G 16318T 73G 152C 263G309.1C U7 SD_30 12308G 16256T 16309G 16318T 73G 152C 263G 309.1C 315.1C U7 SD_32 12308G 16318T 16309G 73G 263G U7 KZ_27 12308G 16318T 16309G 73G 263G 309.1C 315.1C U7 KZ_33 12308G 16309G 16318T 16256T 73G 152C 263G 309.1C 315.1C U7 GH_2 12308G 16309G 16318T 16355T 73G 263G 151T 152C 309.1C 315.1C 523d 524d U7a BG_3, 11 12308G 16093C 16309G 16318T 73G 151T 152C 263G 309.1C 315.1C U7a KS_5 12308G 16126C 16151T 16271C 16309G 16318T 16355T 73G 146C 151T 152C 263G 309.1C 315.1C U7a4

124

BJ_7, 14 12308G 16093C 16051 16271C 16309G 16318T 16355T 73G 263G 150T 152C 309.1C 315.1C 523d 524d U7b1 GH_3 12308G 16069T 16126C 16172C 16271C 16309G 16318T 73G 146C 151T 152C 263G 309.1C 315.1C U7b1 GJ_2 12308G 16189C 16223T 16234T 16311C 16519C 73G 151T 263G 309.2C 315.1C U8C AY_21 1243C 16223T 16292T 16311C 16362C 73G 189G 195C 204C 263G 309.1C W KS_15 1243C 16223T 16292T 16311C 16362C 73G 189G 195C 204C 207A 263G 309.1C W NS_3 1243C 16129A 16172C 16223T 16292T 73G 189G 195G 204C 207A 263G 315.1C W BJ_6 1243C 16193C 16295T 16223T 16292T 16319A 73G 189G 195C 204C 207A 263G 309.3C 315.1C523d 524d W1e1 GH_5,6,11 1243C 16129A 16172C 16223T 16292T 16320T 73G 189G 195C 204C 207A 263G 309.1C 315.1C W1g KS_10 1243C 16129A 16145A 16172C 16223T 16292T 73G 189G 195C 204C 207A 263G 309.1C 523d 524d W1h AY_17 1243C 16223T 16292T 16295T 16320T 73G 194T 195C 199C 204C 263G 309.1C 315.1C W3B1 GH_17 1243C 16172C 16223T 16292T 16362C 73G 189G 194T 195C 204C 207A 263G 30.1C 315.1C W5a BG_10,21 1243C 16192T 16292T 16223T 16325C 16519C 73G 263G 189G 194T 195C 204C 207A 309.1C 315.1C W6 GJ_18 1243C 16182C 16192T 16193.1C 16223T 16292T 16325C 73G 195C 204C 207A 263G 309.1C 315.1C W6

Abbreviations: AY (Arain), BG (Bangash), BJ (Bijarani), BT (Bugti), CN (Chandio), GH (Ghallu),GJ (Gujar), JT (Jat), KZ (Kakazai), KT (Khattak), KS (Khoso), LG (Laghari),LS (Lashari), MH (Mahsuds), MZ (Mazari), NS (Nasrani), OK (Orakazai),RJ (Rajpot), SL (Solangi), SD (Sayyid), YZ (Yusufzai), WZ (Wazirs)

125

South Asian macrohaplogroup M and its distribution in Pakistani Ethnic

Groups:

Monophyletic clad of M haplogroup was widely distributed between most of the East,

South and North Asians, especially in the different ethnic groups and tribes of the

Pakistani population. The frequency was 47% between caste of Punjab and among the tribes of Sindh, Baluchistan and KPK. The origin of Macro-haplogroup M was controversial in the Indian subcontinent some authors suggested South Asian origin, while others suggested that M haplogroup migrated from the Indian subcontinent to

Africa (Quintana-Murci et al., 2004). The major haplogroups identified among

Pakistani ethnic groups are M and its sub clads, their frequencies are shown in Table

3.3. The distribution of M haplogroup and its sub-clads in different ethnic groups of

Baluchistan, KPK, Punjab and Sindh, are shown in Figure 3.15.

126 Table 3.3: Frequencies of haplogroup M in 22 populations of Pakistan Populations Haplotypes AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MS MZ NS OK RJ SD SL WZ YF Total

M 2 4 0 0 1 0 1 0 1 0 0 0 0 1 0 0 1 7 0 0 0 0 18 M2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 3 M2a1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 2 M2a1a 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 2 M2a2 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 4 M2a3 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 M2a3a 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 2 M2b4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 M3 0 1 0 0 1 1 1 0 1 0 0 0 1 0 0 0 5 0 0 0 0 0 11 M3a 0 0 0 0 2 1 0 0 0 0 0 0 0 1 0 0 0 3 0 1 0 0 8 M3a1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 M3a+204 0 2 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 4 M3a2a 0 1 0 0 0 0 1 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 4 M3b 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 3 M3c 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 M3c1 0 1 0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 1 0 1 1 7 M3c1a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 M3c1b 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 4 M3c1b1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 M3c2 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 3 M4 1 0 0 0 1 0 0 4 0 0 0 0 0 0 0 0 1 1 0 0 0 0 8 M5 0 0 0 0 0 1 1 4 1 1 1 0 0 0 0 1 0 4 1 0 1 0 16 M5a1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 1 2 0 1 0 0 0 0 6

Continue 127

M5a1b 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 M5a2a 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 3 M5C1 1 0 0 0 0 0 0 2 0 1 0 0 0 0 0 0 0 0 0 0 1 0 5 M5d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 M6a1a 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 M6a1b 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 M6b 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 2 M7b2 1 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 1 0 0 0 0 5 M10A1+1612 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 9 M12 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 0 0 0 0 3 M14 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 M18a 3 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 M18b 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 M18c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 2 M24 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 M30b 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 M30c 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 5 M30e 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 3 M33a 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 M33A2'3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 M33a1a 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 M33a2a 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 M33b 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 M33B2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 2 M33c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 M34 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3

Continue 128 M35a 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 M35a1a 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 4 M37 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 3 M38 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 M38c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 M40 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 M40a 0 0 0 0 0 0 1 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 3 M41 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 M43a1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 M45 1 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 5 M45 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 2 M46 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 1 0 0 3 M49 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 M49a 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 1 3 0 0 0 6 M49c 3 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 3 0 0 0 7 M51b1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 M52a 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 M52b1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0 4 M57B 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 Total 16 14 4 0 15 6 12 27 6 8 13 0 3 3 2 10 14 40 18 5 4 2 222

Continue

129

Figure 3.15: A network relating haplogroup M of Pakistani caste and tribal populations. Spherical areas are proportionate to the haplotypes frequencies and variant nucleotides are shown along the links between haplotypes.

130

Distribution of West Eurasian haplogroup N in Pakistani population:

The Eurasian mitochondrial pool comprises of one major haplogroup called N haplogroup that was migrated from the Northern route called as one “haplogroup -one migration model”. The West Eurasian macro-haplogroup N, is the ancestor of numerous haplogroups inhabited in Europe, Middle East, Asia and the Americas. The most common haplotypes seen in this study are, N1, N2, N5, N9, N10 AND N11 and their frequencies are summarized in Table 3.4. The network analysis is shown in

Figure 3.16.

131

Figure 3.16: Reduced median networks of haplogroup N based on Pakistani caste and tribal population. The number along the branched denotes mutations with reference to CRS (Anderson et al 1981). While circling areas are proportional to the haplotypes frequency and reticulations indicate parallel mutational pathways.

132 Table 3.4: Frequency distribution of haplogroup N in different in 22 ethnic groups

Populations Haplotypes AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MS MZ NS OK RJ SD SL WZ YF Total N 2 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 2 1 0 0 0 7 N1a 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N1a1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N1b1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N2 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 2 N5 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 N7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 N9a1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 N9b 1 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 4 N9b1c 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 N9b2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 2 N9b2a 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 2 0 0 0 4 N10 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 3 N11a 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 2 N11a1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 Total 4 0 1 0 0 0 4 9 2 0 3 0 0 0 0 5 0 2 5 0 0 0 35

133

East Asian Haplogroup R and its Sub-clads dispersal in Pakistani ethnic groups:

Haplogroup R was the decedent or sub-clad of haplogroup N and was the most common haplogroup in East Asia. The overwhelming bulk of R lineage (67%) in

Pakistani population has a clear Europe and West Asia provenance due to Paleolithic and Neolithic expansions of Caucasian that reached in South Asia via Iranian plateau and might be Arabian Sea maritime routes, could be the possible ground for this genetic influx (Stoljarova et al 2016). Congruently, some of the mtDNA R lineages are undergoing main expansion in this region or either autochthonous to Pakistan.

Among them is subclade R2, which is more pronounced in Southern Pakistan, India and less frequent in most of the adjacent regions of the Iranian Plateau, the Central

Asia, the Arabian Peninsula, the Near East and the Caucasus (Costa et al 2013).In

Pakistani different ethnic groups its percentage was much higher in Sindh, Punjab as compared to KPK. Major Haplogroups found are R2, R5, R7, R8 and R30. The network analysis is shown in the Figure 3.17 that revealed its origin from N haplogroup.

Existence of U lineage and its distribution in Pakistani ethnic groups:

Haplogroup U descended from haplogroup R approximately about 55,000 years ago and found to be oldest haplogroup with a high percentage in the Punjab, and KPK and less in Baluchistan tribal groups. Predominantly haplogroups U2, U6 and U7 are frequently found in this study. Network analysis and frequency distribution is shown in Figure 3.18.

134

Figure 3.17: A phylogenetic network of haplogroup R of Pakistani caste and tribal populations estimated by reduced median method. The size of node constitutes frequency distribution of each haplotype and the identities of mutations that define major haplogroup subset are portray along selected internodes.

135

Table 3.5: Frequency distribution of haplogroup R in 22 ethnic groups of Pakistan Populations Haplotypes AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MS MZ NS OK RJ SD SL WZ YF Total R2 1 1 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 1 6 R2a 0 0 0 0 0 0 0 0 0 0 0 1 1 0 2 0 0 0 0 0 0 0 4 R2b 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0 0 3 R2+195 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 R5 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R5a 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R5a2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 R5a2a 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R6 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R6a 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 R6a1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R31 2 0 0 0 0 0 1 3 0 1 0 0 0 0 0 0 0 0 0 0 0 0 7 R6+16129 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R6A1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R7 0 0 0 0 0 0 0 1 0 0 2 0 0 0 0 0 0 1 0 0 0 1 5 R7a'b 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 R7a1a 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 3 R7a1b2 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 R7b 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 R7b1a 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 R7b2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R8 3 0 1 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 6 R8a1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R8a1a1b 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 3 R8a1a3 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 R8b 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R9 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 R9b2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 Total 12 2 5 1 0 2 5 6 1 1 10 2 2 2 3 1 0 3 3 0 0 2 63

136

The coalescence age of haplogroup U in Europe was about 51,000 – 67,000 ybp and widely disseminated in the Europe and Near East (Roostalu et al 2007). However, this sparse, yet extensive distribution, when assimilated with the presence of a profound allied haplogroup U2 (lacks a characteristic transversion at position 16129) found in native South Asians, proposes that haplogroup U2 is enormously deep-rooted, divergent and was most likely a former lineage of the super-haplogroup U in South

Asia Barcaccia et al 2015).

Figure 3.18: Major subset of haplogroup U in Pakistani caste and tribal population is exhibited in reduced median networks. The frequency of each haplotype in term of nodal size and variant nucleotides is numerated and displayed along the links between haplotypes.

137

Table 3.6: Frequency distribution of haplogroup U in 22 ethnic groups of Pakistani population Populations Haplotypes AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MS MZ NS OK RJ SD SL WZ YF Total ??? 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 U 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 U2 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 2 1 0 0 7 U2a 2 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 5 ?(U2b) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 ?(U2b1) 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 U2+152 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 2 U2b 1 0 0 0 2 1 0 0 1 1 1 0 0 0 1 0 0 1 1 0 0 0 10 U2b1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 2 U2b2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 3 1 0 0 5 U2c 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 U2c1a 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 0 0 3 U2cd 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 2 U2E 1 0 0 0 0 0 0 2 0 0 1 0 0 0 0 0 0 0 0 0 0 0 4 U3a1c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 U4 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 U5a1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 U5b 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 3 U5b1c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 U5b1d1a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 U5b2a1a2 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 3 U5b3 0 0 2 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 3 U5b3b1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 U5B3F 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 U6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 U6a7b1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 U6b1a 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 U7 2 0 0 0 0 1 1 0 0 1 2 0 0 0 0 0 1 0 3 1 1 0 13 U7a4 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 2 U7b1 0 0 1 0 0 2 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 5 U8b1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 U8b1a1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 U8c 2 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 Total 11 2 5 0 2 5 4 6 5 5 9 1 0 2 2 1 4 4 15 6 1 1 91

138

Mitochondrial haplogroup J, K,T, and W:

Haplogroup J is descended from haplogroup JT which is sub-clade of haplogroup T.

The time estimation for the haplogroup J is nearly 45,000 years ago. Besides aforementioned haplogroups, analysis of specific haplogroups J and K (descended from U8) revealed some interesting geological and chronological distribution in

Pashtun. The presence of principal subclade K1a1b1a 2% in Bangash and 5% in

Khattak along with 4% J1 in Khattak reflected a deep Jewish and European ancestry conglomeration. It is worth mentioning that expansion of the Jewish community, especially Ashkenazi, was going to be 1000 BC across the Roman Empire and Iranian

Plateau (Behar et al 2004). However, on account of expulsion in the Western Europe during the fifteenth century, they subsequently dispersed into Eastern Europe and radiated up to Afghanistan, North West Pakistan, and India. Yet, Pakistani and Indian

Jewish communities are an early branch of the Jewish Diaspora with their several unique socio-cultural features, complex history and minor Middle East specific ancestry components Shown in Figure 3.19. The haplogroups determined in this study are J1, J2 and its subclades as shown in network analysis. While frequencies of each haplotypes are given in Table 3.7

139

Figure 3.19: A reduced median networks of haplogroup J in Pakistani caste and tribal populations. Nodal size acquainted the haplotype frequency and reticulation indicated multiple mutations and variant nucleotides are listed as indicated along the links between haplotypes.

140 Table. 3.7: Frequency distribution of haplogroup J in different ethnic groups of Pakistan

Populations Haplogroups AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MH MZ NS OK RJ SD SL WZ YF Total

J 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 2 J1 0 0 0 0 0 0 0 0 0 4 1 0 0 0 0 0 1 0 1 0 0 0 7 J1+16193 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 J1b 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 J1d1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 J2 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 J2a 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 J2a1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 J2a2a 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 J2a2a2 0 0 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 3 J2b 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 J2B1A6 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 JT 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 Total 0 0 3 1 0 3 0 0 0 2 6 0 0 0 0 0 2 0 3 0 0 0 24

141

Figure 3.20: Phylogeny of haplogroups W AND T are estimated by constructing reduced median networks of Pakistani caste and tribal populations. The size of each node directly proportion to the frequency of haplotypes and variant nucleotides were enumerated and shown along the links between haplotypes.

142

Table 3.8: Frequency distribution of haplogroups T and W in different ethnic groups in Pakistan Populations Haplotypes AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MS MZ NS OK RJ SD SL WZ YF Total T1 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 2 T1a+152 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 T2b 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 T2b4a 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 T2b4c 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 T2D1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 T3 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 Total 0 1 0 0 0 1 0 0 1 0 4 0 0 0 0 0 0 0 1 0 0 0 8

Populations Haplotypes AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MH MZ NS OK RJ SD SL WZ YF Total

W 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 0 5 W5a 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 W3a1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 W+194 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 W1e1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 W1g 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 W1h 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 2 W3B1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 W7 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1

Total 2 0 1 0 0 4 0 0 2 2 2 0 0 0 0 1 0 0 2 0 0 0 16

143

Figure 3.21: A network relating haplogroup H of Pakistani caste and tribal populations. Spherical areas are proportionate to the haplotypes frequencies and variant nucleotides are shown along the links between haplotypes.

.

144

Table 3.9: Frequency distribution of haplogroups H in different ethnic groups of Pakistan Populations Haplotypes AY BG BJ BT CN GH GJ JT KS KT KZ LG LS MS MZ NS OK RJ SD SL WZ YF Total

H1a 0 0 1 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 3 H1a1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 H1ba 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 H1e2c 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 H2a2a 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 HV1b3 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 HV2 0 0 0 1 0 0 0 1 1 1 0 1 1 1 0 0 0 0 0 1 0 0 8 HV2a 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 3 HV6 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 H11a2 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 HV17 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 H17c 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 H42a 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 Total 3 1 1 1 3 0 2 1 3 1 0 1 1 1 3 0 0 0 0 3 0 0 25

145

Unrooted Neighbor Joining Network (UNJN):

Unrooted neighbor joining network was drawn on the basis of Fst distances that portrays the relationship among 22 ethnic groups of Pakistan. It has been noted that the tribal population of Baluchistan, KPK, Sindh and castes of Punjab were clustered to gather and population was statistically sound separated by branches. The close relationship was observed between Lashari and Bangash; Laghari and Kakazai; Khoso and Mazari, whereas Khattak and Ghallu were the outlier sharing least haplotypes shown in figure 3.22.

Figure 3.22: An UNJN of 22 populations based on Fst distances among caste communities and inset indicates the relationship for the major population groups.

146

Figure 3.23: Graphical presentation of major haplogroups percentage in 22 populations of Pakistan

147

Multidimensional Scaling (MDS):

MDS analysis was based on Fst distance matrixes performed to elaborate the relationship between the 22 ethnic groups of the Pakistani Population. However, The graphical presentation of MDS revealed that Chandio and Solangi were the clear outlier, separating rest of the groups and hold a lowest overall genetic diversity due to shared haplotypes and high consanguinity. Moreover, comparison with other populations showed that Pashtun samples were form-fitting within the South Asian cluster (Figure 3.24b). The frequency and distribution of haplogroups in Pashtun was eye-catching, mainly because of high occurrence of West Eurasian and low frequency of South Asian haplogroups.

Figure 3.24: MDS plot of 22 ethnic groups of Pakistani population. AY (Arain), BG (Bangash), BJ (Bijarani), BT (Bugti), CN (Chandio), GH (Ghallu),

GJ (Gujar), JT (Jat), KZ (Kakazai), KT (Khattak), KS (Khoso), LG (Laghari),

LS (Lashari), MH (Mahsuds), MZ (Mazari), NS (Nasrani), OK (Orakazai),

RJ (Rajpot), SL (Solangi), SD (Sayyid), YZ (Yusufzai), WZ (Wazirs).

148 Table 3.9a. Pairwise FST distances of mtDNA control region sequences among five provinces of Pakistan Eurasian populations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 1 Baluchistan - 0.00257 0.00446 0.00030 0.23978 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 2 Punjab 0.02508 - 0.00000 0.00000 0.00347 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 3 Sindh 0.02102 0.01663 - 0.00099 0.00059 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 4 NWFP 0.04737 0.02009 0.01400 - 0.00356 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 5 FATA 0.01007 0.03806 0.04893 0.04448 - 0.00020 0.00000 0.00168 0.00010 0.00000 0.00000 0.00000 0.00059 0.00000 0.00059 0.00000 0.00000 0.00000 0.00000 0.00010 0.00000 6 Portugal 0.16963 0.12951 0.12983 0.12314 0.15442 - 0.00040 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 7 Barcelona 0.14558 0.11264 0.11580 0.11779 0.13449 0.03735 - 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 8 Pakistan 0.03615 0.03428 0.04266 0.04181 0.03847 0.09616 0.06472 - 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 9 Iran 0.11892 0.09687 0.10441 0.10585 0.09250 0.08952 0.07601 0.04758 - 0.01554 0.02594 0.06950 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 10 Azerbaijan 0.11314 0.11634 0.11334 0.10900 0.09479 0.12944 0.09725 0.04855 0.02096 - 0.02069 0.53005 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 11 Turkey 0.08991 0.09617 0.09611 0.10552 0.08402 0.11264 0.07434 0.03294 0.01649 0.01940 - 0.45887 0.00000 0.00010 0.00010 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 12 Georgia 0.10046 0.11435 0.11087 0.12059 0.09156 0.13011 0.08931 0.04748 0.01293 -0.00153 -0.00019 - 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 13 Kazakhstan 0.04430 0.03201 0.04377 0.04248 0.05143 0.10761 0.07763 0.01986 0.06608 0.06629 0.04101 0.06703 - 0.00000 0.00267 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 14 Russia 0.09922 0.10342 0.09704 0.12104 0.13334 0.12561 0.09439 0.04445 0.08184 0.09669 0.05839 0.08151 0.05793 - 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 15 Uzbekistan 0.03505 0.02698 0.03681 0.03991 0.04351 0.09142 0.06196 0.01041 0.05121 0.05418 0.03515 0.05379 0.00346 0.03889 - 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 16 Korea 0.07956 0.05165 0.06814 0.05579 0.07711 0.14333 0.11520 0.05808 0.11148 0.10681 0.09137 0.11319 0.02056 0.11370 0.03526 - 0.00000 0.00000 0.00000 0.00000 0.00000 17 Vietnamese 0.09698 0.09458 0.09208 0.08118 0.10014 0.15529 0.11833 0.07016 0.11673 0.09465 0.09434 0.10736 0.04641 0.10911 0.05205 0.03305 - 0.68825 0.00000 0.00000 0.02010 18 Laos 0.10482 0.09909 0.09909 0.08570 0.10493 0.15477 0.12039 0.07477 0.11742 0.09344 0.09708 0.10894 0.05016 0.11034 0.05476 0.03685 -0.00093 - 0.00010 0.00000 0.00941 19 Malaysia 0.07369 0.06110 0.06286 0.05394 0.07825 0.14908 0.11755 0.05571 0.10708 0.09202 0.08922 0.10161 0.03177 0.10720 0.03663 0.02414 0.01793 0.02010 - 0.00000 0.00000 20 Myanmar 0.08872 0.07393 0.08133 0.07379 0.09205 0.14609 0.11533 0.06030 0.10791 0.09369 0.08911 0.10741 0.03526 0.10637 0.04202 0.03787 0.01815 0.01833 0.02287 - 0.00010

21 Thailand 0.08367 0.07726 0.07717 0.06990 0.08913 0.13473 0.09863 0.05625 0.09944 0.08263 0.07838 0.09520 0.03339 0.09269 0.03760 0.03110 0.00450 0.00548 0.01585 0.01002 - Lower left half plane is the pair-wise FST values between population, and upper right half plane is the FST p-value.

149

Table 3.9b. Pairwise FST distances of mtDNA control region sequences among the Pakistani ethnic groups

Populations 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 1 Bijrani - 0.000 0.000 0.000 0.0000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 2 Chandio 0.020 - 0.00040 0.000 0.000 0.0000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 3 Ghullu 0.011 0.004 - 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 4 Khoso 0.031 0.015 0.065 - 0.00000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.002 5 Nasrani 0.041 0.015 0.067 0.089 - 0.014 0.024 0.036 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 6 Solangi 0.035 0.011 0.098 0.077 0.009 - 0.020 0.335 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 7 Sindhi 0.004 0.009 0.009 0.008 0.010 0.009 - 0.000 0.0.1 0.001 0.000 0.000 0.001 0.001 0.002 0.000 0.000 .0001 0.000 8 Pathan 0.092 0.028 0.084 0.123 0.026 0.089 0.011 - 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.001 9 Baluchi 0.026 0.040 0.089 0.052 0.084 0.099 0.030 0.014 - 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.005 10 Brahui 0.093 0.045 0.079 0.011 0.071 0.012 0.021 0.016 0.012 - 0.002 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.001 11 Hazara 0.032 0.047 0.098 0.015 0.098 0.023 0.040 0.035 0.042 0.049 - 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.000 12 Hunza Burusho 0.043 0.012 0.121 0.055 0.096 0.075 0.020 0.009 0.021 0.019 0.038 - 0.000 0.000 0.000 0.000 0.000 0.001 0.002 13 Kalash 0.126 0.158 0.128 0.145 0.120 0.149 0.156 0.119 0.123 0.141 0.132 0.130 - 0.000 0.000 0.000 0.000 0.000 0.000 14 Makrani 0.115 0.112 0.110 0.104 0.120 0.128 0.109 0.030 0.035 0.032 0.050 0.028 0.139 - 0.000 0.000 0.020 0.000 0.001 15 Parsi 0.015 0.043 0.140 0.019 0.059 0.018 0.029 0.013 0.029 0.028 0.040 0.016 0.144 0.039 - 0.000 0.009 0.0001 0.002 16 Karachi 0.019 0.023 0.018 0.012 0.015 0.007 0.014 0.013 0.004 0.021 0.016 0.019 0.122 0.100 0.017 - 0.000 0.001 0.004 17 Gujrati 0.032 0.021 0.058 0.029 0.042 0.009 0.018 0.010 0.016 0.008 0.012 0.017 0.129 0.111 0.027 0.015 - 0.000 0.000

18 Pathan (P) 0.045 0.032 0.097 0.098 0.065 0.011 0.018 0.003 0.026 0.019 0.051 0.005 0.121 0.041 0.026 0.012 0.0031 - 0.000 Saraiki 19 0.052 0.045 0.088 0.056 0.012 0.001 0.014 0.012 0.007 0.015 0.019 0.012 0.131 0.051 0.011 0.010 0.029 0.014 -

Lower left half plane is the pair-wise FST values between population, and upper right half plane is the FST p-value.

150

Figure: 3.24a Multidimensional Scaling plot of FST genetic distance based on result of mtDNA control region sequences of five Provinces of Pakistan and world populations (Supplementary Table S2); South East Asia is represented by the triangle, Europe by open circle, Central Asia by closed square, Sindh by closed diamond, West Asia closed square and Barcelona by closed circles. (Stress .072)

151

The Baluchi ethic groups such as Laghari, Lashari, Bugti, and Mazari have closed relationship with other Pakistani populations. These findings support a hypothesis of common genetic ancestry of these populations, regardless of ethnicity it is perhaps because of the strategic location and severe, harsh environmental conditions of this region, which could have enabled the formation of social organization among expending populations and facilitated in upholding the genetic boundaries within groups, which have settled over time into diverse ethnicities. Moreover, further detailed genetic structure was shown in the (Fig. 3.24c) which shows significant affinity and closer correlation with previously studied Pakistani population, than to the other Central Asian and European populations (Fig. 3.24b)

Figure 3.24b:Multidimensional Scaling plot based on the FST genetic distance of mtDNA control region sequences of Sindhi ethnic groups and sixteen world populations ((Supplementary Table S1); South East Asia (Thailand, Myanmar, Vietnamese, Laos, Malaysia); North East Asia (Korea); South West Asia (Turkey); West Asia (Iran) and Europe (Russia, Georgia, Barcelona, Portugal). Sindhi ethnic groups are represented by the triangle and other populations are represented by circles (Stress = 0.133).

152

Figure 3.24c: Phylogenetic tree based on pairwise Fst distances between 22 ethnic groups in Pakistan and neighboring populations of Central Asia, Europe and Southeast Asia. The evolutionary history was extrapolated using the neighbor-joining method based on control region sequences. The optimum tree with the sum of branch length = 0.055324 is shown.

153

Coding region SNP analysis and Targeted haplogroups designation:

A multiplex assay of coding region SNPs was developed to score the twelve SNPs for the determination of prevalent haplogroups of South Asia. Figure 3.25 shows the electropherograms used to score the 12 SNPs in eight different samples. In all samples, the haplogroup information from the multiplex assay confirmed the haplogroup status of control region sequencing. Control region data proved useful to assign the haplogroups however few samples are those with haplogroup M, M5, M35,

M37, M38, M40 etc are assigned on the basis of coding region SNPs.

The selected SNPs combined into two multiplexes assays that covered almost all the haplogroups present in the Pakistani population. The multiplex “A” designed to target the haplogroups H, J, U, and T, While Multiplex “B” targeted the M, N and R respectively.

154 Haplogroup M:

Figure 3.25: Electropherograms of multiplex A & B obtained from samples belonging to haplogroup M. Plots nucleotide size (X-axis) relative to 120 LIZ size standard (Applied Bio System) and relative florescent units (RFUs, Y-axis).

155

Haplogroup R:

Figure 3.26: Electropherograms of multiplex A & B obtained from samples belonging to haplogroup R. Plots nucleotide size (X-axis) relative to 120 LIZ size standard (Applied Bio System) and relative florescent units (RFUs, Y-axis).

156 Haplogroup N:

Figure 3.27: Electropherograms of multiplex A & B obtained from samples belonging to Haplogroup N: Plots nucleotide size (X-axis) relative to 120 LIZ size standard (Applied Bio System) and relative florescent units (RFUs, Y-axis).

157

Haplogroup H:

Figure 3.28: Electropherograms of multiplex A & B obtained from samples belonging to Haplogroup H: Plots nucleotide size (X-axis) relative to 120 LIZ size standard (Applied Bio System) and relative florescent units (RFUs, Y-axis).

158

Haplogroup J:

Figure 3.29: Electropherograms of multiplex A & B obtained from samples belonging to Haplogroup J: Plots nucleotide size (X-axis) relative to 120 LIZ size standard (Applied Bio System) and relative florescent units (RFUs, Y-axis).

159

Haplogroup U:

Figure 3.30: Electropherograms of multiplex A & B obtained from samples belonging to Haplogroup U: Plots nucleotide size (X-axis) relative to 120 LIZ size standard (Applied Bio System) and relative florescent units (RFUs, Y-axis).

160 HAPLOGROUP T:

Figure 3.31: Electropherograms of multiplex A & B obtained from samples belonging to Haplogroup T: Plots nucleotide size (X-axis) relative to 120 LIZ size standard (Applied Bio System) and relative florescent units (RFUs, Y-axis).

161

Statistical Analysis:

Arlequin statistical software (3.5.2.1) was applied for the analysis of Molecular

Variance (AMOVA), in an attempt to provide insight learning of degree of variations that compiled the genetic structure of the caste and tribal populations of Pakistan. The majority of variations was observed within populations and was attributed to intra- population variations (95.67 %). While variance analysis only accounted for (1.45 %) among the different ethnic groups observed in the samples as shown in Table 3.10.

Table 3.10: AMOVA of 22 populations of Pakistan

162 Genetic diversity and demographic parameters:

To date very few studies have been done on the population of Pakistan. The South Asian region where Pakistan resides has been a major corridor for human migration and trade route throughout history. The population diversity was analogous to expected heterozygosity that determine the difference between two randomly selected sequences. The genetic diversity was calculated for mitochondrial control region described in the Table 3.11. Most of the ethnic groups showed analogous sequence diversity, with the Mazari showed least 0.991 and Gujjar,

Bijarani, Wazirs , showed the highest (0.998) haplotype frequency. The lower level of diversity showed by Mazari ethnic group was also evident in the low MPD (5.73) as shown in table 3.11. This may be population in Baluchistan remaining inside their own ethnic group and no admixture has been done in the past displaced populations. They have high sharing of haplotypes leads to high random match probability thus causing a bottleneck of mtDNA haplotypes. Furthermore, a significant negative value for both Tajima’s D and Fu’s Fst neutrality strongly exhibited a population expansion (Table3.13). Moreover, the raggedness index ® under the demographic expansion model found that all (r) values were less than 0.05, providing congruent genetic evidence for population expansion in all caste and tribal groups.

163

Table 3.11: Diversity and demographic parameters inferred from the control region of mtDNA

Sum of e Linguistic Mismatch Genetic Nucleotide Fu's r Populations Code N MPD Exp Heterozygosity RMP square affiliation distribution diversity diversity Fs ᵃ

PUNJAB Mean Varience Castes

Arain AY Indo-European 50 8.51 8.28 0.996 ± 0.004 0.0008 ± 0.0046 6.50 ± 2.1 0.119 ± 0.110 0.04 0.021 -25.81 0.014 Gujar GJ Indo-European 30 10.36 9.22 0.998 ± 0.008 0.0107 ± 0.0056 6.36 ± 2.8 0.161 ± 0.121 0.06 0.032 -23.71 0.012 Jat JT Indo-European 50 8.66 8.66 0.995 ± 0.004 0.0089 ± 0.0046 6.65 ± 2.0 0.121 ± 0.106 0.03 0.025 -25.79 0.011 Kakazai KZ Indo-European 50 11.84 11.77 0.996 ± 0.004 0.0122 ± 0.00627 7.84 ± 2.4 0.149 ± 0.127 0.04 0.029 -25.46 0.015 Rajpot RJ Indo-European 50 8.65 11.03 0.997 ± 0.006 0.0089 ± 0.0046 6.64 ± 2.0 0.139 ± 0.129 0.03 0.028 -25.79 0.014 Sayyid SD Indo-European 50 11.68 9.59 0.993 ± 0.004 0.0121 ± 0.0061 6.51 ± 2.2 0.128 ± 0.110 0.02 0.021 -24.48 0.010 SINDH

Bijarani BJ Indo-European 20 12.37 10.97 0.998± 0.015 0.0128 ± 0.0067 6.13 ± 2.5 0.193 ± 0.121 0.02 0.052 -12.65 0.019 Chandio CN Indo-European 20 11.97 11.72 0.997 ± 0.158 0.0124 ± 0.0065 5.96 ± 2.51 0.202 ± 0.121 0.04 0.056 -10.91 0.016 Ghallu GH Indo-European 20 12.74 12.25 0.994 ± 0.015 0.0132 ± 0.0069 6.74 ± 2.9 0.231 ± 0.144 0.01 0.055 -14.42 0.012 Khoso KS Indo-European 20 11.41 8.54 0.992 ± 0.015 0.0118 ± 0.0062 7.40 ± 3.40 0.183 ± 0.120 0.01 0.051 -14.30 0.017 Nasrani NS Indo-European 20 9.70 5.56 0.993± 0.015 0.0100 ± 0.0053 5.70 ± 4.6 0.186 ± 0.114 0.01 0.05 -6.67 0.016 Solangi SL Indo-European 15 12.36 8.33 0.992 ± 0.024 0.0128 ± 0.0068 5.36 ± 2.91 0.220 ± 0.118 0.01 0.061 -9.34 0.015 NWFP

Bangash BG Indo-European 20 11.76 10.30 0.994 ± 0.155 0.0122 ± 0.0064 5.66 ± 2.56 0.183 ± 0.118 0.04 0.056 -18.05 0.012 Khattak KT Indo-European 20 13.73 14.49 0.992 ± 0.015 0.0142 ± 0.0074 5.72 ± 2.43 0.196 ± 0.117 0.03 0.057 -12.87 0.013 Orakazai OK Indo-European 20 12.02 11.47 0.994 ± 0.017 0.0124 ± 0.0065 5.81 ± 2.67 0.207 ± 0.115 0.01 0.056 -8.06 0.019 Yusufzai YF Indo-European 5 12.00 6.44 0.999 ± 0.126 0.01244 ± 0.0079 6.00 ± 2.55 0.444 ± 0.084 0.01 0.250 -0.10 0.02 Baluchistan

Bugti BT Indo-European 5 8.20 5.29 0.999 ± 0.126 0.0085 ± 0.0055 8.20 ± 4.5 0.482 ± 0.101 0.04 0.221 -0.61 0.013 Laghari LG Indo-European 5 7.00 5.11 0.999 ± 0.126 0.0072 ± 0.0048 7.00 ± 3.9 0.437 ± 0.080 0.03 0.211 -0.83 0.016 Lashari LS Indo-European 5 12.40 7.82 0.999 ± 0.126 0.01286 ± 0.0082 6.42 ± 0.08 0.459 ± 0.093 0.02 0.255 -0.06 0.017 Mazari MZ Indo-European 10 9.73 11.93 0.991 ± 0.044 0.0100 ± 0.0057 5.26 ± 2.8 0.286 ± 0.115 0.01 0.111 -5.56 0.012 FATA

Mahsuds MH Indo-European 10 12.58 29.20 0.997 ± 0.054 0.01304 ± 0.0072 12.57 ± 3.2 0.339 ± 0.141 0.01 0.120 -1.11 0.011 Wazirs WZ Indo-European 5 8.00 12.67 0.999 ± 0.126 0.00829± 0.0054 8.00 ± 4.48 0.444 ± 0.085 0.01 0.212 -0.64 0.014 MPD = Mean pairwise difference, N = Number of Sample, MP = Random match Probability.

164

Table 3.12: Molecular diversity indicis

No. of No. of No. of No. of Population Theta_S ± SD Theta_pi ± SD Transitions Transversion Substitutions Indels

ARAIN 64 5 69 4 14.9 ± 4.4 8.50 ± 4.4

BANGASH 51 7 58 8 15.7 ± 5.6 11.7 ± 6.9

BIJRANI 59 3 62 2 17.4 ± 6.2 12.3 ± 6.5

BUGTI 15 0 15 2 7.2 ± 3.9 8.2 ± 5.3

CHANDIO 41 10 51 8 14.3 ± 5.1 11.9 ± 6.3

GHULLU 49 3 52 4 14.3 ± 5.1 12.7 ± 6.6

GUJAR 55 8 63 5 14.8 ± 4.9 10.3 ± 5.4

JAT 64 5 69 6 14.9 ± 4.4 8.6 ± 4.5

KAKAZAI 67 8 75 9 15.8 ± 4.7 11.8 ± 6.04

KHATTAK 54 8 62 8 17.4 ± 6.2 13.7 ± 7.1

KHOSO 54 5 59 6 16.6 ± 6.7 11.4± 6.00

LAGHARI 15 1 16 0 7.6 ± 4.1 7.00 ± 4.6

LASHARI 24 1 25 2 12 ± 6.3 12.4 ± 7.9

MAHSUDS 31 2 33 5 11.3 ± 4.8 12.5 ± 7.00

MAZARI 28 1 29 5 10.2 ± 4.7 9.7 ± 5.5

NASRANI 8 2 50 3 13.8 ± 4.9 9.7 ± 5.1

ORAKZAI 47 8 55 6 15.5 ± 5.5 6.01.0 ± 4.3

RAJPOT 50 6 56 10 12.2 ± 3.7 8.6 ± 7.4

SAYYID 79 10 89 6 19.1 ± 5.6 11.6 ± 5.9

SOLANGI 43 7 50 6 15.3 ± 5.8 12.36 ± 6.6

WAZIRS 14 1 15 3 7.2 ± 3.9 8.00 ±5.2

YUSUFZAI 23 1 24 3 11.5 ± 6.6.0 12.0 ± 7.2

Mean 44.31 4.63 48.95 5.04 13.64 ± 5.11 10.80 ± 5.96

S.D 18.3 3.2 20.8 2.5 3.35 ± 0.81 1.90 ± 1.00

165

Table 3.13: Neutrality tests for 22 populations of Pakistan

Chakraborty's test Tajima's D test

Obs. Exp. no. Tajima's Tajima's D Population Hetrozygosity of alleles S Pi(Ф) D p-value ARAIN 0.0016 16.8 67 8.5 -1.736 0.016 BANGASH 0 12 56 11.7 -1.51 0.05 BIJRANI 0 12.2 62 12.3 -1.27 0.08 BUGTI 0 4.09 15 8.2 -0.2 0.48 CHANDIO 0 12 51 11.9 -1.09 0.14 GHULLU 0 12.3 51 12.7 -0.69 0.25 GUJAR 0 14.4 59 10.3 -1.3 0.07 JAT 0 17 67 8.6 -1.7 0.01 KAKAZAI 0 19.9 71 11.8 -1.2 0.1 KHATTAK 0 12.6 62 13.7 -1.3 0.08 KHOSO 0 11.8 59 11.4 -1.5 0.03 LAGHARI 0 3.9 16 7 -0.64 0.35 LASHARI 0 4.3 25 12.4 -0.49 0.4 MAHSUDS 0.022 7.5 32 12.5 -0.28 0.4 MAZARI 0 7.1 29 9.7 -0.99 0.18 NASRANI 0 11.1 49 9.7 -1.3 0.08 ORAKZAI 0 12 55 12 -1.3 0.08 RAJPOT 0.002 16.9 55 8.6 -1.7 0.02 SAYYID 0 19.8 86 11.6 -1.5 0.03 SOLANGI 0 10.1 50 12.36 -1.3 0.09 WAZIRS 0 4.08 15 8 -0.4 0.45 YUSUFZAI 0 4.3 24 12 -0.46 0.43 Mean 0.001 11.2 48 10.8 -1.1 0.17 S.D 0.0004 5.1 20 1.908 0.498 0.165

166 Mismatch distribution:

Mismatch distribution is defined as it is the observed number of differences between pairs of haplotypes. The graph 3.32 shows the unimodal spatial expansion of Rajput caste. X-axis of graph exhibited the number of differences between pairs of haplotypes and Y-axis presented their frequencies, while the solid line denoted the observed frequency and dashed lines showed 95% confidential intervals (α=0.05).

Figure 3.32: Mismatch distribution in Rajput population

167

Haplotype Distance Matrix (HDM):

HDM was described below showed the number of variant alleles between each haplotype within and between the populations. The black solid lines differentiates haplotypes of different populations. In the lower, but the left edge of the graph distance matrix was computed between the populations while upper left and lower right edge elaborate the distance matrix within the Populations. On the right side of the graph a legend of color code was mentioned as shown below figure 3.33.

Figure 3.33: Haplotype distance matrix between/within the 22 ethnic groups of

Pakistani population.

168 Fst Distance Matrix:

The fixation index Fst was measured on the basis of differentiation in single nucleotide polymorphism. The genetic differentiation was measured within and between populations as described by Hundson et al, (1992) by following a formula

Fst = 1- H w /H b

Where H w was the mean number of differences in samples from the same population and H b was the mean number of differences between the sequences of two different populations. The average pairwise difference was calculated by the sum of the pairwise differences divided by number of pairs within the population. The pairwise

Fst was used to elaborate the short term genetic distances with a little shift to linearize the distances with divergence time between the populations (Excoffier et al, 2005).

The following graph described the pairwise Fst values between each population. On the right side Fst values were presented with a color code with legend.

Figure 3.34: Matrix of pairwise Fst

169

Average pairwise difference within & between Populations:

The mean pairwise difference was denoted by π. In the figure 3.35 upper half of the matrix (green) showed the average number of pairwise difference between each population and lower half of the matrix (blue) indicates correct pairwise differences between the populations, while the diagonal (orange) exhibited average pairwise differences within each population and legends of color codes presented on the right side of the graph.

Figure 3.35: Average sum of pairwise differences within and between 22 ethnic groups of Pakistan

170 Haplotype frequencies in 22 ethnic populations of Pakistan:

The dominant haplotypes in different populations and their comparison with each other are shown in figure 3.36.

Figure 3.36: Haplotype frequencies in 22 isonym Population of Pakistan

171

Expected Heterozygosity in 22 ethnic populations of Pakistan:

Expected heterozygosity in individuals for a particular locus in all populations is shown in figure 3.37.

Figure 3.37: Expected heterozygosity in 22 ethnic groups of Pakistan

172 Number of Alleles at different loci:

The number of alleles in 22 populations studied at different loci is shown in figure 3.38.

Figure 3.38: Number of alleles at different loci in 22 isonym groups of Pakistan

173

Molecular Diversity Indexes:

The values and standard deviation of four different diversity indexes in 22 an isonym population of Pakistan are shown in figure 3.39. Solid lines of different colors exhibited different values of diversity indexes and the dash lines of same color presented the standard deviation. On Y-axis “θ” showed genetic differentiation and X- axis represented each population.

Figure 3.39: Molecular diversity indexes of 22 isonym populations of

Pakistan.

174 Divergence time:

The divergence time estimated of all populations from an ancestral population, of size “N” from some “T” generation in the past and persistently isolated from each other ever since. The size of the daughter population was different, but they sum adds up to the size of the ancestral population. It was evident from the graphs (figure 3.40

& 3.41) that the average number of pairwise differences between and within populations estimated the divergence time, scaled by the mutation rate. The graphs revealed inter and intra population diversities having large variance that means short divergence times. The average diversity within the populations was larger than that observed between the populations that may lead to the negative divergence time.

Figure 3.40: Co-ancestry coefficient matrix

175

Figure 3.41: Slatkin’s linearized Fst’S

176 Detection of loci under selection:

To find out distribution of Fst and heterozygosity within population a graph was plotted (figure 3.42), for the observed loci presented as small circles, while dash lines exhibited a simulated data. The loci that revealed significance at the 5%, level indicated as a blue circle while a red circle presented the loci at 1% significance level.

Figure 3.42: Detection of loci under selection from genome scans based on Fst

177

DISCUSSION

Pakistan is a historiographical and cultural wonderland mainly because of its geostrategic location in the Indian subcontinent. Geographically, it is situated at the crossroad of Asia on the junction of the Middle East, Central Asia, and Southeast

Asia. The unique landscape served as the primal highway for human movements and journeys, through an important core trading route called “The Silk Route” that linked

Asia and the Mediterranean Basin (Quintana-murci et al 1999). Moreover, geographical expansion of southern coast of the Persian Gulf, Makran coast of

Pakistan and territory of Afghanistan served as conduits for human dispersion

(Derenko et al 2013) However, the climatic changes prevailed over time have great impact in the human migration and altered the regions that once served as the gateway to a barrier, for instance, the Hormuz Strait (Cadenas et al 2008).

First, the Indian subcontinent turns out to be a hot spot for mitochondrial DNA divergence due to the Pleistocene population expansion. Approximately, 60% overall human population existed in that region around 38,000 year ago (Atkason et al 2008).

Second, the salient migratory events at several different time periods on the Silk route between the widespread empire of China and India contributed much in the population diversity (Kong et al 2003). Third, the “Fertile Crescent”, (the area between Southeast Anatolia and Zagros mountains)” had history of urban civilization as early as 3,000 ybp (years before present) and served as a channel for human migration between Mesopotamia and Iranian Plateau including North-Western part of

Pakistan (Di cristofaro et al 2013).

Pakistan is one of the South Asian country that teeming with archeological remains of the vestige eras dated from lower Paleolithic “Stone Age Culture” (3.3 million years ago) explored in the Punjab Province; Neolithic (7000-2000BC) Mehrghar

178 civilization excavated in Baluchistan Province and Iron Age (1200BC) bloomed in

Khyber Pakhtunkhwa (KPK) Province (Ahmed 2014).

The present study was designed to evaluate the comprehensive survey of the control region of mtDNA in ethnic / caste / isonym populations of Pakistan, to determine the distribution of various haplogroups, genetic diversity, ethno linguistic relationship and matrilineal complexity. The sequence data corresponding 500 individuals belonging to the 22 different ethnic groups of Pakistan. Nucleotide substitution was observed with mean 48.95 ± 20.8, while nucleotide transition (44.31 ± 18.3), transversion (4.63

± 3.2) and indels were observed (5.04 ± 2.5) in different caste and tribal population of

Pakistan. A total of 412 unique haplotypes was observed. It was evident from the present study that most of the haplotypes were singletons as described by (Metspalu et al, 2004: Shoo et al, 2006).

Haplotype Distribution:

About 25.6 % haplotypes were shared more than one individual and 75.4% only present in singleton (Fig 3.36). Maximum shared haplotypes were five in number between Arain, Sayyid and Solangi. In Baluchistan the haplotypes were observed about 20 out of which 12 were unique haplotypes. However Quintana-Murci et al.,

(2004) observed, albeit higher number in Balochi population, that was about 26 out of which 18 were the unique haplotypes. Moreover Siddiqi et al., (2015) observed 70 different numbers of haplotypes in Balochi (Makrani) peoples that have 54 unique haplotypes and sixteen were shared by more than one individual. However, in contrast to this Whale, (2012) find out 12 haplotypes in Balochi population and all were unique.

179

In KPK (NWFP and FATA) the haplotypes were found to be 49, out of which 27 was unique and 10 haplotypes were shared between more than one individual. However

Rakha et al., (2011) observed 157 haplotypes that have 128 unique haplotypes in

KPK. Subsequently, Quintana-Murci et al., (2004) elucidated 39 haplotypes, bearing

35 unique haplotypes, whereas Cordaux et al., (2003) reported 36 haplotypes out of which 30 were unique among Pashtun.

The haplotypes identified in Punjab (Pakistan) were 237, among 153 were the unique haplotypes, whereas Kivisild et al., (1999) noted a 92 haplotypes in Punjabi population that have 85 unique haplotypes. In Sindh 93 haplotypes were found out of which 40 were unique in our study. Nevertheless Hayat et al., (2015) observed 63 different haplotypes out of which 58 were common and 5 were shared in more than one individual in Sindhi population. Furthermore 21 haplotypes of the Sindhi population were identified by Quintana-Murci et al., (2004) out of which 19 were distinctive haplotypes.

It is noteworthy that when size for an observed population increase, the opportunity for multiple polymorphism variations and combinations also increases, that leads to increase the number of haplotypes within the observed population. While for shorter mtDNA sequences, if the sample size increases, the haplotype diversity decrease and vice versa.

Haplotype Diversities:

Twenty two ethnic groups exhibited genetic diversity, the least value was observed in

Balochi ethnic group of Mazari (0.991) and the highest value was observed in Wazirs

(0.999) and Yusufzai (0.999). The increased haplotype diversity was due to the

180 relatively less number of samples under study and few shared haplotypes of these ethnic groups.

In Baluchistan Province, where various indigenous tribes are present Bugti, Laghari and Lashari showed high index of genetic variability (0.999), whereas Mazari showed least genetic variability (0.991). In contrast to Quintana-Murci et al., (2004) that observed low genetic variability (0.974) in inhabitants of Baluchistan, without mentioning their tribal affinities such as Brahui 0.952, Lur 0.978 and Makrani 0.975.

Same with the case of other neighboring populations such as Hunza Burusho (Hunza

Population) 0.980, Klasians 0.830, Kurdish (Iraq) 0.972 and Shugnan (Iran) 0.985.

Whilst, our Balochi population is in conformity with that of the Afghani Bloch

(0.991), Persians (0.992), Turkish (0.992) and Caucasus (0.992). However, Siddiqi et al., (2015) found 0.968 genetic diversity in Makrani population of Baluchistan (100 samples) that was less than Quintana-Murci et al., (2004) which is 0.975 (sample size

75). This difference may be due to the sample size and collection sites.

The Central Asian population exhibited genetic diversities from 0.984 to 0.995 such as Uzbek (0.991), Turkmen (0.989), Kazakh (0.990), Kirghiz (0.991) and Uighur

(0.995) and was in agreement with our results (Comas et al., 1998). The European countries demonstrated haplotypes diversity, albeit lower, ranging from 0.936 (Danes) to 0.984 (Bavarians) and East Asian population presented 0.947 (Ainu) to 0.993

(Chinese) respectively, (Comas et al., 1998).

Northern Asian populations exhibited genetic diversity ranging from 0.99.-0.993 such as Kurds (0.993) Buryats (0.990), Mongolians (0.991), Khamnigans (0.990) which is

181

same as we observed in Balochi ethnic group, however, Koreans (0.984), East Evenks

(0.902), West Evenks (0.953), Yakuts (0.898), Shors (0.839), Telenghits (0.986),

Chukchi (0.781), Teleuts (0.980) have moderate to high genetic (Derenko et al., 2007)

Northeast Asians such as Koreans (0.998), Chinese (0.999), Mongolians (0.999),

Manchurian (0.997), Han (1.000), Thi (1.000) and Vietnamese (0.991) presented high genetic diversity (Jin et al., (2009). Moreover, Sri Lankan populations such as Malay

(0.989), Veda (0.942) showed less genetic diversity as compared to Balochi (0.991),

Sinhalese (0.998), Sri Lankan Tamils (0.998) and Sri Lankan Muslims (0.999

Ranasinghe et al., (2015).

The Hindu Kush Mountain range, the Indus valley plain and Irani Sijistan plateau make a trigon, which is dwelled by an assorted group of tribesmen that are scattered over the entire length and the base of this triangle. These tribesmen are the largest conglomeration in the world known as “Pashtuns” with populations in Afghanistan and Pakistan. They have an indo-European origin with the genetic diversity of

Bangash (0.994), Khattak (0.992), Orakazai (0.994) and Yusufzai (0.999). Quintana-

Murci et al., (2004) found genetic diversity (0.993) in the Pashtun. Thus, in KPK, where the female population is less than that of males and because of cultural customs, many peoples cannot afford to have a wife from their own tribes and they marry outside, which brings in genetic diversity.

Our results were further strengthened by Cordaux et al., (2003) and Rakha et al.,

(2011) reported genetic diversity in Pashtuns 0.994 and 0.993. Other South Asian populations showed genetic diversity, like Afghani Baloch (0.990), Hazara (0.997),

Afghani Pashtuns (1.000), Tajiks (0.980) (Whale, 2012).

182 The same trend of genetic diversity has been seen in Central Asian populations of

Uzbek (0.991), Kirghiz (0.991) and Uighur (0.995) that was in agreement with our results, but with low diversity in Turkmen (0.989), Kazakh (0.990) (Comas et al.,

1998).

Iranian Persians showed comparatively high genetic diversity e.g., Persian (0.999),

Qashqais (0.996) and Azeris (1.000) than KPK tribal population. The higher genetic diversity in Iran may be due to freely marring with other ethnic groups, which have similar linguistic and religious bounding Derenko et al., (2013).

In Punjab the ethnic groups, especially Arain, Jat, Gujar, Rajpot Sayyid and Kakazai have an Indo-European origin. High genetic diversity was exhibited by Gujar (0.998) and Rajput (0.997) as described by Kivisild et al., (2003) and Quintana-Murci et al.,

(2004) which is in agreement with our results. Similarly, other neighboring population such as Kshatriya (0.997), Hindu Rajput (0.997) and Brahmin Rajput (0.985) also exhibited same genetic diversity (Metspalu et al., 2004). However Andhra Brahmin

Rajput (0.985), Karnataka Brahmin Rajput (0.986), Maharashtra Brahmin Rajput

(0.899), Uttar Pradesh Brahmin Rajput (0.989) and West Bengal Brahmin Rajput

(0.977) elucidated a low degree of genetic diversity.

In present study Sayyid showed genetic diversity 0.993 which is much high than observed in Shia Sayyid (0.917), Sunni Sayyid (0.883), Iranian Shia Sayyid (0.928)

Eaaswarkhanth et al., (2010). This may be due to the admixture of Sayyid with other ethnic groups who bring in females, but do not give their females to other than

Sayyid. It has been evident that Sayyid are not part of the local population.

Historically Pakistani Sayyid came from military invasions done by the Muhammad bin Qasim and Mahmud Ghazni for the establishment of the Muslim empire. After that there was subsequent migration of mercenaries, businessmen, Islamic scholars

183

(mostly Sayyid) for Arabia, Iran and Middle East and settled in the different region of the Indian subcontinent (Schimel, 1982). The Imprints of Muslim scholar are still present in the form of graves of Sufi Saints (All are Hashmi Sayyid) present in different territories of Pakistan and India. Predominantly (990 -1077 CE),

Sayyid Abdul Rahman Jelani (1025 to1141 CE), Abdul Qadir Jilani (1077-1166 CE),

Moinuddin Chisti (1141-1230 CE), Baba Fakruddin (1169-1295 CE), Lal Shahbaz

Qalander (1177-1274) etc., are the renown Muslim Sayyid Scholars that inhabited in

Indian Subcontinent. The admixing of Sayyid was done more in the partition of

Pakistan and India, due to the migration most of the non- Sayyid communities claimed them as Sayyid to get a prestigious position in the society (Abun-Nasr, 2013).

In spite of this Kakazai of the Punjab showed genetic diversity (0.996) was very close for Uighur (0.995) and low genetic diversity than Turkmen (0.989), Kazakh (0.990) of central Asia (Comas et al., 1998).

In Sindh Province the least genetic diversity was shown by the Solangi (0.992),

Khoso (0.992), and highest by Bijarani (0.998), Chandio (0.997) and Ghallu (0.994) respectively, that was in contrast to Hayat et al., (2015) that observed (0.957) genetic diversity in the Saraiki population of Sindh.

Overall the haplotype diversity was significantly higher (Z = 3.69) P<0.01 in the

Punjab caste groups than tribal system of the KPK, FATA and Balochistan (Z = 2.35

<0.01). These observations are in consonance as described by Rakha et al, (2011) in the population study of Pathans in KPK and FATA. The genetic configuration was further strengthened by the analysis of mean pairwise differences. The Pashtun ethnic

184 groups of KPK showed mean pairwise difference (5.53) that is very close to the mean pairwise difference observed by Whale, (2012) in Afghanistan Pashtun (5.51) this infer the Pashtun have Central Asian origin and have an admixture of West Eurasians.

The tribal population of Sindh (Nasrani, Ghallu, Khoso and Bijarani) have mean pairwise difference (6.21) that was same (6.29) as observed in Hazara population of

Afghanistan (Wale, 2012). This might be explained by the numerous settlement events of migration of Afghani people. First, when Shah -ull-ah Dehlvi (1703-

1762) invited Ahmed Shah Abdali (ruler of Afghanistan) invaded on Indian subcontinent and rescued the Muslim from British rule. He settled their people in the different regions of the subcontinent and play a key role in the population admixture

(Sluglett, 2015). Second the entry of refugees in Pakistan during Afghan Russian war.

More than 20 million Afghan refugees entered into Pakistan and settled in the different regions of the country and contributed in reshaping of mtDNA landscape

(Grau, 2015).

When we considered the neutrality Indices such as Fst statistics and Tajima’s D, they gave negative values, the former indicating the excess of rare segregated sites or nucleotides and later indicated the access of rare haplotypes when compared to the neutral model. The highest value of Tajima’s D was observed in Mahsuds (-0.28),

Lashari (-0.49), Laghari (-0.64), Mazari (-0.99) and the least was observed in Arain (-

1.73), Gujar (-1.74), Khoso (-1.59), Rajpot (1.78). This explains the demographic expansion history in Pakistan by different invaders at different time periods contributed in the landscape of mtDNA pool of South Asia. Subsequently, the negative values of Fst that vary significantly from zero also support the demographic expansion. It has been evident from studies that genetic influx from the Fertile

185

Crescent to Indian subcontinent was more frequent than from east to west. Because of that, Pakistan, throughout its extensive history, holds up under the sovereignty of

Persians influence (550 BC), Macedonian dynasty (330 BC), Mauryan Empire (322

BC) Kushana monarchy (250 BC), Kabul Shahi realm (1000 AD), Ghaznavid invasion (997 AD), Gurkani domain of Turko-Mongol (1200 AD), Yan Dynasty

(1271AD) and British Empire (18th century). The unification of these conglomerates introduced a unique, but complex socio-cultural structure and strong tribal system in

Pakistan. The MDS analysis and phylogenetic analysis also exhibit the same pattern in which both the tribe and caste population were clustered together indicated their close relationship and population admixing. The mismatch distribution was calculated for all caste and tribal population that explain the unimodal expansion. This model was interpreted as a sign of demographic expansion and no multimodal distribution was observed. Moreover the value of raggedness index was exclusively lower than

0.05, which suggested demographic expansion in all ethnic groups of Pakistan.

Haplogroups of mtDNA:

Haplogroup M:

There is a native proverb “Every two miles the water changes, every four miles the speech” In Indian subcontinent people have 120 languages and 220 mother tongues, this infer the diverse demography and ethno-linguistic culture of the Indian subcontinent. In the present study Macro-haplogroup M was most frequently observed in Punjab (45%) and KPK (43.6%) than Baluchistan (20%) and Sindh (31.3 %).

While Rakha et al., (2011) observed 39.9% M haplogroup in KPK. Our result of

Punjabi population is similar to Kivisild et al., (2003) observed M haplogroup in

Punjabi population about (46%), Contrarily Saraiki population of Pakistan showed less M haplogroup frequency (10.5%) and exhibited high frequency of West Eurasian

186 haplotypes (Hayat et al., 2015). It is suggested that genetic mosaic of Saraiki population received more influence from West Eurasia invaders coming from Central

Asia and Europe. Other Pakistani populations showed moderate to high M haplogroup frequency such as Balochi (33.3%), Brahui (21.1%), Nasrani (54.5%), Sindhi

(30.4%), Karachi (Pakistan) (47.0%), Pashtun of NWFP (29.5%), Hunzaid people

(22.7%) and Gujarati people (44.1%) (Quintana-Murci et al., 2004). While, the frequency of M haplogroup observed by Kivisild et al., (1999) was about 55% in

Punjabi population, 38.0% in Pakistani population and 26% in Kashmiri population.

The neighboring populations such as Hazara, Baloch and Pashtun population of

Afghanistan revealed haplogroup M frequency about 15%, 13.3%, 7.1%, respectively, but on the other side Tajik, Irani, Turkmen, Koremian Uzbek, Mazandaran, Iranian

Kurds, Bukharin Arabs, Tajik, and Turkish showed 0%, haplogroup M frequency

(Whale, 2012; Quintana-Murci et al., 2004). This is evident that all they have West

Eurasian and East Asian origin and no ancestry is contributed by South Asians. In the present study the most distinct haplogroups are M2, M3, M4, M5, M6, M18, M30,

M33, M35, M37, M45, M49, M52.

Haplogroup M2 is the most ancient haplogroup in Indian subcontinent characterized by motif 447G and 16270 T-16319A and about one 10th of the M haplogroup fall into

M2 (Kivisild et al., 2003). In the present study the M2 was most prevalent in KPK ethnic groups Orakazai (6.1%) and Khattak (4.6%). The same results were observed by the Rakha et al., (2011) in Pashtuns that have 4.7% M2 haplogroup. However,

Sindhi have least frequency, such as Chandio (2.6%) and Solangi (0.8%). Which was same as observed by Quintana-Murci et al., (2004) in Sindhi population (2.64%). In

187

In Punjab Haplogroup M2 was reported in a minor frequency, such as Kakazai

(0.35%) and Sayyid (0.7%). This was in agreement with Kivisild et al., (2003) and

Hayat et al, (2015), they reported M2 frequency about (1%) in Punjabi population and

(1.1%) in the Saraiki population of Pakistan respectively. Contrarily, Indian populations showed a very high frequency of M2 haplogroup such as Maharashtr

Andh (26%), Karnatak- Bettta Kuruba (64%), Kacahri (26%), Orissa (15.6%), Kahodi

(29%), Madia (32%) and South Indians (20.48%) (Chandrasekar et al., 2009); Rani et al., (2010).

It is noted that M2 lineage is the most Oldest M lineage found in the Indian subcontinent, approximately its age was about 64,000 to 70,000 ybp. This period was considered as the lower Paleolithic era of the Stone Age culture “Soan Culture” that was predominant in the Subcontinent (Sørensen, 2015). This Idea was further strengthened by the presence of M2 haplogroup in Ethiopia, this led to the idea that

M2 was originated in East Africa, approximately, 60,000 ybp and later migrated towards Asia (Quintana-Murci et al., 1999).

In an assumption, Pagani et al., (2015) point out the fact that migration “Out of

Africa” was done through the northern route. However Gronau et al., (2011) emphasized the divergence of the South African population (San) from other human population was approximately, 108,000 to 157,000 ybp which is estimated through

Bayesian inference.

Haplogroup M3 was defined by the motif 16126, 482 and 489 and was the most abundant subclade of the M haplogroup in KPK (26%), Baluchistan (16%), Punjab

188 (4.5%), and Sindh (9.4%) respectively. The highest frequency was observed in KPK

Bangash (10.7%), Orakazai (9.2%) and the least was observed in the Punjabi Sayyid

(0.7%) Jat (0.35%) and Gujar (1.07 %). Balochi populations such as Solangi exhibited (0.8%) and Ghallu, Chandio, Khoso showed 2.6% frequency. On the other hand Rakha et al., (2011) observed frequency of M3 about (9.5%) in KPK of the

Pashtun while Quintana-Murci et al., (2004) elucidated its frequency about 12.5% in

Pashtun. This was supported by the idea of Cordaux et al., (2004) that paternal ancestry of Indian subcontinent is more closely related to the Central Asian and are the descendants of Central Asian and Indo-Europeans migrants (Pereltsvaig and

Martin, 2015). However, in Indian tribes Thanseem et al., (2006) reported that M3 was found in fairly good frequencies (17%) in Pardhan and Naikpod tribes. M3 was accumulated in the northwestern India (22%), Rajput (14%), Brahmin (16%) as mentioned by Metspalu et al, (2004)

The coalescence time of M3 was estimated to be 27,100 ± 10,200 and 16,400 ± 6100 ybp (Thangaraj et al., 2006). This era was the Middle Stellenbosch, Upper Clacton,

Upper Stellenbosch, Upper Acheulian and Upper Clacton, which are found in the

Potohar region of Pakistan and contributed in the amalgamation of different cultures in the South Asia (Aslamkhan, 1996).

Whale, (2012) worked on the Afghani Pashtun, Afghani Balochi and find that haplogroup M3 was (13.5%) and (12.9%). Tajik and Turkic peoples have a null frequency of M3 haplogroup while the Kyrgyz has a frequency about (10.2%), this was a result of admixture in Kyrgyz done by the gene flow from Eastern to Western

Eurasia during the Paleolithic period (Palstra et al., 2015).

189

Haplogroup M4 was defined by motif 16311 of the control region with the estimated age of about 2500,700 ± 8100 ybp calculated by Thangaraj et al, (2006). The haplogroup M4 showed least frequency in Jat (0.6%) and Rajput (0.4%) of Punjab

Pakistan. This haplogroup has also been reported among the Dhangars Jat and Chitp avan Brahman (Rajput) of Maharashtra state, of India and exhibited frequency about

4.2% (Gaikwad and kashyap, 2005).

The coalescence time of M5 haplogroup as calculated by Thangaraj et al, (2006) is about 52,000 ± 14,600 ybp. Haplogroup M5 was demarcated by motif (16129) and predominantly observed 6.4% among the ethnic groups of Punjab and 7.8% in tribal groups of Sindh, 7.6% in KPK and 1.1% in Baluchistan. Basu et al, (2003) and Sun et al, (2006), stated frequency of haplogroup M5 in Chaturvedi (Brahman Rajput) population of Utter Pradesh (2.1%), Rajput of Bihar (2.9%) and Rajput of Karnataka

(1.1%) .It was predicted that migrant not only follow the southern route from Out of

Africa, but inland dispersion did exist and was adopted by anatomically modern human to enter and colonize the interior of East Asia (Li et al., 2015). This haplogroup M5 is also prevalent in three leading caste populations of Maharashtra namely, Maratha Rajput (4.9%), Desasth Brahmin Rajput (2.45%) and Chitpavan

Brahmin Rajput 3.5% (Gaikwad and kashyap, 2005).

Phylogeographic analysis has suggested that M5 haplogroup is most abundant in all

Roma (Gypsies) population, including Hungarian, Bulgarian, Iberian and Balkan

Roma (ranging from 6-29%), constituted a founder population that distributed across

West Asia and whose provenance might be traced back to the Indian Subcontinent, particularly Pakistan (Malyarchuk et al, 2008) and most of the sequences reported in

190 the Roma are rare in the European population, infer that most of the lineage of M5 was already existed in Roma before their arrival in the Europe territory.

Under these suppositions, Punjab might be considered a putative homeland of the some lineages of Roma Diaspora (Mendizabal, 2008). This finding is in agreement with previous demographic, cultural and linguistic evidences, especially a genetic hint provided by identification of specific private mutation in the Roma shared by the Jat caste in Punjab Pakistan (Ali et al., 2009). This clue proposed that Jat is the contributor of some of the lineage in the maternal Roma population who followed the migration route passed through Persia, Armenia, Greece and the Slavic- speaking regions of the Balkans and finally reached in the Europe (Fraser, 1995).

Rakha et al., (2011) observed 4.1% M5 in Pashtun of KPK, whereas Hayat et al.,

(2015) demonstrated its high frequency 15.29% in Saraiki populations of Pakistan.

Haplogroup M6a was demonstrated by motif (16231-16362) and mostly observed in ethnic groups of the Punjab. It is less frequent and present approximately 1% in

Rajput and Kakazai respectively.

Haplogroup M7 was mainly present in the northeast Asian population, but found at low frequency in Jat (.07%) and Rajpot (0.35%) that was in agreement with Rakha et al., 2011 observed (0.4%) in KPK and Hayat et al., (2015) observed (0 %) in Saraiki populations of Pakistan. However, it is frequently observed in the Mongolian population (2.1%) and Korean population (10.3%) Jin et al., (2009). It has been noted that Mongolians have a much closed genetic affinity with Rajput of Punjab.

Predominantly, Mughal (ethnic group of Pakistan) is the direct descendent of

Mongols (Dashtseren, 2015).

191

Haplogroup M18 was the sub lineage of M30, and found to be in Arian (1.07%), Jat

(1%), whereas M30 was found 0.07% in Rajput, and 0.35% in Sayyid. Rakha et al.,

(2011) observed M30 (5.6%) in KPK and Hayat et al., (2015) observed 2.2% in

Saraiki populations of Pakistan.

In addition to this M34, M35, M38, M40, M41, M49, and M52 were present in very low frequency in our study. It is noteworthy that the classification of these haplogroups further required confirmation by sequencing of coding region variants as control region sequencing is insufficient for the designation of exact haplogroup to these lineages.

Haplogroup N

Haplogroup N derived from the African lineage L3 and estimated age was about 64.6

± 608 kilo years (Kong et al., 2003). In west Eurasia, its age was about 49,200 to

75,000 ybp, in South Asia 71,200 ybp and in East Asia 58,200 ybp (Palanichamy et al., 2015). Haplogroup N was considered as southwest Eurasian lineage and was ancestor of numerous haplogroups inhabited in Europe, Middle East, Asia and the

Americans (Kivisild et al., 1999; Nasidze et al., 2008). N haplogroup was 10.7 % present in Punjabi population and 2.9 % of Sindhi population and 0% inKPK, low frequency 0.8% was observed by Rakha et al., (2011) in KPK Pashtuns. Hayat et al.,

(2015) elaborated the frequency of N was about 1.1% in Saraiki population.

Further, it is found 5% in the Iraqi population, 23 to 44% in different Iranian populations, and 10% in Israel Syria and Jorden respectively (Nasidze et al., 2008;

Tian et al., 2015; Juhász et al., 2015). In central Asia N haplogroup was about 2.3%

192 in Tajiks and 20% in the Iranian of south Caspian region. However, greater frequency was observed in west Eurasian populations than South Asian and Central Asian populations, which raised up to highest in the Australian population (28.5%).

It is noteworthy that the coalescence age of haplogroup N present the oldest divergence around 76 kilo years and spread in Central China and Inner Mongolia. It was genetically evident that the peopling of Sahul (Australia) was the result of an earlier migration from East Africa through the Indian subcontinent following the

“Southern Exit Route” and showed much closed genetic affinity with people of Indian subcontinent. Recently, Fregel and his coworkers (2015) reinforced the existence of an additional Northern route “Out of Africa” by involving Levantine corridor without relating of the Arabian Peninsula and Indian Subcontinent and these deduction was done on the basis of phylogeographical, archeological, and anthropological evidences record of mtDNA haplogroup N (Van der Made, 2011; Larrasoaña et al., 2013; Xing et al., 2015). This route was further supported by an Aterian stone industry that was expanded throughout whole North Africa and Sahar desert, ranging from Nile Valley and outward in the Levant (Scerri, 2012).

Besides these opulent observations, a very close genetic affinity noted in Aterian skull and Levantine (early Homo sapiens) skull reported by Holliday, 2014. In addition to that, the archaeological record, bio-geographical evidences, genetic affinities and bioclimatic evidences proposed that, early anatomically modern man took birth in

Africa, but their nurseries might be first at South Siberia (Dryomov et al. 2015),

North West China core and then later Southeast Asia.

193

Haplogroup R:

The haplogroup R (sub-clad of N) embraced the majority of the West Asian and

European haplogroups U (18%), T (2%), W (3%), J (4%), R0 (13%) and H (5%), which have different frequencies in Pakistani ethnic groups. The overwhelming bulk of R lineage (67%) in Pashtun has a clear Europe and West Asia provenance due to

Paleolithic and Neolithic expansions of Caucasian that reached in South Asia via

Iranian plateau and might be Arabian Sea maritime routes, could be the possible ground for this genetic influx. Congruently, some of the mtDNA R lineages are undergoing main expansion in this region or either autochthonous to Pakistan. Among them is subclade R2, which is more pronounced in Southern Pakistan, India and less frequent in most of the adjacent regions of the Iranian Plateau, the Central Asia, the

Arabian Peninsula, the Near East and the Caucasus. Haplogroup R2, which is observed (8%) in the Balochi Population and least in Punjab (0.7%) and KPK (3%),

While Rakha et al., (2011) noted albeit high frequency in KPK Pashtuns (6%).

Whereas Quintana-Murci et al., (2004) observed (7.5%), in Karachi populations of

Pakistan, 7.7% Balochi population, and (0%) in Sindhi population that was very close to our observations.

Kivisild et al., (1999) found haplogroup R2 about 5.0% in Pakistani population, 3.5% of Kashmiri population, 2.1% in a Punjabi population of Pakistan. However, Sharma et al, (2009) observed R2 in Maharashtra Brahmins (3.33 %), Punjabi Brahmin

(3.39%), much higher in Gujarati Brahmins (9.38 %), and Jammu- Kashmir Gujjar’s

(8.16 %).

194 Haplogroup U:

Haplogroup U is not considered as East Eurasian as it is believed to be a Western

Eurasian specific haplogroup. Surprisingly, it was the second most common haplogroup in the Indian subcontinent as it was in Europe with an estimated age of

51,000 - 67000 years (Kivisild et al, 2009). Nevertheless, there was a profound difference in demographics spreading of U clusters (U2a, U2b, and U2c) in Europe and Indo-Pakistan. Predominantly U2 is the specific sub cluster in Indo-Pakistan than in Europe and differ from the western Eurasians by the transversion at nucleotide position 16129 (Quintana-Murci et al, 2004).

The South Asian influence is primarily characterized by a nodal type of autochthonous haplogroup U and its three sister sub-clads (U2a, U2b, and U2c) distributed subsequently, in the Indo-Pakistan region and are completely restricted in

South Asia but absent in the Europe, the Iranian plateau, and central Asian populations (Quintana-Murci et al, 2004). However, the idea was further strengthened by the hypothesis of two waves of migration, one was a southern migration route of

M haplogroup ancestors (Mongols) to India and other was U haplogroup ancestors

(Caucasians) migration to India, as haplogroup U was completely absent in Mongol population. Nevertheless, there is an extensive difference in the demographical spreading of U clusters (U2a, U2b, and U2c) among Europe and Indo-Pakistan.

Predominantly, U2a, U2b and U2c are the specific sub cluster in South Asia. In

Sindhi population U2a was predominant in Ghallu (1.7%), whereas U2b (2.6%) in

Nasrani, and 1.7% in Ghallu, Khoso and Chandio respectively. Additionally, U2c was prevalent in Solangi (1.7%) and Khoso (1.7%) (Fig. 3) that was similar as described by (Quintana-Murci et al, 2004). Haplogroup U2 differ in Europe and to the western

195

Eurasians by transversions at basal nucleotide position 16129. In addition to this U7 haplogroup also provide essential clues for the genetic variation in Sindhi population and presented albeit in higher frequency (2.14%). In Baluchistan U2 was present 12%

(Mazari, 8% and Bugti 4%), In Punjab its frequency was 8.2% (Kakazai 2.5%, Sayyid

2.8%, Arian 1.42, and Rajput 1.48%).

The overall frequencies of haplogroup U5 and U6 were about 3.2% and 1% respectively in Sindhi and Balochi population of Pakistan. Furthermore Quintana-

Murci et al (2004), noted the frequency of U5 and U6 in Pakistani Balochi and Sindhi populations about 2.6% and 8.7 %.

Another variety of subclade U is U7 present in the 4.06 % in the overall population of Pakistan. Unlike U2, it is present in Europe and Central Asia (Comas et al, 1998).

The coalescence age of the U7 sub cluster is about 32.000 ± 55, 00 ybp (Forster et al,

1996) and it is younger than the U2.

Haplogroup H

The haplogroup H was 10% prevalent in overall tribal groups but different from West

Asia (45%) and Near East (25%). The distribution of haplogroup H in Mahsuds was

5%, Bangash 2% and Orakzai 3% whereas, neighboring population such as Iranian

(14.3%), Uzbek (21.4%), Turkmen (22%) and Tajik (29.5%) showed high frequency.

Other Pakistani population exhibited very high frequency of haplogroup H i.e., Sindhi

(28%), Brahui (26.3%), Balochi (20.5%) but moderate in Burusho (12.3%) and

Hazara (13%) population Quintana-Murci et al (2004). Notably, under these suppositions it is hard to distinguish sequential gene flow or expansions at the population level because the most recent migration (British colonization in the 18th

196 century) could hold both early and derivative lineages, which contributed considerably in the genetic pool of the Pashtun.

Besides aforementioned haplogroups, analysis of specific haplogroups J and K

(descended from U8) revealed some interesting geological and chronological distribution in Pashtun. The presence of principal subclade K1a1b1a 2% in Bangash and 5% in Khattak along with 4% J1b in Khattak reflected a deep Jewish and

European ancestry conglomeration. It is worth mentioning that expansion of the

Jewish community, especially Ashkenazi, was going to be 1000 BC across the Roman

Empire and Iranian Plateau. However, on account of expulsion in the Western Europe during the fifteenth century, they subsequently dispersed into Eastern Europe and radiated up to Afghanistan, North West Pakistan, and India . Yet, Pakistani and Indian

Jewish communities are an early branch of the Jewish Diaspora with their several unique socio-cultural features, complex history and minor Middle East specific ancestry components

Designation of Haplogroups on the basis of coding region SNP analysis:

Multiplex assay was developed in mtDNA coding region defining the major South

Asian M, N, R and Western Eurasian haplogroup H, J, U and T respectively (Van

Oven et al, 2009) in two successive genotyping assays A and B. Individual of known haplogroups was selected on the basis of control region sequencing and subjected for further analysis of coding region It was noteworthy that the size determined by the automated sequencer and real size of the DNA product were slightly dissimilar. There are certain factors that affect the mobility of the DNA product, such as size of DNA product, electrophoretic mobility difference and florescent dye to label the extended

197

primer. It is obvious that the mobility of fragment labeled with green dye (dR6G) slower than that of blue dye (dR110 such as 10400 C/T. Moreover the mobility of fragments labeled with red dye (dROX) are slightly slower than that of same fragment labeled with Black dye (dTAMARA) 12507 T/C this was in agreement of Rakha et al., (2011).

In spite of all some fragments demonstrated stronger fluorescent signals than others in

Electropherograms that is due to minisequencing chemistry of added ddNTPs. In general blue dye exhibited a stronger and longer peaks of signals than any other ddNTPs. The multiplex A rule out the 85 % samples as it carried a major haplogroups of South Asian origin, the rest were analyzed by the Multiplex B.

198 CONCLUSION

The successive growth, distribution and dispersal of Indo-Pak haplogroups since

Neolithic age were one of the utmost event to reshape the historical migrations of

South Asia. However ancestral dispersal route of autochthonous Inhabitants of Indo-

Pak were still indistinct. Our research findings suggested an in situ origin of haplogroup M haplotypes found in Ethiopian and Kenyan population and haplogroup

U haplotypes (West Eurasian) found in Pakistani population gave positive wave for the original dispersal of anatomically modern humans out of Africa through southern rout. It was also evident that Indian subcontinent was major corridor for the migration of different populations between Africa, Western Asia and South Asia successively.

High resolution genetic studies revealed challenges such as endogamous practices, language shifts and sex specific admixtures, need further genetic characterization.

These migrations resulted in identifications of specific haplotypes of South Asia for forensic discrimination. In mass disaster and on crime scene investigation by using large number of SNPs it is possible to distinguish every population involved but there is certain unexplored lineages in Pakistani population that should be explored and incorporated.

199

References

ABBASI, A. M., SHAH, M. H. & KHAN, M. A. 2015. Pakistan and Pakistani Himalayas. Wild Edible Vegetables of Lesser Himalayas. Springer.

ABETEKOV, A. & ABETEKOV, H. 1994. Ancient Iranian nomads in western central Asia. History of Civilisations of Central Asia, 2, 23-33.

ABUN-NASR, J. M. 2013. Muslim communities of grace: the Sufi brotherhoods in Islamic religious life, Columbia University Press.

ACHILLI, A., RENGO, C., BATTAGLIA, V., PALA, M., OLIVIERI, A., FORNARINO, S., MAGRI, C., SCOZZARI, R., BABUDRI, N. & SANTACHIARA-BENERECETTI, A. S. 2005. Saami and Berbers—an unexpected mitochondrial DNA link. The American Journal of Human Genetics, 76, 883-886.

AHMAD, K. S. & FORD, R. 1966. A geography of Pakistan, Oxford University Press.

AHMED, M. Caste System in the Sub-Continent: A comparative study. AL-SIYASA– A JOURNAL OF POLITICS, SOCIETY & CULTURE, 29.

AHMED, M. 2009. Caste System in the Sub-Continent: A comparative study. AL- SIYASA–A JOURNAL OF POLITICS, SOCIETY & CULTURE, 29.

AHMED, M. 2009. Local-Bodies or Local Biradari System: An Analysis of the Role of Biradaries in the Local Bodies System of the Punjab. Pakistan Journal of History and Culture, 30.

AHMED M: Ancient Pakistan-an Archaeological History: Amazon; 2014

ALEXANDER, R. 2015. Solar Heroes and Sun Gods. Myths, Symbols and Legends of Solar System Bodies. Springer.

ALI, M., MCKIBBIN, M., BOOTH, A., PARRY, D. A., JAIN, P., RIAZUDDIN, S. A., HEJTMANCIK, J. F., KHAN, S. N., FIRASAT, S. & SHIRES, M. 2009. Null mutations in LTBP2 cause primary congenital glaucoma. The American Journal of Human Genetics, 84, 664-671.

ALLEN, M., ENGSTRÖM, A.-S., MEYERS, S., HANDT, O., SALDEEN, T., VON HAESELER, A., PÄÄBO, S. & GYLLENSTEN, U. 1998. Mitochondrial DNA sequencing of shed hairs and saliva on robbery caps: sensitivity and matching probabilities. Journal of forensic sciences, 43.

ALSHAMALI, F., BRANDSTÄTTER, A., ZIMMERMANN, B. & PARSON, W. 2008. Mitochondrial DNA control region variation in Dubai, United Arab Emirates. Forensic Science International: Genetics, 2, e9-e10.

ALVAREZ, J. C., JOHNSON, D. L., LORENTE, J. A., MARTINEZ-ESPIN, E., MARTINEZ-GONZALEZ, L. J., ALLARD, M., WILSON, M. R. & BUDOWLE, B. 2007. Characterization of human control region sequences for

200 Spanish individuals in a forensic mtDNA data set. Legal Medicine, 9, 293- 304.

AL-ZAHERY, N., SEMINO, O., BENUZZI, G., MAGRI, C., PASSARINO, G., TORRONI, A. & SANTACHIARA-BENERECETTI, A. 2003. Y- chromosome and mtDNA polymorphisms in Iraq, a crossroad of the early human dispersal and of post-Neolithic migrations. Molecular phylogenetics and evolution, 28, 458-472.

ALZUALDE, A., IZAGIRRE, N., ALONSO, S., ALONSO, A., ALBARRÁN, C., AZKARATE, A. & DE LA RÚA, C. 2006. Insights into the “isolation” of the Basques: mtDNA lineages from the historical site of Aldaieta (6th–7th centuries AD). American journal of physical anthropology, 130, 394-404.

AMBROSE, S. H. 2001. Paleolithic technology and human evolution. Science, 291, 1748-1753.

ANAND, B. 2015. “The Speaker”: India: the journey of its civilization.

ANDERSON, S., BANKIER, A. T., BARRELL, B. G., DE BRUIJN, M., COULSON, A. R., DROUIN, J., EPERON, I., NIERLICH, D., ROE, B. A. & SANGER, F. 1981. Sequence and organization of the human mitochondrial genome.

ANGLEBY, H. 2005. Analysis of domestic dog mitochondrial DNA sequence variation for forensic investigations.

ARUNKUMAR, G., TATARINOVA, T. V., DUTY, J., ROLLO, D., SYAMA, A., ARUN, V. S., KAVITHA, V. J., TRISKA, P., GREENSPAN, B. & WELLS, R. S. 2015. Genome-wide signatures of male-mediated migration shaping the Indian gene pool. Journal of human genetics.

ASIMOV, M. & BOSWORTH, C. 1998. History of Civilizations of Central Asia, Volume IV, The Age of Achievement: AD 750 to the End of the Fifteenth Century, Part One, The Historical, Social and Economic Setting. Paris.

ASIMOV, M. S. 1992. Description of the Project. History of Civilizations of Central Asia, 11.

ASLAMKHAN, M. 1996. Sapta Sindhvas: The Land of Seven Rivers. Lahore Mus. Bull, 9, 59-67.

ASOPA, J. N. 1976. Origin of the Rajputs, Delhi: Bharatiya Publishing House.

ATKINSON, Q. D., GRAY, R. D. & DRUMMOND, A. J. 2008. mtDNA variation predicts population size in humans and reveals a major Southern Asian chapter in human prehistory. Molecular biology and evolution, 25, 468-474.

AYRES, A. 2008. Language, the nation, and symbolic capital: The case of Punjab. The Journal of Asian Studies, 67, 917-946.

201

AYUB, Q., MEZZAVILLA, M., PAGANI, L., HABER, M., MOHYUDDIN, A., KHALIQ, S., MEHDI, S. Q. & TYLER-SMITH, C. 2015. The Kalash Genetic Isolate: Ancient Divergence, Drift, and Selection. The American Journal of Human Genetics, 96, 775-783.

AZIZ, K. K. 2007. A Journey into the Past.

BAG, A. 2015. Early System of Naks. atras, Calendar and Antiquity of Vedic & Harappan Traditions. Indian Journal of History of Science, 50, 1-25.

BALLAND, D. 2010. AFGHANISTAN x. Political History. Encyclopædia Iranica. Retrieved, 08-22.

BAMSHAD, M., KIVISILD, T., WATKINS, W. S., DIXON, M. E., RICKER, C. E., RAO, B. B., NAIDU, J. M., PRASAD, B. R., REDDY, P. G. & RASANAYAGAM, A. 2001. Genetic evidence on the origins of Indian caste populations. Genome research, 11, 994-1004.

BANDELT, H.-J., FORSTER, P. & RÖHL, A. 1999. Median-joining networks for inferring intraspecific phylogenies. Molecular biology and evolution, 16, 37- 48.

BARBARO, A. & CORMACI, P. 2009. DNA typing from lipstick prints left on the skin. Forensic Science International: Genetics Supplement Series, 2, 125-126.

BARNHILL, J. 2015. G HAZ NAV I DS. Encyclopedia of World Trade: From Ancient Times to the Present, 408.

BARTEL, B. 1979. A discriminant analysis of Harappan civilization human populations. Journal of Archaeological Science, 6, 49-61.

BARCACCIA G, GALLA G, ACHILLI A, OLIVIERI A, TORRONI A: Uncovering the sources of DNA found on the Turin Shroud. Scientific reports 2015, 5.

BASU, A., MUKHERJEE, N., ROY, S., SENGUPTA, S., BANERJEE, S., CHAKRABORTY, M., DEY, B., ROY, M., ROY, B. & BHATTACHARYYA, N. P. 2003. Ethnic India: a genomic view, with special reference to peopling and structure. Genome research, 13, 2277-2290.

BAYLY, S. 2001. Caste, society and politics in India from the eighteenth century to the modern age, Cambridge University Press.

BEEBE, N. W., RUSSELL, T., BURKOT, T. R. & COOPER, R. D. 2015. Anopheles punctulatus Group: Evolution, Distribution, and Control*. Annual review of entomology, 60, 335-350.

BEHAR DM, HAMMER MF, GARRIGAN D, VILLEMS R, BONNE-TAMIR B, RICHARDS M, GURWITZ D, ROSENGARTEN D, KAPLAN M, DELLA PERGOLA S: MtDNA evidence for a genetic bottleneck in the early history of the Ashkenazi Jewish population. European Journal of Human Genetics 2004, 12(5):355-364.

202 BELLEW, H. W. 1891. An Inquiry Into the Ethnography of Afghanistan: Prepared for and Presented to the 9th International Congress of Orientalists (London, Sept. 1891), Oriental Univ. Inst.

BELLMAN, J. 1997. Indian Resonances in the British Invasion, 1965-1968. Journal of Musicology, 116-136.

BENECKE, M., KNOPF, M., VOLL, W., OESTERREICH, W., JACOBI, Y. & EDELMANN, J. 1998. Short tandem repeat (STR) locus HUMD8S306 in a large population sample from Germany. Electrophoresis, 19, 2396-2397.

BERDANIER, C. D. 2005. Introduction to mitochondria. OXIDATIVE STRESS AND DISEASE, 16, 1.

BERNDL, K., HATTSTEIN, M., KNEBEL, A. & UDELHOVEN, H.-J. 2005. National Geographic visual history of the world, National Geographic Society.

BIDNER, C. & ESWARAN, M. 2015. A gender-based theory of the origin of the caste system of India. Journal of Development Economics, 114, 142-158.

BINTZ, B. J., DIXON, G. B. & WILSON, M. R. 2014. Simultaneous Detection of Human Mitochondrial DNA and Nuclear‐Inserted Mitochondrial‐origin Sequences (NumtS) using Forensic mtDNA Amplification Strategies and Pyrosequencing Technology. Journal of forensic sciences, 59, 1064-1073.

BISBING, R. E. 1982. The forensic identification and association of human hair. Forensic science handbook, 1, 390-428.

BISWAS, A. 1971. The political history of the Hunas in India, Munshiram Manoharlal Publishers.

BITTLES, A., SULLIVAN, S. & ZHIVOTOVSKY, L. 2004. Consanguinity, caste and deaf-mutism in Punjab, 1921. Journal of biosocial science, 36, 221-234.

BLOSZIES, C., FORMAN, S., WRIGHT, D. & HILDEBRAND, E. 2015. Water level history for Lake Turkana, Kenya in the past 15,000 years and a variable transition from the African Humid Period to Holocene aridity. Global and Planetary Change.

BOLUS, M. 2015. Dispersals of Early Humans: Adaptations, Frontiers, and New Territories. Handbook of Paleoanthropology, 2371-2400.

BRADLEY, D. G. & MAGEE, D. A. 2006. Genetics and the origins of domestic cattle. Documenting domestication: new genetic and archaeological paradigms, 317-328.

BRUNER, E., GRIMAUD-HERVÉ, D., WU, X., DE LA CUÉTARA, J. M. & HOLLOWAY, R. 2015. A paleoneurological survey of Homo erectus endocranial metrics. Quaternary International, 368, 80-87.

203

BRYANT, E. & PATTON, L. 2004. The Indo-Aryan controversy: evidence and inference in Indian history, Routledge.

BUDOWLE, B., ALLARD, M. W., FISHER, C. L., ISENBERG, A. R., MONSON, K. L., STEWART, J. E., WILSON, M. R. & MILLER, K. W. 2002. HVI and HVII mitochondrial DNA data in Apaches and Navajos. International journal of legal medicine, 116, 212-215.

BUTLER, J. M. 2005. Forensic DNA typing: biology, technology, and genetics of STR markers, Academic Press.

C, M. G., MITRA, B., ZHANG, C.-L., DEBNATH, M., LI, G.-M., WANG, H.-W., AGRAWAL, S., CHAUDHURI, T. K. & ZHANG, Y.-P. 2015. West Eurasian mtDNA lineages in India: an insight into the spread of the Dravidian language and the origins of the caste system. Human genetics, 134, 637-647.

CADENAS AM, ZHIVOTOVSKY LA, CAVALLI-SFORZA LL, UNDERHILL PA, HERRERA RJ: Y-chromosome diversity characterizes the Gulf of Oman. European Journal of Human Genetics 2008, 16(3):374-386.

CAMPBELL, B. G. & LOY, J. 2000. Humankind emerging, Allyn & Bacon.

CAMPBELL, M. C. & TISHKOFF, S. A. 2008. African genetic diversity: implications for human demographic history, modern human origins, and complex disease mapping. Annual review of genomics and human genetics, 9, 403.

CANN, R. L. 1987. In search of Eve. The Sciences, 27, 30-37.

CAPT, C., PASSAMONTI, M. & BRETON, S. 2015. The human mitochondrial genome may code for more than 13 proteins. Mitochondrial DNA, 1-4.

CARRACEDO, A., BÄR, W., LINCOLN, P., MAYR, W., MORLING, N., OLAISEN, B., SCHNEIDER, P., BUDOWLE, B., BRINKMANN, B. & GILL, P. 2000. DNA commission of the international society for forensic genetics: guidelines for mitochondrial DNA typing. Forensic Science International, 110, 79-85.

CARTMILL, M. & SMITH, F. H. 2009. The human lineage, John Wiley & Sons.

CASSAN, G. 2010. British law and caste identity manipulation in colonial India: the Punjab Alienation of Land Act. Paris School of Economics Working paper.

CAVALLI-SFORZA, L. L. & EDWARDS, A. W. 1967. Phylogenetic analysis. Models and estimation procedures. American journal of human genetics, 19, 233.

CAVALLI-SFORZA, L. L., MENOZZI, P. & PIAZZA, A. 1994. The history and geography of human genes, Princeton university press.

CHABOT, C. & ALLEN, L. 2009. Global population structure of the tope (Galeorhinus galeus) inferred by mitochondrial control region sequence data.

204 Molecular Ecology, 18, 545-552.

CHAITANYA, L., VAN OVEN, M., WEILER, N., HARTEVELD, J., WIRKEN, L., SIJEN, T., DE KNIJFF, P. & KAYSER, M. 2014. Developmental validation of mitochondrial DNA genotyping assays for adept matrilineal inference of biogeographic ancestry at a continental level. Forensic Science International: Genetics, 11, 39-51.

CHAKRABORTY, U. 2003. Gendering caste through a feminist lens, Popular Prakashan.

CHAN, E. K., HARDIE, R.-A., PETERSEN, D. C., BEESON, K., BORNMAN, R. M., SMITH, A. B. & HAYES, V. M. 2015. Revised timeline and distribution of the earliest diverged human maternal lineages in southern Africa. PloS one, 10.

CHANDIO, N. & ANWAR, M. 2009. IMPACTS OF CLIMATE ON AGRICULTURE AND IT’S CAUSES: A CASE STUDY OF TALUKA KAMBER, SINDH, PAKISTAN. Sindh University Research Journal (Science Series), 41, 59-64.

CHANDRA, S. 2004. Medieval India: From Sultanat to the Mughals-Delhi Sultanat (1206-1526)-Part One, Har-Anand Publications.

CHANDRASEKAR, A., KUMAR, S., SREENATH, J., SARKAR, B. N., URADE, B. P., MALLICK, S., BANDOPADHYAY, S. S., BARUA, P., BARIK, S. S. & BASU, D. 2009. Updating phylogeny of mitochondrial DNA macrohaplogroup m in India: dispersal of modern human in South Asian corridor. PloS one, 4, e7447.

CHATTOPADHYAYA, B. 1976. Origin of the Rajputs: the political, economic and social processes in early medieval Rajasthan. Indian historical review, 3, 59.

CHAUBEY, G. 2010. The demographic history of India: A perspective based on genetic evidence.

CHAUBEY, G., METSPALU, M., KIVISILD, T. & VILLEMS, R. 2007. Peopling of South Asia: investigating the caste–tribe continuum in India. Bioessays, 29, 91-100.

CHEN, F., DENG, Y., DANG, Y., ZHANG, B., MU, H., YU, X., LI, L., YAN, C. & CHEN, T. 2008. Genetic polymorphism of mitochondrial DNA HVS-I and HVS-II of Chinese Tu ethnic minority group. Journal of Genetics and Genomics, 35, 225-232.

CHEN, Y.-S., TORRONI, A., EXCOFFIER, L., SANTACHIARA-BENERECETTI, A. S. & WALLACE, D. C. 1995. Analysis of mtDNA variation in African populations reveals the most ancient of all human continent-specific haplogroups. American journal of human genetics, 57, 133.

205

CHRISTIAN, D. 2015. The Cambridge World History: Volume 1, Introducing World History, to 10,000 BCE, Cambridge University Press.

COMAS, D., CALAFELL, F., MATEU, E., PÉREZ-LEZAUN, A., BOSCH, E., MARTÍNEZ-ARIAS, R., CLARIMON, J., FACCHINI, F., FIORI, G. & LUISELLI, D. 1998. Trading genes along the silk road: mtDNA sequences and the origin of central Asian populations. The American Journal of Human Genetics, 63, 1824-1838.

COMAS, D., PLAZA, S., WELLS, R. S., YULDASEVA, N., LAO, O., CALAFELL, F. & BERTRANPETIT, J. 2004. Admixture, migrations, and dispersals in Central Asia: evidence from maternal DNA lineages. European Journal of Human Genetics, 12, 495-504.

CONARD, N. J. 2015. Cultural Evolution During the Middle and Late Pleistocene in Africa and Eurasia. Handbook of Paleoanthropology, 2465-2508.

CONSORTIUM, H. P.-A. S. 2009. Mapping human genetic diversity in Asia. Science, 326, 1541-1545.

CORDAUX, R., AUNGER, R., BENTLEY, G., NASIDZE, I., SIRAJUDDIN, S. M. & STONEKING, M. 2004. Independent origins of Indian caste and tribal paternal lineages. Current Biology, 14, 231-235.

CORDAUX, R., SAHA, N., BENTLEY, G. R., AUNGER, R., SIRAJUDDIN, S. & STONEKING, M. 2003. Mitochondrial DNA analysis reveals diverse histories of tribal populations from India. European Journal of Human Genetics, 11, 253-264.

CORDAUX, R., WEISS, G., SAHA, N. & STONEKING, M. 2004. The northeast Indian passageway: a barrier or corridor for human migrations? Molecular Biology and Evolution, 21, 1525-1533.

COSTA MD, PEREIRA JB, PALA M, FERNANDES V, OLIVIERI A, ACHILLI A, PEREGO UA, RYCHKOV S, NAUMOVA O, HATINA J: A substantial prehistoric European ancestry amongst Ashkenazi maternal lineages. Nature Communications 2013, 4.

CUI, Y., SONG, L., WEI, D., PANG, Y., WANG, N., NING, C., LI, C., FENG, B., TANG, W. & LI, H. 2015. Identification of kinship and occupant status in Mongolian noble burials of the Yuan Dynasty through a multidisciplinary approach. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 370, 20130378.

DANINO, M. 2006. Genetics and the Aryan Debate. ARYAN INVASION THEORY, 42.

DASHTSEREN, B. 2015. The International Position of Mongolia: A Historical

206 Overview. Mongolian Journal of International Affairs, 4, 48-53.

DE SOUSA, A. & CUNHA, E. 2012. Hominins and the emergence of the modern human brain. Prog. Brain Res, 195, 293-322.

DERDA, M., SOLARCZYK, P., CHOLEWIŃSKI, M. & HADAŚ, E. 2015. Genotypic characterization of amoeba isolated from Acanthamoeba keratitis in Poland. Parasitology research, 114, 1233-1237.

DERENKO, M., MALYARCHUK, B., BAHMANIMEHR, A., DENISOVA, G., PERKOVA, M., FARJADIAN, S. & YEPISKOPOSYAN, L. 2013. Complete mitochondrial DNA diversity in Iranians.

DERENKO, M., MALYARCHUK, B., GRZYBOWSKI, T., DENISOVA, G., DAMBUEVA, I., PERKOVA, M., DORZHU, C., LUZINA, F., LEE, H. K. & VANECEK, T. 2007. Phylogeographic analysis of mitochondrial DNA in northern Asian populations. The American Journal of Human Genetics, 81, 1025-1041.

D'ERRICO, F. & BANKS, W. E. 2015. Tephra studies and the reconstruction of Middle-to-Upper Paleolithic cultural trajectories. Quaternary Science Reviews, 118, 182-193.

DIBYOPAMA, A., KIM, Y. J., OH, C. S., SHIN, D. H. & SHINDE, V. 2015. Human Skeletal Remains from Ancient Burial Sites in India: With Special Reference to Harappan Civilization. Korean Journal of Physical Anthropology, 28, 1-9.

DI CRISTOFARO J, PENNARUN E, MAZIÈRES S, MYRES NM, LIN AA, TEMORI SA, METSPALU M, METSPALU E, WITZEL M, KING RJ: Afghan Hindu Kush: where Eurasian sub-continent gene flows converge. PloS one 2013, 8(10):e76748.

DRYOMOV, S. V., NAZHMIDENOVA, A. M., SHALAUROVA, S. A., MOROZOV, I. V., TABAREV, A. V., STARIKOVSKAYA, E. B. & SUKERNIK, R. I. 2015. Mitochondrial genome diversity at the Bering Strait area highlights prehistoric human migrations from Siberia to northern North America. European Journal of Human Genetics.

EAASWARKHANTH, M., HAQUE, I., RAVESH, Z., ROMERO, I. G., MEGANATHAN, P. R., DUBEY, B., KHAN, F. A., CHAUBEY, G., KIVISILD, T. & TYLER-SMITH, C. 2010. Traces of sub-Saharan and Middle Eastern lineages in Indian Muslim populations. European Journal of Human Genetics, 18, 354-363.

EDWARDS, A., HAMMOND, H. A., JIN, L., CASKEY, C. T. & CHAKRABORTY, R. 1992. Genetic variation at five trimeric and tetrameric tandem repeat loci in four human population groups. Genomics, 12, 241-253.

EGLAR, Z. 2010. A Punjabi Village in Pakistan–Perspectives on Community, Land, and Economy, Oxford University Press, Pakistan (October 2010).

ELLIOTT, H. R., SAMUELS, D. C., EDEN, J. A., RELTON, C. L. & CHINNERY, 207

P. F. 2008. Pathogenic mitochondrial DNA mutations are common in the general population. The American journal of human genetics, 83, 254-260.

EMERY, L. S., MAGNAYE, K. M., BIGHAM, A. W., AKEY, J. M. & BAMSHAD, M. J. 2015. Estimates of Continental Ancestry Vary Widely among Individuals with the Same mtDNA Haplogroup. The American Journal of Human Genetics, 96, 183-193.

ENZEL, Y., ELY, L., MISHRA, S., RAMESH, R., AMIT, R., LAZAR, B., RAJAGURU, S., BAKER, V. & SANDLER, A. 1999. High-resolution Holocene environmental changes in the Thar Desert, northwestern India. Science, 284, 125-128.

ERMINI, L., DER SARKISSIAN, C., WILLERSLEV, E. & ORLANDO, L. 2015. Major transitions in human evolution revisited: A tribute to ancient DNA. Journal of human evolution, 79, 4-20.

ERNSTER, L. & SCHATZ, G. 1981. Mitochondria: a historical review. The Journal of cell biology, 91, 227s-255s.

EXCOFFIER, L., LAVAL, G. & SCHNEIDER, S. 2005. Arlequin (version 3.0): an integrated software package for population genetics data analysis. Evolutionary bioinformatics online, 1, 47.

FAHLBUSCH, E. & BROMILEY, G. W. 2005. The encyclopedia of Christianity, Wm. B. Eerdmans Publishing.

FAROOQ, M., HASSAN, M. & GULL, F. 2015. Mycobial Deterioration of Stone Monuments of Dharmarajika, Taxila. J Microbiol Exp, 2, 00036.

FELSENSTEIN, J. 1988. Phylogenies from molecular sequences: inference and reliability. Annual review of genetics, 22, 521-565.

FISCHER, L. 1968. Land and People. Afghanistan. Springer.

FORSTER, P., CALÌ, F., RÖHL, A., METSPALU, E., D’ANNA, R., MIRISOLA, M., DE LEO, G., FLUGY, A., SALERNO, A. & AYALA, G. 2002. Continental and subcontinental distributions of mtDNA control region types. International journal of legal medicine, 116, 99-108.

FORSTER, P., HARDING, R., TORRONI, A. & BANDELT, H.-J. 1996. Origin and evolution of Native American mtDNA variation: a reappraisal. American journal of human genetics, 59, 935.

FRASER, A. 1995. The gypsies, Wiley-Blackwell.

FRAWLEY, D. 1994. The myth of the Aryan invasion of India.

FRAWLEY, D. 2001. The Rig Veda and the History of India: Rig Veda Bharata Itihasa, Aditya Prakashan.

FREGEL, R., CABRERA, V., LARRUGA, J. M., ABU-AMERO, K. K. &

208 GONZÁLEZ, A. M. 2015. Carriers of Mitochondrial DNA Macrohaplogroup N Lineages Reached Australia around 50,000 Years Ago following a Northern Asian Route. PloS one, 10, e0129839.

GABRIEL, M. N., CALLOWAY, C. D., REYNOLDS, R. L. & PRIMORAC, D. 2003. Identification of human remains by immobilized sequence-specific oligonucleotide probe analysis of mtDNA hypervariable regions I and II. Croatian medical journal, 44, 293-298.

GABRIEL, R. A. 2015. The Madness of Alexander the Great: And the Myth of Military Genius, Pen and Sword.

GAIKWAD, S. & KASHYAP, V. 2005. Molecular insight into the genesis of ranked caste populations of western India based upon polymorphisms across non- recombinant and recombinant regions in genome. Genome biology, 6, P10.

GAIT, E. A. 1902. Census of India, 1901, Office of the Superintendent of Government Printing, India.

GASSE, F. 2000. Hydrological changes in the African tropics since the Last Glacial Maximum. Quaternary Science Reviews, 19, 189-211.

GAZDAR, H. & MALLAH, H. B. 2011. Class, caste and housing in rural Punjab–the untold story of the Marla schemes. CSP Research Report 12, Brighton: IDS.

GEDDES, A. 1960. The alluvial morphology of the Indo-Gangetic Plain: Its mapping and geographical significance. Transactions and Papers (Institute of British Geographers), 253-276.

GEO, P. 2014. Geography of Pakistan. geography.

GEPPERT, M., AYUB, Q., XUE, Y., SANTOS, S., RIBEIRO-DOS-SANTOS, Â., BAETA, M., NÚÑEZ, C., MARTÍNEZ-JARRETA, B., TYLER-SMITH, C. & ROEWER, L. 2015. Identification of new SNPs in native South American populations by resequencing the Y chromosome. Forensic Science International: Genetics, 15, 111-114.

GIBB, H. A. R. 1923. The Arab Conquests in Central Asia.

GILL, P. 2002. Role of short tandem repeat DNA in forensic casework in the UK- past, present, and future perspectives. Biotechniques, 32, 366-385.

GOEBEL, T. 2006. ANTHROPOLOGY: The Missing Years for Modern. Science, 10.

GOUNDER PALANICHAMY, M., SUN, C., AGRAWAL, S., BANDELT, H.-J., KONG, Q.-P., KHAN, F., WANG, C.-Y., CHAUDHURI, T. K., PALLA, V. & ZHANG, Y.-P. 2004. Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on complete sequencing: implications for the peopling of South Asia. The American Journal of Human Genetics, 75, 966-978.

GRAU, L. W. 2015. Securing the Borders of Afghanistan During the Soviet-Afghan War. The Journal of Slavic Military Studies, 28, 414-428.

209

GREENFIELD, L. O. 1979. On the adaptive pattern of “Ramapithecus”. American Journal of Physical Anthropology, 50, 527-548.

GRONAU, I., HUBISZ, M. J., GULKO, B., DANKO, C. G. & SIEPEL, A. 2011. Bayesian inference of ancient human demography from individual genome sequences. Nature genetics, 43, 1031-1034.

GROUCUTT, H. S., SHIPTON, C., ALSHAREKH, A., JENNINGS, R., SCERRI, E. M. & PETRAGLIA, M. D. 2015. Late Pleistocene lakeshore settlement in northern Arabia: Middle Palaeolithic technology from Jebel Katefeh, Jubbah. Quaternary International.

GUPTA, D. Caste, race, politics. SEMINAR-NEW DELHI-, 2001. MALYIKA SINGH, 33-40.

GUPTA, S. The Homeland of the Early Rigvedic Rishis: The Saraswati Basin in Haryana. Vedic culture and its continuity: proceedings of national seminar, 2006. Egully. com, 47.

GUTALA, R., CARVALHO-SILVA, D. R., JIN, L., YNGVADOTTIR, B., AVADHANULA, V., NANNE, K., SINGH, L., CHAKRABORTY, R. & TYLER-SMITH, C. 2006. A shared Y-chromosomal heritage between Muslims and Hindus in India. Human genetics, 120, 543-551.

HAMMER, M. F., WOERNER, A. E., MENDEZ, F. L., WATKINS, J. C. & WALL, J. D. 2011. Genetic evidence for archaic admixture in Africa. Proceedings of the National Academy of Sciences, 108, 15123-15128.

HASAN, S. K. 2005. HELLENISTIC INFLUENCE ON GANDHARA ART IN PAKISTAN. Journal of the Pakistan Historical Society, 53, 33.

HAYAT, S., AKHTAR, T., SIDDIQI, M. H., RAKHA, A., HAIDER, N., TAYYAB, M., ABBAS, G., ALI, A., BOKHARI, S. Y. A. & TARIQ, M. A. 2014. Mitochondrial DNA control region sequences study in Saraiki population from Pakistan. Legal Medicine.

HAYAT, S., AKHTAR, T., SIDDIQI, M. H., RAKHA, A., HAIDER, N., TAYYAB, M., ABBAS, G., ALI, A., BOKHARI, S. Y. A. & TARIQ, M. A. 2015. Mitochondrial DNA control region sequences study in Saraiki population from Pakistan. Legal Medicine, 17, 140-144.

HEDMAN, M., BRANDSTÄTTER, A., PIMENOFF, V., SISTONEN, P., PALO, J., PARSON, W. & SAJANTILA, A. 2007. Finnish mitochondrial DNA HVS-I and HVS-II population data. Forensic science international, 172, 171-178.

HERSHKOVITZ, I., MARDER, O., AYALON, A., BAR-MATTHEWS, M., YASUR, G., BOARETTO, E., CARACUTA, V., ALEX, B., FRUMKIN, A. & GODER-GOLDBERGER, M. 2015. Levantine cranium from Manot Cave (Israel) foreshadows the first European modern humans. Nature, 520, 216- 219.

HOLLAND, M. M. & LAUC, G. 2014. Forensic Aspects of mtDNA Analysis.

210 Forensic DNA Applications: An Interdisciplinary Perspective, 85.

HOLLIDAY, T. W. 2014. Neanderthals and Their Contemporaries. Encyclopedia of Global Archaeology. Springer.

HONG, S. B., KIM, K. C. & KIM, W. 2015. Population and forensic genetic analyses of mitochondrial DNA control region variation from six major provinces in the Korean population. Forensic Science International: Genetics, 17, 99-103.

HOODBHOY, P. & NAYYAR, A. H. 1985. Rewriting the history of Pakistan. Islam, politics and the state, 175.

HUDJASHOV, G., KIVISILD, T., UNDERHILL, P. A., ENDICOTT, P., SANCHEZ, J. J., LIN, A. A., SHEN, P., OEFNER, P., RENFREW, C. & VILLEMS, R. 2007. Revealing the prehistoric settlement of Australia by Y chromosome and mtDNA analysis. Proceedings of the National Academy of Sciences, 104, 8726-8730.

HUDSON, R. R., SLATKIN, M. & MADDISON, W. 1992. Estimation of levels of gene flow from DNA sequence data. Genetics, 132, 583-589.

IKRAM, S. M. 1989. History of Muslim civilization in India and Pakistan: a political and cultural history, Institute of Islamic Culture.

ILYAS, M., KIM, J.-S., COOPER, J., SHIN, Y.-A., KIM, H.-M., CHO, Y. S., HWANG, S., KIM, H., MOON, J. & CHUNG, O. 2015. Whole genome sequencing of an ethnic Pathan (Pakhtun) from the north-west of Pakistan. BMC genomics, 16, 172.

IMAIZUMI, K., PARSONS, T. J., YOSHINO, M. & HOLLAND, M. 2002. A new database of mitochondrial DNA hypervariable regions I and II sequences from 162 Japanese individuals. International journal of legal medicine, 116, 68-73.

IRAVANI, H. 2015. A Comparative Study of the Dome and Ogive in Eastern and Western Architecture. Journal of Selcuk University Natural and Applied Science, 1-10.

IRWIN, J. A., IKRAMOV, A., SAUNIER, J., BODNER, M., AMORY, S., RÖCK, A., O’CALLAGHAN, J., NURITDINOV, A., ATAKHODJAEV, S. & MUKHAMEDOV, R. 2010. The mtDNA composition of Uzbekistan: a microcosm of Central Asian patterns. International journal of legal medicine, 124, 195-204.

IRWIN, J. A., SAUNIER, J. L., BEH, P., STROUSS, K. M., PAINTNER, C. D. & PARSONS, T. J. 2009. Mitochondrial DNA control region variation in a population sample from Hong Kong, China. Forensic Science International: Genetics, 3, e119-e125.

JARRIGE, J.-F. 1981. Economy and society in the Early Chalcolithic/Bronze Age of Baluchistan: New perspectives from recent excavations at Mehrgarh. South Asian Archaeology, 1979, 93-114.

211

JARRIGE, J.-F. 1982. Excavations at Mehrgarh: their significance for understanding the background of the Harappan civilization. Harappan Civilization, 79-84.

JARRIGE, J.-F. & MEADOW, R. 1980. The antecedents of civilization in the Indus Valley. Scientific American, 243, 122-133.

JEFFREYS, A. J., WILSON, V. & THEIN, S. L. 1985. Hypervariable'minisatellite'regions in human DNA. Nature, 67-73.

JIN, H.-J., TYLER-SMITH, C. & KIM, W. 2009. The peopling of Korea revealed by analyses of mitochondrial DNA and Y-chromosomal markers. PloS one, 4, e4210.

JOBLING, M., HURLES, M. & TYLER-SMITH, C. 2013. Human evolutionary genetics: origins, peoples & disease, Garland Science.

JOBLING, M. A. 2001. In the name of the father: surnames and genetics. TRENDS in Genetics, 17, 353-357.

JOBLING, M. A. & GILL, P. 2004. Encoded evidence: DNA in forensic analysis. Nature Reviews Genetics, 5, 739-751.

JORDE, L. B., BAMSHAD, M. & ROGERS, A. R. 1998. Using mitochondrial and nuclear DNA markers to reconstruct human evolution. Bioessays, 20, 126-136.

JUHÁSZ, Z., FEHÉR, T., NÉMETH, E. & PAMJAV, H. 2015. mtDNA analysis of 174 Eurasian populations using a new iterative rank correlation method. Molecular Genetics and Genomics, 1-17.

KAZMI, A. 1984. Geology of the Indus delta. Marine Geology and Oceanography of Arabian Sea and Coastal Pakistan. Van Nostrand Reinhold, New York, 71, 161-180.

KAZUO, M. 2012. Sayyid and Sharifs in Muslim Socities : the living links to the prophet, Routlredge USA , Canida.

KEAY, J. 2011. India: A History. Revised and Updated, Grove/Atlantic, Inc.

KHADEMI NADOOSHAN, F. 2003. Kushana in Central Asia. The International Journal of Humanities, 10, 1-8.

KHAN, A. & UD DIN, N. Indian Muslim Freedom Fighters Based in Afghanistan and Soviet Russia.

KHAN, F. K. 1991. A geography of Pakistan. Oxford, Karachi. Khan SR and Khan SR (2009) Assessing Poverty-Deforestation Links: Evidence from Swat, Pakistan. Ecological Economics, 68, 2607-2618.

KHAN, H. A., KIDWAI, R. A. & KHAN, M. A. By an Arab army led Muhammad bin Qasim, For it to become the easternmost province of Umayyad Caliphate. In 10 th century, added Punjab to Ghazniad Empire. Eventually this led to the establishment of the Delhi Sultanate.

212 KHAN, M. A. & ARIF, S. 2010. Pre-Historic Rock Shelters and Caves in Capital Territory Islamabad and District Rawalpindi. Journal of Asian Civilizations, 33, 52.

KHAN, N., BENSON, J., MACLEOD, R. & KINGSTON, H. 2010. Developing and evaluating a culturally appropriate genetic service for consanguineous South Asian families. Journal of community genetics, 1, 73-81.

KINGMAN, J. F. 2000. Origins of the coalescent: 1974-1982. Genetics, 156, 1461- 1463.

KIVISILD, T. 2015. Maternal ancestry and population history from whole mitochondrial genomes. Investigative genetics, 6, 3.

KIVISILD, T., BAMSHAD, M. J., KALDMA, K., METSPALU, M., METSPALU, E., REIDLA, M., LAOS, S., PARIK, J., WATKINS, W. S. & DIXON, M. E. 1999. Deep common ancestry of Indian and western-Eurasian mitochondrial DNA lineages. Current Biology, 9, 1331-1334.

KIVISILD, T., ROOTSI, S., METSPALU, M., MASTANA, S., KALDMA, K., PARIK, J., METSPALU, E., ADOJAAN, M., TOLK, H.-V. & STEPANOV, V. 2003. The genetic heritage of the earliest settlers persists both in Indian tribal and caste populations. The American Journal of Human Genetics, 72, 313-332.

KLOSTERMAIER, K. K. 1998. Questioning the aryan invasion theory and revising ancient Indian history, na.

KONG, Q.-P., SUN, C., WANG, H.-W., ZHAO, M., WANG, W.-Z., ZHONG, L., HAO, X.-D., PAN, H., WANG, S.-Y. & CHENG, Y.-T. 2011. Large-scale mtDNA screening reveals a surprising matrilineal complexity in east Asia and its implications to the peopling of the region. Molecular biology and evolution, 28, 513-522.

KONG, Q.-P., YAO, Y.-G., SUN, C., BANDELT, H.-J., ZHU, C.-L. & ZHANG, Y.- P. 2003. Phylogeny of East Asian mitochondrial DNA lineages inferred from complete sequences. The American Journal of Human Genetics, 73, 671-676.

KUMAR, R. 2008. Encyclopaedia of untouchables ancient, medieval and modern, Gyan Publishing House.

KUMAR, S., PADMANABHAM, P., RAVURI, R. R., UTTARAVALLI, K., KONERU, P., MUKHERJEE, P. A., DAS, B., KOTAL, M., XAVIOUR, D. & SAHEB, S. 2008. The earliest settlers' antiquity and evolutionary history of Indian populations: evidence from M2 mtDNA lineage. BMC evolutionary biology, 8, 230.

KUMAR, S., RAVURI, R. R., KONERU, P., URADE, B., SARKAR, B., CHANDRASEKAR, A. & RAO, V. 2009. Reconstructing Indian-Australian phylogenetic link. BMC evolutionary biology, 9, 173.

KURESHY, K. & AHMAD, K. S. U. 1977. A geography of Pakistan, Oxford 213

University Press Karachi.

LACAU, H., BUKHARI, A., GAYDEN, T., LA SALVIA, J., REGUEIRO, M., STOJKOVIC, O. & HERRERA, R. J. 2011. Y-STR profiling in two Afghanistan populations. Legal Medicine, 13, 103-108.

LADD, C., LEE, H. C., YANG, N. & BIEBER, F. R. 2001. Interpretation of complex forensic DNA mixtures. Croatian medical journal, 42, 244-246.

LAHR, M. M. & FOLEY, R. 1994. Multiple dispersals and modern human origins. Evolutionary Anthropology: Issues, News, and Reviews, 3, 48-60.

LAHR, M. M. & FOLEY, R. A. 1998. Towards a theory of modern human origins: geography, demography, and diversity in recent human evolution. Yearbook of physical anthropology, 41, 137-176.

LAL, C. 1962. Gipsies: forgotten children of India, Publications Division, Ministry of Information and Broadcasting.

LAL, K. S. 1984. Early Muslims in India, Books & Books.

LANDSTEINER, K. 1900. Zur Kenntnis der antifermentativen, lytischen und agglutinierenden Wirkungen des Blutserums und der Lymphe. Zbl Bakt, 27, 357-362.

LANDSTEINER, K. & MILLER JR, C. P. 1925. SEROLOGICAL STUDIES ON THE BLOOD OF THE PRIMATES: II. THE BLOOD GROUPS IN ANTHROPOID APES. The Journal of experimental medicine, 42, 853.

LARRASOAÑA, J. C., ROBERTS, A. P. & ROHLING, E. J. 2013. Dynamics of green Sahara periods and their role in hominin evolution. PloS one, 8, e76514.

LEHOCKÝ, I., BALDOVIČ, M., KÁDAŠI, Ľ. & METSPALU, E. 2008. A database of mitochondrial DNA hypervariable regions I and II sequences of individuals from Slovakia. Forensic Science International: Genetics, 2, e53-e59.

LEMNRAU, A., BROOK, M. N., FLETCHER, O., COULSON, P., JONES, M., TOMCZYK, K., ASHWORTH, A., SWERDLOW, A., ORR, N. & GARCIA- CLOSAS, M. 2015. Mitochondrial DNA copy number in peripheral blood cells and risk of developing breast cancer. Cancer Research, canres. 1692.2014.

LEWIN, R. 1987. The unmasking of mitochondrial Eve. Science, 238, 24-26.

LI, Y.-C., WANG, H.-W., TIAN, J.-Y., LIU, L.-N., YANG, L.-Q., ZHU, C.-L., WU, S.-F., KONG, Q.-P. & ZHANG, Y.-P. 2015. Ancient inland human dispersals from Myanmar into interior East Asia since the Late Pleistocene. Scientific reports, 5.

LUDOLPH, K. A. Z. W. T. M. C. 1968. Land and People. Afghanistan. Springer.

LYONS, J. D. 2015. Alexander the Great and Hernán Cortés: Ambiguous Legacies of

214 Leadership, Lexington Books.

MABUCHI, T., SUSUKIDA, R., KIDO, A. & OYA, M. 2007. Typing the 1.1 kb control region of human mitochondrial DNA in Japanese individuals. Journal of forensic sciences, 52, 355-363.

MACAULAY, V., HILL, C., ACHILLI, A., RENGO, C., CLARKE, D., MEEHAN, W., BLACKBURN, J., SEMINO, O., SCOZZARI, R. & CRUCIANI, F. 2005. Single, rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes. Science, 308, 1034-1036.

MACAULAY, V., RICHARDS, M., HICKEY, E., VEGA, E., CRUCIANI, F., GUIDA, V., SCOZZARI, R., BONNÉ-TAMIR, B., SYKES, B. & TORRONI, A. 1999. The emerging tree of West Eurasian mtDNAs: a synthesis of control- region sequences and RFLPs. The American Journal of Human Genetics, 64, 232-249.

MAENCHEN-HELFEN, O. 1973. The world of the Huns: studies in their history and culture, Univ of California Press.

MAJI, S., KRITHIKA, S. & VASULU, T. 2009. Phylogeographic distribution of mitochondrial DNA macrohaplogroup M in India. Journal of genetics, 88, 127-139.

MAJUMDER, P. P. 2010. The human genetic history of South Asia. Current Biology, 20, R184-R187.

MALHOTRA, A. & MIR, F. 2012. Punjab Reconsidered: History, Culture, and Practice, Oxford University Press.

MALYARCHUK, B. A., PERKOVA, M., DERENKO, M., VANECEK, T., LAZUR, J. & GOMOLCAK, P. 2008. Mitochondrial DNA variability in Slovaks, with application to the Roma origin. Annals of human genetics, 72, 228-240.

MARÁCZ, L. 2015. The Huns in Western Consciousness: Images, Stereotypes and Civilization. Вестник Томского государственного университета, 17.

MARTEN, K., JOHNSON, T. H. & MASON, M. C. 2009. Misunderstanding Pakistan's Federally Administered Tribal Area? International Security, 33, 180-189.

MARUYAMA, S., MINAGUCHI, K. & SAITOU, N. 2003. Sequence polymorphisms of the mitochondrial DNA control region and phylogenetic analysis of mtDNA lineages in the Japanese population. International journal of legal medicine, 117, 218-225.

MCCRINDLE, J. W., RUFUS, Q. C. & JUSTINUS, M. J. 1896. The Invasion of India by Alexander the Great as Described by Arrian, Q. Curtius, Diodoros, Plutarch and Justin: Being Translations of Such Portions of the Works of These and Other Classical Authors as Describe Alexander's Campaigns in

215

Afghanistan, the Punjâb, Sindh, Gedrosia and Karmania, A. Constable and Company.

MCELREAVEY, K. & QUINTANA-MURCI, L. 2005. A population genetics perspective of the Indus Valley through uniparentally-inherited markers. Annals of human biology, 32, 154-162.

MCLEOD, J. 2015. The history of India, ABC-CLIO.

M'CRINDLE, J. W. 1816. The invasion of India by Alexander the Great, Cosmo Publications.

MELLARS, P. 2006. Going east: new genetic and archaeological perspectives on the modern human colonization of Eurasia. Science, 313, 796-800.

MELTON, T. & NELSON, K. 2001. Forensic mitochondrial DNA analysis: two years of commercial casework experience in the United States. Croatian medical journal, 42, 298-303.

MENDIZABAL, I., SANDOVAL, K., BERNIELL-LEE, G., CALAFELL, F., SALAS, A., MARTÍNEZ-FUENTES, A. & COMAS, D. 2008. Genetic origin, admixture, and asymmetry in maternal and paternal human lineages in Cuba. BMC evolutionary biology, 8, 213.

MENDIZABAL, I., VALENTE, C., GUSMÃO, A., ALVES, C., GOMES, V., GOIOS, A., PARSON, W., CALAFELL, F., ALVAREZ, L. & AMORIM, A. 2011. Reconstructing the Indian origin and dispersal of the European Roma: a maternal genetic perspective. PloS one, 6, e15988.

METSPALU, M., KIVISILD, T., METSPALU, E., PARIK, J., HUDJASHOV, G., KALDMA, K., SERK, P., KARMIN, M., BEHAR, D. M. & GILBERT, M. T. P. 2004. Most of the extant mtDNA boundaries in south and southwest Asia were likely shaped during the initial settlement of Eurasia by anatomically modern humans. BMC genetics, 5, 26.

MGELADZE, A. & MONCEL, M.-H. 2015. The Acheulean in the South Caucasus (Georgia): Koudaro I and Tsona lithic assemblages. Quaternary International.

MILLER, A., HOLMBERG, A., LUKACS, M., CASULLI, A., DEPLAZES, P. & JUREMALM, M. 2014. A semi-automated magnetic capture probe based DNA extraction and real-time PCR method applied in the Swedish surveillance of Echinococcus multilocularis in red fox (Vulpes vulpes) faecal samples.

MOULHERAT, C., TENGBERG, M., HAQUET, J.-F. & MILLE, B. T. 2002. First evidence of cotton at Neolithic Mehrgarh, Pakistan: analysis of mineralized fibres from a copper bead. Journal of Archaeological Science, 29, 1393-1401.

MUGHAL, M. R. 1973. Present state of research on the Indus Valley Civilization, Department of Archaeology and Museums, Ministry of Education and Culture,

216 Government of Pakistan.

MUKHERJEE, B. N. 1969. An Agrippan Source: A Study in Indo-Parthian History, Calcutta: Pilgrim Publishers.

MULLIS, K., FALOONA, F., SCHARF, S., SAIKI, R., HORN, G. & ERLICH, H. 1992. Specific enzymatic amplification of DNA in vitro: the polymerase chain reaction. Biotechnology Series, 17-17.

NAYRES, A. 2008. Language, the nation, and symbolic capital: The case of Punjab. The Journal of Asian Studies, 67, 917-946.

NANAVUTTY, P. 1977. The Parsis, New Delhi: National Book Trust, India.

NASIDZE, I., LING, E., QUINQUE, D., DUPANLOUP, I., CORDAUX, R., RYCHKOV, S., NAUMOVA, O., ZHUKOVA, O., SARRAF‐ZADEGAN, N. & NADERI, G. 2004. Mitochondrial DNA and Y‐chromosome variation in the Caucasus. Annals of human genetics, 68, 205-221.

NASIDZE, I., QUINQUE, D., OZTURK, M., BENDUKIDZE, N. & STONEKING, M. 2005. MtDNA and Y‐chromosome Variation in Kurdish Groups. Annals of Human Genetics, 69, 401-412.

NASIDZE, I., QUINQUE, D., RAHMANI, M., ALEMOHAMAD, S. A. & STONEKING, M. 2008. Close Genetic Relationship Between Semitic‐speaking and Indo‐European‐speaking Groups in Iran. Annals of human genetics, 72, 241-252.

NEI, M. & JIN, L. 1989. Variances of the average numbers of nucleotide substitutions within and between populations. Molecular Biology and Evolution, 6, 290- 300.

NEILL, S. 2004. A history of Christianity in India: The beginnings to AD 1707, Cambridge University Press.

NEW, W. S. Encyclopedia> Blood transfusion.

NILSSON, M., ANDRÉASSON-JANSSON, H., INGMAN, M. & ALLEN, M. 2008. Evaluation of mitochondrial DNA coding region assays for increased discrimination in forensic analysis. Forensic Science International: Genetics, 2, 1-8.

NONGBRI, T. 2013. Tribe, caste and the indigenous challenge in India. Indigeneity In India, 75.

NORDBORG, M. 1997. Structured coalescent processes on different time scales. Genetics, 146, 1501-1514.

217

OPPENHEIMER, S. 2012. Out-of-Africa, the peopling of continents and islands: tracing uniparental gene trees across the map. Philosophical Transactions of the Royal Society B: Biological Sciences, 367, 770-784.

OPPENHEIMER, S. 2012. A single southern exit of modern humans from Africa: Before or after Toba? Quaternary International, 258, 88-99.

ORTIZ, J., GUILDERSON, T. & SARNTHEIN, M. 2000. Coherent high-and low- latitude climate variability during the Holocene warm period. Science, 288, 2198-2202.

OVCHINNIKOV, I. V., MALEK, M. J., DREES, K. & KHOLINA, O. I. 2014. Mitochondrial DNA variation in Tajiks living in Tajikistan. Legal Medicine, 16, 390-395.

PAGANI, L., SCHIFFELS, S., GURDASANI, D., DANECEK, P., SCALLY, A., CHEN, Y., XUE, Y., HABER, M., EKONG, R. & OLJIRA, T. 2015. Tracing the Route of Modern Humans out of Africa by Using 225 Human Genome Sequences from Ethiopians and Egyptians. The American Journal of Human Genetics.

PALANICHAMY, M. G., MITRA, B., ZHANG, C.-L., DEBNATH, M., LI, G.-M., WANG, H.-W., AGRAWAL, S., CHAUDHURI, T. K. & ZHANG, Y.-P. 2015. West Eurasian mtDNA lineages in India: an insight into the spread of the Dravidian language and the origins of the caste system. Human genetics, 134, 637-647.

PALSTRA, F. P., HEYER, E. & AUSTERLITZ, F. 2015. Statistical inference on genetic data reveals the complex demographic history of human populations in Central Asia. Molecular biology and evolution, 32, 1411-1424.

PAPIHA, S. 1996. Genetic variation in India. Human biology, 607-628.

PARPOLA, A. 2015. The Roots of Hinduism: The Early Aryans and the Indus Civilization, Oxford University Press, USA.

PARSON, W. & DÜR, A. 2007. EMPOP—a forensic mtDNA database. Forensic Science International: Genetics, 1, 88-92.

PARSON, W., HUBER, G., MORENO, L., MADEL, M.-B., BRANDHAGEN, M. D., NAGL, S., XAVIER, C., EDUARDOFF, M., CALLAGHAN, T. C. & IRWIN, J. A. 2015. Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples. Forensic Science International: Genetics, 15, 8-15.

PARSON, W., PARSONS, T., SCHEITHAUER, R. & HOLLAND, M. 1998. Population data for 101 Austrian Caucasian mitochondrial DNA d-loop sequences: application of mtDNA sequence analysis to a forensic case. International Journal of Legal Medicine, 111, 124-132.

PASSARINO, G., SEMINO, O., BERNINI, L. F. & SANTACHIARA- BENERECETTI, A. S. 1996. Pre-Caucasoid and Caucasoid genetic features of

218 the Indian population, revealed by mtDNA polymorphisms. American journal of human genetics, 59, 927.

PATERSON, T. T. & DRUMMOND, H. 1962. Soan, the Palaeolithic of Pakistan, Department of Archaeology, Government of Pakistan.

PEOPLE, J. Jat people.

PERELTSVAIG, A. & LEWIS, M. W. 2015. The Indo-European Controversy, Cambridge University Press.

POSSEHL, G. L. 1982. Harappan civilization: A contemporary perspective, Aris & Phillips.

POSSEHL, G. L. 1997. The transformation of the Indus civilization. Journal of World Prehistory, 11, 425-472.

POSSEHL, G. L., RAVAL, M. H. & CHITALWALA, Y. 1989. Harappan civilization and Rojdi, Brill Archive.

POSTANS, C. 1844. Routes Through Kach'hí Gandává. And an Account of the Belúchí and Other Tribes in Upper Sind'h and Kach'hí. Journal of the Royal Geographical Society of London, 193-218.

PRÜFER, K., RACIMO, F., PATTERSON, N., JAY, F., SANKARARAMAN, S., SAWYER, S., HEINZE, A., RENAUD, G., SUDMANT, P. H. & DE FILIPPO, C. 2014. The complete genome sequence of a Neanderthal from the Altai Mountains. Nature, 505, 43-49.

PUGACH, I. & STONEKING, M. 2015. Genome-wide insights into the genetic history of human populations. Investigative genetics, 6, 6.

QUINTANA-MURCI, L., CHAIX, R., WELLS, R. S., BEHAR, D. M., SAYAR, H., SCOZZARI, R., RENGO, C., AL-ZAHERY, N., SEMINO, O. & SANTACHIARA-BENERECETTI, A. S. 2004. Where west meets east: the complex mtDNA landscape of the southwest and Central Asian corridor. The American Journal of Human Genetics, 74, 827-845.

QUINTANA-MURCI, L., SEMINO, O., BANDELT, H.-J., PASSARINO, G., MCELREAVEY, K. & SANTACHIARA-BENERECETTI, A. S. 1999. Genetic evidence of an early exit of Homo sapiens sapiens from Africa through eastern Africa. Nature genetics, 23, 437-441.

QUISPE-TINTAYA, W., WHITE, R. R., POPOV, V. N., VIJG, J. & MASLOV, A. Y. 2015. Rapid Mitochondrial DNA Isolation Method for Direct Sequencing. Mitochondrial Medicine: Volume I, Probing Mitochondrial Function, 89-95.

RAJKUMAR, R., BANERJEE, J., GUNTURI, H. B., TRIVEDI, R. & KASHYAP, V.

219

2005. Phylogeny and antiquity of M macrohaplogroup inferred from complete mt DNA sequence of Indian specific lineages. BMC evolutionary biology, 5, 26.

RAKHA, A., SHIN, K.-J., YOON, J. A., KIM, N. Y., SIDDIQUE, M. H., YANG, I. S., YANG, W. I. & LEE, H. Y. 2011. Forensic and genetic characterization of mtDNA from Pathans of Pakistan. International journal of legal medicine, 125, 841-848.

RAMANA, G. V., SU, B., JIN, L., SINGH, L., WANG, N., UNDERHILL, P. & CHAKRABORTY, R. 2001. Y-chromosome SNP haplotypes suggest evidence of gene flow among caste, tribe, and the migrant Siddi populations of Andhra Pradesh, South India. European Journal of Human Genetics, 9, 695- 700.

RANASINGHE, R., TENNEKOON, K. H., KARUNANAYAKE, E. H., LEMBRING, M. & ALLEN, M. 2015. A study of genetic polymorphisms in mitochondrial DNA hypervariable regions I and II of the five major ethnic groups and Vedda population in Sri Lanka. Legal Medicine.

RANI, D. S., DHANDAPANY, P. S., NALLARI, P., GOVINDARAJ, P., SINGH, L. & THANGARAJ, K. 2010. Mitochondrial DNA haplogroup ‘R’is associated with Noonan syndrome of south India. Mitochondrion, 10, 166-173.

RAO, R. P., YADAV, N., VAHIA, M. N., JOGLEKAR, H., ADHIKARI, R. & MAHADEVAN, I. 2009. Entropic evidence for linguistic structure in the Indus script. Science, 324, 1165-1165.

RAO, S. R. 1973. Lothal and the Indus civilization, New York: Asia Publishing House.

RATNAGAR, S. 1981. Encounters, the westerly trade of the Harappa civilization, Oxford University Press, USA.

RHODIN, A. G., THOMSON, S., GEORGALIS, G. L., KARL, H., DANILOV, I. G. & TAKAHASHI, A. 2015. TURTLE EXTINCTIONS WORKING GROUP.

RIBEIRO-DOS-SANTOS, Â. K. C., CARVALHO, B. M., FEIO-DOS-SANTOS, A. C. & DOS SANTOS, S. E. B. 2007. Nucleotide variability of HV-I in Afro- descendents populations of the Brazilian Amazon Region. Forensic science international, 167, 77-80.

RICHARD, C., RICHARD, C., PENNARUN, E., KIVISILD, T., TAMBETS, K., TOLK, H.-V., METSPALU, E., REIDLA, M., CHEVALIER, S. & GIRAUDET, S. 2007. An mtDNA perspective of French genetic variation. Annals of human biology, 34, 68-79.

RICHARDS, M., BANDELT, H.-J., KIVISILD, T. & OPPENHEIMER, S. 2006. A model for the dispersal of modern humans out of Africa. Human Mitochondrial DNA and the Evolution of Homo sapiens. Springer.

ROBINSON, F. 1989. The Cambridge Encyclopedia of India, Pakistan, Bangladesh,

220 Sri Lanka, Nepal, Bhutan and the Maldives, Cambridge University Press Cambridge.

RÖCK, A. W., DÜR, A., VAN OVEN, M. & PARSON, W. 2013. Concept for estimating mitochondrial DNA haplogroups using a maximum likelihood approach (EMMA). Forensic Science International: Genetics, 7, 601-609.

ROSENBERG, N. A. & NORDBORG, M. 2002. Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nature Reviews Genetics, 3, 380-390.

ROUSTAEI, K., MASHKOUR, M. & TENGBERG, M. 2015. Tappeh Sang-e Chakhmaq and the beginning of the Neolithic in north-east Iran. Antiquity, 89, 573-595.

ROYCHOUDHURY, S., ROY, S., BASU, A., BANERJEE, R., VISHWANATHAN, H., USHA RANI, M., SIL, S. K., MITRA, M. & MAJUMDER, P. P. 2001. Genomic structures and population histories of linguistically distinct tribal groups of India. Human genetics, 109, 339-350.

ROYCHOUDHURY, S., ROY, S., DEY, B., CHAKRABORTY, M., ROY, M., ROY, B., RAMESH, A., PRABHAKARAN, N., RANI, U. & VISHWANATHAN, H. 2000. Fundamental genomic unity of ethnic India is revealed by analysis of mitochondrial DNA. Current Science, 79, 1182-1192.

ROOSTALU U, KUTUEV I, LOOGVÄLI E, METSPALU E, TAMBETS K, REIDLA M, KHUSNUTDINOVA E, USANGA E, KIVISILD T, VILLEMS R: Origin and expansion of haplogroup H, the dominant human mitochondrial DNA lineage in West Eurasia: the Near Eastern and Caucasian perspective. Molecular biology and evolution 2007, 24(2):436- 448.

RUANGSUP, W. 2015. The Emergence and Development of Brahmanism in Thailand with Special Reference to Iconography of Brahmanical Deities.

SAHOO, S. & KASHYAP, V. 2006. Phylogeography of mitochondrial DNA and Y‐Chromosome haplogroups reveal asymmetric gene flow in populations of Eastern India. American journal of physical anthropology, 131, 84-97.

SAIKI, R. K. 1990. Amplification of genomic DNA. PCR protocols: A guide to methods and applications, 2, 13-20.

SAMBROOK, J., FRITSCH, E. F. & MANIATIS, T. 1989. Molecular cloning, Cold spring harbor laboratory press New York.

SATPATHY, B. B. Politico-Social and Administrative History of Ancient India. Age, 38, 49.

SCERRI, E. A new stone tool assemblage revisited: reconsidering the ‘Aterian’in Arabia. Proceedings of the Seminar for Arabian Studies, 2012. 357-370.

SCHIMMEL, A. 1982. Islam in India and Pakistan, Brill.

221

SCHLIESINGER, J. 2015. Ethnic Groups of Laos Vol 3: Profile of Austro-Thai- Speaking Peoples, Booksmango.

SCHOLZ, C. A., JOHNSON, T. C., COHEN, A. S., KING, J. W., PECK, J. A., OVERPECK, J. T., TALBOT, M. R., BROWN, E. T., KALINDEKAFE, L. & AMOAKO, P. Y. 2007. East African megadroughts between 135 and 75 thousand years ago and bearing on early-modern human origins. Proceedings of the National Academy of Sciences, 104, 16416-16421.

SCHURR, T. G., BALLINGER, S. W., GAN, Y.-Y., HODGE, J. A., MERRIWETHER, D. A., LAWRENCE, D. N., KNOWLER, W. C., WEISS, K. M. & WALLACE, D. C. 1990. Amerindian mitochondrial DNAs have rare Asian mutations at high frequencies, suggesting they derived from four primary maternal lineages. American journal of human genetics, 46, 613.

SHAFFER, J. G. 1984. The Indo-Aryan invasions: cultural myth and archaeological reality. The People of South Asia. Springer.

SHARMA, A. 2005. Dr. BR Ambedkar on the Aryan invasion and the emergence of the caste system in India. Journal of the American Academy of Religion, 73, 843-870.

SHARMA, G., TAMANG, R., CHAUDHARY, R., SINGH, V. K., SHAH, A. M., ANUGULA, S., RANI, D. S., REDDY, A. G., EAASWARKHANTH, M. & CHAUBEY, G. 2012. Genetic affinities of the central Indian tribal populations. PloS one, 7, e32546.

SHARMA, S., RAI, E., SHARMA, P., JENA, M., SINGH, S., DARVISHI, K., BHAT, A. K., BHANWER, A., TIWARI, P. K. & BAMEZAI, R. N. 2009. The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system. Journal of human genetics, 54, 47-55.

SHEPHARD, R. J. 2015. Examples of Early City Life from Ancient Assyria, Babylon, Egypt, Israel, India and China: Health as a Gift of the Gods. An Illustrated History of Health and Fitness, from Pre-History to our Post- Modern World. Springer.

SIDDIQI, M. H., AKHTAR, T., RAKHA, A., ABBAS, G., ALI, A., HAIDER, N., ALI, A., HAYAT, S., MASOOMA, S. & AHMAD, J. 2014. Genetic characterization of the Makrani people of Pakistan from mitochondrial DNA control-region data. Legal Medicine.

SIDDIQI, M. H., AKHTAR, T., RAKHA, A., ABBAS, G., ALI, A., HAIDER, N., ALI, A., HAYAT, S., MASOOMA, S. & AHMAD, J. 2015. Genetic characterization of the Makrani people of Pakistan from mitochondrial DNA control-region data. Legal Medicine, 17, 134-139.

SIDDIQI, M. U. A. & SAJID, H. U. RISE AND REBIRTH OF NON-STATE ACTORS IN SWAT: A PHENOMENOLOGICAL STUDY OF SHARIAH AS A POPULAR DEMAND.

222 SIMÃO, F., COSTA, H. A., DA SILVA, C. V., RIBEIRO, T., PORTO, M. J., SANTOS, J. C. & AMORIM, A. 2015. Genetic portrait of Lisboa immigrant population from Angola with mitochondrial DNA. Forensic Science International: Genetics, 15, 33-38.

SINGH, D. P. 2010. Indian cultural values and ethos explained for the decision makers. International Journal of Indian Culture and Business Management, 3, 592-606.

SINGH, G., WASSON, R. & AGRAWAL, D. 1990. Vegetational and seasonal climatic changes since the last full glacial in the Thar Desert, northwestern India. Review of Palaeobotany and Palynology, 64, 351-358.

SINGH, P., SINGH, M., GERDES, U. & MASTANA, S. S. 2001. Apolipoprotein E polymorphism in India: high APOE* E3 allele frequency in Ramgarhia of Punjab. Anthropologischer Anzeiger, 27-34.

SINGH, S. & GAUR, I. D. 2009. in Punjab: Mystics, Literature and Shrines, Aakar Books.

SIROCKO, F., GARBE-SCHÖNBERG, D., MCINTYRE, A. & MOLFINO, B. 1996. Teleconnections between the subtropical monsoons and high-latitude climates during the last deglaciation. SCIENCE-NEW YORK THEN WASHINGTON-, 526-529.

SLUGLETT, P. & CURRIE, A. 2015. Atlas of Islamic History, Routledge.

SMITH, F. H. & AHERN, J. C. 2013. The origins of modern humans: biology reconsidered, John Wiley & Sons.

SMYTH, M. & PIPES, M. S. I. U. 2010. Pakistan? qsrc= 3044.

SOARES, P., ERMINI, L., THOMSON, N., MORMINA, M., RITO, T., RÖHL, A., SALAS, A., OPPENHEIMER, S., MACAULAY, V. & RICHARDS, M. B. 2009. Correcting for purifying selection: an improved human mitochondrial molecular clock. The American Journal of Human Genetics, 84, 740-759.

SOFI, U. J. Paradox of Tribal Development: A Case of Gujars and Bakarwals of Jammu & Kashmir (India).

SØRENSEN, P. 2015. An Evaluation of the Early Soan Chronology. Studia Orientalia Electronica, 50, 257-272.

SPATE, O. H. K. & LEARMONTH, A. T. A. 1972. India and Pakistan: land, people and economy. India and Pakistan: land, people and economy.

ST JOHN, J., SAKKAS, D., DIMITRIADI, K., BARNES, A., MACLIN, V., RAMEY, J., BARRATT, C. & DE JONGE, C. 2000. Failure of elimination of paternal mitochondrial DNA in abnormal embryos. The Lancet, 355, 200.

223

STARR, C., TAGGART, R., EVERS, C. & STARR, L. 2015. Biology: The unity and diversity of life, Cengage Learning.

STAUBWASSER, M., SIROCKO, F., GROOTES, P. & SEGL, M. 2003. Climate change at the 4.2 ka BP termination of the Indus valley civilization and Holocene south Asian monsoon variability. Geophysical Research Letters, 30.

STEIN, M. A. 2014. On Alexander's track to the Indus, Cambridge University Press.

STONEKING, M. 2008. Human origins. EMBO reports, 9, S46-S50.

STOLJAROVA M, KING JL, TAKAHASHI M, AASPÕLLU A, BUDOWLE B: Whole mitochondrial genome genetic diversity in an Estonian population sample. International journal of legal medicine 2016, 130(1):67-71.

STRAIT, D., GRINE, F. E. & FLEAGLE, J. G. 2015. Analyzing Hominin Hominin Phylogeny: Cladistic Approach. Handbook of Paleoanthropology, 1989-2014.

STRINGER, C. 2002. Modern human origins: progress and prospects. Philosophical Transactions of the Royal Society B: Biological Sciences, 357, 563-579.

SULTANA, G. N. N., TULI, J. F., BEGUM, R. & TAMANG, R. 2014. Mitochondrial DNA Control Region Variation from Bangladesh: Sequence Analysis for the Establishment of a Forensic Database. Forensic Medicine and Anatomy Research, 2, 95.

SUN, C., KONG, Q.-P., GOUNDER PALANICHAMY, M., AGRAWAL, S., BANDELT, H.-J., YAO, Y.-G., KHAN, F., ZHU, C.-L., CHAUDHURI, T. K. & ZHANG, Y.-P. 2006. The dazzling array of basal branches in the mtDNA macrohaplogroup M from India as inferred from complete genomes. Molecular biology and evolution, 23, 683-690.

TAMANG, R., SINGH, L. & THANGARAJ, K. 2012. Complex genetic origin of Indian populations and its implications. Journal of biosciences, 37, 911-919.

TAMURA, K., DUDLEY, J., NEI, M. & KUMAR, S. 2007. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Molecular biology and evolution, 24, 1596-1599.

TARN, W. W. & TARN, W. W. 2003. Alexander the Great: Volume 2, Sources and Studies, Cambridge University Press.

TETZLAFF, S., BRANDSTÄTTER, A., WEGENER, R., PARSON, W. & WEIRICH, V. 2007. Mitochondrial DNA population data of HVS-I and HVS- II sequences from a northeast German sample. Forensic science international, 172, 218-224.

THAKUR, U. 1967. The Hūṇas in India, Chowkhamba Sanskrit Series Office.

THANGARAJ, K., CHAUBEY, G., SINGH, V. K., VANNIARAJAN, A., THANSEEM, I., REDDY, A. G. & SINGH, L. 2006. In situ origin of deep rooting lineages of mitochondrial Macrohaplogroup'M'in India. BMC

224 genomics, 7, 151.

THANGARAJ, K., SINGH, L., REDDY, A. G., RAO, V. R., SEHGAL, S. C., UNDERHILL, P. A., PIERSON, M., FRAME, I. G. & HAGELBERG, E. 2003. Genetic affinities of the Andaman Islanders, a vanishing human population. Current Biology, 13, 86-93.

THANSEEM, I., THANGARAJ, K., CHAUBEY, G., SINGH, V. K., BHASKAR, L. V., REDDY, B. M., REDDY, A. G. & SINGH, L. 2006. Genetic affinities among the lower castes and tribal groups of India: inference from Y chromosome and mitochondrial DNA. BMC genetics, 7, 42.

TIAN, J.-Y., WANG, H.-W., LI, Y.-C., ZHANG, W., YAO, Y.-G., VAN STRATEN, J., RICHARDS, M. B. & KONG, Q.-P. 2015. A genetic contribution from the Far East into Ashkenazi via the ancient Silk Road. Scientific reports, 5.

TIPIRISETTI, N. R., GOVATATI, S., PULLARI, P., MALEMPATI, S., THUPURANI, M. K., PERUGU, S., GURUVAIAH, P., RAO, L., DIGUMARTI, R. R. & NALLANCHAKRAVARTHULA, V. 2014. Mitochondrial control region alterations and breast cancer risk: a study in South Indian population. PloS one, 9, e85363.

TIRRUL, R., BELL, I., GRIFFIS, R. & CAMP, V. 1983. The Sistan suture zone of eastern Iran. Geological Society of America Bulletin, 94, 134-150.

TISHKOFF, S. A., REED, F. A., FRIEDLAENDER, F. R., EHRET, C., RANCIARO, A., FROMENT, A., HIRBO, J. B., AWOMOYI, A. A., BODO, J.-M. & DOUMBO, O. 2009. The genetic structure and history of Africans and African Americans. Science, 324, 1035-1044.

TORRONI, A., ACHILLI, A., MACAULAY, V., RICHARDS, M. & BANDELT, H.- J. 2006. Harvesting the fruit of the human mtDNA tree. TRENDS in Genetics, 22, 339-345.

TUFAIL, M. 2012. Impact of the Unrest on the Livelihoods of the Gujjars and Bakkarwals of Jammu and Kashmir. International Journal of Social Science Tomorrow, 1.

TZEN, J. M., HSU, H.-J. & WANG, M.-N. 2008. Redefinition of hypervariable region I in mitochondrial DNA control region and comparing its diversity among various ethnic groups. Mitochondrion, 8, 146-154.

UNDERHILL, P. A., PASSARINO, G., LIN, A. A., SHEN, P., MIRAZON LAHR, M., FOLEY, R. A., OEFNER, P. J. & CAVALLI-SFORZA, L. L. 2001. The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Annals of human genetics, 65, 43-62.

VALENTINE, B., KAMENOV, G. D., KENOYER, J. M., SHINDE, V., MUSHRIF- TRIPATHY, V., OTAROLA-CASTILLO, E. & KRIGBAUM, J. 2015. Evidence for Patterns of Selective Urban Migration in the Greater Indus 225

Valley (2600-1900 BC): A Lead and Strontium Isotope Mortuary Analysis.

VAN DER MADE, J. 2011. Biogeography and climatic change as a context to human dispersal out of Africa and within Eurasia. Quaternary Science Reviews, 30, 1353-1367.

VAN OVEN, M. & KAYSER, M. 2009. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. Human mutation, 30, E386-E394.

VIDYARTHI, L. P. & RAI, B. K. 1977. The tribal culture of India, Concept Publishing Company.

VILLAGE, S. 2015. THE NEOLITHIC TRANSITION: HUNTING-GATHERING TO. Asia in Western and World History: A Guide for Teaching, 217.

VON RAD, U., SCHAAF, M., MICHELS, K. H., SCHULZ, H., BERGER, W. H. & SIROCKO, F. 1999. A 5000-yr record of climate change in varved sediments from the oxygen minimum zone off Pakistan, Northeastern Arabian Sea. Quaternary Research, 51, 39-53.

WAINSCOAT, J. 1986. Human evolution. Out of the garden of Eden. Nature, 325, 13-13.

WALLACE, D. C., BROWN, M. D. & LOTT, M. T. 1999. Mitochondrial DNA variation in human evolution and disease. Gene, 238, 211-230.

WANG, H.-W., LI, Y.-C., SUN, F., ZHAO, M., MITRA, B., CHAUDHURI, T. K., REGMI, P., WU, S.-F., KONG, Q.-P. & ZHANG, Y.-P. 2012. Revisiting the role of the Himalayas in peopling Nepal: insights from mitochondrial genomes. Journal of human genetics, 57, 228-234.

WATKINS, W., PRASAD, B., NAIDU, J., RAO, B., BHANU, B., RAMACHANDRAN, B., DAS, P., GAI, P., REDDY, P. & REDDY, P. 2005. Diversity and divergence among the tribal populations of India. Annals of human genetics, 69, 680-692.

WEISS, H., COURTY, M.-A., WETTERSTROM, W., GUICHARD, F., SENIOR, L., MEADOW, R. & CURNOW, A. 1993. The genesis and collapse of third millennium north Mesopotamian civilization. Science, 261, 995-1004.

WHALE, J. W. 2012. Mitochondrial DNA analysis of four ethnic groups of Afghanistan. University of Portsmouth.

WILCOX, P. 1986. Rome's Enemies (3): Parthians and Sassanid Persians, Osprey Publishing.

WILSON, M., POLANSKEY, D., BUTLER, J., DIZINNO, J., REPLOGLE, J. & BUDOWLE, B. 1995. Extraction, PCR amplification and sequencing of mitochondrial DNA from human hair shafts. Biotechniques, 18, 662-669.

WINK, A. 2002. Al-Hind: The Slavic Kings and the Islamic conquest, 11th-13th centuries, Brill.

226 WISE, T. 1960. OLAF CAROE. The Pathans: 550 BC-AD 1957. Pp. xxii, 521. New York: St. Martins Press, 1959. $12.50. The ANNALS of the American Academy of Political and Social Science, 327, 189-190.

WOLPOFF, M. H., WU, X. & THORNE, A. G. 1984. Modern Homo sapiens origins: a general theory of hominid evolution involving the fossil evidence from East Asia. The origins of modern humans: a world survey of the fossil evidence, 6, 411-483.

WONG, Z., WILSON, V., PATEL, I., POVEY, S. & JEFFREYS, A. 1987. Characterization of a panel of highly variable minisatellites cloned from human DNA. Annals of human genetics, 51, 269-288.

XING, S., MARTINÓN‐TORRES, M., BERMÚDEZ DE CASTRO, J. M., WU, X. & LIU, W. 2015. Hominin teeth from the early Late Pleistocene site of Xujiayao, Northern China. American journal of physical anthropology, 156, 224-240.

YANG, Y., ZHANG, P., HE, Q., ZHU, Y., YANG, X., LV, R. & CHEN, J. 2011. A new strategy for the discrimination of mitochondrial DNA haplogroups in Han population. Journal of forensic sciences, 56, 586-590.

YEAR, L. & EXCAVATING, A. 2004. Was North Africa the launch pad for modern human migrations? Science, 16, 369.

YOTOVA, V., LEFEBVRE, J.-F., KOHANY, O., JURKA, J., MICHALSKI, R., MODIANO, D., UTERMANN, G., WILLIAMS, S. M. & LABUDA, D. 2007. Tracing genetic history of modern humans using X-chromosome lineages. Human genetics, 122, 431-443.

ZHU, Z.-Y., DENNELL, R., HUANG, W.-W., WU, Y., RAO, Z.-G., QIU, S.-F., XIE, J.-B., LIU, W., FU, S.-Q. & HAN, J.-W. 2015. New dating of the Homo erectus cranium from Lantian (Gongwangling), China. Journal of human evolution, 78, 144-157.

ZIMMERMANN, B., BRANDSTÄTTER, A., DUFTNER, N., NIEDERWIESER, D., SPIROSKI, M., ARSOV, T. & PARSON, W. 2007. Mitochondrial DNA control region population data from Macedonia. Forensic Science International: Genetics, 1, e4-e9.

ZLOJUTRO, M., TARSKAIA, L. A., SORENSEN, M., SNODGRASS, J. J., LEONARD, W. R. & CRAWFORD, M. H. 2008. The origins of the Yakut people: evidence from mitochondrial DNA diversity. International Journal of Human Genetics, 8, 119.

ZSURKA, G., KRAYTSBERG, Y., KUDINA, T., KORNBLUM, C., ELGER, C. E., KHRAPKO, K. & KUNZ, W. S. 2005. Recombination of mitochondrial DNA in skeletal muscle of individuals with multiple mitochondrial DNA heteroplasmy. Nature genetics, 37, 873-877.

227