Are the Hazara an Indigenous Population of Gilgit-Baltistan: an Odontometric Examination of Their Origins and Interactions Using Modern and Archaeological Samples

Are the Hazara an indigenous population of Gilgit-Baltistan: An odontometric examination of their origins and interactions using modern and archaeological samples

Written by Amanda M. Camp Spring 2013

A Thesis submitted to the Anthropology Program, School of Social Sciences and Education California State University, Bakersfield In Partial Fulfillment for the Degree Masters of Art

Amanda M. Camp

2013

Are the Hazara an indigenous population of Gilgit-Baltistan:

An odontometric examination of their origins and interactions using

modern and archaeological populations

By Amanda M. Camp

This thesis has been accepted on behalf of the Anthropology Program by their supervisory committee:

______Dr. Brian E. Hemphill Committee Chair

______Dr. Robert M. Yohe II

Dr. Roger Peck

For my dust-bowl migrant grandparents, who were harvest gypsies so their descendants didn’t have to be. I would not be who I am today without them. I carry you in my heart.

Acknowledgements

The author would like to extend special thanks to all of our colleagues in Pakistan

that facilitated this research and those that participated and collaborated in the

collection of the dental casts. Vice Chancellor (Dr.) Ihsan Ali, Professor (Dr.) Habib

Ahmad, Dr. Rachel Jack, Mr. Sajid ul-Ghafoor, Mr. Muhammad Zahir Khan, and the

students of Gilgit University. In addition, thanks must be extended to my committee

chair Dr. Brian Hemphill for encouraging me to take on this research, facilitating data

collection in Pakistan, assisting with the statistical analyses, and overall support

throughout this process. Thanks also to Drs. Roger Peck and Robert M. Yohe II for

agreeing to serve on my thesis committee and for supporting me with many graciously

given words of wisdom. To my dear husband, Casey, thank you so much for all your

encouragement along the way; I could not have done it without you. I must give special

thanks to my Grandma Camp for a lifetime of love and encouragement, to my sister

Angie for never giving up hope and becoming the miracle she is today, and to my

parents, for their support and motivation every step of the way. Grateful

acknowledgement is also given to California State University Bakersfield’s Student

Research Scholars program and the Ronald E. McNair Post-baccalaureate

Achievement program for their financial support of this research. The greatest thanks must be extended to the people of northern Pakistan who so graciously and enthusiastically participated in this research, drew me pictures and gave gifts, and whose sweet smiling faces will never be forgotten. I will carry them in my heart forever.

Are the Hazara an indigenous population of Gilgit-Baltistan: An odontometric examination of their origins and interactions using modern and archaeological populations

By Amanda M. Camp Anthropology Program, School of Social Sciences and Education California State University, Bakersfield

Abstract

The primary goal of this research is to test four current models for South Asian population history with tooth-size allocation analyses and assess the degree and patterning of sex dimorphism based on tooth size among the Hazara. Odontometric data collected from a sample (n=202) of the Hazara population residing in Skardu, Gilgit-Baltistan, northern Pakistan is compared to samples of 27 living and prehistoric populations from Central Asia, the Indus Valley, northern Pakistan and peninsular India to test biological origin theories of the Hazara and to further elucidate the events surrounding the peopling of the Indian subcontinent. Maximum mesiodistal and buccolingual measurements were obtained for all permanent teeth except third molars in accordance with standardized methods. Statistical techniques were employed to measure the degree of both inter- and intra-observer error, assess the potential influence of dental asymmetry and the degree of sex dimorphism. For comparative purposes individual measurements were scaled against the geometric mean to control for sex dimorphism and evolutionary tooth size reduction. Inter- sample differences in tooth size allocation was assessed using pairwise squared Euclidian distances. These distances were used as the basis for determining the patterning of phenetic affinities among samples through hierarchical cluster analysis, neighbor-joining tree cluster analysis, multidimensional scaling using both Guttman’s (1968) and Kruskal’s (1964) methods, and principal co-ordinates analysis. Results of the assessment of inter- and intra-observer error indicate minimal influence, rendering few variables statistically significantly different at an alpha level of 0.05. Three of 28 variables (lower second molar in the mesiodistal dimension, upper first molar in the mesiodistal dimension, and the lower first molar in the buccolingual dimension) were found to differ significantly in the inter-observer error tests, while four variables (lower second molar in the buccolingual dimension, lower first molar in the buccolingual dimension, upper first molar in the buccolingual dimension, and the upper third premolar in the buccolingual dimension) were found to differ significantly in the intra-observer error tests. Results also indicate a low occurrence of bilateral asymmetry among the Hazara that does not preferentially affect either sex. Assessment of sexual dimorphism indicates male Hazara posterior dentition is significantly larger in the buccolingual dimension for both dental arcades and the mesiodistal dimension of all canines. Furthermore, results of the statistical analyses employed to assess biological affinity identify the Hazara as outliers to all other samples. Therefore, the Hazara do not appear to be biologically related to any of the comparative samples included in this study. Such results corroborate genetic studies which indicate that the Hazara are an intrusive non-South Asian population living among other Pakistani highlanders in Gilgit-Baltistan.

Table of Contents Table of Contents ...... 6 List of Figures ...... 8 List of Tables ...... 10 Introduction ...... 11 Ethno-History of the Hazara ...... 14 Hazara Ethnic Origin Theories ...... 16 Descendants of Genghis Khan’s Army ...... 16 Descendants of Kushans ...... 18 Dental Ontogeny ...... 20 Odontometric Heritability ...... 27 Biological Distance ...... 33 Previous Dental Studies in South Asia ...... 36 Materials and Methods ...... 41 Comparative Samples ...... 42 Living Inhabitants of Pakistan and India ...... 45 Prehistoric Inhabitants of the Indus Valley ...... 46 Prehistoric Inhabitants of Central Asia ...... 46 Research Questions ...... 48 Models and Expectations ...... 50 Long-Standing Continuity Model (LSCM) ...... 58 Expectations ...... 61 Aryan Invasion Model (AIM) ...... 62 Expectations ...... 66 Early Entrance Model (EEM) ...... 67 Expectations ...... 76 Historic Era Influences Model (HEIM) ...... 78 Expectations ...... 84 Current Statistical Analyses ...... 85 Intra-observer error Analysis ...... 87 Inter-observer error Analysis ...... 89 Asymmetry Analyses ...... 91 Sexual Dimorphism Analyses ...... 96 Biological Distance Analyses ...... 105 Multidimensional Scaling Analyses ...... 105 Principal Co-ordinates Analysis ...... 115 Cluster Analyses ...... 119 Discussion ...... 126 Assessment of Intra- and Inter-observer Error ...... 126 Patterns of Dental Asymmetry ...... 127 Patterns of Sexual Dimorphism ...... 132 Patterns of Biological Affinities ...... 132 The Long-Standing Continuity Model ...... 134 The Aryan Invasion Model ...... 136 The Early Entrance Model ...... 138 6

The Historic Era Influence Model ...... 141 Odontometric Studies and Genetic Research ...... 143 Conclusion ...... 145 References ...... 148 Appendices ...... 163 A. Descriptive Statistics of the Hazara ...... 163 B. Euclidean Distance Matrix ...... 165

List of Figures

Figure 1. Image of Hazara individuals depicting Asiatic facial characteristics (Bacon, 1951). 155

Figure 2. Location of samples used in this analysis. The red squares indicate the locations of living population samples, while archaeologically derived samples are indicated by blue dots. . 44

Figure 3. Graphic depiction of the Long-Standing Continuity Model. The location of Indo- Aryan language speakers are represented by the purple circle. The Dravidian language speakers are represented by the orange circle…………………………………………………...…………51

Figure 4. Graphic depiction of Aryan Invasion Model. The purple arrows depict the movement of Aryans into from Central Asian into peninsular India...... 53

Figure 5. Graphic depiction of Early Entrance Model. Orange arrows represent the early movement of Proto-Elamo-Dravidian speakers while the purple arrows indicate the later wave of Indo-Aryan speakers...... 55

Figure 6. Graphic depiction of the Historic Era Influence Model. The green lines indicating the separation of the Pakistani highlanders from peninsular Indians...... 57

Figure 7. Bar Graph of Percentages of Statistically Significantly Different Mesiodistal Sexually Dimorphic Dimensions among the Hazara……………………………………………………100

Figure 8. Bar Graph of Percentages of Statistically Significantly Different Buccolingual Sexually Dimorphic Dimensions among the Hazara……………………………………………………..101

Figure 9. Line Chart Plotting Percentages of Sex Dimorphism for the Right-Side Mandibular Dentition………………………………………………………………………………………..102

Figure 10. Line Chart Plotting Percentages of Sex Dimorphism for the Left-Side Mandibular Dentition………………………………………………………………………………………..103

Figure 11. Line Chart Plotting Percentages of Sex Dimorphism for the Right-Side Maxillary Dentition………………………………………………………………………………………..103

Figure 12. Line Chart Plotting Percentages of Sex Dimorphism for the Left-Side Maxillary Dentition………………………………………………………………………………………..104

Figure 13. Results Obtained from Guttman’s Method for Multidimensional Scaling…………106

Figure 14. Results Obtained from Kruskal’s Method for Multidimensional Scaling………….111

Figure 15. Results Obtained from the Principal Co-ordinates Analysis………………………..115

Figure 16. Results of the Hierarchical Cluster Analysis using Ward’s (1963) Linkage………..120

Figure 17. Results produced by the Neighbor-joining Tree Cluster Analysis………………….123

Figure 18. Multidimensional scaling presentation of weighed population pairwise values of short-tandem-repeat loci variation with haplogroups, from Qamar and coworkers (2002:1117)……………………………………………………………………………………..144

List of Tables

Table 1. Samples used in the Tooth Size Allocation Analysis…………………………………..43

Table 2. Statistical Assessment of Intra-observer Error via Two-tailed Distribution Paired Student’s t-tests. The statistically significantly different values (p<0.05) are highlighted in yellow…………………………………………………………………………………………….87

Table 3. Statistical Assessment of Inter-observer Error via Two-tailed Distribution Paired Student’s t-tests. The statistically significantly different values (p<0.05) are highlighted in yellow…………………………………………………………………………………………….90

Table 4. Results of the two-tailed distribution paired Student’s t-tests for asymmetry assessment among male Hazara. The statistically significant values (p<0.05) are highlighted in yellow…..92

Table.5. Results of the two-tailed distribution paired Student’s t-tests for asymmetry assessment among female Hazara. The statistically significant values (p<0.05) are highlighted in yellow...93

Table 6: Results of the directional asymmetry assessment among the Hazara with instances where the left antimere is larger (positive values) highlighted in yellow……………………….95

Table 7. Assessment of Sexual Dimorphism for the Left-side variables of the Hazara sample, showing the results of the two-tailed distribution Student’s t-tests with both assumed homoscedasticity and heteroscedasticity. The values that are statistically significantly different (p<0.05) are highlighted in yellow………………………………………………………………97

Table 8. Assessment of Sexual Dimorphism for the Right-side Variables of the Hazara sample, showing the results of the two-tailed distribution Student’s t-tests with both assumed homoscedasticity and heteroscedasticity. The statistically significant values (p<0.05) are highlighted in yellow…………………………………………………………………………….98

Table 9. Values for each Dimensional Axis Generated by Guttman’s (1968) Method for Multidimensional Scaling………………………………………………………………………106

Table 10. Values for each Dimensional Axis Generated by Kruskal’s Method for Multidimensional Scaling………………………………………………………………………111

Introduction

Researchers have attempted to discern the population history of ethnic groups

found in the Gilgit-Baltistan1, Pakistan using historical evidence, archaeological

methods, osteological analysis, and genetic studies. However, such studies have been

extremely limited, both geographically and temporally. Consequently, despite their

efforts, such researchers are often left with many unanswered questions. The

population histories of ethnic groups in this region of the world have captured the

interest of many researchers because they occupy what has often been referred to as

“the crossroads of Asia” facilitating contact between the peoples of India, Central Asia,

and western China (Fairservis 1995). There have been some biological assessments of

DNA variation among living individuals, but such studies cannot determine with

sufficient accuracy the timing of past population interaction events (Qamar et al. 2002;

Quintana-Murci et al. 2001, 2004). Other equally stymieing problems plague

archaeological excavations and analyses, for inferences about population interaction

patterns based on the presence of certain artifacts often conflate indirect diffusion of

goods and ideas with actual physical contact between members of geographically

disparate populations (for a discussion of such attempts that are not always scientifically

sound see Hemphill 1997). This thesis seeks to address this deficiency through an

assessment of permanent tooth size allocation, a phenomenon known to be under

moderate to strong genetic control (Townsend 1978b). This study uses scientific investigation coupled with historical evidence to illuminate the intricacy of biological

1 The former Northern Areas of Pakistan was formally renamed by the Gilgit-Baltistan Empowerment and Self Governance Order 2009, an act of the Pakistan Parliament in August 2009 that replaced the earlier “Northern Areas” Legal Framework Order 1994. 11

interactions, both temporally and geographically, that have previously been subject to

indirect extrapolations from genetic variation among living individuals, from often reified

archaeological assemblages designated as “cultures,” and from regional oral traditions

that often reflect little more than mythology.

This study provides the first opportunity to examine Hazara2 biological affinities

from the standpoint of dental variation. As such, it offers an opportunity to determine

whether this system of biological variation yields results that are concordant with those obtained by an array of recent genetic investigations regarding this ethnic group. The

methodologies employed in recent DNA analyses are extremely powerful for

determining levels of similarity and differences between populations, but they are of

insufficient specificity for determining exactly when contacts and subsequent gene flow

occurred between populations to the degree needed to test various models based upon

interpretations of the archaeological record. Using teeth to study regional population variation is highly advantageous because teeth can provide both synchronic and

diachronic perspectives concurrently when incorporating dental data from ancient and

extant groups (Scott and Turner 1997). Dental variations can be easily assessed

among both the living and the dead because of easy access of the oral cavity among

living individuals and, because of the resiliency of enamel against taphonomic

processes, a high frequency of preservation of teeth in the archaeological record.

Dental metric and morphological variations are known to be genetically determined and

inherited, which will be discussed in more detail below in this thesis. As such, they offer

2 The term Hazara is used here and throughout this thesis despite the fact that the ethnic group that is the focus of this study self-identify as “Chengazis.” The two terms may be considered synonymous and because the term “Hazara” is more widely known and used, its use will continue here.

an excellent opportunity to test if and when there were major introductions of foreign genes into the resident South Asian gene pool.

Ethno-History of the Hazara

The majority of the Hazara are inhabitants of Hazarajat, a region that

encompasses central Afghanistan and areas of Pakistan, Tajikistan and the Uyghur

Autonomous Region of Western China located north of the Hindu Kush in Xinjiang

Province, China. The 1979 Soviet invasion of Afghanistan left civil unrest in the wake of

their occupation and the Hazara residing there were virtually free of governmental

oversight (Middleton 1995:115). Later political upheavals forced some of the Hazara to

leave central Afghanistan and emigrate to the northern reaches of Khyber Pakhtunkhwa

Province and Gilgit-Baltistan, Pakistan, often in areas of high elevation with rugged

mountainous terrain (Middleton 1995). Despite the terrain, the Hazara are found in

dense populations. The climate in both Afghan and Pakistani regions of the greater

Hazarajat is marked by harsh winters and short summers. In Afghanistan, rough

estimates of Hazara population size range between 1-1.5 million, while estimates of

Hazara population sizes in Pakistan range between 17,000-70,000.

The name “Hazara” is derived from a Mongol-Persian blend of the word meaning

“thousand” in Farsi and is thought to be the Persian equivalent to the Mongol word of

the same meaning minggan (Middleton 1995). This word also describes a fighting unit

represented by a kinship group that provided a thousand horsemen and has become

synonymous to “tribe” (Middleton 1995). This meaning subsequently evolved into

“mountain tribe” and after the 15th century was used to describe a specific group of

people (Middleton 1995). Their traditional language, Hazaragi, is of the Indo-Iranian language family but is marked by numerous Mongolian inclusions and loanwords

(Middleton 1995). 14

Bacon (1951) notes that the Hazara have a strong “Mongoloid” appearance that makes it easy to distinguish them visually from neighboring populations (Fig. 1). Most have broad faces with high cheek bones, flat nasal bridges, narrow eyes due to epicanthic folds, scant facial hair, and are of rather short stature (Bacon 1951). Not

surprisingly, the Hazara have been described as being of Mongoloid appearance by

other researchers (Middleton 1995; Thesiger 1955).

Figure 1. Image of Hazara individuals depicting Asiatic facial characteristics (Bacon, 1951).

Traditionally nomads, the Hazara herd ovicaprids and horses supplemented by

agricultural reliance on mixed gains, including wheat and barley, with the addition of

fava beans (Middleton 1995). According to Thesiger (1955: 314), the Hazara of

Afghanistan live in mud and stone houses built on the valley floors with farming and 15

irrigation systems constructed on the hillsides above the villages. The villages contain

watch towers with the chiefs living in fort-like rectangular structures with courtyards at

their center (Thesiger 1955). The Hazara kinship system is endogamous, being

organized in lineages with descent traced through the male line with preferential

marriages to their patrilateral first cousins (Middleton 1995). Such ethnographic information suggests that Hazara marital practices have discouraged gene flow, acting as a genetic isolating mechanism allowing very little admixture with neighboring groups.

Contemporary Hazara are predominately Shi’a Muslims, causing contention between the majority of Afghans and Pakistanis who identify as Sunni Islamic.

Hazara Ethnic Origin Theories

Descendants of Genghis Khan’s Army

It is commonly asserted that the Hazara are descendants of the army of Genghis

Khan, who marched into the area during the 13th century. These Mongol families

settled and remained long after the dissolution of the Mongol empire during the 14th

century and they subsequently adopted local customs and converted to the local religion, Islam. An ethnographic account of the Hazara of Hazarajat was conducted by

Bacon, who asserts the Hazara have no precise traditions regarding their origins and only a few members are even familiar with the name ‘Genghis Khan’ (Bacon 1951:232).

Elphinstone (1842 in Bacon 1951) had similar findings, and concluded that the Hazara have no account of their own origins. A previous ethnographic account documents the origin of the Hazara according to a chief of the Turbat-Jam region who claimed the present population had belonged to one of the largest tribes of the ‘Moghuls’ that

rebelled against ‘Chingiz Khan’s’ orders for their removal from the Moghulistan area to

the Kohistan region of Kabul (Elias 1989 in Bacon 1951). According to oral tradition, as

this order was being carried out and as the Hazara just crossed the Oxus, the ruler died.

One of ‘Chingiz Khan’s’ descendants forced some of the Hazara to complete their move

to the designated area, but the rest escaped to Badghis Province, Afghanistan.

Historic accounts note Mongol troops operating south of the Hindu Kush during

the winter of AD 1222-23, they were unsuccessful in their attempt to return to Mongolia

through Tibet so the Mongol armies returned to Peshawar and subsequently proceeded

north across the Hindu Kush (Bacon 1951). According to Bacon (1951: 236), an

Iranian campaign was undertaken in AD 1224. A number of cities in the northern areas

were destroyed as a result of their campaign, but there is no historical documentation

that any of the army remained south of the Oxus River after the return of ‘Chinggis

Khan’ to Mongolia and his subsequent death in AD 1227. The Mongol Empire was

divided amongst his four sons but none of the empire was located south of the Oxus

River. The steppe country north of the Oxus River was ruled by his son Chagatai, who

made numerous raids into the Khorasan region of north-central Afghanistan and

crossed the Hindu Kush at various intervals between AD 1282-1306, including nine

expeditions across the Indus River (Bacon 1951). According to Bacon (1951), the

Mongols were unable to occupy that portion of northwestern India but they did manage

to secure areas within what is later known as Hazarajat with 50,000 men who were

accompanied by their families and livestock. Thus, it is possible that the Hazara are

descendants of these Chagataian Mongol troops and their families who entered the

Hazarajat region at various times between AD 1229 and AD 1447 (Bacon 1951: 238,

241-2). An ethnographic account of the Hazara by Thesiger (1955:313) claims the tribe

were descendents of Jagatai, ‘Jinghis Khan’s’ son, or by his grandson Mangu, who

were left to guard the lands acquired during the Mongol invasion (Thesiger 1955).

This origin theory is consistent with recent DNA research identifying Hazara

origins as East Asian, which would help explain the physical appearance of the Hazara

people as they exhibit many characteristics similar to East Asians (Hunley et al. 2009,

Bellew 1979). This theory would also account for some of the Hazara tribal names,

which appear to reflect Mongol names. For example, the tribe of Tulai Khan was

named after the Mongol army general and youngest son of Genghis Khan, Tolui. An

alternative theory is that the Hazara represent descendants of a Uyghur Turkic

population that arrived much earlier to Afghanistan (Bellew 1979). Barthold (1928 in

Bacon 1951) asserts that this group fought both for and against the forces of Genghis

Khan’s armies and suggests that the remnants of this population were integrated into

what is now known as the Hazara. However, this alternative explanation fails to account

for the East Asian physical characteristics and possible genetic admixture.

Descendants of Kushans

There are competing theories that the Hazara are descendents of the population belonging to the Kushan Empire, which existed during the first three centuries AD. The

Kushan Empire extended north of the Karakoram Mountains to include the Tarim Basin

of Xinjiang and southward to include the Indus and Gangetic Valleys of northern

peninsular India. These people are famous for constructing the Buddhas of Bamiyan.

The Bamiyan Valley is located in a basin within the Hazarajat region of central

Afghanistan bordered by steep cliffs that served as a stopping point along a branch of the Silk Route. According to the UNESCO World Heritage Site description, the

Bamiyan Valley contains artistic motifs in edifices and cave walls that integrate many cultural influences into the Gandharan school of Buddhist art (UNESCO 2003).

According to these theories, the Hazara were the Buddhist monks who lived in the city of Bamiyan and occupied the area when it became a center of Buddhism approximately 2,000 years ago. If this is the case, then these Buddhist monks may have come from Gandhara, who seem to have been focused on the Peshawar Valley,

Taxila and the valley immediately north of the Indus Valley proper; as such they should be biologically similar to northern Pakistani groups such as the Swati. However, given that the Peshawar Valley represents a northwestern extension of the Indus Valley, it may be that these Buddhist monks possess affinities to earlier populations of the northern Indus Valley and adjacent regions, such as Timargarha, Sarai Khola and perhaps, even Harappa itself. However, the areas of Bamiyan were ransacked by the

Genghis Khan’s military forces during the siege of Bamiyan in AD 1221, which could serve as an historic event that initiated genetic admixture of the two populations as many posit the Hazara are representative of both populations. Barthold (1928 in Bacon

1951) asserts that this group fought both for and against the forces of Genghis Khan’s armies and suggests that the remnants of this population were integrated into what is now known as the Hazara. Nevertheless, much of these origin theories are based on oral tradition and local mythology, not scientific investigation of their biological origin. It is because of this that the origin of the Hazara is the topic of this thesis and will be investigated further.

Dental Ontogeny

“The link between gene and character is ontogeny” – P.M. Butler (1982:45)

The above statement by Butler is used to emphasize awareness of the intervening factors that shape the phenotypic manifestation of dental traits and the allocation of tooth size throughout the dentition, namely that, “ontogeny is intermediary between genotype and phenotype” (Scott and Turner 1997: 86). It is important to understand the ontogeny of teeth, as well as the genetics that control them, in order to better grasp the causal factors of dental morphometric variation. Therefore the odontogenesis of both crowns and roots will be discussed followed by a consideration of the robusticity of genetic control over and tolerance of environmental factors that potentially lead to a deviation away from that genetic control which leads to the phenotypic expressions that can be measured and scored by the dental anthropologist.

The initiation and development of tooth germs begins six weeks after fertilization. The development of teeth is a consequence of the interaction between the ectoderm and mesoderm layers of the embryo (Scott and Turner 1997). The lining of the embryo’s mouth consists of a layer of epithelial cells with a layer of mesenchymal cells underneath it. These oral mesenchyme cells are derived from the neural crest and are often referred to as ectomesenchyme (Scott and Tuner 1997). The mesenchyme cells begin to differentiate and proliferate within the parabolic-shaped zone of the developing maxillary process and mandibular arch where epithelial cells subsequently grow to form the primary epithelium band (Hillson 1996). This band divides into the vestibular lamina and the dental lamina. The epidermal growth factor (EGF) mRNA is responsible for 20

inducing the formation of dental lamina, for without it tooth germs will not form

(Kronmiller et al. 1991). It is along the edge of the dental lamina that small swellings representing enamel organs develop. Twenty enamel organs for the deciduous dentition form by the tenth week. Enamel organs for the permanent dentition begin to appear around the sixteenth week after fertilization. They are eventually responsible for forming the enamel of the permanent tooth crowns.

Six morphological stages are used by oral biologists to describe tooth germ development: (1) dental lamina; (2) bud; (3) cap; (4) early bell; (5) late bell; and (6) enamel and dentine matrix formation (Scott and Turner 1997). The cap stage is marked by the formation of a unilateral hollowing of the enamel organ bud that is filled and covered with mesenchyme cells (Hillson 1996). The mesenchyme cells within the structure are known as the dental papilla and later form the dentine, the follicle, and the cement. An enamel matrix is formed by the enamel organ differentiating a layer of epithelial cells. During the bell stage, the hollow deepens and the pattern of folds that define the shape of the crown occurs. The cells of the enamel epithelium divide throughout the structure, except where the cusps of the crowns are going to develop.

Small clusters of cells stop dividing at these locations and the epithelium “buckles into folds as the cells in between continue to divide” (Hillson 1996: 119). The locations of cell division cessation begin to differentiate into odontoblasts and eventually ameloblasts. The odontoblasts secrete a predentine matrix. Ameloblasts are responsible for the secretion of an enamel matrix. A structure known as Hertwig’s sheath forms at the rim of the tooth germ, which will serve as the location of the future cervical edge of the crown (Hillson 1996). This is a cuff of cells that extends from the

rim of the tooth germ in the form of a tube that is responsible for the shape of the tooth root. The enamel is then deposited in layers forming the base of the cusps and ridges that become progressively wider until they coalesce (Hillson 1996). The occlusal surface forms and is followed by the development of overlapping sleeve-like layers that form the sides of the crown (Hillson 1996). Predentine is simultaneously deposited in the inner region of the initial enamel matrix and forms the walls and the roof of the pulp chamber. These layers continue to be deposited down the shaft of the root and eventually form the floor of the pulp chamber, roots, and lastly the cuspal apices.

Teeth are but one example of a meristic series, which are segmented and repeated structures found in both invertebrates and vertebrates (Scott and Turner

1997). These repeated structures, known as metameres, exhibit duplication with variation that is expressed along a gradient, such that within a given developmental field adjacent teeth are most similar to one another (Weiss 1990). These gradients are divided into different morphological classes based on crown form (Scott and Turner

1997). There are competing views on the nature of the gradients and dental fields (see

Butler 1937, 1939, Osborn 1978). The morphogenetic fields are represented by four zones of different morphological classes of teeth, each possessing one tooth that is considered to be the most stable member against developmental and evolutionary changes known as a ‘key’ tooth (Dahlberg 1945). This idea is supported by Alvesalo and Tigerstedt (1974) who analyzed odontometric variation within 90 sets of siblings and concluded that since the coefficient of variation was higher in distal rather than mesial members within each tooth group, these distal members were subject to greater influence from non-genetic (environmental) factors.

The perspective of dental ontogeny posited by Osborn (1978) states that the odontogenesis of tooth buds within each morphological class (incisors, canines, and molars) begins with pre-programmed ectomesenchymal clone cells that grow and migrate distally within the morphogenetic fields. The tooth buds become enveloped by zones of inhibition preventing other tooth buds from developing until the clone has migrated sufficiently to its predetermined location (Townsend et al. 2009). Gradients in the phenotypic expression of genetically coded shape characteristics and sizes form as a result of the difference in time allowed for the cellular division of each tooth bud within the same morphogenetic field or class (Osborn 1978)3.

The homeobox theory is yet another model used to explain odontogenesis.

Homeobox genes regulate gene expression during embryonic development, functioning as an encoder of transcription factor groups and specify the patterning of individual structures based on their various combinations (Thesleff 1995). Thesleff (1995) focuses on homeobox-containing genes responsible for the patterning of the head and facial structures of Drosophila and found the function of the Msx-I homeobox gene to be necessary for tooth development. Sharpe (1995) provides evidence for the highly conservative nature of the genetic homeobox using the similarities of the genetic control of embryonic development between Drosophila, and mammalian species and thus lays the foundation for the application of the homeobox theory to human odontogenesis.

A new perspective on dental ontogeny has been offered by Townsend and coworkers (2009) who blend what began as competing theories into a mutually

3 It should be pointed out that Osborn (1978) and various proponents of Butler’s field theory blithely assume that greater expression of crown traits and greater overall conservatism of the crown is greatest for the key tooth within each morphogenetic field and tapers off in frequency among the distal members. However, this has been shown to be a frequently violated assumption (Hemphill 2011). 23

supporting amalgamation of Butler’s initial morphogenetic field concepts (1939),

Dahlberg’s ‘key’ or ‘pole’ tooth concept (1945, 1951), the clone theory posited by

Osborn (1978) and the concept of odontogenic homeobox gene influence (Thesleff

1995; Sharpe 1995; Mitsiadis and Smith 2006) into a theory referred to as cooperative genetic interaction (Mitsiadis and Smith 2006). This new theory is based on the discovery of signaling molecules and the expression of homeobox genes in neural crest derived ectomesenchyme during dental development that ultimately leads to the establishment of morphogenetic fields (Townsend et al. 2009: 35). Line (2001: 36) emphasizes that morphogenetic fields are not restricted to single gene expression but rather these genetic influences are modulated by epigenetic effects. While studying the morphogenetic fields associated with the MSX1 and PAX9 genes, Line (2001: 36) found they were neither limited to a single tooth class not did they follow a simple gradient pattern, a phenomena he interpreted as resulting from the interaction between different signaling molecules.

The idea of cooperative genetic interaction initially proposed by Mitsiadis and

Smith (2006: 37) incorporates the ‘clones’ of neural crest derived cells, the homeobox containing genes in the mesenchyme, and the aforementioned signaling molecules released by the oral epithelium when discerning dental ontogeny. Other researchers, such as Kondo and coworkers (2005) and Takahashi and coworkers (2007), assert that the increased variation in the ‘distal’ tooth relative to the ‘key’ or ‘pole’ tooth is because these ‘distal’ teeth spend a longer amount of time in a soft tissue phase prior to mineralization. As a result, these ‘distal’ teeth experience a greater opportunity for epigenetic and/or environmental factors to influence that tooth’s phenotypic expression.

If this is true, then distal teeth with a field should always be more variable then their

“key” counterparts4.

The controlling mechanisms for dental development are complex and hard to fit

into just one theory. Recently, researchers (Harris and Harris 2007, Hemphill 2013b) have demonstrated population-specific grades and gradients that are not in strict accordance with the expectations of Butler’s, Dahlberg’s, Osborne’s or even Townsend and coworkers’ models for dental development. Recent odontometric analyses demonstrate the presence of grades and gradients within morphogenetic fields that are population-specific. Evidence for this is provided by Harris and Harris (2007) who assess population-specific differences in the mesial-distal crown size gradient of teeth within each morphogenetic field (incisors, premolars, and molars) using a worldwide sample of 107 samples, which they divided into seven geographic-racial groups, and found statistically significant correlations among gradients of different tooth types using one-way factorial ANOVA and Tukey’s HSD post hoc tests. These researchers found

Caucasians to possess the steepest gradients and aboriginal Australians to have the shallowest gradients. Harris and Harris (2007: 14) posit that the biochemical nature of morphogenetic fields is responsible for the degree of “steepness.” That is, they suspect

the steepness of a crown size gradient is a reflection of how sharply the molecular gradient drops with the distance between the mesial, or key tooth, and the distal tooth within a field and also the timing of signaling molecules.

4 Again, the situation appears more complex, for Hemphill Hlusko (2013) found multiple violations of the dictum that later developing distal members of morphogenetic fields are inherently more variable than their mesial metameres. These violations occurred in all multiple-member morphogenetic fields of both dental arcades. 25

Dental morphology analyses also reveal population-specific differences within

morphogenetic fields. Evidence for this is provided by Hemphill (2011) who compared

205 individuals from Madaklasht, Pakistan to both living and prehistoric individuals from

Central Asia, Pakistan and India using univariate and multivariate analyses of 17 morphological tooth characteristics, eight of which involved assessments of the same trait on both ‘key’ and distal members of the same morphogenetic field. Hemphill

(2011: 45) found four traits that conformed to the expectation of key teeth possessing higher trait prevalence than distal member and four that did not conform to this expectation. The four ‘nonconformist’ traits include shoveling and median lingual ridge development on the maxillary incisors, presence of the metaconule on the maxillary molars, and the presence of the metaconulid on the mandibular incisors. Such results suggest a more complex relationship than the previously proposed key tooth-distal tooth

dichotomy in phenotypic expression of morphological traits (p. 46-7). The results of this

study demonstrate the importance of using both ‘key’ teeth and distal members when

analyzing trait frequencies and furthermore, demonstrate how the presence of

population-specific gradients in trait expression within morphogenetic fields is informative in biological distance analyses (p. 48). The stability of tooth form and the underlying causes of its variation are important to understand because they serve as principles that lay the foundation of this scientific investigation.

Odontometric Heritability

Dental development is influenced by a number of factors. In order to use

differences in dental variation to infer patterns of relatedness between populations, it is

important to know how much variation is due to genetic heritability and how much is a

product of environmental influences. It is important to keep in mind, however, that the

human dentition is highly integrated and represents a strongly canalized developmental

system (Saunders and Mayhall 1982). Despite this, some environmental changes can

alter the normal ontogenetic course of dental development. Kollar and Kerley (1979)

demonstrate the importance of timing and context in tooth formation through the study

of isolated enamel organ epithelial and dental papilla cells. It was noted by Kronmiller and coworkers (1992) using mice embryos that the active agent of vitamin A called retinoids produce a combination of supernumerary, fused, and missing teeth when introduced during the initial stage of development and enamel histodifferentiation when introduced at the later stage as noted by Glasstone (1979) who used rabbit molar germs.

The mechanisms of genetic canalization responsible for dental ontogeny work

well to produce teeth and their structures despite possible environmental effects and

chromosomal abnormalities (Scott and Turner 1997:128). The potential introduction of

odontogenetic polymorphisms caused by mutations is under strong selective pressures

(Scott and Turner 1997). Teeth, from a phylogenetic viewpoint, are under very strong

control. Human teeth have evolutionary conservative components that are nearly

identical in cows, pigs, and mice. These include such things as the amino acid

sequences for amelogenin, which is the main protein in the soft enamel matrix (Ten 27

Cate 1994). Indeed, the general tooth crown form for hominoid primates has remained relatively stable for more than 30 million years (Scott and Turner (1997:128).

Strict governance and control over the entire dentition by genes was asserted by

Kraus and Furr (1953). However, according to Garn and coworkers (1968) and Lavelle

(1972), who looked at generational differences between fathers vs. sons and mothers vs. daughters, tooth size can exhibit a degree of plasticity. Garn and coworkers (1968), using chi-square analysis with the mesiodistal diameters of 46 fathers and 49 sons and

34 mothers and 51 daughters, found significantly larger teeth in the second generation samples, more often in males than females. These researchers concluded that such results were either a reflection of increased nutritional status or were a consequence of genetic drift involving the X-chromosome. Lavelle (1972) found secular trends among

Caucasoid, Mongoloid and Negroid descent groups using the mesiodistal and buccolingual diameters, dental arch dimensions and osteometric data of 240 individuals from 60 families. Lavelle (1972) employed multivariate canonical analyses based upon squared generalized distances and found a low degree of correlation between dimensions of parents and offspring.

Despite these previous assertions, Garn and Bailey (1977:82) conclude that dental development is under stronger genetic control than most other calcified tissues.

Townsend and Brown (1978b) found the heritability of odontometric variation to be approximately 64% among Australian Aborigine half-siblings, which is much lower than an early estimate by Garn and coworkers (1965) who estimated a heritability value of nearly 90%, a difference likely attributable to differing methodologies. Nevertheless, it is because of this strong genetic factor that odontometrics can be used to infer biological

relatedness of populations. This is summarized by Moorrees (1962), who posits that

using dental variation to determine ‘race’ can only be accomplished because of the

widely recognized fact that tooth form and size are genetically determined.

It is only when phenotypes possess a strong heritable component, such as dental morphological traits and odontometric characteristics, that they can be used to accurately assess population affinity. Not only are the aforementioned phenotypic expressions highly heritable but they can both be used to elucidate similar affinities.

According to Hemphill (2013b: 317), dental morphology trait frequencies and

odontometric tooth size allocation assessments produce similar, although distinct,

results and can be used in tandem with one another because ultimately they are the

consequence of the same differentiating process. Hemphill (2013: 371) utilized a matrix

correlation test between the triangular matrix of Smith’s MMD values obtained from

dental morphology trait frequencies and the triangular matrix of squared Euclidean

distances obtained from geometrically scaled mesiodistal tooth lengths and

buccolingual tooth breadths from different living samples of Khowars, various ethnic

groups from western India and prehistoric groups from Central Asia and found the

matrices to be significantly correlated.

A number of studies can be used to support the previously posited claims of

odontometric heritability. Using monzygotic twins allows researchers to study the

effects of developmental and environmental noise on the same genome and such

studies consistently demonstrate the robusticity of genetic control over dental

morphometrics. Lundström (1963) used crown morphology to assess zygosity in 124

twin pairs from Michigan and New York and correctly diagnosed 66 of the 72

monozygotic pairs (91.7%) and 51 of the 52 dizygotic pairs (98.1%), with an overall

precision rate of 94.4%, through a comparison of observed differences in cusp number, fissure patterns, crown form, and lingual variations of the anterior teeth. The results of

Lundström (1948, 1954, 1955, and 1967) also support these findings, for he concluded

that accurate prediction of monozygotic and dizygotic twins can be achieved using

dental morphological frequency data.

Goldberg’s (1929) initial study demonstrated the claims posited by previous

investigations. Horowitz and coworkers (1958) analyzed the heritability of tooth size

using mesiodistal diameters of anterior teeth in twins and concluded that zygosity can

be predicted accurately with the canine exhibiting the least measureable genetic

variability while the lateral incisors exhibited the most. Menezes and coworkers’ (1974)

odontometric study using triplets further supports the conclusions of Horowitz and

coworkers (1958). Studies have also been undertaken to determine the extent to which

the X-chromosome is responsible for the heritability of odontometric similarity. Garn

and coworkers (1965) investigated the contribution of the X-chromosome to the

heritability of tooth size with Pearson’s mean product-moment correlations and

determined that it was partially responsible for the size of the mesiodistal dimension.

Conversely, Townsend and Brown (1978a) used the same methodology as Garn and

coworkers (1965) in an investigation of X-chromosome and odontometric correlation but

they found no supporting evidence among full and half-sibling Australian Aborigines.

Nevertheless, the overall findings of Garn and coworkers’ (1965) study are further

supported by Lewis and Grainger’s (1967) investigation of X-chromosome odontometric

influence using parent-child pairs.

Bulmer (1970) and Smith (1974, 1975) developed a formula to estimate heritability of tooth size using intra-class correlation coefficients of monozygotic and dizygotic twins. Potter and Nance (1976) and Myzoguchi (1977) utilized this formula to advance a claim of rather low heritability values for the mesiodistal dimensions of the maxillary canines and first premolars, as well as for the mandibular lateral incisors, premolars and first molars. Given the results of the predictability of zygosity studies, it appears that either Bulmer and Smith’s formula is not a good indicator of heritability or there are other intervening factors not accounted for. Potter and coworkers (1976) conducted a follow-up study to determine whether odontometric heritability is under independent genetic control with respect to its genetic determinants and found a pleiotropic effect of independent genes or groups of genes. They further found that the maxillary and mandibular dentitions are determined independently from one another.

Nevertheless, according to Harris (2003), absolute variation by dental arcade represents only 6.9% of the total variation based on the results of multivariate analyses using mesiodistal and buccolingual measurements of 100 American white and 100

American black individuals.

Variation in asymmetry, population-specific sex dimorphism, and other minor differences within individuals can be influenced by environmental factors (Scott and

Turner 1997). Fluctuating asymmetry refers to characteristics that share the same

genetic coding for both right and left sides but are still not symmetric. Teeth in the

corresponding quadrants of both upper and lower jaws are “symmetrical structures that

exhibit mirror imagery” (Scott and Turner 1997:96). Deviations from the genetically

coded instructions are attributed to developmental disturbances. Tooth size fluctuations

are attributed to metabolic stress induced by nutritional deprivation, pathological

affliction, or inbreeding depression and therefore may result in some degree of

fluctuating asymmetry between antimeres (Scott and Turner 1997). The potentially

measurable difference produced by fluctuating asymmetry could have positive

implications, serving as a useful tool for dental anthropologists in identifying similarities

and differences between samples. However, fluctuating asymmetry of dental

characteristics, including overall size, is relatively low (Scott and Turner 1997). In

addition, human crown dimensions exhibit low levels of sex dimorphism with males

exhibiting teeth that are 2-6% larger than females (Scott and Turner 1997). This has

been demonstrated by a number of dental researchers including Moorrees (1957), Garn

and coworkers (1964, 1966), and Mizoguchi (1988). The fact that sex determination

standards developed for one specific population work poorly for predicting sex in other

populations suggests that even the expression of sex dimorphism in tooth size among

modern humans is population-specific and may have important implications for forensic

and biodistance analyses. Statistical measures will be employed to rule out any variation found within this study that could be caused by such factors.

Biological Distance

A more comprehensive knowledge of biological distance and its statistical measurements is necessary to better understand the methods and theories utilized in this research. Using biological data to infer relationships among different populations elucidates patterns of population movements, mixtures, and modifications (Scott and

Turner 1997). In order to assess scientifically how populations are related, multiple quantitative variables must be simultaneously assessed thereby producing a simple value that summarizes the overall difference between the populations or groups in question (Scott and Turner 1997: 255). These values are known as distance statistics, they are relative measures of relationship using shared affinities. Most dental anthropologists use distance values that represent measures of dissimilarity. A pairwise distance coefficient of 0.0 indicates that the two samples being measured have identical trait frequencies or patterning of tooth size. Therefore, as dissimilarities increase the distance coefficients increase. When these pairwise distance values are arrayed in matrix form, the smallest values represent the groups with the greatest degree of similarity and the larger values are associated with groups that are more divergent.

A number of different statistical techniques have been employed with the same basic function, which is to determine the relative degree of pairwise similarities and dissimilarities (Scott and Turner 1997). Such an approach is supported by Cavalli-

Sforza and coworkers (1994:30) who state, “In general, the distances calculated by different formulas are always highly correlated.” An initial attempt to develop a distance statistic was made by Pearson in 1926. Known as the ‘coefficient of racial likeness,’ this multivariate statistic sought to determine whether the overall distance between two 33

groups was statistically significant. Deficiencies with this statistic, such as adjustments

for correlated variables, were fixed with Mahalanobis (1936) Generalized (D2) statistic

(Scott and Turner 1997). In the 1950s Penrose developed size and shape distance statistics (Scott and Turner 1997).

The observed between-group differences are usually assessed using these traditional distance statistics and analyzed using multivariate techniques, such as

various forms of cluster analysis, principal coordinates, principal components,

multidimensional scaling, canonical variables, discriminant functions and factor analysis

(Howells 1989). Use of multiple groups results in large matrices that are difficult to use

for pattern evaluation. It is because of this that graphical methods have been

developed that reduce the complexities of the matrices into two or three dimensions

(see Sokal and Sneath 1963). Cluster analysis is a popular method for reducing

distance values to two dimensions in the form of dendrograms. In these graphical

depictions, the groups that have similar trait frequencies or patterning in tooth size

cluster together on one branch. Principal components analysis, factor analysis, and

multidimensional scaling can provide X, Y, and Z values along orthogonal axes for each

group in a given analysis. These values can then be plotted in either two (X,Y) or three

(X,Y,Z) dimensions thereby providing visual representation of inferred biological

distance with the groups possessing small pairwise distance values having similar

coordinates and hence plot closely together (Scott and Turner (1997).

Geneticists and statisticians have also developed other techniques for comparing

population genotypic and phenotypic frequencies, including variants of the chi-square

statistic, angular transformations of frequencies, and kinship coefficients (Constandse-

Westermann 1972, Weiner and Huizinga 1972). According to Livingstone (1991), there

are a few assumptions that underlie biological distance statistics. First, it is assumed

that a small inter-group distance value is indicative of a close biological relationship and

recent common ancestry while larger values represent a more distant relationship.

Second, is it assumed that the between group divergence is due to the stochastic

processes of genetic drift and founder effect, while gene flow leads to a lessening of biological distance between participating groups (Livingstone 1991). It is assumed that none of the traits are subjected to natural selection and that none of the traits are correlated, at least to a significant degree, with one another5. Lastly, it is assumed that

these distance values become more reliable at assessing relatedness when based on

many variables rather than just a few and the traits used are equally weighted

(Livingstone 1991). Many of these techniques will be employed in this biodistance

analysis of the Hazara.

5 In this regard biodistance analyses based upon nonmetric morphological traits of the permanent tooth crown differ markedly from biodistance analyses based upon the allocation of permanent tooth size. In the former, one is dealing with differences in trait frequencies and hence a suite of nonparametric statistical techniques, including Smith’s (1962) Mean Measure of Divergence are used to derive the triangular matrix of pairwise distances between samples. By contrast, tooth size allocation analyses depend upon differing degrees of inter-trait correlations within and between members of various morphogenetic fields. It is these correlations that permit increasing amounts of the overall variation between samples to be captured by cluster analyses and by the orthogonal vectors generated by multidimensional scaling and principal coordinates analysis. 35

Previous Dental Studies in South Asia

Dental anthropological studies have been conducted over the course of many years, often to differentiate between species of primates or characterize genetically different human populations. Researchers have used both dental metric and morphological data to create odontographies, using indices and trait frequencies to elucidate biological affinities and population genetic histories as early as the late 19th century. Some of the earliest research involving South Asian populations focused on characterizing regional populations of the world (see Flower 1885, De Terra 1905). The potential of dental characteristics to determine the relatedness of groups continued to be exploited by researchers into the 20th century, expanding the knowledge of genetically determined traits and the validity of dental metric and morphology in diagnosing biological affinities. Hrdlicka (1920) first noticed the high frequency of a morphological characteristic correlating to specific populations when examining the central incisors of Native Americans in 1907. Hellman (1928) analyzed molar cusp and groove pattern frequencies of different human populations. Despite Hellman (1928) using the Dryopithecus pattern to gauge racial evolutionary progression, he was, nevertheless correlating cusp pattern with racial likeness and observing the different frequencies of expression for a specific genetically determined trait. As such, Hellman’s effort to categorize closely related populations represents an early effort that demonstrated the potential of dental morphology for determining biological affinities between human populations.

An example of an early attempt to employ dental morphology to study of biological affinities of a South Asian population is Bowles’ (1943) study of dental characteristics among the Munda, an Austronesian-speaking non-caste tribal group found in east-central India. Another early study was undertaken by Tratman (1950), who analyzed crown and root characteristics of South Indian Tamils in comparison to groups from Malaysia and China. Banerjee (1967) analyzed skeletal material from West

Bengal, and provided a description of root morphology and the frequency of congenitally missing teeth. Joshi and coworkers (1972) observed the frequency of Carabelli’s trait among 489 Gujarati children from Ahmedabad to determine sex-based morphological differences. An investigation of Punjabi morphometric and occlusal patterns was conducted by Sharma and Kaul (1977) using plaster casts of 80 individuals. In addition,

Lukacs (1983) described dental remains recovered from early Neolithic levels at

Mehrgarh, Baluchistan of 16 individuals and reported on caries rates, dental metrics, and morphological features such as shovel-shaped incisors, metaconule frequencies and the occurrence of Carabelli’s trait.

One of the first studies that utilized dental characteristics to analyze population interactions and to determine the degree of biological relatedness of different South

Asian populations is Lukacs (1977), who used dental morphological data of individuals of a series of Hindu caste groups from Maharashtra and Bengal and demonstrated the presence of an east-west cline in dental characteristics from a more European pattern in the west to a more Asiatic pattern in the east. Further studies of South Asian gene flow patterns were conducted by Lukacs and Hemphill (1991) whose dental analyses of prehistoric populations concluded there were at least two western population

incursions into the South Asian subcontinent. More recent studies of the biological

affinities and population genetic histories of South Asians in comparison to prehistoric

Central Asians and living and prehistoric South Asian populations have been

conducted using both odontometric and dental morphological analyses with numerous

studies conducted by Hemphill (see Hemphill 1991, 2008, 2009, 2010, 2011, 2012,

2013a, 2013b, Hemphill and coworkers 1992a, 1992b, 2000, 2013) and others using

data he facilitated and/or directly participated in the collection of (Barton and Hemphill

2012, Blaylock 2008, Blaylock and Hemphill 2007, Guzman and Hemphill 2012, 2013,

O’Neill 2013, O’Neill and Hemphill 2009, 2010, 2012, Willis and Hemphill 2008, Willis

2010, Willits and Hemphill, 2007). For example, Blaylock (2008) described the biological relatedness of the Khowar, a Pakistani population, using frequencies of dental morphological characteristics in comparison to regional living and archaeological samples and found the Khowar share close biological affinities to prehistoric Central Asians. In addition, Willis (2010) used odontometric data to elucidate the biological affinities of the Burusho using 284 plaster casts from Gilgit,

Pakistan. The results indicate they have remained genetically isolated, showing equidistance from living Pakistani populations and prehistoric Central Asian populations (Willis 2010).

Barton and Hemphill (2012) focus on the Yashkun, a Dardic-speaking ethnic group of northern Pakistan, using odontometric data collected from 163 individuals in comparison to 22 samples of prehistoric and living individuals from Pakistan, peninsular

India, Central Asia, and the Iranian Plateau. Researchers assert the results confirm the assertion that the Yashkun are living descendants of a common, indigenous population

of the Hindu Kush and Karakoram highlands. This is based on the results obtained from the neighbor-joining tree cluster analysis and principal co-ordinates analysis which demonstrated closely shared biological affinities between the Burushos and Shins and furthest affinities to other northern Pakistani groups.

Guzman and Hemphill (2012) focus on the Baltis, a Tibeto-Burman speaking ethnic group who reside in northern Pakistan using odontometric data collected from

180 Balti individuals in comparison to 21 samples of prehistoric and living individuals from Pakistan, peninsular India, Central Asia, and the Iranian Plateau. The results of the neighbor-joining cluster analysis and principal co-ordinates analysis indicate the

Baltis are likely descendants of Tibetan populations and not an indigenous population of northern Pakistan as they are positioned as outliers, occupying an isolated phenetic position in the arrays. Guzman and Hemphill (2013) furthered investigations of the

Baltis by testing if geographically distinct members of this self-identified ethnic group share closest biological affinities to one another thus validating the use of these social constructs as units for biological analyses. Researchers used 194 Balti individuals from

Partuk and 217 Balti individuals from Khaplu in comparison to 24 other samples from

Pakistan, peninsular India, and prehistoric Central Asia. The biological meaningfulness is confirmed by the results of the neighbor- joining tree cluster analysis, principal co- ordinates analysis, and multidimensional scaling which indicate the two geographically- distinct Balti samples do exhibit closest biological affinities to one another.

O’Neill and Hemphill (2009) use dental data to describe the biological affinities and population origins of the Wakhi lending support, in part, to the theory that they are refugees from the Wakhan Corridor of Afghanistan. The results of the statistical

analyses demonstrate none of the living Pakistani highland groups share close affinities

to one another. In addition, no close affinities were demonstrated to any of the living or

prehistoric samples south and west of Pakistan with the exception of the distant

phenetic ties demonstrated between Altyn Depe in Turkmenistan and the living Wakhi

(O’Neill and Hemphill 2009). These researchers interpret such results as possibly

implicating Turkic-Pamiri origins for the earliest Wakhi settlers of the Wakhan Valley.

O’Neill and Hemphill (2009) conclude that the Wakhi most likely represent recent

immigrants to Pakistan with linguistic affiliations to the populations like the Madaklasht,

and the Khowar (O’Neill and Hemphill 2009), but whose biological affinities range from

very distant (Madaklasht) to moderate (Khowar).

In addition, O’Neill (2013) analyzed the biological affinities of the Wakhi and the

Shin using odontometric data. O’Neill (2013) concluded that these ethnic groups from

Gilgit-Baltistan do not share close affinities and hence do not have similar origins as ethnic groups (Khowars, Madaklasht) from the Hindu Kush highlands of Chitral. O’Neill

(2013) also found both Shina groups included in the study have similar biological

affinities despite dialectical differences or differences in geographic locality. O’Neill

(2013) concluded that, “ethnic classifications based on linguistic familiarity have

biological meaning, and therefore are appropriate and meaningful when used properly

in demographic studies” (O’Neill 2013: 143).

Materials and Methods

The basis for this investigation is a series of plaster casts of the permanent

dentition of living Hazara teenagers and young adults living in Skardu and other areas of Gilgit-Baltistan. These casts were collected by Pakistani colleagues in collaboration with, and under the supervision of Dr. Brian Hemphill and myself in Gilgit during June

2007 and by Hemphill and his Pakistani colleagues in Skardu in August 2008 with the

informed consent of all participants. The projected goal was to collect one hundred

(100) male and one hundred (100) female dental casts. The subjects were usually

within the age range of 12-16 years old. A total of 202 Chengazi individuals

representing the Hazara were cast. This specific age range is necessary because it is important for all of the permanent dentition to be fully erupted, with the exception of the third molars. Furthermore, the religious practice of purdah, which prohibits girls older

than 16 to participate, defined the maximum age limit parameters of this study.

Measurements of the buccolingual breadths and mesiodistal lengths are used to determine overall tooth dimensions. These measurements are then standardized against the geometric mean to remove the effects of overall size. The allocation of permanent tooth size across the dentition is the unit of comparison for this analysis.

The measurements of the Hazara sample are compared to fifteen living South Asian and Pakistani groups and twelve archaeologically-derived samples from prehistoric sites

located in Central Asia, the Indus Valley of Pakistan, peninsular India, and the Iranian

Plateau. Statistical assessment of dental metric variation includes hierarchical cluster

analysis, neighbor-joining tree cluster analysis (Saitou and Nei, 1987), multidimensional

scaling using Guttman’s (1968) and Kruskal’s (1964) method, and principal co-ordinates 41

analysis (Gower 1966). Examination of intra-observer (n = 25) and inter-observer error

(n = 35) is based upon repeated assessment of randomly selected casts. The

differences, if any, within the various dental metrics at different localities within the

population will also be assessed. Antimeres of left and right sides are tested for any

statistically significant differences to determine whether data from right and left sides

may be pooled without introducing bias in order to create larger sample sizes and

results that are more statistically robust. Also, a subset of complete measurements is

used to assess overall tooth size allocation based upon sex. If no differences are

found, these samples can be pooled to increase sample size and therefore increase the

statistical robusticity of the results.

Comparative Samples

The comparative odontometric materials in this analysis are from several

different sources reflecting pertinent locations and time periods. There are data from 15

living groups, representing the Hindu Kush and Karakoram highlands of the Chitral

District and western half of Gilgit-Baltistan, the Karakoram and Himalayan highlands of

the eastern half of Gilgit-Baltistan, northwestern and southeast India as well as 12

archaeological samples from the Indus Valley, central Asia, and western India that range in antiquity from the aceramic Neolithic to the Early Iron Age. Table 1 specifies

the samples and sizes, their abbreviations, approximate antiquity, and the geographic

region of the archaeological samples that will be used for comparative purposes. Figure

2 provides an illustration of the geographic location of these samples. Considered as a

whole, this statistical assessment of tooth size allocation includes data from 202 Hazara

individuals along with 2,972 comparative individuals, for a total of 3,174 individuals. 42

Table 1. Samples used in the Tooth Size Allocation Analysis

SAMPLE NUMBER ABB. DATE REGION Vaghela Rajputs 190 RAJ Living NW Peninsular India Garasias 207 GRS Living NW Peninsular India Bhils 208 BHI Living NW Peninsular India Chenchus 196 CHU Living SE Peninsular India Gompadhompti Madigas 177 GPD Living SE Peninsular India Pakanati Reddis 184 PNT Living SE Peninsular India Khowar 104 KHO Living Hindu Kush Highlands Burusho 295 BUR Living Hindu Kush Highlands Hazara 202 HAZ Living Hindu Kush Highlands Wakhi (Gulmit) 166 WAKg Living Hindu Kush Highlands Wakhi (Sost) 170 WAKs Living Hindu Kush Highlands Shin (Astor) 170 SHIa Living Hindu Kush Highlands Shin (Other) 100 SHIo Living Hindu Kush Highlands Madaklasht 191 MDK Living Hindu Kush Highlands Swati 190 SWT Living Hindu Kush Highlands Inamgaon 41 INM 1600-700 B.C. W Peninsular India Neolithic Mehrgarh 49 NeoMRG 6000 B.C. Indus Valley Chalcolithic Mehrgarh 25 ChlMRG 4500 B.C. Indus Valley Harappa 33 HAR 2300-1700 B.C. Indus Valley Sarai Khola 15 SKH 200-100 B.C. Indus Valley Timargarha 25 TMG 1400-850 B.C. Indus Valley Djarkutan 39 DJR 2100-1950 B.C. Central Asia Kuzali 24 KUZ 1950-1800 B.C. Central Asia Molali 41 MOL 1800-1650 B.C. Central Asia Sapalli Tepe 43 SAP 2300-2150 B.C. Central Asia Altyn Depe 25 ALT 2500-2200 B.C. Central Asia Geoksyur 64 GKS 3500-3000 B.C Central Asia

TOTAL 3174

Figure 2. Location of samples used in this analysis. The red squares indicate the locations of living population samples, while archaeologically derived samples are indicated by blue dots.

Living Inhabitants of Pakistan and India

Samples of living individuals from the Hindu Kush Highlands of northern Pakistan

include Indo-Aryan-speaking Khowars (KHO) from the village of Buni in Chitral District,

Khyber Pakhtunkhwa, as well as the Burushaski speaking Burusho (BUR) and the

Hazaragi-speaking Hazara (HAZ) of Gilgit-Baltistan. Other Pakistani highlander groups

include the Indo-European-speaking Wakhi6 (WAKg and WAKs) and the Dardic- speaking Shin (SHIa and SHIo). These two groups have been separated by the location in which they were collected. The Wakhi sample was collected from Gulmit

(WAKg) and Sost (WAKs), whereas the Shin samples were collected from the regions

of Astore (SHIa), Gilgit and Haramosh (SHIo). Other highland samples include the

Indo-Iranian-speaking inhabitants of Madak Lasht village (MDK) located in the

Karakoram highlands in the Mansehra District and the Swati (SWT), an ethnic group

believed to be descendants of Pashtun-speaking settlers of the Swat Valley that

relocated to the Hazara hills of Mansehra District between AD 1500 and AD 1700

(Schofield 2003).

Samples of living individuals inhabiting districts in southern and central Andhra

Pradesh in the southeastern portion of peninsular India include high-status caste Hindu

Pakanati Reddis (PNT), low-status caste Hindu Gompadhompti Madigas (GPD), and

non-caste tribal Chenchus (CHU). Members of all three ethnic groups are Telegu-

speakers, a language classified within the Dravidian family of languages Samples of

6 This is a bit of oversimplification. Wakhi is one of several languages that are members of the Pamir language group spoken by a number of ethnic groups within the Gordo-Badakhshan Autonomous Province of eastern Tajikistan, the Badakhshan Province of northeastern Afghanistan. In Pakistan, Wakhi is spoken in the five most northerly valleys: Hunza, Gojal, Ishkoman, Guips, and Yarkhun. O’Neill’s (2013) samples were obtained from Hunza (WAKg), where many now speak Burushaski, and Gojal (WAKs), where Wakhi still remains the primary language. 45

living northwestern peninsular Indians are from Gujarat and include high-status caste

Vaghelia Rajputs (RAJ), low-status caste Garasias (GRS) and tribal Bhils (BHI) (Lukacs

and Hemphill 1993), all of whom speak Indo-Aryan languages.

Prehistoric Inhabitants of the Indus Valley

The sample of the prehistoric inhabitants of Inamgaon (INM) come from the west- central portion of peninsular India, dates to the post-Harappan Jorwe Period and is

included with the Indus Valley samples because of the many similarities exhibited

between them (Lukacs 1985a). Prehistoric Indus Valley samples include the

skeletonized remains found at the archaeological site of Mehrgarh, located on the North

Kachi Plain, west of the Indus Valley in Baluchistan. These include individuals

recovered from levels dating to aceramic Neolithic (NeoMRG) (Lukacs 1986) and

Chalcolithic (ChIMRG) periods (Lukacs and Hemphill 1991). In addition, the

skeletonized remains of individuals recovered from the archaeological site of Harappa,

located east of the Indus River in Punjab Province and dating to the Mature Phase of

the Harappan Civilization (HAR) will be used as a comparative sample in this study

(Hemphill et al. 1991). Other prehistoric Indus Valley samples include individuals

recovered from Early Iron Age deposits at the site of Sarai Khola (SKH) (Lukacs 1983)

and the Late Bronze/Early Iron Age Gandharan Grave Culture sample recovered from

Timargarha (TMG) (Lukacs 1983).

Prehistoric Inhabitants of Central Asia

Prehistoric Central Asian samples from southern Uzbekistan include individuals

recovered from the two urban centers of the Bactrian-Margianan Archaeological

Complex, Djarkutan and Sapalli Tepe (Hiebert 1994). Burials recovered from Djarkutan

have been assigned to three time periods based upon their associated artifacts. These

periods include the Djarkutan Period (DJR: 2000-1800 BC), the Kuzali Period (KUZ:

1800-1650 BC,) and the Molali Period (MOL: 1650-1500 BC. The sample from Sapalli tepe (SAP) slightly predates (2200 – 2000 BC) the earliest human remains recovered from Djarkutan. Samples recovered from archaeological sites located in Turkmenistan include individuals recovered from the Namazga Period V (2500 – 2300 BC) occupation of the urban center of Altyn depe (ALT) located in the northern foothills of the Kopet

Dagh Mountains of south-central Turkmenistan and individuals recovered from the

Namazga Period III (3500 – 3000 BC) occupation of Geoksyur (GKS) (Kohl 1985), located in the desiccated Tedjen River delta of southeastern Turkmenistan.

Research Questions

The present research aims to determine the most likely biological origins of the

Hazara through an analysis of tooth size allocation throughout the permanent dentition

and analyze the influence of sexual dimorphism among the Hazara. This will be

accompanied by testing the pattern of phenetic affinities possessed by the Hazara to

samples of other ethnic groups of varying temporal depth from Central Asia, the Hindu

Kush/Karakoram highlands, the Indus Valley of Pakistan and peninsular India. The

results of which will be compared to the patterns predicted by four current models for

the biological history of ethnic groups of greater South Asia: the Long-Standing

Continuity Model (LSCM), the Aryan Invasion Model (AIM), the Early Entrance Model

(EEM) and the Historic Era Influences Model (HEIM). The LSCM, AIM and EEM reflect competing theories of the population history of the Indian subcontinent. The HEIM is a population history theory incorporating historic period migrations into the explanatory equation that may include a possible migration event from East Asia as the causal factor responsible for the Hazara’s historic region of occupation.

The current evidence and research leaves several questions which this research seeks to elucidate:

1) Are the Hazara the result of long-standing continuity of indigenous

occupation?

2) Are the Hazara descendants of Indo-Aryan-speaking Central Asians who

emigrated across the Hindu Kush Mountains during the mid-2nd millennium BC?

3) Are the Hazara descendants of a wave of proto-Dravidian speakers into South

Asia from proto-Elamitic Iran during the 5th millennium BC? 48

4) Are the Hazara descendants of historically recent immigrants, in the form of either: a.) refugees escaping persecution in Afghanistan; to the northwestern periphery of South Asia, or b.) the descendants of Ghengis Khan’s army, whose foreign genes represent an intrusion into the resident South Asian gene pool?

Models and Expectations

The four South Asian population history models used as the basis of analysis for this current study are not meant to be considered non-inclusive or all-exclusive. As

O’Neill (2013) asserts, different models may be applicable to different populations therefore the assumptions will not force the results to support one model singularly but may be interpreted to support or refute multiple models to varying degrees. As its name implies, proponents of the LSCM, support the idea that the Indian subcontinent has been isolated from any significant population incursions since the initial dispersal of modern Homo sapiens out of Africa during the mid-Pleistocene (60,000 BP) (Kennedy et al. 1984). If this model holds true, groups closest both temporally and in geographic proximity should be most similar in the patterning of tooth size allocation. Based on

Barbujani and Sokal (1990), who demonstrated language to be a potential facilitator and barrier to gene flow in European populations, these South Asian groups could also be separated biologically based on linguistic affiliation. Therefore, the Indo-Aryan speakers and Dravidian-speakers may show separation to one another while following a pattern of isolation-by distance geographically. This linguistically modified version of this model is illustrated in Figure 3. Archaeological analyses, dental investigations and genetic studies provide evidentiary support for this model (Kennedy et al. 1984; Hemphill et al.

1992; Hemphill 1997; Majumdar 1998; Quintana-Murci et al. 2004).

Figure 3. Graphic depiction of the Long-Standing Isolation Model. The location of Indo-Aryan language speakers in the Indian subcontinent are represented by the purple circle. The Dravidian language speakers are represented by the orange circle.

Proponents of the AIM maintain that the initial entrance of Central Asians to the

Indian subcontinent occurred during the mid-2nd millennium BC, spreading both Indo-

Aryan languages and Brahmanic Hinduism throughout South Asia. This model is

illustrated by Figure 4. Proponents of this model commonly attribute the source

population of this emigration to either the inhabitants of the urban centers of the

Bactrian-Margiana Archaeological Complex (BMAC), members of the Andronovo

culture, or populations of the semi-nomadic Vakhsh and Beshkent cultures (Erdosy

1995; Parpola 1995) all of which are to be found in southern Central Asia in what is today the countries of Turkmenistan, Uzbekistan, Tajikistan and Afghanistan. Erdosy

(1995) believes that the collapse of urban civilizations, along with an increase in

population size and the introduction of new crops and horses, are responsible for the

beginning of Aryan culture.

Figure 4. Graphic depiction of Aryan Invasion Model. The purple arrows depict the movement of Aryans into from Central Asian into peninsular India, with the two on the left representing Renfrew’s (1978) and Sarianidi’s (1999) assertions. 53

Proponents of the EEM claim that one, and perhaps two, population incursions occurred after the Neolithic but prior to the dawn of the Christian Era into the Indian subcontinent. The first is maintained to reflect the entrance of proto-Dravidian-speakers during the 5th millennium BC, while the second may have involved the entry of Indo-

Aryan-speakers during the 2nd millennium BC. This model is illustrated by Figure 5.

Some researchers propose that the spread of proto-Dravidian speakers from Elamitic

Iran can be traced through their language and the presence of more developed agricultural methods of production (Fairservis 1975; McAlpin 1981; Fairservis and

Southworth 1989; Hemphill et al. 1991; Hemphill & Lukacs 1993) while others (Fuller

2003) maintain a wholly South Asian origin for Dravidian languages and the populations who spoke these languages suggesting independent centers of plant domestication within peninsular India based on archeobotanical evidence.

Proponents of the HEIM maintain that members of some of the current ethnic groups of northern and western Pakistan (Chitral District, FATA [Federally Administered

Tribal Areas], Gilgit-Baltistan, Baluchistan), as well as members of some of the living ethnic groups of the hills states of northeastern India (Arunachal Pradesh, Assam,

Manipur, Meghalaya, Mizoram, Nagaland, Tripura) are descendants of immigrants who entered the peripheries of South Asia during the protohistoric and historic periods.

Given that many of the Hazara left Afghanistan to escape persecution and are now mostly living in refugee camps, this model seems plausible (Blaylock and Hemphill

2007; Hemphill 2008, 2009), for historically documented population movements and dental analyses provide evidence that such movements can and have taken place

(Hemphill et al. 2009; O’Neill and Hemphill 2009). This model is illustrated by Figure 6.

It may also be subsumed under the HEIM that the that currently mythical origins of the

Hazara do, in fact, have biological validity, with the Hazara owing their origins to soldiers of Genghis Khan (Bellew 1979). Recent genetic studies have lent support to this assertion, demonstrating genetic similarities between the Hazara and East Asian populations (Qamar et al. 2002; Zerjal et al. 2003; Quintana-Murci et al. 2004; Hunley et al. 2009). These four models and their supporting evidence will now be addressed specifically.

Figure 6. Graphic depiction of the Historic Era Influence Model. The green lines indicating the separation of the Pakistani highlanders from peninsular Indians. 57

Long-Standing Continuity Model (LSCM)

The LSCM is one possible explanation for the origins of the Hazara. Evidence supporting this model is three-fold, for archaeological, dental, and genetic studies all

lend credence to this theory regarding the peopling of the Indian subcontinent. Hemphill

(1997) used discriminant function analysis, canonical discriminant function analysis, and

bootstrap analysis to compare features of male crania from Sapalli Tepe, Djarkutan, and

Tepe Hissar to test whether the presence of foreign burial objects in Period III

inhumations at Tepe Hissar, located in northern Iran, was indicative of intrusive Oxus

Civilization colonists or was the product of importation by local residents. The analysis

revealed a biological separation between all three groups studied, with the greatest

separation occurring between the two Central Asian samples and the Iranian sample.

Hemphill (1997) attributed the cause of this separation to genetic isolation over an

extended period of time. Such results led him to assert that although goods from

Central Asia were being traded in a large network that linked BMAC populations to

members of bordering groups, genes were not (Hemphill 1997). Logically, if such was

the case at Tepe Hissar on the Iranian Plateau, the same may be also true in

northwestern Pakistan where similar objects attributed to the BMAC have been

recovered.

Kennedy and co-workers (1984) used principal components analysis of cranial

metric data and interpreted their results as providing evidence for long-standing

continuity. Using the homogeneity found within the South Asian samples as evidence

for regional continuity and interpreted as a reflection of isolation-by-distance, Kennedy

and co-workers (1984) asserted that this effect was increased by three factors:

population size, local population stress and short marital distances. Thus, these

researchers concluded that there is evidence for long-standing genetic continuity of

South Asians both temporally and geographically.

A previous odontometric study conducted by Hemphill and coworkers (1992)

provides further support for the LSCM. These researchers looked at five living groups from India; Chenchus, Madigas, Pakanati Reddis, West Bengalis, and Maharashtrans.

They found that each group could be distinguished by differences in the pattering of tooth size allocation and that there was a strong separation between Indo-European-

speakers (West Bengalis and Maharashtrans) and Dravidian-speakers (Chenchus,

Madigas, Pakanati Reddis), as well as between local non-caste (Chenchus) and caste members (Madigas, Pakanati Reddis). This study demonstrated that although

geographic distance is one of the strongest genetic barriers, cultural practices, such as

the caste system and linguistic affiliation, also exhibit strong influences upon the

patterning of gene flow (Hemphill et al. 1992). Genetic research also supports the

LSCM. Such research includes a study conducted by Majumdar (1998). This research

supports the idea of genetic continuity and isolation stating that “Arab-Indian”

haplotypes at the β-globin gene cluster were present in the genetic sequence of all

Indian groups (Majumdar 1998). In addition, Mehra (2010) found the human leukocyte

antigen (HLA) allele families possess several alleles that are ‘unique’ to the Indian

subcontinent, such as the A*0211, which is almost completely absent in Caucasoid and

Oriental groups (Mehra 2010) thus demonstrating unique genetic qualities possessed

by the Indian subcontinent that lends support to the idea of genetic continuity in South

Asia.

Further support for the LSCM is provided by Quintana-Murci and coworkers

(2004), whose study of mtDNA led them to posit that South Asian populations exhibit signs of an in situ differentiation of deep-rooting lineages with a distribution that is limited to within the region (Quintana-Murci 2004:837). However, there are marked differences between the results obtained from mtDNA with that obtained from Y- chromosome variation. The Y-chromosome evidence provided by Qamar and coworkers (2002) show a marked divergence between the Hazara and the surrounding

Pakistani groups thereby providing evidence that runs counter to Quintana-Murci and coworkers’ (2004) findings.

Although it should be noted, in contrast to Qamar and co-workers (2002) findings, Metspalu and coworkers (2011) provide evidence for genetic similarities between the Hazara and other Central Asians. These researchers conducted a principal components analysis (PCA) using single nucleotide polymorphisms (SNPs) markers from 1310 individuals from 112 populations located in various geographic locations around the world, of which 30 were obtained from different Indian ethnic groups. Their analysis yielded a pattern of clinal variation stretching from Europe to southern India, with the Hazara exhibiting evidence of substantial admixture with

Central Asian populations.

Further support for the long standing genetic continuity of South Asia with differences between Central Asians and South Asians is provided by Sengupta and coworkers (2006). These researchers found very little Central Asian genetic admixture among South Asian populations when using 69 Y-chromosome binary-HG composition markers and 10 microsatellite markers from 728 South Asian samples representing 36

populations from six geographic locations (Sengupta et al. 2006). In addition, they

found that the pattern of regional differentiation among South Asian populations predate

the purported Indo-Aryan invasion of the mid-2nd millennium BC, for the dates obtained

from the accumulated microsatellite variation found within most Indian haplogroups

exceeded 10,000-15,000 years (Sengupta et al. 2006). Based on the genetic admixture

distribution of haplogroup J2a among Indian populations, these researchers postulate

that the Dravidian languages originated in India (Sengupta et al. 2006). This assertion

is based on its presence in upper-caste Dravidian and Indo-Aryan-speakers and its

absence among members of southern Indian tribes, middle and lower castes.

Expectations

If the LSCM is true, the Hazara should be most similar in the patterning of tooth size allocation to groups closest to them geographically and temporally, following a pattern of isolation-by-distance. That is, the Hazara should show closest population affinities with other Pakistani highland groups followed by their alleged ancestors, the prehistoric occupants of the Indus Valley. Furthermore, the Hazara should show no affinities to any Central Asian groups included in this study. Support for this model will suggest that the Hazara are long-term indigenous residents of Northern

Pakistan.

Aryan Invasion Model (AIM)

The AIM is another possible explanation for the ethnic origins of the Hazara.

Proponents of the AIM maintain that the initial entrance of Central Asians into the Indian subcontinent occurred during the mid-2nd millennium BC, spreading both Indo-Aryan languages and Brahmanic Hinduism to the Indus Valley & Upper Doab, subsequently spreading Vedic culture throughout South Asia (Erdosy 1989; Parpola 1995). This model uses the earliest of the Vedic texts—the Rg Veda—as evidentiary support, which some say were written by Aryan invaders into South Asia. It was noted by Sir William

Jones in 1788 that the Sanskrit language, in which these texts are written, possesses numerous similarities to both Greek and Latin (Poliakov 1974).

Archaeological evidence has also been used to support this model. Sarianidi

(1999) proposes that the commonality of artifacts found in both BMAC and South Asian assemblages can be used as evidence for biological affinities between Bronze Age

Central Asians and the inhabitants of South Asia. However, as demonstrated by

Hemphill (1999), the presence of similar artifacts may provide evidence for similar or shared technology but that does not necessarily mean the same biological group of people were manufacturing such items. Caution should always be taken with this line of reasoning. Nevertheless, Sarianidi (1999) proposes a population migration from western central Asia through modern-day northern Pakistan and then south into India.

Ecological evidence suggests that there was a change in climate that led to aridization of Bactria and Margiana during the 2nd millennium BC that affected agricultural production (Sarianidi 1999). However, the evidence for increased aridity in Bactria and

Margiana during the 2nd millennium BC is only suggestive, not definitive. According to 62

Sarianidi (1999), one possible route taken in search of suitable farmland is the Elamitic route. Such a route involved travel east from Anatolia toward the Persian Gulf to Elam, through Iran and the BMAC before heading south to India (Sarianidi 1999). However, this potential route seems illogical unless the emigration occurred prior to the 2nd

millennium BC or if evidence can be provided demonstrating why the Anatolian

populations would first emigrate southward to Elam and then northward to Bactria and

Margiana only to emigrate once again southward across the Hindu Kush into the Indus

Valley. The second route, similar to Renfrew’s (1987) “Neolithic Arya” hypothesis, calls for an initial eastward movement through south-central Asia to Bactria and Margiana, then a southward spread onto the Iranian Plateau and subsequently into India.

Renfrew’s “Neolithic Arya” hypothesis also calls for a direct movement of Neolithic farming populations southeastwards across the Iranian Plateau, over the Tobar Kakar

Mountains into the North Kachi Plain leading to the establishment of Mehrgarh.

Renfrew (1987) interprets evidence of agricultural development in South Asia as indication of the entrance of western Eurasian farming populations into the Indian subcontinent.

Numerous genetic studies have been conducted on various living populations within this geographic area utilizing classic serological, mitochondrial DNA (mtDNA), as well as biparental and uniparental (Y-chromosome) nuclear DNA. According to Cavalli-

Sforza and coworkers (1994), who constructed a population tree based on 54 classic serological markers, the Hazara fall within a West Eurasian cluster that also contains northern Caucasoids. Such genetic evidence, like that proposed by Battacharyya and coworkers (1999), has also lent support for the AIM. These researchers analyzed Y-

chromosome polymorphisms of South Asians and determined that, despite Y-

chromosome heterogeneity, they were most similar genetically to Europeans based on

the YAP element (Battacharyya et al. 1999). This is similar to the findings of Majumdar

(1998) who, using allele frequencies at 10 loci among different geographically located

groups, found a genetic separation displayed on the single-linkage dendrogram between South Indians and the inhabitants of North India, regardless of whether they were found in centrally located Madhya Pradesh, Gujarat or eastern Bengal. Such results lend support to linguistically-based genetic discontinuity, thereby making a population migration/invasion from the north a possibility. Long and coworkers’ (1990, in Hemphill 1991) study also supports the idea that South Asians are more closely related to Europeans than to East Asians. Wainscoat and coworkers (1986, in Hemphill

1991) through the analysis of nuclear DNA similarly concluded that South Asians are

more closely related to European Caucasoids than to Melanesian, Polynesian, and

Southeast Asian populations. Additional evidence for this model includes an

odontometric study conducted by Hemphill (1991), which concludes that South Asians

are closest to Caucasians in biological affinity and found that both possess broad

anterior teeth that exhibit wide buccolingual dimensions relative to their mesiodistal

lengths. Hemphill (1991) also purports that Southeast Asians and Northeast Asians are

similar in tooth size to South Asians and Caucasians.

Tartaglia and coworkers (1995) used genetic evidence to compare various South

Asian groups and found statistically significant heterogeneity among the markers with a

geographic patterning within the distribution. They concluded that the genetic

frequencies and markers of North Indians correlate with those of European groups, a

correlation they interpret as indicative of a demic diffusion of Central Asian genes into

South Asian populations (Tartaglia et al. 1995).

Bamshad and coworkers (1998) used mtDNA and Y-chromosome data from 250 individuals belonging to 12 Telegu-speaking caste populations from northeastern

Andhra Pradesh in southern India to conduct a neighbor-joining cluster analysis analysis. They found that cultural factors, such as the Hindu caste system, have influenced gene flow that has led to a stratification of mtDNA distances between castes.

They further found that this stratification correlates to social rank, in which Dravidian- speaking upper caste Hindus show closer affinities to Europeans than those belonging to castes of lower social status.

Further corroborating this claim, Bamshad and coworkers’ (2001) analysis of

possible genetic differences correlating to caste levels using mtDNA and Y-

chromosome polymorphisms found that upper caste members share closer affinities to

Eastern Europeans than to Asians. Their research indicates that upper caste members

are more similar to Eastern Europeans than lower caste members because upper caste

members show higher frequencies of the haplotypes belonging to the West Eurasian

haplogroups than do lower caste members (Bamshad et al. 2001). Despite the mtDNA

frequency differences by caste, paternally inherited Y-chromosome variation is more

similar to Europeans than Asians for all caste members regardless of social status

(Bamshad et al. 2001).

Thus, both mtDNA and Y-chromosome analyses indicate closest shared

biological affinity to West Asian populations, particularly Eastern Europeans. These

researchers interpret their findings as supporting the AIM (Bamshad et al. 2001).

However, it should be noted Metspalu and coworkers (2011:10) provide evidence that

runs counter to the claims of the AIM by demonstrating that the haplotype diversity of

the k5 and k6 single nucleotide polymorphisms present within the Indian populations

long predates the purported Indo-Aryan invasion of the mid-2nd millennium BC.

Expectations

If the AIM is true, the Hazara and other Pakistani highland ethnic groups should

be most similar in the patterning of tooth size allocation to their presumed ancestral

groups from Central Asia, particularly the occupants of the BMAC urban centers of

southern Uzbekistan (Sapalli tepe and Djarkutan), followed by secondary affinities to

Central Asian samples of greater antiquity, believed to be ancestral to the BMAC

populations of Sapalli tepe and Djarkutan, such as Geoksyur (Hemphill 1999) and

Altyn depe (Hiebert 1994), as well as Indo-Aryan speaking groups occupying northern India (BHI, GRS, RAJ), who are maintained to be the descendants of these Central Asian Indo-Aryan-speaking invaders. However, linguistic association suggests that the Hazara, who speak an Indo-Iranian language, may not possess such affinities. Furthermore, similar population affinities between samples from the BMAC (DJR, ALT, GKS, SAP) to the post-

Harappan Indus Valley samples (TMG, SKH) will also add support to this model by implicating the occurrence of a Bronze Age invasion of Central Asians into the Indian subcontinent. Any potential links between prehistoric Central Asian samples and peninsular Indians may also lend some support to this model.

Early Entrance Model (EEM)

Proponents of the EEM assert that between the late-6th and early-5th millennia

BC a proto-Dravidian-speaking population emigrated from the Elamitic region of

southwestern Iran eastward across the Hindu Kush and Tobor Kakar Mountains into

South Asia, spreading their language and more developed agricultural technology as

they traveled eastward and southward (Hemphill & Lukacs 1993; Hemphill et al. 1991).

This model encompasses two potential scenarios. The first calls for a limited biological

intrusion of Dravidian-speakers to South Asia that was initially limited to that portion of

the subcontinent located west of the Indus River with a subsequent migration of these

proto-Dravidian-speakers southward and eastward into peninsular India, perhaps due to

the entrance of Indo-Aryan speakers from Central Asia during the mid-2nd millennium

BC (Quintana-Murci and coworkers 2004). The second calls for a more substantial

intrusion of proto-Dravidian-speakers into the Indus Valley and beyond into peninsular

India with no subsequent significant influx of Indo-Aryan speakers during the 2nd

millennium BC (Hemphill 1991, Hemphill and coworkers 1992a, Southworth 1995).

Evidentiary support for the EEM model can be found from linguistic studies, archaeological investigations, dental and skeletal analyses, and genetic analyses.

Additional linguistic evidence offered in support of the EEM is provided by

McAlpin (1975, 1981) who proposed that Proto-Elamo-Dravidian-speakers entered the subcontinent from southwestern Iran during the 5th millennium BC. McAlpin (1975)

demonstrated multiple shared affinities between Elamite and Dravidian languages

suggesting they possess a common etymological origin. McAlpin (1975) found at least

20% of Dravidian and Elamite vocabulary are cognates and also posited that both 67

languages possess similar second-person pronouns and parallel case endings, exhibit

identical derivatives, and contain similar abstract nouns. Furthermore, McAlpin (1981)

argues that phonological correspondences between Elamite and Proto-Dravidian occur

quite regularly, for they each have similar neutral vowel patterning, and he also believes

the root structure of the Proto-Elamo-Dravidian is similar to Proto-Dravidian (pp. 83, 88,

93). McAlpin (1981:135) believes demonstrating that Elamite and Dravidian languages share a common etymological origin verifies the existence of Dravidians in the Indus

Valley during the time of the Harappan Civilization, although he admits that this evidence is only circumstantial. Southworth (1995) uses “speech communities” as units of analysis and based on the evidence of substratum influences found within the Indo-

Aryan languages, concludes there was an adoption of an Indo-Aryan language

throughout the northern region of peninsular India. Southworth (1995: 264, 274) posits

that Dravidian-speakers and Indo-Aryan-speakers were in contact with each other as

early as the 2nd millennium BC period and then merged into a singular cultural complex

during the late 2nd and early 1st millennium BC; this culture complex was dominated by a

Indo-Aryan language with multiple local Dravidic inclusions.

Linguistic evidence for this model is also offered by Parpola (1988), whose study

focused on the linguistic similarities between Central Asia, Pakistan and India and

proposes that it was the influence of the Megalithic culture brought by horse-riding

Cimmerians from Kuban that migrated to Iran thus spreading a later form of a West

Aryan/Old Iranian language to India. The majority of these linguistic isolates were easily

assimilated into the preexisting proto-Dravidian-speaking population due to their small

population size (Parpola 1988).

Linguistic and textual evidence further supporting the EEM is provided by Witzel

(1995), who suggests the inhabitants of the Indian subcontinent were speakers of proto-

Mundic languages. Witzel (1995) suggests that the recitation of the Rg Veda, a Bronze

Age compilation of multiple texts with dates ranging from 1900-1200 BC, was precise and exact thereby yielding a modern-day “tape recording” of the earliest form of the text and therefore extremely useful to the study of South Asian history (1995:3, 91).

Through the use of loan words, structural borrowing, place names, and hydronomy,

Witzel (1995) argues that the Rg Veda offers evidence clearly suggesting, not only the interaction of the Vedic Sanskrit speakers with Dravidian and Mundic speakers, but also the acculturation of these groups long before the text was composed (1995:99, 108).

The evidence provided by the study of river names suggests an, “almost complete Indo-

Aryanisation in northern India,” which Witzel (1995) attributes to an acculturative process rather than the result of a political upheaval or the influence of a lingua franca, a dominant language used to facilitate trade (1995:106, 111). Witzel (1995) argues that the appearance of “Aryan” kings with non-Indo-Aryan names provides additional evidence of a long period of acculturation (1995: 108).

Furthermore, place names and cultural loan words such as, “brick” and “wheat” suggest a Western influence that might be traced to the language spoken by the inhabitants of the Bactrian-Margianan urban centers of south Central Asia (Witzel

1995:104). Witzel (1995) thus suggests the possibility of the acculturation or

“Aryanisation” of the Turkmenian-Bactrian area before their entrance into South Asia

(1995:113). Witzel (1995) argues that the Indo-Aryans exhibited somatic characteristics similar to the ancient populations of the Turanian/Iranian/Afghan areas due to genetic

admixture occurring prior to their arrival to the Indian subcontinent (1995:113). Parpola

(1995) also suggests an acculturation event, calling for a kind of Kulturekugel in which all of the trappings of BMAC urban culture are adopted by Andonovo groups as they held the BMAC urban centers subject to their control.

Biological evidence that may lend support to the EEM is provided by Hemphill and Lukacs (1990), who support claims of shared biological affinity between South

Asian and Near Eastern populations, reject previous claims, like those of Cappieri

(1959, 1969, 1970, in Hemphill 1991), of homogeneity among South Asian populations based upon the results of their mean measure of divergence analysis of dental morphology trait frequencies (Hemphill and Lukacs 1990). Further biological evidence that may also lend support to the EEM is demonstrated by Hemphill and coworkers

(1991) using dental morphology trait frequencies, craniometric variation and cranial nonmetric data. The results of their principal components analysis of craniometric data yield the greatest distances between the prehistoric northern Pakistani samples (Harappa,

Timargarha) and the southern Pakistani sample from Mohenjo-daro. Furthermore, the cluster analysis using cranial non-metric data of South Asian samples and individuals from the Near East and Asia show divergence between the Harappans and the inhabitants of Sarai Khola suggesting a biological discontinuity in the Indus Valley after the end of the Harappan Civilization (1750 BC) but before the Early Iron Age at Sarai

Khola (200 BC) (Hemphill et al. 1991:173). These researchers conclude that the biological data does not support Renfrew’s (1987) Neolithic Arya Hypothesis of an introduction of Indo-European speakers into South Asia with the development of agriculture but instead supports regional continuity of the inhabitants of the Indus Valley

within a 2000 year time period with some interactions with populations living on the

Iranian Plateau (Hemphill et al. 1991). It should be noted that the Hazara are found to the northeast of Timargarha in rugged, potentially isolating, terrain and therefore may not show close biological affinities to the aforementioned populations.

Further support for the EEM model is provided by Lamberg-Karlovsky (1994).

Archaeological evidence of influences from the Near East in Central Asia, Baluchistan, and the Iranian Plateau beginning as early as the 4th millennium BC is proposed by

Lamberg-Karlovsky, who asserts the directionality of gene flow is from the Near East via the Iranian Plateau to Central Asia, West Asia and the Indus Valley. He suggests a slow and continuous spread of subsistence patterns and various technologies that may correlate to gene flow via population migration. Lamberg-Karlovsky (1994) argues for characteristically similar archaeological sequencing between the Near East and Central

Asia including shared metallurgy and pottery innovations evident in Central Asia during the 4th and 5th millennia BC with identical subsistence patterns found during the 7th millennium BC at site located in the Near East. However, given the two to three thousand year hiatus, this conclusion seems unlikely.

Francfort (1994) offers a different patterning of cultural assimilation but with similar results by studying the Central Asian dimension of symbolic systems in Bactria and Margiana, which he interprets as evidence of Syro-Hittite and Elamitic mythology present in the Oxus Civilization. Francfort’s reading of the archaeological record suggests that cultural influences from Turkmenistan, the Indus area, and Iran are evident in the Bronze Age civilizations of Bactria and Margiana as early as 2500 BC.

Francfort (1994) interprets this to suggest that the Oxus Civilization is a conglomeration of population migrations and cultural influences.

Analyses of skeletal material also lend support to the EEM model through various types of techniques and data including craniometric measurements and assessments non-metric cranial trait frequencies. When looking at the broad categorization of shared biological affinity there are only small differences between Caucasoid and Mongoloid types which was demonstrated by Howells (1973, 1976, in Hemphill 1991) using cluster and discriminant function analyses and could be interpreted as supporting the directionality of gene flow indicative of this model. This is further supported by Lynch

(1989, in Hemphill 1991) who analyzed phylogenetic hypotheses under the assumption of neutral quantitative variation and proposed that the distance observed between

Caucasoid and Mongoloid types is the result of genetic drift and mutation. The results obtained by Berry and Berry (1967, in Hemphill 1991) from their analysis of non-metric cranial trait variation further demonstrate similarities and differences between Indian populations and populations of various world regions. With regard to Punjabis, Berry and Berry showed that they are very different from both North and South American

Indians, peoples of Western Europe, and both ancient and modern Palestinian samples.

Instead they were found to be similar to both an ancient Egyptian sample and a modern

Burmese sample. These findings are further extrapolated by Pal and coworkers (1988, in Hemphill 1991) who determined through a study of nonmetric cranial variations that modern Gujaratis are distinct from ancient and modern Palestinians and Australian

Aborigines phenetically. They have closer phenetic similarities to both ancient

Egyptians and modern Burmese than to modern American Caucasians and modern

Punjabis (Pal et al. 1988, in Hemphill 1991) however this conclusion seems unlikely.

Dental metric analyses and morphological investigations provide substantial evidence of biological relationships can be used to support the EEM model. One example is Lukacs (1977, in Hemphill 1991), who used dental morphology data to discern the patterning of dental variation among Indian populations. The dental trait patterns showed that those traits characteristic of Europeans occurred more often among a sample of mixed caste Maharashtrans from the western region of India while those traits with a more East Asian dental pattern were found among a mixed caste urban Bengali sample from northeast India. Lukacs interpreted this pattern as indicative of a pattern of gene flow from West to East; however, one should be cautious as this evidence could also be interpreted as lending support to the HEIM rather than the EEM.

Hemphill and Lukacs (1990, in Hemphill 1991) posit the biological evidence of South

Asia exhibits interactions and gene flow indicative of sporadic and limited Western population incursions rather than supporting the hypothesis of long-standing population isolation.

Hemphill and coworkers (1992a) further support this model through the analysis of odontometric variation in the northwestern portion of India where they sought to discern possible biological interrelationships among and between modern groups of Bhils,

Garasias and Rajputs within the content of phenetic affinities to other South Asian groups.

Using cluster analysis and principal components analysis, it was determined that differential tooth size was not due solely to the effects of sexual dimorphism. These analyses also indicated that regardless of geographic locality there is a distinct

difference in tooth size allocation between non-caste tribe members and members of

Hindu castes, a clear separation in tooth size allocation patterns between Dravidian- speakers and Indo-European speakers, as well as differences between ethnic groups of northwestern, northeastern, and western India (Hemphill et. al. 1992a).

Genetic analyses also provide biological evidence to support the EEM model.

Quintana-Murci and coworkers (2001) use Y-chromosome haplogroups to trace

Southwest Asian population movements and they conclude there was a demic diffusion

of early farmers and pastoral nomads into India originating in central Asia. Based on

the frequency patterning and geographical cline of Y-chromosome haplogroups HG-3

and HG-9, Quintana-Murci and coworkers conclude that Indo-European speakers

migrated from Southwestern Iran and Central Asia into India. Substantial evidence for

South Asian population shared affinities with West Asian populations can be found

using such analyses and these lend further support to this model (Bamshad et al., 1998,

1998; Bhattcharayya et al., 1999; Majumdar, 1998; Passarino et al., 1996).

Cavalli-Sforza and coworkers (1988, in Hemphill 1991) posit that the peoples of

South Asia share closest biological affinities with Caucasoid groups lending support to

the proposed West Eurasian origins of both Dravidian- and Indo-Aryan-speaking

populations of South Asia. These researchers also suggest that the northern

populations of India are genetically different from those of South India. This distinction

is also supported by Passarino and coworkers (1996), who use mtDNA polymorphisms

to predict West and East Asian gene flow into the Indian subcontinent. Their genetic

evidence shows a clustering of the Indian samples with samples from western Asia;

nevertheless, Passarino and coworkers assert that one still cannot deduce the origins of

the initial population movement into Indian subcontinent (Passarino et al. 1996:930).

Passarino and coworkers found that the Indian populations possessed a genetic marker

specific to East Asians, haplogroup M, with frequencies exhibiting a positive cline from

north to south suggesting a small level of admixture with the highest frequency was

observed in Andhra Pradesh at 74%, while lowest frequency observed in the Punjab at

27%. The idea behind this being that the Indian subcontinent had limited gene flow

from outside into North India with little to no gene flow reaching into South Indian

populations. These researchers suggest that the initial introduction of East Asian genes

dates to either 41,724 - 55,000 or 30,250 - 60,500 years, which is likely a reflection of the initial dispersal of anatomically modern humans into Asia from Africa (Passarino et al. 1996).

Metspalu and coworkers (2011) provide further genetic support for the EEM.

They used SNP markers obtained from 1310 individuals of 112 populations located in various geographic locations around the world, of which, 30 are represented by different

Indian ethnic groups to perform a principal components analysis. Based on the patterning of pairwise genetic distances, they concluded that the Indian ethnic groups are more similar genetically to West Eurasians than to East Eurasians, with these similarities long predating the purported Indo-Aryan invasion of the mid-2nd millennium

BC. This patterning is further supported by Nei and Roychoudhury (1982, in Hemphill

1991) who posit that the peoples of South Asia are more similar genetically to West

Asian and European Caucasoid populations than to East Asian Mongoloid populations.

Hemphill and coworkers (1991) may provide biological evidence that further supports the EEM by demonstrating the skeletal evidence from Harappa shows closest

biological affinities to Near Eastern populations. This could suggest that Dravidian

speakers were at Harappa during the 3rd millennium BC, which supports the idea of an

early migration of peoples from Elamitic Iran into the Indian subcontinent. However, the

close phenetic affinities shared between the prehistoric inhabitants of Harappa and the

Chalcolithic occupants of Mehrgarh could be mean that Dravidian-speakers were

already present in the Indus Valley by the early 5th millennium BC. Furthermore, it

should be noted that the aforementioned claims only provide support for the EEM if a

direct link between Dravidian speakers and southwestern Iran can be demonstrated.

However, Fuller (2003) provides evidence that runs counter to this premise, for he

claims that Dravidian languages originated entirely in South Asia rather than in

southwestern Iran. Fuller (2003) uses archeobotanical evidence to claim that

independent centers of plant domestication occurred within peninsular India. He thus

rejects the notion that Dravidian languages spread eastward as a result of a migration of

some alleged proto-Elamo-Dravidian-speaking farmers from southwestern Iran. Instead,

he asserts that the Dravidian family of languages are a secondary consequence of the

independent development of plant domestication within peninsular India.

Expectations

If the EEM is correct, then the Hazara should have distant but equal biological affinities with groups from prehistoric Central Asia, living northwestern South Asia, and

the latest samples from the Indus valley. The Central Asian groups are represented by

the Geoksyur (GKS), Altyn depe (ALT), Sapalli tepe (SAP) and Djarkutan (DJR, KUZ,

MOL). The living northwestern Indian groups include the Indo-Aryan-speaking high-

status caste Vaghelia Rajputs (RAJ), low-status caste Garasias (GRS) and tribal Bhils

(BHI) while the post-Harappan samples from the Indus Valley include the Iron Age Sarai

Khola (SKH) sample and the Late Bronze/Early Iron Age Gandharan Grave Culture

sample recovered from Timargarha (TMG). If the EEM is correct, then the Hazara will

also show the furthest relatedness to the prehistoric and living Dravidian-speakers.

These would include southeast Indian groups, such as the high-status caste Hindu

Pakanati Reddis (PNT), low-status caste Hindu Gompadhompti Madigas (GPD), and non-caste tribal Chenchus (CHU); but only if later gene flow from Central Asia upon

Dravidian-influenced populations of the greater Indus Valley and surrounding highland

regions did not occur. However, it should be also noted that the EEM model does not

factor in the possible influence of later genetic admixture; therefore, close biological

affinity to the aforementioned groups is not refuting evidence for this model but may simply be the result of a dynamic population history.

Further support for this model will be found if the Dravidian-speaking ethnic

groups from southeastern India share affinities to Chalcolithic era samples from the

Indus Valley, such as Mehrgarh (ChlMRG) and Harappa (HAR). Therefore, two

breaks in the Indus Valley biological continuity may be demonstrated. These include

one break between the Neolithic and Chalcolithic occupations of Mehrgarh, and, for

the version of the EEM that includes the occurrence of an Indo-Aryan invasion from

Central Asia, a break between the Late Chalcolithic inhabitants of Harappa and the

later prehistoric and historic occupants of the Indus Valley.

Historic Era Influences Model (HEIM)

Proponents of the HEIM propose that many of the ethnic populations located at

the periphery of the Indian subcontinent are the product of recent migrations.

Evidentiary support for this model can be found from historically documented population movements and dental analyses. Historically documented population movements cannot be ignored when dealing with attempts to reconstruct the population history of the Indian subcontinent because such movements have the potential to greatly impact populations, especially at the local group level. Hemphill and coworkers (2009) proposed this model as an explanation for the origins of the Khowar, the numerically dominant ethnic group of Chitral District, Khyber Pakhtunkhwa, northern Pakistan.

O’Neill and Hemphill (2009) used odontometric analyses and determined that the

Wakhi, an ethnic group who claim to be refugees from the Wakhan Corridor of

Afghanistan, also conform to the expectations of this model (Sidky 1995). It is possible

the Hazara will also follow this model since they also claim to be Afghan refugees. The

individuals in question may, in fact, be the result of a population movement that took

place during the historic period. The subsequent genetic impacts of this model may

explain any deviation from the expectations of other origin models.

Evidence lending support to the HEIM may be found within the far-reaching,

influential reign of Genghis Khan (AD 1162-1227). Genghis Khan and his male relatives

established the largest land empire in history, conquering numerous populations, and

often subsequently slaughtering them (Morgan 1986). They left behind many

descendants who ruled China and areas north of the Great Wall for several centuries

after the dissolution of the Mongol empire as a political unit (Morgan 1986). The 78

influences of the Mongol empire in the form of conquered populations and subsequent

gene flow may be far-reaching. The vast region ruled by the Mongol empire included

parts of Central Asia, India, Pakistan, and Iran (Cavalli-Sforza et al. 1994). The troops

of Genghis Khan’s army were often left in detachments of a thousand in the newly

conquered areas and serve as the posited origins of the Hazara population according to proponents of this model. Evidentiary support can be found in genetic studies,

craniometric analyses, and archaeological data.

According to Zerjal and coworkers (2003), who analyzed Y-chromosome differences in a survey of DNA variation in Asia, found a high frequency of a cluster of closely related lineages, a star-cluster, whose origins may be traced back to Mongolia approximately 1,000 years ago. This was deduced based on the degree of observed genetic variation assessed with mutation and population processes, with model parameters consisting of both constant population size and exponentially increasing population size (Zerjal et al. 2003). The boundaries of the Mongol empire at the time of

Genghis Khan’s death correspond to the regions where populations possess the highest frequencies of the star-cluster. With the exception of the Hazara, whose profiles lie within the star-cluster at high frequency, the star-cluster is not found in other Pakistani populations. Thus, Zerjal and coworkers conclude the high frequency presence of this

star-cluster, coupled with its absence in neighboring Pakistani populations, provides

strong evidence that supports their oral tradition that members of Genghis Khan’s army

are responsible for their origins. However, the validity of these models to make such

assertions is slightly questionable since the possible spurious relationship of these

variables is not addressed such as when the star-cluster chromosomes became present

in the Hazara and how the genetic admixture occurred. Correlation of high frequencies of star-cluster chromosomes within the Hazara and their origin among Mongolians does not equate to causation; perhaps it was present among other Pakistani groups and subsequently selected against or perhaps its absence in these Pakistani groups is the result of a genetic isolating mechanism that did not affect the Hazara to the same degree.

Hunley and coworkers’ (2009) study of globally distributed gene patterning offers additional support for the HEIM model. These researchers compared the pattern of neutral genetic variation predicted by a coalescent-based simulation approach to the observed pattern estimated from neutral autosomal microsatellites among 1,032 individuals from 53 populations worldwide (Hunley et al. 2009:35). Outlier populations for specific regions were assessed and the Hazara stood out against the other Western

Eurasian populations by exhibiting greater genetic affinities to East Asians, which likely reflects their East Eurasian ancestry.

Qamar and coworkers (2002) also assert that the Hazara of Pakistan have

Mongol origins. Their analysis of Y-chromosome variation in Pakistani populations typed 18 binary polymorphisms and 16 multiallelic, short-tandem-repeat loci obtained from the non-recombining portion of the Y-chromosome of 718 individuals. They used principal components analysis of haplogroup frequencies, analysis of molecular variance, and median-joining network construction. The Y-chromosome binary polymorphisms showed shared haplogroups 1 and 9 for all Pakistani populations, including the Hazara, but the Hazara lacked haplogroup 3, which was present in all other examined populations (Qamar et al. 2002:1111). They also did not possess

haplogroup 28, which was found in almost all other Pakistani ethnic groups sampled.

The differences observed in the Hazara population were confirmed by principal components analysis of the binary marker frequencies which indicated that all populations show a striking overall resemblance to one another, except the Hazara

(Qamar et al. 2002:1114). The admixture estimates provide additional evidence for an external contribution to the Hazara (Qamar et al. 2002:1112). The results obtained from multidimensional scaling of weighted population pairwise values of short-tandem-repeat loci variation within haplogroups, represented as Wright’s FST (ΦST), show clear divergence with the Hazara exhibiting the most significantly different population pairwise

ΦST values, with a coefficient of determination of 0.81. The results of this study reveal an extreme genetic disconnect between the Hazara and other Pakistani populations, which when coupled with the possession of East Asian haplogroups, provides strong evidence of an East Eurasian origin. Qamar and coworkers (2003) assert that the

Hazara of Pakistan used in this analysis represent individuals who emigrated from

Afghanistan to the Khurram Valley towards the end of the 19th century and whose oral traditions also claim a Mongolian origin.

Quintana-Murci and coworkers (2004) provide further support for the HEIM utilizing mtDNA collected from 910 individuals of 23 populations from the southwestern

Asian corridor, including the Hazara of the former Northwest Frontier Province of

Pakistan (Khyber Pakhtunkhwa). These researchers used lineage geographical distribution and spatial analysis of molecular variance and found that the eastern

Eurasian-specific lineages were either completely absent or found in very low frequencies in all populations within the Anatolian/Caucasus region, the Iranian plateau,

and the Indus Valley, except the Hazara who exhibited a frequency of 35% (Quintana-

Murci 2004:834). A principal components analysis placed the Hazara in an intermediate position between populations from Central Asia on the one hand and those from the

Indus Valley on the other (Quintana-Murci 2004:835). Two significantly differentiated population clusters reveal a division between and Anatolian/Caucasus and Iranian plateau groups and all groups from Central Asia and the Indus Valley. When three groups are employed, the Hazara emerge as their own distinct group. Furthermore, the mtDNA and Y-chromosomal data are in concordance. Both the presence and time depth of Y-chromosome haplogroup C* (xC3c) possessed by the Hazara, coupled with its complete absence in neighboring populations, is suggestive of Genghis Khan’s and

his male relatives’ genetic legacy (Qamar et al. 2002; Zerjal et al. 2003). Importantly,

the mtDNA results obtained by Quintana-Murci and coworkers (2004:840) suggest that not only are males of East Asian origin responsible for the population origins of the

Hazara, but the mtDNA evidence suggested that females are as well.

Zerjal and coworkers (2002) provide evidence for an East Eurasian genetic influence in Central Asia when analyzing 16 Y-chromosomal microsatellites and 16 binary markers of 408 individuals from 15 Central Asian populations. These researchers used the autocorrelation index for DNA Analysis, analysis of molecular variance, multidimensional scaling and bootstrapping analysis and found an east-west gradient of Y-chromosome variation. Researchers also noted high levels of intragroup variation in Y-chromosome differentiation patterns between geographically close populations coupled with low intra-group variation within other populations suggesting

either multiple recent genetic bottlenecks or founder events thus emphasizing the importance of recent historic era genetic admixture events (Zerjal et al. 2002).

Hemphill (1999) analyzed the biological affinities and the likely origins of Bronze

Age Bactrians from the Oxus Civilization urban centers of Sapalli tepe and Djarkutan, whose origins are in question given their stratigraphic position immediately above sterile soil. The sample included 657 adults from Central Asia, Iran, and the Indus Valley.

Statistical techniques employed include Mahalanobis generalized distance to assess craniometric differences and cluster analyses, multidimensional scaling, and principal coordinates analysis to assess phenetic affinities. The biological affinities assessed in this analysis suggest that a possible shift in interregional contacts by the populations of the Oxus Civilization occurred around 2000 BC, which is suspected to involve populations of western China (Hemphill 1999:188). Several lines of archaeological evidence support this claim. Excavations at Sapalli tepe revealed silk remains in four graves as well as hundreds of millet seeds suggesting contact between populations of western China and the north Bactrian oasis (Askarov 1974, 1977, 1981 in Hemphill

1999). Furthermore, the Bactrian bronzes were produced by alloying copper and tin, with the best source of tin lying to the east within the Ferghana Valley, which again may be indicative of contact with western China (Hemphill 1999:188). Contact with western

China and the Indus Valley appears to have continued after 1800 BC based on an analysis of craniometric variation among the inhabitants of Yanbulaq, located in eastern

Xinjiang, but seems to have shifted again to western Central Asia later, perhaps during the 1st millennium BC (Hemphill 2013a). It is possible then the Hazara may show some

affinity to some Central Asian populations, although Hemphill (1997) has demonstrated that trade networks do not equate to gene transmission.

Expectations

If the HEIM is correct, the Hazara should represent recent immigrants into northern Pakistan, with all Pakistani groups showing varied biological affinities to one another with no shared affinities to prehistoric Indus Valley groups. Furthermore, if the

Hazara represent an intrusive population to northern Pakistan, it is possible they may not share biological affinities to any of the comparative samples thus representing an outlier to all populations from central Asia, Pakistan and peninsular India.

Current Statistical Analyses

A number of statistical analyses must be employed prior to conducting

biodistance analyses. These include determining the amount of intra- and inter-

observer error, as well tests for sex dimorphism, and asymmetry (Butler 1939,

Dahlberg 1945). Intra-observer error is assessed through the re-measurement of 25 casts selected randomly from the Hazara sample, and inter-observer error is assessed by measuring individuals previously measured by Hemphill as he is the one responsible for measuring the majority of the comparative samples used

in this analyses. For the inter-observer tests, a sample of casts collected from the inhabitants of Madak Lasht (n=35) were used. The Madaklasht samples were chosen using a random number chart as to not introduce biases; 35 individuals were chosen using this methodology with the most complete specimens used to produce a more accurate assessment of any potential inter- observer error present. Intra- and inter-observer error and the degree of asymmetry between antimeres are assessed for each dimension with paired- samples t-tests.

In addition, before conducting biodistance analyses the raw tooth sizes

are scaled against the geometric mean by sample and by sex to correct for sex dimorphism and evolutionary tooth size reduction (see Jungers et al. 1995).

Pairwise differences in tooth size allocation by sample are assessed with squared Euclidean distances. The diagonal matrix of pairwise squared

Euclidean distances is then submitted to hierarchical cluster analysis with

Ward’s (1963) method, neighbor-joining tree cluster analysis (Saitou and Nei 85

1987), multidimensional scaling with Guttman and Lingos’ coefficient of

alienation (Guttman 1968), Kruskal’s method using stress formula No. 1.

(Kruskal 1964), and principal co-ordinates analysis (Gower 1966). The

aforementioned statistical analyses are performed with NTSYS 2.1, Systat 11, and Phylip-3.69.

Individual teeth will be referred to by class, position in the dental arcade, and jaw. Tooth class will be represented as following: incisors = I, canines = C, premolars = P, molars = M. Differentiating between maxillary and mandibular dentition will be referred to as either upper or lower. For example, when referring to the right first mandibular molar, the abbreviation will read as follows: LRM1. Conversely, the left maxillary second incisor will be designated as ULI2. When distinguishing between the two forms of measurement

(buccolingual and mesiodistal), the dimensions will be abbreviated and cited at the end of the tooth description. For example, the buccolingual dimension of the upper left canine will be designated as ULCBL, whereas the designation

LLM1MD refers to the mesiodistal dimension of the lower left first molar.

Intra-observer error Analysis

Two-tailed distribution paired-samples Student’s t-tests using an alpha level of 0.05 were also conducted to determine if there is a statistically significant difference between the different bouts of measuring conducted by the current researcher in order to determine if the current researcher produces consistent

results. The results of the intra-observer error tests produced reasonably consistent results (see Table 2). The bases of these tests were the measurements taken from 25 randomly selected dental casts from the Hazara collection that were previously measured by the current researcher approximately 6 months prior. Out of 28 total dimensions, four (14.3%) demonstrated significant differences (LM2BL, LM1BL, UM1BL, and UP3BL).

This difference is not viewed as damaging to the current study because it causes little bias upon the overall analysis of tooth-size allocation. Overall, the results of the intra-observer error analysis indicate that the current researcher’s measurements are consistent, and that measurements are taken according to the same criteria during different bouts of data collection.

Table 2. Statistical Assessment of Intra-observer Error via Two-tailed Distribution Paired Student’s t-tests. The statistically significantly different values (p<0.05) are highlighted in yellow.

Tooth & Trial 1 Trial 2 Mean p-value Dimension mean mean difference

LM2MD 9.02 9.11 0.09 0.063 LM2BL 8.87 9.07 0.27 0.018 LM1MD 10.29 10.38 0.09 0.075 LM1BL 9.33 9.61 0.4 0.010

LP4MD 6.02 6.05 0.03 0.650 LP4BL 7.07 7.14 0.07 0.061 LP3MD 6.15 6.18 0.03 0.390 LP3BL 6.55 6.71 0.16 0.063 LCMD 5.95 6.03 0.08 0.085 LCBL 6.05 6.25 0.10 0.073 LI2MD 5.22 5.27 0.05 0.130 LI2BL 4.93 4.98 0.05 0.066 LI1MD 4.75 4.74 0.01 0.820 LI1BL 4.56 4.59 0.03 0.468 UM2MD 8.54 8.69 0.16 0.200 UM2BL 9.35 9.38 0.03 0.078 UM1MD 9.60 9.61 0.01 0.960 UM1BL 9.60 10.17 0.57 0.001 UP4MD 5.83 5.81 0.02 0.710 UP4BL 7.73 7.80 0.07 0.086 UP3MD 6.13 6.17 0.04 0.620 UP3BL 7.46 7.67 0.21 0.020 UCMD 6.77 6.87 0.1 0.065 UCBL 6.58 6.63 0.05 0.110 UI2MD 5.92 6.01 0.09 0.078 UI2BL 4.89 5.29 0.4 0.079 UI1MD 7.46 7.67 0.21 0.081 UI1BL 5.53 5.86 0.28 0.061

Inter-observer error Analysis

Two-tailed distribution paired-samples Student’s t-tests using an alpha level of 0.05 were conducted for each dental measurement to determine if there is a statistically significant difference between the measurement results of Hemphill and myself. A paired-samples t-test was chosen because it is one group of units that has been tested twice, often referred to as a ‘repeated measures’ Student’s t-test.

These inter-observer error tests were based on repeated measurement of 35 dental casts randomly selected from the Madaklasht dental series. Three of the 28 (10.72%) variables (LM2MD, UM1MD and LM1BL) yielded significant differences (see Table 3). The p-value of the LP4MD is barely greater than the

chosen alpha level of 0.05, indicating that it is almost statistically significantly

different. These differences are likely due to semi-eruption and gingival overlay leading to disagreements between the two observers as to whether the tooth can be measured and, if so, where exactly the edge of the tooth is.

Differences between the observers could also be due to a partially erupted tooth or imperfections in the dental casts leading to dispute over whether it was sufficiently preserved or erupted enough to be meaningful if measured.

Overall, the results indicate the current researcher measures consistently with

Hemphill during data collection.

Table 3. Statistical Assessment of Inter-observer Error via Two-tailed Distribution Paired Student’s t-tests. The statistically significantly different values (p<0.05) are highlighted in yellow.

Tooth & Trial 1 Trial 2 Mean p-value Dimension mean mean difference LM2MD 9.42 9.59 0.17 0.006 LM2BL 9.62 9.68 0.06 0.089 LM1MD 10.48 10.56 0.08 0.067 LM1BL 9.72 9.87 0.15 0.002 LP4MD 6.03 6.15 0.12 0.053 LP4BL 7.68 7.76 0.08 0.117 LP3MD 6.44 6.50 0.06 0.121 LP3BL 7.13 7.23 0.1 0.066 LCMD 6.14 6.20 0.06 0.147 LCBL 6.79 6.87 0.08 0.097 LI2MD 5.41 5.48 0.07 0.205 LI2BL 5.84 5.93 0.09 0.082 LI1MD 4.84 4.9 0.06 0.092 LI1BL 5.46 5.53 0.07 0.072 UM2MD 9.19 9.21 0.02 0.088 UM2BL 10.47 10.53 0.06 0.080 UM1MD 9.44 9.61 0.17 0.018 UM1BL 10.80 10.88 0.08 0.079 UP4MD 5.84 5.89 0.05 0.070 UP4BL 8.61 8.70 0.09 0.114 UP3MD 6.35 6.42 0.07 0.210 UP3BL 8.58 8.73 0.15 0.099 UCMD 7.11 7.2 0.09 0.217 UCBL 7.53 7.64 0.11 0.087 UI2MD 6.11 6.2 0.09 0.120 UI2BL 6.07 6.21 0.14 0.069 UI1MD 8.03 8.08 0.05 0.342 UI1BL 6.97 7.08 0.11 0.102

Asymmetry Analyses

Asymmetry is generally categorized as either directional or fluctuating.

Directional asymmetry (DA) is defined as the tendency for consistently greater development of one side of a paired bilateral structure over the other within a population. Fluctuating asymmetry (FA) consists of small random differences between antimeres. Research indicates FA varies in expression among prehistoric and contemporary populations (Doyle and Johnston 1977, DiBernnardo and Bailit

1978, Harris and Nweeia 1980, Townsend 1981, Mizoguchi 1986, Kieser et al.1986a,

Kieser and Groeneveld 1988), between dental arcades, and within morphogenetic fields (Bailit et al. 1970, Townsend and Brown 1980, and Kieser et al. 1986b).

Asymmetry can be calculated several different ways, including methods used by

Harris and Nweeia (1980), the Euclidean map distance used by Kieser and Goeneveld

(1988), the method of Guatelli-Steinberg and coworkers (2006), and a method used by

Hoover and coworkers (2005). A series of paired-samples Student’s t-tests were conducted using an alpha of 0.05 to determine whether there any statistically significant differences exist between right and left antimeres among the Hazara,. It is important to determine whether significant differences exists in order to be able to substitute missing left-side values with corresponding right-side values when conducting the statistical tests used for assessing biological distance.

The paired-samples Student’s t-tests indicate a few variables possess p-values less than 0.05, for both males and females (Tables 4 and 5). Three of 28 dimensions

(10.7%) are significantly different among Hazara males. These include LP3BL,

UM1MD, and UP3BL. The p-value for the LP4BL is just barely higher than the alpha

level. Three dimensions (LM2MD, LP3BL, and UCMD) are also statistically

significantly different among Hazara females, but only one (LP3BL) is shared in

common with Hazara males. Overall, the results indicate a rather low occurrence of

asymmetry, but asymmetry affects the dental arcades differently among males than females. For Hazara females, two of the three dimensions that differ significantly between right and left sides occur in the mandible. By contrast, among Hazara males two of the three significant differences between antimeres occur in the maxilla. An equal number of mesiodistal and buccolingual dimensions were found to differ significantly, suggesting similar plasticity rates during ontogenesis. Given the few statistically significant differences, asymmetry is unlikely to affect the results of the biological distance analyses.

Table 4. Results of the two-tailed distribution paired Student’s t-tests for asymmetry assessment among male Hazara. The statistically significant values (p<0.05) are highlighted in yellow.

Tooth & Minimum Maximum Dimension N Mean DF (n-1) p-value Value Value Pairs LLM2MD 70 9.515 7.44 10.85 69 0.512 RLM2MD 70 9.556 7.08 11.02 69 LLM2BL 64 9.187 7.33 11.41 63 0.796 RLM2BL 64 9.202 7.72 11.13 63 LLM1MD 71 10.458 8.72 11.96 70 0.274 RLM1MD 71 10.368 6.64 11.9 70 LLM1BL 71 9.601 6.76 10.9 70 0.237 RLM1BL 71 9.716 6.97 11.4 70 LLP4MD 81 6.282 5.17 8.06 80 0.140 RLP4MD 81 6.462 5.03 11.58 80 LLP4BL 81 7.261 5.76 8.69 80 0.052 RLP4BL 81 7.485 5.98 11.11 80 LLP3MD 83 6.237 5.35 7.47 82 0.274 RLP3MD 83 6.295 5.2 9.1 82 LLP3BL 82 6.796 5.4 8.42 81 0.037 RLP3BL 82 6.933 5.04 8.78 81 LLCMD 74 6.076 4.77 6.96 73 0.730 RLCMD 74 6.103 5.08 10.45 73 LLCBL 74 6.010 4.43 8.39 73 0.636 RLCBL 74 6.048 3.91 8.72 73 92

LLI2MD 67 5.345 4.12 6.68 66 0.877 RLI2MD 67 5.354 4.55 6.83 66 LLI2BL 80 4.873 3.39 6.68 79 0.355 RLI2BL 80 4.809 3.03 7.57 79 LLI1MD 65 4.800 3.81 5.65 64 0.409 RLI1MD 65 4.778 3.85 5.78 64 LLI1BL 68 4.713 3.06 7.14 67 0.777 RLI1BL 68 4.702 3.34 7.13 67 LUM2MD 75 8.685 6.96 10.88 74 0.644 RUM2MD 75 8.650 6.81 10.98 74 LUM2BL 76 9.614 7.64 11.68 75 0.244 RUM2BL 76 9.700 8.03 11.73 75 LUM1MD 80 10.027 8.34 11.89 79 0.030 RUM1MD 80 9.849 7.31 11.37 79 LUM1BL 81 10.006 8.66 12.17 80 0.177 RUM1BL 81 10.073 8.76 11.96 80 LUP4MD 83 5.869 4.68 7.68 82 0.499 RUP4MD 83 5.929 4.63 10.51 82 LUP4BL 85 7.968 5.78 9.57 84 0.090 RUP4BL 85 8.083 6.45 11.89 84 LUP3MD 86 6.156 4.92 7.28 85 0.794 RUP3MD 86 6.166 5.26 7.33 85 LUP3BL 87 7.791 6.11 9.79 86 0.009 RUP3BL 87 7.939 5.8 9.66 86 LUCMD 82 7.034 5.94 8.34 81 0.270 RUCMD 82 6.973 4.09 8.08 81 LUCBL 74 6.537 4.57 8.99 73 0.452 RUCBL 74 6.594 4.61 8.87 73 LUI2MD 62 6.105 4.67 7.96 62 0.513 RUI2MD 62 6.059 3.78 8.35 62 LUI2BL 69 4.776 3.69 7.02 68 0.591 RUI2BL 69 4.814 3.47 7.33 68 LUI1MD 65 7.807 5.05 8.81 64 0.460 RUI1MD 65 7.774 5.3 9.45 64 LUI1BL 75 5.487 3.51 7.83 74 0.503 RUI1BL 75 5.459 3.12 8.41 74

Table 5. Results of the two-tailed distribution paired Student’s t-tests for asymmetry assessment among female Hazara. The statistically significant values (p<0.05) are highlighted in yellow.

Tooth & Minimum Maximum Dimension N Mean DF (n-1) p-value Value Value Pairs LLM2MD 69 8.830 7.71 10.42 68 0.032 RLM2MD 69 8.965 7.65 10.62 68 LLM2BL 52 8.766 7.67 9.97 51 0.254 RLM2BL 52 8.682 7.63 9.81 51 LLM1MD 77 10.003 6.38 11.48 76 0.874 RLM1MD 77 10.016 6.24 11.43 76 LLM1BL 74 9.243 7.48 10.36 73 0.514 RLM1BL 74 9.309 7.03 10.28 73 LLP4MD 94 6.146 4.82 10.62 93 0.226 RLP4MD 94 6.266 5.21 10.38 93 93

LLP4BL 92 6.987 5.52 9.93 91 0.073 RLP4BL 92 7.121 5.77 10 91 LLP3MD 101 6.182 5.18 7.26 100 0.160 RLP3MD 101 6.116 4.66 7.92 100 LLP3BL 100 6.461 5.42 7.67 99 0.004 RLP3BL 100 6.602 4.9 8.34 99 LLCMD 93 5.806 4.5 6.87 92 0.715 RLCMD 93 5.822 4.71 7.21 92 LLCBL 95 6.014 4.57 7.28 94 0.437 RLCBL 95 5.971 4.35 7.24 94 LLI2MD 70 5.244 4.26 6.37 69 0.928 RLI2MD 70 5.240 4.16 6.4 69 LLI2BL 89 5.098 3.62 6.58 88 0.256 RLI2BL 89 5.041 3.9 7.16 88 LLI1MD 74 4.790 3.94 7.42 73 0.669 RLI1MD 74 4.774 3.4 7.32 73 LLI1BL 88 4.765 3.86 6.2 87 0.283 RLI1BL 88 4.799 3.6 6.42 87 LUM2MD 64 8.511 7.07 10.24 63 0.238 RUM2MD 64 8.397 6.78 10.51 63 LUM2BL 65 9.290 7.92 10.98 64 0.274 RUM2BL 65 9.359 8.02 11.19 64 LUM1MD 82 9.447 8.04 11.39 81 0.821 RUM1MD 82 9.433 7.98 11.42 81 LUM1BL 82 9.609 4.77 11.91 81 0.334 RUM1BL 82 9.691 8.22 11.63 81 LUP4MD 97 5.797 4.66 7.6 96 0.380 RUP4MD 97 5.854 4.39 8.81 96 LUP4BL 98 7.603 5.45 9.03 97 0.089 RUP4BL 98 7.742 5.9 10.29 97 LUP3MD 97 6.055 4.71 7 96 0.764 RUP3MD 97 6.071 4.35 8.92 96 LUP3BL 97 7.359 5.79 9.18 96 0.451 RUP3BL 97 7.404 5.33 9.26 96 LUCMD 92 6.672 5.28 8.18 91 0.002 RUCMD 92 6.774 5.42 8.04 91 LUCBL 85 6.625 5.18 7.58 84 0.128 RUCBL 85 6.712 5.13 7.99 84 LUI2MD 73 5.764 4.17 7.95 72 0.624 RUI2MD 73 5.792 4.39 7.85 72 LUI2BL 79 4.927 3.77 7.6 78 0.413 RUI2BL 79 4.994 4.08 7.67 78 LUI1MD 83 7.364 5.14 8.93 82 0.868 RUI1MD 83 7.370 4.81 8.89 82 LUI1BL 92 5.710 3.98 8.17 92 0.071 RUI1BL 92 5.611 3.6 8.1 92

For those variables found to differ significantly between right and left antimeres, DA was assessed to determine which antimere was larger than the other.

DA was calculated according to the standards used by Harris and Nweeia (1980).

That is, by subtracting the crown size difference between left and right antimeres, (d

= L – R), in the mesiodistal and buccolingual dimension. As such, a negative value

indicates the right antimere is larger than the left. Results of the DA assessment

indicate that most statistically significant differences are a result of the right antimere

being larger than the left (Table 6). The sole exception is the UM1MD dimension

among males for which the left side is larger than the right.

Table 6. Results of the directional asymmetry assessment among the Hazara with instances where the left antimere is larger (positive values) highlighted in yellow.

Females

Tooth & Dimension Sum d LUCMD 6.672 -0.101 RUCMD 6.773 LLP3BL 646.07 -14.15 RLP3BL 660.22 LLM2MD 609.29 -9.3 RLM2MD 618.59 Males Tooth & Dimension Sum d LUP3BL 677.81 -12.87 LUP3BL 690.68 LUM1MD 802.13 14.21 RUM1MD 787.92 LLP3BL 557.27 -11.21 RLP3BL 568.48

Sexual Dimorphism Analyses

Evaluating the degree of sexual dimorphism present among the Hazara is

important before conducting biological distance analyses. Student’s t-tests with a two- tailed distribution were conducted for all dimensions in both dental arcades with assumed homoscedasticity and again with assumed heteroscedasticity using an alpha of 0.05. If these two assumptions produced conflicting results, a test for the homogeneity of variance would be employed to determine which test proved more valid.

The magnitude of sex dimorphism was also calculated for each dimension. This measure is calculated by dividing the sum of the mean for each dimension of the male data and the corresponding mean for the females divided by the mean for the female variable which is then multiplied by 100 to generate a percentage,

((m-f)/f)*100.

The results of the Student’s t-tests indicate that multiple variables do possess p- values less than 0.05, for both right and left sides (Tables 7 and 8). Nine of the 28 dimensions (32.14%) differ significantly among variables belonging to the left-side of the maxillary dental arcade. Of these, five are mesiodistal dimensions (M2, M1, C, I2, and

I1), and four are buccolingual (M2, M1, P4, and P3). Eight dimensions (28.57%) differ significantly among variables belonging to the right-side of the maxillary dental arcade, four are mesiodistal dimensions (M1, C, I2, and I1) and four are buccolingual (M1, M2,

P4, and P3).

The left side of the mandibular dental arcade has seven of the 28 dimensions

(25%) possessing p-values less than alpha. Of these, three are mesiodistal dimensions

(M2, M1, and C) and four are buccolingual (M2, M1, P4, and P3). Furthermore, nine of 96

the 28 dimensions (32.14%) belonging to the right side of the mandibular dental arcade

are statistically significantly different. These variables include five mesiodistal

dimensions (M2, M1, P3, C, and I2) and four buccolingual dimensions (M2, M1, P4, and

P3). Overall, all molars and premolars for both dental arcades and right and left sides

show significant sexual dimorphism in the buccolingual dimension whereas all canines

show sexual dimorphism in the mesiodistal dimension.

Left-Side Measurements Assumed Assumed Tooth & % Sex Sex N Mean Homoscedasticity Heteroscedasticity dimension Dimorphism p-value p-value M 79 9.566 LM2MD 0.000 0.000 7.665 F 87 8.885 M 77 9.226 LM2BL 0.000 0.000 6.303 F 74 8.679 M 75 10.454 LM1MD 0.000 0.000 4.363 F 84 10.017 M 78 9.598 LM1BL 0.000 0.000 4.303 F 83 9.202 M 86 6.296 LP4MD 0.157 0.150 2.092 F 102 6.167 M 86 7.272 LP4BL 0.001 0.001 4.558 F 101 6.955 M 88 6.211 LP3MD 0.699 0.701 0.420 F 103 6.185 M 87 6.778 LP3BL 0.000 0.000 5.020 F 103 6.454 M 84 6.109 LCMD 0.000 0.000 5.237 F 99 5.805 M 85 6.01 5 LCBL 0.967 0.968 0.083 F 99 6.01 0 M 77 5.343 LI2MD 0.097 0.097 2.298 F 79 5.223 M 85 5.074 LI2BL 0.055 0.059 4.020 F 95 4.87 0 M 74 4.820 LI1MD 0.598 0.593 0.795 F 82 4.782 M 76 4.767 LI1BL 0.376 0.397 1.909 F 94 4.67 6 M 79 8.705 UM2MD 0.025 0.024 3.767 F 80 8.389 97

M 81 9.640 UM2BL 0.003 0.003 3.634 F 87 9.302 M 83 10.027 UM1MD 0.000 0.000 6.252 F 90 9.437 M 82 10.005 UM1BL 0.001 0.001 4.327 F 91 9.590 M 87 5.871 UP4MD 0.434 0.431 1.033 F 99 5.811 M 89 7.970 UP4BL 0.000 0.000 4.896 F 99 7.598 M 89 6.161 UP3MD 0.274 0.274 1.232 F 102 6.086 M 89 7.803 UP3BL 0.000 0.000 5.990 F 100 7.362 M 86 7.027 UCMD 0.000 0.000 5.653 F 100 6.651 M 78 6.572 UCBL 0.628 0.646 0.867 F 98 6.515 M 67 6.089 UI2MD 0.002 0.002 6.210 F 83 5.733 M 73 4.978 UI2BL 0.079 0.067 4.339 F 92 4.762 M 74 7.852 UI1MD 0.000 0.000 6.280 F 88 7.388 M 80 5.716 UI1BL 0.170 0.142 3.072 F 94 5.506

Right-Side Measurements Assumed Assumed Tooth & % Sex Sex N Mean Homoscedasticity Heteroscedasticity dimension Dimorphism p-value p-value M 75 9.582 LM2MD 0.000 0.000 6.550 F 75 8.993 M 72 9.169 LM2BL 0.000 0.000 5.829 F 66 8.664 M 80 10.387 LM1MD 0.001 0.001 4.057 F 85 9.982 M 78 9.716 LM1BL 0.000 0.000 4.676 F 84 9.282 M 83 6.470 LP4MD 0.128 0.135 3.437 F 98 6.255 M 83 7.506 LP4BL 0.003 0.003 5.570 F 95 7.110 M 84 6.298 LP3MD 0.026 0.027 2.976 F 101 6.116 M 84 6.940 LP3BL 0.001 0.001 5.120 F 100 6.602 M 79 6.097 LCMD 0.001 0.002 4.765 F 96 5.813

M 77 6.044 LCBL 0.496 0.518 1.324 F 96 5.965 M 72 5.359 LI2MD 0.048 0.050 2.643 F 86 5.221 M 82 5.055 LI2BL 0.061 0.069 4.332 F 95 4.836 M 72 4.794 LI1MD 0.749 0.741 0.503 F 89 4.728 M 70 4.770 LI1BL 0.459 0.486 1.664 F 92 4.808 M 77 8.626 UM2MD 0.166 0.161 2.228 F 70 8.438 M 80 9.679 UM2BL 0.006 0.006 3.408 F 71 9.360 M 85 9.828 UM1MD 0.000 0.000 4.398 F 92 9.414 M 86 10.071 UM1BL 0.000 0.000 4.341 F 94 9.652 M 86 5.916 UP4MD 0.682 0.685 0.801 F 101 5.869 M 86 8.068 UP4BL 0.008 0.008 3.956 F 102 7.761 M 87 6.165 UP3MD 0.276 0.266 1.431 F 98 6.078 M 88 7.936 UP3BL 0.000 0.000 6.925 F 100 7.422 M 85 6.965 UCMD 0.011 0.012 3.063 F 95 6.758 M 85 6.714 UCBL 0.383 0.388 1.683 F 88 6.604 M 79 6.028 UI2MD 0.025 0.025 4.653 F 87 5.760 M 77 4.945 UI2BL 0.686 0.686 1.052 F 85 4.893 M 78 7.746 UI1MD 0.001 0.001 4.931 F 92 7.382 M 77 5.617 UI1BL 0.263 0.233 3.027 F 96 5.447

The percentage of sex dimorphism associated with significantly different variables in the Hazara sample (Figures 7 and 8) ranges from a low of 2.976%

(LRP3MD) to a high of 7.665% (LLM2MD). Interestingly, the mesiodistal dimensions show the greatest range in variation in both tooth types and percentage values, exhibiting both the lowest and highest percentages produced by significant dimensions. Significant differences appear more

patterned for buccolingual dimensions, affecting only the distal dentition for both arcades and sides, ranging from 3.408% to 6.303%. All canines exhibit significant levels of sex dimorphism, ranging between 3.063% (RCMD) to

5.653% (LCMD) for the maxillary dental arcade and 4.765% (RCMD) to 5.237

(LCMD) for the mandibular dental arcade.

Mesiodistal Tooth Dimension 9

5 % on Right Side 4 % of Left Side 3

0 LM2 LM1 LP3 LC LI2 UI1 UI2 UC UM1 UM2

Figure 7. Bar Graph of Percentages of Statistically Significantly Different Mesiodistal Sexually Dimorphic Dimensions among the Hazara.

100

Buccolingual Tooth Dimension 8

4 % on Right Side % of Left Side 3

0 LM2 LM1 LP4 LP3 UP3 UP4 UM1 UM2

Figure 8. Bar Graph of Percentages of Statistically Significantly Different Buccolingual Sexually Dimorphic Dimensions among the Hazara.

Line charts graphing all variables, despite their statistical significance, illustrate an overall pattern of the distribution of sexual dimorphism across each arcade. These distributions are represented by Figures 9 through 12, divided by side and arcade.

Overall, sex dimorphism appears to affect both right and lefts sides similarly for both arcades in the buccolingual dimension with slight variations to the graphed pattern being caused by the LP4, UM1 and UM2. Variations in pattern for mesiodistal lengths are caused by LP3, and the left side of the maxilla showing larger values for the first molar, canine, and both incisors but illustrating the same general trend by tooth type.

Turning to differences observed between the arcades, the right lateral incisors, fourth premolars and second molars appear to be more dimorphic in the buccolingual

101

dimension for the mandible, while the right second molar, both premolars and canine

are more dimorphic mesiodistally in the mandible while both right incisors are more

dimorphic in the maxilla. For the left side of the dental arcades, the second molars are

more dimorphic for the mandible in the buccolingual dimension while the third premolars

are more dimorphic for the maxilla in the same dimension. Conversely, the mesiodistal dimension for the mandibular dental arcade shows the left second and first molars have an inverse relationship with the left second molar displaying more dimorphism in the mandible than the maxilla while the first molar displays less dimorphism for the mandible compared to the maxilla.

Figure 9. Line Chart Plotting Percentages of Sex Dimorphism for the Right-Side Mandibular Dentition.

102

Figure 10. Line Chart Plotting Percentages of Sex Dimorphism for the Left-Side Mandibular Dentition.

Figure 11. Line Chart Plotting Percentages of Sex Dimorphism for the Right-Side Maxillary Dentition.

103

Figure 12. Line Chart Plotting Percentages of Sex Dimorphism for the Left-Side Maxillary Dentition.

104

Biological Distance Analyses

Multidimensional Scaling Analyses

Two methods were applied when conducting multidimensional scaling

analyses: Guttman and Lingos’ coefficient of alienation (1968) and Kruskal’s

(1964) stress formula No. 1. These data reduction techniques yield X, Y, and Z

values along orthogonal axes for each group that have been plotted to provide visual

representation of inferred biological distance. As noted earlier, in such plots, groups

possessing small pairwise distance values have similar coordinate values and therefore

plot closely together (Scott and Turner (1997). Conversely, groups that possess little to

no affinities to other groups included in the analysis will be plotted according to unique

values that serve to isolate them spatially from other groups.

Guttman’s Method

The results of multidimensional scaling into three dimensions with

Guttman’s coefficient of alienation (1968) are illustrated in Figure 13. The three

dimensional matrix produced by this statistical technique yielded a stable solution

after 112 iterations with a stress of 0.082 and explains 97.7% of the observed variance among samples. The values for each dimension are provided in Table 9.

105

Figure 13. Results Obtained from Guttman’s Method for Multidimensional Scaling.

Table 9. Values for each Dimensional Axis Generated by Guttman’s (1968) Method for Multidimensional Scaling. Sample Names and Abbreviations Dimensional Axes One Two Three Khower KHO 0.601 -0.293 0.035 Geoksyur GKS 1.196 0.133 0.109 Altyn Depe ALT 0.208 -1.424 0.526 Pakanati Reddis PNT -0.433 0.081 0.117 Gompadhompti Madigas GPD -0.655 0.087 0.135 Chenchus CHU 0.034 -0.334 0.636 Vaghela Rajputs RAJ -0.453 -0.195 0.311 Gariasias GRS -0.498 -0.091 0.520 Bhils BHI -0.651 0.151 0.513

106

Neolithic Mehrgarh NeoMRG -0.005 0.371 0.101 Chalcolithic Mehrgarh ChlMRG -0.257 0.450 -0.334 Harappa HAR -0.198 0.858 0.200 Timargarha TMG 0.382 1.020 0.322 Sarai Khola SKH 0.569 0.546 0.559 Inamgaon INM -0.949 1.034 -0.045 Djarkutan DJR 0.697 -0.365 -0.308 Kuzali KUZ 0.927 0.667 -0.655 Molali MOL 0.872 0.204 -0.702 Sapalli Tepe SAP 0.981 0.091 -0.120 Madaklasht MDK 0.099 0.022 -0.025 Swati SWT -0.108 -0.214 0.034 Wakhi (Gulmit) WAKg 0.121 -0.520 -0.109 Wakhi (Sost) WAKs 0.225 -0.435 -0.106 Hazara HAZ -2.687 -0.173 -0.625 Burusho BUR -0.157 -0.495 -0.207 Shin (Astor) SHIa 0.099 -0.733 -0.352 Shin (Other) SHIo 0.041 -0.444 -0.529

The results of this analysis identify the Hazara (HAZ) as a peripheral outlier

located in the far left of the array, with only distant affinities to the three

peninsular Indian samples from Gujarat (BHI, GRS, RAJ). This isolated phenetic

position suggests the Hazara represent a distinctly different population

biologically from all of the South Asian and Central Asian samples included in

this analysis. Such phenetic isolation may be the consequence of either a recent

migration into the northern portion of Pakistan, as maintained by the Hazara, or

the presence of some extremely effective culturally-based genetic isolating mechanism. By dramatic contrast, all other Pakistani highlander groups plot closely together in the center-right of the array, indicating fairly close biological affinities. Closest phenetic affinities occur between the two Wakhi samples

107

(WAKg, WAKs) with the Wakhis from Sost and Shin sample from Astore (SHIa) also plotted closely together. The Shin sample from Gilgit and Haramosh

(SHIo) are linked to the Shin sample collected from Astore (SHIa), but they pulled upward and to the left into a unique phenetic position away from the other highlander samples. Swatis (SWT) and the inhabitants of Madaklasht

(MDK) share close affinities to one another but they possess very different secondary affinities. Swatis (SWT) are linked to the other highland groups via the Burusho (BUR), while the Madaklasht possess secondary affinities to prehistoric inhabitants of the Indus Valley, particularly to the earliest sample from the aceramic Neolithic levels at Mehrgarh (NeoMRG). The Khowars

(KHO) occupy a phenetic space between the Wakhis from Sost (WAKs) and the two earliest samples of prehistoric Central Asians, Sapalli tepe (SAP) and the Djarkutan period occupants of Djarkutan (DJR).

With the exception of the Namazga V period sample from Altyn depe

(ALT), which is found in a highly isolated position in the lower center of the

array, the remaining prehistoric samples from Central Asia occupy positions on

the extreme right side of the array. The temporally distinct sample from Djarkutan,

dating to the Molali (MOL) Period, occupies a phenetic space somewhat displaced to

the forefront relative to the samples from Sapalli tepe (SAP) and the temporally

distinct sample from the Djarkutan Period (DJR), while the Kuzali (KUZ) Period sample

occupies a somewhat isolated phenetic position near the phenetic position occupied by

Pakistani highland samples, but with a lower score on the Y-axis (Dimension Two) thereby offsetting it to the forefront. The Namazga III Period inhabitants of Geoksyur 108

(GKS) show closest affinities with the earliest Bactrian sample, Sapalli tepe (SAP), but

with the highest score for any of the samples for the Z-axis (Dimension Three), the

sample from Geoksyur occupies an isolated position in the extreme upper right of the

array.

Located in the center of the array, all of the peninsular Indian samples exhibit close phenetic affinities to one another, with one exception, the Chenchus

(CHU), a Dravidian-speaking tribal sample that occupies an isolated phenetic position in the lower center of the array. The remaining peninsular Indian samples occupy phenetic positions that are in accordance with language and social position (i.e., Hindu caste vs. tribal status). The Dravidian-speaking low- status caste Hindu Gompadhompti Madigas (GPD) share closest affinities to the sample of Dravidian-speaking high-status caste Hindu Pakanati Reddis (PNT).

Interestingly, these Dravidian-speaking, high-status caste Hindu Pakanati Reddis

(PNT) share secondary affinities to the Indo-Aryan-speaking, high-status caste

Vaghelia Rajputs (RAJ). Vaghelia Rajputs are most closely related to the Indo-Aryan-

speaking low-caste sample of Garasias (GRS), who are also positioned proximally to

the Indo-Aryan-speaking Bhil tribals (BHI).

The prehistoric samples from the Indus Valley show reasonable regional continuity, for all are located upper center of the array. With a single exception, the

Chalcolithic inhabitants of Mehrgarh, the Indus Valley samples are arranged in

chronological order. That is, the Neolithic sample from Mehrgarh (NeoMGR) links to

the sample of Late Chalcolithic occupants of Harappa (HAR), which links to the Late

Bronze/Early Iron Age Gandharan Grave Culture sample from Timargarha (TMG),

109

which in turn links to Sarai Khola (SKH), the latest of the prehistoric Indus Valley

samples. The sample collected from Timargarha (TMG) shares distant affinities with

the Iron Age Sarai Khola (SKH) sample. The Early Chalcolithic inhabitants of Mehrgarh,

while they share most proximate affinities with the earlier Neolithic inhabitants of this

site, possess phenetic affinities that are not particularly close and they depart along a

unique vector away from all other prehistoric samples from the Indus Valley. This

phenetic isolation of Chalcolithic Mehrgarh may reflect the arrival of a new population

into the North Kachi Plain during the fifth millennium BC. The prehistoric peninsular

Indian inhabitants of Inamgaon (INM) show distant affinities to the Late Chalcolithic

sample from Harappa (HAR), occupying a relatively isolated space in the upper left of

the array.

Kruskal’s Method

The results of multidimensional scaling with Kruskal and Lingos’ (1964)

stress formula No. 1 is provided in Figure 14. The matrix of pairwise distances

into three dimensions produced by this data reduction technique was obtained in

26 iterations with a stress of 0.064 and explains 98.1% of the observed variance among samples. The values for each dimension are provided in Table 10.

110

Figure 14. Results Obtained from Kruskal’s Method for Multidimensional Scaling.

Table 10. Values for each Dimensional Axis Generated by Kruskal’s Method for Multidimensional Scaling. Sample Names and Abbreviations Dimensional Axes One Two Three Khower KHO 0.578 -0.275 0.062 Geoksyur GKS 1.049 0.254 0.554 Altyn Depe ALT 0.157 -1.351 -0.684 Pakanati Reddis PNT -0.449 0.085 -0.016 Gompadhompti Madigas GPD -0.641 0.101 0.039 Chenchus CHU -0.047 -0.310 -0.613 Vaghela Rajputs RAJ -0.502 -0.170 -0.149 Gariasias GRS -0.578 -0.056 -0.317 Bhils BHI -0.731 0.173 -0.304 111

Neolithic Mehrgarh NeoMRG -0.029 0.375 0.032 Chalcolithic Mehrgarh ChlMRG -0.224 0.432 0.418 Harappa HAR -0.184 0.801 -0.376 Timargarha TMG 0.359 1.009 -0.348 Sarai Khola SKH 0.474 0.791 0.136 Inamgaon INM -0.913 1.064 0.124 Djarkutan DJR 0.628 -0.305 0.498 Kuzali KUZ 1.176 0.393 -0.515 Molali MOL 1.164 0.018 -0.044 Sapalli Tepe SAP 0.984 0.120 0.173 Madaklasht MDK 0.133 0.032 -0.019 Swati SWT -0.110 -0.213 0.012 Wakhi (Gulmit) WAKg 0.095 -0.532 0.110 Wakhi (Sost) WAKs 0.222 -0.427 0.061 Hazara HAZ -2.771 -0.187 0.260 Burusho BUR -0.109 -0.525 0.196 Shin (Astor) SHIa 0.150 -0.764 0.255 Shin (Other) SHIo 0.121 -0.532 0.453

Once again, the results indicate that the Hazara (HAZ) sample is an

extreme peripheral outlier located in the far left of the array, sharing only very

distant affinities to the sample of Dravidian-speaking low-status caste Hindu

Gompadhompti Madigas (GPD) instead of the Indo-Aryan-speaking Bhil tribals (BHI) as indicated in the previous plot. The other Pakistani highlander groups are plotted in the right front of the array. Somewhat different from the results produced by Guttman’s (1968) method, Kruskal’s method identifies the two

Wakhi samples (WAKg and WAKs) as sharing closest affinities with one another, as do the two Shin samples (SHIa, SHIo). Of the highland samples, the Khowars share closest affinities to prehistoric Central Asians, but this time, closest affinities are with the earliest of these prehistoric Central Asian

112

samples, Sapalli tepe (SAP). The Burusho (BUR) sample occupies a

somewhat isolated position intermediate between the Wakhi sample from

Gulmit (WAKg) on the one hand, and Swatis (SWT) on the other. Swatis and the inhabitants of Madaklasht (MDK) occupy more distant phenetic positions relative to other Pakistani highlander samples in the array yielded by Kruskal’s method than with Guttman’s. Nevertheless, the two samples are identified as possessing the same secondary affinities to non-highlander samples. That is, the inhabitants of Madak Lasht are identified as possessing rather distant affinities to the Neolithic inhabitants of Mehrgarh, while Swatis are identified as possessing rather distant affinities to high-status Hindu caste Vaghelia Rajputs from Gujarat.

The Central Asian sample from Altyn depe (ALT) is plotted within the same general phenetic space as the other Central Asian samples on the far right, but the minimum spanning tree identifies closest affinities occur with the Burusho

(BUR), rather than to any of the other samples from Central Asia. Such results suggest that Altyn depe represents a phenetic outlier with little to no affinities to the other samples included in this analysis. The temporally distinct samples from

Djarkutan (DJR, MOL and KUZ), the Namazga III Period inhabitants of Geoksyur

(GKS), and the Bactrian sample collected from Sapalli tepe (SAP) all share the

same affinities illustrated previously by Guttman’s method.

The living peninsular Indian samples are plotted closely together, in front of the prehistoric Indus Valley samples and to the left of the Pakistani highlanders and prehistoric Central Asians. Once again, the Dravidian-speaking low-status 113

caste Hindu Gompadhompti Madigas (GPD) share closest affinities to the sample of Dravidian-speaking high-status caste Hindu Pakanati Reddis (PNT). Pakanati

Reddis (PNT) share secondary affinities to the Indo-Aryan-speaking, high-status caste

Vaghelia Rajputs (RAJ). The Vaghelia Rajputs (RAJ) sample is, once again, most

closely related to the Indo-Aryan-speaking low-caste sample of Garasias (GRS). The

Garasias (GRS) also share affinities to the Indo-Aryan-speaking Bhil tribals (BHI). In a

marked departure from the results obtained with Guttman’s (1968) method, Dravidian-

speaking tribal Chenchus are not identified as an isolated peripheral outlier. Instead,

Kruskal’s method identifies them as possessing somewhat distant affinities to the

Garasias (GRS), a low-status Indo-Aryan-speaking Hindu caste of Gujarat.

The results of Kruskal’s (1964) method are similar to the results produced by the Guttman’s (1968) method with regard to the prehistoric samples from the

Indus Valley. Once again, Indus Valley samples are arranged in chronological order, with one notable exception, the Early Chalcolithic inhabitants of Mehrgarh

(ChlMRG). The Neolithic inhabitants of Mehrgarh (NeoMRG) share distant affinities to the Late Chalcolithic sample from Harappa (HAR), which links to the Late

Bronze/Early Iron Age Gandharan Grave Culture sample collected from Timargarha

(TMG), which in turn links to Sarai Khola (SKH), the latest of the prehistoric Indus

Valley samples. Intriguingly, Sarai Khola occupies an isolated phenetic position in the

upper right of the array. The prehistoric peninsular Indian inhabitants of Inamgaon

(INM), once again, show distant affinities to the Harappa (HAR) sample, occupying a relatively isolated space in the upper far left portion of the array.

114

Principal Co-ordinates Analysis

Principal co-ordinates analysis is used to provide a check on the results obtained by the other data reduction techniques. Principal co-ordinates analysis yields three co-ordinate axes that combine to account for 88% of the variance among samples. A plotting of co-ordinate axes scores for the first three co- ordinate axes with a minimum spanning tree imposed is provided in Figure 15.

Figure 15. Results Obtained from the Principal Co-ordinates Analysis.

115

Once again, the Hazara (HAZ) sample is located in the far left of the array, completely isolated from all of the other samples included in this analysis.

Instead of showing distant relations to living peninsular Indian samples, as was observed in the plots yielded by multidimensional scaling, the Hazara are identified here as possessing extremely peripheral affinities to the Chalcolithic inhabitants of Mehrgarh (ChlMRG). The prehistoric Indus Valley samples form an aggregate located centrally and in the forefront of the front of the array. Similar to the results of the multidimensional scaling analyses, the Late Jorwe period

sample from west-central peninsular India (INM) shows a distant affinity to the Late

Chalcolithic sample from Harappa (HAR), occupying a relatively isolated position in the lower left of the array towards the front. The Late Bronze/Early Iron Age Gandharan

Grave Culture sample collected from Timargarha (TMG) is, once again, located in the

phenetic space between the Late Chalcolithic sample from Harappa (HAR) and the

sample from Iron Age Sarai Khola (SKH), showing slightly closer affinities to Sarai

Khola (SKH) than to Harappa (HAR). Sarai Khola (SKH) occupied a phenetic position

intermediate between Timargarha (TMG) on the one hand and the sample from the

Central Asian BMAC urban center of Sapalli tepe (SAP) on the other.

All prehistoric Central Asian samples exhibit fairly close affinities to one

another and occupy the far right portion of the array, with one exception, the

Namazga V period sample from Altyn depe (ALT), which is located in the upper- center of the array, exhibiting apparent close affinities to the highland sample of Burushos (BUR). The temporally distinct sample from Djarkutan dating to the

Djarkutan Period (DJR), once again, shows distant affinities to the Pakistani highland 116

sample of Khowars (KHO). With the exception of the Hazara (HAZ), samples of living

Pakistani highlanders occupy the upper-center of the array. Within this regional aggregate, patterns of phenetic affinity are far different from those identified by multidimensional scaling. Unlike multidimensional scaling, which show the two Wakhi samples (WAKg and WAKs) as possessing closest affinities to one another, as do the samples of Shin, principal co-ordinates analysis identifies Wakhis from Sost (WAKs) as possessing closest affinities to Shin from Gilgit and Haramosh (SHIo), while Wakhis from Gulmit (WAKg) are identified as possessing closest affinities to Shin from Astore

(SHIa). Even more striking are the phenetic affinities identified for Swatis (SWT) and for the inhabitants of Madak Lasht (MDK). Both multidimensional scaling plots identified these two samples as possessing closest affinities to one another, but principal co-ordinates analyses identify these two samples as the two highland groups with the greatest phenetic distances to one another. The Swatis (SWT) are identified as possessing rather close affinities to the Burusho (BUR) on the one hand and distant affinities to Dravidian-speaking high-status Pakanati Reddis (PNT) from Andhra

Pradesh on the other. By contrast, the inhabitants of Madaklasht (MDK) are identified as possessing rather distant affinities to Khowars (KHO).

Peninsular Indian samples are found to the left of the Pakistani highlander groups, in the upper region of the array. Once again, the Dravidian-speaking low- status caste Hindu Gompadhompti Madigas (GPD) share closest affinities to the

Dravidian-speaking high-status caste Hindu Pakanati Reddis (PNT), but in opposition to the results obtained by multidimensional scaling, the Gompadhompti Madigas

(GPD) are also identified as possessing close affinities to the Indo-Aryan- 117

speaking Bhil tribals (BHI). The Garasias (GRS) also share affinities to the Dravidian- speaking high-status caste Hindu Pakanati Reddis (PNT) on the one hand and to high- status Indo-Aryan-speaking Vaghelia Rajputs (RAJ) on the other. Dravidian-speaking tribal Chenchus (CHU) are identified as occupying an isolated position in the upper- center of the array with only very distant affinities to Vaghelia Rajputs (RAJ).

118

Cluster Analyses

Hierarchical Cluster Analysis with Ward’s Linkage

Hierarchical cluster analysis using Ward’s (1963) linkage yields a dendrogram in which the samples are patterned geographically and temporally. The Hazara are identified as a distinct outlier with very distant and peripheral affinities to living peninsular Indian populations and prehistoric Indus Valley samples. Prehistoric

Central Asian and living Pakistani highlander samples cluster first together and then secondarily with each other while the living peninsular Indian and prehistoric Indus

Valley samples cluster first together and then secondarily to each other (Figure 16).

The only two exceptions to this geographic patterning are: 1), the southeastern peninsular Indian sample of living Chenchu tribals (CHU) showing closest affinities to living Pakistani highlander Swatis (SWT) and the Indo-Iranian speaking Madaklasht

(MDK); and 2), the living Pakistani highlander Hazara (HAZ) group, who do not share close affinities to any of the comparative samples. Thus, with only two exceptions, the results of the hierarchical cluster analysis yield four geographico-temporal aggregates that may be identified as: prehistoric Central Asians, prehistoric inhabitants of the Indus

Valley, living peninsular Indians, and living Pakistani highlanders.

119

Figure 16. Results of the Hierarchical Cluster Analysis using Ward’s (1963) Linkage.

All of the prehistoric Central Asian samples group together, except for the

Namazga V period sample from Altyn depe (ALT), which is the westernmost sample included in this analysis. Instead, the Mid- to Late Bronze Age inhabitants of this urban center found on the Kopet Dagh foothill plain of southern Turkmenistan do not share close affinities to any of the samples and are marked by only distant affinities to living

Pakistani highlander groups. Such affinities corroborate the results obtained from previous craniometric studies (Hemphill, 2013; Hemphill and Mallory, 2004), which

120

suggest the inhabitants of this site were involved in trade and exchange networks with populations to the south and to the west (such as Tepe Hissar), rather than with populations located to the east in Bactria or to the southeast in the Indus Valley (Barton and Hemphill 2011). One sub-clade within this Central Asian aggregate includes two temporally distinct samples from Djarkutan dating to the Molali (MOL) and Kuzali

Periods (KUZ). Another sub-clade is composed of the Middle Bronze Age Namazga III inhabitants of Geoksyur (GKS), located in the Tedjen River Delta of southeastern

Turkmenistan and the sample recovered from Sapalli tepe (SAP), the oldest of the samples from southern Uzbekistan, and these two samples share secondary and equal affinities to the Djarkutan Period sample from Djarkutan (DJR) and to the sample of living northern Pakistani, Indo-Aryan speaking Khowars (KHO) from Chitral District.

All living northern Pakistani samples group together, except for the Khowars

(KHO) who share closest affinities with the Djarkutan Period sample from Djarkutan.

Similar to results found in previous studies (O’Neill 2013), close affinities are demonstrated between the two Wakhi samples (WAKg and WAKs) with secondary affinities between both Wakhi samples and the Shin sample from Astore. The Shin samples collected from Gilgit and Haramosh share closest affinities to the Burushaski- speaking Burusho (BUR) and secondary affinities to the Shin sample collected from

Astore and the Wakhi (WAKg and WAKs).

Keeping in mind the previously mentioned exception of the sample of tribal

Chenchus, other living samples from peninsular India form a sub-clade that reflect both linguistic affinities and geographic distance. That is, the Dravidian-speaking high- status caste Hindu Pakanati Reddis (PNT) sample shares closest affinities to the

121

Dravidian-speaking low-status caste Hindu Gompadhompti Madigas (GPD) sample

while the Indo-Aryan-speaking low-caste sample of Garasias (GRS) from western India

and the Indo-Aryan-speaking high-status caste Vaghelia Rajputs (RAJ) share closest

affinities to one another. Indo-Aryan-speaking Bhil tribals (BHI) share secondary

affinities to both the high-status caste Vaghelia Rajputs (RAJ) and the low-caste

Garasias (GRS) sample.

Overall continuity is demonstrated among the prehistoric Indus Valley samples.

The prehistoric inhabitants of Inamgaon (INM), which is located in peninsular India,

share closest affinities to the Late Chalcolithic period sample from Harappa (HAR).

These two groups form a sub-clade that shows secondary affinities to the sub-clade

formed by the Neolithic (NeoMRG) and Chalcolithic (ChlMRG) inhabitants of Mehrgarh.

The Late Bronze/Early Iron Age Gandharan Grave Culture sample from Timargarha

(TMG) and Iron Age Sarai Khola (SKH) samples share closest affinities with each

other.

Neighbor-joining Cluster Analysis

The results of neighbor-joining cluster analysis illustrate relationships between

samples that are similar, but not identical, to those produced by the hierarchical cluster

analysis using Ward’s linkage (Figure 17). All prehistoric Central Asian samples form a

regional aggregate, except Altyn Depe (ALT), which stands apart as a distant outlier.

Similar to hierarchical cluster analysis, the two latest samples from Djarkutan, Molali

(MOL) and Kuzali Period (KUZ), share closest affinities to one another, while the

earliest sample from Bactria, Sapalli tepe (SAP), has closest affinities to the Namazga

III Period inhabitants of Geoksyur (GKS). This latter relationship was also found in an 122

earlier craniometric investigation, which led Hemphill (1999) to posit that the initial peopling of the north Bactrian oasis may have been refugees from the desiccated

Tedjen River Delta. The Djarkutan period (DJR) sample from Djarkutan shares affinities with the Indo-Aryan speaking Khowars (KHO) from northern Pakistan.

Figure 17. Results produced by the Neighbor-joining Tree Cluster Analysis. 123

Neighbor-joining cluster analysis depicts a pattern of affinities among Pakistani highlander samples that is similar to those obtained from hierarchical cluster analysis with Ward’s (1963) method. The Khowars (KHO) are identified as an outlier to this regional aggregate. Swatis (SWT) and the inhabitants of Madak Lasht (MDK) show closest affinities to one another and only possess distant affinities to the other groups from the Karakoram and Hindu Kush highlands. Close affinities are demonstrated between the two geographically distinct Wakhi samples (WAKg and WAKs) and the

Shin sample from Astore (SHIa), while the Shin sample collected from Gilgit and

Haramosh (SHIo) share closest affinities with the Burushaski-speaking Burusho (BUR) sample, both of which have secondary affinities to the two Wakhi samples and the Shin sample from Astore (SHIa). Chenchu tribals (CHU) are identified as an outlier with no close affinities to any of the other samples included in this analysis. The Hazara (HAZ) sample is also identified as an outlier to all northern Pakistani groups, with very distant affinities to the Late Chalcolithic sample from Harappa (HAR).

Located in the upper right of the array, five of the six peninsular Indian samples

exhibit closer affinities to one another than to samples from the other regions of South

Asia. Within this aggregate, the Dravidian-speaking caste Hindu samples from Andhra

Pradesh (PNT and GPD) show closest affinities to one another, as do the three Indo-

Aryan-speaking samples from Gujarat, located in northwestern India (RAJ, GRS and

BHI). Once again, the prehistoric Indus Valley samples form a regional aggregate, in

which the two earliest samples (NeoMRG, ChlMRG) share closest affinities to one

another, while affinities among the three later samples (HAR, TMG, SKH) are more

124

diffuse. The Late Jorwe period sample from west-central India (INM) is distantly associated with these prehistoric samples from the Indus Valley.

125

Discussion

Assessment of Intra- and Inter-observer Error

Before patterns of dental asymmetry, sexual dimorphism or biological affinities

could be assessed, the potential confounding factors of inter- and intra-observer error

had to be determined. The results of inter- and intra-observer error tests indicate little to

no significant influence upon the results of this research, for only a few variables show

statistically significantly differences. The inter-observer error tests indicate that three of

the 28 (10.72%) metric variables considered in this analysis differ significantly

between myself and Hemphill. These variables include LM2MD, UM1MD and

LM1BL. Intra-observer tests indicate a slightly higher rate of discordance, for four of

the 28 (14.3%) variables were found to differ significantly between the two

measurement sessions. These include LM2BL, LM1BL, UM1BL, and UP3BL. The higher rate of occurrence for the intra-observer error is believed to be a reflection of the current researcher’s learning curve as all Hazara dental casts were measured prior to conducting the measurements involved in the inter-observer error tests.

Intriguingly, while the disagreement between myself and Hemphill yielded no evident pattern by jaw or dimension, all involved the molar teeth. By contrast, significant mensurational disagreements between my two bouts of measuring did yield a consistent pattern, all involved buccolingual dimensions. I was surprised by this finding for when I was measuring I was especially concerned about the accuracy of the mesiodistal dimensions, especially for the premolar teeth, since it was often difficult to place the points in the interstitial spaces in between adjacent teeth. Yet, 126

none of the mesiodistal dimensions yielded a significant difference between

measurement sessions. I suspect the reason significant difference arose between

bouts for the buccolingual dimensions was due to holding the calipers at slightly

different orientations to the occlusal surfaces of the teeth during the two

measurement sessions. Overall, however, the rates of discordance yielded by these

analyses of inter- and intra-observer fall within the range of previous analyses

(O’Neill, 2013; Willis, 2010) and therefore did not compromise additional analyses.

Therefore, patterns of dental asymmetry, sexual dimorphism, and biological affinities and how they relate to current genetic studies will be discussed below.

Patterns of Dental Asymmetry

The results of the statistical analyses employed to discern patterns of dental

asymmetry have many implications. First, the pattern of dental asymmetry expression

suggests similar plasticity rates during ontogenesis for both mesiodistal and

buccolingual dimensions. Similar findings were reported by Harris and Nweeia (1980)

who analyzed 57 Ticuna Indians of Colombia. With an F value of 0.02 when pooled

tooth type, sex and arcade, the results of the one-way analysis of variance (ANOVA)

indicate that there is no difference in the magnitude of fluctuating asymmetry between

mesiodistal and buccolingual dimensions. However, the results of a three-way ANOVAs

of the mesiodistal and buccolingual dimensions show significant differences for all

variables in the mesiodistal dimension where only one variable, tooth type, is significant

in the buccolingual dimension, suggesting the buccolingual dimensions exhibits less

asymmetry (Harris and Nweeia 1980:136-7). 127

Second, although the occurrence of asymmetry does not preferentially affect tooth dimensions (mesiodistal or buccolingual), it does preferentially affect different dental arcades for males than females. Laboratory experiments conducted by Siegel and Doyle (1975) suggest the dentition of each dental arcade respond differently to stress. The results of their analyses indicated the upper molars exhibit more asymmetry in the buccolingual dimension whereas the mandibular dentition shows more variability in the mesiodistal dimension when exposed to the same stress.

The results of the current study also suggest different plasticity rates during odontogenesis for the maxilla and mandible distinguish males and females. For Hazara females, the majority of dimensions that differ significantly between right and left sides occur in the mandible while the majority of differences between antimeres among

Hazara males occur in the maxilla, only adhering partially to the pattern found by other researchers (Harris and Nweeia 1980). Harris and Nweeia (1980) found the maxillary dental arcade exhibited more asymmetry than the mandibular in both male and female

Ticuna Indians in the mesiodistal dimension. The same was demonstrated to be true for females regarding the buccolingual dimension however, the males showed slightly higher variability for the mandible in the buccolingual dimension (Harris and Nweeia

1980:138).

Third, a formula to determine directional asymmetry indicates that all significantly different variables show the right antimere to be larger than the left with the exception of the UM1MD dimension among males. This is similar to the results obtained by Guatelli-

Steinberg and coworkers (2006) who analyzed dental casts of 469 individuals belonging to the 1950’s population of Gullah, African Americans living in South Carolina and found

128

the right side to be larger than the left in those variables found to be statistically

significantly different when conducting Student’s t-tests. UM1MD appears anomalous in nature. Overall, given the few statistically significant differences (three each for males and females), it is very unlikely that asymmetry adversely affected the results of the biological distance analyses conducted during this research.

The pattern of dental asymmetry also has repercussions regarding the genetic canalization of males and females. Garn and coworkers (1965, 1966) discussed the presence of sexual dimorphism in dental asymmetry with regard to both dental metrics

(Garn et al., 1965) and morphology (Garn et al., 1966), suggesting that the paired X- chromosome confers greater dimensional control during ontogenesis. This implies that females are better buffered against deviations from side-to-side identicality than males.

Instead, the results of the asymmetry assessment obtained in the current study indicate rather low and, despite being different in expression, equally occurring rates of asymmetry among male and female members of the Hazara ethnic group.

The finding of parity in the expression of dental asymmetry across the two sexes among the Hazara may reflect either of two possible scenarios. First, these results may indicate that stress during the development of the permanent dentition was inflicted upon male and female members of the Hazara equally and is expressed equally due to a similar degree of genetic canalization among members of both sexes. Such a scenario would then demonstrate that, despite the differences in socio-cultural norms for males and females among the Hazara, neither sex is preferentially exposed to pathological afflictions, nutritional deprivation, or other stress-inducing factors that cause deviations from the genetically-coded instructions for development and result in

129

asymmetry in tooth size between antimeres. The degree of genetic canalization for males and females is speculative with some researchers elucidating similarities (Lau et. al. 1989) while others assert sex-based differences (Avesalo and Varrela 1991). For example, Lau and coworkers (1989) demonstrated the structural gene for amelogenin, the main protein in the soft enamel matrix, is located on both the X and Y chromosomes. By contrast, Avesalo and Varrela (1991) assert the X and Y chromosomes have different roles in the process, stating the X chromosome exerts its primary influence on enamel while the Y chromosome promotes both enamel and dentine growth. However, according to Scott and Turner (1997), despite the involvement of sex chromosomes in dental development, the effects do not substantiate a meaningful influence over phenotypic expression as the sex-based differences are often inconsistent and of low-magnitude when present (Scott and Turner 1997:109).

Alternatively, these results may indicate that, due to the cultural norms exhibited by the Hazara, males and females are exposed to different degrees of metabolic stress, but because one is more genetically canalized (supposedly, females; Garn et al., 1966;

Stini et al., 1969, 1972; Townsend and Brown, 1980; Stinson, 1985; Malina et al., 1985;

Lukacs and Hemphill, 1993) and therefore can withstand more developmental noise during odontogenesis than the other, the expression of asymmetry is equalized, despite the degree of exposure to stress during development being higher for one than the other. One could infer the cultural norms of the Hazara mimic those of some other

Pakistani and South Asian groups (Fikree et al., 2004; Qadir et al., 2011), who preferentially value males over females. Such differential valuation of little boys over little girls is reflected most dramatically by higher rates of female infanticide relative to

130

infanticide of males, but more often by a phenomenon known as “daughter neglect” often arising from cultural customs, such as bridal dowries, that distribute resources (i.e. wealth) outside family units therefore discouraging an adequate use of resources on daughters as they are already an expense (Miller 1984). Preferential treatment of males vs. females is often reflected in skeletal samples as a disproportionate occurrence of metabolic stress indicators exhibited by females in comparison to males from the same population including linear enamel hypoplasia and pitting, dental caries and subsequent antemortem tooth loss. For example, an increased rate of carious lesions in females in comparison to males from the Bronze Age Harappan sample was reported by Lukacs (1992). Under such a scenario, one could conclude that Hazara girls are exposed to greater amounts of stress during odontogenesis, but because they are more highly canalized genetically, the result is aparity in the phenotypic expression of dental asymmetry as observed in this study.

The second of these scenarios seems unlikely. Hemphill, during the collection of these dental casts saw no evidence that would suggest any of the typical symptoms of daughter neglect, for he found no evidence of dramatic differences in stature, body mass index, or linear enamel hypoplasia prevalence between the Hazara boys and girls that are the subject of the current study (Hemphill, personal communication, 2013).

However, it would be more meaningful to scientifically test the potential implications rather than speculate based on cultural inferences and assumptions and could be conducted in future research to expand the understanding of dental asymmetry exhibited by the Hazara. In order to statistically assess these implications further, a study of the variance within the statistically significantly different variables between

131

males and females should be conducted using Shapiro-Wilks test for homogeneity of variance within each variable and Bartlett’s test for homogeneity of variance between each variable; less variance should be expressed among those more genetically canalized.

Patterns of Sexual Dimorphism

The results of the statistical tests for assessment of sexual dimorphism yield

significant differences with males possessing larger teeth than females. All left and right

molars and premolars for both dental arcades exhibit significant sexual dimorphism in

the buccolingual dimension whereas sex dimorphism in the mesiodistal dimension is

limited to right and left mandibular and maxillary canines. The percentages of sex dimorphism associated with statistical significance range from a low of 2.98%

(LRP3MD) to a high of 7.67% (LLM2MD) and are visually represented in Figures 7 and

8 while all variables, despite their statistical significance are illustrated in Figures 9 through 12. Such percentages conform to the accepted 2-6% range of dimorphism established by other researchers (Scott and Turner 1997; Garn et al. 1965, 1966;

Mizoguchi 1988) with the Hazara exhibiting just slightly higher percentages for two variables, URP3BL (6.93%) and LLM2MD (7.67%).

Patterns of Biological Affinities

The primary goal of this research has been to analyze and interpret the dynamic population history of the Hazara and other ethnic groups found in the Hindu

Kush and Karakoram highland regions of Khyber Pakhtunkhwa Province and Gilgit-

132

Baltistan, Pakistan. This research provided the first opportunity to determine

whether the patterning of tooth size allocation of the permanent dentition among the

Hazara yields results that are concordant with those obtained by an array of recent

genetic investigations. In addition, this research lends further support to the biological

significance of the ethnic classifications for the self-identifying, geographically distinct

Wakhi and Shin ethnic groups who occupy the Karakoram highlands of Gilgit-Baltistan.

The results of the current analyses demonstrate that the Wakhi samples from

Gulmit and Sost (WAKg and WAKs) consistently exhibit closer affinities to one another than to any of the other groups considered in this study. These closely shared affinities to one another are demonstrated strongly in three of the five of the statistical methods employed. Both methods employed for multidimensional scaling plot the two

Wakhi samples closest together, as does the dendrogram produced by the hierarchical cluster analysis. These results demonstrate that these self-identifying ethnic groups are meaningful biological entities for reconstruction of population history.

Further support for the ethnic classifications of these groups being biologically meaningful is lent by the biological affinities of the Shin samples (SHIa, SHIo). The results of the current analyses demonstrate the Shin sample collected from Astore and those collected from Gilgit and Haramosh share closest affinities to one another using the Kruskal’s (1964) method for multidimensional scaling. The other statistical methods employed show shared close affinities but none as closely as those illustrated by

Kruskal’s (1964) method. The closely shared biological affinities illustrated by the results of this current study demonstrate that the addition of more samples (HAZ) to the biological distance analyses employed does not alter the primary findings of O’Neill 133

(2013), and as such, increasing the robusticity of the results obtained by tooth size allocation analysis.

The four most current models of South Asian population history were tested using biodistance analyses of patterning of tooth size in the permanent dentition

in conjunction with historical, archaeological, osteological and genetic evidence.

Through this research, aspects of large-scale inter-regional contacts among post-

Pleistocene populations of the Indus Valley, peninsular India, and regionally adjacent

Bronze Age samples from southern Central Asia and their biological consequences

among subsequent populations within South Asia were explored. The results of the

biological distance analyses and how they relate to the expectations of the four South

Asian population history models used in this study are addressed below.

The Long-Standing Continuity Model

Proponents of the Long-Standing Continuity Model (LSCM) maintain that South

Asian ethnic groups are a product of an initial dispersal of modern Homo sapiens out of

Africa some 60-100,000 years ago. These emigrants spread across Central and

Eastern Eurasia by one if not two routes of dispersal (Wells et al. 2001, Zerjal et al.

2002). One of these routes, the Southern Route of Dispersal, resulted in the initial

introduction of modern humans into the Indian subcontinent during the Pleistocene

(Atkinson et al., 2008; Forster and Matsumura, 2005; Macauley et al., 2005; Mellars,

2006; Qamar et al., 2002; Underhill et al., 2001; but see Lahr and Foley, 1994; Kong et

al., 2006; Zhong et al., 2010) and was subsequently followed by a long-standing

continuity of these groups (Metspalu et al. 2004; Sengupta et al. 2006). If the LSCM is

134

true, it is expected that the biological affinities of these groups will reflect a pattern of isolation-by-distance, in which populations closest both temporally and geographically possessing closest affinities to one another. The LSCM is only partially supported by the results of this study. The results of the statistical analyses provided evidence discounting the claim that no significant movements of outside populations into the

Indian subcontinent have occurred during the past 60,000 years (Kennedy 1999, also see O’Neill and Hemphill 2008), for all of the biological distance analyses employed consistently display shared affinities between the prehistoric Central Asian sample dating to the Djarkutan Period from Djarkutan (DJR) and living Pakistani Khowars

(KHO) from Chitral District, a finding that is inconsistent with long-standing local continuity among all groups. Furthermore, all the statistical analyses indicate no shared affinities between the Hazara (HAZ) and the other samples of highlander Pakistani ethnic groups included in the current study. Nevertheless, the results do yield regional aggregates consistent with the notion that geographic proximity encourages localized gene flow and is consistent with a pattern of biological differentiation consistent with isolation-by-distance. However, this is not the entire picture, for exceptions to this patterning among Pakistani highlanders include the Khowars (KHO), the Hazara (HAZ), and the Madaklasht (MDK). In addition, the peninsular tribal Chenchus (CHU) also do not follow suit with the isolation-by-distance patterning, nor do the Late Jorwe inhabitants of Inamgaon (INM).

135

The Aryan Invasion Model

Proponents of the AIM assert that Indo-Aryan languages were first introduced to

South Asia by Aryan invaders from Central Asia, particularly Bactria and Margiana,

during the second millennium BC, whose descendants subsequently spread Vedic

culture throughout peninsular India (Erdosy 1995; Parpola 1995). If true, the post-

Mature Phase inhabitants of the Indus Valley ought to share close biological affinities to

this intrusive population. Furthermore, Indo-Aryan-speaking ethnic groups of peninsular

Indian populations should also reflect the biological impact of these Aryan invaders.

With the sole exception of the results obtained from the principal co-ordinates analysis,

none of the other analyses lend any support for a scenario that calls for a Bronze Age

invasion of Central Asians into Indian subcontinent during the latter half of the 2nd

millennium B.C. To be sure, the results obtained by principal co-ordinates analysis

show shared, but distant, biological affinities between the Central Asian sample

recovered from Sapalli tepe (SAP) and the latest of the Indus Valley prehistoric

samples, Sarai Khola (SKH), but none of the other analyses employed in this study link

these two samples or any other prehistoric Central Asian samples to prehistoric Indus

Valley samples. However, one must remember that these samples are separated in

time by some two millennia. It stands as incongruous that such connections are

genuine, when there is no indication of any affinities between the sample from Sapalli

Tepe and the much more temporally proximate Gandharan Grave Culture sample from

Timargarha, which Dani (1967) identified as indicative of these very Central Asian

invaders. Instead, the results of both multidimensional scaling methods and both cluster analyses reveal an extreme disconnect between these aggregates, who consistently

136

occupy completely different phenetic spaces in the arrays (multidimensional scaling) or placed in sub-clades of completely different aggregates within dendrograms (cluster analyses). Consequently, the anomalous results of the principal co-ordinates analysis in linking the samples from Sapalli tepe and Sarai Khola are considered spurious and should not be interpreted as lending support for the wholesale migration of Central

Asians into the Indus Valley called for by the proponents of this model.

In addition, the only link between any Central Asian sample and peninsular

Indians is the distant phenetic affinity between the Central Asian outlier, Altyn depe

(ALT), and the peninsular Indian outlier, tribal Chenchus (CHU). Because these results are not consistent, because the Chenchu speak a Dravidian language rather than an Indo-Aryan language, and because there is no other indicator of any cultural similarity between the food producing, stock-raising, highly urbanized Bronze Age inhabitants of Altyn depe (Masson, 1988) and the nomadic tribal Chenchus, who until recently subsisted by means of hunting and gathering (Fürer-Haimendorf, 1943), this tenuous connection should not been interpreted as supporting evidence for this model. Instead, this similarity is merely a statistical consequence of these data reduction techniques attempting to accommodate two samples that stand as distant outliers to all other samples included in this analysis. In dramatic contrast, the shared affinities between the

Khowars (KHO) and prehistoric Central Asians are consistently produced in the statistical analyses employed in this study as well as those of others (Blaylock,

2008; Hemphill et al., 2013; O’Neill, 2013) thus suggesting validity to the

137

affiliation between this living ethnic group of the Hindu Kush highlands and the prehistoric inhabitants of the north Bactrian oasis.

The Early Entrance Model

Proponents of the Early Entrance Model (EEM) claim that members of Proto-

Elamitic populations from southwestern Iran emigrated to the Indus Valley of Pakistan

at some point between the fifth and seventh millennia BC (Fairservis and Southworth

1989; Southworth 1995). If this model is true, then a biological continuity between

prehistoric Indus Valley populations and populations of southwestern Iran should be

demonstrated. Furthermore, proponents of this model maintain that this proto-

Dravidian population residing within the Indus Valley antecedent to and

contemporaneous with the Harappan Civilization, was the source population for living

Dravidian-speaking populations of southeastern peninsular India (McAlpin 1981:

Witzel 1999). If proto-Elamo-Dravidian-speaking populations were in the Indus Valley,

it may be that this population may also have given rise to some of the ethnic groups

found today in the Hindu Kush and Karakoram highlands of northern Pakistan.

In order for this model to be supported, one and perhaps two breaks in the

biological continuity of Indus Valley populations must be demonstrated. The first

should signal the initial appearance of these proto-Elamo-Dravidian-speakers into the

Indus Valley at some point between the fifth and seventh millennia BC. As such, one

ought to expect strong differences between the aceramic Neolithic inhabitants of

Mehrgarh, who antedate any such immigration event, and the Chalcolithic inhabitants

of this site who lived there subsequent to this alleged migration. A second possible

138

break in Indus Valley biological continuity may have occurred in the mid-second millennium, signaling the arrival of Indo-European-speaking populations from Central

Asia.

The prehistoric samples from the Indus Valley show reasonable continuity and, for the most part, phenetic distances correspond to their chronological order. Nevertheless, there is evidence to support the first of the breaks in biological continuity of the Indus Valley called for by proponents of the

EEM. A break between the Neolithic and Chalcolithic occupations of Mehrgarh

(NeoMGR and ChlMGR) is illustrated in the array resulting from both methods used for multidimensional scaling and by principal co-ordinates analysis. The Early

Chalcolithic inhabitants of Mehrgarh still share most proximate affinities with the earlier Neolithic inhabitants of this site, but they possess phenetic affinities that place them on a unique vector in a phenetic space somewhat isolated from the other prehistoric samples from the Indus Valley. As stated previously, this phenetic isolation of Chalcolithic Mehrgarh may reflect the arrival of a new population into the

North Kachi Plain during the fifth millennium BC, perhaps even the Proto-Elamo-

Dravidian-speaking migrants this model posits.

The second break that may or may not have occurred according to proponents of the EEM does not, for the most part, appear to be supported by the statistical results. The sample of Late Chalcolithic occupants of Harappa (HAR) links to the Late Bronze/Early Iron Age Gandharan Grave Culture sample from

Timargarha (TMG), which in turn links to Sarai Khola (SKH) in the array produced by both methods for multidimensional scaling thereby following the pattern of 139

affinities expected under conditions of biological continuity over time during the

last several millennia BC. However, the dendrogram produced by the hierarchical

cluster analysis using Ward’s (1963) linkage does illustrate a break in continuity

between these samples, placing the sample from Harappa (HAR) in a different sub-

clade than the samples from Timargarha (TMG) and Sarai Khola (SKH).

However, this could be a result of the analysis being forced to find pairwise splits

regardless of them truly reflecting the biological relatedness between samples.

Further support for this model could be found if the Dravidian-speaking ethnic

groups from southeastern India exhibit affinities to Chalcolithic era samples from the

Indus Valley. However, none of the analyses show any evidence of shared biological affinities between Dravidian-speaking South Asian Gompadhompti

Madigas (GPD), Pakanati Reddis (PNT) or tribal Chenchus (CHU) to Chalcolithic era samples from the Indus Valley sites of Mehrgarh (ChlMRG) or Harappa

(HAR), thereby refuting the expectations of this model.

Still further, if the EEM is true, biodistance analyses should yield distant but equal biological affinities to prehistoric Central Asian samples, living northwestern

Indian groups, and post-Harappan Indus Valley groups. Although the multidimensional scaling methods plot Pakistani highlanders in a phenetic space between prehistoric Central Asian samples (GKS, ALT, DJK, KUZ, and MOL) and living northwestern Indian groups (RAJ, GRS, and BHI), such results are inconsistent.

That is, the Guttman’s (1968) method shows the northern Pakistani Swatis (SWT) as

sharing affinities to both the Indo-Aryan-speaking northwestern Indian Vaghelia

Rajputs (RAJ) and the Dravidian-speaking southeastern Indian Chenchu (CHU), with

140

greater distance from the Chenchu. Although the Kruskal’s (1964) method also

indicates shared affinities between Swatis (SWT) and Vaghelia Rajputs (RAJ), neither

the two cluster analyses, nor the principal co-ordinate analysis corroborate this shared

affinity.

The Historic Era Influence Model

Proponents of the Historic Era Influence Model maintain that the members of

some of the current ethnic groups of northern Pakistan are descendants of immigrants

who entered South Asia during the protohistoric and historic periods (Hemphill et al.

2009; O’Neill and Hemphill 2009; O’Neil 2012). If the HEIM is true, then these groups

will show either varied or no biological affinities to one another, and no affinities to prehistoric Indus Valley groups. The results of the biological distance analyses strongly support the HEIM. Hierarchical and neighbor-joining cluster analyses identify the

Hazara as a distant outlier to all of the Central and South Asian samples included in this analysis. Such results not only suggest limited to no gene flow between the Hazara and the living ethnic groups of the Hindu Kush and Karakoram highlands residing within close geographic proximity to them, but these results also indicate that the Hazara are not the descendants of prehistoric Central Asians occupying the North Bactrian Oasis of southern Uzbekistan (Sapalli tepe, Djarkutan), the desiccated Tedjen Oasis of southeastern Turkmenistan (Geoksyur), or the Kopet Dagh foothill plain of south-central

Turkmenistan (Altyn depe), nor are they descendants of the prehistoric inhabitants the

Indus Valley (Mehrgarh, Harappa, Saraki Khola), the Swat Valley (Timargarha), or even peninsular India (Inamgaon). Their depiction as complete outliers in the three- 141

dimensional array yielded by multidimensional scaling using both Guttman’s (1963) and

Kruskal’s (1964) methods, and in the triaxial plot produced by principal co-ordinates analysis further corroborates and demonstrates that the Hazara share no biological affinities to other living ethnic groups of the Hindu Kush and Karakoram highlands, to prehistoric inhabitants of southern Central Asia or populations of the Indus Valley, or to living ethnic groups of west-central and southeastern peninsular India.

Nevertheless, these statistical analyses do yield sub-clade differences in the

Pakistani highlander groups that largely conform to their relative geographic proximity, in which the occupants of Gilgit-Baltistan area consistently cluster closely together

(WAKg, WAKs, SHIa and SHIo) than they do with the two ethnic groups from the Chitral

District of Khyber Pakhtunkhwa (KHO, MDK). These results are similar to those obtained by O’Neill (2013) and are in support of the HEIM by demonstrating a discontinuity between the biological affinities among most Pakistani highlander groups that is likely the result of recent historic population movements and interactions. The results of this study and others (Blaylock 2008; O’Neil 2013; Hemphill et al. 2013) consistently demonstrate that the Khowars (KHO) share affinities with prehistoric

Central Asians. The results of the principal co-ordinates analysis that singularly places the inhabitants of the valleys of Chitral at a great phenetic distance is viewed as anomalous and does not refute the close relationship indicated by the results obtained by other four analyses encompassed by this research. Likewise, the results of the principal co-ordinates analysis, which place the Wahki and Shin samples at a greater phenetic distance than displayed in the other methods is also considered anomalous.

142

Odontometric Studies and Genetic Research

The results of these odontometric analyses corroborate the genetic studies that

also found the Hazara to be outliers to all other Pakistani populations including Zerjal

and coworkers’ (2003) study, which analyzed Y-chromosome variation in Asia and found a high frequency of a cluster of closely related lineages, referred to as a star-

cluster, present among the Hazara at a high frequency but not found in other Pakistani

populations. Another example of genetic differences among the Hazara is the mtDNA

analysis conducted by Quintana-Murci and coworkers (2004) who found eastern

Eurasian-specific lineages at a frequency of 35% among the Hazara, but either completely absent or present at very low frequencies in all Central Asian and Indus

Valley populations (Quintana-Murci 2004:834). Still further, Hunley and coworkers’

(2009) comparison of the pattern of neutral genetic variation predicted by a coalescent-

based simulation approach to the observed pattern estimated from neutral autosomal

microsatellites produced outlier populations that were assessed specifically. Once

again, the Hazara stood out as an anomaly relative to Western Eurasian populations in

a fashion remarkably similar to the results obtained by these odontometric analyses.

Indeed, in their study of Y-chromosome binary polymorphisms Qamar and

coworkers (2002) found the Hazara to be outliers relative to other Pakistani populations,

for the Hazara were found to lack some haplogroups that all other Pakistani groups

possessed. Indeed, their principal components analysis of these binary markers

revealed a striking overall resemblance among all Pakistani groups—except the

Hazara, who were, once again, outliers to the other populations (Qamar et al.

2002:1114). The results obtained from multidimensional scaling of weighted population

143

pairwise values of short-tandem-repeat loci variation within haplogroups also showed a

clear divergence of the Hazara, who exhibited the most significantly different population

pairwise values (Figure 18; Qamar et al. 2002:1117). Thus, these studies demonstrate an extreme genetic disconnect between the Hazara and other Pakistani populations, very similar to the results of this odontometric study. The similarities found between these genetic analyses and the odontometric analyses of the Hazara conducted in this research support the validity of the use of dental metric data to discern biological affinities and relatedness of different populations.

Figure 18. Multidimensional scaling presentation of weighed population pairwise values of short tandem-repeat loci variation with haplogroups, from Qamar and coworkers (2002:1117).

144

Conclusion

In conclusion, this research adds to the existing data available for Pakistani highlanders, refining our understanding of population dynamics at this important ancient crossroads at the western margin of the “roof of the world.” The results of this odontometric research indicate the Hazara represent foreigners who introduced non- local genes into the resident South Asian gene pool, supporting the HEIM and in agreement with recent genetic studies (Hunley and coworkers 2009; Quintana-Murci and coworkers 2004; Zerjal and coworkers 2003; Qamar and coworkers 2002). This study identifies the Hazara as a distinctly genetically different ethnic group living in the northern area of Pakistan. Positioned as complete outliers in the hierarchical and neighbor-joining tree analyses, the multidimensional scaling using both Guttman’s

(1963) and Kruskal’s (1964) methods, and in the array illustrated by the principal coordinates analysis indicate the Hazara do not share biological affinities to other

Pakistani highlander groups, prehistoric inhabitants of Central Asian populations or the

Indus Valley, or living peninsular Indian populations. The results only lend partial support for the LSCM because some groups do follow a pattern of isolation-by-distance.

The results of the study also lend partial support to the EEM by demonstrating a break between the Neolithic and Chalcolithic occupations of Mehrgarh (NeoMGR and

ChlMGR); while only the relationship demonstrated singularly by the Pakistani Khowars

(KHO) and their relationship to one of the temporally distinct samples from Djarkutan

(DJR) implicates any validity for the AIM.

However, the question still remains; who do the Hazara share close biological affinities to? In order to test the Genghis Khan origin story preserved in their oral 145

tradition, the use of either East Asian or other sinodontic dental comparative samples must be employed. Examples of these comparative samples have been found but have limited utility. That is, only males are represented in the proposed population samples which, decreases the useable sample sizes and hence the robusticity of the statistical findings. Furthermore, statistic tests measuring the influence of inter-observer error could not be conducted. The sinodontic populations that could be potentially used to further examine the Hazara population’s history include: the Anyang Chinese series, the

Urga Mongolian collection, and the Chifeng Chinese series.

The archaeologically-derived Anyang Chinese series were recovered from the

Shan sites of the Han province, China and date to the Yin (Shang) period (1500-1027

BC) (Institute of History and Institute of Archaeology 1982). This collection is housed at the Academia Sinica of the Republic of China in Taipei and is represented by 21 male individuals (Matsumura 1994). The early historic Chifeng Chinese sample includes 38 male individuals dating from 1027-200 BC and are from Inner Mongolia, China (Miyake et al. 1938). The Urga Mongolian series is represented by 132 male individuals originating from Ulan Bator, Mongol and are currently housed at the National Museum of Natural History, Smithsonian Institution (Matsumura1995:237). This series was originally collected by Hrdlicka in the early 1930’s and is said to represent an early modern Mongolian population. No data representing the females of these populations was available. The measurements of these samples were obtained from Hirofumi

Matsumura (Personal correspondence 2010). If biological distance analyses were conducted using sinodontic samples, like those previously described, it would be

146

possible to elucidate the potential East Asian origins of the Hazara ethnic group in northern Pakistan.

147

References

Alvesalo, L., Tigerstedt, P.M.A. (1974). Heritabilities of human tooth dimensions. Heriditas 77:311-318.

Atkinson Q.D., Gray R.D., Drummond A.J. (2008). MtDNA variation predicts population size in humans and reveals a major Southern Asian chapter in human prehistory. Molecular Biology and Evolution 25(2):468-474.

Bacon, E. (1951). The Inquiry into the History of the Hazara Mongols of Afghanistan. Southwestern Journal of Anthropology 7:230-247.

Bailit, H.L. Workman, P.L., Niswander, J.D. and C.J. MacLean (1970). Dental Asymmetry as an Indicator of Genetic and Environmental Conditions in Human Populations. Human Biology 42: 626-638.

Bamshad, M.J., Watkins, W.S., Dixon, M.E., Jorde, L.B., Bhaskara Rao, B., Naidu, J.M., Ravi Prasad, B.V., Rasanayagam, A., Hammer, M.F. (1998). Female gene flow stratifies Hindu castes. Nature 395:651-652

Bamshad, M., Kivisild, T., Watkins, W.S., Dixon, M.E., Ricker, C.E., Rao, B.B., Naidu, J.M., Ravi Prasad, B.V., Govinda Reddy, P., Rasanayagam, A., Papiha, S.S., Villems R., Redd, A.J., Hammer, M.F., Nguyen, S.V., Carroll, M.L., Batzer, M.A., and Jorde L.B. (2001). Genetic evidence on the origins of Indian caste populations. Genome Res 11:994-1004.

Barbujani, G., and R.R. Sokal. (1990). Zones of sharp genetic change in Europe are also linguistic boundaries. Proc. Natl. Acad. Sci. 87:1816–19.

Barton, A.M., Hemphill, B.E. (2011). A Craniometric Investigation of Biological Contacts between Populations of the Iranian Plateau and Central Asia during the Last Three Millennia B.C. American Journal of Physical Anthropology (Suppl. 52):82- 83.

(2012). An Odontometric Investigation of Biological Affinities of the Yashkuns of Northern Pakistan. American Journal of Physical Anthropology (Suppl 54): 91.

Bellew, H.W. (1979). The races of Afghanistan. Sang-e-Meel Publications, Lahore, Pakistan.

Blaylock, S.R. (2008). Are the Koh an indigenous population of the Hindu Kush?: A dental morphology investigation. Masters Thesis. California State University, Bakersfield.

148

Blaylock, S., Hemphill, B.E. (2007). Are the Koh Indigenous Inhabitants of the Hindu Kush? II. A Dental Morphology Investigation. American Journal of Physical Anthropology (Suppl. 44): 76.

Bulmer, M.G. (1970). The Biology of Twinning Man. Oxford, Clarendon.

Butler, P.M. (1937). Studies of the mammalian dentition. I. The teeth of Centetes ecaudatus and its allies. Proceedings of the Zoological Society of London B107:103-132.

(1939). Studies of the mammalian dentition. Differentiation of the post-canine dentition. Proceedings of the Zoological Society of London B109:1-36.

(1982). Some problems of the ontogeny of tooth patterns. In Teeth: Form, Function, and Evolution, ed. B. Kurten, pp. 44-51. New York: Columbia University Press.

Constandse-Westermann, T.S. (1972). Coefficients of Biological Distance. Oosterhout N.B., The Netherlands: Anthropological Publications.

Dahlberg, A.A. (1945). The changing dentition of man. Journal of the American Dental Association 32:679-690.

(1951). The Dentition of the American Indian. In: Laughlin WS, editor. The Physical Anthropology of the American Indian. New York: Viking Fund Inc., pp 138–76.

Dani, A.H. (1967). Timargarha and Gandharan Grave Culture. Ancient Pakistan 3:1- 407.

Doyle, W.J. and O. Johnston (1977). On the Meaning of Increased Fluctuating Asymmetry: A Cross Populational Study. American Journal of Physical Anthropology 46:127-134.

Erdosy, G. (1989). Ethnicity in the Rigveda and its Bearing on the question of Indo- European Origins. South Asian Stud 5:35-47

(1995). Language, material culture and ethnicity: Theoretical perspectives. In: Erdosy G, editor. The Indo-Aryans of Ancient South Asia. Berlin: Walter de Gruyter, pp 1-31.

Elphinstone, M. ([1814] 1972). An Account of the Kingdom of Caubul. 3d ed. New intro. by Sir Olaf Caroe. Karachi, Pakistan: Oxford University Press.

Fairservis, W. A. (1975). The roots of ancient India. Chicago: University of Chicago Press. 149

(1995). Central Asia and the Rigveda: The archaeological evidence. In: Erdosy G, editor. The Indo-Aryans of Ancient South Asia. Berlin: Walter de Gruyter, p 206-212.

Fairservis, W.A., Southworth, F.C. (1989). Linguistic archaeology and the Indus Valley culture. In Old Problems and New Perspectives in the Archaeology of South Asia, ed. J.M. Kenoyer. Madison: Wisconsin Archaeological Reports No. 2, pp. 133–141.

Fikree, F., Pasha, O. (2004). Role of gender in health disparity: the South Asian context. BMJ, 328:823-826.

Forster, .P, Matsumura, S. (2005). Did early humans go north or south? Science 308(5274):965-966.

Francfort, H.P. (1994). The Central Asian dimension of the symbolic systems in Bactria and Margiana. Antiquity 68(259):406-418.

Fuller, D. (2003) An archaeological perspective on Dravidian historical linguistics: archaeological crop packages, livestock and Dravidian vocabulary. In Examining the Farming/Language Dispersal Hypothesis, eds. P. Bellwood and C. Renfew. Cambridge: McDonald Institute for Archaeological Research, pp. 191-213.

Garn, S.M., Lewis, A.B. and Kewersky, R.S. (1964). Sex difference in tooth size. Journal of Dental Research 43:306.

(1965). Size interrelationships of the mesial and distal teeth. Journal of Dental Research 44:350-354.

(1966). Extent of sex influence on Carabelli’s polymorphism. Journal of Dental Research 45:1823.

Garn, S.M., Lewis, A.B. and Walenga, A. (1968). Evidence for a secular trend in tooth size over two generations. Journal of Dental Research 47:503.

Garn, S.M. and Bailey, S.M. (1977). The symmetrical nature of bilateral asymmetry (δ) of deciduous and permanent teeth. Journal of Dental Research 56:1422.

Glasstone, S. (1979). Tissue culture of the development of teeth and jaws. OSSA 6:89- 104.

Goldberg, S. (1929). Biometrics of identical twins from a dental viewpoint. Journal of Dental Research 9:363-409.

150

Gower, J.C. (1966). Some Distance Properties of Latent Root and Vector Methods used in Multivariate Analysis. Biometrika 53: 325-338.

Guatelli-Steinberg, D., Sciulli, P., and H. Edgar (2006). Dental Fluctuating Asymmetry in the Gullah: Tests of Hypotheses Regarding Developmental Stability in Deciduous vs. Permanent and Male vs. Female Teeth. American Journal of Physical Anthropology 129: 427-434.

Guttman, L. (1968.) A general nonmetric technique for finding the smallest coordinate space for a configuration of points. Psychometrika 33:469- 506.

Guzman, M.C., and B.E. Hemphill (2012). An Odontometric Investigation of the Biological Origins of the Baltis: A Tibeto-Burman speaking Population of Northern Pakistan. American Journal of Physical Anthropology (Suppl 54): 157.

(2013). Are Socioethnic Groups Biologically Meaningful Entities? A Tooth Size Allocation Analysis of the Baltis of Northern Pakistan. American Journal of Physical Anthropology (Suppl. 56):139.

Harris, E.F. (2003). Where’s the variation? Variance Components in Tooth Sizes of the Permanent Dentition. Dental Anthropology 16(3): 84-94.

Harris, E.F., Harris, J.T. (2007). Racial differences in tooth crown size gradients within morphogenetic fields. Revista Estomatologia 15(2):1-16.

Harris, E.F.,Nweeia, M. (1980). Dental Asymmetry as a Measure of Environmental Stress in the Ticuna Indians of Colombia. American Journal of Physical Anthropology. 53:133-142.

Hemphill, B.E. (1991). Tooth size apportionment among contemporary Indians: An analysis of caste, language, and geography, a dissertation. University Microfilms International, Ann Arbor.

(1999). Adaptations and affinities of Bronze Age Bactrians: IV. A craniometric examination of the origins of Oxus Civilization populations. . American Journal of Physical Anthropology 108: 173-192.

(2001). Do Foreign Artifacts Mean Foreign People? A Skeletal Technique for Analyzing Anomalous Burials in a Bronze Age Cemetery. In: M. Taddei and G. De Marco (eds.), South Asian Archaeology 1997. Rome: Istituto Italiano per L’Africa e L’Oriente, pp. 409-435.

(2008). Are the Inhabitants of Madaklasht an Emigrant Persian Population in Northern Pakistan?: A Dental Morphometric Investigation. American Journal of Physical Anthropology (Suppl. 46): 115. 151

(2009). The Swatis of Northern Pakistan—Emigrants from Central Asia or Colonists from peninsular India?: A Dental Morphometric Investigation. American Journal of Physical Anthropology (Suppl. 48):147.

(2010). Dental Anthropology of the Madaklasht I: A Description and Analysis of Variation in Morphological Features of the Permanent Tooth Crown. Pakistan Heritage 2: 1-36.

(2011). Dental Anthropology of the Madaklasht II: A Comparative Analysis of Morphological Variation-Are the Madaklasht an Intrusive Population in Northern Pakistan? Pakistan Heritage 3:1-78.

(2012). The Awans of Northern Pakistan: Emigrants from Central Asia, Arabs from Western Afghanistan, or Colonists from Peninsular India? A Dental Morphometric Investigation. American Journal of Physical Anthropology (Suppl 54): 163.

(2013a) A View to the North: Biological Interactions across the Inter-Montane Borderlands during the Last Two Millennia BC. In: D. Frenez and M. Tosi (eds.), South Asian Archaeology 2007, Volume I. Oxford: Archaeopress, BAR International Series No. 2454, pp. 117-126.

(2013b) Grades, Gradients and Geography: A Dental Morphometric Approach to the Population History of South Asia. In: G.R. Scott and J.D. Irish (eds.), Anthropological Perspectives on Tooth Morphology: Genetics, Evolution, Variation. Cambridge: Cambridge University Press, pp. 341-387.

Hemphill, B.E., Ali, I., Blaylock, S., Willits, N. (2013) Are the Kho an Indigenous Population of the Hindu Kush?: A Dental Morphometric Approach. In: D. Frenez and M. Tosi (eds.), South Asian Archaeology 2007, Volume I. Oxford: Archaeopress, BAR International Series No. 2454, pp. 127-137.

Hemphill, B.E., Hlusko, L.J. (2013). Tansies in the Field: An Odontometric Assessment of Orthodox Perspectives on Dental Field Theory, Ontogenetic Canalization, and Sex Dimorphism. American Journal of Physical Anthropology (Suppl. 56):146-147.

Hemphill, B.E., Lukacs, J.R., Kennedy, K.A.R. (1991). Biological Adaptations and Affinities of Bronze Age Harappans. In: R. Meadow (ed.), Harappa Excavations 1986-1990: A Multidisciplinary Approach to Third Millennium Urbanism. Madison: Prehistory Press. pp. 137-182.

Hemphill, B.E., Lukacs, J.R., Joshi, M.R., Lal, R.B. (1992a). Odontometric variation in North West India: Biologic interrelationships among Bhils, Garasia and Rajputs. Indian Journal of Physical Anthropology and Human Genetics 18(1):1-48.

152

Hemphill, B.E., Lukacs, J.R., Reddy, V.R. (1992b). Tooth Size Apportionment in Modern India: Factors of caste, language, and geography. Journal of Human Ecology 2:231-253.

Hemphill, B.E., Lukacs, J.R., Walimbe, S.R. (2000). Ethnic Identity, Biological History and Dental Morphology: Evaluating the Indigenous Status of Maharashtra's Mahars. Antiquity 74: 671-681.

Hemphill, B.E. and Mallory, J.P. (2004) Horse-Mounted Invaders from the Russo-Kazakh Steppe or Agricultural Colonists from Western Central Asia? A Craniometric Investigation of the Bronze Age Settlement of Xinjiang. American Journal of Physical Anthropology 124(3):199-222.

Hiebert, F.T. (1994). Origins of the Bronze Age Civilization in Central Asia. Cambridge, Mass.: Peabody Museum of Archaeology and Ethnology, American School of Prehistoric Research Bulletin No. 42.

Hillson, S. (1996). Dental Anthropology. Cambridge: Cambridge University Press.

Horowitz, S.L., Osborne, R.H., de George, F.V. (1958). Hereditary factors in tooth dimensions, a study of the anterior teeth in twins. Angle Orthodontist 28:87-93.

Howells, W.W. (1973). Cranial Variation in Man. A Study By Multivariate Analysis of Patterns of Differences Among Recent Human Populations. Papers of the Peabody Museum of Archaeology and Ethnology, 67:1-259.

(1976). Explaining modern man: evolutionists versus migrationists. Journal of Human Evolution 5:245-495.

(1989). Skull shapes and the Map: Craniometric Analysis in the Dispersion of Modern Homo. Papers of the Peabody Museum of Archaeology and Ethnology, Harvard University, Vol. 79, Cambridge, Mass: Harvard University.

Hunley, K., Healy, M., and J. Long (2009). The Global Pattern of Gene Identity Variation Reveals a History of Long-Range Migrations, Bottlenecks, and Local Mate Exchange: Implications for Biological Race. American Journal of Physical Anthropology 183:35-46.

Kennedy, K.A.R., Chiment, J., Disotell, T., Meyers, D. (1984). Principal-Components analysis of Prehistoric South Asian Crania. American Journal of Physical Anthropology 64:105-118.

Kennedy, K. A. (1999). Paleoanthropology of South Asia. Evolutionary Anthropology: Issues, News, and Reviews, 8(5):165-185.

Kieser, J.A. and H. T. Groeneveld (1988). Fluctuating Odontometric Asymmetry in an 153

Urban South African Black Population. Journal of Dental Research 67:1200- 1205.

Kieser, J.A., Groeneveld, H.T. and C.B. Preston. (1986a). Fluctuating Odontometric Asymmetry in the Lengua Indians of Paraguay. Annals of Human Biology 13:489- 498.

(1986b). Fluctuating Dental Asymmetry as a Measure of Odontogenic Canalization in Man. American Journal of Physical Anthropolog, 71:437-444.

Kohl, P.L. (1985). Recent Research in Central Asia. American Antiquity 50:789-795.

Kollar, E.J. .and Kerley, M.A (1979). Odontogenesis: Interaction between isolated enamel organ epithelium and dental papilla cells. OSSA 6:163-170.

Kondo, S., Townsend, G., Yamada, H. (2005). Sexual dimorphism of cusp dimensions in human maxillary molars. Am J Phys Anthropol 128:870–7.

Kong, Q., Bandelt, H., Sun, C., Yao, Y., Salas, A., Achilli, A., Wang, C., Zhong, L., Zhu, C., Wu, S., Torroni, A., Zhang, Y. (2006). Updating the East Asian mtDNA phylogeny: a prerequisite for the identification of pathogenic mutations. Human Molecular Genetics 15(13):2076-2086.

Kraus, B.S. and Furr, M.L. (1953). Lower first premolars. Part I. A definition and classification of discrete morphological traits. Journal of Dental Research 32:554-564.

Kronmiller, J.E.,Uphold, W.B. and Kollar, E.J. (1991). EGF antisense oligodeoxynucleotides block murine odontogenesis in vitro. Developmental Biology 147:485-488.

Kronmiller, J.E.,Uphold, W.B. and Kollar, E.J. (1992). Alteration of murine odontogenic patterning and prolongation of expression of epidermal growth factor mRNA by retinol in vitro. Archives of Oral Biology 37:129-138.

Kruskal, J.B. (1964). Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 29:115-129.

Lahr, M.M, Foley, R. (1994). Multiple dispersals and modern human origins. Evolutionary Anthropology 3:48-60.

Lamberg-Karlovsky, C.C. (1994). The Bronze Age khanates of Central Asia. Antiquity 68(259):398-405.

Lau, E., Mohandas, T., Shapiro, L., Slavkin, H., and M. Snead (1989). Human and mouse amelogenin gene loci are on the sex chromosomes. Genomics 4:162-8. 154

Lavelle, C.L.B. (1972). Secular trends in different racial groups. Angle Orthodontist 42:19-25.

Lewis, D.W., Granger, R.M. (1967). Sex-linked inheritance of tooth size. Archives of Oral Biology 12:539-544.

Line, S.R.P. (2001). Molecular morphogenetic fields in the development of human dentition. J Theor Biol 211:67–75.

Livingstone, F.B. (1991). Phylogenies and the forces of evolution. American Journal of Human Biology 3:83-89.

Lukacs, J.R. (1977). Anthropological Aspects of Dental Variation in North India: a Morphometric Analysis. Unpublished Ph.D. thesis, Cornell University, Ithaca, New York.

(1983) Dental anthropology and the origins of two Iron Age populations from Northern Pakistan‟. Homo 34:1-15.

(1985). Tooth size variation in prehistoric India. American Anthropologist 87:811– 825.

(1986). Dental morphology and odontometrics of early agriculturalists from Neolithic Mehrgarh, Pakistan. In Teeth Revisited: Proceedings of the VIIth International Symposium on Dental Morphology, eds. D.R. Russell, J.P. Santoro, J.-P., and D. Sigogneau-Russell. Paris: Memoires du Museum National d’Histoire Naturelle, pp. 285–303.

(1992). Dental paleopathology and agricultural intensification in South Asia: new evidence from Bronze Age Harappa. American Journal of Physical Anthropology 87(2): 133-150.

Lukacs, J.R., Hemphill, B.E. (1991). The dental anthropology of prehistoric Baluchistan: a morphometric approach to the peopling of South Asia. In Recent Advances in Dental Anthropology, eds. M.A. Kelly and C.S. Larsen. New York: Alan R. Liss, pp. 77–119.

(1993). Odontometry and Biological Affinity in South Asia: Analysis of Three Ethnic Groups from Northwest India. Human Biology, 65(2): 279-325.

Lundstrom, A. (1948). Tooth Size and Occlusion in Twins. New York, Karger.

(1954). Intermaxillary tooth width ratio, tooth alignment and occlusion. Acta Odontologica Scandinavica 12:910-916.

155

(1955). The significance of genetic and nongenetic factors in the profile and facial skeleton. American Journal of Orthodontics 41: 410-416.

(1963). Tooth morphology as the basis for distinguishing monozygotic and dyzygotic twins. American Journal of Human Genetics 15:34-43.

(1967). Genetic aspects of variation in tooth width based on symmetry and twin studies. Hereditas 57:403-409.

Lynch, M. (1989). Phylogenetic hypotheses under the assumption of neutral quantitative variation. Evolution 43:1-17.

Macauley, V., Hill, C., Achilli, A., Rengo, C., Clarke, D., Meehan, W., Blackburn, J., Semino, O., Scozzari, R., Cruciani, F., Taha, A., Shaari, N., Raja, J., Ismail, P., Zainuddin, Z., Goodwin, W., Bulbeck, D., Bandelt, H., Oppenheimer, S., Torroni, A., Richards, M. (2005). Single, rapid coastal settlement of Asia revealed by analysis of complete mitochondrial genomes. Science 308(5724):1034-1036.

Mahalanobis, P.C. (1936). On the generalized distance in statistics. Proceedings of the National Institute of Science, India 2:49-55.

Majumdar, P.P. (1998). People of India: Biological diversity and affinities. Evol Anthropol 6(3):100-110.

Malina, R. M., Little, B. B., Buschang, P. H., DeMoss, J., & Selby, H. A. (1985). Socioeconomic variation in the growth status of children in a subsistence agricultural community. American Journal of Physical Anthropology 68(3):385- 391.

Masson, V.M. (1988) Altyn depe. Philadelphia: University of Pennsylvania Museum of Archaeology and Anthropology, Monograph No. 55.

McAlpin, D. (1975). Elamite and Dravidian, Further Evidence of Relationships. Current Anthropology 16(1):105-115.

(1981). Proto-Elamo-Dravidian: The Evidence and its Implications. Transactions of the American Philosophical Society 71(3):1-155.

Mehra, N.K. (2010). Defining genetic architecture of the populations in the Indian subcontinent: Impact of human leukocyte antigen diversity studies. Indian J Hum Genet 16(3):105-107.

Mellars, P.A. (2006). Going east: new genetic and archaeological perspectives on the modern human colonization of Eurasia. Science 313(5788):796-800.

Metspalu, M., Kivisild, T., Metspalu, E. et al. (2004). Most of the extant mtDNA 156

boundaries in South and Southwest Asia were likely shaped during the initial settlement of Eurasia by anatomically modern humans. BMC Genetics 5:26.

Metspalu, M., Romero, I.G., Yunusbayev, B., Chaubey, G., Mallick, C.B., Hudjashov, G., Nelis, M., Ma¨gi, R., Metspalu, E., Remm, M., Pitchappan, R., Singh, L., Thangaraj, K., Villems, R. and T. Kivisild (2011). Shared and unique components of human population structure and genome-wide signals of positive selection in South Asia. American Journal of Human Genetics 89(6):731-744.

Middleton, J., Rassam, A. (1995). Encyclopedia of World Cultures, Volume 9-Africa and the Middle East. G.K. Hall and Co., Boston, Massachusetts.

Miller, B. D. (1984). Daughter neglect, women's work, and marriage: Pakistan and Bangladesh compared. Medical Anthropology 8(2):109-126.

Mitsiadis T.A., Smith M.M. (2006). How do genes make teeth to order through development? J Exp Zool (Mol Dev Evol) 306B:177–82.

Mizoguchi, Y.(1986). Correlated Asymmetries Detected: The Tooth Crown Diameters of Human Permanent Teeth. Bulletin of the National Science Museum Tokyo, Series D (Anthropology) 12:15-45.

(1988). Degree of bilateral asymmetry of nonmetric tooth crown characters quantified by the tetarchoric correlation method. Bulletin of the National Science Museum, Tokyo, Series D (Anthropology) 14:29-49.

Moorrees, C.F.A. (1957). The Aleut Dentition: A Correlative Study of Dental Characteristics in an Eskimo People. Cambridge: Harvard University Press.

Moorrees, C.F.A. (1962). Genetic considerations in dental anthropology. In Genetics and Dental Health, ed. C.J. Witkop, Jr., pp. 101-112. New York: McGraw Hill.

Morgan, D. (1986). The Mongols. Blackwell Publishers, Oxford.

O’Neill, P.W. and Hemphill, B.E. (not published, listed as 2008). Dental Fluctuating Asymmetry among Pakistani Highlanders. Unpublished manuscript on file at the Center for South Asian Research, California State University, Bakersfield.

(2009). “Considerations for the Population History of the Wakhan Corridor: An Odontometric Investigation of Wakhi Biological Affinity and Diachronic Analysis of Biological Interaction between Northern Pakistan and South Asia.” American Journal of Physical Anthropology (Suppl. 48) (New York, NY): 203.

157

(2012) Odontometric Investigation among Three Ethnolinguistic Groups from the Rugged Mountain Highlands of Gilgit-Baltistan, Pakistan: Testing Historical Hypotheses with Tooth Size Allocation Analysis. American Journal of Physical Anthropology (Suppl 54): 228.

O’Neill, P.W. (2013). Tracing Shina and Wakhi Origins: Are The Ethnic Classifications Commonly Used in Demographic Studies Bologically Meaningful? Masters Thesis. California State University, Bakersfield.

Osborn, J.H. (1978). Morphogenetic gradients: fields versus clones. In Development, Function, and Evolution of Teeth, eds. P.M. Butler and K.A. Joysey, pp. 171-201. New York: Academic Press.

Parpola, A. (1988). The Coming of the Aryans to Iran and India and the Cultural and Ethnic Identity of the Dāsas. Studia Oreintalia 64:195-302.

Parpola, A. (1995). Formation of the Aryan Branch of Indo-European. In: R. Blench and M. Spriggs (eds.), Language and Archaeology, Vol. 3: Combining Archaeological and Linguistic Aspects of the Past. London: Routledge, pp. 1-27.

Passarino G., Semino O., Bernini L.F., and Santachiara-Benerecetti A.S. (1996). Pre- Caucasoid and Caucasoid genetic features of the Indian population, revealed by mtDNA polymorphisms. American Journal of Human Genetics 59:927-934.

Pearson, K. (1926). On the coefficient of racial likeness. Biometrika 18:105-117.

Poliakov L. (1974). The Aryan Myth. Basic Books, New York.

Potter R.H.Y., Nance W.E. (1976). A twin study of dental dimensions. I. Discordance, asymmetry and mirror imagery. American Journal of Physical Anthropology 44:391-396.

Potter R.H.Y., Nance W.E., Y.U.P., Davis W.B. (1976). A twin study of dental dimensions. II. Independent genetic determinants. American Journal of Physical Anthropology 44:397-412.

Qadir, F., Khan, M., Medhin, G. and M. Prince (2011). Male gender preference, female gender disadvantage as risk factors for psychological morbidity in Pakistani women of childbearing age - a life course perspective. BMC Public Health, 11:745.

Qamar R. et al. (2002). Y-chromosomal DNA variation in Pakistan. American Journal of Human Genetics 70:1107-1124.

158

Quintana-Murci L. et al. (2001). Y-Chromosome lineages trace diffusion of people and languages in Southwestern Asia. American Journal of Human Genetics 68:537- 542.

Quintana-Merci L. et al. (2004). Where West meets East: The complex mtDNA landscape of the Southwest and Central Asian Corridor. American Journal of Human Genetics 74:827-845.

Renfrew, C. (1987). Archaeology and Language: The Puzzle of Indo-European Origins. New York: Cambridge University Press.

Saitou, N. and Nei, M. (1987). The neighbor-joining method: a new method for reconstructing evolutionary trees. Mol Biol Evol 4:406-425.

Sarianidi, V. (1999). Near Eastern Aryans in Central Asia. J. Indo-Eur. Stud. 27(3- 4):295-326.

Saunders, S.R. and Mayhall, J.T. (1982). Developmental patterns of human dental morphological traits. Archives of Oral Biology 27:45-49.

Scott, G. R. and Turner II, C.G (1997). The Anthropology of Modern Human Teeth, Dental morphology and its variation in recent human populations. Cambridge: Cambridge University Press.

Schofield, V. (2003). Afghan Frontiers; Feuding and Fighting in Central Asian. Tauris Parke Paperbacks: I.B. Tauris and Co. Ltd. New York, NY.

Sengupta, S., Zhivotovsky, LA, King, R, Mehdi, S.Q., Edmonds, C.A., Chow , C.E., Lin, A.A., Mitra, M,, Sil, S.K., Ramesh, A., Usha Rani, M.V., Thakur, C.M., Cavalli- Sforza, L.L., Majumder, P.P., and P.A. Underhill (2006). Polarity and temporality of high-resolution Y-chromosome distributions in India identify both indigenous and exogenous expansions and reveal minor genetic influence of Central Asian pastoralists. American Journal of Human Genetics 78(2):202-221.

Sharpe, P.T. (1995). Homeobox genes and orofacial development. Connect Tissue Res 32:17–25.

Sidky, H. (1995). Hunza: an Ethnographic Outline. Illustrated Book Publisher, Jaipur.

Siegel, M.I., and W.J. Doyle (1975). The differential effects of prenatal and postnatal audiogenic stress on fluctuating asymmetry. J. Exp. Zool. 191:211-214.

Smith, C. (1972). Coefficients of biological distance. Ann Hum Genet 36:241-245.

(1974). Concordance in twins: methods and interpretations. American Journal of Human Genetics 26:454-464. 159

(1975). Quantitative inheritance. In: Textbook of Human Genetics, eds. G Fraser and O Mayo, pp 382-441. Oxford, Blackwell.

Sokal, R.R. and Sneath, P.H.A. (1963.) Principles in Numerical Taxonomy. San Francisco: W.H. Freeman.

Southworth, F.C. (1995). Reconstructing social context from language: Indo-Aryan and Dravidian prehistory. In Erdosy, G. (ed.), The Indo-Aryans of Ancient South Asia. Berlin: Walter de Gruyter, pp. 258-277.

Stini, W. A. (1969). Nutritional stress and growth: sex difference in adaptive response. American Journal of Physical Anthropology 31(3): 417-426.

(1972). Reduced sexual dimorphism in upper arm muscle circumference associated with protein‐deficient diet in a South American population. American Journal of Physical Anthropology 36(3): 341-351.

Stinson, S. (1985). Sex differences in environmental sensitivity during growth and development. Yrbk Phys Anthropol 28:123–147.

Takahashi M, Kondo S, Townsend G, Kanazawa E (2007). Variability in cusp size of human maxillary molars, with particular reference to the hypocone. Archs Oral Biol 52:1146–54.

Tartaglia, M., Scacchi R., Corbo R.M., Pompei F., Rickards O., Ciminelli B.M., Sangatramani T., Vyas M., Dash S., and Modiano G. (1995). Genetic heterogeneity among the Hindus and their relationships with other “Caucasoid” populations: New data on Pubjab-Haryana and Rajasthan Indian states. American Journal of Physical Anthropology 98:257-273.

Ten Cate, A.R. (1994). Oral Histology: Development, Structure, and Function. 4th edn. St. Louis: Mosby.

Thesiger, Wilfred (1955). The Hazara of Central Afghanistan. The Geographical Journal 121(3):312-319.

Thesleff, Irma (1995) Homeobox genes and growth factors in regulation of craniofacial and tooth morphogenesis. Connect Tissue Res 53 (3): 129-134.

Townsend, G.C. (1981). Fluctuating Asymmetry in Deciduous Dentition of Australian Aborigines. Journal of Dental Research 60:1849-1857.

Townsend, G.C. and T. Brown (1978a). Inheritance of tooth size in Australian Aborigines. American Journal of Physical Anthropology 48:305-314.

160

(1978b). Heritabilites of permanent tooth size. American Journal of Physical Anthropology 49:497-502.

(1980). Dental Asymmetry in Australian Aboriginals. Human Biology 52: 661-673.

Townsend G., Harris E., Lesot H., Clauss F., and Brook A. (2009). Morphogenetic fields within the human dentition: A new, clinically relevant synthesis of an old concept. Archives of Oral Biology 54:34-44.

Underhill, P.A., Passarino, G, Lin, A., Shen, P., Lahr, M., Foley, R., Oefner, P., Cavalli- Sforza, L. (2001). The phylogeography of Y chromosome binary haplotypes and the origins of modern human populations. Annals of Human Genetics 65:43-62.

UNESCO World Heritage Centre (2003) Advisory Body Evaluation; Bamiyan Valley, Afghanistan. http://whc.unesco.org/archive/advisory_body_evaluation/208rev.pdf

Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of American Statistical Association 58:236-244.

Weiner, J.S. and Huizinga, J. (Eds.) (1972). The Assessment of Population Affinities in Man. Oxford: Clarendon Press.

Weiss, K.M. (1990). Duplication with variation: metameric logic in evolution from genes to morphology. Yearbook of Physical Anthropology 33:1-23.

Wells, R., Yuldasheva, N., Ruzibakiev, R., Underhill, P., Evseeva, I., Blue-Smith, J., Jin, L., Su, B., Pitchappan, R., Shanmugalakshmi, S., Balakrishnan, K., Read, M., Pearson, N., Zerjal, T., Webster, M., Zholoshvili, I., Jamarhashvili, E., Gambarov, S., Nikbon, B., Dostiev, A., Aknazarov, O., Zalloua, P., Tsoy, I., Kitaev, M., Mirrakhimov, M., Chariev, A., Bodmer, W. (2001). The eurasian heartland: a continental perspective on Y-chromosome diversity. Proceedings of the National Academy of Sciences USA 98(18):10244-10249.

Willis, C.A., Hemphill, B.E. (2008). Are the Burusho an indigenous population of the Northern Areas, Pakistan? An odontometric investigation. American Journal of Physical Anthropology 135:223.

Willis, C.A. (2010). Are the Burusho an indigenous population of the Northern Areas, Pakistan?: A comparison of biological affinities across modern and ancient populations. Masters Thesis. California State University, Bakersfield.

Willits, N., Hemphill, B.E. (2007). Are the Koh Indigenous Inhabitants of the Hindu Kush? I. An Odontometric Investigation. American Journal of Physical Anthropology (Suppl. 44): 251.

161

Witzel, M. (1995). Early Indian History: Linguistic and Textual Parametres. In Erdosy, G. (ed.), The Indo-Aryans of Ancient South Asia. Berlin: Walter de Gruyter, pp. 85-125.

Zerjal, T., Xue, Y., Bertorelle, G., Wells, S., Bao, W., Zhu, S., Qamar, R., Ayub, Q., Mohyuddin, A., Fu, S., Li, P., Yuldasheva, N., Ruzibakiev, R., Xu, J., Shu, Q., Du, R., Yang, H., Hurles, M., Robinson, E., Gerelsaikhan, T., Dashnyam, B., Mehdi, S., and C. Tyler-Smith (2003). The Genetic Legacy of the Mongols. American Journal of Human Genetics 72:717-721.

Zerjal, T., Wells, R.S., Yuldasheva, N. and Ruzibakiev, R. (2002). A genetic landscape reshaped by recent events: Y-chromosomal insights into Central Asia. American Journal of Human Genetics 71:466-482.

Zhong, H., Shi, H., Qi, H., Duan, Z., Tan, P., Jin, L., Su, B., Ma, R. (2010). Extended Y chromosome investigation suggests postglacial migrations of modern humans into East Asi via the northern route. Molecular Biology and Evolution 28(1):717- 727.

162

Appendices

A. Descriptive Statistics of the Hazara

Left-Side Measurements

Tooth & Minimum Maximum Standard Sex N Mean dimension Value Value Deviation M 79 9.566 7.44 10.92 0.746 LM2MD F 87 8.885 7.71 10.58 0.61 3 M 77 9.226 7.33 11.41 0.736 LM2BL F 74 8.679 7.18 9.97 0.611 M 75 10.454 8.72 11.96 0.576 LM1MD F 84 10.017 6.38 11.48 0.779 M 78 9.598 6.76 10.90 0.665 LM1BL F 83 9.202 6.03 10.36 0.660 M 86 6.296 5.17 8.06 0.547 LP4MD F 102 6.167 4.82 10.62 0.669 M 86 7.272 5.76 8.69 0.593 LP4BL F 101 6.955 5.52 9.93 0.647 M 88 6.211 4.68 7.47 0.482 LP3MD F 103 6.185 5.18 7.26 0.434 M 87 6.778 5.40 8.42 0.599 LP3BL F 103 6.454 5.42 7.67 0.516 M 84 6.109 4.77 7.48 0.442 LCMD F 99 5.805 4.5 6.87 0.442 M 85 6.015 4.43 8.39 0.833 LCBL F 99 6.010 4.57 7.28 0.566 M 77 5.343 4.12 6.68 0.459 LI2MD F 79 5.223 4.09 6.37 0.437 M 85 5.074 3.39 6.68 0.818 LI2BL F 95 4.870 3.62 6.58 0.590 M 74 4.820 3.81 5.88 0.388 LI1MD F 82 4.782 3.94 7.42 0.491 M 76 4.767 3.06 7.14 0.811 LI1BL F 94 4.676 3.53 6.20 0.522 M 79 8.705 6.96 11.18 0.922 UM2MD F 80 8.389 6.83 10.24 0.826 M 81 9.640 7.64 11.68 0.768 UM2BL F 87 9.302 7.92 10.98 0.673 M 83 10.027 8.34 11.89 0.597 UM1MD F 90 9.437 8.04 11.39 0.669 M 82 10.005 8.66 12.17 0.681 UM1BL F 91 9.590 4.77 11.91 0.914 M 87 5.871 4.68 7.68 0.492 UP4MD F 99 5.811 4.66 7.60 0.536 M 89 7.970 5.78 9.57 0.623 UP4BL F 99 7.598 5.4 5 9.03 0.632 M 89 6.161 4.92 7.28 0.470 UP3MD F 102 6.086 4.71 6.62 0.470

163

M 89 7.803 6.11 9.79 0.746 UP3BL F 100 7.362 5.79 9.18 0.642 M 86 7.027 5.84 8.34 0.478 UCMD F 100 6.651 5.28 8.18 0.495 M 78 6.572 4.57 8.99 0.956 UCBL F 98 6.515 5.15 7.58 0.587 M 67 6.089 4.67 7.96 0.705 UI2MD F 83 5.733 4.17 7.95 0.674 M 73 4.978 3.69 7.02 0.816 UI2BL F 92 4.762 3.77 8.59 0.751 M 74 7.852 5.05 9.07 0.664 UI1MD F 88 7.388 5.14 8.93 0.751 M 80 5.716 3.51 8.58 1.223 UI1BL F 94 5.506 3.98 8.17 0.770

164

B. Euclidean Distance Matrix

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2.799 3.6569 3.5043 2.4119 2.7268 2.485 3.5102 3.6494 2.2411 2.7222 2.4069 2.1502 2.582 3.1369 3.0199 3.162 3.386 3.7545 3.6667 3.9156 2.7042 3.5773 3.3313 3.3366 2.2956 1.8737 1.8981 1.9688 5.1366 3.2215 4.5865 3.2754 3.9087 3.9438 1.0041 2.5374 3.0452 3.6278 4.008 2.9445 3.9307 3.7751 1.9411 2.007 2.6148 1.4003 2.9533 3.9247 3.6273 1.3181 1.7067 2.2302 3.3446 4.2101 3.8958 2.1067 2.0903 2.842 1.9209 1.4349 3.3067 2.9719 4.55532.2085 3.68972.2389 2.8974 3.94632.0806 3.3413 3.5118 3.7058 3.2382 3.1372 2.0455 4.2344 2.9317 1.6496 2.3849 4.3969 2.5929 1.9741 2.311 4.5072 2.7799 2.1018 2.2553 2.6077 3.2411 1.6106 2.566 2.3651 1.9997 3.6817 2.7086 2.6701 2.2772 3.5059 1.9245 2.9416 2.0446 3.024 2.3371 2.6444 2.3929 3.6073 2.5407 2.8045 2.8427 4.4113 2.8779 3.6798 3.4189 3.452 2.5277 3.8905 3.0994 3.3313 3.4794 3.5175 2.5159 4.2846 2.6387 2.9725 2.352 3.565 2.6741 3.8517 3.2152 2.6843 3.1254 3.1459 2.847 1.4587 2.1559 1.4712 4.48391.9622 4.6835 2.6608 5.24072.6441 3.7691 2.67931.9408 2.7159 3.3386 3.0003 1.8766 4.3434 3.7976 4.1323 3.9477 3.8546 3.3245 3.113 3.6785 4.0982 3.3853 3.381 3.9158 3.9014 2.9016 3.4111 3.611 4.0797 3.7944 4.2443 2.746 3.8805 3.812 4.3043 2.8018 2.8281 3.9562 3.0788 2.5765 3.1037 2.7607 3.2319 3.5264 3.6239 3.1889 3.6518 3.8807 3.6272 3.6027 3.3395 3.0662 2.8892 3.1738 4.4018 2.5146 4.6654 4.6937 2.8585 2.3134 2.3667 2.5432 1.9712 2.75022.9746 3.02273.5813 3.4955 3.8627 4.2252 3.939 2.19 2.4328 4.3034 2.3128 2.4785 2.4737 2.8458 3.5564 2.7624 2.515 2.8699 3.2947 2.3923 2.8415 2.9117 3.0819 2.102 2.5801 1.6631 2.843 2.2883 2.6016 3.41862.7229 3.4012 4.9418 2.79 3.1176 4.5489 3.3364 3.3144 3.299 3.5526 3.4344 3.3046 3.5716 3.2084 2.9783 3.3781 3.0906 2.6213 2.2302 3.1346 3.0863 2.3245 2.8622 2.3153 1.89336.3432 3.1378 7.2133 2.8755 6.1185 2.7659 4.8646 2.9864 4.7814 2.5596 5.6234 2.4543 4.7962 2.7025 4.8344 2.9067 4.329 2.5168 5.1034 2.7747 4.9033 3.6045 5.4615 3.5586 6.2891 3.2008 6.5887 4.2515 4.7038 2.1408 6.2689 3.5363 7.0783 2.8027 2.4631 6.808 2.0927 7.0297 1.7145 5.5774 0.8539 5.0046 5.2222 5.112 2.4769 3.4257 3.2898 2.9761 3.2094 2.8275 2.8675 3.2131 3.6193 3.0942 3.2398 4.0834 4.3166 3.9706 4.5692 2.4776 3.7967 3.1841 3.2247 2.3383 1.8273 1.1168 1.4914 5.4184 1.9206 Column1 Column2KHO Column3 Column4 Column5ALT Column6 Column7 Column8GPD Column9 Column10 Column11 Column12 Column13 Column14 Column15 Column16 Column17 Column18 Column19 Column20 Column21 Column22 Column23 Column24 Column25 Column26 Column27 Column2 GKS PNT GRS CHU RAJ BHI KUZ MDK SWT WKG INM DJR MOL SAP NRG CRG HAR TMG SKH WKS HAZ BUR SHA

165