<<

J. Anim. . Genet. ISSN 0931-2668

ORIGINAL ARTICLE racehorse mitochondrial DNA demonstrates closer than expected links between maternal genetic history and pedigree records M.A. Bower1, M. Whitten2,*, R.E.R. Nisbet3, M. Spencer4, K.M. Dominy5, A.M. Murphy6, R. Cassidy7, E. Barrett1, E.W. Hill8 & M. Binns2,†

1 McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK 2 Department of Veterinary Basic Sciences, Royal Veterinary College, London, UK 3 Sansom Institute for Health Research, University of South , , SA, Australia 4 School of Environmental Sciences, University of Liverpool, Liverpool, UK 5 Department of Medical and Molecular Genetics, School of Medicine, King’s College London, Guy’s Hospital, London, UK 6 Department of Veterinary Basic Sciences, Royal Veterinary College, London, UK 7 Department of Anthropology, Goldsmiths College, London, UK 8 School of Agriculture, Food, Science and Veterinary Medicine, University College Dublin, Dublin, Ireland

Keywords Summary General Studbook; maternal lineage; The potential future earnings and therefore value of Thoroughbred mitochondrial DNA; pedigree records; untested in the racing arena are calculated based on the performance of thoroughbred . their forebears. Thus, lineage is of key importance. However, previous Correspondence research indicates that maternally inherited mitochondrial DNA (mtDNA) M.A. Bower, McDonald Institute for does not correspond to maternal lineage according to recorded pedigree, Archaeological Research, University of casting doubt on the voracity of historic pedigrees. We analysed mtDNA of Cambridge, Downing Street, Cambridge CB2 296 Thoroughbred from 33 maternal lineages and identified an 3ER, UK. Tel: 01223 339330; Fax: 07990 interesting trend. Subsequent to the founding of the Thoroughbred breed 514733; E-mail: [email protected] in the 16th century, well-populated maternal lineages were divided into sub-lineages. Only six in 10 of the sampled shared mito- Current Address: *Max Planck Research chondrial haplotype with other members of their maternal lineage, Group on Comparative Population Linguistics, despite having a maternal ancestor according to pedigree Max Planck Institute for Evolutionary records. However, nine in 10 Thoroughbreds from the 103 sub-lineages Anthropology, Leipzig, sampled shared mtDNA with horses of their maternal pedigree sub-line- †Equine Analysis Systems, , KY age. Thus, Thoroughbred maternal sub-lineage pedigree represents a more 40347, USA accurate breeding record than previously thought. Errors in pedigrees must have occurred largely, though, not exclusively, at sub-lineage foun- Received: 30 March 2012; dation events, probably due to incomplete understanding of modes of accepted: 16 June 2012 inheritance in the past, where maternal sub-lineages were founded from individuals, related, but not by female descent.

kept of all Thoroughbreds since the establishment of Introduction the breed to inform breeding choices (Huggins 2000). is one of the most popular sports world- These records are held in the General Studbook of wide. The majority of horse races involve the Thor- (GSB), the first edition of which pub- oughbred breed, founded in the British Isles in the lished at the end of the 18th century (Weatherby 16th and 17th centuries. Native British were 1791). The GSB represents the oldest and most crossed with imported Middle Eastern to give complete record of in the world. rise to the elite Thoroughbred we know today (Bower However, recent scientific research comparing the et al. 2011). Extensive pedigree records have been mitochondrial DNA (mtDNA) of Thoroughbred horses

© 2012 Blackwell Verlag GmbH • J. Anim. Breed. Genet. (2012) 1–9 doi:10.1111/j.1439-0388.2012.01018.x Thoroughbred racehorse maternal genetic history M. A. Bower et al. has called into question the voracity of the pedigrees foundation events. If maternal sub-lineage is a more recorded in the GSB (Hill et al. 2002; Harrison & Turri- accurate indicator of maternal genetic ancestry than on-Gomez 2006) and thus the validity of the General lineage alone, it would suggest that the GSB is a more Studbook as a record of genetic inheritance. Mito- accurate record of past breeding choices than previ- chondrial D-loop DNA is highly variable and has been ously demonstrated, if only at the sub-lineage level. used extensively for phylogenetic reconstruction of domestic horse origins (Jansen et al. 2002; McGahern Materials and methods et al. 2006; Lei et al. 2009), including the Thorough- bred (Bower et al. 2011). However, it is also a - Mitochondrial DNA sequences for n = 296 Thorough- ful tool in pedigree reconstruction (Bowling et al. bred horses were included in this study and comprised 2000; Luı´s et al. 2006; A´ lvarez et al. 2012). n = 196 Thoroughbred horse mtDNA sequences Because mitochondrial DNA is maternally inher- generated in the study and 100 mtDNA sequences ited, it follows that horses from the same maternal from the Hill et al., study (Hill et al. 2002). Maternal lineage according to pedigree records should share the lineages and sub-lineages were numbered following same mitochondrial sequence. An analysis of 100 the Lowe Family Figure system (Lowe 1913; Bobinski Thoroughbreds showed that eight of 19 maternal lin- 1953). Pedigrees were traced back to the founding eages contained horses with more than one mito- of the maternal lineage of each individual chondrial haplotype, despite being related by direct (Bobinski 1953; Weatherby 2010). Close relatives female descent according to the General Studbook (less than three generations) and individuals for (Hill et al. 2002). Subsequently, in a much larger whom pedigree records were not available were study, Harrison & Turrion-Gomez (2006) identified 28 excluded (n = 10). of 33 Thoroughbred lineages that had horses with mtDNA different to that of the founding female of DNA isolation, amplification and sequencing their maternal lineage, suggesting extensive errors in pedigree records. This has been attributed to confu- Whole genomic DNA was isolated from pulled sion of mares at the foundation stages of the breed hairs according to Allen et al. (1998) and from blood and variability in how accurately or ably early samples using a Nucleon DNA Extraction Kit breeding records were kept, prior to the formalization (Amersham Biosciences UK Limited, Little Chalfont of the breeding record in 1791, although some are Buckinghamshire, UK) following the manufacturer’s attributable to more recent recording events (19th instructions. century – 1980). Amplification of the equine mitochondrial D-loop Subsequent to the founding of the Thoroughbred was performed by PCR. Reactions were performed in breed, well-populated maternal lineages were divided 12.5 ll consisting of 1 ll of DNA extract, 19 HiSpec into sub-lineages (e.g. Lowe Family 1a, 1b etc., Lowe additive (Bioline, UK), 19 PCR buffer (Bioline UK

1913). A sub-lineage was formed where a lineage Limited, London, UK), 2.41 mM MgCl2,25lM each of branched from multiple maternally related horses dNTP and 0.25 U Immolase hot start DNA polymerase (e.g. multiple daughters of a given mare). Being a (Bioline). 0.4 lM of each primer was used. Forward sub-section of a maternal lineage, sub-lineage horses primer: 5′ACCCTGGTCTTGTAAACCAG, reverse primer: should share mtDNA, not only with their sub-lineage 5′TGGTTGCTGATGCGGA. (Hill et al. 2002). Thermo- founder, but also with their original maternal lineage cycling conditions were as follows: initial activation at founder, or tap-root mare. However, if horses share 94°C for 10 min was followed by 39 cycles of 94°C for mtDNA with members of their maternal sub-lineage 1 min, 52°C for 1 min and 72°C for 1 min. Sequenc- but not their maternal lineage, this would indicate ing reactions were carried out in both directions using that sub-lineages represent multiple foundation the Big Dye® Terminator v3.1 cycle (Applied Biosys- events involving horses not related by female descent. tems®, Life Technologies Ltd., Paisley, UK) sequencing We hypothesized that the majority of the errors in kit and internal sequencing primer 5′GTTATGTGT- the GSB pedigree records were occurred at sub-line- GAGCATGGGC. Sequencing products were analysed age founding events. To investigate this, we surveyed on an ABI 3100 Genetic Analyzer, and base calling ® 296 Thoroughbred horses (the largest Thorough- was performed using ABI Prism AB DNA Sequencing bred mtDNA data set available in the public domain Analysis Software v5.1.1. (Applied Biosystems®) All to date). We compared mtDNA and sub-lineage PCR amplicons were sequenced in two directions, pedigrees to investigate whether errors in Thorough- sequences were proof read and compared, and contigs bred pedigrees occurred at maternal sub-lineage were constructed using Mega 4 (Kumar et al. 2004).

2 © 2012 Blackwell Verlag GmbH • J. Anim. Breed. Genet. (2012) 1–9 M. A. Bower et al. Thoroughbred racehorse maternal genetic history

All unique sequences were re-amplified and rese- sub-divided into 103 sub-lineages. In some cases, indi- quenced using high-fidelity Taq polymerase (Platinum vidual sub-lineages are under-represented in the data; Taq HiFi; Invitrogen, Life Technologies Ltd., Paisley, therefore, where sample number is below n = 3, UK) to confirm haplotype signature. Sequences were results are not discussed; however, low sample num- deposited in GenBank (EU580148–EU580172). ber sub-lineages also followed the observed pattern discussed later. In total, 25 unique mitochondrial sequences Phylogenetic analyses were identified in the 296 Thoroughbred horses. This Sequences were aligned in MEGA 4 (Tamura et al. is an increase in 10 haplotypes from those previously 2007) using the CLUSTAL W algorithm (Higgins et al. reported (Hill et al. 2002; Harrison & Turrion-Gomez 1994) and truncated to 343 base pairs for analysis to 2006). Fourty-two polymorphic sites were identi- be compatible with the Hill et al. (2002) data. fied comprising three indels and 39 transitions (Table 1). Ninety two per cent of the Thoroughbreds we - Maximum likelihood phylogeny pled shared mitochondrial haplotype with horses of PHYML version 2.4.4 for LINUX (Guindon & Gascuel their maternal pedigree sub-lineage, that is, they had 2003) was used to estimate maximum likelihood identical mtDNA to the sub-lineage founding mare. trees, with an initial BIONJ tree (Gascuel 1997). The Of the sub-lineages sampled (n = 103, 3 individuals HKY + gamma substitution model with four gamma or more: n = 76), only 10 (13%) shared mitochon- categories (Hasegawa et al. 1985) was used following drial haplotype with a sub-lineage that was not their hierarchical likelihood ratio tests using MODELTEST 3.7 own (Sub-lineages 1n, n = 4; 2f, n = 7; 8c, n = 5; 19c, (Posada & Crandall 1998) with PAUP 4.0b10 for UNIX n = 3; 11g, n = 4; 20, n = 3; 23a, n = 3; 23b, n = 3; (Swofford 2003). Transition/transversion ratios and and 9b, n = 6). In contrast, only 61% shared mito- the shape parameter for the gamma distribution were chondrial haplotype with members of their overall estimated from the data, both topology and edge maternal lineage, despite having a common maternal lengths were optimized, and 1000 bootstrap replicates ancestor according to pedigree records, that is, they were run to estimate our uncertainty about the maxi- did not have identical mtDNA to the lineage founding mum likelihood tree, given the methods used and the mare (tap-root mare). Therefore, sub-lineage is a size of the data set (Felsenstein 2004). more accurate record of genetic history than lineage. Many of the discrepancies between pedigree and genetic lineage occurred at the foundation of pedigree Median joining networks sub-lineages. Median joining networks were inferred using the pro- Median joining networks (Figure 1) and maximum gram NETWORK 4.5 (www.fluxus-engineering.com), likelihood (ML) trees (Figure 2) were inferred. Details and default settings were chosen (r = 2 and e = 0) of the proportion of lineages and sub-lineages present (Bandelt et al. 1999). Preliminary trials indicated a within haplogroups are in Table 2. Haplotype sharing mutational hotspot (mitochondrial D-loop, nucleotide occurred around a group of 135 horses forming five position 15 585, as hypothesized by Jansen et al. clusters of unique sequences separated by a single (2002), which we excluded. Median joining networks nucleotide each (Figure 1), which include Lineages 9 were inferred with nodes proportional to the fre- and 12 (n = 13, n = 16); 10 and 14 (n = 18, n = 4); 3 quency of each given haplotype and the branch (n = 14); 1, 2 and 8 (n = 7, n = 32, n = 16); 7 and 22 lengths proportional to the genetic distance between (n = 9, n = 11). AMOVA and principle component anal- the nodes. Trees and networks were rooted with ysis of Thoroughbred mtDNA in comparison with equid species following Lei et al. (2009). 2000 Eurasian domestic horse mtDNA sequences show that Thoroughbred maternal lineages encom- pass the entire range of of Eurasian horse Results populations (Bower et al. 2011). Thus, the chance of We analysed mtDNA sequence data from n = 296 two founder females sharing mtDNA is low, as they Thoroughbred horses comprising n = 196 Thorough- were drawn from a highly diverse population. bred mitochondrial D-loop sequences generated in Instead, our data suggest that these lineages had this study and n = 100 mtDNA sequences from Hill closely related foundation mares. et al. (2002), representing the 33 major maternal Although all members of a given lineage were lineages in the current Thoroughbred population, expected to have identical mtDNA, only Lineage 25

© 2012 Blackwell Verlag GmbH • J. Anim. Breed. Genet. (2012) 1–9 3 4 history genetic maternal racehorse Thoroughbred

Table 1 Variable nucleotide positions in a 343-bp fragment of mtDNA D-loop the 33 major Thoroughbred maternal families. Where possible, haplotypes are related to those identified by Hill et al. (2002) indicated in column 2. Reference sequence from GenBank X79547 et al. Hill (2002) ;haplotype 15465 15478 15479 15486 15494 15495 15496 15510 15521 15526 15532 15534 15538 15540 15542 15563 15574 15584 15585 15596 15597 15601 15602 15603

Ref CCCATCA- GTCCAACAGCGAATCT Sub-lineage 9a A ...... T . . . . . G . T . B ....C...... A....C Lineage 6 C . . . . C . G . . . . T ...... A . . . . C Lineages 10, 14, 42 D . . . . C . G . . . . T ...... A . . . . C Lineages 3 and 15 E . T . . C . G . . . . T ...... A . . . . C Lineages 2, 8, 16 F . . . . C . G . . . . T ...... A . . . T C Lineages 9 and 12 G . . . . C . G . . . . T ...... T C Lineage 1 H ...... A . . C T . Lineage 25 I ...... C T . Lineages 4, 11, 13 J ...... G . . . . T A . . . T . Lineage 19 K ...... C . . . G . . . . A . . . T . Lineage 5 L ...... A ...... G . . T . Sub-lineage 5e M ...... Lineages 6, 20, 23 N ...... T . ©

02BakelVra GmbH Verlag Blackwell 2012 Lineage 20 O ...... G ...... G . . T . Lineage 23 P T ...... G . . . . T A . . . T . Q ....C.G....T...... TC Lineages 2 and 18 . . . . . T ...... Lineages 7, 17, 22 . . . . C . G . . . - T ...... A . . . T C Sub-lineage 8c . . . . C . G . . . . T ...... A . . . T C Sub-lineage 11f . . T G . . . T . . . . G . . . . T A . . . T . Sub-lineage 12b ...... •

.Ai.Bed ee.(02 1–9 (2012) Genet. Breed. Anim. J. Lineage 21 ...... Lineage 26 ...... A . . . . . Lineage A4 ...... A . A . . C T . Lineage B3 . . . . C . G . . . . T . . . G . . A . . . T C Lineage B4 . . . . C . G . . . - T ...... A . . . T C Bower A. M. tal. et

• 5 .Ai.Bed ee.(02 1–9 (2012) Genet. Breed. Anim. J. GmbH Verlag Blackwell 2012 © ieg 4A..G...... D4 D3 C2 . A3 . . A3 ...... A4 . B2 . . . D1 ...... D4 . . . A4 G . C1 ...... B1 . . . T ...... E2 . . T C1 G T . . A3 . . C3 . G . . G ...... C2 ...... B2 C2 . . . . . G G ...... G . . . T . . . T . . . . . D1 . . . . G G . D3 ...... G T . . . . . D2 . . D2 ...... T . . . D2 . . . . G G . G . . T ...... G . A1 . . T ...... T . T ...... T . . G . T . . . . . T . . . . T ...... C . . T A . T ...... A ...... A . T ...... T . . . . . T . . . T ...... G T . T T . . . . . G . . G . . . . T G ...... T . G . . . . A ...... A ...... A . . G ...... C . G . . . G ...... G ...... B4 . . Lineage . . G . . B3 . . . T Lineage ...... A4 . . Lineage . . . A . C 26 . . Lineage . G . . . . G . 21 . Lineage . . . . . 12b . Sub-lineage ...... 11f . Sub-lineage C A . . P 8c . Sub-lineage . . . . 22 17, . . O 7, . . Lineages . 18 . . . and 2 . G Lineages . . . G . . L . . N G 23 G . M . Lineage . . K 20 . Lineage G 23 . 20, . 6, . . Lineages . I 5e Sub-lineage . H . J . 5 . . Lineage A 19 Lineage T . 13 11, . 4, . Lineages G 25 Lineage F . . 1 Lineage C 12 E and 9 Lineages . 16 8, 2, D Lineages 15 and 3 Lineages 42 14, 10, A GTCAAGTGCTCCATCCCCACAALineages 6 Lineage 9a Sub-lineage Ref al 1 Table (Continued)

.G...... T...G...... G Q ...... T...... G.... B Hill et al. (2002) haplotype

15604

15617

15635

15649

15650

15651

15659

15666

15684

15703

15709

15718

15720

15737

15770

15771

15806

15807

15810

15825

15826

15827

Jansen et al. 2002

haplotypes

hruhrdrchremtra eei history genetic maternal racehorse Thoroughbred .A Bower A. M. tal. et Thoroughbred racehorse maternal genetic history M. A. Bower et al.

Figure 1 A median joining network of mitochondrial D-loop sequences from 296 Thoroughbred horses. The area of nodes is proportional to the fre- quency of the haplotype, numbers on the branches represent the number of nucleotide substitutions between nodes. Details of the proportion of pedigree lineage and sub-lineage horses present in each node are in Table 2.

Figure 2 A maximum likelihood tree of mitochondrial D-loop sequences from 296 Thoroughbred using the HKY + gamma model of evolution, with 1000 bootstrap replicates, rooted with hemionus and Equus . Many zero-length terminal edges in the maximum likelihood tree are pres- ent, as expected if two horses shared the same maternal DNA. A mutation in the offspring of one sister but not in the offspring of the other sister would result in zero-edge branching.

(n = 6) formed a discrete phylogenetic group. Four distinct groups. Split lineages were separated by a lineages (3, n = 20; 6, n = 14; 9, n = 25; and 23, minimum of six nucleotide positions, suggesting that n = 12) were split into multiple, phylogenetically de novo mutations were not responsible for this

6 © 2012 Blackwell Verlag GmbH • J. Anim. Breed. Genet. (2012) 1–9 M. A. Bower et al. Thoroughbred racehorse maternal genetic history

Table 2 Pedigree sub-lineage of 296 Thoroughbred horses which share maternal genetic lineage as defined in Figures 1 and 2

Molecular genetic lineage Pedigree sub-lineage Thoroughbreds that share mitochondrial haplotype

Lineage 1 01e (2) 01k (5) 01l (3) 01m (4) 01n (1) 01s (1) 01t (1) 16 (1) A1 (1) Lineages 2, 8 and 16 01p (2) 01n (3) 01u (2) 02d (4) 02e (3) 02f (5) 02i (11) 02n (3) 02o (4) 02s (2) 06e (2) 08a (9) 08c (4) 08d (2) 08h (1) 16a (1) 16c (1) 16g (1) 16h (1) 20 (1) 52 (1) Lineages 3 and 18 03c (4) 18 (2) 18a (1) A48 (1) Lineages 3 and 15 03b (3) 03d (3) 03e (2) 03g (6) 03l (1) 03o (1) 15a (1) 19c (1) Lineages 4, 11 and 13 04c (4) 04d (3) 04j (2) 04k (3) 04l (2) 04r (2) 11 (1) 11a (1) 11d (1) 11f (1) 11g (1) 13a (1) 13b (1) 13c (1) Lineage 5 05g (3) 05h (1) Sub-lineage 5e 05e (1) Lineage 6 06a (3) Lineages 6, 20 and 23 06b (2) 06d (4) 06f (3) 20 (2) 23 (7) 23a (2) 23b (1) Lineages 7, 17 and 22 7 (4) 07a (2) 07f (3) 17b (2) 22 (2) 22a (7) 22b (1) 22d (1) Sub-lineage 8c 08c Lineages 9 and 12 02f (2) 09b (3) 09c (10) 12c (10) 12d (5) 12f (1) A29 (1) Lineage 9 9 (4) 09a (3) 09e (1) 09f (1) Lineages 10, 14 & 42 09b (3) 10a (8) 10c (8) 11g (3) 14a (1) 14b (1) 14c (1) 14f (1) 42 (1) Sub-lineage 11f 11f (1) Sub-lineage 12b 12b (1) Lineage 19 5 (3) 19 (2) 19b (4) 19c (1) Lineage 20 02a (2) 19c (1) 20 (1) 20a (1) 20c (1) 20d (1) Lineage 21 21a (9) 23a (1) Lineage 23 23b (2) Lineage 25 25 (6) Lineage 26 26 (1) Lineage A4 A4 (3) Lineage B3 B3 (3) Lineage B4 B4 (3)

A number of individuals are in brackets. Molecular genetic lineages are numbered according to Lowe Family Number (Lowe 1913). A complete sample list, including maternal sub-lineage founding dam and sub-lineage foundation date, is available in Table S1.

pattern. Therefore, these lineages must have had from the records of private Thoroughbred in multiple founding mares. The exception is Lineage 8, 1791, or the GSB is an accurate record, but historic where a single mismatched individual is separated misunderstandings of genetics gave rise to these from the group by a base change at nucleotide conflicts. position 15 585. This is a highly mutable position It is probable that errors were introduced in the (Jansen et al. 2002) and likely represents a de novo GSB when it was first collated. Many lineages show mutation. unnamed mares in their pedigrees, which may have led to confusion in recording descent in early historic periods (Prior 1924). The diversity of founding Discussion females may have been reduced by a poor under- Nine of 10 Thoroughbreds from the sub-lineages we standing of biological modes of inheritance during sampled shared mtDNA with horses of their maternal breed development, for example, maternal half-sisters pedigree sub-lineage, whereas only six in 10 shared or cousins mistakenly thought to represent different mitochondrial haplotype with other members of their maternal lineages. Alternatively, the same mare could maternal lineage, despite having a common maternal have founded two lineages, possibly at two different ancestor according to pedigree records. studs. Historic records show that horses’ names were Our data show multiple instances of pedigree line- often changed when they were sold (Vamplew 1976). age in conflict with genetic lineage. However, our Thus, pedigree records would show lineages founded data also show that pedigree sub-lineage is a more by different mares when in fact they were founded by accurate representation of genetic lineage. There are the same individual. two possible explanations: significant errors were However, over the past 350 years, maternal lin- introduced into the GSB when it was first collated eages grew to sufficient size to require division into

© 2012 Blackwell Verlag GmbH • J. Anim. Breed. Genet. (2012) 1–9 7 Thoroughbred racehorse maternal genetic history M. A. Bower et al. sub-lineages. One hundred three sub-lineages are Conclusions represented in our data, with sub-lineage foundation dates between 1714 and 1963 (Table S1). The founda- As both maternal lineage and mitochondrial DNA are tion of a sub-lineage represents the initiation of a new inherited down the female line, it should be possible breeding line. For example, (1901, Line- to compare studbook records and biological models of age 14c) was the founding matriarch of a new sub- breed history using mtDNA. This is particularly perti- lineage of The Oldfield Mare’s lineage (founded in the nent to Thoroughbred horses as young horses are late 17th century, Lineage 14) (Bobinski 1953). We assessed for future ability and breeding potential and propose that a majority of the disagreements between therefore financial value, based on membership of pedigree heritage and genetic heritage arose at the maternal lineages. Membership of a particular lineage foundation of these sub-lineages. can command significant fees at . The potential Maternal sub-lineages could have been mistakenly future earnings and therefore the value of foals and created out of horses related, but not by female yearlings untested in the racing arena are calculated descent, for example, fraternal sisters or cousins. based on the performance of their forebears. Thus, If this were the case, sub-lineage horses would lineage is of key importance. share mitochondrial haplotype, but may not share Our data show that only 61% of the maternal mtDNA with the lineage from which the sub-line- lineages we sampled share mitochondrial haplotype age originally derived. Our data support this. For with other members of their lineage. Although some example, the majority of Lineage 3 (Sub-lineages 3b, mismatches can be attributed to de novo mutations in n = 3; 3d, n = 3; and 3e, n = 2) shared mitochon- mtDNA, it cannot account for the vast majority of the drial haplotype. In contrast, Sub-lineage 3c (n = 4) differences. However, 92% of horses share mitochon- was separated from the rest of Lineage 3 by 11 drial haplotype with members of their sub-lineage. nucleotide positions. Thus, Sub-lineage 3c had an The close agreement of maternal sub-lineage and alternative foundation mare to that of the rest of mitochondrial haplotype is consistent with inter-line- Lineage 3. Seven mismatched Lineage 1 horses (Sub- age mismatches occurring at or around sub-lineage lineages 1n, n = 3; 1p, n = 2; and 1u, n = 2) share founding events. Thus, the most accurate designation mitochondrial haplotype with Lineage 2 (n = 32). of the maternal origins of an individual Thoroughbred Therefore, Sub-lineages n, p and u of Lineage 1 horse lies in its sub-lineage designation. were founded from a Lineage 2 mare or vice versa. Because both lineages trace pedigrees back to a mare Acknowledgements called Web (1808), this is the probable source of the error. The authors wish to thank the Animal Health Trust, Our data can resolve conflicts in recorded pedigrees. Prof. Martin Jones, Prof. Christopher J. Howe and For example, it has been hypothesized by Thorough- the Department of Biochemistry, University of Cam- bred historians that Lineage 4 founder, Layton (Vio- bridge, Prof. Graeme Barker and the members of the let) Barb Mare, was the same horse that founded Glyn Daniel Laboratory for Archaeogenetics. We Family 2, Mr. Burton’s Violet Barb Mare (Vamplew & gratefully acknowledge the provision of samples by Kay 2005). Our data do not support this. Professor W.R. Allen at the Thoroughbred Breeders’ A proportion of the American Thoroughbreds Association Equine Fertility Unit. This research was exported from the UK during the late 17th and early funded by the Horserace Betting Levy Board, the 18th centuries cannot be traced back to the GSB McDonald Institute for Archaeological Research, The because their pedigree records were destroyed during Isaac Newton Trust, and the Leverhulme Trust. the (1861–1865) (Harrison 1933). Our data show that American Lineage A1 References (n = 1) and A4 (n = 3) derive from UK Lineage 1 and therefore trace their ancestry to Tregonwell’s Natural Allen M., Engstro¨ m A.S., Meyers S., Handt O., Saldeen T., Barb Mare (approximately 1657). American Lineage von Haeseler A., Pa¨a¨bo S., Gyllensten U. (1998) Mito- A4 is separated from UK Lineage 1 by a single nucleo- chondrial DNA sequencing of shed hairs and saliva on tide polymorphism, which represents a de novo muta- robbery caps: sensitivity and matching probabilities. tion at position 15 574. This mutation has become J. Forensic Sci., 43, 453–464. fixed in the American Thoroughbred population some A´ lvarez I., Ferna´ndez I., Lorenzo L., Payeras L., Cuervo time in the past 300 years, because no horses with the M., Goyache F. (2012) Founder and present maternal British Lineage 1 haplotype were identified. diversity in two endangered Spanish horse

8 © 2012 Blackwell Verlag GmbH • J. Anim. Breed. Genet. (2012) 1–9 M. A. Bower et al. Thoroughbred racehorse maternal genetic history

assessed via pedigree and mitochondrial DNA informa- Lei C., Su R., Bower M., Edwards C., Wang X., Weining S., tion. J. Anim. Breed. Genet., 129, 271–279. Liu L., Xie W., Li F., Liu R., Zhang Y., Zhang C., Chen H. Bandelt H., Forster P., Ro¨ hl A. (1999) Median-joining (2009) Multiple maternal origins of native modern and networks for inferring intraspecific phylogenies. Mol. ancient horse populations in China. Anim. Genet., 40, Biol. Evol., 16,37–48. 933–944. Bobinski K. (1953) Family Tables of Racehorses. Waterlow Lowe B. (1913) Breeding Racehorses by the Figure Sys- & Sons, London. tem. The Field and Queen (Horace Cox) Ltd., London. Bower M., Campana M., Whitten M., Edwards C.J., Jones Luı´s C., Bastos-Silveira C., Costa-Ferreira J., Cothran E., H., Barrett E., Cassidy R., Nisbet R., Hill E.W., Howe C., Oom M. (2006) A lost maternal lineage in Binns M.M. (2011) The cosmopolitan maternal heritage the . J. Anim. Breed. Genet., 123, of the Thoroughbred racehorse breed shows a significant 399–402. contribution from British and Irish Native mares. Biol. McGahern A., Bower M., Edwards C., Brophy P., Sulim- Lett., 7, 316–320. ova G., Zakharov I., Vizuete-Forster M., Levine M., Li S., Bowling A.T., Del Valle A., Bowling M. (2000) A pedi- MacHugh D., Hill E. (2006) Evidence for biogeographic gree-based study of mitochondrial D-loop DNA sequence patterning of mitochondrial DNA sequences in Eastern variation among Arabian horses. Anim. Genet., 31,1–7. horse populations. Anim. Genet., 37, 494–497. Felsenstein J. (2004) PHYLIP (Phylogeny Inference Pack- Posada D., Crandall K.A. (1998) Modeltest: testing age) Version 3.6. Distributed by the author. the model of DNA substitution. Bioinformatics, 14, 817– Gascuel O. (1997) BIONJ: an improved version of the NJ 818. algorithm based on a simple model of sequence data. Prior C.M. (1924) Early Records of the Thoroughbred Mol. Biol. Evol., 14, 685–695. Horse. Sportsman Office, London. Guindon S., Gascuel O. (2003) PHYML: a simple, fast and Swofford D.L. (2003). PAUP*: Phylogenetic Analysis using accurate algorithm to estimate large phylogenies by Parsimony (*and other methods) Version 4. Sinauer maximum likelihood. Syst. Biol., 52, 696–704. Associates, Sunderland, Massachusetts. Harrison F. (1933) The Background of the American Stud Tamura K., Dudley J., Nei M., Kumar S. (2007) MEGA4: Book. Old Dominion Press, Richmond, . molecular Evolutionary Genetics Analysis (MEGA) Harrison S.P., Turrion-Gomez J.L. (2006) Mitochondrial software version 4.0. Mol. Biol. Evol., 24, 1596–1599. DNA: an important female contribution to thoroughbred Vamplew W. (1976) The Turf: a Social and Economic racehorse performance. Mitochondrion, 6,53–63. History of Horse Racing. Allen Lane, London. Hasegawa M., Kishino H., Yano T. (1985) Dating the Vamplew W., Kay J. (2005) Encyclopedia of British human-ape split by a molecular clock of mitochondrial Horseracing. Routledge, London. DNA. J. Mol. Evol., 22, 160–174. Weatherby J. (1791) An Introduction to a General Stud Higgins D., Thompson J., Gibson T., Thompson J.D., Book. Weatherby and Sons, London. Higgins D.G., Gibson T.J. (1994) CLUSTAL W: improving Weatherby J. (2010) General Stud BookVolume 46. the sensitivity of progressive multiple sequence align- Weatherby and Sons, London. ment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res., 22, 4673–4680. Supporting Information Hill E.W., Bradley D.G., Al-Barody M., Ertugrul O., Splan Additional Supporting Information may be found in R.K., Zakharov I., Cunningham E.P. (2002) History and the online version of this article: integrity of thoroughbred dam lines revealed in equine Table S1 The composition of the dataset analysed Anim. Genet. 33 – mtDNA variation. , , 287 294. in this study, including maternal lineage (Lowe – Huggins M. (2000) Flat Racing and British Society, 1790 Sub-family designation), sub-lineage founding dam, 1914: A Social and Economic History. Portland or Frank date of foundation of sub-lineage and number of Cass, London. individuals. Jansen T., Forster P., Levine M., Oelke H., Hurles M., Ren- Please note: Wiley-Blackwell are not responsible for frew C., Weber J., Olek K. (2002) Mitochondrial DNA the content or functionality of any supporting materi- and the origins of the domestic horse. Proc. Nat. Acad. als supplied by the authors. Any queries (other than Sci., 99, 10905–10910. missing material) should be directed to the corre- Kumar S., Tamura K., Nei M. (2004) MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis sponding author for the article. and sequence alignment. Brief. Bioinform., 5, 150–163.

© 2012 Blackwell Verlag GmbH • J. Anim. Breed. Genet. (2012) 1–9 9