<<

WILDCAT SCORING FOR CONSERVATION BREEDING UNDER THE SCOTTISH CONSERVATION ACTION PLAN

Dr Helen Senn, Dr Rob Ogden Wildcat Hybrid Scoring For Conservation Breeding under the Conservation Action Plan

Dr. Helen Senn1, Dr. Rob Ogden

Subjected to academic review and approved by Scottish Wildcat Conservation Action Plan Steering Group May 2015

Citation: Senn HV and Ogden R, Wildcat Hybrid Scoring For Conservation Breeding under the Scottish Wildcat Conservation Action Plan (2015), Royal Zoological Society of , May 2015

Cover image: Peter Cairns, northshots.com

1 Communicating author [email protected] 2

About Scottish Wildcat Action The Scottish wildcat is one of 's most elusive and endangered . Often referred to as the ‘Tiger of the Highlands’, it is one whose image we recognise instantly. Striking, handsome and powerful, it is the very essence of a wild predator living by stealth and strength.

We have come to the stage where urgent action is needed to save Scotland's remaining . We have given ourselves just six years to halt the decline. Scottish Wildcat Action is one of the most ambitious conservation projects ever undertaken in Scotland, with over 20 organisations, community groups and landowners coming together to tackle the decline of Scottish wildcats.

The work is a key part of delivering the national Scottish Wildcat Conservation Action Plan, and involves both in situ and ex situ conservation activities, including targeted effort in six priority areas, monitoring and surveying work, and a conservation breeding programme (based at RZSS Wildlife Park in Kingussie).

Project partners

3

Executive Summary 1. This document is designed to set out the genetic system for determining hybridisation and how it should be integrated with morphological data, in putative specimens of the wildcat ( silvestris) destined to be brought into a conservation breeding programme, overseen by the Royal Zoological Society of Scotland as part of the Scottish Wildcat Conservation Action Plan (SNH 2013). 2. Wildcats hybridise with the domestic and produce fertile offspring. 3. In the absence of whole genome sequencing, a sample of genetic markers are capable of estimating the extent of hybridism in an individual with an associated degree of confidence. Wildcat provides further assessments of wildcat ancestry. 4. The test system devised by RZSS to select the for conservation breeding with the greatest available proportion of wildcat ancestry, employs 35 nuclear SNP DNA markers and one mitochondrial marker in combination with pelage assessments. The genetic test is based on one of the more powerful (83 SNP) tests currently available, developed in Switzerland, and has the advantage of generating data that can be compared to datasets for wildcats across Europe. It produces very similar estimates of hybridism to the Swiss test, has a slightly lower degree of confidence associated with them, but is faster and more cost-effective to run. 5. Testing of cat samples collected from across Scotland indicates that there is a complete genetic continuum between wild and domestic cat genetic types in the wild/ population. From the relatively limited sampling to date, the wild-living cat population is a hybrid swarm, i.e. most individuals demonstrate some level of hybridisation. 6. Any method of choosing “wildcats” for a conservation breeding programme needs to decide on a cut-off between wildcat and domestic cat types. 7. We suggest that as a general principle we choose in which we have a 95% confidence of them being closer than a first generation backcross to wildcat based on their genetic scores. A first generation backcross to wildcat is a cat where one of its four grandparents is a domestic cat and the remaining three are wildcats. 8. Based on the limited evidence currently available, there does not always appear to be good correspondence between the genetic test and the commonly used phenotypic test (pelage score). This is likely because hybridisation in Scotland has been occurring for a long time and the phenotypic traits are under the control of very few genes that do not match the areas examined in the test. However, this correspondence requires further analysis from a larger sample of individuals with a range of and genotypes. 9. We propose that genetic and phenotypic tests are used as separate, independent lines of evidence, when decisions are made about cats. Only where the two lines of evidence corroborate should we chose cats for the conservation breeding programme.

4

Contents About Scottish Wildcat Action ...... 2 Executive Summary ...... 4 Purpose ...... 6 History of wildcat hybrid testing ...... 6 Principle of hybrid testing ...... 7 Current test ...... 8 mtDNA marker ...... 8 Nuclear markers ...... 8 Background justification to the test...... 9 Theoretical limits to the test ...... 9 Reference data and choice of loci ...... 11 The performance of statistical methods (STRUCTURE) and the empirical limits to the test ...... 15 Discussion of limitations of the test ...... 19 Decisions for hybrid cut-off criteria ...... 25 Nuclear genetic criteria at the 35 Locus test ...... 25 Power of mtDNA test: ...... 28 Pelage Scores ...... 30 The test decision ...... 32 The Principle of the test ...... 32 The details ...... 32 Pelage Scoring Criteria ...... 32 Genetic Scoring Criteria ...... 32 Combined pelage and genetic scoring decision matrix: ...... 33 References ...... 35 Appendices ...... 38 Appendix 1: Allele frequencies at the 35 SNPs in the 82 individual test dataset ...... 38 Appendix 2: STRUCTURE output at different values of K in the 82 individual test dataset...... 41 Appendix 3: Test data ...... 41 Appendix 4: Physical validation of the test ...... 46 Appendix 5: Standard Test protocol at RZSS...... 47 Appendix 6: Nuclear SNP assay information and reorder numbers ...... 49 Appendix 7: SNP assay clustering ...... 56

5

Purpose This protocol is designed to set out the genetic system for determining hybridisation in putative specimens of the wildcat (Felis silvestris) destined to be brought into a conservation breeding programme overseen by the Royal Zoological Society of Scotland (RZSS) 2 as part of the Scottish Wildcat Conservation Action Plan (SNH 2013). Wildcats hybridise with the domestic cat (Felis catus) and produce fertile offspring. Genetic tests are capable of determining the extent of hybridism in an individual with a degree of confidence. This document discusses the protocol used at the Royal Zoological Society of Scotland as part of the Conservation Breeding Strategy of the Scottish Wildcat Conservation Action Plan (SNH 2013) and gives a critical evaluation of the limitations of its power. In line with the brief from the Scottish Wildcat Conservation Action Plan (SNH 2013) throughout this document the working assumption is that: 1. The situation for Scottish Wildcat is so critical in terms of low numbers and high level of genetic from domestic cat, that taking some animals from the wild into a conservation breeding programme is a conservation solution that is going to be of net benefit to the Scottish Wildcat (as opposed to managing the situation in the wild alone, or doing nothing). 2. The Scottish Wildcat is distinct entity with biodiversity and cultural dimensions that is worth conserving (versus favouring conservation solutions that involve wildcats from ). The approach is also based on the assumption that wildcats with a high proportion of wildcat ancestry can still be found in the wild in Scotland. 3. That we are seeking to protect a distinct group of cats that look like wildcats and contain a large proportion of wildcat genes, but may not all be genetically “pure” wildcats.

History of wildcat hybrid testing A large number and variety of tests have previously been employed to assay for wildcat hybridisation, both within Scotland (Daniels et al. 2001; Beaumont et al. 2001; SNH survey of 2013/14 (Commissioned report 768)) and in other populations worldwide (Pierpaoli et al. 2003; Randi 2008; Oliveira et al. 2008; Driscoll et al. 2011; McEwing et al. 2011; Nussberger et al. 2013; Mattucci et al. 2013; Witzenberger & Hochkirch 2014; Nussberger, Wandeler, & Camenisch 2014; Le Roux et al. 2014). A comprehensive review of this information up to the year 2013 is found in Neaves & Hollingworth (2013). The important point to note here is that data from different test systems is not necessarily comparable. In the case of microsatellite systems that can be subjective to score, even comparisons of the same system between different labs may be difficult and requires the sharing of reference samples to conduct the calibration - this is not always possible. With this in mind, the test that is used at the Royal Zoological Society of Scotland is based on a system of SNP (Single Nucleotide Polymorphism) markers that was originally developed by the Laboratory of Prof. Lukas Keller at the University of Zurich, Switzerland, on populations of cats found in the Swiss Jura (Nussberger et al. 2013; Nussberger, Wandeler,

2 RZSS Wildcat Conservation Breeding for Release Protocols (2014) 6

& Camenisch 2014; Nussberger, Wandeler, Weber, et al. 2014). This system is one of the most extensive wildcat testing systems available currently. Basing the RZSS method on this system has a number of advantages: 1. It enables the data generated in Scotland to be compared directly to data held for animals from mainland Europe (and therefore draw direct genetic comparisons). This enables the situation in Scotland to be ground-truthed against other populations that are being studied. 2. A system based on SNPs (as opposed to microsatellites) removes issues of subjectivity and inter-lab calibration, making the test and resulting data more easily accessible to other institutions in the future.

The test presented here is an expanded version of the test developed (also from the Swiss test, above) and run by RZSS in the SNH survey of 2013/14 (Commissioned report 768). Data generated during the 2013/14 survey is directly comparable to data generated in this expanded version of the test (at 12/14 original loci). The latest test will, however, give an increased level of confidence in the estimates of hybridisation beyond that used in the SNH survey of 2013/14.

Principle of hybrid testing The principle of DNA hybrid testing is to survey the genome of an individual and estimate what proportion has been inherited from each parent (its hybrid score). At a conceptual level this approach is relatively easy to understand, however in practice, there are a number of issues which complicate the analysis making hybridisation a very difficult genetic phenomenon to assay. 1. By definition, hybridising species are closely related and therefore much of their genome will be genetically indistinguishable. Thus the first step is generally to try and find genetic markers that differentiate between the parent species and use this marker set to assay for hybridisation. Therefore the reliability of the marker set will be intrinsically liked to the quality of the reference data used to generate it. Since it is not always easy to find reliable reference individuals that definitely do not have hybrid ancestry, this can be a complicating factor that introduces a level of uncertainty. The larger the number of sites in the genome (markers) we can use to examine the issue of hybridisation, the less reliant we are on any one particular marker and possible associated anomalies in the reference datasets.

2. As introgression progresses through the generations by backcrossing, the proportion of the genome that has introgressed in any one individual reduces by, on average, ½ every generation (although there is considerable variation surrounding this). This means that the more distant the hybrid ancestry of an individual is, the more difficult it is to detect. Very large numbers of genetic markers are required to detect distant hybrid ancestry reliably and estimate accurately the proportion of the genome that

7

is introgressed3. This means that it is much harder to understand situations where hybridisation has been happening between the parent species for many generations. In the Scottish Wildcat hybridisation has been occurring for hundreds if not thousands of years – potentially since domestic cat arrived on mainland Britain (Neaves & Hollingsworth 2013).

3. Large numbers of markers are costly and time-consuming to run and some methods require high quality DNA.

Thus any hybrid test is a balance between the ideal (a large number of markers, ideally whole genome data4) and the necessary practical restrictions of running the test.

Current test The current test consists of mitochondrial and nuclear DNA markers used in tandem to infer the hybrid ancestry of any given cat. mtDNA marker This current test includes a single mtDNA marker that distinguishes between wildcat and domestic cat at the mitochondrial genome (i.e. only distinguishes maternal lineages, for limitations of this see later). Details of this test have been published in McEwing et al. (2011). Nuclear markers The current test includes a panel of 35 nuclear SNPs that appear to be highly discriminatory between a reference dataset of wild and domestic cats. This test is an expanded version of the 14 SNP test utilised in the SNH survey of 2013/14 (Commissioned report 768) and is based on the markers in Table 1 published in (Nussberger et al. 2013). The 35 markers were chosen as a compromise between an ideal requirement (to have many markers for a hybridism test) and cost/speed consideration for running the test.

Table 1: Details of the SNP Panel

RZSS_SNPID Allele 1 Allele 2 Genetic The 12 markers Present on (Vic) (Fam) Location included in the original Swiss panel panel of 14 in SNH (Nussberger survey of 2013/14 et al pers. (Comm. Rep. 768)5 com) SNP001 G C A1_214461789 Yes SNP012 G T A3_90799249 Yes SNP014 A G B1_123418311 Yes

3 Once hybridisation in a population has progress to such an extent that the majority of the population is consistent of hybrids of some kind (i.e. hybrid swarm) then this simple scenario of backcrossing to pure animals no longer hold true however animals with distant hybrid ancestors will 4 Or lots of markers with linkage information to examine genomic blocks of introgression. 5 An additional two markers were also used for this original panel of 15 that are no longer used: SNP153 which was dropped by the Swiss and SNP038 that did not convert to the new Taqman assay chemistry (see below). 8

SNP016 A G B1_20092839 Yes SNP019 G A B2_11748866 Yes SNP026 G C B3_75494376 Yes SNP030 A G B4_45476816 Yes Yes SNP044 A G E3_12301230 Yes SNP045 C T F1_24323263 Yes SNP047 G C F2_7927040 Yes SNP048 C T A3_51056949 Yes Yes SNP050 G A C1_223335334 Yes SNP058 G A D1_126067118 Yes Yes SNP060 A T D1_128802001 Yes SNP062 G T D2_88876341 Yes SNP084 A G D4_103411241 Yes SNP098 G A E1_47901546 Yes SNP101 C T B4_143164026 Yes SNP102 C T C2_142858667 Yes Yes SNP114 G A A2_62528310 Yes Yes SNP115 G A A2_63544109 Yes Yes SNP127 C T B3_132539085 Yes SNP129 G A B4_96741303 Yes Yes SNP133 G A C1_163375181 Yes SNP143 C T F2_29878116 Yes Yes SNP146 C T A1_214220499 Yes SNP148 G A A2_120724549 Yes Yes SNP155 A C B2_129152112 Yes Yes SNP166 G A B3_147841323 Yes SNP176 T C C1_112821482 Yes SNP178 G A C1_189621758 Yes Yes SNP187 C G D3_49022779 Yes SNP190 C G D3_88773687 Yes SNP195 A G E2_33320051 Yes SNP196 T A E2_50523470 Yes

The allele frequencies found at these loci are detailed in Appendix 1.

Background justification to the test Investigations which were conducted and led to the decision on the final SNP marker panel choice were as follows: Theoretical limits to the test A test with a given number of SNP markers has a fixed theoretical limit to its power. The power of a 35 SNP test, assuming that the markers chosen are indeed truly discriminant between wild and domestic cats is a follows:

9

Table 2: Theoretical power of the 35 SNPs test used to test wildcats

Category Average % genome Average number of alleles Probability of domestic (%wildcat) of domestic using misdiagnosing cat as a pure test wildcat individual by chance if test is perfectly discriminatory (calculated from eq.2 of Boecklen & Howard (1997)) Pure wildcat 0% (100%) 0 n/a F1 Exactly 50% (50%) 35 0 (one of each chromosome pair) Bx1 wildcat 25% (75%) 17.5 0.000000 Bx2 wildcat 12.5% (87.5%) 8.75 0.000042 Bx3 wildcat 6.25% (93.75%) 4.375 0.009339 Bx4 wildcat 3.125% (96.88%) 2.1875 0.104471 Bx5 wildcat 1.5625% (98.44%) 1.0938 0.329162 Bx6 wildcat 0.78125% (99.22%) 0.5469 (beyond functional 0.576262 limit of test) Bx7 wildcat 0.390625% (99.61%) 0.28125 (beyond 0.759943 functional limit of test)

Table 2 illustrates the theeoretical maximal power of the test. Roughly speaking, as a best case scenario, this test will reliably distinguish will reliably distinguish pure wild-cats from 1st – 3rd generation back-crosses (<1% error rate). For comparison we list here the theoretical limits of other possible panels of markers with a given number.

Table 3: Theoretical power of tests with varying numbers of SNPs

Number of perfectly Limit of test (ancestral mixing at which there is < 5% probability discriminatory loci of confusing this category with pure animal (actual probability given in brackets) 14 1st Generation backcross (0.000061) 20 2nd Generation backcross (0.003171) 24 3rd Generation backcross (0.040569) 30 3rd Generation backcross (0.018207) 36 3rd Generation backcross (0.008171) 48 4th Generation backcross (0.045156) 83 4th Generation backcross (0.004716) 96 5th Generation backcross (0.047460)

10

Although the best case scenario for the 35 SNP test is to distinguishing up to 3rd generation backcross, the true power of the test may be somewhat lower than this due to unavoidable uncertainties surrounding the reference data used to generate the test and due to ancestral polymorphisms in the markers. Reference data and choice of loci The 14 SNPs from the original SNH survey of 2013/14 (Commissioned report 768), 12 of which are used here, were chosen based on segregation between a test panel of 10 wildcats, consisting of five high pelage scoring individuals from Scotland (NMS accession numbers 1958.8, 1947.13(2), 1931.59, mw1947.133, mw1947.131) and five individuals from , and a test panel of 4 domestic cats of Scottish (n=3) and German (n=1) origin. The high pelage scoring Scottish cats had scores ranging from 18-21 on the 7 Pelage Score (Kitchener et al. 2005). See the SNH survey of 2013/14 (Commissioned report 768) for further details. The additional 23 SNPs were selected from a panel of 82 individuals that were run at the Keller lab (University of Zurich, Switzerland) for 83 SNPs previously shown to be highly diagnostic between Swiss wild and domestic cats (Nussberger et al. 2013; Nussberger, Wandeler, & Camenisch 2014; Nussberger, Wandeler, Weber, et al. 2014). This panel of 82 individuals consisted of 6 Swiss reference cats (2x wildcat from the Swiss Jura6, 2x domestic cat, 1x F1, 1x backcross to wild, all from Switzerland) and 76 samples collected from the wild and captivity in Scotland. 38 samples in this dataset had previously been run for the SNH survey of 2013/14 (Commissioned report 768) and acted as positive controls within the Scottish dataset7 alongside the Swiss reference cats that had previously been run by the Swiss lab. A full list of theses samples with their locality data and pelage scores can be found in Appendix 3.

Table 4: Summary of test data of the 82 individuals used for developing the new panel of 35 SNPs

General Location Number of cats Captive 16 E_Cairngorms 13 N_Cairngorms 28 N_Inverness 10 S_Cairngorms 4 W_Coast 3 Swiss_Bx 1

6 The two wildcats are wild individuals (roadkills), a male WK28 from Kleinlützel (2008) and a female WK56 from Oberbuchsiten (2002), both in canton Solothurn (Beatrice Nussberger pers. com.) 7 At the 13 loci in common between the two methods 11

Swiss_dom 2 Swiss_F1 1 Swiss_wild 2 Scotland Unknown 1 England 1

Analysis of reference data; the status quo in Scotland and what follows from it A principle component analysis (PCA) of the Swiss 83 SNP dataset (conducted in Genalex 6.5) revealed that there appears to be no distinct separation of wild cats in Scotland from domestic cat. Although this dataset is not huge (See Table 4) it suggests that it would be difficult to draw a dividing line between the two populations (see also Daniels et al. 2001; Neaves & Hollingsworth 2013). From this we make the following statements: 1: Wildcats in Scotland appear to form a hybrid swarm with domestic cats. This can be seen in contrast to the situation of hybridisation existing between red deer and sika deer ( Cervus) in Scotland; where low level hybridisation has not, as yet, resulted in complete genetic and phenotypic mixing of the two species into a “hybrid swarm” other than in localised regions on the Kintyre Peninsula (Senn & Pemberton 2009; Senn, Barton, et al. 2010; Senn, Swanson, et al. 2010). 2: Any programme to bring “wildcats” into a conservation breeding programme will have to set a threshold (based on judgement rather than a clear biological distinction) that balances the wish to preserve the genetic diversity encapsulated in the apparently non-pure wildcats from Scotland as part of a Scottish Wildcat Conservation Breeding Programme, versus the desire not to be too inclusive of domestic cat genes (and associated traits). Set the purity bar too high and the risk is that good wildcat genes are excluded, ‘good-ish’ cats are excluded as hybrids, and the population accepted into captivity is so small that it will experience a high level of inbreeding. Set the bar too low and we end up breeding something that is only slightly better than the situation in the wild. 3: In setting such a cut-off we will also need to be aware of the uncertainty around any estimate that the genetic data produces, given the limitation of hybridisation assays discussed above (inherent power of any test, uncertainty surrounding reference data etc.).

12

Figure 1: A Principle Component Analysis (PCA) of 82 cats (for details see table 4 & Appendix 3). Each point represents a single cat scored at 83 SNP DNA markers. The proximity of the cats to each other on the plot represents their genetic similarity. The percentage of variation shown by each of the components is: PCA1 : 34.28%; PCA2:5.46 %. The Swiss reference cats fall out as expected across the cluster and can be used to bench-mark other cats, for example WCQ0114 has high genetic similarity to the Swiss reference F1 hybrid cat.

13

Use of reference data to choose 35 markers for the test panel The reference data matrix (82 individuals x 83 SNPs) was used to choose the additional markers for the panel. To do this, the dataset was analysed using STRUCTURE 2.3.4 (Pritchard et al. 2000; Falush et al. 2003; Pritchard 2010). STRUCTURE is a piece of population genetic software that can be considered the “gold standard” for population assignment and admixture (hybridisation) analysis. It is commonly used in similar analyses and uses a Bayesian clustering algorithm to statistically assign an individual to each of a number of clusters based on the available genetic data. The model makes use of the observation that true populations show two properties in their genetic data: 1: “Hardy- Weinberg Equilibrium” and 2: “Linkage Equilibrium”, and STRUCTURE seeks to optimise the individuals into the best genetic clusters that conform to these properties (see references above for more details). The following (standard) model was chosen: 500,000 burn-in, 1,000,000 MCMC reps, Admixture model (infer alpha), Correlated allele frequencies model (Lamda =1). Null allele frequencies were estimated simultaneously using the RECESSIVEALLELES=1 option and by setting dummy values at each locus (see STRUCTURE manual). This was done since the presence of null alleles has the possibility of distorting estimates of hybridisation (Senn & Pemberton 2009a). The model was run for K=2 (since we are investigating a hybridising scenario between two populations8). Three replicates of the analysis were run to ensure stability of the results. The two genetic clusters generated by STRUCTURE were assumed to represent domestic and wild genetic populations as benchmarked against the Swiss reference samples. Pelage data was not taken into account during this analysis. The Q-hat scores from STRUCTURE (estimates of the posterior probability of a cat belonging to wildcat) were examined9 and the dataset was divided into two sets using the following criteria. Set 1, (the “domestic set”): cat with scores of Q <0.25, consisting of two Swiss domestic cats and eight feral cats collected from across Scotland. Set 2, (the “wild” set): cats with scores of Q >0.75, consisting of the two Swiss wildcats and 35 wild-living cats from Scotland and captive wildcats from across the UK. This equates to approximately judging “1st generation backcross to wild and better” against “1st generation backcross to domestic and worse”.

Pairwise Fst, a measure of population differentiation, was calculated for all SNPs across these datasets using Genepop 4.2. The loci were ranked according to Fst. SNPs with the highest values of Fst were taken to show the highest level of genetic differentiation between the two groups (i.e. had the greatest power to distinguish between the two). The loci were then chosen according to these approximate criteria: highest ranking SNPs not on the same chromosome of another SNP already in the panel. When the chromosome choices had been depleted, the highest ranking loci not within 1,000,000 bp of other loci on panel were selected. This ensured that loci showing high levels of differentiation were chosen, but that

8 A graph of LnD(P) at other values of K can be found in Appendix 2 9 The scores for individual cats can be found in Appendix 3.

14 they were also not tightly linked on the genome (discussed later). The position of each locus on the domestic cat genome is given in Table 1. The performance of statistical methods (STRUCTURE) and the empirical limits to the test The performance of STRUCTURE at various numbers of loci was evaluated by running the data set of 82 individuals through STRUCTURE according to the parameters above, using panels of different sizes. These were: 1. The original panel of 13 SNPs 10 2. The final choice panel of 35 SNPs (which contained 12 of the original SNPs). 3. A panel of 24 and 30 SNPs (to further explore the degree of confidence with fewer markers) based on similar choice criteria to the ones chosen for the final panel of 35

(i.e. high Fst and not closely located on chromosomes). 4. The full panel of 83 SNPs. For each of these datasets STRUCTURE was used to allocate a hybrid score for each individual and calculate a 90% posterior probability interval (confidence interval) around the score. A comparison of the extremes of the test (13 versus 83 loci) shows that hybrid categories allocated are generally similar, although there is appreciable variation in scores between the two tests (Figure 2):

10 14 original loci minus the locus dropped by the Swiss research group. Since this locus had been dropped it was not possible to make any comparisons.

15

Figure 2: (above) relationship between the test at the original 13 SNPs and the entire Swiss panel of 83 individuals. The correlation is statistically significant however the variance between results of the different test is still quite large. For example a cat (orange arrow) given a score of ~40% (~0.4) wildcat on the 13 SNP test is given a score of >60% (>0.6) wildcat on the 83 SNP test. For the arguments given above, we assume that the 83 SNP test has an inherently higher level of reliability

16 than the test with a lower numbers of markers. (Below) the relationship is tightened if we compare 83 against 35 markers.

We make the assumption, for the argument given above, that the value produced by the 83 SNP test is closer to the true hybrid score. In order to understand this better, it is informative to look at the confidence intervals surrounding the hybrid scores; which decrease with increasing number of loci:

Table 5: The average width of the 90% posterior probability (confidence) interval surrounding each hybrid score in the test panel of 82 cats. Confidence intervals decrease with increasing numbers of loci i.e. our confidence in the test increases with increasing numbers of loci.

% relative to 83 Average width of 90% posterior probability confidence panel Panels interval around hybrid score width Original 13 0.2892 208% 24 0.2062 148% 30 0.1835 132% Final 35 0.1757 126% 83 0.1389 100%

It can be seen that the width of the probability interval decreases with an increasing number of SNPs. A decrease in the number of SNPs from 83 to 35 increases the probability interval to 126% of the 83 SNP panel width. A plot of the data (Figure 3) reveals that the degree of uncertainty relative to the hybrid score changes across the range of hybrid score. Using 35 loci the average level of uncertainty around a hybrid score is 0.1757. This increases to approximately 0.2 at hybrid scores of 0.75 (75% wildcat) and to 0.225 at a score of 0.5 (50% i.e. F1). Using 35 loci approximately halves the level of uncertainty, in comparison to using 13 loci. Figure 3 illustrates the confidence interval in the different datasets.

17

0.5

0.45

0.4

0.35

0.3 13 0.25 24 30 0.2 83 probability interval) probability 0.15 final 35 0.1

Uncertainty surrounding Qhat (90% posterior posterior (90% Qhat surrounding Uncertainty 0.05

0 0 0.2 0.4 0.6 0.8 1 "Hybrid score" Qhat (estimated probabilty of being a wildcat)

Figure 3: The uncertainly surrounding the hybrid score, against the hybrid score for each cat in the test panel of 82 cats, for each of a panel of loci (13,24,30,35,83).

18

Discussion of limitations of the test

Circularity and limit of reference data There is clearly potential for circularity given the limited number of animals used as reference individuals to design this test and the limited amount of knowledge that we have on the current situation in Scotland more generally. We have ensured that this is mitigated in the following ways: 1. Inclusion of reference animals from geographically distant areas (mainland Europe, Scotland). 2. Two separate methods of choosing loci. One using pelage characteristics in a smaller number of animals (for the first 12 markers), and one relying only on inherent patterns of genetic differentiation11 of a larger geographically diverse dataset (for the additional 23 markers). These two methods were performed using two different datasets as a starting point (i.e. they are independent). 3. No assumption in the STRUCTURE test that alleles at the markers are diagnostic of “wildcat” or “domestic” populations. The STRUCTURE model can handle the possibility of ancestral polymorphism. 4. Additionally, the original test panel of cats from Germany and Scotland used to generate the 14 SNP test were re-examined at (1) the final panel of loci and (2) the final panel of loci minus the original 14 loci. This provided confirmation that the test results of the new SNP panel place these reference cats in the same category (Table 6).

11 Primarily (admixture) linkage disequilibrium, implemented through STRUCTURE model.

19

Table 6: Details of the reference samples used to choose the original panel of 14 SNPs and their scores using the new SNP test. The new test contain 35 SNPs however the original cats (below) were only scored for 33 of these 35 SNPs because the Swiss test panel changed between the two studies and the original panel did not contain SNP044 & SNP045). In addition the cats were also just analysed using the new 21 Loci that did not overlap with the original panel of 14. In both cases the resulting hybrid scores place the cats in the correct category i.e. wildcat or domestic.

RZSS_ID Other 7PS Q LBQ12 UBQ13 Q (21) LBQ11 UBQ12 Function in Accession No (33) (33) (33) (21) (21) original test panel WCQ0348 0.978 0.936 1 0.956 0.884 0.998 German WC WCQ0349 0.985 0.947 1 0.965 0.897 0.998 German WC WCQ0350 0.977 0.936 1 0.965 0.897 1 German WC WCQ0351 0.987 0.952 1 0.982 0.933 1 German WC WCQ0352 0.968 0.906 1 0.922 0.814 0.998 German WC WCQ0353 0.026 0 0.074 0.028 0 0.093 German DC WCQ0354 0.059 0.008 0.126 0.058 0 0.157 Scottish DC WCQ0355 0.087 0.019 0.166 0.113 0.011 0.235 Scottish DC WCQ0356 0.125 0.052 0.212 0.167 0.043 0.301 Scottish DC WCQ0364 1958.8 19 0.983 0.942 1 0.956 0.876 1 Scottish WC

WCQ0365 1947.13(2) 18 0.992 0.968 1 0.986 0.947 1 Scottish WC

WCQ0366 1931.59 21 0.991 0.964 1 0.983 0.936 1 Scottish WC

WCQ0367 mw1947.133 21 0.979 0.936 1 0.953 0.875 0.998 Scottish WC

WCQ0368 mw1947.131 19 0.967 0.92 0.998 0.938 0.861 0.989 Scottish WC

Null alleles Null alleles are alleles that fail to amplify at a locus due to a mutation in the primer binding site. Homozygous individuals will appear to have a failed genotype and heterozygous individuals will score as a homozygote. There is the potential for null alleles to interfere with the determination of hybridism, as false homozygotes scores at SNPs with null allele can inflate or deflate estimates (Senn & Pemberton 2009). For this reason, the STRUCTURE analysis is used to jointly estimate null allele frequency during the assignment analysis. The null allele frequency estimates from this analysis can be found in the Appendix 1. Estimates range from 0.2-4.8% across both populations. The following loci had an estimate of >2% null allele frequency in the wildcat population cluster: SNP048, SNP196, and the following for the domestic cat cluster: SNP012, SNP058, SNP114, SNP115, SNP143, SNP190, SNP196. The likely presence of null alleles does not, however, appear to have a large effect on the estimations of hybridism as shown by a comparison of Q-hat hybrid scores generated in STRUCTURE, with or without the null allele estimation model:

12 Lower boundary of the 90% confidence interval for Q (see later) 13 Upper boundary of the 90% confidence interval for Q (see later)

20

Figure 4: Performance of STRUCTURE model with and without the option to simultaneously estimate null allele frequency on hybrid score (above) and confidence interval surrounding the hybrid score (below). The impact of null alleles on hybrid score estimation and its confidence is negligible.

21

Physical linkage There is a potential issue with physical linkage amongst the chosen 35 markers. One of the assumptions of the STRUCTURE model is that the markers behave genetically as “unlinked”. Where markers are situated on the same chromosome, this assumption is technically violated. The more closely linked the markers are, the less likely they are to be recombined in a given time period, and the greater the breach of this assumption. The more markers that are used, the more likely this assumption is to be breached. Where markers are thought to be independent and are in fact not, this has the potential to skew estimates of hybridism. Map distances between the markers ordered along chromosomes are shown in Table 7. As a rough guide, within the human genome markers situated more than 100,000,000 base pairs apart are likely to recombine each generation (rate is approx 0.01 crossing per Million bp) and so essentially behave independently. Comparison of physically linked markers in Table 6 shows that most are more closely situated than this. In order to examine the possible effect of linkage on estimates of hybridisation, the linkage model in STRUCTURE (Falush et al. 2007) was employed. The linkage model essentially states that linked markers are more likely to come from the same population and weights this likelihood by the distance between them. The linkage model was run using the distances in Table 6 with the same parameters used for the other analyses (see above) including the null allele model. Linkage does not have any appreciable effect on the estimations of Hybrid score or its confidence (Figure 5). The linkage model also does not appear to improve the estimates (reduce confidence interval).

Table 7: Physical distances between the panel of 35 SNPs

Ordered of Chromosome and Physical distance (bp) of ordered markers in cat position (bp) on markers from previous marker. -1 genome domestic cat genome denotes first (or sole) marker on chromosome SNP146 A1_214220499 -1 SNP001 A1_214461789 241,290 SNP048 A3_51056949 -1 SNP114 A2_62528310 11,471,361 SNP115 A2_63544109 1,015,799 SNP012 A3_90799249 27,255,140 SNP148 A2_120724549 29,925,300 SNP019 B2_11748866 -1 SNP016 B1_20092839 8,343,973 SNP014 B1_123418311 103,325,472 SNP155 B2_129152112 5,733,801 SNP026 B3_75494376 -1

22

SNP127 B3_132539085 57,044,709 SNP166 B3_147841323 15,302,238 SNP030 B4_45476816 -1 SNP129 B4_96741303 51,264,487 SNP101 B4_143164026 46,422,723 SNP176 C1_112821482 -1 SNP133 C1_163375181 50,553,699 SNP178 C1_189621758 26,246,577 SNP050 C1_223335334 33,713,576 SNP102 C2_142858667 -1 SNP058 D1_126067118 -1 SNP060 D1_128802001 2,734,883 SNP062 D2_88876341 -1 SNP187 D3_49022779 -1 SNP190 D3_88773687 39,750,908 SNP084 D4_103411241 -1 SNP098 E1_47901546 -1 SNP195 E2_33320051 -1 SNP196 E2_50523470 17,203,419 SNP044 E3_12301230 -1 SNP045 F1_24323263 -1 SNP047 F2_7927040 -1 SNP143 F2_29878116 21,951,076

23

Figure 5: Performance of STRUCTURE model with and without the linkage model on hybrid score (above) and confidence interval surrounding the hybrid score (below). The impact of linkage on hybrid score estimation and its confidence is negligible.

24

Discussion of other software for estimating hybridism New Hybrids (Anderson & Thompson 2002) has previously been used to assign hybrid category to wild cat data (Nussberger et al. 2013; the SNH survey of 2013/14 (Commissioned report 768)). This test uses a similar (Bayesian) model to STRUCTURE to assign the animals in the dataset to discrete categories (e.g. Wildcat, Domestic, F1, F2, Bx1 etc). Although it appears to be a conceptually more simple result to understand than the Q- hat value provided by STRUCTURE, it is however a less appropriate test for the scenario in Scotland. The reason for this is that it seems likely that the “hybrid swarm” pattern of hybridisation found in Scotland is old and therefore generating complex hybrids. We can imagine a scenario where F1 are mating with Bx2, F2 are mating with Bx5 etc. In other words the cats do not conform to simple categories proposed by the New Hybrids model. This means that cats can often be assigned to multiple categories with low probability and are sensitive to “jumping” category when different reference data is used in the analysis (data not shown here). The simple estimation of the proportion of the genome that is introgressed that is essentially provided by STRUCTURE14 is likely to be more accurate and is actually simpler to interpret.

Decisions for hybrid cut-off criteria Nuclear genetic criteria at the 35 Locus test Each hybrid estimate has an associated level of uncertainty surrounding it (Figure 3). It is important to take this level of uncertainty into account when setting a cut-off for levels of hybridisation to inform conservation breeding Given that the data forms a genetic continuum (Figure 1) we state that the cut-off value for deeming that an individual cat meets the criteria for breeding on genetic criteria shall be: “Wildcat”: An animal scored on the 35 loci test whose lower bound of the 90% confidence interval (LBQ) is greater than 0.75 “Certain not wildcat” An animal scored at 35 loci test whose upper bound of the 90% confidence interval (UBQ) is less than 0.75 “Cat of uncertain genetic status” Any cat that falls between the above definitions i.e Lower bound (LBQ) <0.75, upper bound >0.75

14 Formally the posterior probability of belonging to a given cluster.

25

The value of 75% (0.75) is chosen as this represents the proportion of a genome that would be “wildcat” in a first generation backcross to wildcat, Bx1wildcat (i.e a cat with one domestic grandparent). The cut off used allows us to select cats in which we have approximately a 95%15 confidence of being better than first generation backcross.

Given the high degree of introgression found in the Scottish wildcat (see Figure 1), this is considered to be the best value to balance issues surrounding stringency, leniency and inherent uncertainty in the genetic test (see above). The threshold could be increased if more animals with a high proportion of wildcat ancestry are found to exist. This cut off is illustrated graphically in Figure 6. With a differing number of loci this cut-off would have the following implications for the wild and captive cats in the test dataset (Table 8). It can be seen from the table that increasing the number of loci, in general, decreases the number of cats that we are uncertain about, however an increase beyond 35 loci brings few benefits under this cut-off system versus using the full 83 loci.

Table 8: Decision made on cats in the test dataset using the cut-off detailed above with various datasets of different loci.

“Certain not “Cat of uncertain genetic wildcat” status” “Wildcat” 13 Loci wild 2 14 17 captive 4 6 23 83 Loci wild 11 17 5 captive 5 6 22 Actual panel 35 Loci wild 11 16 6 captive 5 6 22

15 95% because the 90% confidence interval is two-tailed.

26

Figure 6: Hybrid scores for individual cats scored at 35loci in the test data set. Cats are ordered along the bottom of the graph. Points represent the hybrid score at an individual cat. Lines represent the 90% confidence interval. Cats in green are “Good wildcat”, cats in red are “Certain not good wildcat”, cats in grey are “Cat of uncertain genetic status.

27

Power of mtDNA test: Mitochondrial tests alone are not suitable for determining hybridisation since they only provide information regarding the female to female lineage (matriline). Essentially this test only provides information on one of the 2n possible ancestors that an individual has n generations back. In conjunction with nuclear markers it is useful for assaying recent hybridisation, but becomes harder to interpret the older hybridisation events are. Data generated as part of this report shows that the power of this test is likely to be very low because even genetically “good” cats at nuclear loci have mitochondrial DNA introgression (Figure 7). This suggests that either the mitochondrial haplotype is not fixed between these two species (ancestral polymorphism) or that introgression has been happening for a long time between wild and domestic cats in Scotland. Therefore we state here: 1. That this test is only used to provide corroborating evidence in the event that a decision has to be made about a “cat of uncertain genetic status” that has to go to committee (see The test decision). Within the conservation breeding programme, the ultimate aim is to breed this trait out. A best strategy for doing this (i.e either in the short, or longer term) will be established according to its impact on other genetic factors (i.e inbreeding) within the captive breeding pool and is beyond the scope of this document.

28

Q HAT

Figure 7: Hybrid Score (at 83 loci) for wild and captive cats. Blue bar represent the hybrid score obtain from STRUCTURE. Red stars represent the domestic mitochondrial haplotype. It can be seen that cats with a high value of Q (i.e very wildcat) have mitochondrial DNA introgression. This suggest that either the mitochondrial haplotype is not fixed between these two species (ancestral polymorphism) or that introgression has been happening for a long time.

29

Pelage Scores We also investigated the relationship between pelage score (7PS) and DNA hybrid score. The cats detailed here are cats in the test dataset (Appendix 3) from the National Museum of Scotland (NMS) and the SNH survey of 2013/14 (Commissioned report 768). The cats were scored by Dr Andrew Kitchener and Charlotte Wagener NMS. There a weak positive relationship between the two types of marker for the range of cats that have been investigated at 83 loci (Figure 8). A previous survey that used 9 microsatellite markers (SNH Commissioned report no. 356, 2010), the survey by Beaumont et al. (2001), and the meta-analysis presented by Neaves & Hollingsworth (2013) have drawn similar conclusions.

Figure 8: Relationship for hybrids score (at 83 loci) and pelage score (7PS) for the 47 cats for which both types of data exists. There is a weak positive correlation between the two. A larger sample size is required to investigate the relationship between the two scores properly, emphasising the need for ongoing scientific research.

The explanation for this weak correspondence is likely to include the following reasons: Phenotypic traits measured in wildcats are likely to be under the control of a small number of genes. The genetic test surveys a number of different genes that are probably not in tight linkage to the genes that control these phenotypic traits. In recent hybrids we would expect the correspondence between phenotypic traits and hybrid score to be reasonable (due to

30 linkage disequilibrium), however in a situation of complex ancient hybridisation this relationship is likely to be broken up as small chunks of domestic cat genome enter the wildcat population carrying single genes that exert a large effect on phenotype (either as dominants or recessives). In the case of recessives, as the gene pool shrinks they are more likely to be presented as homozygotes. As yet, we are not in a technical position to understand the genes that control wildcat phenotype across the wide suit of genes for pelage, behaviour, physiology etc. that distinguish wildcats from their domestic relative. It should be noted, that pelage scores are only one subset of the phenotypic traits that make a wildcat a wildcat. The pelage score is however a trait that is easy to measure, not likely to be subject to too much environmental variation and is of importance to the general public 16. The lack of correspondence between genetic and phenotypic pelage characteristics should perhaps not be viewed as a problem, instead it is a logical consequence of a situation involving hybridisation over a number of generations. It is not unlikely that single genes may have considerable effect on phenotype; mutations to the human gene melanocortin 1 receptor (MC1R) which as a homozygote confers not only red hair colour but pale skin in humans is a good illustration of this fact. In animals, for example, a small number of genes has been implicated in coat colour variation in a large number of mammals, including house mice (Mus musculus), Soay sheep (Ovis aries), ( familiaris) and other domesticated animals (Schmutz & Berryere 2007; Gratten et al. 2010; Cieslak et al. 2011). Under the very reasonable assumption that pelage characteristics are indeed under genetic control we should view them as additional independent genetic markers. Thus nuclear loci and pelage traits should be taken as independent lines of evidence when making decisions about whether or not a cat is a wild cat and a suitable candidate for conservation breeding.

16 See also objectives of Scottish Wildcat Conservation Action Plan (SNH 2013).

31

The test decision The Principle of the test The principle of the test is that genetic and pelage traits are taken as independent lines of evidence. They are scored blind with respect to one another. The details Pelage Scoring Criteria Standardised images from each individual will be taken whilst under anaesthesia according to the Photography Protocol (section 4.3). These will be sent to NMS for pelage scoring by trained personnel. The individual undertaking the scoring will not be made aware of the identity (trap location or captive collection for example) of the cat to be scored, or of any genetic screening result. The findings will be considered in conjunction with results from the genetic screening on an individual basis, but as a general principle cats scoring <16 will be rejected, whilst those scoring >18 will be included in the conservation breeding programme subject to their genetic score. Those individuals scoring from 16 to 18 will require further consideration on their suitability (see detailed decision matrix below).

Genetic Scoring Criteria Genetic screening of cats will occur through a DNA test consisting of 35 SNP markers. DNA extraction from blood and SNP genotyping will be conducted at the WildGenes laboratory at RZSS. Repeat analyses of each cat will be run, alongside positive and negative controls. Hybrid scores will be estimated using the programme STRUCTURE run against a large standard reference data set of wildcats, domestic cats and hybrids. Again all scoring will be done blind, with scorers not having any prior data on the trapping location or images of the individual being tested. The findings will be considered in conjunction with results from the pelage scoring on an individual bases, but as a general principle: “Wildcat, PASS”(An animal scored on the 35 loci test whose lower bound (LBQ) of the 90% confidence interval surrounding Qhat is greater than 0.75) will be included in the conservation breeding programme. “Certain not Wildcat, FAIL”(An animal scored at 35 loci test whose upper bound (UBQ) of the 90% confidence interval surrounding Qhat is less than 0.75 ) will be rejected

“Cat of uncertain genetic status, UNCERTAIN”(Any cat that falls between the above definitions i.e Lower bound <0.75, upper bound >0.75) will require further consideration as per the decision matrix (below).

32

Combined pelage and genetic scoring decision matrix: 1. Pelage and genetic scores will be generated independently and compared against this matrix:

Genetic Criteria (35 SNP)

MATRIX 1 FAIL UNCERTAIN PASS (UBQ<0.75) (UBQ>0.75/LBQ <0.75) (LBQ>0.75)

<16 ✖ ✖ ✔*

(7PS)

16-18 ✖ ? Go to sub-matrix 2 ✔

Pelage CriteriaPelage >18 ✖ ✔ ✔

Accept into breeding programme * Monitor breed for 1 generation and evaluate pelage of offspring X Reject

33

2. Cats with an intermediate matrix score (falling into the central blue square above) will be further assessed against a refined matrix, as follows:

Genetic Criteria (35 SNP)

SUB-MATRIX 2

UNCERTIAN in Matrix 1 UNCERTAIN in Matrix 1 & Q & Q <0.75 ≥ 0.75

16 ✖ ? go to committee

(7PS)

17 ✖ ✔

Pelage CriteriaPelage 18 ? go to committee ✔

Accept into breeding programme X Reject ? Go to committee decision

3. Following the results of the refined matrix assessment, any cats that are still considered to be intermediate “?” (above) will be evaluated individually by a panel of three assessors from the National Museum of Scotland, the Royal Zoological Society of Scotland and Scottish Natural Heritage.

Process documentation: The genetic and pelage scoring results and final decision on each cat will be documented in a standardised case file for each individual.

Process review: Test selection criteria will be reviewed after first captive season, though a review of case files, by the Scottish Wildcat Conservation Action Plan Steering Group.

34

References

Anderson EC, Thompson EA (2002) A Model-Based Method for Identifying Species Hybrids Using Multilocus Genetic Data. , 1229, 1217–1229.

Beaumont M, Barratt EM, Gottelli D et al. (2001) Genetic diversity and introgression in the Scottish wildcat. Molecular ecology, 10, 319–36.

Boecklen W, Howard D (1997) GENETIC ANALYSIS OF HYBRID ZONES : NUMBERS OF MARKERS AND POWER OF RESOLUTION. Ecology, 78, 2611–2616.

Cieslak M, Reissmann M, Hofreiter M, Ludwig A (2011) Colours of . Biological reviews of the Cambridge Philosophical Society, 86, 885–99.

Daniels MJ, Beaumont MA, Johnson PJ et al. (2001) Ecology and genetics of wild-living cats in the north-east of Scotland and the implications for the conservation. , 146–161.

Driscoll C, Yamaguchi N, O’Brien SJ, Macdonald DW (2011) A suite of genetic markers useful in assessing wildcat (Felis silvestris ssp.)-domestic cat (Felis silvestris catus) admixture. The Journal of heredity, 102 Suppl , S87–90.

Falush D, Stephens M, Pritchard JK (2003) Inference of population STRUCTURE using multilocus genotype data: linked loci and correlated allele frequencies. Genetics, 164, 1567–87.

Falush D, Stephens M, Pritchard JK (2007) Inference of population STRUCTURE using multilocus genotype data: dominant markers and null alleles. Molecular ecology notes, 7, 574–578.

Gratten J, Pilkington JG, Brown E a et al. (2010) The genetic basis of recessive self-colour pattern in a wild sheep population. Heredity, 104, 206–14.

Kitchener AC, Yamaguchi N, Ward JM, Macdonald DW (2005) A diagnosis for the Scottish wildcat (Felis silvestris): a tool for conservation action for a critically-endangered felid. Animal Conservation, 8, 223–237.

Mattucci F, Oliveira R, Bizzarri L et al. (2013) Genetic STRUCTURE of wildcat ( Felis silvestris ) populations in . Ecology and , 3, 2443–2458.

McEwing R, Kitchener AC, Holleley C, Kilshaw K, O’Donoghue P (2011) An allelic discrimination SNP assay for distinguishing the mitochondrial lineages of European wildcats and domestic cats. Conservation Genetics Resources, 4, 163–165.

Neaves, L.E. & Hollingsworth, P.M. 2013. The Scottish wildcat (Felis silvestris); A review of genetic information and its implications for management. Conservation Genetic Knowledge Exchange, Royal Botanic Gardens,

35

Nussberger B, Greminger MP, Grossen C, Keller LF, Wandeler P (2013) Development of SNP markers identifying European wildcats, domestic cats, and their admixed progeny. Molecular ecology resources, 13, 447–60.

Nussberger B, Wandeler P, Camenisch G (2014) A SNP chip to detect introgression in wildcats allows accurate genotyping of single hairs. European Journal of Wildlife Research, 60, 405–410.

Nussberger B, Wandeler P, Weber D, Keller LF (2014) Monitoring introgression in European wildcats in the Swiss Jura. Conservation Genetics, 15, 1219–1230.

Oliveira R, Godinho R, Randi E, Alves PC (2008) Hybridization versus conservation: are domestic cats threatening the genetic integrity of wildcats (Felis silvestris silvestris) in ? Philosophical transactions of the Royal Society of London. Series B, Biological sciences, 363, 2953–61.

Pierpaoli M, Biro ZS, Herrmann M et al. (2003) Genetic distinction of wildcat (Felis silvestris) populations in Europe, and hybridization with domestic cats in Hungary. Molecular Ecology, 12, 2585–2598.

Pritchard JK (2010) Documentation for STRUCTURE software : Version 2 . 3.

Pritchard JK, Stephens M, Donnelly P (2000) Inference of population STRUCTURE using multilocus genotype data. Genetics, 155, 945–59.

Randi E (2008) Detecting hybridization between wild species and their domesticated relatives. Molecular ecology, 17, 285–93.

Le Roux JJ, Foxcroft LC, Herbst M, MacFadyen S (2014) Genetic analysis shows low levels of hybridization between African wildcats ( Felis silvestris lybica ) and domestic cats ( F. s. catus ) in South . Ecology and Evolution, n/a–n/a.

Schmutz SM, Berryere TG (2007) Genes affecting coat colour and pattern in domestic dogs: a review. Animal genetics, 38, 539–49.

Senn H V, Barton NH, Goodman SJ et al. (2010) Investigating temporal changes in hybridization and introgression in a predominantly bimodal hybridizing population of invasive sika (Cervus nippon) and native red deer (C. elaphus) on the Kintyre Peninsula, Scotland. Molecular ecology, 19, 910–24.

Senn H V, Pemberton JM (2009) Variable extent of hybridization between invasive sika (Cervus nippon) and native red deer (C. elaphus) in a small geographical area. Molecular ecology, 18, 862–76.

Senn H V, Swanson GM, Goodman SJ, Barton NH, Pemberton JM (2010) Phenotypic correlates of hybridisation between red and sika deer (genus Cervus). The Journal of animal ecology, 79, 414–25.

36

Witzenberger K a, Hochkirch A (2014) The genetic integrity of the ex situ population of the (Felis silvestris silvestris) is seriously threatened by introgression from domestic cats (Felis silvestris catus). PloS one, 9, e106083.

37

Appendices Appendix 1: Allele frequencies at the 35 SNPs in the 82 individual test dataset Estimate of Ancestral Estimate in Domestic Locus Allele Freq Cat Estimate in Wildcat SNP001 G 0.488 0.966 0.08 C 0.376 0.03 0.918 Null 0.136 0.003 0.002

SNP012 G 0.428 0.736 0.265 T 0.426 0.237 0.73 Null 0.146 0.028 0.005

SNP014 A 0.341 0.949 0.003 G 0.496 0.048 0.992 Null 0.163 0.004 0.005

SNP016 A 0.495 0.957 0.178 G 0.359 0.029 0.82 Null 0.146 0.014 0.002

SNP019 G 0.308 0.867 0.002 A 0.523 0.12 0.991 Null 0.17 0.013 0.007

SNP026 G 0.401 0.986 0.006 C 0.43 0.011 0.991 Null 0.169 0.003 0.003

SNP030 A 0.339 0.937 0.003 G 0.493 0.058 0.99 Null 0.169 0.005 0.007

SNP044 A 0.406 0.623 0.06 G 0.465 0.374 0.935 Null 0.129 0.004 0.005

SNP045 C 0.529 0.11 0.989 T 0.31 0.887 0.002 Null 0.162 0.004 0.009

SNP047 G 0.33 0.779 0.009 C 0.514 0.216 0.98 Null 0.156 0.006 0.011

38

SNP048 C 0.451 0.95 0.115 T 0.346 0.04 0.837 Null 0.203 0.01 0.048

SNP050 G 0.491 0.98 0.125 A 0.337 0.01 0.859 Null 0.172 0.01 0.016

SNP058 G 0.421 0.654 0.326 A 0.422 0.325 0.659 Null 0.156 0.021 0.015

SNP060 A 0.373 0.977 0.003 T 0.448 0.016 0.993 Null 0.179 0.006 0.004

SNP062 G 0.37 0.971 0.003 T 0.442 0.019 0.99 Null 0.189 0.009 0.007

SNP084 A 0.397 0.978 0.01 G 0.423 0.015 0.984 Null 0.18 0.008 0.006

SNP098 G 0.4 0.931 0.026 A 0.453 0.065 0.97 Null 0.148 0.004 0.005

SNP101 C 0.508 0.993 0.076 T 0.339 0.005 0.917 Null 0.154 0.003 0.007

SNP102 C 0.348 0.955 0.001 T 0.468 0.035 0.994 Null 0.185 0.01 0.005

SNP114 G 0.484 0.929 0.245 A 0.35 0.041 0.749 Null 0.167 0.03 0.006

SNP115 G 0.486 0.931 0.242 A 0.341 0.033 0.751 Null 0.172 0.036 0.006

39

SNP127 C 0.384 0.966 0.007 T 0.451 0.029 0.99 Null 0.164 0.004 0.003

SNP129 G 0.403 0.992 0.004 A 0.417 0.005 0.991 Null 0.18 0.003 0.005

SNP133 G 0.475 0.911 0.236 A 0.384 0.078 0.761 Null 0.14 0.011 0.003

SNP143 C 0.468 0.936 0.105 T 0.363 0.036 0.888 Null 0.168 0.028 0.006

SNP146 C 0.476 0.948 0.079 T 0.388 0.047 0.919 Null 0.136 0.005 0.002

SNP148 G 0.339 0.93 0.001 A 0.48 0.053 0.996 Null 0.181 0.017 0.003

SNP155 A 0.516 0.985 0.196 C 0.323 0.007 0.795 Null 0.161 0.008 0.01

SNP166 G 0.497 0.959 0.19 A 0.366 0.036 0.807 Null 0.137 0.006 0.002

SNP176 T 0.397 0.99 0.004 C 0.412 0.004 0.988 Null 0.191 0.006 0.008

SNP178 G 0.441 0.992 0.018 A 0.377 0.003 0.973 Null 0.182 0.004 0.009

SNP187 C 0.47 0.95 0.092 G 0.375 0.044 0.898 Null 0.155 0.005 0.01

40

SNP190 C 0.319 0.843 0.003 G 0.481 0.103 0.993 Null 0.2 0.054 0.004

SNP195 A 0.38 0.988 0.003 G 0.415 0.007 0.982 Null 0.205 0.006 0.015

SNP196 T 0.506 0.976 0.333 A 0.317 0.013 0.647 Null 0.177 0.011 0.02

Appendix 2: STRUCTURE output at different values of K in the 82 individual test dataset

Appendix 3: Test data

41

Pelage score (7Ps) (Scored by Andrew mtDNA Kitchener/ Charlotte ID Other_ID Q (83loci) LBQ UBQ type Wagener NMS) Sex Locality description rHK60 0.065 0.018 0.12 SWISS DOMESTIC REF rHK64 0.058 0.014 0.112 SWISS DOMESTIC REF rWK24 0.736 0.661 0.807 SWISS BxWC REF rWK28 0.989 0.966 1 SWISS WILDCAT REF rWK54 0.558 0.478 0.637 SWISS F1 Ref REF rWK56 0.992 0.974 1 SWISS WILDCAT REF WCQ0047 PH04.96 0.931 0.878 0.975 wild 18 F A832 WCQ0052 R171.99 0.898 0.842 0.946 wild 18 F Scotland, Argyll, Ardslignish Scotland, Ross-shire, Black Isle, Munlochy, On WCQ0053 R172.99 0.93 0.88 0.972 wild 12+ F A832 Eof WCQ0073 2/408 0.042 0.004 0.094 domestic 8 F Garve, Inchbae WCQ0097 PH23.12 0.046 0.006 0.099 domestic F Glenlivet estate WCQ0098 PH24.12 0.503 0.42 0.586 domestic 11 F Glenlivet estate WCQ0099 GH22.12 0.089 0.038 0.147 domestic 16 F Portlethen/Aberdeenshire WCQ0100 GH37.12 0.317 0.24 0.398 wild 12 F Castle Grant\Strathspey WCQ0104 PH13.12 0.477 0.394 0.561 wild F Road B976 WCQ0105 R103.03 0.424 0.342 0.507 wild M A947, Parkside, near Oldmeldrum WCQ0107 RL45/97 0.659 0.576 0.739 domestic F Grantown-on-Spey WCQ0110 GH10.12 0.854 0.786 0.915 wild M Between Kinveachy and Rattray/Strathspey WCQ0114 GH40.12 0.544 0.458 0.629 domestic 12 F Dunbeath/ WCQ0116 GH42.12 0.544 0.462 0.625 wild M Tarras Woodland\Moray WCQ0118 GH44.12 0.546 0.462 0.629 wild 14 F East Lodge, Balavil Estate/Badenoch WCQ0119 GH45.12 0.441 0.358 0.524 wild F B9119 near Wester Tulloch/Aberdeenshire WCQ0132 PH125.13 0.828 0.756 0.894 wild 14 F na- captive WCQ0137 PH30.11 0.528 0.445 0.611 domestic M Road A944 Delhand Bridge

42

WCQ0155 GH05.10 0.926 0.861 0.979 domestic 12 F Port Lympne WAP WCQ0157 GH07.10 0.943 0.893 0.984 domestic 17 M 90/92 L WCQ0158 GH08.10 0.92 0.869 0.963 wild 12 M Kinveachy Junction WCQ0159 GH09.10 0.868 0.799 0.93 wild M Ballintean, Glen Feshie WCQ0160 GH10.10 0.882 0.822 0.934 wild 13 M Lochranza, Cullicudden, Culbokie WCQ0161 GH14.10 0.668 0.585 0.749 wild M Drumtochty Glen, Auchenblae WCQ0165 GH27.10 0.072 0.026 0.127 domestic 9 M Nethy Bridge WCQ0166 GH28.10 0.333 0.252 0.416 wild F na- captive WCQ0167 GH29.10 0.974 0.924 0.999 domestic M na- captive WCQ0168 GH31.10 0.406 0.323 0.49 domestic 10 M A944 near Strathdon WCQ0170 GH38.10 0.437 0.356 0.519 wild F na- captive WCQ0171 GH39.10 0.51 0.427 0.593 wild M Laggantrygonn Cemetry WCQ0172 GH40.10 0.22 0.149 0.295 wild 13 M Upcott Grange Farm, Devon WCQ0208 PH114.13 0.719 0.639 0.795 wild 14 F Auchleven WCQ0209 GCB 0.685 0.605 0.761 wild 16 F Gartly, Aberdeenshire WCQ0210 GH10.12 0.854 0.786 0.915 wild M Between Kinveachy and Rattray/Strathspey WCQ0211 GH16.10 0.905 0.85 0.952 wild 13 F A837 Lochinver-Inchnadamph, Assynt WCQ0212 GH31.12 0.804 0.731 0.872 wild 17 M A957 Slug Road, Rickarton, Stonehaven WCQ0213 GH4.10 0.807 0.731 0.878 wild 15 F Rymore, Tulloch, Nethybridge WCQ0214 GH6.10 0.904 0.846 0.954 wild 13 Banffshire, Ordiquill, near Cornhill WCQ0215 GH8.12 0.359 0.278 0.442 wild 12 F Between Newtonmore and Kingussie MOR-CB WCQ0216 = MO 0.659 0.58 0.736 wild 15 F , Lochaber WCQ0217 P2M1 0.915 0.852 0.969 wild 13+ na-captive WCQ0218 PH18.12 0.74 0.662 0.813 wild 13 M Road A9 at pass of Binnam WCQ0221 WCT-T1 0.3 0.223 0.379 wild 17 Near Wick- Bridge of gillock WCQ0222 G-CC 0.754 0.679 0.824 wild 12 Gartly, Aberdeenshire WCQ0223 GCD 0.54 0.456 0.622 wild 15 M Gartly, Aberdeenshire WCQ0224 GCE 0.801 0.731 0.866 wild 8 F Gartly, Aberdeenshire

43

A939 Grantown-on-Spey to Tomintoul rd , WCQ0225 GH17.10 0.697 0.618 0.772 wild 16 about 5 mile south of Grantowen-on-Spey WCQ0226 GH25.12 0.588 0.504 0.67 wild M Drumochter Pass/Badenoch WCQ0227 GH36.12 0.267 0.192 0.346 wild 13 F A95 South of Grantown-on-Spey WCQ0228 GH60.10 0.22 0.149 0.295 wild 14 M Nethy Bridge MOR-CA WCQ0229 = MOR- 0.403 0.321 0.487 wild 11 F Morvern, Lochaber WCQ0230 PH27.12 0.404 0.322 0.488 wild 15 M Drumtochty,Aberdeenshire WCQ0231 PH31.12 0.477 0.394 0.561 wild 17 M Drumtochty, Aberdeenshire ANG-CA WCQ0234 =ANG J 0.597 0.515 0.678 wild 7 F Angus WCQ0235 GH23.12 0.361 0.282 0.442 wild 12 M Gartly, near Huntly WCQ0236 GH24.12 0.476 0.395 0.557 wild 12 F Blacklunans, Glenshee/Perthshire WCQ0243 P1F1 0.935 0.866 0.989 domestic 19 F na- captive WCQ0244 P1M1 0.862 0.787 0.932 domestic 18 M na-captive WCQ0245 P2F1 0.913 0.84 0.978 domestic 19 F na- captive BO-CA = WCQ0246 SBO-B 0.556 0.473 0.638 domestic 14 M Strathbogie, Aberdeenshire WCT-CPL WCQ0247 (Diesel) 0.75 0.675 0.821 domestic Male Aviemore WCQ0248 WCT-T2 0.918 0.847 0.981 domestic 17 Roadside- Moy Bridge Brahan Estate A835 WCQ0249 GCA 0.581 0.497 0.663 domestic 10 M Gartly Aberdeenshire WCQ0250 Kitten' 0.665 0.584 0.743 domestic Glen Muick, Aberdeenshire. WCQ0251 Scaniport 0.187 0.119 0.26 domestic M Scaniport, Invernesshire SBO-CC = WCQ0252 SBO-C 0.505 0.421 0.588 domestic 15 M Strathbogie, Aberdeenshire WCQ0253 CV1 0.185 0.12 0.256 domestic F Ben Wyvis area WCQ0255 GH30.12 0.375 0.294 0.459 domestic 18 M Kaims of Airlie WCQ0256 GH34.12 0.313 0.236 0.393 domestic 12 M Dava moor WCQ0335 Cama 0.913 0.843 0.972 wild Female na- captive

44

WCQ0336 Edana 0.886 0.758 0.987 domestic Female na- captive WCQ0337 Sid 0.924 0.853 0.979 domestic Male na- captive WCQ0338 Finn 0.849 0.758 0.93 domestic Male na- captive WCQ0339 Muira 0.886 0.816 0.947 domestic Female na- captive WCQ0340 Iona 0.863 0.787 0.933 domestic Female na- captive WCQ0341 Forba 0.841 0.762 0.914 domestic Male na- captive WCQ0342 Alvie 0.885 0.82 0.943 wild na- captive WCQ0343 Kendra 0.928 0.869 0.977 wild na- captive WCQ0344 Iona 0.953 0.899 0.993 wild na- captive WCQ0345 Garton 0.945 0.89 0.989 wild na- captive

45

Appendix 4: Physical validation of the test This section details how the test was validated in the laboratory at RZSS: The SNP marker assays were ordered from Applied Biosystems as Order Custom TaqMan® Probes and tested on a dataset of reference indivduals from the WildGenes laboratory, RZSS. The alleles that correspond to given dyes can be found in Table 1. Probe information including reodering numbers can be found in Appendix 4. The probes were tested on a dataset of reference individuals from the WildGenes laboratory. These consisted of 45 wild and domestic type cats including 4 individuals previously part of the test dataset generated on the Swiss system of 83 SNPS (WCQ073,WCQ105,WCQ118,WCQ0227) and 12 internal controls that were repeated twice (denoted r in Table 9) . A single sample of Felis margarita (SCA098) was also included as this species is commonly processed in the WildGenes laboratory and is closely related. Two non-template (negative) controls were included as is standard.

Table 9 Samples used for the test verification process

NCT NCT SCA098 WCQ0073 WCQ0105 WCQ0118 WCQ0227 WCQ0358 WCQ0362 WCQ0382 WCQ0383 WCQ0384 WCQ0385 WCQ0386 WCQ0387 WCQ0388 WCQ0389 WCQ0390 WCQ0391 WCQ0392 WCQ0393 WCQ0399 WCQ0400 WCQ0427 WCQ0428 WCQ0428r WCQ0429 WCQ0429r WCQ0430 WCQ0430r WCQ0431 WCQ0431r WCQ0432 WCQ0432r WCQ0433 WCQ0433r WCQ0434 WCQ0434r WCQ0435 WCQ0435r WCQ0436 WCQ0436r WCQ0437 WCQ0437r WCQ0438 WCQ0438r WCQ0439 WCQ0439r

Samples were run according to the conditions detailed in the standard test protocol (see below). Results were as follows, plots of all the SNPs can be found in Appendix 6:

 100% of the genotypes that were called matched with the internal positives.  100% of the genotypes that were scored matched between the data generated by the Swiss system and the data generated by RZSS.

46

Appendix 5: Standard Test protocol at RZSS

Laboratory amplification of 35 SNp probes using StepOne rtPCR machine

- Using filtertips and standard laboratory protection & hygiene throughout

- Extract two replicates of sample using 100µl EDTA blood using standard fuji film protocol. One blank extraction extracted along side.

-set up the plate of samples:

-Quantify samples using the QUBIT dsDNA BR assay.

-Dilute samples to 10ng/ul using ddH2O.

-Make the mastermix using the following recipe:

PER SAMPLE:

5ul Taqman GTXpress MasterMix

0.25ul Taqman Probe

3.75ul ddH2O

-Label the reaction plate on the side only (do not write on the top of the plate).

-Pipette 1ul of DNA in to the appropriate well (remember to use four positive controls and two non- template controls per SNP (one of 1ul ddH2O, and the other of 1ul extraction control)).

-See below for an example of plate set up for one sample, testing six SNPs.

1 2 3 4 5 6 7 8 SNP001 A NTC ddH2O NTC extraction Run Run Run Run Sample1.1 Sample1.2 control1 control2 control3 control4 SNP012 B NTC ddH2O NTC extraction Run Run Run Run Sample1.1 Sample1.2 control1 control2 control3 control4 SNP014 C NTC ddH2O NTC extraction Run Run Run Run Sample1.1 Sample1.2 control1 control2 control3 control4 SNP016 D NTC ddH2O NTC extraction Run Run Run Run Sample1.1 Sample1.2 control1 control2 control3 control4 SNP019 E NTC ddH2O NTC extraction Run Run Run Run Sample1.1 Sample1.2 control1 control2 control3 control4 SNP026 F NTC ddH2O NTC extraction Run Run Run Run Sample1.1 Sample1.2 control1 control2 control3 control4

-Add 9ul of the mastermix to each well (use 8ul if using 2ul of DNA) and seal the plate with MicroAmp optical adhesive lids or MicrAmp optical 8-stip caps.

47

Briefly spin the plate to ensure all the samples are at the bottom of the wells.

The PCR can be run on either the StepOne or using a thermocycler. The endpoint read must be done on the StepOne.

PCR Conditions:

Stage Step Temp Time (StepOne) Holding DNA polymerase 95˚C 20 sec activation Cycling Denature 95˚C 3 sec (40 cycles) Anneal/Extend 60˚C 30 sec

Use a post read temperature of 25˚C.

Data analysis

-Step one machine is step up to automatically call genetpyes as laid out in Table 1.

- Data is imported in Excel.

- Run control genotypes are compared to reference data.

-Target animal compared to ensure identical genotypes.

- Data analysed in STRUCTURE using the following (standard) model: 500,000 burn-in, 1000,000 MCMC, Admixture model (infer alpha), Correlated allele frequencies model (Lamda =1). Null allele frequencies were estimated simultaneously using the RECESSIVEALLELES=1 option and by setting dummy values at each locus (see STRUCTURE manual). The model run at K=2 for 3 replicates.

-Qhat values of reference data set compared to known Qhat values for this data set

-Mean values of Q, UBQ and LBQ taken.

-Final value of the cat compared against the decision matrix.

48

Appendix 6: Nuclear SNP assay information and reorder numbers ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHS1P79 96-position tube rack v1 1382223 10-digit barcoded tube 0168673674 A01 40x SNP195_F TCCTGTCTGGCCAGTCTTCTT 36 SNP195_R CCTGCATCCACTGCTTTATAAGGT 36 SNP195_V VIC ATGAACCAATCTCTCCCC 8 NFQ SNP195_M FAM TATGAACCAATCCCTCCCC 8 NFQ SNP195

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHUAOEH 96-position tube rack v1 1382223 10-digit barcoded tube 0168673673 A02 40x SNP048_F TGTGTAGGTCATCCAGAGCTTTCTA 36 SNP048_R GGCATGCAATTAGAAGACATCTATCTC 36 SNP048_V VIC CAGCCTGGCCCCTTA 8 NFQ SNP048_M FAM CAGCCTGGTCCCTTA 8 NFQ SNP048

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHVJMKP 96-position tube rack v1 1382223 10-digit barcoded tube 0168673672 A03 40x SNP058_F CAGGACAGGCATGCTTCCA 36 SNP058_R AAAATGCCCAAGAGACTGATTCCT 36 SNP058_V VIC AACATCAATGATCTGTCACAG 8 NFQ SNP058_M FAM ACATCAATGATCTGTCATAG 8 NFQ SNP058

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHWSKQX 96-position tube rack v1 1382223 10-digit barcoded tube 0168673671 A04 40x SNP146_F AGCTCTGTCGCTCCTCACT 36 SNP146_R GACCAGCCACTAGAGAATTGTCATA 36 SNP146_V VIC TGTTCTTCTTGTGGACAGTG 8 NFQ SNP146_M FAM TGTTCTTCTTGTAGACAGTG 8 NFQ SNP146

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHX1IW5 96-position tube rack v1 1382223 10-digit barcoded tube 0168673670 A05 40x SNP176_F CACTGGCACTTGCTGTTATCAAAT 36 SNP176_R GCTTGGTAACTTTTGATTGAATGACTGA 36 SNP176_V VIC CTTCCTGATACATCTTATC 8 NFQ SNP176_M FAM TTCCTGATACACCTTATC 8 NFQ SNP176

49

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHZAG3D 96-position tube rack v1 1382223 10-digit barcoded tube 0168673669 A06 40x SNP190_F CAGTGTCTGTCTGGCCATCATTATT 36 SNP190_R AGAGCTGCTGGTCTCCTCAT 36 SNP190_V VIC TCCAATCCTATCGCACTCA 8 NFQ SNP190_M FAM CTCCAATCCTATCCCACTCA 8 NFQ

SNP190

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH0JE9L 96-position tube rack v1 1382223 10-digit barcoded tube 0168673668 A07 40x SNP026_F GGAGGCGGAGACAATTAGCA 36 SNP026_R ACACTGTTTACCTTGCGTACTGA 36 SNP026_V VIC CCTTGGAAACCCCTAAGAT 8 NFQ SNP026_M FAM CTTGGAAACCGCTAAGAT 8 NFQ SNP026

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH1SDFT 96-position tube rack v1 1382223 10-digit barcoded tube 0168673656 A08 40x SNP030_F CCAGATGTGTGTGATACTTAGTCCATT 36 SNP030_R CTCACAGACCAATCTTGTCTCCTTTA 36 SNP030_V VIC AATTTCCTTCTCTAGTCATTT 8 NFQ SNP030_M FAM TCCTTCTCTGGTCATTT 8 NFQ SNP030

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH21BL1 96-position tube rack v1 1382223 10-digit barcoded tube 0168673657 A09 40x SNP166_F GGACAAAGACGCAGAGGAGTTTT 36 SNP166_R GTAAATAGATCACTGTGCCAGGACAT 36 SNP166_V VIC ATGCCCCTTCGTCCTAG 8 NFQ SNP166_M FAM ATGCCCCTTTGTCCTAG 8 NFQ SNP166

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH399R9 96-position tube rack v1 1382223 10-digit barcoded tube 0168673658 A10 40x SNP050_F AGAAAAAATAACAAAAGCAGCCACTGA 36 SNP050_R CGGTAAGAGTACAGCGAATGTGTT 36 SNP050_V VIC CAACCTTACAGAAATC 8 NFQ SNP050_M FAM CCAACCTTATAGAAATC 8 NFQ SNP050

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH5I7YH 96-position tube rack v1 1382223 10-digit barcoded tube 0168673659 A11 40x SNP098_F TGGGAAGACCAAGCAAGGG 36 SNP098_R CTCCCCTCAAGACCTCTCCTA 36 SNP098_V VIC TGTTCCTCTAAGCTTACTTC 8 NFQ SNP098_M FAM TGTTCCTCTAAGTTTACTTC 8 NFQ

50

SNP098

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH6R54P 96-position tube rack v1 1382223 10-digit barcoded tube 0168673660 A12 40x SNP001_F CTGCTACACAATAACACACATGCAT 36 SNP001_R GAATTTACTGCATATCCCCCACTACA 36 SNP001_V VIC CAAAGTTTGAAGGATTTC 8 NFQ SNP001_M FAM AAAGTTTGAACGATTTC 8 NFQ SNP001

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH704AX 96-position tube rack v1 1382223 10-digit barcoded tube 0168673661 B01 40x SNP127_F CCAGAGAGCTGCCCAACATTT 36 SNP127_R GGACACGTAGGATCAGCTCATG 36 SNP127_V VIC TGGAAGGACGCCTCTT 8 NFQ SNP127_M FAM TGGAAGGACACCTCTT 8 NFQ SNP127

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AH892G5 96-position tube rack v1 1382223 10-digit barcoded tube 0168673662 B02 40x SNP038_F GGGACCTTTGACCTTACATTGGTAT 36 SNP038_R AGGGTCCTCCATGTCCCAATATAT 36 SNP038_V VIC CTTTTCTAGGCACGAAGAC 8 NFQ SNP038_M FAM TCTAGGCGCGAAGAC 8 NFQ SNP038

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHABHMY 96-position tube rack v1 1382223 10-digit barcoded tube 0168673663 B03 40x SNP114_F CTCAGAAACCTCGCCATCCA 36 SNP114_R TGGTGGAATTATTTCATTAGAAGAGGCTTT 36 SNP114_V VIC TAGTGCCGCATCCTT 8 NFQ SNP114_M FAM CTAGTGCCACATCCTT 8 NFQ SNP114

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHBKFS6 96-position tube rack v1 1382223 10-digit barcoded tube 0168673664 B04 40x SNP143_F GTCTTGAGGCAGAGAACATTTGG 36 SNP143_R CACAAGGCCTAGTCTTTAGATAATTTTCAGA 36 SNP143_V VIC CAGTATTTCACGGTATACC 8 NFQ SNP143_M FAM CAGTATTTCACAGTATACC 8 NFQ SNP143

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHCTDZE 96-position tube rack v1 1382223 10-digit barcoded tube 0168673665 B05 40x SNP084_F GGCTAGGATTTGGTCTTTGCATAGT 36 SNP084_R

51

CAAGAAGAACTATCCTGATGTGGGAAA 36 SNP084_V VIC TGTATTCAGTGTCTGTATCT 8 NFQ SNP084_M FAM ATTCAGTGCCTGTATCT 8 NFQ SNP084

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHD2B5M 96-position tube rack v1 1382223 10-digit barcoded tube 0168673666 B06 40x SNP019_F CGAGCAAGAGAAAGATGGTTAAGAGT 36 SNP019_R GGAGCATTTTAGGATTTTTTTGTGTATCG 36 SNP019_V VIC CTCTAAGACGCAACCTA 8 NFQ SNP019_M FAM ATCTCTAAGACACAACCTA 8 NFQ SNP019

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHFBABU 96-position tube rack v1 1382223 10-digit barcoded tube 0168673667 B07 40x SNP045r_F CCTCTACTGAGGGTTCCAAATGG 36 SNP045r_R GTCTGCAGATGTTGGGAAAGGA 36 SNP045r_V VIC AGTCTCCCACTGCAGTC 8 NFQ SNP045r_M FAM AGTCTCCCATTGCAGTC 8 NFQ SNP045r

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHGJ8H2 96-position tube rack v1 1382223 10-digit barcoded tube 0168673655 B08 40x SNP016_F AGTTTGACAAGTATAATTAAAGCTCCCTATG 36 SNP016_R CCTGCTTGGAATGAGAGAGATAGGA 36 SNP016_V VIC CTCTCCCCAATCATAC 8 NFQ SNP016_M FAM CTCTCCCCAGTCATAC 8 NFQ SNP016

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHHS6OA 96-position tube rack v1 1382223 10-digit barcoded tube 0168673654 B09 40x SNP044r_F ACTGTTTGGCATTGGCTTTTCC 36 SNP044r_R GCCTCAAATTCTTGGGCTCTGT 36 SNP044r_V VIC TTGCCTCCAAATGGA 8 NFQ SNP044r_M FAM TGCCTCCGAATGGA 8 NFQ SNP044r

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHKA20Q 96-position tube rack v1 1382223 10-digit barcoded tube 0168673652 B10 40x SNP014_F CATTCCCAATCTTCCTCTTTCCTGAA 36 SNP014_R CTGCTAGTGGGAAAAGAAACTGAGA 36 SNP014_V VIC AACTCTCAAATCTATTACTTC 8 NFQ SNP014_M FAM CTCTCAAATCTGTTACTTC 8 NFQ SNP014

52

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHLJ06Y 96-position tube rack v1 1382223 10-digit barcoded tube 0168673651 B11 40x SNP062_F CTCTTGTGGACACCCACCAA 36 SNP062_R GGCATTTCTTAGGAATCCAGATGTGT 36 SNP062_V VIC ACCTACTGTTTGGTAGGCA 8 NFQ SNP062_M FAM ACCTACTGTTTTGTAGGCA 8 NFQ SNP062

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHMSZC6 96-position tube rack v1 1382223 10-digit barcoded tube 0168673650 B12 40x SNP101_F TGTTCAATTCTCTGAGGCTTTCTGG 36 SNP101_R GGTGTCTTCTAGGGTTATGGCAAA 36 SNP101_V VIC TAGCCCTACAAAATGCCTCAG 8 NFQ SNP101_M FAM AGCCCTACAAAATACCTCAG 8 NFQ SNP101

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHN1XJE 96-position tube rack v1 1382223 10-digit barcoded tube 0168673649 C01 40x SNP102_F AAATAATGGCTCAGGTGCCTCTAC 36 SNP102_R GGCTAATTCTGTTTCTGTTCTCCCAAT 36 SNP102_V VIC CCCTTTGTCCACCTTT 8 NFQ SNP102_M FAM ACCCTTTGTCTACCTTT 8 NFQ SNP102

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHPAVPM 96-position tube rack v1 1382223 10-digit barcoded tube 0168673648 C02 40x SNP115_F CACATCAAAGCTCAGGTGAAACATT 36 SNP115_R CCGATCTCCACTGCAAATTCACT 36 SNP115_V VIC CAACAACAATTCTGTATCGTG 8 NFQ SNP115_M FAM CAACAACAATTCTATATCGTG 8 NFQ SNP115

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHQJTVU 96-position tube rack v1 1382223 10-digit barcoded tube 0168673647 C03 40x SNP129_F CAGGAGCTCCCCTAAAACTGAAATA 36 SNP129_R GCCTTCTCTTCCTGTCTCCAAATAA 36 SNP129_V VIC CTGGCTAGTGAAGAAA 8 NFQ SNP129_M FAM CTGGCTAATGAAGAAA 8 NFQ SNP129

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHRSR12 96-position tube rack v1 1382223 10-digit barcoded tube 0168673646 C04 40x SNP155_F CGGAGCAAACAGTCAATAACCAGTA 36 SNP155_R CCAAGTGCCATTAAGCAGCAAT 36 SNP155_V VIC

53

ATTATGTTTCTAAACCCC 8 NFQ SNP155_M FAM ATTATGTTTCTAACCCCC 8 NFQ SNP155

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHS1P8A 96-position tube rack v1 1382223 10-digit barcoded tube 0168673645 C05 40x SNP178_F GCTCGACTTCCTATCAAAACCAAAA 36 SNP178_R CCAATTACAGGCTTGCATTTCTTGT 36 SNP178_V VIC CCTTGTCAGCGTCGAGAT 8 NFQ SNP178_M FAM CCTTGTCAGCATCGAGAT 8 NFQ SNP178

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHUAOEI 96-position tube rack v1 1382223 10-digit barcoded tube 0168673644 C06 40x SNP187_F CACTGAGGCCCAAGCAAGA 36 SNP187_R CCCCACCACTCCCTAATGTC 36 SNP187_V VIC CCTACTCTGAACTGCCTGTG 8 NFQ SNP187_M FAM CCTACTCTGAACTCCCTGTG 8 NFQ SNP187

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHVJMKQ 96-position tube rack v1 1382223 10-digit barcoded tube 0168673632 C07 40x SNP196_F GCTGTCCTGAGAGTAAAATTCAACTG 36 SNP196_R AGTATATGAGAGGTATTGAAGTAGCCTTT 36 SNP196_V VIC CTACTGTTGACTTCCC 8 NFQ SNP196_M FAM CTACTGTTGTCTTCCC 8 NFQ SNP196

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHWSKQY 96-position tube rack v1 1382223 10-digit barcoded tube 0168673633 C08 40x SNP060n_F ACACACACTCAAAGGACAAACAACT 36 SNP060n_R CCTGGTGTACCCCACTCATG 36 SNP060n_V VIC CTTCACCCCAAGGTTAG 8 NFQ SNP060n_M FAM CTTCACCCCATGGTTAG 8 NFQ

SNP060n

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHX1IW6 96-position tube rack v1 1382223 10-digit barcoded tube 0168673634 C09 40x SNP047n_F TTCTCCATACTGGATTTTGGCACAA 36 SNP047n_R GTTTCCATACCTTCAACTAACTCGAGAT 36 SNP047n_V VIC CTTTTTTTGACACCTGTTTAC 8 NFQ SNP047n_M FAM TTTTTGACACGTGTTTAC 8 NFQ SNP047n

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 19-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service

54

AHZAG3E 96-position tube rack v1 1382223 10-digit barcoded tube 0168673635 C10 40x SNP133n_F AGATTAGTGATTCTCAAAAAGGGAAGCA 36 SNP133n_R GCTTTAAACACCTTGCTCAGGAGAT 36 SNP133n_V VIC CAACCCGTGGGTATC 8 NFQ SNP133n_M FAM ACAACCCATGGGTATC 8 NFQ SNP133n ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 27-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHRSR11 96-position tube rack v1 1384653 10-digit barcoded tube 0168673430 A01 40x SNP148_F CTTTTGGGTACATTTGCTTCACTGAA 36 SNP148_R TGGTGCTGGAGAACTTGGAATG 36 SNP148_V VIC TTGGGTAGTCAGGCAACT 8 NFQ SNP148_M FAM TGGGTAGTCAGACAACT 8 NFQ SNP148

ROYAL ZOOLOGICAL SOCIETY SCOTLAND 5556093 27-JAN-15 4332077 Custom Taqman(R) SNP Genotyping Assay Service AHI14UI 96-position tube rack v1 1384653 10-digit barcoded tube 0168673429 B01 40x SNP012_F CTTGGTTACCTCTGGGAGACC 36 SNP012_R CCTGGGTAACAGTTTGACCTGATTT 36 SNP012_V VIC TGGACATTCATTTAGTCATGC 8 NFQ SNP012_M FAM TGGACATTCATTTATTCATGC 8 NFQ SNP012

55

Appendix 7: SNP assay clustering SNP001

SNP012

56

SNP014

SNP016

57

SNP019

SNP026

58

SNP030

SNP044

59

SNP045

SNP047

60

SNP048

SNP050

61

SNP058

SNP060

62

SNP062

SNP084

63

SNP098

SNP101

64

SNP102

65

SNP114

SNP115

66

SNP127

SNP129

SNP133

67

SNP143

SNP146

68

SNP148

69

SNP155

SNP166

70

SNP176

SNP178

SNP187

71

SNP190

72

SNP195

SNP196

73

WCID

74