RECONSTRUCTING THE MOLECULAR PHYLOGENY OF GIANT SENGIS (GENUS )

A Thesis submitted to the faculty of A6 San Francisco State University 3(? In partial fulfillment of zo\5 the requirements for the Degree Ib'oL

Master of Science

In

Biology: Ecology, Evolution, and Conservation Biology

by

Elizabeth Jane Carlen

San Francisco, California

August 2015 Copyright by Elizabeth Jane Carlen 2015 CERTIFICATION OF APPROVAL

I certify that I have read Reconstructing the Molecular Phylogeny o f Giant Sengis (genus

Rhynchocyon) by Elizabeth Jane Carlen, and that in my opinion this work meets the criteria for approving a thesis submitted in partial fulfillment of the requirement for the degree Master of Science in Biology: Ecology, Evolution, and Conservation Biology at

San Francisco State University.

Research Fellow California Academy of Sciences RECONSTRUCTING THE MOLECULAR PHYLOGENY OF GIANT SENGIS (GENUS RHYNCHOCYON)

Elizabeth Jane Carlen San Francisco, California 2015

Giant sengis (genus Rhynchocyon), also known as giant elephant-shrews, are approximately 500 g forest floor that range from Central to East Africa.

Previous work on giant sengi has focused primarily on pelage color, pelage pattern, and the geographic distributions of the groups. Because there is complex phenotypic variation and large geographic ranges within some species, I chose to use genetic work to evaluate the phylogeny and classification of the genus. Genetic data were used to investigate the four currently recognized species (R. chrysopygus, R. cirnei,

R. petersi, and R. udzungwensis) and seven of the eight currently recognized subspecies

(R. cirnei cirnei, R. cirnei macrurus, R. cirnei reichardi, R. cirnei shirensis, R. cirnei stuhlmanni, R. p. petersi, and R. p. adersi). I used DNA extracted from fresh and historical museum samples to analyze approximately 4,700 nucleotides (2,685 bases of mitochondrial DNA and 2,019 bases of nuclear DNA) and reconstruct a molecular phylogeny. I also investigated and genetically confirmed the identity of Rhynchocyon sp. sequences published on GenBank, and suggest that the captive Rhynchocyon populations of North American zoos are R. p. adersi. My analyses confirm the current morphological classification, with each currently recognized species forming a monophyletic clade. My phylogeny suggests that hybridization among taxa is not widespread in Rhynchocyon, that the recently reported sengi from the Boni forest of

Northern Kenya is genetically similar to R. chrysopygus, and that the subspecies R. c. stuhlmanni should be elevated to full species.

I certify that the Abstract is a correct representation of the content of this thesis. ACKNOWLEDGEMENTS

I would like to thank my thesis committee J. Dumbacher, G. Rathbun, and D. Blackburn

for their support and feedback. This work was financially supported by the California

Academy of Sciences, the Biology Department at San Francisco State University, the

Graduate Student Council in Biology at San Francisco State University, the Society for

the Study of Evolution, and the Society for Integrative and Comparative Biology. G.

Rathbun particularly helped with facilitating the inclusion of the Boni Rhynchocyon. B.R.

Agwanda of the National Museums of Kenya captured and prepared the Boni

Rhynchocyon. S. Adanje of the Kenya Wildlife Service personally imported the Boni

Rhynchocyon tissue into the United States of America. S. Musila of the National

Museums of Kenya encouraged the inclusion of the Boni Rhynchocyon in this

analysis. C. Sabuni collected several tissues of R. p. petersi for this project while

working on his dissertation. K. Consolate collected tissues of R. c. stuhlmanni. W.

Stanley at the Field Museum of Natural History, N. Duncan at the American Museum of

Natural History, and J. Chupasko at the Museum of Comparative Zoology provided

museum samples for this project. F. Catzeflis, L. Herwig, G. Rathbun, and J. Dumbacher

helped determine the vouchers for the Douady et al. (2003) specimen. W. Wendelen

provided a photograph of specimens from within the R. c. macrurus cline, including

specimens collected in Chingulungulu, Tanzania (Douady et al. 2003). M. Omura

provided photographs of MCZ43732. K. Lengel and S. Eller, at the Philadelphia Zoo, and P. Riger, at the Houston Zoo, helped track down information about the Rhynchocyon zoo specimens. K. Hildebrandt at the Museum of the North, University of Alaska,

Fairbanks, A. Sellas at the California Academy of Sciences, and O. Carmi at the

California Academy of Sciences helped teach me laboratory techniques. L. E. Olson provided financial support to travel to the University of Alaska Fairbanks to repeat DNA extraction and amplification in an alternate ancient DNA lab (NSF DEB-1 120904 to

LEO). R. Bell and E. Stanley helped with phylogenetic analysis. M. Bernal helped with

PopART analysis. W. B. Simison provided guidance throughout this project and endless moral support. Finally, I would like to thank my parents who have always supported my decisions. To everyone that supported this project and me, “thank you.” TABLE OF CONTENTS

List of Tables...... :...... ix

List of Figures...... x

List of Appendices...... xi

Introduction...... 1

Methods...... 8

Specimens...... 8

Laboratory Methods...... 8

Alignment and Analysis...... 12

Results...... 14

Discussion...... 17

Clarifying the Taxonomic Status of Current Sequences...... 18

Origins of Captive Populations...... 23

Species Diagnosis...... 25

Current Taxonomic Status of Rhyne hocyon...... 27

Conclusions...... 30

References...... 49

Appendices...... 60 LIST OF TABLES

Table Page

1. Data for specimens used for DNA sequencing...... 33 2. Primers used for DNA amplification and sequencing...... 35 3. Best fit models for loci sequenced...... 38 4. Distance matrix for comparing Smit et al. (2011) sequences...... 39 5. Distance matrix for 12s 16s mitochondrial sequences...... 40

ix LIST OF FIGURES

Figures Page

1. Rhynchocyon cirnei subspecies ranges...... 41 2. Geographic range of the genus Rhynchocyon...... 42 3. MrBayes phylogram of Rhynchocyon 12s 16s mitochondrial region...... 43 4. TCS allelic networks for Rhynchocyon nuclear loci IRBP and vWF...... 44 5. Nucleotide alignment of Smit et al. (2011) sequences and primers...... 45 6. Nucleotide alignment of Smit et al. (2011) sequences and Homo sapiens...... 46 7. MrBayes cladogram for Rhynchocyon 12s 16s mitochondrial region...... 47 8. Type locality for Rhynchocyon cirnei hendersoni...... !...... 48

x LIST OF APPENDICES

Appendix Page

1. Rhynchocyon color plates...... 60 a. Rhynchocyon chrysopygus...... 60 b. Rhynchocyon cirnei cirnei...... 61 c. Rhynchocyon cirnei macrurus...... 62 d. Rhynchocyon cirnei reichardi...... 63 e. Rhynchocyon cirnei shirensis...... 64 f. Rhynchocyon cirnei stuhlmanni...... 65 g. Rhynchocyon petersi petersi...... 66 h. Rhynchocyon udzungwensis...... 67 i. Comparison of multiple Rhynchocyonspecimens ...... 68 2. Additional field data from unvouchered specimens...... 69 3. Fresh and historical DNA extraction methods...... 70 1

Introduction

Since Carolus Linnaeus developed the binomial nomenclature system of classification, one of the tasks of scientists has been to organize organisms into taxonomic groups. Mayr and Bock (2002) go as far as saying organisms must be ordered before they can be studied and understood. This task is essential to understanding evolution because similarities among organisms (physiological, behavioral, morphological, developmental, etc.) can be a reflection of their decent (Hennig 1965).

Modern systematists demand that the taxonomy of a group reflects their evolutionary histories. However, certain characters can be misleading, causing scientists to erroneously group and name organisms. Using the most obvious characters may not accurately estimate the degree of species relatedness because convergent evolution can cause distantly related organisms to appear similar. For example, European moles, golden moles, and marsupial moles share similar body shapes specialized for their fossoriai lifestyle, and elongation of the body evolved in two unrelated lineages of salamanders

Lineatriton and Oedipina. In contrast, adaptive radiation can cause closely related organisms to appear unrelated (Schluter 2000). For example, the approximately 1,600 described species of cichlid fishes in the African Rift lakes appear morphologically unrelated. Since the advent of DNA sequencing, the strength and objectivity of molecular characters, combined with molecular phylogenetics, has helped scientists resolve the evolutionary relationships of many groups (Kress et al. 2002, Pellegrino et al.

2001, Zahiri et al. 2011). As DNA sequencing and computational analysis have lowered 2

in cost, molecular phylogenetic analysis has aided the understanding of evolutionary relatedness among organisms.

Here, 1 use DNA sequences to reconstruct a phylogeny for the giant sengis (genus

Rhynchocyon), also known as giant elephant-shrews. Their English name, elephant- shrew, comes from their trunk-like snout and their once-presumed relationship with shrews. Recently, biologists have moved towards calling this group of ‘sengis,’ which is a Kiswahili name (Rathbun & Kingdon 2006).

Because sengis are superficially similar in ecology and morphology to true shrews, hedgehogs, and moles, they were originally placed in the order

Insectivora. However, it was clear from nearly the beginning that Insectivora was a

'dumping ground’ for many groups of unknown phylogenetic, and this resulted in various taxonomic reshuffling to accommodate sengis (Rathbun 2009). In the late 20th century, as molecular phylogenetics became more common, Springer et al. (1997) showed that sengis were actually more closely related to golden moles, tenrecs, aardvarks, hyraxes, elephants, and manatees. This new clade was given the name

'’ to reflect the likely African origin, and the rank of superorder, sometimes referred to as a supercohort (Stanhope et al. 1998a). Subsequent molecular studies have confirmed the grouping of Afrotheria (Scally et al. 2002; Springer et al. 1999; Stanhope et al. 1998b; van Dijk et al. 2001).

The 19 extant species of sengis (Order: Macroscelidea, Family: Macroscelididae) are restricted to the African continent and form two well-defined sub-families, the soft- 3

furred sengis (Macroscelidinae), with 15 extant species in three genera (,

Macroscelides, and Petrodromus) and the giant sengis (Rhynchocyoninae) with four extant species in one genus (Rhynchocyon). While much molecular phylogenetic work has been done on the relationships of soft-furred sengis (Douady et al. 2003, Dumbacher et al. 2012, Dumbacher et al. 2014, Smit et al. 2007, Smit et al. 2008, Smit et al. 2011), comparatively little molecular work has been done on the phylogenetic relationships of giant sengis (Lawson et al. 2013, Smit 2008).

Giant sengis are the largest of all the sengis, ranging approximately 300g - 700g.

Giant sengis are quadruped mammals with long legs for their body size and a long tail that is sparsely haired. Their long snout can twist and probe in search of invertebrates, which giant sengis feed on. Giant sengis are largely diurnal, and species studied to date are facultatively monogamous, with males and females marking and defending territories

(FitzGibbon 1997, Rathbun 2009).

In the 65 years between 1847 and 1912, ten species and four subspecies of giant sengis were described, all in the genus Rhynchocyon. Corbet and Hanks (1968) conducted the most thorough modern taxonomic revision of these 14 taxa and accepted only three species: R. chrysopygus, R. petersi, and R. cirnei. Corbet and Hanks (1968) further recognized two R. petersi subspecies (R. p. adersi and R. p. petersi) and six R. cirnei subspecies (R. c. cirnei, R. c. shirensis, R. c. reichardi, R. c. hendersoni, R. c. macrurus, and R. c. stuhlmanni). Rhynchocyon c. shirensis was a new subspecies, while the others were previously described as full species. Although Corbet and Hanks (1968) 4

placed stuhlmanni as a subspecies of R. cirnei, they noted that R. c. stuhlmanni could

arguably be a full species based on its short nasals, all white tail (appendix 1, figure F),

and its geographic range, which is isolated in the Congo Basin (figure 1).

Corbet and Hanks (1968) used pelage coloration, pelage pattern, and species distribution to inform their revision. The well-defined golden-rumped sengi, R. chrysopygus, has a distinctive golden-straw colored patch of fur on its rump with a

surrounding rufous pelage (appendix 1, figure A) and lives in the low-canopy coastal forests of Kenya (figure 2). The black and rufous sengi, R. petersi, has rufous tinged fur on its head with a black back and rump pelage (appendix 1, figure G), and lives in the evergreen forests of eastern Tanzania and southeastern Kenya (figure 2). Rhynchocyon petersi is further split into two subspecies, R. p. petersi from mainland Tanzania and

Kenya, and R. p. adersi from the islands of Mafia and Zanzibar off the coast of

Tanzania. Corbet and Hanks (1968) placed all other giant sengis in R. cirnei. Most R. cirnei have a head with a brown tinged fur and dorsal pelage with a pattern of spots and dark lines on a yellowish-brown or rufous background (appendix 1, figures B-F). Two R. cirnei subspecies, R. c. macrurus and R. c. stuhlmanni, show great variation in pelage color, with dark forms in the eastern portion of their respective ranges, and light forms in the western portion of their ranges (appendix 1, figure 1). Rhynchocyon cirnei lives in a mosaic of habitats including lowland forests, montane forests, and riparian thickets, in

Central and East Africa. Specifically, R. cirnei ranges between the Congo and Ubangi 5

rivers in the Democratic Republic of Congo, along the Rift Valley in Malawi, Zambia, and Tanzania, and north of the Zambezi River Basin into Tanzania (figure 1).

In 2008, Rovero et al. described a fourth species, R. udzungwensis, which has a grey forehead, black rump pelage, and grizzled yellow-orange-rufous pelage on its sides

(appendix 1, figure H), and occurs in two evergreen forests in the Udzungwa Mountains in Tanzania (figure 2). Andanje et al. (2010) reported a giant sengi with an unusual morphology from the Boni and Dodori National Reserves on the northern coast of Kenya.

This sengi has a face with grizzled yellow-brown fur, dark maroon back pelage, and black rump fur, most closely resembling R. udzungwensis in coloration (see photos in

Andanje et al. 2010). Andanje et al. (2010) did not provide a taxonomic rank or formal description for the ‘Boni sengi,’ and to date its taxonomic status has not been determined.

The Corbet and Hanks (1968) revision was followed by a different taxonomic interpretation by Jonathan Kingdon (1974) in his comprehensive series on East African mammals. Kingdon (1974) considered the giant sengis a single species, R. cirnei, with R. chrysopygus and R. petersi treated as ‘incipient species.’ Kingdon (1974) based this argument on the variation of dorsal pelage checkering present in all Rhynchocyon, and attributed the clinal pelage differences to hybridization among the multiple ‘incipient species’. In his volume on the mammals of Africa, Kingdon (2013) revised his taxonomic treatment of giant sengis, accepting the three species recommended by Corbet and Hanks (1968), with the addition of R. udzungwensis. 6

However, there is reason to be concerned about taxonomic classifications that are based primarily on pelage colors and patterns. One concern is that pelage characters can be unreliable phylogenetic characters. By using multiple, independently evolving characters in a phylogenetic matrix, it is less likely that one single character will grossly mislead the phylogenetic reconstruction.

In the first molecular phylogenetic study to contain Rhynchocyon, Douady et al.

(2003) included a single specimen in their analysis of the role of the Sahara in the diversification of Macroscelidea. The Rhynchocyon specimen sequenced by Douady et al. (2003) was not identified to species and a voucher was not cited, although a collection locality in southeastern Tanzania was given in Douady’s thesis (2001).

In a second important study, Smit et al. (2011) included Rhynchocyon in their study of the phylogenetic relationships of Macroscelididae. Smit et al. (2011) sequenced approximately 2,000 bases of the mtDNA gene fragments 12S rRNA, valine tRNA, and

16S rRNA (12s 16s) from three historical Rhynchocyon specimens: R. chrysopygus, R. c. reichardi, and R. p. petersi from the Natural History Museum in London. Based on their phylogenetic analysis, Smit et al. (2011) proposed that R. petersi and R. cirnei were sister species, and R. chrysopygus was sister to the R. petersi - R. cirnei clade. Smit et al.

(2011) also identified the Douady et al. (2003) sequence as R. chrysopygus. However, based on the collection locality of Douady’s (2001) tissue in Southeastern Tanzania, I question Smit et al.’s (2011) identification. 7

The third key molecular study including Rhynchocyon is a paper by Lawson et al.

(2013), who used molecular phylogenetics to study the interspecific relationship of R. udzungwensis and R. c. reichardi. Lawson et al. (2013) collected Rhynchocyon samples from four forests, including the contact zone between R. udzungwensis and R. c. reichardi. They analyzed three mitochondrial loci (ND2, D-loop, 12S) and two nuclear loci (ENML, vWF) and found the individual nuclear gene trees strongly supported the monophyly of R. udzungwensis. However, their analysis of concatenated mitochondrial loci supported paraphyly of R. udzungwensis. Due to the mixing of mitochondrial alleles in their phylogeny, Lawson et al. (2013) concluded that ancient hybridization occurred between R. c. reichardi and R. udzungwensis. They attributed the admixture to ancient hybridization, and not current hybridization, because of the monophyly of the nuclear loci and because they did not find morphological hybrids. It is unclear if historical introgression is widespread in Rhynchocyon and Lawson et al. (2013) concludes that a robust multilocus matrix, with population-level sampling and introgression analysis, is needed to investigate the evolutionary history of this group.

The objective of my research was to generate DNA sequence data and use it to reconstruct a molecular phylogeny for the genus Rhynchocyon. With these data, I would then be able to determine the taxonomic designation of Douady et al.'s (2003) GenBank sequences, determine the subspecies designation of the captive R. petersi population, test for hybridization among Rhynchocyon taxa, and test the Corbet and Hanks (1968)

Rhynchocyon taxonomic designations. 8

Methods

Specimens

Table 1 is a complete list of specimens sampled. 1 obtained fresh tissue preserved in alcohol from specimens in the mammalogy collections at the California Academy of

Sciences (CAS) and the Field Museum of Natural History (FMNH). Unvouchered fresh tissue was also collected for this project, see appendix 2 for available data on these individuals. For taxa for which I had no fresh tissue, I sampled dried tissue from museum study skins. Additionally, 1 incorporated GenBank sequences from Douady et al. (2003) and Smit et al. (2011) and analyzed these sequences along with the sequences I generated for this study.

Laboratory Methods

Three independently segregating loci were chosen for genetic analysis based on previous work done with the family Macroscelididae (Douady 2001, Douady et al. 2003,

Dumbacher et al. 2014, Lawson et al. 2013, Smit et al. 2011, Springer et al. 1997). 1 sequenced 2,685 bases from a mitochondrial region that include genes for 12s ribosomal

RNA, valine transfer RNA, and 16s ribosomal RNA (12sl 6s), 976 of the nuclear locus inter-photoreceptor retinoid-binding protein exon 1 (IRBP), and 1,043 of the nuclear locus von Willebrand factor exon 28 (vWF).

I extracted DNA from approximately 25 mg of fresh tissue (previously stored in ethanol and frozen at -80°C until extraction) using a DNeasy Blood and Tissue extraction 9

kit (Qiagen, Venlo, Limburg, Netherlands). A working aliquot of supernatant was stored

at -20°C, with the remainder stored at -80°C in the Center for Comparative Genomics

Cryogenic Collection at CAS. Further details regarding fresh tissue DNA extraction can

be found in appendix 3.

Polymerase chain reaction (PCR) was performed on DNA extracted from fresh tissues using multiple primer sets. A complete list of primers used for amplification can

be found in table 2. For DNA extractions from fresh tissue, I performed PCR in 25 |^1 reactions with the following reagent final concentrations: IX Invitrogen Buffer (Life

Technologies, South San Francisco, California, USA), 1.5 mM magnesium chloride, 0.4

|iM forward primer, 0.4 ^iM reverse primer, 0.2 mM deoxyribonucleotides, and 0.2 units of Invitrogen Taq per sample (Life Technologies, South San Francisco, California,

USA). A total of 2 jaL DNA extract was added to each PCR tube. Tubes were spun and placed in a MyCycler™ Thermal Cycler (BioRad Laboratories, Inc., Hercules,

California, USA).

During PCR, an initial denaturation took place at 94°C for 2 minutes followed by a cycle consisting of 32-35 repeats of denaturation, annealing, and extension. Denaturation took place at 90-94°C for 1-2 minutes, primer annealing temperatures for 12s 16s were 50°C for 1 minute, for IRBP were 55°C for 30 seconds, and for vWF 55°C for 30 seconds. Extension was performed at 72°C for 2 minutes. A final extension period of 10 minutes at 72°C was performed at the end of the 32-35 cycles. 10

For historical museum specimens that did not have frozen tissue, I sampled approximately 25 mg of dried tissue from the hind foot or the dorsal incision of the dried specimen, and extracted this historical DNA in a dedicated ancient DNA laboratory at the

California Academy of Sciences (CAS) or the University of Alaska Fairbanks, Museum of the North (UAF). For a subset of historical samples, extraction and PCR were repeated in both laboratories. This provided an independent replication for those individuals.

Historical DNA was extracted using a standard phenol chloroform extraction (at

CAS) or the DNA IQ™ Tissue and Hair Extraction Kit (Promega Bio Systems,

Sunnyvale, California, USA) (at CAS and UAF). Extractions were stored at -20°C in a dedicated ancient DNA freezer until PCR was performed. A detailed description of methods used for historical DNA tissue extraction can be found in appendix 3.

For DNA extracts from historical specimens, PCR was performed using multiple overlapping primer sets per locus, with each primer set targeting 100 to 250 bases (Olson

& Hassanin 2003). I designed primers based on sequences from fresh tissue using

Primer3 v4.0.0 (Untergasser et al. 2012) in Sequencher v5.3 (Gene Codes Corporation,

Ann Arbor, MI USA ) or Geneious v7.1.4 (Kearse et al. 2012).

For ancient PCR reactions, I used an AmpliTaq Gold® (Life Technologies, South

San Francisco, CA USA) for standard reactions, and a Platinum® Taq DNA Polymerase

High Fidelity (Life Technologies, South San Francisco, CA USA) for samples that were more difficult to amplify. For the AmpliTaq Gold14' method, 25 (j.1 reactions were 11

performed and final PCR concentrations were: IX AmpliTaq Gold® Buffer, 2 mM magnesium chloride, 1 |j.M each forward and reverse primers, 1 mM deoxyribonucleotides, 1 mg/mL bovine serum albumin, 0.5 units AmpliTaq Gold® polymerase, and 2 |il DNA extract per reaction. The PCR program included an initial denaturation at 94°C for 9 minutes, a cycle of denaturation at 94°C for 1 minute, primer annealing at 50°C-65°C for 30 seconds, and extension at 72° for 1 minute was repeated

55 times. A final extension period of 4 minutes at 72°C was performed at the end of the

55 cycles. For Platinum " Taq DNA Polymerase High Fidelity method, 15 j^l reactions were performed and final PCR concentrations were: IX High Fidelity Buffer, 0.6 mM magnesium sulfate, 0.4 (iM each forward and reverse primers, 0.25 mM deoxyribonucleotides, 1.66 mg/mL bovine serum albumin, 0.15 units Platinum® Taq

DNA Polymerase High Fidelity, and 0.5 f*L of extracted DNA per reaction. The PCR program included an initial denaturation at 94°C for 2 minutes, a cycle of denaturation at

94°C for 20 seconds, primer annealing at 52°C for 30 seconds, and extension at 68°C for

1 minute was repeated 60 times. A final extension period of 10 minutes at 68°C was performed at the end of the 60 cycles.

PCR products for fresh and historical samples were visualized on a 1% agarose gel stained with ethidium bromide or GelRed™ (Biotium, Inc., Hayward, California,

USA), and product size was confirmed using a standard ladder. Primers and unincorporated nucleotides were eliminated using USB ExoSAP-IT (Affymetrix Inc.,

Santa Clara, California, USA) or a DNA Clean & Concentrator™-5 (Zymo Research 12

Corporation, Irvine, California, USA). Amplicons were sequenced using BigDye

Terminator version 3.1 cycle sequencing chemistry (Life Technologies, South San

Francisco, California, USA). Amplicons were visualized on an ABI 3130 Genetic

Analyzer (Life Technologies, South San Francisco, California, USA) located at CAS's

Center for Comparative Genomics. Samples that were extracted and amplified at UAF were purified and sequenced at the High Throughput Genomics Center, Seattle WA

(http ://www. h tseq. org/).

Alignment and Analysis

Because of a higher likelihood of contamination, all amplicons from historical

DNA were checked for contamination using the blastn, megablast, and discontiguous megablast algorithms for the nucleotide Basic Local Alignment Search Tool (BLAST) on the National Center for Biotechnology Information (NCBI) website. For sequences from fresh tissue, the entire sequence was BLASTed to check for contamination. Sequences were assembled and edited in Geneious v7.1.4 (Kearse et al. 2012). I removed all primers prior to assembly, and created consensus sequences for each individual. For heterozygotes at nuclear loci, both alleles were phased and given unique names (e.g. allele 1, allele 2). Assembled sequences, Rhynchocyon sequences downloaded from

GenBank, and outgroup sequences were aligned in Geneious using the MAFFT v7.0l 7 alignment plugin (Katoh et al. 2002). Alignments were checked by eye and exported for 13

analysis. Duplicate haplotypes or allele sequences from multiple individuals were

identified and eliminated using FaBox DNAcoIlapser v 1.41 (Villesen 2007).

Each of the three independently segregating loci (12s 16s, IRBP, and vWF) were analyzed independently and as concatenated datasets. There has been a debate over analyses based on individual or concatenated gene matrices (Degnan and Rosenberg

2006, Doyle 1992, Gatesy and Springer 2014, Maddison 1997, Szollosi et al.

2014). However, because I am trying to assess the relationship of close relatives, I analyzed each locus individually, to specifically test for introgression and conflicting signal, which may be ignored by concatenated matrices.

Phylogenetic analyses were run using both maximum likelihood and Bayesian approaches. First, Nexus files were imported into PAUP* v4.0bl0 (Swofford 2003) and sequences were partitioned into transfer RNAs and ribosomal RNAs for the mitochondrial region, and into codon positions for the nuclear loci. I used MrModelTest v2.3 (Nylander, 2004) and the Akaike information criterion (Akaike 1974) to assess the rate-specific model of evolution for each partition (table 3). I performed Bayesian analysis using MrBayes v3.1.2 (Ronquist and Huelsenbeck 2003). Bayesian analysis was run for 10 million Markov Chain Monte Carlo (MCMC) generations, sampling trees and parameters every 1,000th generation. The first 25% of the generations sampled were discarded as burnin. I performed maximum-likelihood analysis using Random

Axelerated Maximum Likelihood (RAxML) v7.2.6 (Stamatakis 2006). The number of bootstrap replicates was determined using the automatic bootstrapping criteria in 14

RAxML. To test the robustness of the results and the impact of missing data, I removed any individuals with over 50% missing data from the matrix, and repeated the analysis on the reduced matrix. All analyses were performed on the phylocluster at CAS. Support for each node was estimated using Bayesian posterior probabilities in MrBayes (Ronquist and Huelsenbeck 2003) and bootstrap analysis in RAxML (Stamatakis 2006). 1 created a genetic distance matrix for the 12s 16s region in Geneious v7.1.4 (Kearse et al. 2012) by subtracting the percent identity provided in the multiple alignment from 100 and averaging individuals across taxa. Nuclear loci were visualized as unrooted TCS allele networks (Clement et al. 2000) using PopART vl (Leigh et al. 2015).

Results

Final aligned sequence length for 12s 16s equaled 2,685 nucleotide base pairs representing 48 specimens across 10 giant sengi taxa and 3 outgroup taxa. Final aligned sequence length for 1RBP equaled 976 base pairs representing 45 specimens across 8 gint sengi taxa. Final aligned sequence length for vWF equaled 1,043 base pairs representing

45 specimens across 8 giant sengi taxa.

Bayesian analysis and maximum likelihood analysis of the mitochondrial locus

12s 16s (figure 3) recovered similar trees with consistent support for nodes. I considered node support as significant if the Bayesian posterior probability (Bpp) was above 0.95 and the maximum likelihood bootstrap (mlb) support was above 90. In my 12s 16s tree 15

(figure 3), there was good phylogenetic resolution at the species level and even the

subspecies level.

In the mitochondrial phylogeny (figure 3), R. chrysopygus is sister to all other

Rhynchocyon (Bpp=l/mlb=100). This tree shows strong support for the reciprocal

monophyly and sister relationship of R. petersi and R. udzungwensis

(Bpp=0.99/mlb=91). Within the R. cirnei clade there is strong support for R. c. stuhlmanni as sister to all other R. cirnei lineages (Bpp=l/mlb=94), and strong support

for R. c. reichardi as a distant sister to the R. c. macrurus-R. c. shirensis-R. c. cirnei clade

(Bpp=l/mlb=99). There is also strong support for the monophyly of the two R. cirnei taxa (R. c. reichardi and R. c. cirnei) that contain five or more individuals sampled. This

phylogram also shows strong support for R. c. reichardi from Malawi (MCZ 43732) as sister to R. c. reichardi from Tanzania (FMNH samples) (Bpp=l/mlb=100). This tree

shows the two R. c. macrurus specimens cluster together, however there is weak support

for this branch (Bpp=0.76/mlb=70). This analysis shows R. p. adersi clustering within the greater R. p. petersi species complex, and sister to the two specimens from the

Houston Zoo (CAS MAM 28767 and CAS MAM 29516) (Bpp=l/mlb=88).

The genetic distance matrix for the 12s 16s region shows percent differences between 0.1% and 8.8% for Rhynchocyon taxa (table 5). The highest within

Rhynchocyon differences are between R. c. shirensis and other taxa. However, I am skeptical of these values because R. c. shirensis has approximately 1,400 less bases than other taxa. Excluding R. c. shirensis, all other Rhynchocyon taxa have mean genetic 16

distances ranging from 0.2% to 3.7%. Rhynchocyon udzungwensis and R. petersi, which

are sister to each other, have a mean genetic difference of 1.1 %. Rhynchocyon c.

stuhlmanni, which is sister to all other R. cirnei, has a mean genetic distance with R. c.

reichardi of 2.8%, with R. c. marcurus of 2.5%, with R. c. shirensis of 7.8%, and with R.

c. cirnei o f 3.1%.

Nuclear loci IRBP and vWF were chosen based on previous work with

Macroscelididae (Douady et al. 2003, Dumbacher et al. 2014, Lawson et al. 2013, Smit et

al. 2011, Springer et al. 1997). However, these loci exhibited low variation among the

samples analyzed and I recovered only four single nucleotide polymorphisms (SNPs)

across the 976 IRBP bases and nine SNPs across the 1,043 vWF bases. Nuclear loci

IRBP and vWF are shown in individual allelic networks (figure 4).

Within the IRBP allele network, R. petersi and all but one R. cirnei subspecies

sampled cluster together (n=36). The exception to this is R. c. stuhlmanni, which is

separated from the rest of the R. cirnei species by three steps. Additionally, R.

chrysopygus and R. udzungwensis share an allele, and this shared allele differs by one

nucleotide change from the R. c. stuhlmanni allele and the unique R. chrysopygus allele.

The vWF locus contains more phylogenetic structure. In the vWF network there

is only one allele that is present in multiple taxa. This allele is the most common allele

overall (n=31), and it is shared by R. p. petersi, the R. petersi Houston zoo specimens

(CAS MAM 28767 and CAS MAM 29516), and R. c. cirnei. Four taxa, R. c. reichardi,

R. c. stuhlmanni, R. chrysopygus, and R. udzungwensis, appear relatively distinct; each 17

taxon has unique alleles that are at least two steps to the next nearest taxon. Allele networks for both nuclear loci show the distinctness of R. c. stuhlmanni, R. chrysopygus, and R. udzungwensis.

When comparing the analysis of the mitochondrial region and the nuclear loci it is interesting to note the concordance among the loci for most taxa. Discordance among the analyses comes from the placement of R. c. cirnei and R. p. petersi, which share an allele in the nuclear analyses of IRBP and vWF (figure 4) and are distant clades in the mitochondrial analysis of 12s 16s (figure 3). In the mitochondrial analysis R. p. petersi is more closely related to R. udzungwensis than another other taxon.

I also sequenced tissue from one Rhynchocyon specimen collected in the Boni forest, however, this data is not presented in any of the figures. My analysis shows the

Boni specimen is 100% identical to R. chrysopygus for the mitochondrial locus 12s 16s and for the nuclear locus v WF. For the nuclear locus IRBP, the Boni individual was heterozygous with one allele matching a R. chrysopygus allele, while the other allele was new to my analysis and one change different from the allele shared by R. udzungwensis and R. chrysopygus. Thus, the tissue I sequence and analyzed from the Boni population is indistinguishable from R. chrysopygus.

Discussion

The phylogeny I present in this study is the first molecular phylogeny of

Rhynchocyon to include all four currently recognized species, and all recognized 18

subspecies, except for R. c. hendersoni. With nearly complete taxon sampling, my analysis confirms that earlier taxonomists, using pelage color, pelage patterns, and geographic range, accurately inferred phylogenetic relationships within Rhynchocyon

(Corbet and Hanks 1968, Rovero et al. 2008). Below I discuss my findings in relation to previous work on Rhynchocyon and highlight new information about this group.

Clarifying the Taxonomic Status o f Ambiguous GenBank Sequences

In 2003, Douady et al. published three Rhynchocyon sp. sequences on GenBank, without identifying the species that was sequenced. I sought to determine the species of the Rhynchocyon published by Douady et al. (2003) for three reasons. First, because these sequences have been used in other studies (e.g. O'Leary et al. 2013) and continue to be useful in research, including my study of Rhynchocyon phylogenetics. Second, because Rhynchocyon tissue has become increasingly difficult to export, and thus these sequences are likely to remain valuable for some time. Third, I wanted to correct the record because other studies have incorrectly claimed that these sequences are from R. chrysopygus (Smit et al. 2011).

The three sequences published on GenBank by Douady et al. (2003) represent the following loci and corresponding accession numbers: 12sl6s (AY310880), IRBP

(AY310894), and vWF (AY310887), and because all sequences share an extraction number (CJD-2003), I can assume they were amplified from the same extraction, which presumably came from a single tissue sample, and therefore, from the same specimen. In his dissertation, Douady (2001) lists two tissues as the source of genetic data for his 19

Rhynchocyon sp„ (tissue numbers T-1853, T-1854), from the collection of Francois

Catzeflis at the Universite Montpellier, France and provides the collection locality for

both tissues as Chingulungulu (Tanzania). Douady (2001) does not report which tissue

was sequenced and posted on GenBank, so 1 must assume that the GenBank sequences

could come from either of the tissues listed.

Correspondence with Francois Catezflis, who provided the tissues to Douady,

linked the tissues (T-1853, T-1854) with the original specimen collectors, Herwig Leirs

and Walter Verheyen. Correspondence with Herwig Leirs provided a specific collection

locality for both tissues (10°44’S, 38°33’E), collection dates for the two specimens (29

July 1987 and 30 July 1987, respectively), and the location of the museum vouchers

(unpublished correspondence, F. Catezflis and H. Leirs). Both specimens reside in the

collection at the Royal Museum of Central Africa in Tervuren, Belgium. Specimen

number 96.037-M-5388 is associated with tissue T-1853, and specimen number 96.037-

M-5390, is associated with tissue T-1854.

Now that the specimen voucher and associated date are linked to two specimens

from the same locality, the Rhynchocyon sp. specimen (Douady et al. 2003) can be

assigned to R. c. macrurus based on four pieces of evidence. First, the collection locality

for the Douady et al. (2003) sequence is well within the R. c. macrurus range, and outside

of the range of other Rhynchocyon taxa. The Chingulungulu collection locality is over

700 kilometers south of the southernmost range of R. chrysopygus. Second, photos of the voucher specimen’s pelage coloration and pattern are consistent with those of inland 20

forms of R. c. macrurus, having orange rufous sides with a yellow wash and two distinct black dorsal stripes that fade into a dark rump. Most noticeably, these specimens clearly lack the maroon body and golden rump, characteristic of R. chrysopygus. Third, the museum catalog at the Royal Museum of Central Africa in Tervuren, Belgium identifies the specimens as R. cirnei. No subspecies identification is given in the museum catalog.

Fourth, the placement of the Douady et al. (2003) sequence on my 12s 16s mitochondrial tree confirms that the Douady et al. (2003) specimen clusters with R. c. macrurus

(specimen FMNH 88204) (Bpp=0.76/mlb=70), and well within the cirnei clade (figure

3). The need to identify the species associated with the Rhynchocyon sequence in Doudy et al. (2003) reiterates the necessity to identify specimens and their associated vouchers when posting sequences on GenBank. Additionally, 1 was unable to sequence nuclear

DNA from R. c. macrurus (FMNH 88204), however, because I know the Douady et al.

(2003) sequence is R. c. macrurus, I was able to include this sequence as the sole representative o fR. c. macrurus in my IRBP and vWF allele networks.

Smit et al. (2011) suggested that the sequences from Douady et al. (2003) are R. chrysopygus based solely on greatest similarity with R. chrysopygus in their mitochondrial 12s 16s tree of R. chrysopygus, R. cirnei, and R. petersi. Given my investigation, that shows that the Douady et al. (2003) sequences were R. c. macrurus, 1 further studied the Douady et al. (2003) and Smit et al. (2011) sequences to explore the sources of the contradiction in the species identification. Smit et al. (2011) sequenced

DNA from three historical study skins and posted the sequences on GenBank: R. 21

chrysopygus (EU136154), R. cirnei (EU13615), and R. petersi (EU136152). After aligning these sequences with mine, derived from fresh tissue of the same species, I found unusual insertions, deletions, and regions of especially high divergence between the Smit et al. (2011) data, the Douady et al. (2003) data, and my data.

I investigated these anomalies by carrying out three procedures. First, I mapped the PCR primers used by Smit et al. (2013) to their data. Because the Rhynchocyon tissues sequenced by Smit et al. (2011) were from historical specimens, multiple primer sets are required to amplify the approximately 2,000 bases of their published 12s 16s mitochondrial region. When I mapped the primers and study skin sequences, I noticed that some of the primers sat on top of each other, meaning the amplicons do not overlap

(figure 5). Therefore, it is impossible to generate the whole sequence as reported using their method, and gaps of unknown nucleotides (n’s) should be included where only primers existed in the sequence. If these primer regions are not replaced by n’s, portions of the published sequence reflects the primer sequence and not the target individual’s sequence (Olson and Hassanin, 2003).

Second, by aligning the primers I was able to identify each independently amplified PCR products. I BLASTed each amplicon against non-redundant sequence databases on the National Center for Biotechnology Information (NCBI) website and noticed that some of the amplicons for/?, petersi and R. cirnei sequence returned the closest matching identity to Homo sapiens. To additionally test for contamination, I aligned the Smit et al. (2011) sequences, the Douady et al. (2003) sequence, a Homo 22

sapiens 12s 16s sequence (GenBank accession number KM986533), and the primers that

Smit et al. (2011) used to amplify the 12s 16s locus. 1 noted the amplicons that matched

Homo sapiens from the sequences published by Smit et al. (2011). I assigned these amplicons arbitrary names (regions A, B, and C) and used these regions to create distance matrices in Geneious v7.1.4 (table 4). Table 4 shows that at region A, the Smit et al.

(2011) R. petersi (EU136153) sequence is 6% different from the banked Homo sapiens sequence, and is 22% different from Douady et al.’s (2003) Rhynchocyon sequence

(AY310880), 20% different from Smit et al.’s (2011) R. chrysopygus sequence

(EU136153), and 21% different from Smit et al.’s (2011) R. cirnei sequence

(EU 136154)— meaning that at region A, the Smit et al. (2011) R. petersi sequence is more closely related to human sequence than to other Rhynchocyon sequences. I found a similar pattern for R. petersi (EU 136153) at region B (percent differences equal 4%,

30%, 30%, and 33%, respectively) and at region C (percent differences equal 6%, 22%,

21%, and 14%, respectively).

Third, during my analysis I noticed that two of the Smit et al. (2011) sequences

(EU136154 and EU136I5) contained multiple ambiguity codes. Because there is only one copy of mitochondrial genes, ambiguity codes should not be present or should be very rare artifacts of PCR or somatic mutation. The presence of ambiguity codes might alternatively signal poor quality sequence or multiple sequences due to contamination. Rhynchocyon chrysopygus (EU 136152) had two ambiguous bases and R. cirnei (EU 136154) had 28 ambiguous bases. I compared these ambiguous bases with the 23

Homo sapiens sequence and the Douady et al. (2003) sequence and noticed the ambiguous bases occurred where human sequences differed from Rhynchocyon. The ambiguous bases matched the nucleotides present in both sequences (figure 6). For example, if a base was coded as Y (meaning either C or T) in the Smit et al. (2011)

GenBank sequence, the T matched Douady et al.’s (2003) Rhynchocyon sequence and the

C matched Homo sapiens sequence. This is an indication of sequence from Rhynchocyon and an underlying (likely human) contaminant in the same read.

Based on these three pieces of evidence (primer overlap, distance matrices suggesting chimeric sequences, and numerous ambiguity codes in mitochondrial sequences), I conclude that the Smit et al. (2011) Rhynchocyon sequences are likely chimeric sequences that contain both sengi and non-sengi DNA regions. Until tissue from the specimens used by Smit et al. (201 1) can be amplified and sequenced in an independent laboratory, these sequences and the resulting phylogeny should be regarded with skepticism, and therefore I did not include the Smit et al. (2011) sequences in my final analyses.

Origins o f Captive Populations

Captive populations of organisms have great value for educating the public about the conservation of a species and their habitats. In addition, captive animals provide an ex situ conservation population that could be used to repopulate a wild population, if it were to decline or become extinct. Zoos and aquariums keep studbooks, which are 24

records of the origins, genealogical history, and fates of each captive individual.

Studbook information allows zoo curators to minimize inbreeding, enhance the genetic diversity of the population, and insure that taxa do not hybridize.

While it is known who imported the captive population of R. petersi into the

United States, unfortunately the collection locality is unknown (unpublished correspondence, K. Lengel, P. Riger, and S. Eller). Based on pelage coloration, the captive giant sengi population is R. petersi, but it is not known which subspecies was imported, R. p. petersi from mainland Tanzania and southeastern Kenya, or R. p. adersi from the islands of Mafia and Zanzibar off the coast of Tanzania. If the captive sengis become important to a conservation plan that involves reintroductions, it would be necessary to know where the captives originated. Moreover, if additional R. petersi were brought into captivity, it would be important to know their subspecies designation to prevent hybridization.

In an attempt to determine the taxonomic designation of the captive population, I analyzed the DNA of two specimens from the Houston Zoo that died and were prepared as study skins at CAS (CAS MAM 28767 and CAS MAM 29516). I found that the two specimens cluster with R. p. adersi on the 12s 16s mitochondrial phylogram (figure 3). I am unable to confirm the clustering of the zoo specimens and R. p. adersi at the nuclear loci because I was unable to sequence nuclear DNA from R. p. adersi. However, analysis of the 12s 16s mitochondrial locus suggests that the zoo specimens were exported from 25

Zanzibar Island or Mafia Island. Since I have only one R. p. adersi sample, and this sample falls within the R. p. petersi clade, I regard these results as preliminary.

Species Diagnosis

The biological species concept defines species as populations of organisms that actually or potentially interbreed (Mayr 1942). Kingdon (1974) suggested Rhynchocyon was one species with some populations hybridizing. While this idea is not accepted by sengi biologists, the question still remained, could Rhynchocyon be one large species complex?

To further complicate matters, Lawson et al. (2013) concluded historical introgression occurred between R. c. reichardi and R. udzungwensis, where the distribution of the two taxa meet in the Udzungwa Mountains of Tanzania, calling into question the genetic boundaries of these two taxa. However, I found no evidence of introgression between these two species (R. c. reichardi and R. udzungwensis) or between any other Rhynchocyon taxa. The differences between our two studies is likely explained by the differences in geographical sampling. Lawson et al. (2013) sampled extensively across a narrow range, targeting the contact zone of R. c reichardi and R. udzungwensis. I sampled shallowly across a broad range, away from contact zones. Therefore, if introgression occurred at contact zones, I am less likely to have detected it. Moreover, even though historical introgression occurred between R. c. 26

reichardi and R. udzungwensis (Lawson et al. 2013), my data suggests that widespread

gene flow and panmixia did not occur in Rhynchocyon.

Furthermore, my 12s 16s phylogeny shows that R. c. reichardi and R.

udzungwensis are not sister taxa, with R. udzungwensis being more closely related to R. petersi than the R. cirnei clade (figure 7). Additionally, my analyses of nuclear loci show

no evidence o f R. udzungwensis sharing alleles with either./?, c. reichardi or R. petersi

(figure 4). However, hybrids have frequently been observed between non-sister species

(Scribner et al. 2001). Hybridization of non-sister taxa has been documented in

Heliconius butterflies (Dasmahapatra et al. 2007), bats (Larsen et al. 2010), chipmunks

(Good et al. 2003) and birds (McKitrick and Zink 1988). Often these hybrids occur when

non-sister species have overlapping ranges or historical contact zones, such as the contact

zone between R. udzungwensis and R. c. reichardi. Moreover, these hybrids may not be

morphologically distinguishable from the parent species, especially if the introgression

occurred historically. Lawson et al. (2013) noted that none of their specimens were

phenotypically hybrids, though genetic analysis revealed evidence of hybridization.

Because hybridization is not widespread, future studies looking for evidence of

Rhynchocyon introgression should sample heavily in areas where historically contact

between species may have occurred. 27

Current Taxonomic Status o f Rhynchocyon taxa

Rhynchocyon cirnei hendersoni is the only Rhynchocyon taxon not represented in this study. For this study I requested tissues from six specimens cataloged as R. c. hendersoni (AMNH 81331, AMNH 81332, MCZ 43731, MCZ 43732, MCZ 43734, and

MCZ 43737) and attempted to amplify and sequence DNA from all six samples.

Unfortunately. I was only able to successfully sequence DNA from one sample, MCZ

43732. Flowever, I conclude that MCZ 43732 is R. c. reichardi, and not R. c. hendersoni, based on two pieces of evidence. First, the collection locality for MCZ

43732 is well outside the R. c. hendersoni subspecies range and inside the range of R. c. reichardi. Both Ansell (1964) and Corbet and Hanks (1964) state that R. c. hendersoni is known only from its type locality near Livingstonia, Malawi. The collection locality of

MCZ 43732, on the Vipya Plateau, is approximately 200 kilometers south of the type locality (figure 8). Moreover, Ansell (1964) writes that Lawrence and Loveridge, who collected MCZ 43732, misidentified specimens from the Vipya plateau as R. c. hendersoni. Second, the dorsal coloration of MCZ 43732 is inconsistent with the original description of R. c. hendersoni and consistent with the description of R. c. reichardi (appendix 1, figure I). The original description for/?, c. hendersoni describes the type specimen with a dark head with grizzled blackish fur, dark ears, and black forefeet and hindfeet (Thomas 1902). Corbet and Hanks (1968) note that overall tone of

R. c. hendersoni is very dark (appendix 1, figure I). The specimen in question has light brown fur on its head with flecks of black, light brown ears, and brown forefeet and 28

hindfeet. The overall coloration of the specimen in question is light, with an undertone of yellow coming through. Based on these two pieces of evidence, the collection locality for the specimen and the dorsal coloration, I conclude that MCZ 43732 is R. c. reichardi.

Thus, despite my efforts, my phylogeny does not include R. c. hendersoni, and this enigmatic taxon requires additional attention.

In their revision of Macroscelididae, Corbet and Hanks (1968) describe a new subspecies, R. c. shirensis, based on a grizzled black and cream dorsal pelage, grey-based contour hairs, dorsal spots that were darker than R. c. cirnei, and other pelage and dental characters (see Corbet and Hanks 1968 for summary and appendix 1, figure E). Corbet and Hanks (1968) define the range of R. c. shirensis as southern Malawi (figure 1). I have only one sample of R. c. shirensis in my mitochondrial analysis and no samples in my nuclear analyses, however, in the 12s 16s phylogeny R. c. shirensis is within the R. cirnei clade (figure 3) and does not cause any taxa to be paraphyletic, therefore, I recommend continuing to treat R. c. shirensis as a subspecies of R. cirnei.

Along with the six R. cirnei subspecies, Corbet and Hanks (1968) proposed a potential seventh subspecies based on a single specimen collected in northeastern

Mozambique. Corbet and Hanks (1968) remarked that this specimen might be an intermediate between R. c. cirnei and R. c. macrurus based on tail coloration. To investigate this potential subspecies, Coals and Rathbun (2013) collected eight

Rhynchocyon specimens from northeastern Mozambique. In their study, Coals and

Rathbun (2013) compared their specimens with two R. c. cirnei topotypes and found that 29

there was variation in pelage coloration throughout the range. Coals and Rathbun (2013) concluded that the giant sengis in northeastern Mozambique are R. c. cirnei and further speculated that with genetic work R. c. cirnei and R. c. shirensis would be found in the same taxon. Because 1 do not have tissue from R. c. cirnei topotypes, and I have only one

R. c. shirensis specimen in my analysis, I am unable to assess the genetic relationship of

R. c. cirnei from the type locality and the specimens from northeastern Mozambique, or further speculate on the genetic relationship of R. c. cirnei and R. c. shirensis.

In 2010 Andanje et al. suggested a potentially new species of Rhynchocyon from the Dodori and Boni National Reserves on the northern coast of Kenya. A voucher specimen was collected and placed at the National Museums of Kenya (NMK169427), and tissue from this voucher was sent to CAS by the Kenya Wildlife Service. The sequences that I obtained from the tissue were identical to R. chrysopygus at 12s 16s, vWF, and one of two alleles at IRBP. These data suggest that the tissue I sequenced is genetically very similar to, or perhaps a form of, R. chrysopygus. This is surprising given the very different pelage color and patterns between these two allopatric forms (Coals &

Rathbun, 2013). Moreover, I have determined that dorsal pelage patterning and coloration are valid phylogenetic markers for all other Rhynchocyon taxa. Because my results are based upon a single tissue specimen collected by others, I am reluctant to draw any conclusions regarding this specimen and the sequences without examining the voucher and knowing more about how the tissue was collected. More data should be 30

collected and analyzed before any conclusions can be made about the taxonomic status of this morphologically unique giant sengi.

It has been proposed that R. c. stuhlmanni could be elevated to full species based on the short nasal bones, all-white tail, and isolated range in the Congo Basin (Corbet and

Hanks 1968, Corbet 1970). My molecular data also suggest that R. c. stuhlmanni could be elevated to full species. The 12s 16s phylogeny (figure 3) shows strong support for R. c. stuhlmanni as a distinct lineage that is sister to all other R. cirnei subspecies. The mean distance matrix for 12s 16s (table 5) shows R. c. stuhlmanni as at least 2% divergent from other R. cirnei, while the remaining R. cirnei subspecies show among subspecies divergences between 1 % and 1.6%.

Moreover, the nuclear allele networks (figure 4) show additional support for the uniqueness of R. c. stuhlmanni and support elevating it to full species. In both the IRBP and vWF allele networks, R. c. stuhlmanni has a unique allele that is not shared by any other taxa. Furthermore, the R. c. stuhlmanni allele in the IRBP network is three steps away from the other R. cirnei subspecies, and closer to an allele shared by R. chrysopygus and R. udzungwensis. Thus R. c. stuhlmanni is morphologically, geographically, and genetically distinct from the other R. cirnei.

Conclusions

The phylogeny and allele networks shown here present a comprehensive evaluation of the systematic relationships of the genus Rhynchocyon. Here I conclude 31

that the unidentified Rhynchocyon sequences from Douady et al. 2003 on Genbank are from R. c. macrurus and that Smit et al.’s (2011) Rhynchocyon sequences are contaminated (likely with human DNA). I speculate that the North American captive

Rhynchocyon population is R. p. adersi. Moreover, my analysis shows that hybridization is not widespread in Rhynchocyon. Furthermore, I conclude that the tissue sample that 1 received from the Boni sengi is genetically very similar to R. chrysopygus. Based on genetic, morphological, and geographical range, I recommend provisionally treating R. c. stuhlmanni as a distinct species (R. stuhlmanni), however, a study of the R. stuhlmanni population is recommended. Rhynchocyon stuhlmanni shows a dorsal pelage color cline, similar to R. c. macrurus, and the sequences in my study come from two specimens collected 42 kilometers apart and are not representative of the entire cline. A study that spans the entire geographic range of the species, sampling dark, medium, and light individuals would confirm that my sequences are consistent with the entire R. stuhlmanni population.

Although the Corbet and Hanks (1968) taxonomy has been used for many years, this is the first time that I can conclude with high confidence that the taxonomy accurately reflects the evolutionary history of Rhynchocyon. Moreover, this study confirms that the Rhynchocyon dorsal pelage color and dorsal pelage patterns, in conjunction with geographic range, have been useful for resolving taxonomic and phylogenetic relationships. 32

Based on my genetic analysis I recommend the following taxonomic treatment for giant sengis:

Class: Mammalia Linneus, 1758

Superorder: Afrotheria, Stanhope et al., 1998

Order: Macroscelidea Butler, 1956

Family: Macroscelididae Bonaparte, 1838

Subfamily: Rhynchocyoninae

Genus: Rhynchocyon Peters, 1847

Rhynchocyon cirnei Peters, 1847

Rhynchocyon cirnei cirnei Peters, 1847

Rhynchocyon cirnei shirensis Corbet & Hanks, 1968

Rhynchocyon cirnei reichardi Reichenow, 1886

Rhynchocyon cirnei hendersoni Thomas, 1902

Rhynchocyon cirnei macrurus Gunther, 1881

Rhynchocyon stuhlmanni Matschie, 1893

Rhynchocyon petersi Bocage, 1880

Rhynchocyon petersi petersi Bocage, 1880

Rhynchocyon petersi adersi Dollman, 1912

Rhynchocyon chrysopygus Gunther, 1881

Rhynchocyon udzungwensis Rathbun & Rovero, 2008 Table 1: Data for specimens used for DNA sequencing. Museum numbers for vouchered specimens and field numbers for unvouchered specimens are listed, and an asterisk (*) denotes field number. Museum codes are as follows: AMNH=American Museum of Natural History, BMNH=Natural History Museum London, CAS MAM=California Academy of Sciences, FMNH=Field Museum of Natural History, MCZ=Museum of Comparative Zoology, MTSN=Museo Tridentino di Scienze Naturali, and RMCA=Royal Museum of Central Africa. §Denotes sequence was downloaded from GenBank. fDenotes sequence is from historical DNA. GenBank Accession Number voucher/field specimen 12sl6s IRBP vWF collection locality number E. edwardii unknown AY310885» AY310899* AY310892* South Africa M. tnicus CAS MAM 27997 KF8951048 KF742665* KF742645* Khorixas District, Kunene Region, Namibia; -20.7266, 14.1283 P. tetradactylus unknown AY310883s AY310897* AY310890* Chingulungulu, Tanzania R. chrysopygus CAS MAM 24525 KT348460t KT348366f & ^358508’ KT358505t & KT358506f Gedi National Monument, Kilifi District, Kenya; -3.3097, 40.0182 R. chrysopygus CAS MAM 24526 KT348461t none none Gedi National Monument, Kilifi District, Kenya; -3.3097, 40.0182 R. chrysopygus FMNH 153106 KT348462t none none Mombasa, Kilifi District, Kenya; -4.05, 39.6667 R. c. cirnei CAS MAM 29344 KT348463 KT348372 KT348411 Mareja Reserve, Mozambique; -12.8436, 40.1617 R. c. cirnei CAS MAM 29345 KT348466 KT348405 KT348412 Mareja Reserve, Mozambique; -12.8483, 40.1649 R. c. cirnei CAS MAM 29351 KT348470 KT348406 & KT348407 KT348413 Mareja Reserve, Mozambique; -12.8440, 40.1609 R. c. cirnei CAS MAM 29352 KT348468 KT348375 KT348414 Mareja Reserve, Mozambique; -12.8420, 40.1637 R. c. cirnei CAS MAM 29353 KT348469 KT348376 KT348415 Mareja Reserve, Mozambique; -12.8452, 40.1615 R. c. cirnei CAS MAM 29355 KT348464 KT348377 KT348423 Mareja Reserve, Mozambique; -12.8420, 40.1637 R. c. cirnei CAS MAM 29357 KT348465 KT348378 KT348416 Mareja Reserve, Mozambique; -12.8429, 40.1623 R. c. cirnei CAS MAM 29358 KT348467 KT348379 KT348417 Mareja Reserve, Mozambique; -12.8450, 40.1614 RMCA 96.037-M- R. c. macrurus 5388 or RMCA AY310880s AY310894s AY310887s Chingulungulu region, Tanzania; -10.44, 38.33 96.037-M-5390 R. c. macrurus FMNH 88204 KT348471* none none Mihuru, Newala District, Mtwara Region, Tanzania; -10.6667, 39.5 Mbizi Mts, Mbizi Forest Reserve, vicinity of Mazumba Hill, Sumbawanga R. c. reichardi FMNH 171474 KT348474 KT348400 KT348452 District, Rukwa Region, Tanzania Mbizi Mts, vicinity of Mazumba, Sumbawanga District, Rukwa Region, R. c. reichardi FMNH 171617 KT348475 KT348380 KT348448 & KT348451 Tanzania Mahale Mts, Mahale National Park, 0.5 km NW Nkungwe Hill summit, R. c. reichardi FMNH 177823 KT348476 none KT348449 Kigoma District, Kigoma Region, Tanzania; -6.1043, 29.7790 Mahale Mts, Mahale National Park, 0.5 km NW Nkungwe Hill summit, R. c. reichardi FMNH 178010 KT348477 KT348381 KT348450 Kigoma District. Kigoma Region, Tanzania; -6.1043, 29.7790 R. c. reichardi (labeled MCZ 43732 KT348473* KT348404t KT348447* Vipya Plateau, Malawi R. c. hendersoni) R. c. shirensis AMNH 161777 KT348472* none none Mlanje Plateau, Malawi R. c. stuhlmanni M300* KT348478 KT348409 KT348453 Democratic Republic of the Congo; 0.0131, 25 .5565 R. c. stuhlmanni MK001* none KT348409 KT348454 Democratic Republic of the Congo; 0.2946, 25.2917 R. peterst spp. CAS MAM 28767 KT348479 KT348382 KT348424 Houston Zoo, Houston, Texas, United States of America R. petersi spp. CAS MAM 29516 KT348480 KT348383 KT348425 Houston Zoo, Houston, Texas, United States of America R. p. adersi MCZ 22829 K.T348481* none none Nyanga Id., Zanzibar, Tanzania Table 1 continued

GenBank Accession Number voucher/field specimen 12sl6s IRBP vWF collection locality number South Pare Mts, Chome Forest Reserve, 5.5 km S Bombo, near Kanza R. p. petersi FMNH 151213 KT348482 KT348384 KT348418 Village, Kilimanjaro Region, Tanzania; -4.32, 38 South Pare Mts, Chome Forest Reserve, 7 km S Bombo, Kilimanjaro Region, R. p. petersi FMNH 151214 KT348483 KT348401 KT348419 Tanzania; -4.33, 38 Nguru Mts, Manyangu Forest Reserve, near Disango, Morogoro District, R. p. petersi FMNH 161311 KT348485 KT348385 KT348420 Morogoro Region, Tanzania; -6.04, 37.5467 Nguru Mts, Manyangu Forest Reserve, near Disango, Morogoro District, R. p. petersi FMNH 161312 KT348486 KT348386 KT348427 Morogoro Region, Tanzania; -6.04, 37.5467 North Pare Mts, Minja Forest Reserve, Mwanga District, Kilimanjaro R. p. petersi FMNH 192684 KT348484 KT348402 KT348422 Region, Tanzania; -3.5815, 37.6773 Nguru Mts, Manyangu Forest Reserve, near Disango, Morogoro District, R. p. petersi FNMHI61395 KT348501 KT348373 & KT358507 KT348421 Morogoro Region, Tanzania; -6.04, 37.5467 R. p. petersi RP15* none KT348387 KT348428 Zaraninge Forest, Tanzania; -6.1367, 38.6055 R. p. petersi TA1812* none KT348374 none Zaraninge Forest, Tanzania; -6.1055, 38.6158 R. p. petersi TA1818* KT348494 KT348408 KT348436 Askari Forest, Tanzania; -5.9955, 38.7607 R. p petersi TA1833* KT348487 KT348388 KT348429 Zaraninge Forest, Tanzania; -6.1056, 38.6167 R. p. petersi TA1835* KT348495 KT348389 KT348430 Zaraninge Forest, Tanzania; -6.1126, 38.6211 R. p. petersi TZ22766* KT348488 KT348390 KT348431 Gendagenda Forest, Tanzania; -5.5759, 38.6423 R. p. petersi TZ22767’ KT348499 KT348391 KT348437 & KT348442 Gendagenda Forest, Tanzania; -5.5639, 38.6502 R. p. petersi TZ22769* KT348498 KT348392 KT348426 & KT348443 Gendagenda Forest, Tanzania; -5.5871, 38.6395 R. p. petersi TZ22770’ KT348500 KT348393 KT348432 Gendagenda Forest, Tanzania; -5.5871, 38.6404 R. p. petersi TZ22774* KT348491 KT348394 KT348433 Kwamsisi Forest, Tanzania; -5.8909, 38.5928 R. p. petersi TZ22775* KT348489 KT348403 KT348434 Kwamsisi Forest, Tanzania; -5.8921, 38.5938 R. p. petersi TZ22776’ KT348492 KT348395 KT348438 & KT348444 Kwamsisi Forest, Tanzania; -5.8921, 38.5939 R. p. petersi TZ22778' KT348493 KT348396 KT348439 Kwamsisi Forest, Tanzania; -5.8938, 38.5949 R. p. petersi TZ22779* KT348496 KT348397 KT348440 & KT348445 Kwamsisi Forest, Tanzania; -5.8937, 38.5944 R. p. petersi TZ22783* KT348490 KT348398 KT348435 Gendagenda Forest, Tanzania; -5.601, 38.6468 R. p. petersi TZ22811 * KT348497 KT348399 KT348441 & KT348446 Kwamsisi Forest, Tanzania; -5.8723, 38.5726 R. udzungwensis CAS MAM 28043 KT348503 KT348368 KT348455 Udzungwa Mountains, Ndundulu Forest, Tanzania; -7.8045, 36.5059 R. udzungwensis CAS MAM 28318 KT348504 KT348369 KT348456 Udzungwa Mountains, Ndundulu Forest, Tanzania; -7.7944, 36.4919 R. udzungwensis FMNH 194127 KT348506 KT348370 KT348457 Udzungwa Mountains, Ndundulu Forest, Tanzania; -7.8045, 36.5059 R. udzungwensis MTSN 6000 KT348505 KT348371 KT348458 Udzungwa Mountains, Ndundulu Forest, Tanzania; -7.8036, 36.5059 R. udzungwensis BMNH 2007.7 KT000011 KT000020 KF202173 Udzungwa Mountains, Ndundulu Forest, Tanzania; -7.8045, 36.5059 35

Table 2: Primers used for DNA amplification and sequencing, ^denotes primer was used for original PCR amplification; all primers were used for sequencing. All primers are read in the 5’ to 3’ direction. locus primer name primer sequence reference historical fresh DNA DNA

12sl6s rRNA-aF* aaagcaaarcactgaaaatgcytagatg Douady 2001

12s16s rRNA-aR caaactgggattagataccccactat Douady 2001

12sl6s rRNA-bF catctggcctacacccagaag Douady 2001

12sl6s rRNA-bR gcagccatcaattaagaaagcgttaaag Douady 2001

12s16s rRNA-cF gacgagaagaccctatggagc Douady 2001

12 s16s rRNA-cR cgattatgcaacaggctcctctag Douady 2001

12s16s rRNA-dF gaatctttcatctttcccttacggtac Douady 2001

12s16s rRNA-dR gtgggcatccgttctgatataagct Douady 2001

12s16s rRNA-eF ctccgaggtcaccccaacc Douady 2001

12s16s rRNA-eR{ tgttaaggagaggatttgaacctctg Douady 2001

12s16s 12sl6s_285Fi ayttcgtgccagccacc this work

12s16s 12sl6s_316Ri tgtycgtatgaccgcgg this work

12sl6s 12sl6s_665F* ccgccatcttcagcaaa this work

12sl6s 12sl6s_768R* agcccattagtttccatca this work

12s16s 12sl6s_892F* ccgtcaccctcctcaa this work

12s16s 12sl6s_964R* cgacttgtctcctcttgtg this work

12s16s 12s16s_l 128F{ catttayactataaagtataggag this work

12s16s 12sl6s_1322Rt ctcgtctggtttcgggg this work

1RBP IRBP445* aaccttacacaggaggaactgct Douady et al. 2003

1RBP 1RBP913 gccctggacctccagaagctgaggatagg Douady et al. 2003

IRBP IRBP 1451* agggcttgctctgctggag Douady et al. 2003

IRBP IR B P76F* gcgcaggtatcccaca this work 36

Table 2 continued locus primer name primer sequence reference historical fresh DNA DNA

IRBP IRBP 220R* agaaaattctcctaagcc this work

IRBP IRBP175F* ccccagctgttcattgg this work

IRBP IRBP 343R* cctccttgccaacatgg this work

IRBP IRBP_317F* gcagagaaatccatgttgg this work

IRBP IRBP 557R* ctgtgtccaggtcattgg this work

IRBP IRBP 520F* gcttcctccacccaga this work

IRBP IRBPJ769R* ctggaccttgcctcagg this work

IRBP IRBP 726F* tgtcagcactgtatctct this work

IRBP IRBP_973R{ ggtactaagccagctgg this work

IRBP IRBP 878F* cctgtgcagtgccg this work

IRBP IRBP_1,062R* ctgcttagtgaactgca this work vWF vWF-A* ctgtgatggtgtcaacctcacctgtgaagcctg Douady 2001 vWF vWF-A2* agcaagctgctggacctggtcttcctgctgga Douady et al. 2003 vWF vW F-Bi tegggggagcgtctcaaagtcctggatga Douady 2001 vWF VWF-B2* gcagggtttcctgtgaccatgtagaccag Douady et al. 2003 vWF vW F-D2 gtgatcccggtgggcat Douady et al. 2003 vWF vWF-G2 aaaggctttgttctcaggggcctgcttctc Douady et al. 2003 vWF vW F-118F* tctcagaagcggatccg this work vWF VWF-151R* attccaccaaggccaca this work vWF vW F-223F{ ggcc aggtgaagtatgc this work vWF vWF-259Ri tggaagccacattgctg this work vWF vWF-302Fl acatagaccgcccagag this work vWF VWF-331R* caatacgggaggcctct this work vWF VWF-461RJ tgcttgaggttggcatg this work 37

Table 2 continued

locus primer name primer sequence reference historical fresh DNA DNA vWF VWF-524F* agctggagcagagaagg this work vWF vWF-593Ri gtagggggagatggctc this work vWF VWF-644F* ttacgttcccagcacct this work vWF vWF-674Rt atggagctccgtgtagg this work vWF VWF-796F* catgtcactgtgctgca this work vWF vW F-855R{ tgcctcactgaaggtgt this work vWF vW F-865F{ gctgatatcctgcagca this work vWF VWF-909R* gccaccctgatactgga this work vWF vWF-927F} gctggccctacagtaca this work vWF vWF-976R* ctccctggctggtagag this work vWF vWF-1067R{ accacctggatgtctcc this work • 38

Table 3. Best fit models for loci sequenced. Models were determined using MrModelTest v2.3 (Nylander, 2004) based on Akaike information criterion.

locus total number of bases amplified best fitting substitution model

GTR+I+r ( 12s)

H KY+r (tRNA valine)

12s 16s 2685 GTR+I+r (16s)

HK.Y (tRNA leucine)

GTR (position 1)

GTR+I (position2) IRBP 976

GTR+r (position 3)

HKY+I (position 1)

HKY+I (position 2) vWF 1043

GTR (position 3) 39

Table 4. Distance matrix for comparing Smit et al. (2011) sequences. Matrices show percent difference between 12sl6s mitochondrial sequences of Homo sapiens, Rhynchocyon sp. (Douady et al. 2003), and R. chrysopygus, R. petersi, R. cirnei (Smit et al. 2011). Percent difference was calculated in Geneious v7.1.4 (Kearse et al. 2012) by subtracting the percent identity from 100.

VO ^ O ~ ^ < Z) — . ^ m vo -r ^ ^d. w ^~~ £ tills ■5 6 a 5 §> -5 8. 3:*2 Q£^ QC QC •

Homo sapiens (KM986533) 34 34 4 37 Rhynchocyon sp. (AY310880) 24 0 30 32 R. chrysopygus (EU 136152) 24 3 30 32 R. petersi (EU 136153) 6 22 20 33 R. cirnei (EUI 36154) 22 24 22 21

region A (lower left) and region B (upper right)

>■ 2 < w D

■5-2 os-S' os as

Rhynchocyon sp. (AY310880) 23

R. chrysopygus (EU136152) 24 3

R. petersi (EU136153) 5 22 21

R. cirnei (EUI36154) 17 21 20 14

region C Table 5. Distance matrix for 12sl6s mitochondrial sequences. Table shows uncorrected percent differences (p- distance). Range is shown in the lower left triangle and mean distance is shown in the upper right triangle (in parentheses). P-distances were calculated in Geneious v7.1.4 (Kearse et al. 2012) by subtracting the percent identity from 100.

(zoo) S oj 8 Q. tetradactylus edwardii O. F. F. P. M. M. micus R. chrysopygus R. c. sthulmanni R. c. reichardi R. c. macrurus R. c. shriensis R. R. c. cirnei R. udzungwensis R. p. adersi R. petersi O: f . edw ardii (18.1) (18.5) (19.2) (21.1) (21.2) (20.5) (24.5) (21.8) (21.5) (19.3) (21.4) (21.4)

M. m icus 18.1 (17.7) (20.8) (21.5) (21.6) (21.3) (25.6) (21.9) (22.2) (20.8) (21.7) (21.7)

P. tetradactylus 18.5 17.7 (22.5) (22.1) (22.8) (22.6) (27.5) (22.9) (23.1) (22.8) (22.6) (22.7)

R. chrysopygus 19.5- 20.8- 22.6- (3.1) (3) (2.9) (8.8) (2.8) (3.0) (2.7) (2.9) (2.7) 18.8 20:7 22.4

R. c. sthulm anni 21.1 21.5 22.1 3.1 (2.8) (2.5) (7.8) (3.1) (3.3) (3.0) (3.4) (3.2)

R. c. reichardi 21.7- 21.8- 22.9- 3 2.9- (1.5) (6.8) (16) (3.5) (2.5) (3.5) (3.4) 19.4 21.2 22.5 2.5 R. c. m acrurus 21.7- 21.7- 22.7- 2.9- 2.8- 1.8- (6.1) (10) (3.2) (2.2) (3.1) (2.9) 19.2 20.9 22.5 2.8 2.2 1.2

R. c. shriensis 24.5 25.6 27.5 8.8 7.8 6.8 6.1 (6.1) (8.6) (7.8) (7.9) (7.7)

R. c. cirnei 21.8 22- 23.0- 2.9- 3.2- 1.6- 1 . 1 - 1 . 0 6.2- (3.7) (2.3) (3.5) (3.4) 21.9 22.9 2.8 3.1 1.5 6.0 R. udzungwensis 21.7- 22.6- 23.2- 3.1- 3.5- 3.7- 3.4-3.1 11.4- 4.0- (1.0) (1.2) (1.0) 21.4 22.0 23.0 2.9 3.2 3.4 7.8 3.6

R. p. adersi 19.3 20.8 22.8 2.7 3.0 2.5 2.2 7.8 2.3 1.0 (0.1) (0.2)

R. petersi (zoo) 21.4 21.7 22.6 2.9 3.4 3.5 3.1 7.9 3.5 1.2 0.1 (0.4)

R. p. petersi 21.6- 21.9- 22.8- 3.0- 3.3- 3.5- 3.1-2.6 7.9- 3.4- 1.1- 0.5- 0.5- 21.0 21.4 22.3 2.5 3.1 3.3 7.7 3.3 0.9 0.1 0.3 41

Figure 1. Rhynchocyon cirnei subspecies ranges as proposed by Corbet and Hanks (1968). Note that two ranges for R. c. cirnei are shown (ranges 1 and 7). Corbet and Hanks (1968) hypothesized a potentially new subspecies of R. cirnei (range 7). Coals and Rathbun (2013) determined that giant sengis from range 7 are R. c. cirnei. Figure adapted from Corbet and Hanks (1968).

5* O. B. CORBET & J. HANKS

R. c. shirensis ■ R. c. reichardi ■ R. c. hendersoni ■ R. c. macrurus * R. c. stuhlmanni ■ R. c. cirnei 42

Figure 2. Geographic range of the genus Rhynchocyon, based on Rhynchocyon data compiled by G. Rathbun and summarized at www.sengis.org. Note the difference between figure 1 ranges, by Corbet and Hanks (1968), and the ranges in this figure, which reflect different data sets. Species ranges are colored in purple (R. chrysopygus), pink (R. cirnei), blue (R. petersi), and green (R. udzungwensis). Collection localities for samples used in this study are denoted with circles (/?. chrysopygus), triangles (R. cirnei), stars (R. petersi), and squares (/?. udzungwensis). Figure 3. Majority rule consensus phylogram based on 10,000 MrBayes trees for the 12sl6s mitochondrial region of Rhynchocyon. Bayesian posterior probabilities above 0.95 were considered significant and are represented by an asterisk above the branch. Maximum likelihood bootstrap values above 90 were considered significant and are represented by an asterisk below the branch.

-C t-knutpypis CASMAM24S26 092 j ~ ckryxopupm FMNH 153106 cVMfft'w (ASMAM24523 • f t cirnei uuhlmaitni M300 - ft cirnei mckanli MC7.43732 - K Lima reichanii FMNH 171474 - R cirnei reh hanii FMNHI7I6I7 - R cm e i wk'kanii FMNH178010 8 lL f t arm i rcuhanti FMNH 177823 0.7b|------ft cirnei nucrurm FMNH88204 701------R. cirnei mocruru% Douady > Rdm ei ahuvnm AMNH 161777 "LI. - A a rn a arm* CASMAM29351 - ft cirnei cm m CASMAM29353 f— A cm m drum CASMAM2935* - K a m a cm m CASMAM29352 - R cirnei cirnei CASMAM29345 - R cirnei cirnei CASMAM29344 I cirnei cirnei CASMAM2935S I lim a cirnei CASMAM293S7 ft uJaniK*vuiis CASMAM28318 - R. uJaingHaim M ISNoOOO R mkung*eiuis ( ASMAM28043 R. HdaMgvtntb FMNH 194127 L- R. nJomgnmuu BMN112U07 7 R. p e ie n ip a e m FMNH 16139$ R.petersipeteni FMNHI5I213 R. peieni po e m FMNU151214 R. peieni/leienl FMNH192684 Rpeteni peieni CZ22767 R. p o e m oJent MCZ22829 A p a e m (/*») C ASMAM29516 ft peiem (zoo) CASMAM28767 — R peieni peiem TAI83S R peiem peiem IZ22769 ft peieni /leuni J/22779 8V 'R peienipeiem TZ228II R peieni/K-lem 1A1818 - R . peieni peieni TAI833 - R. peieni po e m TZ22766 - ft peieni peieni TZ22775 - R. peieni peiem TZ22783 - ft p e m u peters! FZ22774 - ft peieni peieni TZ22776 - f t peieni paem TZ22778 ft peiem peieni 1X22770 _*r R p a e m peieni FMNHI6I3I2 74L R peiem peiem FMNIII6I3II UJ Figure 4. TCS allelic networks for Rhynchocyon nuclear loci IRBP and vWF. Warm colors (pinks, oranges, yellows) represent R. cirnei subspecies. Cool colors (blues) represent R. petersi subspecies. Purple and green represent R. chrysopygus and R. udzungwensis, respectively. Black circles represent alleles that were not recovered in this analysis and ‘n’ represents the number of individuals sampled.

fl=l ,n=2 n=4 . n= I vWF

41=1

n=2 I R. chrysopygus o R. cirnei cirnei R cirnei macrurus

n=2 ^ R. cirnei reichardi

6 ' O *. cirnei stuhlmanni ,n=5 R. petersi (zoo)

R. petersi petersi

R udzungwensis

4^ Figure 5. Nucleotide alignment of Smit et al. (2011) sequences and primers. Figure shows primers used for amplification of the 12sl6s region mapped to the sequences as posted on GenBank. Arrows indicate places where primers overlap. Figure 6. Nucleotide alignment of Smit et al. (2011) sequences and Homo sapiens. Figure shows an alignment of Homo sapiens (KM986533), R. petersi (EU136153), R. cirnei (EU136154), R. chrysopygus (EU136152), and Douady et al. (2003) sequence Rhynchocyon sp. (AY310880). Note the ambiguous bases for R. cirnei (EU136154) and R. chrysopygus (EU136152) that match Homo sapiens and Douady et al.’s (2003) Rhynchocyon sp. sequence. Also note the homology between H. sapiens and the R. petersi (EU136153).

Coverage 01 H sapiens KM98653H R petersi_EU 136153 R ctrnei_EU136154 R.chrysopygus EU136152 R.sp_AY310880 . Figure 7. Majority rule consensus cladogram based on 10,000 MrBayes trees for the 12sl6s mitochondrial region of Rhynchocyon. Bayesian posterior probabilities above 0.95 were considered significant and are represented by an asterisk above the branch. Maximum likelihood bootstrap values above 90 were considered significant and are represented by an asterisk below the branch. To the right of the taxa “n” represents the number of individuals sampled and “h” represents the number of unique haplotypes.

R. chrysopygus n=3 h=3

R. cirnei stuhlmanni n=l h=l

R. cirnei reichardi n=5 h=3

R. cirnei macrurus n=2 h=2

85 65 R. cirnei shirensis n=l h=l 0.66 71 R. cirnei cirnei n=8 h=6

R. udzungwensis n=5 h=4

R. petersi n=23 h 17

4^ 48

Figure 8: Type locality of Rhynchocyon cirnei hendersoni. Figure shows the type locality for R. c. hendersoni (near Livingstonia, Malawi) and collection locality of MCZ 43732 on the Viphya Plateau, approximately 200 kilometers south of the type locality. Localities plotted on Google Earth. 49

References

Akaike, H., 1974. A new look at the statistical model identification. IEEE Trans. Autom.

Control 19, 716-723. doi: 10.1109/TAC. 1974.1100705

Andanje, S., Agwanda, B.R., Ngaruiya, G.W., Amin, R., Rathbun, G.B., 2010. Sengi

(elephant-shrew) observations from northern coastal Kenya. J. East African Nat. Hist. 99,

1-8. doi: 10.2982/028.099.0101

Ansell, W.F.H., 1964. Addenda and corrigenda to “Mammals of Northern Rhodesia.”

Puku 2, 14-52.

Clement, M., Posada, D., Crandall, K.A., 2000. TCS: A computer program to estimate gene genealogies. Mol. Ecol. 9, 1657-1659. doi: 10.1046/j.l365-294X.2000.01020.x

Coals, P.G.R., Rathbun, G.B., 2013. The taxonomic status of giant sengis (genus

Rhynchocyon) in Mozambique. J. East African Nat. Hist. 101, 241-250. doi: 10.2982/028.101.0203

Corbet, G.B., 1970. Patterns of subspecific variation. Symp. Zool. Soc. London 26, 105-

116. 50

Corbet, G.B., Hanks, J., 1968. A revision of the elephant-shrews, family

Macroscelididae. Bull. Br. Museum (Natural Hist.) Zool. 16, 45-111.

Dasmahapatra, K.K., Silva-Vasquez, A., Chung, J., Mallet, J., 2007. Genetic analysis of a wild-caught between non-sister Heliconius butterfly species. Biol. Lett. 3, 660-

663. doi: 10.1098/rsbl.2007.0401

Degnan, J.H., Rosenberg, N.A., 2006. Discordance of species trees with their most likely gene trees. PLoS Genet. 2, e68. doi:10.1371/joumal.pgen.0020068

Douady, C., 2001. Molecular phylogenetics of the Insectivora. PhD diss. The Queen’s

University of Belfast.

Douady, C.J., Catzeflis, F„ Raman, J., Springer, M.S., Stanhope, M.J., 2003. The Sahara as a vicariant agent, and the role of Miocene climatic events, in the diversification of the mammalian order Macroscelidea (elephant shrews). Proc. Natl. Acad. Sci. U. S. A. 100,

8325-8330. doi: 10.1073/pnas.0832467100

Doyle, J.J., 1992. Gene trees and species trees: molecular systematics as one-character taxonomy. Syst. Biol. 17, 144—163. 51

Dumbacher, J.P., Rathbun, G.B., Smit, H.A, Eiseb, S.J., 2012. Phylogeny and taxonomy of the round-eared sengis or elephant-shrews, genus (Mammalia,

Afrotheria, Macroscelidea). PLoS One 7, e32410. doi: 10.1371 /journal.pone.0032410

Dumbacher, J.P., Rathbun, G.B., Osborne, T.O., Griffin, M., Eiseb, S.J., 2014. A new species of round-eared sengi (genus Macroscelides) from Namibia. J. Mammal. 95, 443-

454. doi: 10.1644/13-MAMM-A-159

FitzGibbon, C.D., 1997. The adaptive significance of monogamy in the golden-rumped elephant-shrew. J. Zool. 242, 167-177. doi: 10.1111/j.l469-7998.1997.tb02937.x

Gatesy, J., Springer, M.S., 2014. Phylogenetic analysis at deep timescales: unreliable gene trees, bypassed hidden support, and the coalescence/concatalescence conundrum.

Mol. Phylogenet. Evol. 80, 231-266. doi:10.1016/j.ympev.2014.08.013

Good, J.M., Demboski, J.R., Nagorsen, D.W., Sullivan, J., 2003. Phylogeography and introgressive hybridization: chipmunks (genus Tamias) in the northern Rocky Mountains.

Evolution 57, 1900-1916. doi:10.111 l/j.0014-3820.2003.tb00597.x 52

Hennig, W., 1965. Phylogenetic systematics. Annu. Rev. Entomol. 10, 97-116. doi: 10.1146/annurev.en. 10.010165.000525

Katoh, K., Misawa, K., Kuma, K., Miyata, T., 2002. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30,

3059-3066. doi: 10.1093/nar/gkf436

Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S.,

Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Meintjes, P., Drummond,

A., 2012. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647-1649. doi: 10.1093/bioinformatics/bts 199

Kingdon, J., 1974. Elephant Shrews, in: East African Mammals: An Atlas of Evolution in

Africa. Vol. II, Part A: Insectivores and Bats. Academic Press Inc., New York, pp. 36-

55.

Kingdon, J., 2013. Mammals of Africa: Volume 1: Introductory Chapters and Afrotheria, in: Kingdon, J., Happold, D.C.D., Butynski, T.M., Hoffman, M., Happold, M., Kalina, J.

(Eds.). Bloomsbury Publishing, London, pp. 75-100. doi: 10.1016/S0070-2153(06)75012-

8 53

Kress, W.J., Prince, L.M., Williams, K.J., 2002. Phylogeny and a new classification of the gingers (Zingiberaceae): evidence from molecular data. Am. J. Bot. 89, 1682-1696.

Larsen, P.A, Marchan-Rivadeneira, M.R., Baker, R.J., 2010. Natural hybridization generates mammalian lineage with species characteristics. Proc. Natl. Acad. Sci. U.S.A.

107, 11447-11452. doi: 10.1073/pnas. 1000133107

Lawson, L.P., Vemesi, C., Ricci, S., Rovero, F., 2013. Evolutionary history of the grey­ faced sengi, Rhynchocyon udzungwensis, from Tanzania: a molecular and species distribution modeling approach. PLoS One 8, e72506. doi: 10.1371/journal.pone.0072506

Leigh, J., Bryant, D., Steel, M., 2013. PopART (Population Analysis with Reticulate

Trees), http://www.popart.otago.ac.nz/downloads.shtml. Accessed 22 February 2015.

Maddison, W.P., 1997. Gene trees in species trees. Syst. Biol. 46, 523-536.

Mayr, E., 1942. Systematics and the origin of species, from the viewpoint of a zoologist.

Harvard University Press, Cambridge, MA, pp. 120. 54

Mayr, E., Bock, W.J., 2002. Classifications and other ordering systems. J. Zool. Syst.

Evol. Res. 40, 169-194.

McKitrick, M.C., Zink, R.M., 1988. Species concepts in ornithology. Condor 90, 1-14.

Nylander, J.A.A., 2004. MrModeltest v2. Program distributed by the author. Evolutionary

Biology Centre, Uppsala University.

O’Leary, M.A., Bloch, J.I., Flynn, J.J., Gaudin, T.J., Giallombardo, A., Giannini, N.P.,

Goldberg, S.L., Kraatz, B.P., Luo, Z., Meng, J., Ni, X., Novacek, M.J., Perini, F.A.,

Randall, Z.S., Rougier, G.W., Sargis, E.J., Silcox, M.T., Simmons, N.B., Spaulding, M.,

Velazco, P.M., Weksler, M., Wible, J.R., Cirranello, A.L., 2013. The placental mammal ancestor and the post-K-Pg radiation of placentals. Science. 339, 662-667. doi: 10.1126/science. 1229237

Olson, L.E., Flassanin, A., 2003. Contamination and chimerism are perpetuating the legend of the snake-eating cow with twisted horns (Pseudonovibos spiralis). A case study ofthe pitfalls of ancient DNA. Mol. Phylogenet. Evol. 27, 545-548. doi: 10.1016/S 1055-

7903(03)00022-8 55

Pellegrino, K.C.M., Rodrigues, M.T., Yonenaga-Yassuda, Y., Sites Jr., J.W., 2001. A molecular perspective on the evolution of microteiid lizards (Squamata,

Gymnophthalmidae), and a new classification for the family. Biol. J. Linn. Soc. 74, 315—

338. doi: 10.1006/bijl.2001.0580

Rathbun, G.B., 2009. Why is there discordant diversity in sengi (Mammalia: Afrotheria:

Macroscelidea) taxonomy and ecology? Afr. J. Ecol. 47, 1-13.

Rathbun, G.B., Kingdon, J., 2006. The etymology o f‘sengi.’ Afrotherian Conservation

Newsletter IUCN/SSC Afrotheria Specialist Gr. 4, 14-15.

Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572-1574. doi:10.1093/bioinformatics/btgl80

Rovero, F., Rathbun, G.B., Perkin, A., Jones, T., Ribble, D.O., Leonard, C., Mwakisoma,

R.R., Doggart, N., 2008. A new species of giant sengi or elephant-shrew (genus

Rhynchocyon) highlights the exceptional biodiversity of the Udzungwa Mountains of

Tanzania. J. Zool. 274, 126-133. doi: 10.111 l/j.l469-7998.2007.00363.x

Scally, M., Madsen, O., Douady, C.J., de Jong, W.W., Stanhope, M.J., Springer, M.S.,

2002. Molecular evidence for the major clades of placental mammals 8, 239-277 56

Schluter, D., 2000. The ecology of adaptive radiation. Oxford University Press, pp. 10-

11

Scribner, K.T., Page, K.S., Bartron, M.L., 2000. Hybridization in freshwater species: a review of case studies and cytonuclear methods of biological inference. Rev. Fish Biol.

Fish. 10, 293-323.

Smit, H., 2008. Phylogeography of three Southern African endemic elephant-shrews and a supermatrix approach to the Macroscelidea. PhD diss. Stellenbosch University.

Smit, H.A., Robinson, T.J., van Vuuren, B.J., 2007. Coalescence methods reveal the impact of vicariance on the spatial genetic structure of Elephantulus edwardii

(Afrotheria, Macroscelidea). Mol. Ecol. 16, 2680-2692. doi: 1 0 .1 1 1 1/j. 1365-

294X.2007.03334.X

Smit, H.A., Robinson, T.J., Watson, J., van Vuuren, B.J., 2008. A new species of elephant-shrew (Afrotheria: Macroscelidea: Elephantulus) from South Africa. J.

Mammal. 89, 1257-1268. 57

Smit, H.A., van Vuuren, J., O’Brien, P.C.M., Ferguson-Smith, M., Yang, F., Robinson,

T.J., 2011. Phylogenetic relationships of elephant-shrews (Afrotheria, Macroscelididae).

J. Zool. 284, 133-143. doi: 10.1111/j. 1469-7998.2011,00790.x

Springer, M.S., Cleven, G.C., Madsen, O., de Jong, W.W., Waddell, V.G., Amrine, H.M.,

Stanhope, M.J., 1997. Endemic African mammals shake the phylogenetic tree. Nature

388, 61-64. doi: 10.1038/40386

Springer, M.S., Amrine, H.M., Burk, A., Stanhope, M.J., 1999. Additional support for

Afrotheria and Paenungulata, the performance of mitochondrial versus nuclear genes, and the impact of data partitions with heterogeneous base composition. Syst. Biol. 48, 65-75. doi: 10.1080/106351599260445

Stamatakis, A., 2006. RAxML-VI-HPC: Maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688-2690. doi: 10.1093/bioinformatics/btl446

Stanhope, M J., Waddell, V.G., Madsen, O., de Jong, W.W., Hedges, S.B., Cleven, G.C.,

Kao, D., Springer, M.S., 1998a. Molecular evidence for multiple origins of Insectivora and for a new order of endemic African insectivore mammals. Proc. Natl. Acad. Sci. U.

S. A. 95, 9967-9972. 58

Stanhope, M.J., Madsen, O., Waddell, V.G., Cleven, G.C., de Jong, W.W., Springer,

M.S., 1998b. Highly congruent molecular support for a diverse superordinal clade of endemic African mammals. Mol. Phylogenet. Evol. 9, 501-508. doi: 10.1006/mpev. 1998.0517

Swofford, D. L., 2003. PAUP*. Phylogenetic Analysis Using Parsimony (*and Other

Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts.

Szollosi, G.J., Tannier, E., Daubin, V., Boussau, B., 2014. The inference of gene trees with species trees. Syst. Biol. 64, e42-e62. doi:10.1093/sysbio/syu048

Thomas, O., 1902. LVI.— A new Rhynchocyon from Nyasaland. J. Nat. Hist. 10, 403-

404.

Untergasser, A., Cutcutache, I., Koressaar, T., Ye, J., Faircloth, B.C., Remm, M„ Rozen,

S.G., 2012. Primer3-new capabilities and interfaces. Nucleic Acids Res. 40, 1-12. doi: 10.1093/nar/gks596 59

van Dijk, M. a, Madsen, O., Catzeflis, F., Stanhope, M.J., de Jong, W.W., Pagel, M.,

2001. Protein sequence signatures support the African clade of mammals. Proc. Natl.

Acad. Sci. U.S.A. 98, 188-193. doi:10.1073/pnas.98.1.188

Villesen, P., 2007. FaBox: An online toolbox for FASTA sequences. Mol. Ecol. Notes 7,

965-968. doi: 10.1111/j. 1471-8286.2007.01821.x

Zahiri, R., Kitching, I.J., Lafontaine, J.D., Mutanen, M., Kaila, L., Holloway, J.D.,

Wahlberg, N., 2011. A new molecular phylogeny offers hope for a stable family level classification of the Noctuoidea (Lepidoptera). Zool. Scr. 40, 158-173. doi: 10.1111 /j. 1463-6409.2010.00459.x Appendix 1 - Color Plates

Figure A: Rhynchocyon chrysopygus CAS MAM 24526

head grizzled cream, brown, and black

Hanks, thighs and back maroon

rump golden

Rhynchocyon chrysopygus 61

Figure B: Rhynchocyon cirnei cirnei CAS MAM 29358

dorsal ground color grizzled black and yellow second row of spots faint but discrete

dorsal spots chestnut

Rhynchocyon cirnei cirnei 62

Figure C: Rhynchocyon cirnei macrurus AMNH 179301

central stripes chestnut

second row with isolated spots

rump and flanks rufous

Rhynchocyon cirnei macrurus*

* light form only Figure D: Rhynchocyon cirnei reichardi CAS MAM 28535

dorsal ground color grizzled black and cream

central stripes black

rump not rufous

Rhynchocyon cirnei reichardi 64

Figure E: Rhynchocyon cirnei shirensis AMNH 161777

feet and ear slightly browner than rest of pelage

dorsal ground color grizzled black and spots blackish brown, cream lighter at edges

Rhynchocyon cirnei shirensis 65

Figure F: Rhynchocyon cirnei stuhlmanni AMNH 49462 (western form) and AMNH 49521 (eastern form)

ground color grizzled black and cream or yellow

central stripes dark

tail completely white

eastern form western form

Rhynchocyon cirnei stuhlmanni 66

Figure G: Rhynchocyon petersi petersi CAS MAM 30667

head rufous and slightly grizzled shoulders and flanks orange rufous

rump and center of back black

black hairs from rump extend onto tail

Rhynchocyon petersi petersi 67

Figure H: Rhynchocyon udzungwensis CAS MAM 28043

grey forehead

behind ears and shoulder grizzled yellow- rufous

orange -rufous sides

lower rump jet black

Rhynchocyon udzungwensis 68

Figure I: Multiple Rhynchocyon specimens, (a) R. chrysopygus (b) R. p. petersi (c) R. p. adersi (d-f) R. c. stuhlm anni (g) R. c. hendersoni (h) R. c. reichardi (i) R. c. shirensis (j) R. cirnei subsp. (k-m) R. c. macrurus. Image from Corbet and Hanks (1968).

Hull. Hr. Mus. mil. Hist. iZool. PLATE 1 Appendix 2 - Additional field data collected from unvouchered specimens

head & hind body tail foot ear Field weight length length length length taxa collector Number Date Sex (s) (mm) (mm) (mm) (mm) lat long notes R. cirnei stuhlmanni K. Consolate MK00I 6/20/2005 F 475 285 240 75.2 29.7 0.2946 25.2917 R. cirnei stuhlmanni K. Consolate M300 2/18/2006 M 355 307 220 73.3 33.6 0.0131 25.5565 R. peters petersi C. Sabuni TA1818 5/25/2012 M 490 229 221 68.5 27.4 -5.9955 38.7607 adult R. peters petersi C. Sabuni TZ22766 2/12/2013 M 471 227 222 75.5 29 -5.5759 38.6423 adult R. peters petersi C. Sabuni TZ22783 6/4/2013 M 446 252 225 68.5 25.3 -5.6010 38.6468 adult R. peters petersi C. Sabuni TZ22767 2/16/2013 M 481 229 210 66.5 30 -5.5639 38.6502 adult R. peters petersi C. Sabuni TZ22769 3/6/2013 M 445 274 228 72.5 26.6 -5.5871 38.6395 adult R. peters petersi C. Sabuni TZ22770 3/8/2013 F 490 286 235 70.4 29 -5.5871 38.6404 adult R. peters petersi C. Sabuni TZ22779 4/24/2013 M 430 260 210 70 30 -5.8937 38.5944 adult R. peters petersi C. Sabuni TZ22776 4/11/2013 F 480 250 211 67.4 27.8 -5.8921 38.5939 adult R. peters petersi C. Sabuni TZ22811 9/15/2020 M 444 274 228 66.6 26 -5.8944 38.5946 adult R. peters petersi C .Sabuni TZ22775 4/11/2013 M 450 250 210 68.5 30 -5.8921 38.5938 adult R. peters petersi C. Sabuni TZ22778 4/18/2013 F 600 280 223 70.4 29.1 -5.8938 38.5949 pregnant R. peters petersi C. Sabuni TZ22774 4/8/2013 F 550 260 222 71.5 27.6 -5.8909 38.5928 pregnant R. peters petersi C. Sabuni TA1833 4/2/2012 M 488 227 210 70 30.4 -6.1056 38.6167 adult R. peters petersi C. Sabuni RPI5 8/11/2012 F 422 280 223 70.4 27 -6.1367 38.6055 adult R. peters petersi C. Sabuni TA1812 3/25/2012 F 518 272 220 69.5 31 -6.1055 38.6158 pregnant R. peters petersi C. Sabuni TA1835 8/3/2012 F 403 469 215 70 25.7 -6.1126 38.6211 adult 70

Appendix 3 - Fresh and historical DNA extraction methods

Fresh Tissue DNA Extraction Methods

DNA was extracted from approximately 25 mg of fresh tissue (previously stored in ethanol and frozen at -80°C until extraction) using a DNeasy Blood and Tissue extraction kit (Qiagen, Venlo, Limburg, Netherlands). Tissue was diced into smaller pieces and placed in a tube containing 180 |uL of ATL tissue lysis buffer and 20 fiL of 10 mg/mL proteinase K. Tissue was allowed to incubate at 55°C for 4 hours to overnight. After digestion, samples were removed from the incubator and 200 |*L of AL lysis buffer was added along with 200 (J.L of 100% ethanol. Each sample was vortexed well. Sample were then transferred to a DNeasy Mini Spin Column (Qiagen, Venlo,

Limburg, Netherlands) with a collection tube and spun at 8,000 rpm for 1 minute. The spin column was transferred to a new collection tube and 500 |a.L of AW1 Wash Buffer was added. Samples were spun again at 8,000 rpm for 1 minute. The spin column was transferred to a new collection tube and 500 |j.L of AW2 Wash Buffer was added. Samples were spun at 14,000 rpm for 3 minutes. Spin columns were transferred to a final collection tube and 100 |^L of AE Elution Buffer (10 mM Tris-Cl, 0.5 mM

EDTA; pH 9.0) was added. Samples were allowed to incubate at room temperature for 5 minutes and then spun at 8,000 rpm for 1 minute. An additional 100 ^L of AE Elution

Buffer was added to each sample, incubated at room temperature for 5 minutes, and spun at 8,000 rpm for 1 minute. A working aliquot of supernatant was stored at -20°C with the 71

remainder stored at -80°C in the Center for Comparative Genomics Cryogenic

Collection.

Historical Tissue DNA Extraction Methods

DNA was extracted from approximately 25 mg of dried tissue in a dedicated

ancient DNA laboratory. Tissue was taken from the hind foot or the dorsal incision of the specimen. DNA was extracted using a standard phenol-chloroform extraction or the

DNA IQ™ Tissue and Hair Extraction Kit (Promega Bio Systems, Sunnyvale, California,

USA).

For the phenol-chloroform extraction a master mix of extraction solution was

prepared with the final concentrations: 10 mg/ml DTT, 1% SDS (by weight), 0.1 mg/ml

proteinase K, 0.02 M EDTA, 0.01 M Tris, 0.01 M sodium chloride. Tissue was diced on

UV sterilized aluminum foil and placed in a UV sterilized 2 mL tube along with 750 the extraction solution. Samples were incubated at 55°C overnight. If digestion was not

complete, samples were spiked with an additional lO^L of 10 mg/mL proteinase K and

allowed to incubate for another 2-12 hours. Following incubation samples were spun

down to remove solution from the lid and 750 (iL of phenol was added. Samples were

inverted to mix the phenol and aqueous layers, then spun at 10,000 rpm for 1

minute. The aqueous layer was removed and placed into a new, UV sterilized tube and

an additional 750 fiL of phenol was added to each sample. Samples were once again

inverted to mix the phenol and aqueous layers and spun at 10,000 rpm for 1 minute. The

aqueous layer was removed and placed into a new, UV sterilized tube and 750 of 72

chloroform was added to each sample. Samples were inverted to mix the chloroform and aqueous layers and spun at 10,000 rpm for 1 minute. The aqueous layer was removed, placed in a Centricon® Centrifugal Filter Units (EMD Millipore), and 1 mL of UV sterilized deionized H.O was added to each sample. Samples were spun at 5,000 rpm at room temperature for 20 minutes. After 20 minutes an additional 2 (iL of UV sterilized deionized HO was added to each sample. Samples were spun at 5,000 rpm at room temperature for an additional 20 minutes. Tubes were then inverted allowing the supernatant to collect in the cap. Samples were spun at 1,000 rpm at room temperature for 3 minutes. Supernatant was collected, placed in a UV sterilized tube, and set on an incubation block at 55°C for 10 minutes. Extraction was subsequently stored at -20°C in a dedicated ancient DNA freezer.

For the extraction using the DNA IQ™ Tissue and Hair Extraction Kit (Promega

Bio Systems, Sunnyvale, California, USA) up to 25 mg of tissue was placed in a UV sterilized tube. To wash samples, 500 |iL of 100% EtOH was added to each sample. Samples were vortexed briefly, EtOH was removed, and an additional 500 |^L of

100% EtOH was added. Samples were vortexed again, tissue was removed, minced and placed in a new UV sterilized tube.

A master mix of incubation solution containing 64 |iL incubation buffer, 8

(iL 1M DTT, and 8 |iL proteinase K was prepared for each sample. Samples were incubated on a shaker or rotator at 56°C for 2 hours to overnight. Following incubation

150 |iL of prepared lysis buffer and 15 fj.L of DNA IQ™ Resin was added to each 73

sample. Samples were vortexed at high speed for 3 seconds and allowed to incubate for 5 minutes at room temperature. Following incubation samples were vortexed at high speed for 2 seconds, and immediately placed in a MagneSphere® Technology Magnetic

Separation Stand (Promega Bio Systems, Sunnyvale, California, USA). Once separation occurred, all solution was disposed of without disturbing the resin pellet. An additional

100 nL of lysis buffer was added to each sample, samples were removed from the magnetic stand, vortexed and immediately placed back in the magnetic stand. Once separation occurred, all lysis buffer was removed without disturbing the resin pellet. For the first wash, 100 fiL of 1X wash buffer was added to each sample. Samples were removed from the magnetic stand, vortexed for 2 seconds at high speed, and immediately placed back in the magnetic stand. Wash steps were repeated another 2 times for a total of 3 washes. Samples were allowed to air-dry at room temperature in the magnetic stand with the lids open for 5 minutes. Following air-drying, 100 (J.L of elution buffer was added to each sample. Samples were incubated on a heat block at 65°C for 5 minutes. After 5 minutes, samples were immediately placed in the magnetic stand. Supernatant was collected and placed in a UV sterilized tube. Extraction was stored at -20°C in a dedicated ancient DNA freezer until used for polymerase chain reaction.