Advance Publication by J-STAGE

Genes & Genetic Systems

Received for publication: April 2, 2017 Accepted for publication: February 5, 2018 Published online: April 10, 2018

1 Original paper

2 Signature of positive selection in mitochondrial DNA

3 in Cetartiodactyla

4

5 Satoko Mori1 and Masatoshi Matsunami1,2*

6

7

8 1Laboratory of Ecology and Genetics, Graduate School of Environmental Science,

9 Hokkaido University, N10W5, Kita-ku, Sapporo, Hokkaido 060-0810, Japan

10 2Graduate School of Medicine, University of the Ryukyus, 207, Nishihara-cho,

11 Okinawa 903-0215, Japan

12

13 *Corresponding author.

14 Masatoshi Matsunami

15 207, Nishihara-cho, Okinawa 903-0215, Japan

16 TEL/FAX +81-98-895-1766, Email: [email protected]

17

18 Running head: Positive selection in mtDNA of Cetartiodactyla

19

20

21

1

22 Abstract

23 Acceleration of the amino acid substitution rate is a good indicator of positive

24 selection in adaptive evolutionary changes of functional genes. Genomic information

25 about has become readily available in recent years, as many researchers have

26 attempted to clarify the adaptive evolution of mammals by examining evolutionary rate

27 change based on multiple loci. The order Cetartiodactyla (Artiodactyla and Cetacea) is

28 one of the most diverse orders of mammals. Species in this order are found throughout

29 all continents and seas, except Antarctica, and they exhibit wide variation in

30 morphology and habitat. Here, we focused on the metabolism-related genes of

31 mitochondrial DNA (mtDNA) in species of the order Cetartiodactyla using 191 mtDNA

32 sequences available in databases. Based on comparisons of the dN/dS ratio (ω) in 12

33 protein-coding genes, ATP8 was shown to have a higher ω value (ω = 0.247) throughout

34 Cetartiodactyla than the other 11 genes (ω < 0.05). In a branch-site analysis of ATP8

35 sequences, a markedly higher ω value of 0.801 was observed in the ancestral lineage of

36 the clade of Cetacea, which is indicative of adaptive evolution. Through efforts to detect

37 positively selected amino acids, codon positions 52 and 54 of ATP8 were shown to have

38 experienced positive selective pressure during the course of evolution; multiple

39 substitutions have occurred at these sites throughout the cetacean lineage. At position 52,

40 glutamic acid was replaced with asparagine, and, at position 54, lysine was replaced

41 with non-charged amino acids. These sites are conserved in most Artiodactyla. These

42 results imply that the ancestor of cetaceans underwent accelerated amino acid changes

43 in ATP8 and replacements at codons 52 and 54, which adjusted metabolism to adapt to

44 the marine environment.

45

46 Key words: ATP8, Cetartiodactyla, dN/dS ratio (ω), mtDNA, positive selection

47

2

48 Introduction

49 Identification of genetic changes leading to morphological and physiological

50 adaptations is one of the central goals of evolutionary biology. An accelerated rate of

51 amino acid substitutions in a particular gene and during a particular evolutionary period

52 are evidence that the gene plays an important role for adaptive evolution in a given

53 species (Messier and Stewart, 1997). An accelerated rate of amino acid substitutions in

54 protein-coding genes can be assessed by the dN/dS ratio (ω), which is the ratio of the

55 nonsynonymous substitution rate (dN) and the synonymous substitution rate (dS) (Yang,

56 1998; Yang and Nielsen, 1998). The  value is used as an indicator of selective pressure

57 acting on the protein-coding genes; values of  < 1,  = 1, and  > 1 indicate purifying

58 selection, neutral evolution, and positive selection, respectively (Yang, 2007). Recent

59 studies in mammals have attempted to identify genes that contributed to adaptive

60 evolution in particular species by comparing the substitution rates in multiple loci. For

61 example, Chikina et al. (2016) detected accelerated substitution rates in genes involved

62 in sensory, structure, and metabolism in marine mammals, providing clues to better

63 understand their adaptive evolution to a particular environment.

64 One could assume that natural selection does not operate over an entire coding

65 region but rather at specific amino acid sites that are essential for changing the function

66 of a given gene. Therefore, care should be taken to identify specific amino acid

67 substitutions that affect functional changes in the genes (positively selected sites) in the

68 process of elucidating the genes involved in adaptive evolution. Most studies only

69 discuss acceleration of the substitution rate, while some studies have detected positive

70 selection and positively selected sites (Finch et al., 2014; Tian et al., 2016).

71 Mitochondrial genomes, which are circular molecules of 14,000-20,000 bp in

72 (Kolesnikov and Gerasimov, 2012), are now available in databases for many

3

73 organisms. Mitochondrial DNA (mtDNA) in mammals encodes 13 proteins that

74 constitute oxidative phosphorylation (OXPHOS) complexes: ND1-4, 4L, 5 and 6 of

75 complex I (NADH dehydrogenase); cytochrome b (Cytb) of complex III (bc1 complex);

76 COX1-3 of complex IV (cytochrome c oxidase); and ATP6 and ATP8 of complex V

77 (ATP synthase) (Wallace, 2007). Most cellular energy is produced through the

78 OXPHOS pathway, which takes place in the protein complexes embedded in the inner

79 mitochondrial membrane (Saraste, 1999). Mutations of mtDNA greatly affect the

80 metabolic activity of organisms, suggesting that mtDNA plays an important role in

81 morphological evolution and environmental adaptation. It is known that purifying

82 selection is one of the dominant forces of evolution on mitochondrial OXPHOS genes

83 (Tomasco and Lessa, 2011). On the other hand, evidence of adaptive evolution acting on

84 mtDNA has already been detected in previous reports. For instance, da Fonseca et al.

85 (2008) carried out comparative sequence analysis on protein-coding genes of

86 mitochondrial genomes across 41 species and detected a number of

87 substitutions that alter the biochemical properties of functional sites of specific proteins.

88 Because mammals adapt to different habitats by changing their metabolic processes, an

89 accelerated substitution rate in the mtDNA might be tightly linked with the adaptive

90 evolution of some metabolic processes.

91 One of the most diverse orders of mammals, Cetartiodactyla, has 332 extant

92 species grouped into 132 genera (IUCN, Hassanin et al., 2012). This order includes

93 artiodactyls (, , , hippos, , and llamas) and cetaceans

94 (whales, dolphins, and porpoises) which are found throughout all continents and seas,

95 except Antarctica. Within this order, there is large morphological variation and great

96 habitat diversity, as seen with cetaceans. As the ancestor of Cetacea originally lived on

97 land (Gatesy et al., 2013), an adaptation to water with a change in metabolism was

4

98 required to make the transition to aquatic living (Tomanek, 2014). Other examples are

99 found within tribes Lamini (belongs to family ) and Caprini (belongs to

100 family ); most species in these tribes live in high-altitude mountains and adapt

101 to low levels of oxygen, cold temperature, and scarce food supply. Such stresses may

102 promote specific directional evolution of mtDNA (Hassanin et al., 2009). Large

103 variation in body size is also widely observed in this order, which includes very small

104 species such as family Tragulidae (weighs less than 3 kg) (Rössner, 2007) and

105 large-bodied animals such as Giraffini (adult giraffes weigh in excess of 1000 kg)

106 (Brown et al., 2007) and whale (some species have a length greater than 10 m)

107 (Ridgway, 1997). In general, difference in body size is strongly related to metabolism

108 (Martin and Palumbit, 1993). Therefore, the order Cetartiodactyla is a good model for

109 inferring the molecular evolution of mtDNA that is associated with environmental

110 adaptation and morphological evolution, which are linked to metabolic processes. A few

111 previous studies reported adaptive evolution of mtDNA with a focus on specific taxa

112 belonging to Cetartiodactyla. Although Hassanin et al. (2009) showed that the ω of

113 ATPase increased during the evolution of Caprini, which live at high altitudes, they did

114 not demonstrate the statistical significance of this increase. In addition, Caballero et al.

115 (2015) identified that codon site 297 in the ND2 gene in the three “river dolphins”

116 (Families Pontopoeidae, Lipotidae, and Inidae) is under positive selection related to

117 adaptation to the freshwater environment. However, significant positive selection in

118 Cetartiodactyla has not been reported and comprehensive studies of this order are

119 lacking.

120 In this study, we used the dN/dS ratio (ω) to identify genes that show an

121 accelerated evolutionary rate and positive selection of mtDNA in Cetartiodactyla. Then,

122 the relationship between molecular evolution of mtDNA and environmental adaptation,

5

123 such as change of body size and habitat, was elucidated. Amino acid substitutions under

124 positive selection were identified to infer functional changes of genes that correlated

125 with adaptive evolution in specific lineages of Cetartiodactyla.

126 Materials and Methods

127 Phylogenetic analyses

128 The complete mitochondrial genomes from 210 species described in Hassanin

129 et al. (2009) covering most of the family Cetartiodactyla were retrieved from the NCBI

130 database (Supplementary Table S1). Additionally, we downloaded two mitochondrial

131 genomes, Orcinus orca and Equus caballus (Xiufeng and Arnason, 1994; Morin et al.,

132 2010). From these sequences, we excluded 21 species for which gene annotations were

133 incomplete and portions of protein coding sequences were missing. We used 191

134 well-annotated mitochondrial genomes for further analysis. These whole sequences

135 were aligned using MAFFT with default settings (Katoh and Standley, 2013). A

136 phylogenetic tree was constructed using the maximum likelihood (ML) analysis

137 implemented in RAxML v8.2 (Stamatakis, 2014) with the GTR+CAT model. Bootstrap

138 probabilities (BP) were computed using 100 replicates. Based on the phylogenetic tree

139 generated with the full-length mtDNA alignment, evolutionary rates were investigated.

140

141 Identification of positive selection sites

142 Among the phylogenetic branches of Cetartiodactyla, dN/dS analyses were

143 carried out to infer selective pressure on each of the 12 protein-coding genes of

144 mtDNA: ND1, ND2, COX1, COX2, ATP8, ATP6, COX3, ND3, ND4L, ND4, ND6, and

145 Cytb. Because, in the ND5 genes, we identified many inappropriate annotations, such

146 as the insertion of termination codons, we excluded this gene from analysis. The 191

147 sequences of each gene were aligned using the CLUSTALW algorithm with default

6

148 settings in MEGA software (Thompson et al., 1994; Kumar et al., 2016). The dN/dS

149 ratio (ω) values were calculated by the CODEML program implemented in the PAML

150 v. 4.8 package (Yang, 2007). Two types of analyses were conducted to evaluate the

151 evolutionary rates by the following models. First, the ω was calculated for each gene

152 of the mitochondrial genome to identify genes showing a relatively high evolutionary

153 rate. Site model M0 (one ratio model, model = 0, NSsite = 0, fix omega = 0) and M1a

154 (neutral model, model = 0, NSsite = 1, fix omega = 0), which computes a single ω for

155 all branches were computed. Using model M0, dN and dS values for each branch were

156 calculated assuming an identical ω among all branches. Model M1a was also

157 computed using an identical ω among all branches under two categories: 1) sites under

158 purifying selection (0 < ω < 1) and 2) sites under neutral evolution (ω = 1) (Yang et al.,

159 2005). Then focusing on the accelerated genes, we examined whether these genes were

160 positively selected using the likelihood ratio test (LRT) between M8 (selection model:

161 model = 0, NSsite = 8, fix omega = 0) and M8a (neutral model: model = 0, NSsite = 8,

162 fix omega = 1) site models (Swanson et al., 2003). Model M8 can detect positive

163 selection for all branches allowing codons to evolve with dN/dS (ω) > 1. Model M8a

164 was used to calculate ω values for all branches under two categories: 1) sites under

165 purifying selection (0 < ω < 1) and 2) sites under neutral evolution (ω = 1). The LRT

166 between the M8 and M8a models was conducted to confirm positive selection in the

167 selected genes. Second, we evaluated the evolutionary rate for each branch on the

168 phylogenetic tree. For genes that showed high ω values in M0 and M1a model

169 analyses, we conducted maximum likelihood analyses using the branch-site model

170 (two-ratios model: model = 2, NSsite = 0, fix omega = 0) to calculate dN/dS ratio (ω)

171 specific to the lineages. In this study, we focused on the branches that showed body

172 size evolution, adaptation to high-altitude environment and adaptation to the aquatic

7

173 environment. To infer positive selection on specific branches, further branch-site

174 model analyses were conducted using the MA model (model = 2, NSsite = 2, fix

175 omega = 0) and MA null model (model = 2, NSsite = 2, fix omega = 1) (Yang et al.,

176 2005). The LRT between the MA and M1a models was conducted to assign significant

177 increases in evolutionary rates to specific branches (Zhang et al., 2005). LRTs between

178 MA and MA null models were conducted to infer positive selection on targeted

179 branches (Yang et al., 2005). In the LRTs, we conducted chi-squared tests with the

180 degrees of freedom set to 1 toexamine the statistical significance of differences

181 between the models using a threshold of P < 0.05. Because low dS values and

182 saturation of substitutions violate the dN/dS estimation, we excluded branches showing

183 dS < 0.001 and dS or dN > 2 for all dN/dS estimations according to the criteria of

184 Villanueva-Cañas et al. (2013).

185 In addition, amino acid sites under positive selection in the branches that showed

186 significant accelerated evolutionary rates or positive selection were detected using

187 Bayes empirical Bayes (BEB) analysis to identify sites under positive selection with

188 posterior probabilities > 0.80 (Yang et al., 2005; Tian et al., 2016) in the MA model of

189 the CODEML analysis. As the CODEML model has the potential to be biased and tends

190 to detect positive selection operating on conserved genes (Foote et al., 2011), we

191 analyzed physico-chemical changes due to amino acid replacement by TreeSAAP

192 (Woolley et al., 2003), which measures the influence of the amino acid replacements

193 based on 31 structural and biochemical amino acid property changes. Amino acids

194 showing property changes in category-7 and -8 (two categories with radical property

195 changes) with P ≤ 0.001 are considered to be under strong difference in selective

196 pressure.

197

8

198

199 Results

200 Phylogenetic relationships

201 Phylogenetic analyses were conducted using the complete mitochondrial sequences

202 (20,866 bp) of 191 species of Cetartiodactyla (Fig. 1). Most nodes had high bootstrap

203 value (100%), and the topology was in good accord with previously reported trees

204 (Hassanin et al., 2012).

205

206 Accelerated evolutionary rate of ATP8 in Cetartiodactyla

207 The ω for the phylogenetic tree shown in Fig. 1 was calculated using the M0 and

208 M1a models for the dS and dN rates in each of the 12 mitochondrial OXPHOS genes

209 and was overall small (ω < 1) across all genes, suggesting the conservative evolution of

210 mitochondrial genes (Table 1). The ω values across all genes varied greatly, with

211 relatively high values (M0: 0.247, M1a: 0.317) for ATP8 compared to smaller values for

212 the remaining genes (M0: ω < 0.05, M1a: ω < 0.10) (Table 1). In the M0 model, the dN

213 component of ω was also the highest (M0: 0.020) for ATP8 (Fig. 2, Table 1). It indicates

214 that the substitution rate in ATP8 is accelerated. However, no significant differences

215 were detected in LRT between M8 and M8a (P = 0.078, Table 1), indicating that

216 positive selection did not occur across all branches of the tree.

217

218 Evolutionary rate of ATP8 in each branch of the Cetartiodactyla tree

219 Using the branch-site model (two-ratios model), we calculated ω values for

220 ATP8 at each branch. The interior branch from the split of to last

221 common ancestor of Cetacea (node 205 to 204) showed higher ω values (ω = 0.801)

222 than the other branches (Fig. 3, Table 2). LRT between M1a and MA on branches

9

223 leading to the last common ancestor of Cetacea showed a significant difference (P <

224 0.05) (Table 3), indicating that the evolutionary rate of ATP8 was accelerated after the

225 split of Hippopotamidae and before the divergence of Cetacea.

226 The correlation between evolutionary rates and morphological and/or

227 environmental adaptations in this order showed that the cetacean ancestral branch

228 associated with marine adaptation had a significantly accelerated evolutionary rate (Fig.

229 3, Table 2). On the other hand, there was no increase in ω at the branch associated with

230 highland adaptation and body size change. For example, the ω on the branches leading

231 to the Lamini (node 213 to Lgua_NC_011822: ω = 0.142) and the Caprini (node 270 to

232 271: ω = 0.118) (Table 2) were lower than the ω across all the branches (M0: ω =

233 0.247). Likewise, although drastic increase in body size occurred in some branches

234 (Mysticeti; node 202 to 203, Giraffini; node 222 to 223, Pecora; node 220 to 221)

235 (Hassanin and Douzery, 2003; Mitchell and Skinner, 2003; Gatesy et al., 2013), no

236 significant increase in evolutionary rate related to the change in body size was detected

237 among the branches. Because of the low substitution rates, neither of these branches

238 showed a significant increase in ω values (Fig. 3, Table 2).

239

240 Positive selection of ATP8 in the cetacean branch

241 Further analysis to confirm whether positive selection or relaxation of

242 purifying selection caused acceleration of the evolutionary rate in the ATP8 gene at the

243 cetacean ancestral branch was conducted by comparing the results from the neutral

244 branch-site model (MA null) and the branch-site model allowing positive selection

245 (MA) (Yang et al., 2005; Zhang et al., 2005). A significant difference was detected in

246 the LRT between these models at the cetacean ancestral branch (P < 0.05) (Table 3).

247 Especially, positive selection was detected in ATP8 at this branch. There was no

10

248 evidence of relaxed or positive selection in the descendant branches leading to each

249 cetacean species. Thus, only the cetacean ancestral branch showed positive selection in

250 the ATP8 gene.

251

252 Identifying sites under positive selection in ATP8 amino acid sequences

253 Ancestral amino acid sequences at nodes 205 to 204 were calculated by

254 CODEML in the PAML package. CODEML analysis with the MA model detected two

255 substitutions (sites 52 and 54) that were under positive selection at the cetacean

256 ancestral branch (BEB analysis, P > 0.9) (Table 3).

257 Two substitutions at sites 52 and 54 were also positively selected in cetacean

258 ancestral branch based on analysis of the 21 samples, which included one representative

259 sample for almost all families represented by the 191 species. Physicochemical analysis

260 by Tree SAAP was performed on the extracted 21 samples. There were 11 amino acid

261 substitutions in ATP8 (Supplementary Fig. S1). Significant physicochemical amino acid

262 changes in sequences at the cetacean ancestral branch in ATP8 were also identified by

263 the algorithm implemented in TreeSAAP. At least one radical amino acid

264 change (category-7 and category-8 on a scale of 1–8; at P ≤ 0.001) was detected at sites

265 52, 54, and 63 (Supplementary Fig. S1).

266 Prominent substitutions were found at sites 52 and 54 in cetacean ATP8. Although

267 these sites were highly conserved in the Artiodactyla including in this study, except for

268 cetaceans (Fig. 4), substitutions only occurred at cetacean lineages resulting in different

269 amino acid sequences in cetaceans. For example, site 52 is glutamic acid and negatively

270 charged in Artiodactyl species, but becomes asparagine and is not charged in cetaceans.

271 Likewise, site 54 is conserved lysine and positively charged in Artiodactyla. However,

272 following multiple substitutions at this site along the cetacean lineage, the site is no

11

273 longer positively charged and has undergone substitutions to one of the non-charged

274 amino acids (alanine, threonine, and methionine) in cetacean species.

275

276 Discussion

277 Acceleration of evolutionary rate in mtDNA of Cetartiodactyla

278 In this study, we found an accelerated dN rate and an increased ω in ATP8 in the

279 order Cetartiondactyla. Previous studies covering wide samplings of mammals also

280 noted a high nucleotide substitution rate in ATP8 (Pesole et al., 1999) and showed a

281 high incidence of radical amino acid property changes per residue (da Fonseca et al.,

282 2008). However, these studies targeted many phylogenetically distant species and

283 results may therefore overlook sensitive changes in the evolutionary mode of this order.

284 This study clearly showed that the substitution rate of the ATP8 gene could have been

285 accelerated during diversification of the order Cetartiodactyla.

286 ATP8 consists of mitochondrial ATP synthase (mtATPase), complex V.

287 MtATPase consists of two functional domains: the membrane-extrinsic F1 sector and the

288 membrane-bound F0 sector, and these domains constitute the rotor of mtATPase, which

289 rotates during ATP synthesis/hydrolysis. ATP is produced through phosphorylation of

290 ADP by rotation of F1 and utilizing the energy of proton transfer through the

291 membrane-embedded channel of F0 (Jonckheere et al., 2012). Some of the subunits of

292 F0 form the stator stalk, which prevents futile rotation of mtATPase during ATP

293 synthesis/hydrolysis (Stephens et al., 2003). A stator stalk consists of multiple subunits,

294 and ATP8 is only a part of the integral components of the stator stalk (Supplementary

295 Fig. S2). Compared to other proteins composing the core subunits of oxidative

296 phosphorylation (OXPHOS) complexes, ATP8 does not appear to have a pivotal

297 function in complex V. Because ATP8 has no direct influence in the function of complex

12

298 V, the functional restriction of this protein seems to be reduced. The relatively high

299 substitution rate of ATP8 in Cetartiodactyla is thought to be due to the relaxed

300 constraints of the protein, and the limited function of the ATP8 gene would allow

301 accumulation of substitutions.

302 Although an unusual feature of the ATP8 gene may also be responsible for

303 increased ω, it seems reasonable that acceleration of the evolutionary rate of the gene in

304 Cetartiodactyla is not related to this feature. In the Cetartiodactyla, ATP8 overlaps with

305 the ATP6 gene, and the length of the overlapping region is dependent on species (6-46

306 nucleotides). With respect to the reading frame (+1) of ATP8, the ATP6 gene starts at

307 the +3 reading frame (Supplementary Fig. S3). Due to this feature, the dS value of the

308 ATP8 gene is relatively low (Table 1). Because a substitution may be simultaneously

309 synonymous and nonsynonymous in overlapping regions, the commonly used methods

310 to calculate dS, dN, and dN/dS are not suitable (Wei and Zhang, 2014). However, the

311 length of overlapping regions of ATP8/6 is only up to 46 nucleotides in the datasets

312 generated in this study, and ω of ATP8 excluding the overlapping region was also

313 significantly higher than that of other genes. Thus, the evolutionary rate of the gene

314 seems to definitely be accelerated in Cetartiodactyla.

315

316 No branch-specific acceleration or positive selection of ATP8 in Artiodactyla

317 (terrestrial)

318 Though the ω of ATP8 across all branches of the Artiodactyla phylogenetic tree

319 was higher than that of other genes, no association with morphological evolution or

320 environmental adaptation at specific branches was found. An accelerated evolutionary

321 rate would be expected in branches associated with the evolution of increased body size

322 such as in the branch leading to Giraffini. However, these branches showed relatively

13

323 low ω in ATP8. Regarding this gene, the increase in body size and acceleration in

324 evolutionary rate were not correlated. Likewise, there was no significant increase of

325 evolutionary rate in the branches related to highland adaptation. For protein-coding

326 genes other than ATP8, analysis only with a model that assumes a single, constant ω

327 among all branches was performed. There was no correlation between the increase in

328 the substitution rate of ATP8 and the body size evolution and highland adaptation.

329

330 Positive selection of ATP8 in the cetacean ancestral branch might be related to

331 marine adaptation

332 In this study, in addition to detecting positive selection of ATP8 in the cetacean

333 lineage based on ω, it was confirmed by CODEML and TreeSAAP that substitutions at

334 sites 52 and 54 were subjected to positive selective pressure. Amino acid sites 52 and 54

335 are mostly conserved in the Artiodactyla (Glutamic acid, E and Lysine, K). ATP8

336 constituting an integral component of the stator stalk was related to the assembly of

337 multiple subunits (Hadikusumo et al., 1988; Jonckheere et al., 2012). In particular,

338 positively charged amino acid regions at the C terminus are strongly involved in both

339 the assembly and function of the F0 sector (Stephens et al., 2003). The 54K (lysine),

340 which is considered to be under positive selective pressure in this study, is a positively

341 charged amino acid present in the C terminus region. The substitution at 54K in a

342 cetacean ancestor may cause functional changes to this protein. Although the function of

343 this residue is unknown, a substitution at 52E is also observed in the ancestor of

344 cetaceans and carries a change in charge (glutamic acid, E to asparagine, N). It seems

345 that this substitution is strongly related to the functional change in ATP8. Accumulation

346 of substitutions and especially substitutions at 52E and 54K in cetacean ancestors might

347 contribute to cetacean-specific functional changes in ATP8.

14

348 We suspect that these changes may be related to marine adaptation in cetaceans.

349 Because the common ancestor of hippos and cetaceans had a change in habitat from

350 land to sea, their ancestors may be terrestrial or semi-aquatic species (Gatesy et al.,

351 2013). Thus, it is expected that physiological adaptation to the aquatic environment

352 occurred in the cetacean ancestors. Because mammals have pulmonary respiration,

353 hypoxia (reduction in convective oxygen delivery in blood and tissues) is recognized as

354 one of the biggest hurdles to adapting to the aquatic environment, and aerobic

355 metabolism is known to be reduced under hypoxic conditions (Kooyman et al., 1981;

356 Tomanek, 2014). In addition to hypoxia tolerance, brain size expansion and muscle

357 functional changes that promote metabolism under anaerobic conditions compared to

358 terrestrial mammals are characteristic phenotypic changes observed in the evolution of

359 cetaceans (Kooyman et al., 1981; Marino et al., 2007). Since brain and muscle activity

360 is directly related to metabolism and ATP synthesis (Erecinska and Silver, 1989;

361 Korzeniewski, 1998), genes involved in this energy synthesis might be strongly related

362 to marine adaptation in cetaceans. An increase in the mutation rate of several nuclear

363 genes involved in metabolism has already been detected in marine mammals (Foote et

364 al., 2015; Chikina et al., 2016). Positive selection of ATP8 located in mtDNA is

365 involved in ATP synthesis and was detected here, and these results strongly suggest that

366 changes in metabolic activity are important for marine adaptation in cetaceans.

367 Finally, hypotheses of how functional changes in ATP8 contribute to adaptation to

368 the marine environment are considered. First, due to functional changes of ATP8, the

369 function of the stator stalk to prevent futile rotation of mtATPase was strengthened. In

370 cetaceans, reduction of oxygen consumption is critical because of hypoxic conditions.

371 In the Cetacea, the functional change of ATP8 may have occurred to prevent futile

372 rotation of mtATPase. This change will strongly suppress consumption of oxygen.

15

373 Second, it is thought that function of the stator stalk is lost due to the functional change

374 of ATP8. In cetaceans, aerobic respiration becomes inactive due of hypoxia (Tomanek,

375 2014), making the function of limiting futile rotation of mtATPase less necessary.

376 Although it is difficult to elucidate the functional changes of ATP8 in detail based on the

377 results of this study alone, the accumulation of amino acid substitutions, including

378 substitutions at 52E and 54K of ATP8 are strongly suggested to be involved in the

379 evolution of energy synthesis in cetaceans, which may have been utilized for adaptation

380 to the marine environment.

381

382

383 Conclusion

384 In this study, the evolutionary rate of ATP8 was found to be relatively higher

385 than that of other mitochondrial genes in Cetartiodactyla and positive selection was

386 detected in the cetacean ancestral branch. Despite high conservation of mtDNA, ATP8

387 was exposed to relaxed selective pressures. Accumulation of amino acid substitutions

388 and point mutations (at sites 52 and 54) under positive selective pressure of ATP8 was

389 detected in the cetacean ancestors. These functional changes in cetacean ATP8 are

390 suggested to have made changes to ATP synthesis status possible, and this change is

391 related to adaptation to the aquatic environment. Further research focusing on amino

392 acid substitution rates in the ATP8 gene across mammal species will contribute to a

393 better understanding of the molecular evolution of mtDNA linked to adaptive evolution

394 in mammals.

395

396

397 Acknowledgements

16

398 We appreciate Profs. Hitoshi Suzuki, Masashi Ohara and Toru Miura for their critical

399 review of the manuscript, Drs. Yoshinobu Hayashi and Gohta Kinoshita for their many

400 helpful discussions, and Dr. Rumiko Suzuki for help with English editing. This work

401 was supported in part by the Spatiotemporal Genomics Project promoted by University

402 of the Ryukyus and a KAKENHI Grant-in-Aid for Young Scientists (B) to M.M. (No.

403 JP16K18613).

404

405

17

406 References

407 Brown, D.M., Brenneman, R.A., Koepfli, K.P., Pollinger, J.P., Milá, B., Georgiadis, N.J.,

408 Louis, E.E.Jr, Grether, G.F., Jacobs, D.K., and Wayne, R.K. (2007) Extensive

409 population genetic structure in the . BMC Biol. 5, 57.

410 Caballero, S., Duchêne, S., Garavito, M.F., Slikas, B., and Baker, C.S. (2015) Initial

411 evidence for adaptive selection on the NADH subunit Two of freshwater dolphins

412 by analyses of mitochondrial genomes. PLoS One 10, e0123543.

413 Chikina, M., Robinson, J.D., and Clark, N.L. (2016) Hundreds of genes experienced

414 convergent shifts in selective pressure in marine mammals. Mol. Biol. Evol. 33,

415 2182-2192

416 da Fonseca, R.R., Johnson, W.E., O'Brien, S.J., Ramos, M.J., and Antunes, A. (2008)

417 The adaptive evolution of the mammalian mitochondrial genome. BMC Genomics

418 9, 119.

419 Erecińska, M., and Silver, I.A. (1989) ATP and brain function. J. Cereb. Blood Flow

420 Metab. 9, 2-19.

421 Finch, T.M., Zhao, N., Korkin, D., Frederick, K.H., and Eggert, L.S. (2014) Evidence of

422 positive selection in mitochondrial complexes I and V of the African elephant.

423 PLoS One 9, e92587.

424 Foote, A.D., Liu, Y., Thomas, G.W.C., Vinař, T., Alföldi, J., Deng, J., Dugan, S., van Elk,

425 C.E., Hunter, M.E., Joshi, V., et al. (2015) Convergent evolution of the genomes of

426 marine mammals. Nat. Genet. 47, 272-275.

427 Foote, A.D., Morin, P.A., Durban, J.W., Pitman, R.L., Wade, P., Willerslev, E., Gilbert,

428 M.T., and da Fonseca, R.R. (2011) Positive selection on the killer whale

429 mitogenome. Biol. Lett. 7, 116-118.

430 Gatesy, J., Geisler, J.H., Chang, J., Buell, C., Berta, A., Meredith, R.W., Springer, M.S.,

18

431 and McGowen, M.R. (2013) A phylogenetic blueprint for a modern whale. Mol.

432 Phylogenet. Evol. 66, 479-506.

433 Hadikusumo, R.G., Meltzer, S., Choo, W.M., Jean-François, M.J.B., Linnane, A.W., and

434 Marzuki, S. (1988) The definition of mitochondrial H+ ATPase assembly defects in

435 mit− mutants of Saccharomyces cerevisiae with a monoclonal antibody to the

436 enzyme complex as an assembly probe. Biochim. Biophys. Acta. 933, 212-222.

437 Hassanin, A., Delsuc, F., Ropiquet, A., Hammer, C., van Vuuren, B.J., Matthee, C.,

438 Ruiz-Garcia, M., Catzeflis, F., Areskoug, V., Nguyen, T.T., et al. (2012) Pattern and

439 timing of diversification of Cetartiodactyla (Mammalia, ), as revealed

440 by a comprehensive analysis of mitochondrial genomes. C. R. Biol. 335, 32-50.

441 Hassanin, A., and Douzery, E.J.P. (2003) Molecular and morphological phylogenies of

442 Ruminantia and the alternative position of the . Syst. Biol. 52, 206-228.

443 Hassanin, A., Ropiquet, A., Couloux, A., and Cruaud, C. (2009) Evolution of the

444 mitochondrial genome in mammals living at high altitude: new insights from a

445 study of the tribe Caprini (Bovidae, ). J. Mol. Evol. 68, 293-310.

446 Jonckheere, A.I., Smeitink, J.A.M., and Rodenburg, R.J.T. (2012) Mitochondrial ATP

447 synthase: architecture, function and pathology. J. Inherit. Metab. Dis. 35, 211-225.

448 Katoh, K., and Standley, D.M. (2013) MAFFT multiple sequence alignment software

449 version 7: improvements in performance and usability. Mol. Biol. Evol. 30,

450 772-780.

451 Kolesnikov, A.A., and Gerasimov, E.S. (2012) Diversity of mitochondrial genome

452 organization. Biochemistry (Mosc) 77, 1424-1435.

453 Kooyman, G.L., Castellini, M.A., and Davis, R.W. (1981) Physiology of diving in

454 marine mammals. Annu. Rev. Physiol. 43, 343-356.

455 Korzeniewski, B. (1998) Regulation of ATP supply during muscle contraction:

19

456 theoretical studies. Biochem. J. 330, 1189-1195.

457 Kumar, S., Stecher, G., and Tamura, K. (2016) MEGA7: molecular evolutionary

458 genetics analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870-1874.

459 Marino, L., Connor, R.C., Fordyce, R.E., Herman, L.M., Hof, P.R., Lefebvre, L.,

460 Lusseau, D., McCowan, B., Nimchinsky, E.A., Pack, A.A., et al. (2007) Cetaceans

461 have complex brains for complex cognition. PLoS Biol. 5, e139.

462 Martin, A.P., and Palumbi, S.R. (1993) Body size, metabolic rate, generation time, and

463 the molecular clock. Proc. Natl. Acad. Sci. USA 90, 4087-4091.

464 Messier, W., and Stewart, C.B. (1997) Episodic adaptive evolution of primate

465 lysozymes. Nature 385, 151-154.

466 Mitchell, G., and Skinner, J.D. (2003) On the origin, evolution and phylogeny of

467 giraffes Giraffa camelopardalis. Trans. R. Soc. S. Afr. 58, 51-73.

468 Morin, P.A., Archer, F.I., Foote, A.D., Vilstrup, J., Allen, E.E., Wade, P., Durban, J.,

469 Parsons, K., Pitman, R., Li, L., et al. (2010) Complete mitochondrial genome

470 phylogeographic analysis of killer whales (Orcinus orca) indicates multiple species.

471 Genome Res. 20, 908-916.

472 Pesole, G., Gissi, C., De Chirico, A., and Saccone, C. (1999) Nucleotide substitution

473 rate of mammalian mitochondrial genomes. J. Mol. Evol. 48, 427-434.

474 Ridgway, S.H. (1997) Who are the whales? Bioacoustics 8, 3-20.

475 Rössner, G.E. (2007) Family Tragulidae. In The Evolution of Artiodactyls. (eds.:

476 Prothero, D.R. and Foss, S.E.), pp. 213-220. Johns Hopkins University Press,

477 Baltimore.

478 Saraste, M. (1999) Oxidative phosphorylation at the fin de siècle. Science 283,

479 1488-1493.

480 Stamatakis, A. (2014) RAxML version 8: a tool for phylogenetic analysis and

20

481 post-analysis of large phylogenies. Bioinformatics 30, 1312-1313.

482 Stephens, A.N., Khan, M.A., Roucou, X., Nagley, P., and Devenish, R.J. (2003) The

483 molecular neighborhood of subunit 8 of yeast mitochondrial F1F0-ATP synthase

484 probed by cysteine scanning mutagenesis and chemical modification. J. Biol. Chem.

485 278, 17867-17875.

486 Swanson, W.J., Nielsen, R., and Yang, Q. (2003) Pervasive adaptive evolution in

487 mammalian fertilization proteins. Mol. Biol. Evol. 20, 18-20.

488 Thompson, J.D., Higgins, D.G., and Gibson, T.J. (1994) CLUSTAL W: improving the

489 sensitivity of progressive multiple sequence alignment through sequence weighting,

490 position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22,

491 4673-4680.

492 Tian, R., Wang, Z., Niu, X., Zhou, K., Xu, S., and Yang, G. (2016) Evolutionary

493 genetics of hypoxia tolerance in cetaceans during diving. Genome. Boil. Evol. 8,

494 827-839.

495 Tomanek, L. (2014) Proteomics to study adaptations in marine organisms to

496 environmental stress. J. Proteomics 105, 92-106.

497 Tomasco, I.H., and Lessa, E.P. (2011) The evolution of mitochondrial genomes in

498 subterranean caviomorph rodents: adaptation against a background of purifying

499 selection. Mol. Phylogenet. Evol. 61, 64-70.

500 Villanueva-Cañas, J.L., Laurie, S., and Albà, M.M. (2013) Improving genome-wide

501 scans of positive selection by using protein isoforms of similar length. Genome

502 Biol. Evol. 5, 457-467.

503 Wallace, D.C. (2007) Why do we still have a maternally inherited mitochondrial DNA?

504 Insights from evolutionary medicine. Annu. Rev. Biochem. 76, 781-821.

505 Weber, J. (2007) ATP synthase–the structure of the stator stalk. Trends Biochem. Sci. 32,

21

506 53-56.

507 Wei, X., and Zhang, J. (2014) A simple method for estimating the strength of natural

508 selection on overlapping genes. Genome Biol. Evol. 7, 381-390.

509 Woolley, S., Johnson, J., Smith, M.J., Crandall, K.A., and McClellan, D.A. (2003)

510 TreeSAAP: Selection on amino acid properties using phylogenetic trees.

511 Bioinformatics 19, 671-672.

512 Xiufeng, X., and Arnason, U. (1994) The complete mitochondrial DNA sequence of the

513 horse, Equus caballus: extensive heteroplasmy of the control region. Gene 148,

514 357-362.

515 Yang, Z. (1998) Likelihood ratio tests for detecting positive selection and application to

516 primate lysozyme evolution. Mol. Biol. Evol. 15, 568-573.

517 Yang, Z. (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol.

518 Evol. 24, 1586-1591.

519 Yang, Z., and Nielsen, R. (1998) Synonymous and nonsynonymous rate variation in

520 nuclear genes of mammals. J. Mol. Evol. 46, 409-418.

521 Yang, Z., Wong, W.S., and Nielsen, R. (2005) Bayes empirical Bayes inference of

522 amino acid sites under positive selection. Mol. Biol. Evol. 22, 1107-1118.

523 Zhang, J., Nielsen, R., and Yang, Z. (2005) Evaluation of an improved branch-site

524 likelihood method for detecting positive selection at the molecular level. Mol. Biol.

525 Evol, 22, 2472-2479.

526

22

527 Figure legends

528 Fig. 1. Phylogeny of Cetartiodactyla based on complete mitochondrial genomes

529 ML phylogenetic tree was inferred with RAxML progaram. Values on the node are node

530 IDs (left) and bootstrap probabilities (BP) (right). Thickened color bars on branches

531 specify branches leading to morphological evolution or environmental adaptation

532 (black: body size increasing and highland adaptation; blue: underwater adaptation).

533

534 Fig. 2. Nonsynonymous substitution rate (dN) and dN/dS ratio (ω) of each

535 mitochondrial protein-coding gene

536 Common ω (identical values for all branches) calculated using the M0 model. dN and ω

537 are represented by grey and black bars, respectively. Individual scores of these values

538 are shown in Table 1.

539

540 Fig. 3. Nonsynonymous substitution rate (dN) and ω (dN/dS ratio) of ATP8 in each

541 branch

542 These values were computed by the branch-site model (two ratios model). Node

543 numbers correspond to Fig. 1. Branches for which the ω could not be calculated due to

544 extremely low nonsynonymous or synonymous substitution rates are not displayed. dN

545 and ω are represented by grey and black bars, respectively. A list of ω scores is given in

546 Table 2.

547

548 Fig. 4. Amino acid differences of 21 samples, which included one representative

549 sample for almost all families represented by the 191 species

550 The rectangle encloses the sequences of cetacean species. Amino acid sites 52 (E) and

551 54 (K) were conserved in Artiodactyla (terrestrial).

552

23

553 Supplementary data

554 Supplementary Fig. S1. Radical (red) and conservative (blue) amino acid changes

555 in ATP8 in the cetacean common ancestor

556 The common ancestral sequence of the Cetacea was determined by PAML and

557 TreeSAAP. Amino acid changes that caused radical or conservative amino acid property

558 changes were assessed by TreeSAAP. Conservative changes corresponded to

559 conservative categories 1 and 2 and radical changes to categories 7 and 8 (P ≤ 0.001).

560

561 Supplementary Fig. S2. The structure of mtATPase

562 Schematic structure of mtATPase is illustrated with reference to Weber (2007).

563 MtATPase consists of two domains, F1 and F0. Each subunit constituting the domains is

564 shown in a different color. ATP8 constitutes part of the integral component of the stator

565 stalk shown in yellow.

566

567 Supplementary Fig. S3. Up to 46 nucleotides overlap in the reading frames of

568 Cetartiodactyl mitochondrial genes ATP8 and ATP6

569 For each nucleotide triplet (square bracket), the amino acid is given as a one-letter code

570 that is either +1 in the frame for ATP8 (orange) or +3 in the frame for ATP6 (grey).

571

572 Supplementary Table S1. Species with complete mitochondrial DNA sequences

573 included in this study

24 Table 1. Scores of nonsynonymous substitution rate (dN), synonymous substitution rate (dS) and ω (dN/dS ratio) for each mitochondrial protein-coding gene (site model)

M0 M1a M8 M8a M8 vs M8a Gene ω dN dS ω dN dS ω dN dS ω dN dS probability ND1 0.017 0.004 0.229 0.034 0.009 0.216 ------ND2 0.049 0.010 0.206 0.060 0.014 0.229 ------COX1 0.014 0.003 0.195 0.010 0.002 0.237 ------COX2 0.003 0.001 0.252 0.003 0.001 0.254 ------ATP8 0.247 0.020 0.082 0.317 0.025 0.080 0.285 0.024 0.085 0.256 0.023 0.088 0.078 ATP6 0.048 0.008 0.158 0.084 0.012 0.149 ------COX3 0.019 0.003 0.180 0.049 0.008 0.164 ------ND3 0.030 0.006 0.210 0.050 0.010 0.207 ------ND4L 0.030 0.006 0.216 0.056 0.011 0.200 ------ND4 0.031 0.007 0.212 0.063 0.013 0.205 ------ND6 0.032 0.008 0.243 0.057 0.014 0.245 ------Cytb 0.025 0.005 0.219 0.045 0.009 0.206 ------

Table 2. Scores of nonsynonymous substitution rate (dN), synonymous substitution rate (dS) and ω (dN/dS ratio) for ATP8 at each branch (two-ratio model)

Branch ω dN dS 220--->221 (Pecora ancestral branch) 0.584 0.027 0.047 222--->223 (Giraffini ancestral branch) 0.298 0.081 0.273 202--->203 (Mysticeti ancestral branch) 0.231 0.014 0.059 205--->204 (Cetacean ancestral branch) 0.801 0.122 0.153 213--->Lgua_NC_011822 (Lamini ancestral branch) 0.142 0.054 0.378 270--->271 (Caprini ancestral branch) 0.118 0.039 0.329

Table 3. Scores of ω (dN/dS ratio) of ATP8 using the branch-site model (MA) in node 205 to 204 (cetacean ancestral branch)

MA null MA M1a vs MA MA null vs MA Positively selected Site Branch background ω foreground ω background ω foreground ω probability probability sites (BEB) class node 205--->204 2a 0.082 0.400 0.081 14.568 P < 0.05 P < 0.05 11, 22, 52, 54 Note: Positive selected sites are inferred at P > 0.8 with those reaching 0.90 shown in bold.

Figure 1 Figure 2

0.25

0.2

0.15

0.1

0.05

0 ND1 ND2 COX1 COX2 ATP8 ATP6 COX3 ND3 ND4L ND4 ND6 Cytb Figure 3

270--->271 (Caprini ancestral branch)

213--->Lgua_NC_011822 (Lamini ancestral branch)

205--->204 (Cetacean ancestral branch)

202--->203 (Mysticeti ancestral branch)

222--->223 (Giraffini ancestral branch)

220--->221 (Pecora ancestral branch)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Figure 4 Table S1. Species Names Common Names Acession ID Used for alignment Balaena mysticetus bowhead whale NC_005268 Yes Eubalaena australis Southern right whale NC_006930 Yes Eubalaena japonica North Pacific right whale NC_006931 Yes Balaenoptera acutorostrata Northern minke whale NC_005271 Yes Balaenoptera bonaerensis Southern minke whale NC_006926 Yes Balaenoptera borealis Sei whale NC_006929 Yes Balaenoptera edeni brydei Bryde's whale NC_006928 Yes Balaenoptera edeni edeni Eden's whale NC_007938 Yes Balaenoptera musculus blue whale NC_001601 Yes Balaenoptera omurai Omura's whale NC_007937 Yes Balaenoptera physalus fin whale NC_001321 Yes Megaptera novaeangliae humpback whale NC_006927 Yes Eschrichtius robustus gray whale NC_005270 Yes Caperea marginata pygmy right whale NC_005269 Yes Delphinus capensis common dolphin NC_012061 Yes Cephalorhynchus heavisidii Haviside's dolphin JN632624 Yes Grampus griseus Risso's dolphin NC_012062 Yes Lagenorhynchus albirostris white-beaked dolphin NC_005278 Yes Sousa chinensis humpback dolphin NC_012057 Yes Stenella attenuata Pantropical spotted dolphin NC_012051 Yes Stenella coeruleoalba striped dolphin NC_012053 Yes Tursiops aduncus Indo-Pacific bottlenose dolphin NC_012058 Yes Tursiops truncatus common bottlenose dolphin NC_012059 Yes Inia geoffrensis Amazon river dolphin NC_005276 Yes Pontoporia blainvillei La Plata dolphin NC_005277 Yes Lipotes vexillifer Yangtze river dolphin NC_007629 Yes Monodon monoceros narwhal NC_005279 Yes Phocoena phocoena harbor porpoise NC_005280 Yes Physeter macrocephalus sperm whale NC_002503 Yes Kogia breviceps pygmy sperm whale NC_005272 Yes Platanista gangetica Ganges river dolphin NC_005275 Yes Berardius bairdii Baird's beaked whale NC_005274 Yes Hyperoodon ampullatus Northern bottlenose whale NC_005273 Yes amphibius hippopotamus NC_000889 Yes Choeropsis liberiensis JN632625 Yes Antilocapra americana JN632597 Yes taurus taurus domestic cattle EU177832 Yes Bos taurus indicus zebu cattle EU177868 Yes Bos gaurus gaur JN632604 Yes Bos grunniens yak NC_006380 Yes Bos javanicus javanicus banteng JN632606 Yes Bos javanicus brimanicus banteng JN632605 Yes bison American bison JN632601 Yes Bison bonasus European bison JN632602 Yes bubalis bubalis River buffalo AF547270 No Bubalus bubalis carabanesis Swamp buffalo JN632607 Yes Bubalus depressicornis anoa EF536351 Yes Syncerus caffer EF536353 Yes Pseudoryx nghetinhensis EF536352 Yes Boselaphus tragocamelus EF536350 Yes Tetracerus quadricornis chousingha EF536355 Yes angasii nyala JN632702 Yes Tragelaphus derbianus giant eland EF536354 Yes Tragelaphus eurycerus bongo JN632703 Yes Tragelaphus imberbis lesser EF536356 Yes Tragelaphus eland JN632704 Yes Tragelaphus scriptus isolate1 bushbuck JN632705 Yes Tragelaphus scriptus isolate2 bushbuck JN632706 Yes Tragelaphus scriptus isolate3 bushbuck JN632707 Yes Tragelaphus spekii sitatunga EF536357 Yes Tragelaphus strepsiceros greater kudu JN632708 Yes Antilope cervicapra JN632598 Yes Antidorcas marsupialis JN632596 Yes Dorcatragus megalotis antelope JN632631 Yes rufifron s isolate1 red-fronted JN632634 Yes Eudorcas rufifrons isolate2 red-fronted gazelle JN632633 Yes Gazella bennettii Indian gazelle JN632635 Yes Gazella cuvieri Cuvier’s gazelle JN632636 Yes Gazella dorcas osiris Dorcas gazelle JN632637 Yes Gazella dorcas pelzelnii Dorcas gazelle JN632638 Yes Gazella gazella gazella mountain gazelle JN632640 Yes Gazella gazella erlangeri mountain gazelle JN632639 Yes Gazella leptoceros Rhim gazelle JN632641 Yes Gazella spekei Speke's gazelle JN632642 Yes Gazella subgutturosa subgutturosa goitered gazelle JN632644 Yes Gazella subgutturosa marica goitered gazelle JN632643 Yes Litocranius walleri JN632653 Yes Madoqua kirkii Kirk's dik-dik JN632654 Yes Madoqua saltiana Salt's dik-dik JN632655 Yes Dama gazelle JN632665 Yes Nanger granti Grant's gazelle JN632666 Yes Nanger soemmerringii Soemmerring's gazelle JN632667 Yes Ourebia ourebi JN632680 Yes gutturosa Mongolian gazelle JN632689 Yes campestris steenbok JN632693 Yes Saiga tatarica JN632700 Yes Aepyceros melampus JN632592 Yes Alcelaphus buselaphus buselaphus JN632593 Yes Alcelaphus buselaphus lichtensteinii hartebeest JN632594 Yes Connochaetes gnou black JN632626 Yes Connochaetes taurinus isolate1 blue wildebeest JN632628 Yes Connochaetes taurinus isolate2 blue wildebeest JN632627 Yes pygargus blesbok FJ207530 No Ammotragus lervia aoudad FJ207522 No Arabitragus jayakari Arabian FJ207523 No Budorcas taxicolor FJ207524 No caucasica West Caucasian tur JN632609 Yes Capra falconeri markhor FJ207525 No Capra hircus domestic goat GU295658 Yes Capra ibex Alpine ibex FJ207526 No Capra nubiana Nubian ibex FJ207527 No Capra pyrenaica Spanish ibex FJ207528 No Capra sibirica Siberian ibex FJ207529 No Capricornis crispus Japanese FJ207533 No Capricornis milneedwardsii Suamatran serow FJ207534 No Capricornis swinhoei Taiwan serow NC_010640 Yes Hemitraus jemlahicus Himalayan tahr FJ207531 No Naemorhedus baileyi red JN632663 Yes Naemorhedus griseus griseus Chinese goral FJ207532 No Naemorhedus griseus evansi Chinese goral JN632664 Yes Oreamnos americanus Rocky FJ207535 No Ovibos moschatus FJ207536 No aries domestic sheep NC_001941 Yes Pantholops hodgsonii chiru NC_007441 Yes nayaur bharal FJ207537 No pyrenaica isard FJ207538 No Rupicapra rupicapra chamois FJ207539 No Cephalophus adersi Aders's JN632611 Yes Cephalophus callipygus isolate1 Peters's duiker JN632613 Yes Cephalophus callipygus isolate2 Peters's duiker JN632614 Yes Cephalophus callipygus isolate3 Peters's duiker JN632612 Yes Cephalophus dorsalis bay duiker JN632615 Yes Cephalophus jentinki Jentink's duiker JN632616 Yes Cephalophus leucogaster white-bellied duiker JN632617 Yes Cephalophus natalensis Natal duiker JN632618 Yes Cephalophus nigrifrons black-fronted duiker JN632619 Yes Cephalophus ogilbyi Ogilby's duiker JN632620 Yes Cephalophus rufilatus red-flanked duiker JN632621 Yes Cephalophus silvicultor yellow-backed duiker JN632622 Yes Cephalophus spadix Abbott's duiker JN632623 Yes Philantomba maxwellii Maxwell's duiker JN632685 Yes Philantomba monticola isolate1 blue duiker JN632686 Yes Philantomba monticola isolate2 blue duiker JN632687 Yes Sylvicapra grimmia bush duiker JN632701 Yes equinus roan antelope JN632647 Yes Hippotragus niger sable antelope JN632648 Yes nasomaculatus addax JN632591 Yes Oryx beisa beisa JN632676 Yes Oryx dammah scimitar-horned oryx JN632677 Yes Oryx gazella gemsbok JN632678 Yes Oryx leucoryx Arabian oryx JN632679 Yes batesi Bates's pygmy antelope JN632668 Yes Nesotragus moschatus suni JN632669 Yes Oreotragus oreotragus JN632675 Yes Redunca arundinum southern JN632694 Yes Redunca fulvorufula mountain reedbuck JN632695 Yes ellipsiprymnus defassa waterbuck JN632651 Yes Kobus leche lechwe JN632652 Yes Pelea common rhebok JN632684 Yes elaphus red NC_007704 Yes Cervus nippon centralis sika deer NC_006993 Yes Cervus nippon taiouanus sika deer NC_008462 Yes Cervus nippon yakushimae sika deer NC_007179 Yes Axis axis deer JN632599 Yes Axis porcinus hog deer JN632600 Yes Dama dama dama fallow feer JN632629 Yes Dama dama mesopotamica fallow feer JN632630 Yes Elaphurus davidianus Père David's deer JN632632 Yes Przewalskium albirostris Thorold's deer JN632690 Yes duvauceli barasingha JN632696 Yes Rucervus eldi Eld's deer JN632697 Yes alfredi Visayan spotted deer JN632698 Yes Rusa timorensis rusa deer JN632699 Yes Rusa unicolor sambar NC_008414 Yes Elaphodus cephalophus NC_008749 Yes Muntiacus crinifrons black NC_004577 Yes Muntiacus muntjak Indian muntjac NC_004563 Yes Muntiacus reevesi Chinese muntjac NC_008491 Yes Muntiacus vuquangensis giant muntjac FJ705435 Yes Capreolus capreolus roe deer JN632610 Yes Hydropotes inermis Chinese JN632649 Yes Alces alces JN632595 Yes hemionus mule deer JN632670 Yes Odocoileus virginianus isolate1 white-tailed deer JN632671 Yes Odocoileus virginianus isolate2 white-tailed deer JN632673 Yes Odocoileus virginianus isolate3 white-tailed deer JN632672 Yes Blastocerus dichotomus JN632603 Yes antisensis taruca JN632646 Yes Mazama americana isolate1 red brocket JN632656 Yes Mazama americana isolate2 red brocket JN632657 Yes Mazama nemorivaga isolate1 gray brocket JN632660 Yes Mazama nemorivaga isolate2 gray brocket JN632659 Yes Mazama gouazoubira gray brocket JN632658 Yes Mazama rufina little red brocket JN632661 Yes Ozotoceros bezoarticus JN632681 Yes Pudu mephistophiles Northern pudu JN632691 Yes Pudu puda Southern pudu JN632692 Yes Rangifer tarandus NC_007703 Yes Giraffa camelopardalis antiquorum Kordofan giraffe JN632645 Yes Giraffa camelopardalis angolensis Angolan giraffe NC_012100 Yes Okapia johnstoni JN632674 Yes Moschus berezovskii dwarf NC_012694 Yes Moschus moschiferus Siberian musk deer JN632662 Yes kanchil lesser mouse-deer JN632709 Yes Hyemoschus aquaticus water JN632650 Yes Sus barbatus bearded EF545592 Yes Sus scrofa wild boar FJ237000 No Phacochoerus africanus common NC_008830 Yes porcus red river hog JN632688 Yes Pecari tajacu isolate1 collared JN632682 Yes Pecari tajacu isolate2 NC_012103 Yes Pecari tajacu isolate3 collared peccary JN632683 Yes Camelus bactrianus domestic Bactrian NC_009628 Yes Camelus ferus wild Bactrian camel NC_009629 Yes Camelus dromedarius Arabian camel JN632608 Yes guanicoe llama NC_011822 Yes Vicugna pacos alpaca AJ566364 No Panthera tigris tiger NC_010642 Yes Pteropus dasymallus Ryukyu flying-fox NC_002612 Yes Equus asinus donkey NC_001788 Yes Orcinus orca Killer whale GU187177 Yes Equus caballus Horse NC_001640 Yes Figure S1 Figure S2 Figure S3