Genome

Preserving the tree of life of the fish family in Africa in the face of the ongoing extinction crisis

Journal: Genome

Manuscript ID gen-2018-0023.R3

Manuscript Type: Article

Date Submitted by the 02-Mar-2019 Author:

Complete List of Authors: Adeoba, Mariam; University of Johannesburg Tesfamichael, Solomon; University of Johannesburg Yessoufou, Kowiyou; University of Johannesburg

Conservation, African freshwater Ecosystems, IUCN Red List, EDGE, DNA Keyword: Draft barcoding

Is the invited manuscript for consideration in a Special 7th International Barcode of Life Issue? :

https://mc06.manuscriptcentral.com/genome-pubs Page 1 of 42 Genome

Preserving the tree of life of the fish family Cyprinidae in Africa in the face of the ongoing extinction crisis

Mariam Salami1, Solomon Tesfamichael2, Kowiyou Yessoufou2

1Department of Zoology, University of Johannesburg, Kingsway Campus, PO Box 524, Auckland Park 2006, South Africa

2Department of Geography, Environmental Management and Energy studies, University of Johannesburg, Kingsway Campus, PO Box 524, Auckland Park 2006, South Africa

*Corresponding author: Kowiyou Yessoufou [email protected] Draft

1

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 2 of 42

Abstract Our understanding of how the phylogenetic tree of fishes might be affected by the ongoing extinction risk is poor. This is due to the unavailability of comprehensive DNA data, especially for many African lineages. In addition, the ongoing taxonomic confusion within some lineages, e.g. Cyprinidae, makes it difficult to contribute to the debate on how the fish tree of life might be shaped by extinction. Here, we combine COI sequences and taxonomic information to assemble a fully sampled phylogeny of the African Cyprinidae and investigate whether we might lose more phylogenetic diversity (PD) than expected if currently- go extinct. We found evidence for phylogenetic signal in extinction risk, suggesting that some lineages might be at higher risk than others. Based on simulated extinctions, we found that the loss of all threatened species, which approximates 37% of total PD, would lead to a greater loss of PD than expected, although highly evolutionarily-distinct species are not particularly at risk. Pending the reconstruction of an improved multi-gene phylogeny, our results suggest that prioritizing high-EDGE species (Evolutionary Distinct and GloballyDraft Endangered species) in conservation programmes, particularly in some geographic regions, would contribute significantly to safeguarding the tree of life of the African Cyprinidae.

Keywords: African freshwater ecosystems, Conservation, DNA barcoding, EDGE, IUCN Red List

2

https://mc06.manuscriptcentral.com/genome-pubs Page 3 of 42 Genome

Introduction Evidence that we are losing biodiversity at an unprecedented rate is now piling up (Vamosi and Vamosi 2008; Barnosky et al. 2011; Ceballos et al. 2015). For example, one-fifth of vertebrate species richness, including 13% of birds, 25% of mammals, 31% of sharks and rays and 32% of amphibians, are at risk of extinction (Hoffmann et al. 2010; IUCN 2014). One consequence of species extinction is the disruption of ecosystem functioning and the loss of functional diversity (Erwin 2008) that generates the ecosystem services upon which human life relies (MEA 2005). More concerning is the recent prediction that we risk losing, in only a few centuries, 75% of current vertebrate species (Vamosi and Vamosi 2008; Barnosky et al. 2011), if drastic measures are not taken.

To inform the decision-making process towards such measures, a great deal of studies of extinction risk has been conducted in recent years, but these studies focus mostly on terrestrial vertebrates (Purvis et al. 2000;Draft Jetz et al. 2004; Davies and Yessoufou 2013; Tingley et al. 2013; Schachat et al. 2016; Tonini et al. 2016; Veron et al. 2017). This massive focus on terrestrial vertebrates, particularly mammals, results in a better understanding of not only ecological and evolutionary predispositions of mammals to extinction (Cardillo et al., 2005), but also how the current extinction crisis might deplete the phylogenetic diversity (PD) accumulated over millions of years on the tree of life (Purvis et al. 2000; Davies 2016). The general pattern emerging from these studies is that at-risk species tend to cluster on a phylogenetic tree – phylogenetic signal – (Tonini et al. 2016; Veron et al. 2017) and consequently, if clusters of at-risk species go extinct, entire lineages might be lost, thus leading to a disproportionate loss of PD (Purvis et al. 2000; Yessoufou and Davies 2016). As such, revealing how threatened species are distributed along a phylogeny can guide efforts to prioritise certain lineages in conservation projects. Traditionally, the tests of phylogenetic signal in extinction risk are done on IUCN risk categories (Caddy and Garibaldi 2000), an approach that ignores the drivers of risk. This approach has recently been proved to mask a bigger picture of phylogenetic basis of extinction risk because different species could be in the same risk category, but the risk may be driven by different causes (Schachat et al. 2016).

3

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 4 of 42

However, as opposed to terrestrial vertebrates, efforts to understand extinction risk patterns in aquatic vertebrates have been weak, particularly from a phylogenetic perspective (see review of Veron et al. 2017). There is therefore a need to channel more efforts towards filling the knowledge gap in extinction risk in fish group in comparison to the vast knowledge generated for other vertebrates; e.g. mammals, birds, amphibians, reptiles (Jetz et al. 2004; Davies and Yessoufou 2013; Tingley et al. 2013; Schachat et al. 2016; Tonini et al. 2016; Veron et al. 2017).

In this context, the present study focuses on the African subset of the freshwater fish family Cyprinidae. In Africa, the diversity of Cyprinidae is relatively well known with 539 described taxa on the continent (Table S1), andDra this provides an opportunity to fill the existing knowledge gap on the fish tree of life and risk of extinction. Cyprinidae is the most diverse taxonomic group of freshwater fishes distributed across the world (Nelson 2006; Imoto et al. 2013) with 374 genera and 3 061 ftdescribed species globally (Eschmeyer and Fong 2015; Froese and Pauly 2016). DraftThis family is widely distributed across Africa, Europe and North America (Thai et al. 2007) with 26 genera and 539 species found in Africa (Eschmeyer and Fong, 2015). Some species of this family are of economic importance in aquaculture, angling, fisheries, aquarium trade, and many others serve as an essential protein source for humans, in addition to their high values in recreational fisheries (Skelton 2001; Thai et al. 2007; Collins et al. 2012).

Investigating extinction risk in freshwater ecosystems is important and certainly urgent. For example, the exponential growth of the human population results in unprecedented pressures on natural systems, particularly on freshwater ecosystems in the developing world (Marshall and Gratwicke 1998; Thieme et al. 2005; Dudgeon et al. 2006; García et al. 2010; Thieme et al. 2013). As an illustration, the biophysical conditions of 83% of all freshwater ecosystems are seriously degraded owing to the human footprint (Vörösmarty et al. 2010). The drivers of this degradation include overexploitation, overfishing, dam construction, urbanization, aquarium trade, climate change and invasive species (Venter et al. 2010; Arthington et al. 2016). Consequently, freshwater ecosystems have suffered higher species loss than any other terrestrial or marine ecosystems (MEA 2005; Revenga et al. 2005), and this motivates for more investigations of extinction risk in freshwater systems. Such investigations are expected to focus more on phylogenetic diversity (PD) as

4

https://mc06.manuscriptcentral.com/genome-pubs Page 5 of 42 Genome

it is more likely that the loss of PD would be more dramatic than that of species richness simply because of the emerging pattern of non-random extinction (Davies and Yessoufou, 2013; Veron et al. 2017).

From this perspective, recent conservation studies have increasingly shifted their focus towards the preservation of the evolutionary history on the tree of life rather than just species richness (Jetz et al. 2004; Tonini et al. 2016; Veron et al. 2017; Faith 2008; Mooers et al. 128 2008; Thuiller et al. 2011; BillionnetDra 2012; Yessoufou et al. 2017). Several evidences justify this recent shift. On one hand, the conservation of evolutionary history or PD leads to the protection of not only a diversity of functions or services (Forest et al. 2007; Veron et al. 131 2017), but also of evolutionarilyft distinct species (Faith 1992; Winter et al. 2012). On the other hand, it also leads to the conservation of biodiversity functions that provide currently known and unknown goods and services (Forest et al. 2007; Faith 2008; Faith et al. 2010; Faith and Polloc 2014). Although there are several metrics to quantify evolutionary diversity,Draft ED (evolutionary distinctiveness, a metric that measures the phylogenetic uniqueness of a species) is one of the most frequently used in recent studies that seek to preserve evolutionary history on the tree of life (Jetz et al. 2004; Tonini et al. 2016; Yessoufou et al. 2017). This is because ED has been shown to be efficient in capturing most evolutionary history accumulated in a particular tree of life (Redding et al. 2008; Redding et al. 2015), especially when ED values are analysed within a biogeographical framework (Jetz et al. 2004; Daru et al. 2013; Yessoufou et al. 2017). In addition, ED metrics also capture broadly the biology of a particular group (Warren et al. 2008; Redding et al. 2010).

However, as opposed to mammals, reptiles and birds, how the fish tree of life could be affected by the ongoing mass extinction is poorly understood. This is because of the taxonomic confusion in several lineages, e.g. Cyprinidae (Skelton 2016). The current taxonomic debate around Cyprinidae is likely to take longer as more DNA data are required, and the lab facilities needed to generate these sequences are not always available on the continent. Even in the few labs that have such facilities, fish samples for the over 500 African Cyprinidae species are also not available. Given the ongoing extinction crisis, it is critical that we attempt to elucidate how the crisis might prune the fish tree of life on

5

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 6 of 42

the continent pending further clarification of taxonomic status within the family and the availability of fish samples.

In the present study, our aim is to inform conservation decisions through investigating evolutionary diversity of the fish family Cyprinidae in Africa. Specifically, we assemble a gene-tree based on DNA barcodes of the African Cyprinidae, explore how threatened fish species and threat drivers are distributed along the phylogeny, and investigate how the loss of currently threatened fish species would impact the evolutionary diversity on the phylogeny.

Material and Method IUCN categorisation of the African Cyprinidae We compiled the list of the 539 African Cyprinidae species from Fishbase (Froese and Pauly 2017; accessed March 2017) presented in Table S1. The IUCN status of all the 539 species is as follows (http://www.iucnredlist.org/;Draft accessed September 2016): Least Concern (LC, 253 species), Near Threatened (NT, 13 species), Vulnerable (VU, 51 species), Endangered (EN, 167 33 species), Critically Endangered (EN, 10 species), Extinct (EX, 1 species), Data Deficient (DD, 137 species) and Not Assessed (NA, 11 species) (Table S1). The only species already extinct, Enteromius microbarbis David & Poll, 1937, was excluded from the analysis.

Threats to fish diversity Different data on threats were retrieved from the IUCN database (IUCN 2016; accessed September 2016). These threats were then grouped into the following categories: habitat change, invasive species, pollution and overexploitation. When a species is under more than one threat category, it is categorised in the group "multiple threats" (Table S1).

Assembling a fully sampled phylogeny of the African Cyprinidae We used the recent approach of Thomas et al. (2013) to assemble a complete phylogeny when DNA sequences are not available for all species. This approach requires taxonomic information and DNA data (here COI sequences) with which one can assemble a constraint tree. With regard to DNA data, three types of species (types 1, 2 and 3) are distinguished: “type 1 species” are species for which we have COI sequences; “type 2 species” are

6

https://mc06.manuscriptcentral.com/genome-pubs Page 7 of 42 Genome

species for which COI sequences are missing but they are congeners of type 1; “type 3 species” have no COI sequences and are not congeners of type 1. In this study, we have 138 type 1 species, 388 type 2 species and 13 type 3 species. To assemble the constraint tree, an XML file was generated using the COI sequences of the type 1 species (see Adeoba et al. 2018 for details of the origin and how sequences were compiled), in the program BEAUTi, and this file was used to reconstruct a dated constraint tree based on a Bayesian MCMC approach implemented in the BEAST program. Further, we selected GTR + I + Γ as the best model of sequence evolution based on the Akaike information criterion evaluated using MODELTEST (Nylander 2004). In addition, a Yule process was selected as the tree prior with an uncorrelated relaxed lognormal model for rate variation among branches. Also, we included the COI sequences of the following species used as outgroups and for calibration (He et al. 2008; Tang et al. 2010; Wang et al. 2012): Barbonymus altus, Barbonymus schwanenfieldii, Barbus, Carrassius auratus, Carrassius gibelio, Gyrinocheilus aymonieri, Hybognathus argyritis, Myxocyprinus asiaticus, Paramisgurnus dabryanusDraft, Phoxinus phoxinus, Pseudobora parva, Rhinichthys umatilla, Tinca tinca and Vimba vimba.

For calibration, following Wang et al. (2012) and Cavender (1991), the root node of Cyprinidae was constrained to 55.8 million years (My, ±0.5), and the split between Tinca and the modern leuciscins was constrained to 18.0 My. In addition, the lineage Barbus was calibrated to 13 Mya (±0.5) following Zardoya and Doadrio (1999). In the process of tree reconstruction, a Yule process was selected as the tree prior with an uncorrelated relaxed lognormal model for rate variation among branches. Monte Carlo Markov chains were run for 50 million generations with trees sampled every 1000 generations. Log files, including prior and likelihood values, as well as the effective sample size (ESS) were examined using TRACER (Rambaut and Drummond 2007). ESS values were all > 200 for the age estimates. We discarded 25% of the resulting 50,000 trees as burn-in, and the remaining trees were combined using TREEANNOTATOR (Rambaut and Drummond 2007) to generate a maximum clade credibility (MCC) tree (our constraint tree).

To integrate the types 2 and 3 species into the constraint tree, a simple taxon definition file that lists all three types of species along with their taxonomic information (here names) was formed. Using the constraint tree and the taxon definition file, a MrBayes input

7

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 8 of 42

file was first generated as implemented in the R library PASTIS (Thomas et al. 2013), and then we reconstructed a dated complete phylogeny using MrBayes 3.2. (Ronquist & Huelsenbeck 2003) under a relaxed-clock model with node-age calibrations indicated above. In the 10,000 resulting trees, the topology of species with DNA sequences remains fixed, and the unsampled species (types 2 and 3) were assigned randomly within their genus. In our analyses, we used the 50% consensus of these 10,000 trees, which obviously integrate over the phylogenetic uncertainty ofDra the missing species by collapsing poorly known clades into polytomies. To account for this uncertainty, we used a sample of 100 trees from the posterior to calculate a range of values specifically for some metrics; e.g. ED, EDGE and PD (see details below in Sectionft Data Analysis). Although we acknowledge that the resulting phylogeny from our tree reconstruction approach may not be suitable for estimating certain variables (e.g. rates of continuous-character evolution), we confidently rely on the recent evidence provided through simulations that the approach remains adequate for assessing branch-length related measures such as diversification rate (Rabosky, 2015), and by extension ED,Draft EDGE and PD. Also, several studies have employed similar approach to assemble phylogenies used to investigate different types of evolutionary or conservation questions (Isaac et al., 2012; Jetz et al., 2012, 2014; Tonini et al. 2016; Yessoufou et al. 2017; Adeoba & Yessoufou 2018). Our phylogeny is therefore appropriate for creating null models of the distribution of threat status that are conservative with respect to remaining phylogenetic uncertainty.

Data analysis All analyses were conducted in R (R Development Core Team 2013), and the R script is provided as Supplemental Information. For each species, IUCN categories were transformed into binary data: 0 when a species is LC or NT (i.e. non-threatened) and 1, when a species is VU, EN or CR (i.e. threatened). DD and NE species were treated as non- threatened following Adeoba (2018). Similarly, we coded threat data as follows: 1, when a particular threat is reported for a species; 0, when no threat is reported for a species, and NA when threat information is missing for a species. NA values were treated firstly as NA=0 and then NA =1. The same binary transformation was done for habitat preferences (pelagic, benthopelagic and demersal) and geographic origin (e.g. West Africa, Central Africa, etc. following https://www.mapsofworld.com/africa/regions, accessed March 2017) (Table S1). Next, using the complete phylogeny that we assembled, we applied the D

8

https://mc06.manuscriptcentral.com/genome-pubs Page 9 of 42 Genome

statistic (Fritz and Purvis, 2010) to test for phylogenetic signal in IUCN categories, threat categories, habitat preferences and geographic origins. When D = 1 for a variable, then this variable is randomly distributed at the tips of the phylogeny; D = 0 corresponds to a BM (Brownian Motion) model; D < 0 signifies high phylogenetic signal, whereas D>1 signifies that a variable is over-dispersed on the phylogenetic tree. We tested for significance of D values using 1000 random shuffling across the tips of the phylogeny.

Further, we assessed how the loss of currently threatened species would impact the tree. To this end, we firstly quantified the total evolutionary history accumulated on the tree using

the Faith’s (1992) phylogenetic diversity (PDTotal). Then, we calculated PDthreatened after pruning all non-threatened species from the tree, allowing us to get the percentage of at-

risk PD in relation to PDTotal. Next, we simulated the loss of 103 species from the tree by pruning randomly 103 species 1,000 times from the tree (103 is the number of threatened species in the dataset). We then calculated each time the PD values of each set of random 103 species (PDrandom), and finally, weDraft compared the average of these random PD values to the value of PDthreatened (PDthreatened is the value of PD corresponding to the species currently threatened).

In addition, we measured the evolutionary distinctiveness (ED; Isaac et al., 2007) of each species as the sum of branch lengths from a tip to the root of the tree, divided by the number of tips the branch sustains (also over 100 trees, and mean ED values are reported).

Using one-way ANOVA, we tested the correlation between EDAverage and IUCN categories of species. This was assessed in three scenarios: i) we singled out DD and NE species alongside other IUCN categories; ii) we repeated the same analysis but considered all DD and NE as LC (i.e. non-threatened, following Adeoba 2018); iii) we grouped IUCN status into non-threatened vs. threatened species.

Finally, we ranked all species according to their EDGEAverage scores (average over 100

trees), and mapped the spatial distribution of top EDGEAverage species. EDGE is computed

as [ln(1 + EDAverage) + GE *ln(2)], where EDAverage and GE stand for average evolutionary distinctiveness and global endangerment, respectively. GE was coded as follows (Butchart et al. 2005): Least Concern = 0, Near Threatened and Conservation Dependent = 1,

9

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 10 of 42

Vulnerable = 2, Endangered = 3, Critically Endangered = 4. In the EDGE calculation, DD species are treated as LC following Adeoba (2018).

Finally, we mapped the geographic distribution of the top genera in the EDGE-based ranking of fish species. Species occurrence points were downloaded from the IUCN online portal (http://www.iucnredlist.org; accessed February 2017). Top genera were identified based on the number of species each genus has in the top 50 EDGE species. For the mapping, a coverage of African freshwater bodies was used as the basic unit of mapping, by assuming that a species identified in a water body can be spotted at any location within that water body. The distributions cover both water bodies and land surfaces and thus provide rather generic indications; it was therefore necessary to limit the mapping exercise within water bodies only. Overlay analysis was performed to extract fish distributions that fell within inland water bodies of Africa. All map analyses were done using ArcGIS (ESRI® ArcGIS version 10.5, Redlands, CA). Draft Results The gene tree reconstructed based on COI indicates three major subfamilies in Africa, of which the largest is Cyprininae followed by Danioninae and very few representatives of Leucisinae (Figure 1). The topology recovered contradicts early reported topologies but is congruent to some extent to the most recent topology (see discussion).

We found evidence for phylogenetic signal in extinction risk (D = 0.90, P = 0.04*; Figure S1). We also found support for signal in the causes of extinction risk but only for over- exploitation (D = 0.91, P = 0.03*), pollution (D = 0.91, P = 0.03*) and invasive species (D = 0.23, P < 0.001***) (Figure 2). Also, demersal and benthopelagic (but not pelagic) species are more closely related than expected by chance as well as species in each geographic region (Figures S2 and S3, respectively).

Furthermore, the total PD accumulated in the fully sampled phylogeny of the fish family

Cyprinidae is PDTotal = 13 billion 20 million years, whereas all currently threatened species represent PDThreatened = 4 billion 783 million years, i.e. 37% of PDTotal. If all threatened species go extinct, we would lose far more PD than expected by chance (PDMeanRandomLoss =

10

https://mc06.manuscriptcentral.com/genome-pubs Page 11 of 42 Genome

1 billion 351 million years, confidence interval CI = 1 billion 138-1 billion 586 million years; Figure 3).

The top ED species is Acapoeta tanganicae (EDAverage = 97.63 My), while Barbus

sylvaticus had the lowest EDAverage score (8.37 My) (Table S1). In the top 50 EDAverage species, the genera Barbus (28%), Labeobarbus (18%) and Labeo (10%) are the most represented in terms of species richness, and 94% of these species are benthopelagic (Table

1). In addition, 34% of the top 50 EDAverage species are LC, 32% Data Deficient and 14% are VU, and these species are predominantly found in central (34%), eastern (24%) and southern Africa (16%).

Nonetheless, there was no relationship between EDAverage and IUCN categories, irrespective of how these categories are treated (one-way ANOVA, P > 0.05; Figure 4).

However, on EDGE score, it is Barbus boboi that has the highest score (EDGEAverage = 6.78), whereas the lowest score wasDraft found for B. dorsolineatus (Table S1). In the top-50 EDGEAverage score ranking, nine genera are represented (Table S1), with the genus Barbus being the most dominant (52%) followed by Pseudobarbus (12%), Labeo and Labeobarbus (10% each). In addition, the overwhelming majority of species in the top-50 EDGE ranking

are benthopelagic (88%). In contrast to the top-EDAverage species that are predominantly found in central Africa, most top-EDGE species are distributed in southern (34%), eastern and western Africa (24%, respectively; Figure 5).

Discussion This study highlights the utility of combining DNA sequences and taxonomic information to generate phylogenetic hypotheses for conservation studies. Nevertheless, there has been broad concern in the literature about using COI barcodes to construct deep phylogeny. Also, the phylogenetic literature tends to be more supportive of multi-marker phylogenetic hypotheses. However, most recent multi-marker large phylogenies of fishes do not include freshwater fishes (Rabosky et al. 2018), and most of those that contain freshwater fishes, to our knowledge, exclude African Cyprinidae (Betancur-R et al. 2013). Pending the completion of the ongoing multi-marker tree of life project (http://bio.slu.edu/mayden/cypriniformes/home.html), the present study relies on a COI-

11

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 12 of 42

phylogeny to explore the phylogenetic basis of extinction risk. COI-phylogenies have been shown in several studies to be reliable in estimating phylogenetic metrics (e.g. community phylogenetic metrics) in comparison to the same metrics calculated using multi-marker trees (Smith et al. 2014; Boyle and Adamowicz 2015). Also, in our COI- phylogeny, the subfamily Labeoninae is embedded within the subfamily Cyprininae, and this contradicts the topology reported for the family Cyprinidae in earlier studies (Thai et al. 2007; Tang et al. 2009; Zheng et al. 2010). However, this contradiction is congruent with the most recent treatment of the subfamily Labeoninae (see Yang et al. 2015).

Using this phylogeny, and in line with the general trend of phylogenetic signal in extinction risk (Tonin et al. 2016; Yessoufou and Davies 2016), we found evidence that extinction-prone species are more closely related than expected by chance (but see Jetz et al. 2004 for birds). Consequently, if at-risk speciesDra slide to extinction, we risk losing entire clades and a great amount of evolutionary history. In particular, fish species threatened by over-exploitation, pollution and invasiveDraft species are the most closely related, suggesting that either closely related species are sensitiveft to similar threats or co-occurring threatened species are simply exposed to similar threats. Both explanations are plausible as we also found a phylogenetic signal in habitat preference and geographic origin of occupancy of species. From a conservation perspective, our findings imply that, if actions are not taken on time, over-exploitation, pollution and invasive species may drive phylogenetically close species to extinction across the continent.

The first concern about these findings is that we risk losing more phylogenetic diversity (PD) than expected if closely related species go extinct. We explored this eventuality on the tree of life of the African Cyprinidae. We found that the African Cyprinidae represents more than 13 billion years of PD, and that if all currently threatened species slide to extinction, we would lose ~ 5 billion years of PD, representing ~ 37% of total PD of African Cyprinidae. The loss of 37% of total PD is ill-afforded in the context of multiple and persistent calls to preserve evolutionary diversity rather than just species richness (Jetz et al. 2004; Pellens and Grandcolas, 2016). These calls are motivated by increasing evidence that preserving evolutionary diversity is necessary to maintain the diversity of ecosystem functions (Forest et al. 2007; Veron et al. 2017), the conservation of rare and unique species (Faith 1992; Winter et al. 2012) and, more critically, for the maintenance of

12

https://mc06.manuscriptcentral.com/genome-pubs Page 13 of 42 Genome

a sustainable delivery of known and hidden goods and services (Forest et al. 2007; Faith 2008; Faith et al. 2010; Faith and Polloc 2014).

As expected from clustered threatened species, we found that the amount of at-risk PD is far greater than expected under a scenario of random loss. This could only be explained by the fact that the loss of clustered threatened speciesDra may drive the loss of a greater number of phylogenetic branches within some clades, thus heightening the possibility of losing entire clades and therefore more PD than expected (see Davies and Yessoufou 2013). To avoid such loss, recent studies call for the prioritizationft of high-ED species (Jetz et al. 2004; Redding et al. 2008; Warren et al. 2008; Redding et al. 2010; Daru et al. 2013; Redding et al. 2015; Tonini et al. 2016; Yessoufou et al. 2017). Our study provides for conservation biologists and decision-makers a ranking of all the African Cyprinidae based on their ED scores. In this ranking, Acapoeta tanganicae is the top priority for conservation. The species A. tanganicae is endemic to the Lake Tanganyika and the Rusizi River in Burundi, DRC, Tanzania andDraft Zambia. It is found in inshore waters over rocky substrates and in major fast-flowing rivers (Ntakimazi 2006) where it feeds on insects, ostracods, diatoms, worms and aufwuchs (Ntakimazi 2006). Although A. tanganicae is IUCN-categorized as Least Concern, overexploitation has been reported as a major threat to the species, and no known conservation measures are taken so far (Ntakimazi 2006). Because A. tanganicae tops the list of high-ED species, we call for urgent regulated exploitation in the countries where the species occurs. In addition, the top 50 species in ED score are mostly Enteromius, Labeobarbus and Labeo in central, eastern and southern Africa, making these regions the focus for conservation projects. However, the fact that 34% of these priority species are Data Deficient calls for renewed efforts to elucidate the IUCN threat status of species, at least for those that deserve urgent conservation measures.

Our study further provides opportunity to explore how ED scores of vertebrates are distributed across IUCN Red List categories. Here, our result of no relationships between ED and IUCN categories of fish mirrors what has been reported for other vertebrate lineages [e.g. mammals (Arregoitia et al. 2013), birds (Jetz 2004), reptiles (Tonini et al. 2016)], thus suggesting that evolutionarily distinct vertebrates are not particularly threatened. From a conservation perspective, this is an interesting finding as sets of high- ED species represent huge evolutionary diversity (Redding et al. 2008, 2015).

13

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 14 of 42

Furthermore, the integration of ED and IUCN Red List categories provides a better scoring system, known as EDGE scores, for species prioritization in conservation projects (Huang et al. 2011). The Zoological Society of London (ZSL) is championing a global conservation campaign informed by EDGE scores.Dra The focus of the campaign has been on mammals, amphibians, birds and reef coral species (Jetz et al. 2004; Isaac et al. 2007; Collen et al. 2011; Isaac et al. 2012; Huang 2012). For marine fish, an EDGE prioritization effort for sharks is currently underway (EDGEft 2015); our study complements this global effort with an EDGE ranking of 539 species of the African freshwater Cyprinidae. Our ranking indicates that the species Enteromius boboi has the highest score, whereas the lowest score was found for Enteromius dorsolineatus. Enteromius boboi shares a large number of scales and long barbels with many other species of Cyprinidae but is unique with its large blotch at the end of the caudal peduncle and the downward flexion of the below the (Entsua-Mensah 2010). In addition, E. boboi occurs only in one River, the River Farmington in Draftthe Gibi Mountains in Liberia (West Africa). Although it is reported to be threatened by deforestation and mining activities, there is unfortunately no available data on the trend of its population, and worse, there are no known conservation measures ever put in place to prevent this Critically Endangered species from sliding to extinction (Entsua-Mensah 2010). Our finding that E. boboi scores highest in EDGE ranking among the African Cyprinidae calls for an urgent need for conservation projects targeting this at-risk species. Nonetheless, species in nine genera are represented in the top EDGE list, including Enteromius, Pseudobarbus, Labeo and Labeobarbus as the most dominant in term of species richness. We mapped their geographic distributions to aid conservation decisions. These top EDGE species are native to southern (Lesotho, Malawi, Mozambique and South Africa), eastern (Ethiopia, Kenya, Rwanda, Tanzania, and Uganda) and western Africa (Cote d’Ivoire, Guinea, Liberia and Sierra Leone). We call for conservation biologists in these countries and around the world to prioritize species in the top EDGE list provided in this study for the African fish in their conservation agenda.

In summary, we assembled a gene-tree of the African Cyprinidae, allowing us to fill the knowledge gap of how extinction risk in the group might deplete PD of African Cyprinidae. We found that at-risk species that are phylogenetically closely related are facing similar threat, and this is more likely because these species co-occur in same

14

https://mc06.manuscriptcentral.com/genome-pubs Page 15 of 42 Genome

geographic regions or same habitat (e.g. benthopelagic vs. demersal). We also found that threatened species are clustered on the phylogeny, and this may lead to a disproportionate loss of PD in comparison to random expectation. Furthermore, we ranked all species based on the urgency for conservation based on ED and EDGE scores, and further indicated the geographic areas of distribution of the species that top the prioritisation scores. Overall, our results suggest that conserving freshwater ecosystems,Dra regulating the exploitation of resources, and prioritizing high-EDGE species in conservation programmes, particularly in Lesotho, Malawi, Mozambique, South Africa, Ethiopia, Kenya, Rwanda, Tanzania, Uganda, Cote d’Ivoire, Guinea, Liberia and Sierraft Leone, would contribute significantly to safeguarding the phylogenetic diversity of the African Cyprinidae. Finally, the results reported here should be interpreted bearing in mind that a more comprehensive multi-gene phylogeny is required for a better understanding of the phylogenetic pattern of extinction risk, given that we only used a gene-tree based on COI in the present study. Draft

15

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 16 of 42

Ethics: No special permit was required for this study as we did not collect any data from the field, and no informed consent of any participant was necessary because our study did not involve any interview. All data used were retrieved from public repositories.

Data accessibility: Our data are deposited at Dryad: http://dx.doi.org/10.5061/dryad.k10k8

Authors contributions: K.Y. designed the study; M.I.A retrieved all data analysed from GenBank/EBI and FishBase; M.I.A carried out sequence alignments and KY performed tree reconstruction; KY carried out the statistical analyses and coordinated the whole study; S.G.T run all spatial analyses; M.I.A., K.Y., and S.G.T. wrote the manuscript. All authors gave final approval for publication.

Competing interest: The authors declareDraft no competing interest Funding: K.Y. received financial support from South Africa’s National Research Foundation (NRF) for funding (Grant No: 103944) and the 2016 Research prize of the Société Botanique de France. M.I.A. received funding from the University of Johannesburg URC International Scholarships.

Acknowledgement We much appreciate the assistance of Gavin H. Thomas with the reconstruction of full phylogeny of the African Cyprinidae using the R library PASTIS. We acknowledge the South Africa’s National Research Foundation (NRF) for funding (Grant No: 103944 and

112113). We also thank the University of Johannesburg for the URC International Scholarships provided to the first author.

16

https://mc06.manuscriptcentral.com/genome-pubs Page 17 of 42 Genome

References Adeoba. 2018. Phylogenetic analysis of extinction risk and diversification history of the African Cyprinidae using DNA barcodes. PhD thesis, University of Johannesburg.

Adeoba M.I., Tesfamichael S.G., van der Bank H., Yessoufou K. 2017. Data from: Preserving the evolutionary history of the African Cyprinidae in the context of extinction crisis. Dryad Digital Repository. doi:10.5061/dryad.k10k8.

Adeoba MI, Yessoufou K. 2018. Analysis of temporal diversification of African Cyprinidae (Teleostei, Cypriniformes). ZooKeys 806: 141–161.

Arregoitia LD, Blomberg SP, Fisher DO. 2013. Phylogenetic correlates of extinction risk in mammals: species in older lineages are not at greater risk. Proceedings of the Royal Society of London B: Biological Sciences 280, 20131092. (doi:10.1098/rspb.2013.1092). Draft Arthington AH, Dulvy NK, Gladstone W, Winfield IJ. 2016. Fish conservation in freshwater and marine realms: status, threats and management. Aquatic Conservation: Marine and Freshwater Ecosystems 26, 838– 857. (doi:10.1002/aqc.2712).

Barnosky AD et al. 2011. Has the Earth’s sixth mass extinction already arrived? Nature 471, 51–57. (doi:10.1038/nature09678).

Batista MC, Gouveia SF, Silvano DL, Rangel TF. 2013. Spatially explicit analyses highlight idiosyncrasies: species extinctions and the loss of evolutionary history. Diversity and Distributions 19, 1543–1552. (doi:10.1111/ddi.12126).

Betancur-R R, Broughton RE, Wiley EO, Carpenter K, López JA, Li C, Holcroft NI, Arcila D, Sanciangco M, Cureton II JC, Zhang F, Buser T, Campbell MA, Ballesteros JA, Roa- Varon A, Willis S, Borden C, Rowley T, Reneau PC, Hough DJ, Lu G, Grande T, Arratia G, Ortí G. 2013. The Tree of Life and a new classification of bony fishes. PLoS Currents: Tree of Life.

17

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 18 of 42

Billionnet A. 2012. Solution of the generalized Noah’s ark problem. Systematic Biology 62, 147–156. (doi.org/10.1093/sysbio/sys081).

Boyle EE, Adamowicz SJ. 2015. Community phylogenetics: Assessing tree reconstruction methods and the utility of DNA barcodes. PLoS ONE 10(6): e0126662.

Butchart SHM, Akcakaya HR, Kennedy E, Hilton-Taylor C. 2005. Biodiversity indicators based on trends in : strengths of the IUCN red list index. Conservation Biology 20, 1523–1739.

Caddy JF, Garibaldi L. 2000. Apparent changes in the trophic composition of world marine harvests: the perspective from the FAO capture database. Ocean & Coastal Management 43, 615–655. (doi:10.1016/S0964-5691(00)00052).

Cardillo M, Mace GM, Jones KE, BielbyDraft J, Bininda-EmondsDra ORP, Sechrest W, Orme CDL, Purvis A. 2005. Multiple causes of high extinction risk in large mammal species. Science 309, 1239–1241. (doi:10.1126/science.1114488).ft Cavender TM. 1991. The fossil record of the Cyprinidae. In: Winfield IJ, Nelson JS, eds. Cyprinid Fishes: Systematics, Biology and Exploitation. London: Chapman and Hall, 34– 54.

Ceballos G, Ehrlich PR, Barnosky AD, Garcia A, Pringle RM, Palmer TM. 2015. Accelerated modern human-induced species losses: entering the sixth mass extinction. Science Advances 1, e1400253. (doi:10.1126/sciadv.1400253).

Collen B, Turvey ST, Waterman C, Meredith HMR, Kuhn TS, Baillie JEM, Isaac NJB. 2011. Investing in evolutionary history: implementing a phylogenetic approach for mammal conservation. Philosophical Transactions of the Royal Society, B: Biological Sciences 366, 2611–2622. (doi:10.1098/rstb.2011.0109).

18

https://mc06.manuscriptcentral.com/genome-pubs Page 19 of 42 Genome

Collins RA, Armstrong KF, Meier R, Yi Y, Brown SDJ, Cruickshank RH, Keeling S, Johnston C. 2012. Barcoding and border biosecurity: Identifying Cyprinid Fishes in the Aquarium Trade. PLoS ONE 7, e2838. (doi:10.1371/journal.pone.0028381).

Daru BH, Yessoufou K, Mankga LT, Davies TJ. 2013. A Global Trend towards the Loss of Evolutionarily Unique Species in Mangrove Ecosystems. PLoS ONE 8, e66686. (doi:10.1371/journal.pone.0066686).

Davies TJ. 2015. Losing history: How extinctions prune features from the tree of life. Philosophical Transactions of the Royal Society B, 370, 20140006. (doi:10.1098/rstb. 2014. 0006).

Davies TJ, Smith GF, Bellstedt DU, Boatwright JS, Bytebier B, et al. 2011. Extinction Risk 525 and Diversification Are Linked in a Plant Biodiversity Hotspot. PLoS Biol. 9, e1000620. (doi:10.1371/journal.pbio.1000620).Draft

Davies TJ, Yessoufou K. 2013. Revisiting the impacts of non-random extinction on the tree-of-life. Biology Letters 9, 20130343. (doi:10.1098/rsbl.2013.0343).

Dudgeon D et al. 2006. Freshwater biodiversity: importance, threats, status and conservation challenges. Biological reviews 81, 163–182. (doi:10.1017/S1464793105006950).

EDGE 2015. The EDGE of existence program, version 2015. http://www.edgeofexistence.org/about/future_edge.php Accessed 20.02.2015.

Entsua-Mensah, M. 2010. Barbus boboi. The IUCN Red List of Threatened Species 2010: e.T181722A7714422. http://dx.doi.org/10.2305/IUCN.UK.20103.RLTS.T181722A7714422.en. Downloaded on February 2017.

19

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 20 of 42

Erwin DH. 2008. Extinction as the loss of evolutionary history. Proceedings of the National Academy of Sciences of the United States of America 105, 11520–11527. (doi:10.1073/pnas.0801913105).

Eschmeyer WN, Fong JD. 2013. Species by family/subfamily.http://research.calacademy. org/research/ichthyology/catalog/SpeciesByFamily.asp.

Eschmeyer WN, Fong JD. 2015. Species by family/subfamily.http://research.calacademy. org/research/ichthyology/catalog/SpeciesByFamily.asp. Electronic version accessed 21-02- 2015.

Faith DP. 1992. Conservation evaluation and phylogenetic diversity. Biology and Conservation 61, 1–10.

Faith DP. 2008. Threatened species Draftand the potential loss of phylogenetic diversity: Conservation scenarios based on estimated extinction probabilities and phylogenetic risk analysis. Conservation Biology 22, 1461–1470. (doi:10.1111/j.1523-1739.2008.01068.x).

Faith DP, Magallόn S, Hendry AP, Conti E, Yahara T, Donoghue MJ. 2010. Ecosystem services: an evolutionary perspective on the links between biodiversity and human well- being. Current Opinion in Environmental Sustainability 2, 66–74. (doi:10.1016/j.cosust.2010.04.003).

Faith DP, Polloc LJ. 2014. Phylogenetic diversity and the sustainable use of biodiversity. In Applied Ecology and Human Dimensions in Biological Conservation (eds MC Lyra- Jorge, L M Verdade), pp. 35–52. Springer, Berlin.

Forest F et al. 2007. Preserving the evolutionary potential of floras in biodiversity hotspots. Nature 445, 757–760. (doi:10.1038/nature05587).

Fritz SA, Purvis A. 2010. Selectivity in mammalian extinction risk and threat types: A new measure of phylogenetic signal strength in binary traits. Conservation Biology 24, 1042– 1051. (doi:10.1111/j.1523-1739.2010.01455.x).

20

https://mc06.manuscriptcentral.com/genome-pubs Page 21 of 42 Genome

Froese R, Pauly D. 2017. FishBase. World Wide Web electronic publication. Available from: http://www.fish base.org, version (assessed March 4th 2017).

Froese R, Pauly D. 2016. FishBase. World Wide Web electronic publication. www.fishbase.org, version (24-06-2016).

García N, Cuttelod A, Malak DA. 2010. The status and distribution of freshwater biodiversity 582 in Northern Africa. IUCN.

He S, Mayden RL, Wang X, Wang W, Tang KL, Chen WJ. 2008. Molecular phylogenetics of the family Cyprinidae (: Cypriniformes) as evidenced by sequence variation in the first intron of S7 ribosomal protein-coding gene:Further evidence from a nuclear gene of 587the systematic chaos in the family. Molecular Phylogenetics and Evolution 46, 818–829. (doi:10.1016/j.ympev.2007.06.001).Draft

Hoffmann M et al. 2010. The impact of conservation on the status of the world’s vertebrates. Science 330, 1503–1509. (doi:10.1126/science.1194442).Dra Huang D. 2012. Threatened reef corals of the world. PLoS ONE 7, e34459. (doi:10.1371/journal.pone.0034459). ft Huang S, Davies TJ, Gittleman JL. 2011. How global extinctions impact regional biodiversity in mammals. Biology Letters 8, 222–225. (doi:10.1098/rsbl.2011.0752).

Imoto JM, Saitoh K, Sasaki T, Yonezawa T, Adachi J, Kartavtsev YP, Miya M, Nishida M, Hanzawa N. 2013. Phylogeny and biogeography of highly diverged freshwater fish species (Leuciscinae, Cyprinidae, Teleostei) inferred from mitochondrial genome analysis. Gene 514, 112–124. (doi:10.1016/j.gene.2012.10.019).

Isaac NJ, Redding DW, Meredith HM, Safi K. 2012. Phylogenetically-informed priorities for amphibian conservation. PLoS one 7:e43912. (doi:10.1371/journal.pone.0043912) .

21

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 22 of 42

Isaac NJB, Turvey ST, Collen B, Waterman C, Baillie JEM. 2007. Mammals on the EDGE: Conservation priorities based on threat and phylogeny. PLoS One 2, e296. (doi:10.1371/journal.pone.0000296).

IUCN (International Union for Conservation of Nature). 2014. The IUCN red list of threatened species. Gland, Switzerland. Available from http://www.iucnredlist.org/about/citing (accessed September 2014).

IUCN. 2016. The IUCN red list of threatened species. Version 2016-4. http://www.iucnredlist.org. Downloaded Accessed on 19 September 2016.

Jetz W, Rahbek C, Colwell RK. 2004. The coincidence of rarity and richness and the potential 618 signature of history in centres of endemism. Ecology Letters 7, 1180– 1191. (doi:10.1016/j.cub.2014.03.011).Draft

Jono CMA, Pavoine S. 2012. Threat diversity will erode mammalian phylogenetic diversity in the near future. PLoS ONE 7, e46235. (doi:10.1371/ journal.pone.0046235).

Marshall BE, Gratwicke B. 1998. The barred (Teleostei: Cyprinidae) of Zimbabwe: is there cause for concern? Southern African Journal of Aquatic Science 24, 157–161. (doi.10:1080/10183469.1998.9631422).

[MEA] Millennium Ecosystem Assessment. 2005. Ecosystems and Human Well-being: Synthesis. Washington (DC): Island Press.

Mooers AØ, Faith DP, Maddison WP. 2008. Converting endangered species categories to probabilities of extinction for phylogenetic conservation prioritization. PLoS ONE 3, e3700. (doi:10.1371/journal.pone.0003700).

Nelson JS. 2006. Fishes of the World, Fourth edition. New York: John Wiley.

22

https://mc06.manuscriptcentral.com/genome-pubs Page 23 of 42 Genome

Ntakimazi, G. 2006. Acapoeta tanganicae. The IUCN Red List of Threatened Species2006:e.T61308A12459046.http://dx.doi.org/10.2305/IUCN.UK.2006.RLTS.T6130 8A12459 046.en. Downloaded on February 2017.

Nylander JAA. 2004. Modeltest v2. Program distributed by the author. Sweden: Evolutionary 642 Biology Centre, Uppsala University.

Pellens R, Grandcolas P. 2016. Phylogenetics and conservation biology: drawing a path into 645 the diversity of life. In: Biodiversity Conservation and Phylogenetic Systematics (pp. 1-15). Springer International Publishing.

Purvis A, Agapow PM, Gittleman JL, Mace GM. 2000. Nonrandom extinction and the loss of evolutionary history. Science 288, 328–330. (doi:10.1126/science.288.5464.328).

R Development Core Team. 2013. ADraft language and environment for statistical computing. Retrieved from http://www.r-project.org.

Rabosky, DL, Chang, J, Title, PO, Cowman, PF, Sallan, L, Friedman, M, Kaschner, K, Garilao, C, Near, TJ, Coll M, & Alfaro ME. 2018. An inverse latitudinal gradient in speciation rate for marine fishes. Nature 559, 392-408.

Rambaut A, Drummond AJ. 2007. TreeAnnotator (version 1.5.4). Retrieved from http://beast.bio.ed.ac.uk. Accessed 13th Dec 2015.

Redding DW, DeWolff CV, Mooers AØ. 2010. Evolutionary distinctiveness, threat status, and ecological oddity in primates. Conservation Biology 24, 1052–1058. (doi:10.1111/j.1523-1739.2010.01532.x).

Redding DW, Hartmann K, Mimoto A, Bokal D, Devos M, Mooers A. 2008. Evolutionarily distinctive species often capture more phylogenetic diversity than expected. Journal of Theoretical Biology 251, 606–615. (doi:10.1016/j.jtbi.2007.12.006).

Redding DW, Mooers AO, Şekercioğlu ÇH, Collen B. 2015. Global evolutionary isolation measures can capture key local conservation species in Nearctic and Neotropical bird

23

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 24 of 42

communities. Philosophical Transaction of the Royal Society B, 370, 20140013. (doi:10.1098/rstb.2014.0013).

Revenga C, Campbell I, Abell R, De Villiers P, Bryer M. 2005. Prospects for monitoring freshwater ecosystems towards the 2010 targets. Philosophical Transactions of the Royal Society of London B: Biological Sciences 360, 397–413. (doi:10.1098/rstb.2004.1595).

Ronquist F, Huelsenbeck JP. 2003. MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574.

Schachat SR, Mulcahy DG, Mendelson JR. 2016. Conservation threats and the phylogenetic utility of IUCN Red List rankings in Incilius toads. Conservation Biology 30, 72–81. (doi:10.1111/cobi.12567).

Skelton PH. 2001. A complete guideDraft for fresh water fishes of southern Africa. Cape Town: Struik Publishers.

Skelton PH 2016. Name changes and additions to the southern African freshwater fish fauna. African Journal of Aquatic Science 41, 345-351. (doi: 10.2989/16085914.2016.1186004).

Smith, MA, Hallwachs, W, Janzen DH, 2014. Diversity and phylogenetic community structure of ants along a Costa Rican elevational gradient. Ecography 37, 720-731.

Tang KL, Agnew MK, Hirt MV, Sado T, Schneider LM, Freyhof J, Sulaiman Z, Swart E, Vidthavanon C, Miya M, Saitoh K, Soimios AM, Wood RM, Mayden RL. 2010. Systematics of the subfamily Danioninae (Teleostei: Cypriniformes: Cyprinidae). Molecular Phylogenetics and Evolution 57, 189–214. (doi:10.1016/j.ympev.2010.05.021).

Tang Q, Getahun A, Liu H 2009. Multiple in-to-Africa dispersals of Labeonin fishes (Tel- eostei: Cyprinidae) revealed by molecular phylogenetic analysis. Hydrobiologia 632, 261– 271.

24

https://mc06.manuscriptcentral.com/genome-pubs Page 25 of 42 Genome

Thai BT, Phan PD, Austin CM. 2007. Phylogenetic evaluation of subfamily classification of the Cyprinidae focusing on Vietnamese species. Aquatic Living Resources 20, 143–153. (doi:10.1051/alr:2007025).

Thieme M, Thieme ML, Abell, R. 2013. World Wildlife Fund Ecoregion assessments: Freshwater Ecoregions of Africa and Madagascar: A Conservation Assessment. Washington DC, US: Island Press. ProQuest ebrary. Web. 15 November 2016.

Thieme ML, Abel R, Stiassny MLJ, Burgess L, Lehner B, Dinerstein, E, Olso D. 2005. Freshwater ecoregions of Africa and Madagascar: A conservation Assessment. Island Press, 704 Washington DC.

Thomas GH, Hartmann K, Jetz W, Joy JB, Mimoto A, Mooers AO. 2013. PASTIS: An R package to facilitate phylogenetic assembly with soft taxonomic inferences. Methods in Ecology and Evolution 4, 1011–1017.Draft (doi: 10.1111/2041-210X.12117).

Thuiller W, Lavergne S, Roquet C, Boulangeat I, Lafourcade B, Araujo MB. 2011. Consequences of climate change on the tree of life in Europe. Nature 470, 531–534. (doi:10.1038/nature09705).

Tingley R and Hitchmough RA, Chapple DG. 2103. Life-history traits and extrinsic threats 715 determine extinction risk in New Zealand lizards. Biological Conservation 165, 62– 68. (doi:10.1016/j.biocon.2013.05.028).

Tonini J FR, Beard KH, Ferreira RB, Jetz W, Pyron RA. 2016. Fully-sampled phylogenies of squamates reveal evolutionary patterns in threat status. Biological Conservation 204, 23–31. (doi:10.1016/j.biocon.2016.03.039).

Vamosi JC, Vamosi SM. 2008. Extinction risk escalates in the tropics. PLoS One 3, e3886. (doi:10.1371/journal.pone.0003886).

25

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 26 of 42

Veron S, Davies TJ, Cadotte MW, Clergeau P, Pavoine S. 2017. Predicting loss of evolutionary history: Where are we? Biological Reviews 92, 271–291. (doi:10.1111/ brv.12228).

Venter JA, Fouché PSO, Vlok W. 2010. Current distribution of the southern barred , Opsaridium peringueyi (Pisces: Cyprinidae), in South Africa: Is there reason for concern? African Zoology 45, 244–253.

Vörösmarty CJ et al. 2010. Global threats to human water security and river biodiversity. Nature 467, 555–561. (doi:10.1038/nature09440).

Wang Z-C, Gan X, Li J, Mayden R, He S. 2012. Cyprinid phylogeny based on Bayesian and maximum likelihood analyses of partitioned data: implications for Cyprinidae systematics. Science China Life Sciences 55, 761–773. (doi:10.1007/s11427-012-4366-z). Draft Warren WC et al. 2008. Genome analysis of the platypus reveals unique signatures of evolution. Nature 453, 175–183.

Winter M, Devictor V, Schweiger O. 2012. Phylogenetic diversity and nature conservation: where are we? Trends in Ecology & Evolution 28, 199–204. (doi:10:1016/j.tree.2012.10.015)

Yang L, Sado T, Hirt MV, Pasco-Viel E, Arunachalam M, Li J, Wang X, Freyhof J, Saitoh K, Simons AM, Miya M, He S, Mayden RL 2015. Phylogeny and polyploidy: Resolving the classification of cyprinine fishes (Teleostei: Cypriniformes). Molecular Phlogenetics and Evolution 85: 97–116.

Yessoufou K, Daru BH, Tafirei R, Elansary HO, Rampedi I. 2017. Integrating biogeography, threat and evolutionary data to explore extinction crisis in the taxonomic group of cycads. Ecology and Evolution 2017, 1–12. (doi:10.1002/ece3.2660).

Yessoufou K, Davies, TJ. 2016. Reconsidering the loss of evolutionary history: How does non-random extinction prune the tree-oflife? In R. Pellens & P. Grandcolas (Eds.),

26

https://mc06.manuscriptcentral.com/genome-pubs Page 27 of 42 Genome

Biodiversity Conservation and Phylogenetic Systematics, Volume 14 of the series “Topics in Biodiversity and Conservation” (pp. 57–80). Switzerland: Springer International Publishing.

Zardoya R, Doadrio I. 1999. Molecular evidence on the evolutionary and biogeographical patterns of European Cyprinids. Journal of Molecular Evolution 49: 227–237.

Zheng LP, Yang JX, Chen XY, Wang WY. 2010. Phylogenetic relationships of the Chinese Labeoninae (Teleostei, Cypriniformes) derived from two nuclear and three mitochondrial genes. Zoologica Scripta 39: 559–571.

Draft

27

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 28 of 42

Table 1: Top 50 EDGE-species of the African Cyprinidae alongside their geographic origin, habitat preference, IUCN conservation status and ED values (Evolutionary distinctiveness). My = million years.

Species names ED value EDGE IUCN Habitat Geographic (My) values category region (my) 1 Enteromius oboi 53.9956798938537 6,779843357 CR Benthopelagic Western 2 Larbeobarbus ruasae 33.580796585592Draft 6,315887238 CR Benthopelagic Eastern 3 Opsaridium microlepis 62.7623329137027 6,234604164 EN Dermasal Multiple

4 Enteromius carcharhinoides 30.576385056184 6,224998255 CR Pelagic Western

5 Pseudophoxinus punicus 61.078547639426 6,207842022 EN Benthopelagic Northern

6 Enteromius melanotaenia 23.6679463040357 5,978093402 CR Benthopelagic Western

7 Pseudobarbus erubescens 23.3066624728506 5,963339211 CR Benthopelagic Southern

28

https://mc06.manuscriptcentral.com/genome-pubs Page 29 of 42 Genome

Species names ED value EDGE IUCN Habitat Geographic (My) values category region (my) 8 Labeobarbus platystomus 19.986237430648 5,816455585 CR Benthopelagic Eastern

9 Pseudobarbus burchelli 19.8603106611476 5,810437064 CR Benthopelagic Southern 10 Labeo mesops 38.3398049239498Draft 5,751678396 EN Benthopelagic Southern 11 Labeobarbus macrophtalmus 35.5632785539664 5,67848596 EN Benthopelagic Eastern

12 Pseudobarbus asper 35.4206526611476 5,674577536 EN Dermasal Southern

13 Pseudobarbus capensis 33.580796585592 5,622740058 EN Benthopelagic Southern

14 Enteromius lauzannei 33.0186918482427 5,606351676 EN Benthopelagic Western

15 Labeo alluaudi 29.9529967944722 5,511911363 CR Benthopelagic Western

16 Labeobarbus ruandae 14.4267538467375 5,508691987 NT Benthopelagic Eastern

29

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 30 of 42

Species names ED value EDGE IUCN Habitat Geographic (My) values category region (my)

17 Caecobarbus geertsii 59.9143490703587 5,495763125 VU Benthopelagic Central

18 Enteromius thysi 29.3893507316164 5,493533784 EN Benthopelagic Central

19 Labeobarbus acuticeps 27.8496649992284Draft 5,441539923 EN Benthopelagic Eastern

20 Enteromius liberiensis 25.9106724992284 5,371964497 EN Benthopelagic Western

21 52.947119389426 5,374298657 VU Benthopelagic Eastern

22 Pseudobarbus serra 25.1588345973147 5,343628518 EN Benthopelagic Southern

23 Garra tana 48.9593673902804 5,297504384 VU Benthopelagic Eastern

24 Labeobarbus mungoensis 23.3485395237216 5,271913411 EN Benthopelagic Central

30

https://mc06.manuscriptcentral.com/genome-pubs Page 31 of 42 Genome

Species names ED value EDGE IUCN Habitat Geographic (My) values category region (my) 25 Labeo seeberi 10.8682136917783 5,24645243 VU Benthopelagic Southern

26 Enteromius huguenyi 22.534560056184 5,237911523 EN Pelagic Western 27 Enteromius nigroluteus 22.2279768002686Draft 5,22479899 VU Benthopelagic Central 28 Garra allostoma 43.7875523902804 5,188224613 VU Benthopelagic Central

29 Labeobarbus roylii 21.2942188418983 5,183768942 EN Benthopelagic Southern

30 Enteromius pseudotoppini 43.4238783895173 5,180071488 VU Benthopelagic Eastern

31 Enteromius treurensis 20.2694942608749 5,136715393 EN Benthopelagic Southern

32 Pseudobarbus phlegethon 19.8603106611476 5,117289884 EN Benthopelagic Southern

33 Labeo curriei 9.31089292511167 5,105789624 EN Benthopelagic Western

31

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 32 of 42

Species names ED value EDGE IUCN Habitat Geographic (My) values category region (my)

34 Labeobarbus alluaudi 38.3984029990523 5,060019644 VU Benthopelagic Western

35 Pseudobarbus trevelyani 18.6809443570549 5,059092418 EN Benthopelagic Southern

36 Labeobarbus petitjeani 38.0929351242284Draft 5,052236124 VU Benthopelagic Eastern

37 Labeo percivali 36.9076949910261 5,021448487 VU Benthopelagic Eastern

38 Enteromius laticeps 36.5457561444369 5,011854714 VU Benthopelagic Eastern

39 Labeobarbus mbami 17.5043983938537 4,997449996 EN Benthopelagic Central

40 Labeobarbus gruveli 34.0947875149094 4,944346977 VU Benthopelagic Western

41 Labeobarbus claudinae 16.4416934970474 4,93830506 EN Benthopelagic Eastern

32

https://mc06.manuscriptcentral.com/genome-pubs Page 33 of 42 Genome

Species names ED value EDGE IUCN Habitat Geographic (My) values category region (my) 42 Pseudobarbus burgi 16.4191944744809 4,93701427 EN Benthopelagic Southern

43 Pseudobarbus quathlambae 16.2284651611476 4,926004509 EN Benthopelagic Southern 44 Pseudobarbus afer 15.8085984744809Draft 4,901332111 EN Dermasal Southern 45 Pseudobarbus calidus 31.6698643348327 4,872747434 VU Benthopelagic Southern

46 Enteromius raimbaulti 30.5327711444369 4,83732172 VU Benthopelagic Western

47 Enteromius amatolicus 27.8794434111093 4,749424402 VU Benthopelagic Southern

48 Labeobarbus reinii 27.8630959169073 4,748858182 VU Benthopelagic Northern

49 Labeobarbus capensis 27.8630959169073 4,748858182 VU Benthopelagic Southern

33

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 34 of 42

Species names ED value EDGE IUCN Habitat Geographic (My) values category region (my) 50 Enteromius aliciae 13.3920337639439 4,746116384 EN Pelagic Western

Draft

34

https://mc06.manuscriptcentral.com/genome-pubs Page 35 of 42 Genome

Caption Table:

Table 1: Top 50 EDGE-species of the African Cyprinidae alongside their geographic origin, habitat preference, IUCN conservation status and ED values (Evolutionary distinctiveness). My = million years.

Caption Figures:

Figure 1: Fully sampled phylogeny of the African Cyprinidae color-coded at subfamily level. Outgroups are excluded for analysis purpose. All 50,000 trees generated are available on dryad Digital Repository: (http://dx.doi.org/10.5061/dryad.k10k8).

Figure 2: Results of the tests of phylogenetic signal in threat status and drivers of extinction risk using D statistic (Fritz and Purvis, 2010). Threat status is measured as IUCN Red List categories. The graph in blue is the distribution of D values assuming a Brownian Motion (BM) model, and the blue vertical line indicates D = 0 (when the phylogenetic distribution of a parameterDraft is no different from BM). The graph in red is the distribution of D values assuming a random model, and the red vertical line indicates D = 1 (when the phylogenetic distribution of a parameter is no different from random). The bold black vertical line indicates the observed D value. The number of * is indicative of the significance level of the observed D values.

Figure 3: Comparison between observed (vertical red line) and random losses (frequency histogram generated from 1000 random replicates) of phylogenetic diversity from the tree of life of the African Cyprinidae. Observed loss is the PD that will be lost if all threatened species go extinct.

Figure 4: Variation in values of Evolutionary Distinctiveness across 539 species of the African Cyprinidae. IUCN categories are Least Concern (LC), Near Threatened (NT), Vulnerable (VU), Endangered (EN), Critically Endangered (CR), Data Deficient (DD), and Not-Evaluated (NE). Threat categories comprised threatened (VU+EN+CR) and non- threatened categories (LC+NT). Three scenarios are considered: A) when DD and NE are distinguished, B) when DD and NE are considered as LC (following Adeoba 2018), and C) when IUCN categories are grouped into threatened and non-threatened categories.

Figure 5: Geographic distribution of the top fish genera in the EDGE-based species categorization following urgency level for conservation. Top genera are identified based on 35

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 36 of 42

their species richness in the top 50 EDGE species. Rivers where a representative of a genus occurs are color-coded to match the colors attributed to genus in the legend.

Supplemental information

Table captions:

Table S1 All data analyzed in the present study presenting the characteristics of African Cyprinidae.

Figure captions:

Figure S1 Distribution of threatened species (red branches) along the phylogeny. Threatened species are defined following IUCN red list (≥VU).

Figure S2 Results of the tests of phylogeneticDraft signal in habitat preference of fish species using D statistic. Habitat preference is defined as benthopelagic, demersal and pelagic. The graph in blue is the distribution of D values assuming a Brownian Motion (BM) model, and the blue vertical line indicates D = 0 (when the phylogenetic distribution of a parameter is no different from BM). The graph in red is the distribution of D values assuming a random model, and the red vertical line indicates D = 1 (when the phylogenetic distribution of a parameter is no different from random). The bold black vertical line indicates the observed D value. The number of * is indicative of the significance level of the observed D values.

Figure S2 Results of the tests of phylogenetic signal in geographic regions of species occurrence using D statistic. Regions are defined as central, eastern, northern, southern, and western Africa. "Multiple" indicates species occurring in more than one geographic regions. The graph in blue is the distribution of D values assuming a Brownian Motion (BM) model, and the blue vertical line indicates D = 0 (when the phylogenetic distribution of a parameter is no different from BM). The graph in red is the distribution of D values assuming a random model, and the red vertical line indicates D = 1 (when the phylogenetic distribution of a parameter is no different from random). The bold black vertical line

36

https://mc06.manuscriptcentral.com/genome-pubs Page 37 of 42 Genome

indicates the observed D value. The number of * is indicative of the significance level of the observed D values.

Draft

37

https://mc06.manuscriptcentral.com/genome-pubs Genome Page 38 of 42

Draft

Threatened Nonthreatened

https://mc06.manuscriptcentral.com/genome-pubs Page 39 of 42 Genome threat status* Overexploitation* Density Density 2 4 6 2 4 6 0 0 −0.5 0.0 0.5 1.0 Draft 0.0 0.5 1.0 D value D value

Invasive*** Pollution* Density Density 2 4 6 1 2 3 4 0 0

−1.0 −0.5 0.0 0.5 1.0 0.0 0.5 1.0 https://mc06.manuscriptcentral.com/genome-pubs D value D value Genome Page 40 of 42

Draft Frequency 0 50 100 150 200 250 300

1000 2000 3000 4000 5000 https://mc06.manuscriptcentral.com/genome-pubs randomly lost pd Page 41 of 42 A GenomeB C 100 100 100 80 80 80

60 60 Draft 60 Evolutionary distinctiveness 40 40 40 20 20 20

LC NT VU EN CR DD NE https://mc06.manuscriptcentral.com/genome-pubsLC NT VU EN CR nonthreatened threatened

IUCN categories IUCN categories Threat categories Genome Page 42 of 42

A Sierra Leone B C Guinea

Côte Tunisia Cameroon d'Ivoire Morocco

Spain Algeria Gabon Congo DRC

Liberia Mali Niger 0 200 400 0 450 900 0 300 600 Km Km Km Guinea-BissauDraft Genus name Barbus D E Malawi Comoros Caecobarbus Ethiopia Central African Republic Garra Labeo Labeobarbus ¯ Uganda Mozambique Leptocypris Kenya Rwanda South Africa Opsaridium Tanzania Lesotho Pseudobarbus 0 375 750 0 400 800 Pseudophoxinus Km Km African Countries

https://mc06.manuscriptcentral.com/genome-pubs