Molecular Phylogenetics and Evolution 66 (2013) 1027-1040

Contents lists available at SciVerse ScienceDirect

Molecular Phylogenetics and Evolution

ELSEVIER journal homepage: www.elsevier.com/locate/ympev

A large-scale phylogeny of (, Siluriformes) reveals the influence of geological events on continental diversity during the Cenozoic

Aurelie Pintona* Jean-Fran^ois Agneseb, Didier Paugyc,d, Olga Oteroa a Universite de Poitiers, IPHEP - Institut de Paleoprimatologie, Paleontologie Humaine: Evolution et Paleoenvironnements, UMR CNRS7262 INEE, SFA, Bat. B35, 6 rue M ichel Brunet, F86022 Poitiers Cedex, France b Universite de Montpellier, ISEM - Institut des Sciences de l'Evolution, UMR CNRS5554, Bat. B24, 2 Place Eugene Bataillon, 34095 Montpellier, France c Museum National d'Histoire naturelle, Departement «Milieux et Peuplements aquatiques», unite «Biodiversite et Dynamique des communautes aquatiques», US MNHN 0403, 43 rue Cuvier, 75231 Paris Cedex 05, France d Institut de recherche pour le developpement, 43 rue Cuvier, 75231 Paris Cedex 05, France

ARTICLE INFO ABSTRACT

Article history: To explain the spatial variability of fish taxa at a large scale, two alternative proposals are usually evoked. Received 27 April 2012 In recent years, the debate has centred on the relative roles of present and historical processes in shaping Revised 26 November 2012 biodiversity patterns. In , attempts to understand the processes that determine the large scale dis­ Accepted 15 December 2012 tribution of fishes and exploration of historical contingencies have been under-investigated given that Available online 28 December 2012 most of the phylogenetic studies focus on the history of the Great Lakes. Here, we explore phylogeograph- ic events in the evolutionary history of Synodontis (Mohokidae, Siluriformes) over Africa during the Ceno­ Keywords: zoic focusing on the putative role of historical processes. W e discuss how known geological events together with hydrographical changes contributed to shape Synodontis biogeographical history. Synodon­ Synodontis Africa tis was chosen on the basis o f its high diversity and distribution in Africa: it consists of approximately 120 Phylogeography species that are widely distributed in all hydrographic basins except the Maghreb and South Africa. W e Geotectonic events propose the most comprehensive phylogeny of this catfish genus. Our results provide support for the Cenozoic 'hydrogeological’ hypothesis, which proposes that palaeohydrological changes linked with the geological context may have been the cause of diversification of freshwater fish deep in the Tertiary. More precisely, the two main geological structures that participated to shape the hydrographical network in Africa, namely the Central African Shear zone and the East African rift system, appear as strong drivers of Syn- odontis diversification and evolution. © 2012 Elsevier Inc. All rights reserved.

1. Introduction 2007). These studies illustrated the richness and ecomorphological diversity of the cichlid species and documented various selective W ith almost 3000 freshwater fish species, the African fish fauna pressures involved in the ecomorphological differentiation and rivals that of Asia (>3500 species) and South America (>5000 spe­ species divergence. However, the processes that determine the cies). However, Africa’s evolutionary phenomena are what make its large-scale distribution of fish have been poorly investigated. Most freshwater fauna distinctive. This continent is characterised by its hypotheses that aim to explain the spatial variability of taxa at a species flocks, its high level of endemic taxa and the presence of large scale were first proposed for South American faunas in an relictual ''living fossils,’’ i.e., dipnoan and cladistian. Africa has environmental context dramatically different from that of Africa. more archaic and phylogenetically isolated freshwater fishes than During the last 15 years of research in macroecology, the debate any other continent, which might be related to long-term isolation has centred on the relative roles of present and historical processes of the plate (Otero, 2010). So far, studies focusing on freshwater in shaping present biodiversity patterns. The ''contemporary fish diversity on the continent have mainly used cichlids as evolu­ climate’’ hypothesis proposes that the current availability of re­ tionary models. Their rapid radiation in the Great Lakes of East sources controls the number of individuals that can live in a given Africa has attracted the most attention (e.g., Kocher et al., 1995; area and consequently the number of species that can co-exist Brandstatter et al., 2005; Salzburger et al., 2005; Day et al., (Currie et al., 2004). Alternatively, the ''historical’’ hypothesis pro­ poses that species distributions, and thus biodiversity patterns, are determined by long-term processes acting in the past. Historical * Corresponding author. hypotheses stress the strong concordance between known geolog­ E-mail addresses: [email protected] (A. Pinton), jean-francois. [email protected] (J.-F. Agnese), [email protected] (D. Paugy), [email protected] ical events, climatic events, and distribution patterns and it (O. Otero). includes many variants. M ore precisely, some authors assume that

1055-7903/$ - see front matter © 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ympev.2012.12.009 1028 A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040

Pleistocene glacial-interglacial alternation has been decisive in existed on the continent. Indeed, the early Cenozoic CAS associated shaping the present-day biota, while others believe that modern with the generalised African uplift produced the drainage divide species originated mostly in the Tertiary or earlier but before the between the Congolese and Nilo-Sudan waters, south to the pa- Quaternary glaciations (Hewitt, 2000; Willis and Niklas, 2004). leo-Lake basin and bordered by the South Sudan Rift in the Concerning freshwater fish taxa, Lundberg (1998) proposed that east (Fairhead, 1988; Stankiewicz and de Wit, 2006). If the CAS much of the diversity in South America might be the outcome of geological structure constitutes a major dispersal barrier that influ­ palaeohydrological changes under geological and climatic controls. enced speciation, we expect a strong phylogeographic structuring This proposal constitutes the “hydrogeological hypothesis’’ that across this barrier with the presence of vicariant lineages in the has been supported by Montoya Burgos (2003). Later study has Congolese and in the Nilo-Sudan regions. demonstrated that the historical biogeography of the South Am er­ Faunal relationships between Congolese and Zambezian prov­ ican catfish Hypostomus reveals that the many documented hydro­ inces are observed ( Katongo et al., 2007; K ob lm u ller et al., 200 8 ). geological changes likely occurred at the origin of all main The Chambeshi River, which is now part of the Congo basin, was cladogenetic events observed in the phylogeny of this genus. M ore formerly part of the Zambezi province, as it constituted the head­ recently in Africa, Goodier et al. (2011) found that principal diver­ waters of the Kafue (Moore and Larkin, 2001). The capture of the gen ce even ts in Hydrocynus (tiger fish) have interfaced closely with Chambeshi by the Congolese Luapula River is assumed to date to evolving drainage systems across tropical Africa during the last 16 the Pliocene on geological evidence, and the molecular estimates million years. Such a close relationship between phylogeographical of divergence times of allopatric taxa in at least three lineages patterns and geological, climatic and hydrological processes is cer­ ( Hydrocynus, Mastacembelus, cichlids) from those watersheds spans tainly not exclusive to the Hydrocynus genus, and several findings the interval from 4 to 1 million years ago (Cotterill and De Wit, suggest that the hydrogeological hypothesis could be extended to 2011). As an alternative hypothesis, Bell-Cross (1965) proposed many other fish taxa on the African continent. In one such finding that the most likely dispersal route out of the Congo was from based on fish fossil reports, a modern-type fish fauna can be recog­ the Kasai River into Upper Zambezi headwaters and that this nised at least since the Early Miocene (23-16 Ma; Otero and Gayet, waterway was active until recently in the Pleistocene. Goodier 2001; Stewart, 2001) with probable roots earlier in the Palaeogene et al. (2011) proposed that the timing of southward dispersal con­ (Otero and Gayet, 2001). Thus, the diversity and distribution of curs with constraints on activity in the western branch of the ac­ modern species are undoubtedly influenced by hydrological, geo­ tive EARS, currently geologically dated to have started between 4 logical and climatic events that occurred in Africa during this time and 12 Ma. If the geological context in the area is indeed responsi­ period. Furthermore, the ecological proprieties o f freshwater fishes ble for speciation events, we expect that Zambezian Synodontis make them sensitive to the evolution of a drainage system, which species are related to Congolese ones and that their dispersal with­ is under climatic and geological control. Indeed, their dispersal is in the Zambezian region occurs around the Pliocene. Moreover, in restricted by river divides and major catchments of Africa have ex­ th e area, riftin g a ctivity in th e w estern branch o f th e active EARS is isted since before the Miocene, although the details of their config­ also responsible for the formation of Lake Malawi, and geological uration have changed over time, notably in eastern Africa (Thieme evidences from deposits surrounding the lake suggests that a deep et al., 2005). Hence, phylogeographical analysis of fish groups lake may have first existed between approximately 4.5 and 8 Ma exhibiting wide distribution ranges and inhabiting all major river (Ebinger et al., 1993). basins of Africa should reflect historical changes that be detected In addition to its Zambezian association, the Congo basin exhib­ in their modern phylogeographic pattern. Here, we test the appli­ its a close relationship with Lake Tanganyika. Prior to rifting, the cability of the hydrogeological hypothesis on the African continent proto-Malagarasi River, an ancient river system located in the area by investigating the timing of diversification of the catfish genus of the modern Tanganyika, possibly drained west into the Congo Synodontis. This genus was chosen on the basis of its high diversity River system (Leveque, 1997). This geological data is corroborated and large distribution. It is endemic to the African freshwaters and by phylogenetic studies as in cichlids the Tanganyikan radiation re­ numbers approximately 120 species widely distributed in all sides between the emergence of fishes that inhabit the Congo River hydrographic basins, except Maghreb and southern Africa, and system. However, alternative proposals based only on fish distribu­ inhabits almost all types of aquatic environments, from the tion have been considered. Notably, a hydrographic connection be­ woodlands of the Congolese basin to the savannah grasslands of tween the Tanganyika and the Nilo-Sudan area is suggested by the . This geographic coverage is a requirement for the some authors that propose that taxa o f certain groups (i.e., Polypte- exploration on how biotic history relates to evolving landscapes rus, characids, and cyprinids) penetrated into Lake Tanganyika on a continental scale. The ichthyological provinces serve as the from the Nile system via the Rusizi (Snoeks et al., 1997). geographical unit of our study (Fig. 1). They are defined by homog­ The first radiation of cichlids in the Tanganyika is associated enous ichthyological content (Roberts, 1975; Leveque, 1997) and w ith the b egin n in g o f the riftin g in th e central basin ap p roxim ately hence by a distinctive evolutionary history that is moulded by 9 -1 2 M a ( Cohen et al., 1997). This radiation is th ought to result the geology-climate balance through time. Our samples include from diversification that took place at the onset of lake formation 51 species that broadly inhabit 70 hydrographic basins belonging before full lacustrine conditions. A second radiation is dated to eleven ichthyological provinces (Fig. 1; Leveque et al., 2008). approximately 5-6 Ma in Lake Tanganyika when the development W e formulate temporally and geographically explicit hypothe­ of real lacustrine habitats, including deep waters, is recorded ses prior to data collection and analysis, and use information from (Koblmuller et al., 2005, 2007a, 2007b; Duftner et al., 2005). independently derived datasets to test factors proposed to Lake Tanganyika is also a source of recent cichlid diversity in drive speciation. Below, we describe precise historical factors that East and South Africa (Salzburger et al., 2005). Some of the haplo- might explain fish diversity in Africa, namely, past geology and chromine Tanganyikan lineages seeded cichlid radiations in distant h ydrology. lakes and rivers, notably, in Lakes Victoria and Malawi and in the Most contemporary river basins of Africa formed during the Congolese/South African (CSA) drainage systems. The molecular Tertiary when Central Africa emerged above sea level (Stankiewicz clock calibration for the most recent common ancestor to the and de Wit, 2006). Two main geological structures participated to Malawi-Victoria-CSA lineages dates to approximately 2.4 Ma shape the hydrographical network in Africa: the Central African (1.2-4.0 Ma), 2 Ma (1.2-3.9 Ma) for the first branching events Shear zone (CAS) and the East African rift system (EARS). The Con­ within the CSA clade and 1.8 Ma (0.7-3.8 Ma) for the most recent go basin is by far the most ancient delineated drainage system that common ancestor of the modern haplochromines from the Malawi A. Pinton et at./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040 1029

Fig. 1. The main ichthyological provinces in Africa, after Leveque (1997). In brackets, we indicate acronyms used in the text for each region and the number of Synodontis species in each ichthyological region. Sample localities of Synodontis are figurate on the map with numbers referring to the localities defined in Table 1.

and Victoria Lakes (Salzburger et al., 2005). This geographical pat­ 2005, 2006 and 2007. Muscle tissues were preserved in 95% eth­ tern appears to be robust as it is found based on different sam­ anol in the field. Voucher specimens for each species are depos­ pling (see Koblmuller et al., 2008), but different dating can be ited in the Museum National d’Histoire Naturelle (Paris). Three obtained with alternative calibration methods (Genner et al., non-Synodontis species were also used: M icrosynodontis sp. and 2007). For instance, Genner et al. (2007) suggested that the cichlid Chiloglanis sp., which belong to the family Mochokidae, and flock of Lake Malawi originated either 4.6 Ma (Gondwana calibra­ Auchenoglanis sp. (Auchenoglanididae). tion) or 2.4 Ma (fossil calibration). In Mastacembelus, the Tanganyi­ W e used two molecular markers, the mitochondrial gene cyt b kan species are related with South and East African species (Brown and the first intron of the ribosomal nuclear marker S7 (rpS7). et al., 2010) as th ey form the sister group o f a Southern-Eastern Cyt b is useful for elucidating both deep-level and shallow time clade that branches off earlier than the cichlids, approximately relationships, but in order to validate our phylogenetic hypothesis, 11.9 Ma (8.7-15.9 Ma). Brown et al. (2010) proposed that M ast- w e used the nuclear gen e S7 that e vo lves m ore slo w ly and is inher­ acembelus shiranus, endemic to Lake Malawi, colonised the Lake ited via a different mechanism. approximately 3.9 Ma, post deep water conditions (4.5 Ma). Once Total DNA was extracted from muscle tissue preserved in alco­ again, in cichlids and Mastacembelus, it is proposed that phylogeo- hol using the GenElute Mammalian Genomic DNA Miniprep Kit graphic patterns have been influenced by climate- and/or geology- from Sigma-Aldrich, Inc. The mtDNA cytochrome b region induced changes in the environment, with river capture events was PCR amplified using two newly designed primers: 5'-GAC- most likely playing an important role for species dispersal. TTGAAGAACCACCGTTG-3' forward and 5' -TTTAGAATTCTGG The aim o f our analysis is to test th e a pplicability o f the “ h yd ro­ CTTTGGGAG-3' reverse. The amplification protocol consisted of geological hypothesis’’ on Synodontis fishes. W e expect that much 35 cycles beginning with 3 min at 93 °C for initial denaturation, of the diversity of the genus might be the outcome of palaeohydro- followed by cycles of 30 s at 93 °C, 30 s at 51 °C for annealing, logical changes influence by geological processes. 1 min 30 s at 72 °C for extension, with a final 5 min extension step at 72 °C. The primers used amplified a 1.2 kb fragment of which 2. Materials and methods 960 bp was sequenced. rpS7 was amplified with the following primers designed by Chow and Hazama (1998): 5'-TGGCCTC

2.1. Taxon sampling and laboratory protocols TTCCTTGGCCGTC-3' forward and 5'-AACTCGTCTGGCTTTTCGCC-3' reverse. Amplification was carried out with an initial denaturation

A total of 60 Synodontis samples representing 51 species and at 95 °C for 1 min 30 s, followed by 30 cycles of amplification three outgroup taxa were included in the analyses. In addition (denaturation at 95 °C for 30 s, annealing at 58 °C for 1 min and to the data available in GenBank and research on Synodontis extension at 72 °C for 2 min) and a final extension at 72 °C for Tanganyikan flocks, (Day and Wilkinson, 2006; Koblmuller 5 min. Fragments were purified with the ExoSAP-IT kit (Amersham et al., 2006; Day et al., 2009), w e sam pled and sequenced 12 spe­ Biosciences) and sequenced using the original primers from the cies elsewhere in Africa to obtain a data set that reflects the glo­ BigDye Terminator Reaction Mix from Applied Biosystems. bal distribution and diversity of Synodontis in the African Sequencing reactions were electrophoresed on an ABI 3130 XL freshwaters (Fig. 1, Table 1). Synodontis species were collected automated sequencer (Applied Biosystems). All chromatograms in Chad, , Egypt, , Kenya and in 2004, were verified visually. 1030 A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040

Table 1 Synodontis species used in this study and species analysed as outgroups: species, collector, collection locations, sample size and GeneBank accession numbers. Numbers in brackets indicate sample localities reported in Fig. 1.

Species Collector Locality/river or, and (basin) Accession No.: Cyt b Accession N o.: rpS7

Outgroup Chiloglanis asymetricaudalis Day (2009) Kigoma/Mukuti Bridge (1) DQ886604 - Microsynodontis sp. Day (2009) Cuvette-Ouest/Lekenie river (Congo) (2) DQ886604 - Auchenoglanis sp. Pinton A. Bamako/n.a. (N iger) (3 ) EU781895 -

Synodontis Synodontis frontosus Day (2009) Wanseko/Lake Albert (Lake Albert) (4) FM878850 FM878912 Synodontis serratus Day (2009) Wanseko/Lake Albert (Lake Albert) (4) FM878849 FM878911 Synodontis schall 1 Pinton A. & Otero O. Bamako/n.a. (N iger) (3 ) EU781912 KC020677 Synodontis schall 2 Pinton A. & Otero O. Mailao/Chari (Chad) (5) EU781914 KC020678 Synodontis schall 3 Pinton A. & Otero O. Manantali/Senegal (Senegal) (6) EU781908 KC020679 Synodontis schall 4 Pinton A. & Otero O. Manantali/Senegal (Senegal) (6) EU781909 KC020680 Synodontis violaceus Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781943 KC020688 Synodontis filamentosus 1 Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781936 KC020682 Synodontis filamentosus 2 Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781937 KC020683 Synodontis clarias 1 Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781927 KC020684 Synodontis clarias 2 Day (2009) Nana-Grebiz/n.a. (8) FM878875 FM878925 Synodontis sorex Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781944 KC020685 Synodontis courteti 1 Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781940 - Synodontis courteti 2 Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781941 - Synodontis ocellifer Pinton A. Manantali/Senegal (Senegal) (6) EU781931 KC020675 Synodontis membranaceus 1 Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781924 KC020686 Synodontis membranaceus 2 Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781925 KC020687 Synodontis batensoda Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781923 KC020676 Synodontis nigrita Pinton A. & Otero O. N’Djamena/n.a. (Chad) (7) EU781917 Day (2009) Wanseko/Lake Albert (Lake Albert) (4) FM878880 Synodontis budgetti 1 Pinton A. & Otero O. Bamako/n.a. (N iger) (3 ) EU781938 - Synodontis budgetti 2 Pinton A. & Otero O. Bamako/n.a. (N iger) (3 ) EU781939 - Synodontis afrofischeri Nyingi D. Lake Victoria/n.a. (Lake Victoria) (9) EU781946 Day (2009) Jinja Lake/Lake Victoria (Lake Victoria) (10) FM878890 Synodontis af. afrofischeri Day (2009) Kigoma/Malagrasi (Lake Victoria) (11) DQ886616 FM878889 Synodontis victoriae Nyingi D. Lake Victoria/n.a. (Lake Victoria) (9) EU781929 Day (2009) Jinja Lake/Lake Victoria (Lake Victoria) (10) FM878908 Synodontis njassae Day (2009) Nkhata Bay/Lake Malawi (Lake Malawi) (12) DQ886620 FM878907 Synodontis irsicae Day (2009) Mpulung/Lake Tanganyika (Tanganyika) (13) DQ886653 FM878900 Synodontis polli Day (2009) Mpulung/Lake Tanganyika (Tanganyika) (13) DQ886645 FM878899 Synodontis petricola Day (2009) Mpulung/Lake Tanganyika (Tanganyika) (13) DQ886638 FM878901 Synodontis af. tanganaicae Day (2009) Kigoma/Lake Tanganyika (Tanganyika) (14) DQ886658 FM878904 Synodontis multipunctatus Day (2009) Mpulung/Lake Tanganyika (Tanganyika) (14) DQ886625 FM878896 Synodontis granulosus Day (2009) Ikola/Lake Tanganyika, (Tanganyika) (15) DQ886651 FM878898 Synodontis grandiops Day (2009) Jakobsen’s Beach/Lake Tanganyika (15) FM878846 FM878897 Synodontis lucipinnis Day (2009) Kigoma Lake/Tanganyika (15) DQ886631 FM878903 Synodontis angelicus Day (2009) n.a./n.a. DQ886605 FM878882 Synodontis pleurops Day (2009) Cuvette-Ouest/Mambili (Congo) (16) DQ886612 FM878886 Synodontis congicus Day (2009) Cuvette-Ouest/Mambili (Congo) (16) DQ886607 FM878885 Synodontis decorus Day (2009) Cuvette-Ouest/Lekoli (Congo) (16) DQ886609 FM878884 Synodontis brichardi Day (2009) n.a./n.a. DQ886606 FM878883 Synodontis greshoffi Day (2009) Cuvette-Ouest/Congo (Congo) (16) DQ886610 FM878894 Synodontis nigriventris Day (2009) Cuvette-Ouest/Lekoli (Congo) (17) DQ886611 FM878887 Synodontis contractus Day (2009) Cuvette-Ouest/Lekenie (Congo) (2) DQ886608 FM878881 Synodontis leopardinus Day (2009) Okavango/Thoage (Zambezi) FM878860 FM878918 Synodontis macrostiga Day (2009) Okavango/Okavango (Zambezi) (18) FM878867 FM878923 Synodontis nebulosus Day (2009) Manica Area/Buzi (Zambezi) (21) FM878862 - Synodontis nigromaculatus Day (2009) Okavango/Thoage (Zambezi) (18) DQ886615 FM878895 Synodontis af. thamalakalensis Day (2009) Okavango/Okavango, Boro (Zambezi) (19) FM878861 FM878919 Synodontis vanderwaali Day (2009) Okavango/Okavango, Boro (Zambezi) (19) FM878864 FM878921 Synodontis woosnami Day (2009) Okavango/Maunachira (Zambezi) (20) FM878866 FM878922 Synodontis zambezensis Day (2009) Maputo/Maputo (Zambezi) (22) FM878858 FM878905 Synodontis batesii Agnese J.-F. n.a./Nyong (Nyong) (23) EU781934 - Synodontis steindachneri Agnese J.-F. n.a./Nyong (Nyong) (23) EU781935 KC020681 Synodontis gambiensis Day (2009) Magburaka/Rokel (Rokel) (24) FM878868 FM878924 Synodontis thysi Day (2009) Magburaka/Rokel (Rokel) (24) FM878872 FM878916 Synodontis waterloti Day (2009) Bumbuna/Rokel (Rokel) (25) FM878869 FM878878 Synodontis katangae Day (2009) Luapula/Luapula, (Congo) (26) FM878856 - Synodontis rebeli Day (2009) Sud/Lobe (Lobe) (27) DQ886613 FM878877 Synodontis soloni Day (2009) Cuvette-Ouest/Congo, (Congo) (2) DQ886614 FM878888 Synodontis unicolor Day (2009) Luapula/Lake Mwueru (Mwueru) (26) FM878874 FM878891

2.2. Sequence alignment and phylogenetic analysis two partitions (cyt b and rps7), we conducted a 1000-pseudorepli- cate partition homogeneity test or an incongruence-length differ­ The sequences were aligned using ClustalX (Thompson et al., ence (ILD ) test ( Farris et al., 1995). 1997) and the alignment was then refined manually. W e combined W e inferred phylogenetic hypotheses from the dataset using sequences into a single dataset. To identify any conflict among the two probabilistic approaches: maximum likelihood (ML) and A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040 1031

Bayesian inferences (BI). ML analyses were run using GARLI and (-Ln clock = 12229.935, -Ln non-clock = 12184.297, LRT = 91.926, GARLI-PART (Genetic Algorithm for Rapid Likelihood Inference, d.f. = 61, p < 0.01). The differences in log likelihoods are significant, ver. 0.97; Zwickl, 2006), which provides considerable advantages and the molecular clock hypothesis was rejected. over PAUP in terms of computational efficiency and allows parti­ W e used the complete data set to estimate approximate diver­ tioning of the data. It uses a genetic algorithm that finds the tree gence times for Synodontis using an uncorrelated log-normal relaxed topology, branch lengths and model parameters that maximise clock rate variation model (Drummond et al., 2006) as implemented ln(L) simultaneously (Zwickl, 2006). BI analysis was performed in the software BEAST version 1.4.2 (Drummond and Rambaut, using MrBayes version 3.0b4 (Huelsenbeck and Ronquist, 2001). 2007). A speciation Yule process was used as a tree prior. Divergence W ith the two probabilistic methods, the choice of an adequate se­ times in the tree were estimated using the same partition model se­ quence evolution model remains a crucial issue. The search for the lected for MrBayes analysis (see below). Because fossils correctly optimal model of nucleotide substitution was conducted using placed in a phylogeny can only provide minimum age estimates MrModeltest 2.0 (Nylander, 2004) based on the Akaike Information for a particular lineage (Marshall, 1990), estimates of substitution Criterion (AIC) values (Posada and Buckley, 2004). rates were calibrated using fossil information in the form of proba­ In our Bayesian analysis, w e explored different data partitioning bilistic priors rather than as fixed values. Specifically, a lognormally strategies on the combined data set to improve the fit of the sub­ distributed prior (w ith a rigid lower bound, mean and standard devi­ stitution model to the data. Four partitioning strategies were used. ation parameters) was chosen to accommodate some of the uncer­ In the first approach (four partitions: 4P), w e ascertained the best- tainties associated with the use of fossil data to specify the ages of fit model and model parameters based on the AIC values for each the internal nodes used as calibration points. This prior is generally codon position for Cyt b and for the unpartitioned gene rps7. W ith­ regarded as the most appropriate for modelling palaeontological in the MrBayes analysis, each codon position o f the Cyt b was given information (Hedges and Kumar, 2004; Drummond et al., 2006), a separate (unlinked) model. W e ran a second MrBayes analysis and its use implies that the actual cladogenetic event is likely to have with the dataset divided into three partitions (3P) corresponding occurred at some tim e prior to the earliest appearance of the fossil to first and second positions versus the third for Cyt b gene, while ( Ho, 2007). Indeed, fossils can only provide minimum age estimates, the rps7 was unpartitioned. A third strategy consisted of dividing and their appearance must postdate the origin of the clade to which the dataset into two partitions corresponding to the two genes they belong. It is unclear how much the appearance of a clade pre­ (2P), and in the fourth strategy, the data were unpartitioned (0P). dates the age of the first fossil. Therefore, w e ensured that the 95% Searches were conducted using the default parameters, starting probability included the oldest reasonable age for the clade. with random trees, including three heated and one cold chain for In Africa, the oldest records of assignable to extant 5,000,000 generations in which parameters and trees were sam­ families are from the Late Palaeocene of (White, 1935), pled every 100 generations. The log likelihood was plotted against and six families have been described: Ariidae, Bagridae, Clariidae, generation to identify the convergence point, and the burn-in was Claroteidae, Mochokidae and Schilbeidae. Among these six fami­ discarded. Partitioning strategies were compared by calculating lies, Mochokidae is by far the best represented with the Synodontis Bayes factors. For each of the four runs, the harmonic means of and M ochokus genera found. W e calibrated the tree using two fossil the likelihoods were calculated using the sump command in constraints. The earliest evidence of the Mochokidae family is a MrBayes 3.1.2. Following Kass and Raftery (1995), a 2*Aln Bayes tooth discovered from Eocene deposits on the north shore of Birket factor > 10 was interpreted as strong evidence against the alterna­ Qarun in the Fayum Depression in Egypt which is dated approxi­ tive topology tested. mately 37 Ma and could belong to Synodontis, Chiloglanis or M icro - W e estimated the maximum likelihood (ML) topology using the synodontis (Murray et al., 2010). So far, we have observed this type program GARLI-PART (ver. 0.97.; Zwickl, 2006) with model param­ of tooth only in Mochokidae fish and not in any other African fish eters estimated during the run. This application allows partitioning taxa, either modern nor fossil (Otero and Pinton, pers. obs.). More­ of data into subsets that may each be assigned to separate evolu­ over, some undescribed fossil material from sub-contemporary tionary models with independently estimated parameters. W e ex­ strata in Libya (one tooth, Otero et al., in preparation) and slightly plored the same data partitioning strategies that were defined for older material from Algeria (nuchal shield, Otero et al., in prepara­ the MrBayes analysis. AIC was used to assess the optimal data par­ tion) have been identified by Otero and Pinton as likely Mochokid. tition model. It was calculated as the likelihood score estimated by Thus, the presence of Mochokidae in the Late Eocene is confiden­ Garli under each p artition ing schem e penalised b y a function o f the tially ascertained. The absence of any Mochokid record from strata number of free parameters in the model. The model with the older than the Late Eocene and their richness in much younger smallest AIC was considered the best. strata (Late Miocene and Pliocene) does not constitute proof of Five runs were conducted to ensure the topology converged on the absence of the family before the Late Eocene. Indeed, the fresh­ the same maximum likelihood tree. For each run, we performed water fossil record is shaped by multiple factors including geolog­ 500 independent search replicates (searchreps = 500). Nodal ical, climatic and sedimentological context (Otero, 2010). support was inferred by bootstrap proportions after 500 bootstrap Consequently, the absence of taxa during certain periods only re­ replicates with two independent search replicates each (bootstra- flects the quality of the fossil record - the Early Palaeogene and Oli- preps = 500 and searchreps = 3). Bootstrap values were obtained gocene deposits with freshwater fishes are scarce (ibid). Therefore, by importing the trees into PAUP* (Swofford, 2001) and generating a log normal prior distribution with the following parameters was a majority-rule consensus topology. implemented to place the most recent common ancestor (MRCA) of Synodontis, Chiloglanis and Microsynodontis: offset = 37 Ma; log 2.3. Estimation of time mean = 1; log standard deviation = 1.2 to set 65 Ma as the 97.5% soft upper bound. This upper bound corresponds to the limit be­ The first step was to determine whether lineages exhibit a con­ tween the Paleocene and the Cretaceous and is based on the earli­ sistent substitution rate, i.e., whether they obey the molecular est occurrence of catfish on the African continent. The earliest clock assumption. W e performed a likelihood-ratio test on the like­ fossil Synodontis dates from the Early Miocene (Burdigalien: lihood scores of trees with and without the molecular clock en­ 23.4-16 Ma) of Egypt (Priem, 1920) and was discovered in a faunal forced. Degrees of freedom were calculated as the number of assemblage whose age is 18 Ma (Miller, 1999). This fossil is identi­ taxa minus 2. The likelihood-ratio test revealed significant fied as Synodontis based upon the presence o f a supraoccipital that differences between the tree with and without the molecular clock is truncated posteriorly and exhibits the typical posterior notch 1032 A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040 allowing insertion of the anterior nuchal plate. This fossil was used gene trees (ILD test not significant, p < 0.005), and we present the to place the MRCA of Synodontis with the following parameters: combined data analyses here. offset = 18 Ma; log mean = 1; log standard deviation =1 to set Nucleotide substitution models selected by AIC under different 37 Ma as the 97.5% soft upper bound. This upper bound is based partition models are presented in Table 2. on the earliest occurrence of the family Mochokidae. A consensus obtained from BI and ML analyses is shown in Fig. 2 The Markov chains were run for 40 million generations, starting (Fig. 2A). Under the Bayesian criterion, Bayesian factors favoured from a randomly chosen tree and sampling and saving every the 3P partition model (Table 3) due to the value of twice the log 1000th tree to obtain effective sample sizes of more than 200 for of the Bayes factor being 92.22, which is well above Kass and Raf- all parameters. In all, 25% of samples were removed as burn-in. tery’s critical value of 10 (Kass and Raftery, 1995). Under the like­ Convergence ofthe chain to stationary distributions was confirmed lihood criterion, the highest partition model, 3P, obtained the by inspection of the MCMC samples in each analysis using the pro­ smallest score (Table 4) and was selected as the best-fit partition gram Tracer 1.4 (Drummond and Rambaut, 2007). The BEAST anal­ scheme. The same nodes are globally recovered after conducting ysis was performed twice (to determine if the two independent BI and ML analyses. However, we found minor differences within runs converged on the same posterior distribution) and log output subclades. The ML analysis recovered S. batesii as the sister taxon files were compared using Tracer. Analyses were also run without o f S. budgetti, S. rebeli, S. waterloti and S. nigrita while the BEAST data to sample from the joint prior distribution only. analysis found that S. batesii formed a clade with S. unicolor, S. greshoffi and S. afrofischeri. Alternative positions of S. batesii are weakly supported in both analyses and its placement is unresolved 2.4. Range evolution in the MrBayes analysis (Fig. 2B). Following partitioned maximum likelihood analysis, S. courteti nests within the Nilo-Sudan species The phylogenetic tree and estimated divergence times of the at node G (Fig. 2A), reinforcing geographical phylogenetic evidence, Synodontis lineages generated by BEAST were employed to recon­ whereas in Bayesian analysis, S. courteti forms a clade with I andJ struct the possible historical distribution for this group. Both the ( Fig. 2B). Because a KH test implemented in PAUP indicates that the Maximum Likelihood-based estimation (dec-model) implemented monophyly of S. courteti + I + J could not be rejected, and because in Lagrange (Ree et al., 2005; Ree and Smith, 2008) and Bayesian this monophyletic relationship was sustained in a maximum parsi­ Binary MCMC Analysis (BBM) provided in RASP v2.0b (Yu et al., mony analysis, we favour this hypothesis without excluding the 2010) were used in the reconstruction efforts. By using multiple other. The age o f the m ost recent com m on ancestor to the Synodon­ trees from a Bayesian analysis, BBM has the advantage that uncer­ tis genus was estimated at approximately 20.8 Ma (95% HPD: 18­ tainties in phylogenetic inference can be taken into account. In 28). Trees obtained from Maximum Likelihood and Bayesian anal­ contrast, Lagrange uses a likelihood approach that considers the ysis both recovered the same two major clades (nodes B and C, branch length of a given tree. W e used the BEAST MCMC tree chain Fig. 2) that roughly correspond to a divide between two great geo­ as input for the BBM analysis, and w e used the BEAST maximum graphical areas. Congolese, Zambezi and Guinean lineages gather clade credibility tree as the input tree for Lagrange. A Python script in a clade that includes all the Congolese endemic species (node was created using the online Lagrange configurator. Dispersal be­ B, Fig. 2A), whereas the clades that belong to the Nilo-Sudan and tween non-adjacent areas was restricted. W e defined the following Great Lakes regions gather in a second major group (node C, eleven geographically relevant areas corresponding to the recogni­ Fig. 2A). The mean age for those two major groups of Synodontis sed ichthyoregions (Fig. 1): Angola (A), Congo (C), Lower species were estimated at approximately 18.9 Ma (95% HPD: 15­ (LG), Malawi (M ), Upper Guinea UG), Nilo-Sudanese (NS), Zambezi 26) and 16.9 Ma (95% HPD: 12-23), respectively. Within clade B (Z), East coast (E), Eburneo-Guinean (EG), Tanganyika (T) and the diversification of the Congolese/Zambezi species (nodes D Victoria (V). W e chose not to place dispersal time constraints on and E, Fig. 3) starts approximately 15 Ma (95% HPD: 10-21 Ma, the analyses to optimise the data and allow all practical range Fig. 3). All the species included in clades D and E are either endemic evolution scenarios to occur. Ancestral ranges were assumed to in­ to the Congo basin or distributed in the Zambezi province except clude no m ore than five areas, th e m axim u m observed for Synodon- S. steindachneri and S. batesii w h ich are present in th e n orthern part tis extant species. of Lower Guinea and S. afrofischeri, w h ich is found in the Great Lakes region (Fig. 2A). Finally, clade F (Fig. 2A), which diversified 3. Results approximately 15.4 Ma (95% HPD: 11-21) presents a mixed geo­ graphical pattern including species located in the Nilo-Sudan area, 3.1. Dataset characteristics and phylogenetic relationships Upper and Lower Guinea. The second major group of Synodontis (node C, Fig. 2A) is A 954 base-pair alignment for the cyt b region of the mitochon­ formed by an assembly of Nilo-Sudan and Great Lakes species. drial genome was obtained after trimming the ends of each se­ The ancestor of the clade C appeared approximately 16.9 Ma quence, whereas the rpS7 fragment obtained was 575 bp long. (95% HPD: 12-23) and mostly diversified between 15 Ma and The resulting combined data matrix results in 1529 sites. Sequence gaps found from our alignment are not included in the data matrix as coded insertion/deletion characters (indels). Data for the rpS7 Table 2 intron1 were missing for S. katangae, S. nigromaculatus, S. batesii, Nucleotide substitution models selected by S. budgetti and S. courteti. AIC under different partition models. A total of 689 variable sites were identified for all samples, of Gene M odel which 537 were parsimony informative (i.e., shared by at least two different sequences). The cyt b fragment contains more parsi­ cytb GTR + I + G cytb 1st position HKY + I mony-informative characters (informative for parsimony in PAUP, cytb 2nd position GTR + G PI = 342) than the rpS7 fragment (PI = 195). Considering the com­ cytb 1st and 2nd GTR + I + G plete dataset, pairwise p-distance values range from 6% to 19% be­ position tw e e n outgroup and ingroup taxa and from 0.2% (S. multipunctatus cytb 3rd position SYM + I + G rps7 K80 vs. S. grandiops) to 12% (S. gambiensis vs. S. greshoffi) within the in­ cytb + rps7 GTR + I + G group taxa. W e observed no significant conflict among individual A. Pinton et at./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040 1033

Fig. 2. Phylogenetic relationships of African Synodontis inferred from the concatenated mtDNA and ncDNA datasets. (A) Consensus of the ML and BI trees, with support from bootstrap generated from ML and Bayesian posterior probability (BPP), above and below branches respectively (shown for nodes with greater than 50% support). Key nodes are labelled A-J. Distribution in ichthyological provinces refers to Fig. 1 and is given for each species using the following abbreviations: A, Angola; C, Congo; E, East Coast; T, Tanganyika basin; V, Victoria basin; LG, Lower Guinea; Ma, Malawi Lake; NS, Nilo-Sudan; EG, Eburneo-Ghanea; UG, Upper Guinea; Z, Zambezi. Distribution map of some species are shown and indicated in bold. (B) Phylogram based on Bayesian analysis of concatenated data set (mtDNA and ncDNA) to depict branch lengths. 1034 A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040

Table 3 4. Discussion Summary of Bayesian analyses. The values in bold indicate the optimal strategy selected by Bayes Factor. Our Synodontis phylogeny supports an almost complete phylog- Partition model Harmonic mean Bayes factor eographic gap between the Congolese (node B, Fig. 2A) and Nilo- 0P -12234.35 -152.92 Sudan (node C, Fig. 2A) lineages. A relatively strong isolation of C 2P -12222.61 -129.44 from NS is corroborated by reconstructions of the Synodontis distri­ 3P -12157.89 NA bution h istory as on ly on e even t o f direct dispersal b etw e en NS and 4P -12203.97 -92.22 C is inferred by BBM in the early history of the group between 20.8 Ma (node A, Fig. 4) and 18.9 Ma (node B, Fig. 4). Subsequently, lineage diversifications within Nilo-Sudan and Congolese regions 5 M a ( Fig. 3 ). A sister group relationship b etw e en the W es t A frican are favoured together with dispersals from those two regions to S. courteti and Great Lakes/South African Synodontis species is esti­ the surrounding ones excluding direct exchanges between them. mated at approximately 11.3 Ma (95% HPD: 7-18) following the Indeed, according to ancestral reconstruction analyses, the pres­ topology recovered in Bayesian analysis (Fig. 2B). Then, a sister ence of the Nilo-Sudan species in the Congolese clade B (node F, relationship between the Tanganyikan species clade and the Zam­ Fig. 4) and of Congolese species in the Nilo-Sudanese clade C (node bezi/Congolese species is supported with an approximated time of I, Fig. 4) results from indirect dispersal. In parallel to the diversifi­ divergence dated at 8.8 Ma (95% HPD: 6-13). The flock o f the Syn- cation o f Synodontis from C within node 36 (Fig. 4), they expanded odontis species to the Tanganyika (node J, Fig. 2A) is estimated at into LG, NS and UG at node F (Fig. 4) which gave rise to S. budgetti, approximately 6.0 Ma (95% HPD: 3-9). In the Likelihood analysis, 5. waterloti, S. rebeli and S. nigrita. A dispersal route from C to LG is a sister group relationship between the Nilo-Sudan species at the plausible and m ore lik ely than a d irect dispersal b etw e en C and NS. node G + S. courteti (Fig. 2A) is demonstrated. In this case, the This waterway is supported by both ancestral area reconstruction Nilo-Sudan versus Great Lakes/South split is estimated approxi­ at nodes 37 and E (Fig. 4) and is congruent with known faunal kin­ mately 17.1 Ma (95% HPD: 12-21). ships between LG and C: some of the LG species appear to have arisen from ancient connections between the Nyong and Sangha 3.2. Biogeographic inferences rivers (Leveque, 1997). Moreover, faunistic exchanges between NS and LG have been proposed, via connections between the Sana- Results of ancestral area reconstruction using the dispersal- ga and the Cross rivers (Roberts, 1975). Also, the placement of the extinction-cladogenesis model (dec-model, LAGRANGE) and Congolese species S. nigromaculatus in the Nilo-Sudan group (node Bayesian Binary MCMC Analysis (RASP) were mostly congruent, I, Fig. 2A ) is exp lain ed b y an invasion o f C from a w id esp read ances­ with eleven recovered differences in optimal resolutions (bold tor distributed in Z and T (BBM analysis) or at least in T (dec-m od­ nodes, Table 5). Results for reconstruction are presented in Fig. 4. el), and does not support any direct dispersal between NS and C. Lagrange estimates the ‘inheritance’ of a range after a phylogenetic Furthermore, hydrographic isolation of the Nilo-Sudan area from split, and therefore the ancestral range of Synodontis cannot be di­ th e Congo is corroborated b y the geograp h ic pattern o f distribution rectly estimated. However, Lagrange reconstructs the range of the of the whole genus, as 119 of the 120 extant Synodontis species most recent common ancestor of clade B to most likely be spread in exclusively belong to one of these geographic areas. C, LG, NS and E, and the ancestral range o f clade C includes NS, sup­ Phylogeographic pattern, distribution data and ancestral area porting the idea of a large ancestral range, although support is low reconstruction indicate that the CAS has acted as a barrier limiting (Table 5). Bayesian analysis inferred an ancestor distributed on NS dispersals between the Congo and Nilo-Sudan provinces in the Neo­ only (node A, Fig. 4) with range expansion that subsequently oc­ gene, thus favouring Synodontis speciation by allopatry. The phy- curs from NS to C. Within node A (Fig. 4), this ancestral range split logeographic dichotomy reflecting watershed isolation of the into tw o areas: C (node B, Fig. 4) and NS (node B, Fig. 4). W ithin the Congo is recorded approximately 16.9 Ma which corresponds to ‘‘Congolese area’’ (node B, Fig. 4), both reconstructions suggest early periods of diversifications in the respective Nilo-Sudan and multiple dispersals from C into LG (node B, E and 37, Fig. 4) and Congolese areas. In other freshwater fish groups, dating can clearly a recent invasion of UG, NS and EG. The Congolese ancestral area differ as for instance in Hydrocynus where a similar dichotomy is at node B (Fig. 4) is retained at nodes 36 and D in both analyses. estimated at approximately 6.8 Ma (Goodier et al., 2011). The Within node D (Fig. 4), range expansion into A +Z is proposed freshwater drainage divide between NS and C affects the different and at node 33colonisation of T from C is inferred. Within the fish taxa diachronically, which is compatible with our proposal: a Nilo-Sudanese area (node C, Fig. 4), BBM results suggest that sev­ constant geological barrier such as the CAS may promote ‘hard’ eral lineages successively evolved within NS and that recently this allopatry (physical disruption) in which lineage formation is ancestral range expanded into E ( S. af. Punctulatus lineage), UG dependent only on the formation of populations on opposite sides (S. thysi and S. gambiensis lin eages) and into EG and LG (S. schall lin ­ of the river. In this case, divergences in multiple unrelated taxa can eages). The dec-model inferred a slightly different range evolution be distributed randomly in time, linked with ecological valence of in that it favours more recent dispersals of S. schall ( Fig. 4 ). A t node the taxa and their aptitude to disperse (Pyron and Burbrink, 2010). 18 (Fig. 4), range expansion from NS to T is inferred by both anal­ In Bayesian and likelihood analyses, a ‘‘(Nilo-Sudan, yses. W ithin the T area, the reconstruction o f both ancestral areas (Great Lakes + Zambezi-Malawi-Congo))’’ relationship is supported inferred the colonisation o f Z, M, C and V. (n o d e 18, Fig. 4 ). In Synodontis, we find uncommon geographical

Table 4 Summary of maximum likelihood analyses. Ln L is the best likelihood score estimated by GARLI under each partition model. The values in bold indicate the optimal strategy selected by AIC.

Partition model Nucleotide substitution model Ln L (L) No. Parameters (P) AIC

0P GTR + I + G -12103,264 10 24226,528 2P GTR + I + G/K80 -11834,325 11 23690,65 3P GTR + I + G/SYM + I + G/K80 -11308,258 18 22652,516 4P HKY + I/GTR + G/SYM + I + G/K80 -11311,593 22 22667,186 A. Pinton et at./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040 1035

Fig. 3. Chronogram inferred from the Bayesian dating analysis (BEAST) of the concatenated data (mtDNA and ncDNA). Estimated age is given above each node and grey node bars represent 95% confidence intervals (HPD: highest posterior distributions), time is in millions of years. 1036 Table 5 Inferred ancestral ranges for branches (separated by vertical bar) descending from each node in Fig. 4, and their relative probability. Only alternative scenarios that fall within two log-likelihood units of the optimal reconstruction and have a relative probability P0.1 are provided. Clades for which differences exist in optimal reconstructions between Lagrange and S-DIVA are shown in bold.

Clade No. Infer. Anc. Area (Lagrange) Relative probability (Lagrange) Mean age estimate [95%HPD] (in Ma) (BEAST) Clade support (BEAST) Infer. anc. area (S-DIVA) Marginal probability (S-DIVA)

A [C + LG + NS + Neb|NS] 0.10 20.8[18.1, 27.6] PP = 1.00 NS 38.85 C + NS 29.81 C 21.14 B [C|LG + NS + Neb] 0.13 18.9[14.7, 25.8] PP = 0.91 C 23.60 C + NS 18.31 NS 13.70 C + LG 10.28 C + LG + NS 8.21 LG 7.92 LG + NS 6.14 C [NS|NS] 0.64 16.9[11.8, 23.2] PP = 0.99 NS 93.52 [NS + T|NS] 0.11 . itn t l/oeua Pyoeeis n Eouin 6 21) 1027-1040 (2013) 66 Evolution and Phylogenetics al./Molecular et PintonA. D [C|C] 0.80 14.8[10.3, 20.8] PP = 0.54 C 84.75 [C + LG|C] 0.17 C + LG 13.90 E [C|C] 0.80 13.7[8.0, 20.1] PP = 0.55 C 49.51 C + LG 48.38 F [LG|LG] 0.1 15.4[11.2, 21.1] PP = 0.99 LG + NS + EG 27.12 NS + EG 17.60 EG + NS 15.74 EG + LG 9.98 LG 6.48 EG 5.79 G [NS|NS] 0.96 14.5[9.9, 20.2] PP = 0.89 NS 96.21 H [a + Z|Z] 0.37 1.1[0.4, 2.3] PP = 1.00 A + Z 78.41 [Z|Z] 0.23 Z 13.59 [A|A + Z] 0.13 [Z|A + Z] 0.12 I [z |t + z ] 0.26 6.2[3.0, 10.4] PP = 0.99 T+Z 39.67 [z |c + t + z ] 0.14 Z 21.93 T 12.86 C+T+Z 11.26 C+Z 6.22 J [T|T] 0.97 6.0[3.4, 9.3] PP = 1.00 T 97.39 1 [NS|NS] 0.95 12.9[9.0, 18.2] PP = 0.96 NS 99.25 2 [NS|NS] 0.96 9.5[6.1, 13.9] PP = 1.00 NS 99.38 3 [NS|NS] 0.86 7.9[4.9, 11.8] PP = 0.61 NS 98.16 [n s + u g |n s ] 0.11 4 [NS|NS] 0.96 6.8[3.6, 10.1] PP = 0.99 NS 97.18 5 [n s |lG + NS + EG + UG + E]0.11 6.1 PP < 0.5 NS 83.33 NS + UG 6.43 6 [NS|NS] 0.96 1.4[0.6, 2.8] PP = 1.00 LG + NS + EG + UG + E 94.42 7 [ n s |n s ] 0.18 1.1[0.4, 2.2] PP = 0.58 LG + NS + EG + UG + E 99.51 8 [ n s |n s ] 0.18 0.7[0.2, 1.6] PP = 0.97 LG + NS + EG + UG + E 94.42 9 [NS|E] 0.73 3.8[1.3, 6.7] PP = 0.99 NS 95.28 [NS|NS] 0.20 10 [NS|UG] 0.55 5.8[2.8, 9.3] PP = 0.67 NS 95.00 [NS|NS] 0.36 11 [NS|NS] 0.99 0.3[0.03, 1.1] PP = 1.00 NS 99.80 12 [NS|NS] 0.99 10.9[6.9, 15.8] PP = 0.93 NS 98.45 13 [NS|NS] 0.91 8.1[4.5, 12.3] PP = 1.00 NS 89.81 14 [NS|NS] 0.99 6.4[2.8, 10.2] PP = 0.71 NS 99.16 15 [NS|NS] 0.99 0.5[0.04, 1.4] PP = 1.00 NS 99.77 16 [NS|NS] 0.99 7.3[3.4, 11.9] PP = 1.00 NS 99.71 17 [NS|NS] 0.99 1.9[0.4, 4.3] PP = 1.00 NS 99.85 A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040 1037

m-'^cor'-mcoLncococn comm oo^, ininr^.'vfi in oo ■^r o r> oo oo oo m r^rNodcdr^cddcncnaiaJcncnto ^i/>Mi^dr

U S £ U (A _ J Z _ J z N N N N + + + + u Z u j u j u Z j j Z u u u u u u u u < N < N < N < U U U U U

o cn r-» o o o ■ o cn co o o o o o o o

cu cu cu cu cu cu , cu cu cu cu cu cu cu cu cu cu cu , cu cu cu cu cu

CO — I — I cn co co m co Ln ^ IN !N IN -Ln -O CO ro cd cd cd . . . . ro . r- . . ^mmLnoooocn r-T cT r-T r-I ii£L£L£L£Lii£L£Lii i IN i oo in i- m m c o iN r - r-» cq 0 Ln I--; r'ir-■r-r-dr'idd'^ m r-» Ln co cn

ddddddddcoLno^ t o r-» in ffl IN M (N IN O r^mOLnmLnmm

(/} £ Z 2 _ _ + + on N < N N N ------Z N J 2 2 a z ; _ — > + > + _ ib b h b ^ =-£±u u ( j (/} ( j on — H_ >_ U_ + H_ + E— U N .<. < + <_ + < U U U U U 2Z2ZUUUUH>HHU JUUUUUUUNNN N N < < < S J U U U U U i b t , b b — £ ^ . S . E . 1038 A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040

Fig. 4. Ancestral area reconstruction using (RASP). Green arrows indicate dispersal events between designated regions. Node numbers refer to per-clade results listed in Table 5. When differences in optimal resolution between reconstruction using the dec-model and Bayesian Binary MCMC Analysis are found, w e indicate in a square the alternative reconstruction. Geographic coverage is indicated by capitals preceding taxon names. Broken branches lead to nodes with low posterior probabilities (<50) obtained in BEAST. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040 1039 connections as the Tanganyikan species are closely related with group of Congolese species and diversified in a flock approxi­ the Nilo-Sudanese ones and do not have roots in the Congo basin. mately 1.1 Ma (95% HPD: 0.4-2.3, Fig. 3). Divergence time esti­ Taking into account ancestral area reconstruction uncertainty, col­ mates suggested that Zambezi were colonised from the Congo onisation of T from NS occurred approximately between 11.3 Ma between 6.9 and 1.1 Ma (node 40, Fig. 4), a timing that overlaps (95% HPD: 6.5-17.5, node 18, Fig. 4) and 8.8 (95% HPD: 5.5-13.4, invasion of the Zambezian by S. nigromaculatus and S. zambezensis node 19, Fig. 4). A striking similar temporal and geographical pat­ from an ancestor distributed in the Tanganyika basin. The timings tern is found in Mastacembelus in which a Southern-Eastern clade of these two invasions, particularly taking into account 95% con­ is recovered, while W est African taxa represent early diverging lin­ fidence intervals, lend support that colonisation occurred at a eages of the African clade. In Mastacembelus, the molecular dating similar time. Moreover, those timings are consistent with geolog­ of divergence from the Nilo-Sudan ichthyoregion is dated approx­ ical activity consisting of southward extension in the western imately 14.5 Ma (11-19 Ma). Both phylogenetic reconstructions branch of the active EARS postulated to have started between 4 together with molecular dating are consistent with the geological and 12 Ma. Furthermore, the related species of clade H are the data, namely, the rise of the Virunga highlands that detached Lake C ongolese S. decorus and S. katangae respectively distributed in Kivu from the Nile basin during the mid-Miocene. Colonisation of T the Kasai and in the Chambeshi Rivers, which were purportedly from NS and its subsequent isolation from the Nilo-Sudan area linked with the Zambezi region. Hence, despite significant uncer­ around the mid-Miocene was followed by the evolution of several tainty in geological dates and the wide error of molecular dates, lineages within Lake Tanganyika (nodeJ, Fig. 4). In our phylogeny, there is a remarkable agreement between the geological data the recovered dates are in agreement with palaeogeological recon­ (Moore and Larkin, 2001) and relaxed molecular clocks support­ structions for the LT evolution. First, the formation of the central ing strong linkages between the tectonic and cladogenic events basin lake beginning between 9 Ma and 12 Ma (Cohen et al., in the area. 1997) constitutes a plausible event that separated a pool of Syn­ odontis Tanganyikan species from more southern lineages. Then, the onset of full lacustrine conditions in the lake occurred between 5. Conclusion 5 M a and 6 M a ( Cohen et al., 1997), a tim in g consistent w ith the Synodontis flock in the LT. Next, as in cichlids, the Tanganyika area Our pan-African sampling revealed a strong phylogenetic seg­ becomes a sink for adjacent regions, namely the Victoria and Ma­ regation of geographic regions. Patterns of genetic variation lawi Lakes (S. victoriae and S. njassae lineages), the Zambezi ( S. strongly correlate with distributions of extant species indicating zambezensis lineage) and the Congo basin ( S. nigromaculatus that Synodontis evolution reflects spatial and temporal signatures lineage). of landscape history. The history of this genus, which emerged The LT endemic lineage plus S. victoriae (n odeJ, Fig. 2A ) is e sti­ approximately 20 million years ago, has been continuously paral­ mated to have diverged later than 6 Ma (95% HPD: 3.4-9.3), which lel to the development of the hydrographical network, itself is consistent with previous molecular studies (Day and Wilkinson, mainly linked to the tectonic context. Synodontis phylogeographic 2006; Day et al., 2009). Over the course of clade J evolution, S. vic- pattern was alternatively linked to the rise of the CAS and to the toriae dispersed into V in the last 5 Ma, according to both ancestral development of the East African Rift system. Our conclusions area reconstructions. Evolution of S. victoriae within the LT basin from Africa concur with the ‘hydrogeological’ hypothesis of Mon­ and subsequent emigration in V is plausible, considering that this toya Burgos (2003), which maintains that palaeohydrological species principally occurs in Lake Victoria, which is considerably changes linked with the geological context may have been the younger than the age of this species. cause of diversification deep in the Tertiary. In addition to the Colonisation of the Malawi Lake is thought to have occurred in geological context, the role of climate change on Synodontis his­ the last 4.7 Ma (95% HPD: 1.9-9.6, node 29, Fig. 4) according to tory is, although difficult to unravel, probably significant. Indeed, both analyses. The cladogenesis event that founded Malawi Syn- if Synodontis evolution has tracked formative changes in the odontis overlaps with molecular dating of the Pliocene divergence topology of major river channels, it certainly also reflects an o f Mastacembelus (6-2.5 Ma) and of cichlids, according to Genner inherent sensitivity to changes in habitat quality associated with et al. (2007). Those dates are consistent with the geological frame­ episodic climate change. In our phylogeny, approximately 70% of work: a deep lake may have existed between approximately 4.5 lineages emerged between 15 Ma and 5 Ma, which corresponds to and 8 M a ( E binger et al., 1993). a period of drastic changes in Africa with increasing aridity and Finally, colonisation of the Zambezi system is inferred to have the rise of the C4 grasses. Clearly understanding how ecological occurred two times from two different regions in Synodontis his­ niche o f Synodontis species may have varied temporally and geo­ tory. First, dispersal from T to Z is inferred between 8.8 Ma (node graphically is a further step that will shed more light upon their 19, Fig. 4) and 6.2 Ma (node I, Fig. 4) according to BBM or in the last phylogeographic history. 6.2 Ma applying the dec-model. It appears that Synodontis displays a similar biogeographic pattern to Mastacembelus in which a Tang­ anyikan/South African sister group relationship is also assumed. Acknowledgements Moreover, the dates generated for Synodontis are relatively similar to those estimated for Mastacembelus ( B row n et al., 201 0 ). W h en The analyses were run in ISEM (UMR IRD 226-, Montpellier, com parin g Synodontis and Mastacembelus histories to that of cich- France), with the technical facilities of the Centre Mediterraneen lids, we found evidence for biogeographic clades mirroring a sim­ Environnement Biodiversite. W e are grateful to Randall Brummett ilar pattern, but it is likely that more ancient connections (W orld Fish) and Dorothy Nyingi (National Museum of Kenya) for between water-systems were used by the two former genera. p rovid in g Synodontis tissue for molecular analysis. W e extend our Apart from the Synodontis species S. nigromaculatus and S. zam bez- gratitude to Gilles Fediere, Bruno Sicard and Yacouba Traore (IRD ensis, we found that Synodontis Zambezian species are mainly re­ Bamako, Mali), Jean-Yves Moisseron (IRD Cairo, Egypt) and A. Lik- lated to the Congolese species (node H, Fig. 4) showing a similar ius and Fabrice Lihoreau (University of N’Djamena, Chad) for field biogeographic pattern to the one hypothesised for cichlids (Katon- and sampling assistance. W e also thank ANR (Projet ANR 05- go e t al., 2007; K oblm u ller et al., 2 00 8 ). The clade form ed b y the six BLAN-0235). W e greatly thank the editors and the reviewers for species S. woosnami, S. leopardina, S. macrostigma, S. vanderwaali, their constructive and helpful comments that improved the manu­ S. nebulosa and S. afthamalakanensis (n o d e H, Fig. 2A ) is th e sister script, and T. W irth and J. Joordens for discussions. 1040 A. Pinton et al./Molecular Phylogenetics and Evolution 66 (2013) 1027-1040

Appendix A. Supplementary material Koblmuller, S., Egger, B., Sturmbauer, C., Sefc, K.M., 2007b. Evolutionary history of Lake Tanganyika's scale-eating cichlid fishes. Mol. Phylogenet. Evol. 44, 1295­ 1305. Supplementary data associated with this article can be found, Koblmuller, S., Schliewen, U.K., Duftner, N., Sefc, K.M., Katongo, C., Sturmbauer, C., in the online version, at http://dx.doi.org/10.1016/j.ympev.2012. 2008. Age and spread of the haplochromine cichlid fishes in Africa. Mol. Phylogenet. Evol. 49, 153-169. 12.009. Kocher, T.D., Conroy, J.A., McKaye, K.R., Stauffer, J.R., Lockwood, S.F., 1995. Evolution of NADH dehydrogenase subunit 2 in east African cichlid fish. Mol. Phylogenet. References Evol. 4, 420-432. Leveque, C., 1997. Biodiversity Dynamics and Conservation: The Freshwater Fish of Tropical Africa. Cambridge University Press, Cambridge. Bell-Cross, G., 1965. The distribution o f fishes in Central Africa. Fish. Res. Bull. Zamb. Leveque, C., Oberdorff, T., Paugy, D., Stiassny, M.L.J., Tedesco, P.A., 2008. Global 4, 3-20. diversity of fish (Pisces) in freshwater. Hydrobiology 595, 545-567. Brandstatter, A., Salzburger, W., Sturmbauer, C., 2005. Mitochondrial phylogeny of Lundberg, J.G., 1998. The temporal context for the diversification of neotropical the Cyprichromini, a lineage of open-water cichlid fishes endemic to Lake fishes. In: Malabarba, L.R., Reis, R.E., Vari, R.P., Lucena, Z.M.S., Lucena, C.A.S. Tanganyika, East Africa. Mol. Phylogenet. Evol. 34, 382-391. (Eds.), Phylogeny and Classification o f Neotropical Fishes. EDIPUCRS, Porto Brown, K.J., Ruber, L., Bills, R., Day, J.J., 2010. Mastacembelid eels support lake Alegre, Brasil, pp. 48-68. Tanganyika as an evolutionary hotspot o f diversification BMC. Evol. Biol. 10, Marshall, C.R., 1990. The fossil record and estimating divergence times between 188. lineages: maximum divergence times and the importance of reliable Chow, S., Hazama, K., 1998. Universal PCR primers for S7 ribosomal protein gene phylogenies. J. Mol. Evol. 30, 400-408. introns in fish. Mol. Ecol. 7, 1255-1256. Miller, E.R., 1999. Faunal correlation of Wadi Moghara, Egypt: implications for the Cohen, A.S., Lezzar, K.E., Tiercelin,J.J., Soreghan, M., 1997. N e w paleogeographic and age o f Prohylobates tandyi. J. Hum. Evol. 36, 519-533. lake-level reconstructions of Lake Tanganyika: implications for tectonic, Montoya Burgos, J.I., 2003. Historical biogeography o f the catfish genus Hypostomus climatic and biological evolution in a rift lake. Basin Res. 7, 107-132. (Siluriformes: Loricariidae), with implications on the diversification of Cotterill, F.P.D., de Wit, M.J., 2011. Geoecodynamics and the Kalahari Epeirogeny: Neotropical ichthyofauna. Mol. Ecol. 12, 1855-1867. linking its genomic record, tree of life and palimpsest into a unified narrative of M oore, A.E., Larkin, P.A., 2001. Drainage evolution in South-Central Africa since the landscape evolution. S. Afr. J. Geol. 114, 493-518. breakup of Gondwana. S. Afr. J. Geol. 104, 47-68. Currie, D.J., Mittelbach, G.G., Cornell, H.V., Field, R., Guegan, J.-F., Hawkins, B.A., Murray, A.M., Cook, T.D., Attia, Y.S., Chatrath, P., Simons, E.L., 2010. A freshwater Kaufman, D.M., Kerr, J.T., Oberdoff, T., O'Brien, E., Turner, J.R.G., 2004. ichthyofauna from the Late Eocene Birket Qarun Formation, Fayum. Egypt. J. Predictions and tests of climate-based hypotheses of broad-scale variation in Vertebr. Paleontol. 30, 665-680. taxonomic richness. Ecol. Lett. 7, 1121-1134. Nylander, J.A.A., 2004. MrModeltest2. Ver. 2. 2. Evolutionary Biology Centre, Uppsala Day, J.J., Wilkinson, M., 2006. On the origin of the Synodontis catfish species flock University. . from Lake Tanganyika. Biol. Lett. 2, 548-552. Otero, O., Gayet, M., 2001. Palaeoichthyofauna from the Lower Oligocene and Day, J.J., Santini, S., Garcia-Moreno, J., 2007. Phylogenetic relationships of the Lake Miocene of the Arabian Plate: palaeoecological and palaeobiogeographical Tanganyika cichlid tribe Lamprologini: the story from mitochondrial DNA. Mol. implications. Palaeogeogr. Palaecol. 165, 141-169. Phylogenet. Evol. 45, 629-642. Otero, O., 2010. What controls the freshwater fish fossil record? A focus on the Late Day, J.J., Bills, R., Friel, J.P., 2009. Lacustrine radiation in African Syndontis catfish. J. Cretaceous and Tertiary of Afro-Arabia. Cybium 34, 93-113. Evol. Biol. 22, 805-817. Otero, O., Pinton, A., Cappetta, H., Valentin, X., et al. in preparation. Late middle Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis by Eocene fish from Dur At-Talah (Libya) and new early records for modern African sampling trees. BMC Evol. Biol. 7, 214. freshwater fish. Drummond, A.J., Ho, S.Y.W., Phillips, M.J., Rambaut, A., 2006. Relaxed phylogenetics Posada, D., Buckley, T.R., 2004. Model selection and model averaging in and dating w ith confidence. PLoS Biol. 4, e88. http://dx.doi.org/10.1371/ phylogenetics: advantages of the AIC and Bayesian approaches over likelihood journal.pbio.0040088. ratio tests. Syst. Biol. 53, 793-808. Duftner, N., Koblmuller, S., Sturmbauer, C., 2005. Evolutionary relationships of the Priem, R., 1920. Poissons fossiles du Miocene d'Egypte (Burdigalien de Moghara, Limnochromini, a tribe of benthic deep water cichlid fishes endemic to Lake ‘‘Desert lybique''). In: Fourtau, R. (Ed.), Contribution a l'etude des vertebres Tanganyika, East Africa. J. Mol. Evol. 60, 277-289. miocenes de l'Egypte. Governm ent Press, Cairo, pp. 8-15. Ebinger, C.J., Deino, A.L., Tesha, A.L., Becker, T., Ring, U., 1993. Tectonic controls on Pyron, R.A., Burbrink, F.T., 2010. Hard and soft allopatry: physically and ecologically rift basin morphology: evolution of the northern Malawi (Nyasa) Rift. J. mediated modes of geographic speciation. J. Biogeogr. 37, 2005-2015. Geophys. Res. 98, 17821-17836. Ree, R.H., M oore, B.R., W ebb, C.O., Donoghue, M.J., 2005. A likelihood fram ework for Fairhead, J.D., 1988. Mesozoic plate tectonic reconstructions of the Central South inferring the evolution of geographic range on phylogenetic trees. Evolution 59, Atlantic Ocean: the role of the West and Central African Rift system. 2299-2311. Tectonophysics 155, 181-191. Ree, R.H., Smith, S.A., 2008. M axim um likelihood inference o f geographic range Farris, J.S., Kallersjo, M., Kluge, A.G., Bult, C., 1995. Testing significance o f evolution by dispersal, local extinction, and cladogenesis. Syst. Biol. 57, 4-14. incongruence. Cladistics 10, 315-319. Roberts, T.R., 1975. Geographical distribution of African freshwater fishes. Zool. J. Genner, M.J., Seehausen, O., Lunt, D.H., Joyce, D.A., Shaw, P.W., Carvalho, G.R., Linn. Soc-Lond. 57, 249-319. Turner, G.F., 2007. Age of cichlids: new dates for ancient lake fish radiations. Salzburger, W., Mack, T., Verheyen, E., Meyer, A., 2005. Out of Tanganyika: Genesis, Mol. Biol. Evol. 24, 1269-1282. explosive speciation, key-innovations and phylogeography of the Goodier, S.A.M., Cotterill, F.P.D., O'Ryan, C., Skelton, P.H., de W it, M.J., 2011. Cryptic haplochromine cichlid fishes. BMC Evol. Biol. 5,17. diversity of African Tigerfish (Genus Hydrocynus) reveals palaeogeographic Snoeks, J., De Vos, L., Thys van den Audenaerde, D., 1997. The ichtyogeography of signatures o f linked neogene geotectonic events. PLoS One 6, e28775. http:// Lake Kivu. S. Afr. J. Sci. 93, 579-584. dx.doi.org/10.1371/journal.pone.0028775. Stankiewicz, J., de Wit, M.J., 2006. A proposed drainage evolution model for Central Hedges, S.B., Kumar, S., 2004. Precision o f molecular time estimates. Trends Genet. Africa-Did the Congo flo w East? J. Afr. Earth Sci. 44, 75-84. 20, 242-247. Stewart, K.M., 2001. The freshwater fish of neogene Africa (Miocene-Pleistocene): Hewitt, G.M., 2000. The genetic legacy of the Quaternary ice age. Nature 405, 907­ systematics and biogeography. Fish Fish. 2, 177-230. 913. Swofford, D.L., 2001. PAUP/: Phylogenetic Analysis Using Parsimony (/and Other Ho, S.Y.W., 2007. Calibrating molecular estimates o f substitution rates and Methods), Version 4. Sinauer, Sunderland, Massachusetts, USA. divergence times in birds. J. Avian Biol. 38, 409-414. Thieme, M.L., Abell, R.A., Stiassny, M.L.J., Skelton, P.H., Lehner, B., 2005. Freshwater Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYEAYES: Bayesian inference of Ecoregions of Africa and Madagascar: A Conservation Assessment. Island Press, phylogeny. Bioinformatics 17, 754-755. W ashington (DC), USA. Kass, R.E., Raftery, A.E., 1995. Bayes factors. J. Am. Stat. Assoc. 90, 773-795. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. The Katongo, C., Koblmuller, S., Duftner, N., Mumba, L., Sturmbauer, C., 2007. CLUSTALX windows interface: flexible strategies for multiple sequence Evolutionary history and biogeographic affinities of the serranochromine alignm ent aided by quality analysis tools. Nucleic Acids Res. 24, 4876-4882. cichlids in Zambian rivers. Mol. Phylogenet. Evol. 45, 326-338. W hite, E., 1935. Fossil fishes o f Sokoto Province. Bull. Geol. Surv. Nigeria 14, 1-78. Koblmuller, S., Duftner, N., Katongo, C., Phiri, H., Sturmbauer, C., 2005. Ancient Willis, K.J., Niklas, K.J., 2004. The role of Quaternary environmental change in plant divergence in bathypelagic Lake Tanganyika deepwater cichlids: mitochondrial m acroevolution, the exception or the rule? Philos. T. Roy. Soc. B 359, 159-172. phylogeny of the tribe Bathybatini. J. Mol. Evol. 60, 297-314. Yu, Y., Harris, A.J., He, X., 2010. S-DIVA (Statistical Dispersal-Vicariance Analysis): a Koblmuller, S., Strurmbauer, C., Verheyen, E., Meyer, A., Salzburger, A., 2006. tool for inferring biogeographic histories. Mol. Phyl. Evol. 56, 848-850. Mitochondrial phylogeny and phylogeography of East African squeaker Zwickl, D.J., 2006. Genetic algorithm approaches for the phylogenetic analysis of catfishes (Siluriformes: Synodontis). BMC Evol. Biol. 6, 49. large biological sequence datasets under the maximum likelihood criterion. Koblmuller, S., Duftner, N., Sefc, K.M., Aibara, M., Stipacek, M., Blanc, M., Egger, B., Ph.D. Dissertation, University o f Texas at Austin. Sturmbauer, C., 2007a. Reticulate phylogeny of gastropod-shell-breeding cichlids from Lake Tanganyika-the result of repeated introgressive hybridization. BMC Evol. Biol. 7, 7.