This is a repository copy of Testing the performance of environmental DNA metabarcoding for surveying highly diverse tropical fish communities: A case study from .

White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/166144/

Version: Published Version

Article: Doble, C.J., Hipperson, H. orcid.org/0000-0001-7872-105X, Salzburger, W. et al. (4 more authors) (2020) Testing the performance of environmental DNA metabarcoding for surveying highly diverse tropical fish communities: A case study from Lake Tanganyika. Environmental DNA, 2 (1). pp. 24-41. ISSN 2637-4943 https://doi.org/10.1002/edn3.43

Reuse This article is distributed under the terms of the Creative Commons Attribution (CC BY) licence. This licence allows you to distribute, remix, tweak, and build upon the work, even commercially, as long as you credit the authors for the original work. More information and the full terms of the licence here: https://creativecommons.org/licenses/

Takedown If you consider content in White Rose Research Online to be in breach of UK law, please notify us by emailing [email protected] including the URL of the record and the reason for the withdrawal request.

[email protected] https://eprints.whiterose.ac.uk/

Received Apri | Revised September | Accepted October DOI edn

ORIGINAL ARTICLE

Testing the performance of environmental DNA metabarcoding for surveying highly diverse tropical fish communities: A case study from Lake Tanganyika

Christopher J. Doble1,2,3 | Helen Hipperson3| Walter Salzburger4 | Gavin J. Horsburgh3| Chacha Mwita5| David J. Murrell1,2 | Julia J. Day1,2

1Department of Genetics Evoution and Environment University Coege London Abstract London UK Background and Aims: Environmenta DNA eDNA metabarcoding provides a highy 2 Centre for Biodiversity and Environment sensitive method of surveying freshwater fish communitie s athough studies to date Research University Coege London London UK have argey been restricted to temperate ecosystems Due to imited reference 3Department of Anima and Pant sequence avaiabiity and chaenges identifying cosey reated and rare species in Sciences NERC Biomoecuar Anaysis - Faciity University of Sheffied Sheffied diverse tropica ecosystems the effectiveness of metaba rcoding methods for sur UK veying tropica fish communities from eDNA sampes remain s uncertain To address 4 Department of Environmenta this we appied an eDNA metabarcoding approach to survey L ake Tanganyikas LT Sciences Zooogica Institute University of Base Base Switzerand speciesrich ittora fish communities 5Department of Aquatic Sciences and Materials and Methods: As this system contains many cosey reated species par- Fisheries University of Dar es Saaam Dar ticuary cichid fishes we used four primer sets incuding a cichidspecific primer set es Saaam Tanzania CichidCR A reference database was buit for the s s and contro region for Correspondence fish species incuding over of known cichids Christopher J Dobe Department of Genetics Evoution and Environment Results and Discussion: In siico and in situ resuts demonstrated wide variabiit y in the University Coege London Gower Street taxonomic resoution of assignments by each primer with the cichidspecific marker London WCE BT UK Emai christopherdobeucacuk CichidCR enabing greater specieseve assignments for this highy diverse famiy Juia J Day Department of Genetics A greater number of noncichid teeost species were dete cted at sites compared to Evoution and Environment University Coege London Gower Street London the visua survey data For cichid species however seque ncing depth substantiay WCE BT UK infuenced species richness estimates obtained from eDNA s ampes with increased Emai jdayucacuk depths producing estimates comparabe to that obtained fr om the visua survey data Funding information Conclusions: Our study highights the importance of sequencing depth and oca Natura Environment Research Counci GrantAward Number NEL reference databases when undertaking metabarcoding studies within diverse eco- Genetics Society Heredity Fiedwork Grant systems as we as demonstrating the potentia of eDNA met abarcoding for survey- Systematics Research Fund Fisheries Society of the British Ise Sma Research ing diverse tropica fish communities even those containing cosey reated species Grant Percy Saden Memoria Trust within evoutionary radiations Fund NERC Bimoecuar Anaysis Faciity Sheffied European Research Counci Swiss Nationa Science Foundation KEYWORDS cichids environmenta DNA Lake Tanganyika metabarcoding reference database tropica freshwaters

This is an open access artice under the terms of the Creative Commons Attribution License which permits use distribution and reproduction in any medium provided the origina work is propery cited The Authors Environmental DNA pubished by John Wiey Sons Ltd

24 | wileyonlinelibrary.com/journal/edn3 Environmental DNA. 2020;2:24–41. DOBLE ET AL. |

|INTRODUCTION to distinguish between cosey reated species in tropica ecosystems Breman Loix Jordaens Snoeks Van Steenberge Pereira Freshwaters gobay represent highy productive and bioogicay di- Hanner Foresti Oiveira More diverse systems aso require verse ecosystems with much of this diversity centered in the tropics higher samping depths to obtain accurate species richness estimates Coen et a Aquatic habitats in South America Centra and Gotei Cowe As a resut current eDNA metabarcoding Eastern Africa and SouthEast Asia contain the highest species rich- designs may require modifications eg increased sequencing depth ness and endemicity across a major freshwater taxonomic groups to effectivey survey the diverse fish communities found in many excuding crayfish highighting the importance of these regions for tropica freshwater ecosystems A recent appication of eDNA me- goba freshwater diversity Tisseui et a As we as hot spots tabarcoding within tropica South American streams highighted the of diversity tropica freshwaters have aso been foca points of de- potentia of this method for surveying diverse fish communities veopment and as a resut face a broad range of stressors Strayer whie aso demonstrating some current imitations in the detec- Dudgeon This has resuted in rates of biodiversity decine tion of species compared to studies within temperate ecosystems in freshwaters surpassing that in both terrestria and marine eco- Cieros et a Sti in its infancy as a method there remains systems with extinction rates of freshwater fishes in the twentieth a need to test eDNA metabarcoding methods across a wider range century exceeding that of a other vertebrate groups Burkhead of compex tropica ecosystems to better understand the potentia Coen et a The recent Freshwater Living Panet Index of these approaches for surveying freshwater fish diversity gobay reports average recorded popuation decines since in the Neo Limiting factors preventing the appication of eDNA metabar- and Afrotropics of and respectivey Grooten Amond coding within tropica systems incude difficuties coecting and This exceeds terrestria decines and highights the substan- preserving sampes within remote ocations incompete taxonomic tia pressures on species within tropica freshwater ecosystems descriptions for many fish groups resuting in cryptic biodiversity Species richness and evenness measures underpin our under- Decru et a Saes et a and the imited avaiabiity standing of bioogica diversity and our abiity to monitor their re- of sequences within pubic databases for tropica fish species par- sponses to anthropogenic stressors These measures are reiant ticuary for the commony used s and s mitochondria gene on the accurate detections of species as they assume survey data regions Due to the extensive moecuar and taxonomic work under- are representative of the community samped Buckand Studeny taken on Lake Tanganyikas LT fish fauna Sazburger Van Bocxaer Magurran Newson Therefore variation in the detectabi- Cohen and refs therein most fish groups in this system ity of species due to differing behaviors or across habitat types can are we studied and there is aso a good avaiabiity of DNA sampes ead to inaccuracies in diversity measures Gotei Cowe and sequences from museum and research group coections As a Traditiona methods of surveying freshwater ecosystems a impose resut LT provides an idea tropica system with which to test eDNA biases on datasets often reating to the size and activity of indi- metabarcoding methods viduas Jackson Harvey As a resut the survey method Lake Tanganyika contains an exceptiona fish diversity with over used has been shown to significanty infuence the diversity vaues species most of which are endemic to the basin Sazburger obtained from sites Deacon et a Jackson Harvey et a A key feature of LTs fish fauna is that much of its di- Oiveira Gomes Latini Agostinho To hep overcome these versity emerged through in situ evoutionary radiations incuding individua biases appying mutipe methods for surveying fish com- radiations of cichid fishes with at east known species Day munities has been widey advocated Kubecka et a Cotton Barracough Muschick Indermaur Sazburger Recenty environmenta DNA eDNA metabarcoding has been Ronco Bscher Indermaur Sazburger catfishes shown to be a sensitive and costeffective method of surveying Day and Wikinson Peart et a and mastacembeid freshwater communities Evans et a Hnfing et a spiny ees Brown et a with the atter noncichid groups Vaentini et a Comparisons with other survey methods have containing far fewer species The high eves of sequence simiarity demonstrated how eDNA metabarcoding data can compement between cosey reated and young species emerging from rapid traditiona approaches often eading to increased detection of spe- evoutionary radiations can make accurate barcode identifications cies and improved species richness estimates Deiner et a chaenging Sazburger For exampe a recent study of LT Despite this current appications of eDNA metabarcoding within cichid fishes showed the taxonomic resoution of the traditiona freshwaters have argey been restricted to temperate ecosystems COI barcoding region was imited in some cases to species com- with few pubished studies within tropica freshwaters Cantera et pexes and genera Breman et a As with other arge fresh- a Cieros et a Lopes et a Potentia variabiity water akes much of LTs fish diversity is found within the ittora in the effectiveness of these methods across temperate and tropica zone Vadeboncoeur McIntyre Vander Zanden The oca ecosystems coud resut from differences in the ecoogy of eDNA species richness of fish communities is particuary high within it- Eichmier Best Sorensen Stricker Fremier Godberg tora rocky habitats where as many as fish species incuding variation in the detectabiity of rare species within highy di- cichids have been identified within a mutiyear m quad- verse tropica fish communities the existence of taxonomic prob- rat survey Takeuchi Ochi Kohda Sinyinza Hori and ems for some fish taxa Decru et a and the abiity of markers cichid species identified at one site in Mahae Nationa Park | DOBLE ET AL.

FIGURE Map of Lake Tanganyika samping ocations for both fied seasons in and Site numbers increase from to in a northwards direction Green circes are sites surveyed in ony yeow diamonds show sites surveyed in ony and red trianges are sites surveyed in both years Inset Lake Tanganyika with the red box highighting the study area Maps were created with QGIS v The base map is Googe™ Terrain map

Britton et a As such accurate and consistent surveying species within the LT ittora habitat c Are eDNA species richness of species within this habitat poses significant chaenges with estimates and detection rates comparabe to visua survey data methods often reiant on aborintensive SCUBA visua surveys or gi netting athough see Widmer et a who utiized video |MATERIALS AND METHODS technoogy for surveying LT cichid fishes The high oca diversity and arge number of cosey reated and |Site ocations young species within LTs ittora habitat poses a number of chaenges to eDNA metabarcoding methods many of which wi be common Samping was undertaken within the Kigoma region of LT in Tanzania across other tropica ecosystems Here we deveop an extensive ref- Figure representing a westudied and easiy reachabe area of erence sequence database across key barcoding regions and coect the ake GPS ocations Tabe S The fish communities aong this eDNA sampes aongside visua survey data to address a number of section of coastine have received substantia research with surveys these chaenges Specificay we asked a Can eDNA metabarcod- of the ittora fish communities having been undertaken since ing methods accuratey identify LT cichid fishes to species eve b Britton et a Two fied seasons were undertaken to LT in How effective is eDNA metabarcoding at detecting noncichid teeost SeptemberOctober and MayJune during which visua DOBLE ET AL. |

SCUBA survey data and eDNA sampes were coected across wwwfishb aseorg Brichard and Ronco et a Tabe sites aong a km stretch of coastine S Cichid s s and contro region sequences for species were extracted from avaiabe mitogenome aignments derived from whoe genome assembies Separate species with identica marker |Samping sequences within the reference database were grouped into spe- Surveys were focused aong areas of rocky habitat as this contains cies compexes simiar to Breman et a Separate species the most diverse ittora fish fauna Hori Gashagaza Nshombo compex groupings were undertaken for each primer set based on Kawanabe The structure of ittora communities has been marker resoution Tabe S For each region one sequence per spe- shown to differ across very fine spatia scaes particuary between cies was incuded in the reference database except for Oreochromis depths of m Takeuchi et a To capture this a nested tanganicae where a second sequence was incuded from an individ- design survey foowing Britton et a was adopted at each site ua coected from a fish farm within LT in case the farmed popu- in which a series of ten stationary visua surveys were undertaken ations differed from wid ones Oreochromis tanganicae fish farm at depths of and m Figure S Bohnsack Bannerot At Oreochromis malagarasi and Oreochromis niloticus sequences were each depth five surveys were undertaken positioned at the centra obtained separatey through Sanger sequencing Appendix S and eDNA coection point and at and m aong the coastine in either an Aulonocranus dewindti contro region sequence from NCBI was in - direction For further detais of the survey design see Appendix S cuded in the reference database as mitogenome extract seq uences Prior to undertaking the visua surveys a water sampe was were not avaiabe for these species coected from the midsurvey point at a depth of m foowing a Noncichid fish sequences were obtained using sampes fro m re- design simiar to Port et a Containers used for coecting search group coections Day and Sazburger abs the S outh African each eDNA sampe were rinsed with beach soution sodium Institute for Aquatic Biodiversity SAIAB and the Ameri can Museum hypochorite concentration unknown foowed by ake water at the of Natura History AMNH Some sampes for species within t he ref- coection site Divers remained m off the bottom to prevent kick- erence database argey inhabiting river catchments were coected ing up sediment that coud ater the eDNA within the water coumn outside of the LT basin As the taxonomies for some of these fish Nitrie goves were aso worn to reduce the potentia of contaminat- groups remain unresoved sampes coected outside of th is range ing sampes with their own DNA A singe L water sampe was may represent separate species yet to be described Never theess coected within a coapsibe container transferred to a more soid they woud be cose reatives to those species found within the LT container and stored on ice within a cooer box whie visua surveys basin Eight sampes coected within the LT basin were id entified were undertaken Water sampes were subsequenty fitered once to eve for these fish groups These sampes were aso se- back onshore within hr of coection quenced and incuded in the database to hep overcome the pot en- Nitrie goves were worn throughout the fitering process and tia issue of poory described taxonomies eg Amphilius Clarias prior to fitering the work surface was ceaned with commer- Enteromius Kneria Leobarbus and Opsaridium For the aboratory cia beach soution eDNA sampes consisted of L water fiters methods undertaken to obtain reference database sequences see subsampes of the L water sampe with one eDNA sampe co- Appendix S ected per site in and three fiter repicates coected per site in Each L sampe was vacuumfitered onto sterie mm |Primer design in siico and in vitro testing diameter m pore size ceuose nitrate fiter paper contained within m disposabe Thermo Scientific Nagene Anaytica Reference database sequences for each region were used to de - Test Fiter Funnes Fiters were then foded inwards three times and sign new primers targeting the akes cichids and test th ese in paced within a m Eppendorf tube In this tube was stored siico aong with previousy pubished universa fish pri mer sets immediatey at C transported back to the UK at C using ecoPrimers was used to search these regions for suitabe var iabe a dry shipper and then stored at C unti extraction In barcode ocations Riaz et a This identified a hig hy vari- sampes were fuy submerged in moecuar grade ethano at abe portion of the contro region enabing improved taxon omic room temperature unti returning to the UK after which they were resoution Primers fanking this region were designed using se- stored at C A fitration bank was coected between the fitra- quence aignments and Primer v Untergasser et a tion of every five sampes to monitor for potentia contamination This primer set ampifies a bp barcode fragment with t he This invoved fitering L of commercia botted water Kiimanjaro forward primer ocated within the same region as the common y brand foowing the same procedure as above used LProF forward primer Meyer Morrissey Scharti The sTeeo primer set was designed foowing a simiar me thod targeting the same variabe region as Ves whie ampifyi ng a |Reference database shorter barcode of bp compared to bp Evans et a A mutimarker approach was adopted to overcome ow divergences The newy designed CichidCR and sTeeo primers aong with between cosey reated species for individua markers A tota fish previousy pubished SV and MiFishU primer sets wer e se- ist for species within the LT basin was deveoped based on FishBase ected for in siico testing Miya et a Riaz et a | DOBLE ET AL.

TABLE Primers used for the first round of ampifications

Barcode Name Region Forward primer 12 (5'–3') Reverse primer (5'–3') length (bp) Reference

MiFishU s GTCGGTAAAACTCGTGCCAGC CATAGTGGGGTATCTAATCCCAGTTTG Miya et a SV s ACTGGGATTAGATACCCC TAGAACAGGCTCCTCTAG Riaz et a sTeeo s GACGAGAAGACCCTDTGGAG GTCCTGATCCAACATCGAG This pubication CichidCR Contro CCTACCCCTAGCTCCCAAAG ACTGATGGTGGGCTCTTACTACA This pubication Region

Primer specificity was evauated using PrimerMiner v with Anayser Fina poos were quantified using both the KAPA threshod scores ranging between and Byemans Geeson Library Quantification Kit and a QUBIT using the dsDNA HS assay Hardy Furan Ebrecht Leese KimuraParameter Libraries were sequenced on an Iumina MiSeq patform at the KP distances were cacuated for each marker using ape v to Sheffied Chidrens Hospita Next Generation Sequencing Faciity The investigate the genetic divergences between species within the MiFish and SV poo were sequenced on a bp PairedEndse - reference databases Kimura Paradis Caude Strimmer quencing run and a bp PairedEnd run was used for the sTeeo DNA extracts for fish species found within LT were am- and CichidCR poo A PhiX spikein was incuded on both runs pified with each primer set to test their consistency of ampification to increase the sequence compexity In tota sampes were se- across taxa Tabe S L eDNA sampes fitered foowing the quenced for each primer set comprising fied sampes aquarium same methods as for the fied sampes were aso coected from the sampes fiter negative contros and extraction negative contros LT tank at Zooogica Society of London ZSL This aquarium con- tained five cichid species endemic to the ake sp |Bioinformatic anayses Haplotaxodon microlepis Altolamprologus calvus Neolamprologus lon- gior and Lepidiolamprologus kendalli with eDNA sampes sequenced Anayses were run on the High Performance Computing Custer at the foowing the same metabarcoding methods detaied beow The University of Sheffied with fu detais on software and parameters four seected primers are shown in Tabe with their ocations in used provided in the Appendix S Briefy reads were quaity checked each gene region highighted in Figure S with FastQC Andrews and trimmed based on read quaity with the remova of Iumina sequencing adaptors using Trimmomatic v Boger Lohse Usade Quaity fitered reads were aigned |DNA extraction and PCR ampifications with FLASH v Mago Sazberg and primers trimmed Further detais of the DNA extractions and PCR ampifications are aowing for one mismatch with Mothur v Schoss et a provided within the Appendix S Briefy sampe fiters were ex- Sequences were derepicated with USEARCH v Edgar tracted using the Qiagen DNeasy® Bood Tissue Kit foowing a and custered into high resoution MOTUs with Swarm v d modified protoco in combination with a Qiagen QIAshredder based Mah Rognes Quince De Vargas Dunthorn MOTUs with a on the methods of Godberg et a and Lacoursirerousse et read count ess than were removed as in Hnfing et a a DNA was ampified using a twostep PCR protoco with Bast searches were undertaken against the oca reference da- barcoded Iumina adapters added in a second ampification PCR tabase for each marker Foowing this taxonomies were assigned Ampicon PCRs PCR were repicated four times for each sampe with MEGAN v based on primerspecific identity threshods of and pooed to minimize PCR bias A fiter and extraction contros using defaut parameters apart for a minimum score of were incuded in the PCRs a minimum evaue of –10 and a top percent of Remaining unas- signed MOTUs were removed To investigate the ikey identity of unassigned MOTUs a sec- |Library quantification and origina ondary bast search was undertaken against the NCBI nuc eotide sequencing run database with taxa assigned using the same methods as for the oca Foowing PCR of product from each reaction was quantified reference database Taxa assigned at this stage to previou sy unas- using a FLUOstar Optima Promega Based on these resuts sam- signed reads were not incuded within the fina dataset fr om which pes were normaized to equa concentrations pooed into groups species richness estimates were derived of and ceaned with AmPure XP beads The Iuminatagged DNA concentration of each poo was quantified using the KAPA Library |Error fitering and fina eDNA Quantification Kit run on a QuantStudio K Appied Biosystems and matrix assignment DNA fragment size identified with an Agient Anayzer As this identified ikey primer dimer in some sampe poos these were size To reduce the impact of fase positives the maximum read count for seected using a BuePippin Sage Science and rerun on an Agient each MOTU found in any of the fiter or extraction negativ e contros DOBLE ET AL. | was subtracted from the read counts for the respective MOTUs |Reference database and in siico testing within the LT and aquarium sampes MOTUs were then grouped together by their assigned Finay taxonomic assign- A tota of fish species from famiies were identified as oc- ments with a sampe read count beow were removed This curring in LT and its broader catchment area Tabes S and S This was based on read counts within the aquarium sampes removing incudes cichid species of which are described with the re- any assignments to species not found within the aquarium except mainder currenty either undescribed or putative Ronco et a potentia misidentifications to cose reatives by the MiFish marker Reference database sequences were obtained for fish species Taxonomic assignments from each primer set were aggregated to- incuding eight taxa ony identified to genus eve representing gether to form a fina matrix Those at a higher eve than species or of species and cichid species representing from this fam- species compex were removed iy In siico resuts demonstrate the MiFish SV and sTeeo primer sets are highy conserved across the akes fishes except for a first base mismatch against the Synodontis catfishes spp |Increased sequencing depth for the SV primer set Figure S The CichidCR primer set is Foowing the anaysis of the initia resuts a subset of sampes highy conserved across the akes cichid fishes with no mismatches N coected in from eight sites within the first seven bases of either primer These resuts were sup- were resequenced on a MiSeq run with the CichidCR marker to ported by the consistent ampification of DNA across fish species investigate the impact of sequencing depth on the species richness in the ake by the three universa fish primers and cichid fishes estimates obtained from the metabarcoding data Six PCR repicates by CichidCR Figure S and Tabe S Further testing of the newy were undertaken per a sampe producing a tota of technica rep- designed sTeeo and CichidCR primers against the MitoFish da- icates per a site Library preparation sequencing and bioinformatic tabase and cichid mitogenomes within NCBI respectivey demon- anaysis steps were consistent with those for the origina sequencing strated these primers are argey conserved across fish species and runs detaied above for further information see Appendix S cichids gobay Tabe S The percentage of species with unique barcodes in the reference database ranged between for the SV marker for |Statistica anaysis MiFish for sTeeo and for CichidCR Figure S A statistica anaysis was undertaken in RStudio v RStudio Species with identica sequences were grouped into markerspecific Team Tota matrices were converted to presenceabsence species compexes Tabe S The genetic distances of species to for comparisons between year fiter repicates and survey methods their cosest neighbor had a mean of for the MiFish marker Visua survey species accumuation curves were produced using for SV for sTeeo and for CichidCR Figure iNEXT v Hsieh Ma Chao Accumuation curves for S Cosest neighbor genetic distances for the cichid fishes were the eDNA sampes and site species richness vaues were cacuated argey beow for most markers with of MiFish of using Vegan v eDNA accumuation curves were cacuated for V of sTeeo and of CichidCR barcodes having sites ony as these contained three fiter repicates per site To divergence vaues greater than This increased for the noncichid investigate scaes of detection Sørenson dissimiarity vaues were fishes across the three universa fish primers with and cacuated between the eDNA site species richness estimates and of MiFish SV and sTeeo barcodes respectivey hav- those derived from the visua survey data at five different scaes a ing cosest neighbor divergence vaues higher than site scae where there is a maximum distance of m between sur- Mean withingenus genetic distances aso varied consideraby veys and eDNA sampes b the centra three survey points at both between markers Figure S ranging between and depths with a maximum of m between surveys and eDNA sam- Tabe S Withingenus KP distances for the cichid fishes were pes c the centra survey points where eDNA sampes were co- three to four times higher for CichidCR compared to the other ected d surveys at m and e surveys at m depth Sørenson three markers The distribution of genetic divergence is aso greater dissimiarity vaues were cacuated with betapart v Basega for CichidCR compared to the other three markers ranging be- Orme Oksanen et a tween and Figure S The increased interspecific vari- abiity within the CichidCR barcode for the cichid fishes suggests a ikey improved taxonomic resoution compared to the other three |RESULTS markers with the potentia to identify many cichids down to species level. |Visua surveys

A tota of detections representing species were made across |Origina sequence data the visua surveys incuding cichid and noncichid species Tabe S Interpoated and extrapoated samping curves showed In tota miion and miion pairedend reads were ob- cear pateauing at the majority of sites demonstrating sufficient tained from the bp and bp MiSeq runs respectivey samping competeness in both fied seasons Figures S and S Foowing the bioinformatic fitering steps shown in Tabe S a | DOBLE ET AL. tota of miion reads remained across the four primer sets A. calvus or H. microlepis for the CichidCR marker The MiFish miion reads were assigned to the MiFish primer set with the marker aso made two erroneous assignments to cose reatives of other three primers ranging between thousand and miion species found within the aquarium shown beow the dashedine in reads Low read counts were identified for some sampes with each Tabe S The aquarium sampes highight the chaengesidentify - primer set MiFish SV sTeeo and CichidCR ing taxa down to species eve with varying taxonomic resoutions that were utimatey removed These were not consistent across the achieved across a four primer sets sampes apart from repicates for sites and in that had very ow extract DNA concentrations ng Foowing fiter- |Lake Tanganyika sampes ing Tabe S mean sequencing depths per a site were in one fiter repicate and in with three fiter rep- The majority of MOTUs and reads were assigned to taxonomiceves icates Mean LT sampe depths for each marker were reads above species and species compex for the three universash fiprim- for MiFish for SV for sTeeo and for ers Figures S and S and of totareads CichidCR Species accumuation curves for the eDNA sam- were identified to species eve by the MiFish SV ands pes showed itte pateauing demonstrating imited samping com- Teeo markers respectivey of SV reads weressigned a peteness and suggesting sequencing depth may not be sufficient at to famiy eve greater than a three other markers MiFish some sites Figure S Of the fiter and extraction contros se- sTeeo and CichidCR A much higher propor- quenced no contamination was identified within the negative con- tion of MOTUs and reads were assigned to species eve bychid Ci tros for any of the species within the oca reference database The CR with of MOTUs and of reads assigned tocies spe secondary NCBI bast search did identify human assigned SV eve respectivey N and sTeeo N reads within some of the fied fiter con- Within the fina eDNA dataset detections of species tros ikey resuting from the coecting or fitering of eDNA sampes or species compexes N were made across the eDNA sampes in the fied as these were not present in any of the DNA extraction in both years incuding cichids and noncichids Tabe S negative contros No specieseve assignments were made forClarias catfishes so Taxa from fish famiies within the reference database were ony the genus assignment was incuded Of the cichidspecies assigned to MOTUs Figure Cichidae dominated both the MOTU identified were unikey to be found aong the surveyedrange and read counts of a four primer sets argey refecting their abun- based on previous coastine survey data and Konings Tabe dance within the ittora habitat The MiFish primer set showed a S These species represented of site occurrencesithin w the high specificity to fishes with of reads assigned to sequences eDNA dataset with species occurring three or ess timesOny within the reference database For the SV sTeeo and three of the misassigned species had nearest neighbor P K dis- CichidCR markers and of reads respectivey tances greater than with a mean of within theirentify id - remained unassigned The secondary bast search against the NCBI ing markers This demonstrates they a had cose geneticreatives nt database demonstrated the majority of these reads were assigned within the reference database Simiar to the aquarium sampes to Vertebrata primariy Hominidae for the SV and sTeeo these were considered to be erroneous assignments ikey repre- primer sets Figure For CichidCR of unassigned reads senting fase positives As a resut they were removed from further matched to LT cichid sequences not incuded within the oca refer- anaysis for comparisons with the visua survey data ence database This ikey refects intraspecific variabiity within the Of the remaining cichid identifications werede ma CichidCR barcode not accounted for within the reference database by one marker by two by three and rby fou for which there is currenty one sequence per species Specieseve markers Figure a A tota of cichid species were cteddete by assignments to MOTUs made by the oca reference and NCBI data- CichidCR by sTeeo by MiFish and by SV The bases were found to differ in of cases with species in different CichidCR primer set aso detected the most unique cichidspecies genera assigned to of MOTUs To ensure the accuracy of iden- with independent identifications Of the noncichiddentifica i - tifications MOTUs with ony NCBI assignments were therefore not tions eight were made by a three markers with sTeeoN incuded within the fina dataset and MiFish N detecting more species than SVN Figure b Ony two of the erroneous species identifications were made by more than one marker Figure c Of the remaining |Aquarium sampes identifications three were made by MiFish two by SV seven by A four markers consistenty identified N. longior which is by far sTeeo and five by CichidCR the most abundant fish species within the ZSL tank TabeS Julidochromis dickfeldi specieseve assignments were made by the |eDNA sampe comparison MiFish and CichidCR markers for theJulidochromis sp within the tank The remaining assignments however were imited to enusg No reationship was identified between species richness timate es tribe or famiy eve with no read assignments corresponding to and the standardized read counts of sampes Spearman rho DOBLE ET AL. |

FIGURE Famiies to which MOTUs and sequence reads were assigned top and famiies to which unassigned MOTUs and sequences were assigned with a secondary bast search against the NCBI nt database bottom | DOBLE ET AL.

|eDNA fied season comparison

The use of mutipe fiter repicates in the fied season resuted in increased eDNA species richness estimates for each site com- pared to Mean eDNA species richness estimates across a sites were in and in whie the six sites surveyed in both years had a mean species richness estimate of in and in Sørenson dissimiarity comparisons of the and species richness estimates derived from each site there- fore showed imited simiarity ranging between and Tabe S The nestedness component dominated at four sites compared with species turnover as a resut of the ower species richness esti- mates derived from the sampes compared to

|Comparison of eDNA and visua survey site diversity estimates

Based on the initia sequencing run across a surveys a tot a of species were detected with species cichids and non cichids identified by both methods cichids and noncichids by the eDNA method ony and cichids and noncichid by the visua surveys ony Visua survey species richness estimates were consistenty higher compared with those derived from the eDNA sampes in both years Figure This was argey due to a re duced detection of cichids at each site with a number of commony ob- served species missing from or underrepresented within the eDNA dataset For exampe Lamprologus callipterus Lepidiolamprologus attenuates and Perrisodus microlepis that had and site occurrences within the visua survey data respectiveywere not present within the eDNA dataset A reduced number of detections per species were aso consistenty observed within the eDNA data compared to the visua survey dataset Figure S The eDNA dataset consistenty detected a greater numberf o noncichid species at each site compared with the visua rvey su data particuary in Figure Across the and fied seasons a mean of and noncichid species respectivey was detected per site within the eDNA dataset compared to and respectivey from the visua surveys In tota greatera FIGURE Primer detections of fish species within the number of noncichid species detected within the eDNA sampes eDNA dataset Identifications are spit into cichid fishes ony a N compared to the visua surveysN This incudes an noncichid fishes b and ikey fase positives c increased number of detected species within theMastacembelus spiny ee Synodontis and caroteid catfish radiations as we as of other catfishes eg Malapterurus tanganyikaensis Tanganikallabes p N or sites Spearman rho p N mortiauxi Lates species the akes two freshwater herring spe- Fiter repicates coected at sites in showed imited simiarity in cies Limnothrissa miodon and Stolothrissa tanganicae and Acapoeta species detected Figure In tota of species detections were tanganicae. Observed Barbus sp possiby a misidentification of made by one fiter by two and by three A significant nega- A. tanganicae assigned to genus eve at sites in was the tive correation was identified between tota site species richness ony noncichid detected by the visua surveys not incuded in the and the percent of species identified in ony one bioogica repicate eDNA dataset Spearman rho p N with a significant positive Species detections for the eDNA and visua survey methods correation aso identified between site species richness and the showed imited simiarity at each site Tabe In n averagea percent of species identified in three bioogica repicates Spearman of of species detected at each site were found within theNA eD rho p N and visua survey datasets were in the eDNA data onyd an DOBLE ET AL. |

FIGURE Percentage of species 100 identified in each of the three fiter repicates coected at sites in

75

3 replicates 50 2 replicates 1 replicate % of Species Identified 25

0

234101112161718192021 Site in the visua survey data ony In the number of species detected the origina run of turnover nestedness f or a by both survey methods at each site increased to with de- primers and turnover nestedness for the Cichid tected by the eDNA data ony and by the visua survey ony As a CR marker ony Overa there was an average sequencing depth of resut mean Sørenson dissimiarity vaues of and were de- reads per sampe and reads per site for the add itiona tected across and respectivey demonstrating this differ- run postfitering Species accumuation pots for each site demonstrate ence in composition At most sites dissimiarity was argey driven by substantia pateauing at most sites highighting improved samping species turnover variation caused by the repacement of one species competeness within the additiona sequence data with site estimates by a different species due to the high proportion of species identified coser to species saturation Figure S by ony one survey method For sites where the eDNA species rich- ness estimate was much ower than the visua survey data a greater proportion of variance resuted from species nestedness as the eDNA |DISCUSSION species ist represented a subset of the more diverse visua survey Subsets of the visua survey site data were aso anayzed against the This study has demonstrated the potentia of eDNA metabarcod- eDNA data to investigate whether these better refected the eDNA ing for surveying diverse and compex fish communities as we as species richness estimates Litte difference in Sørenson dissimiarity detecting cosey reated species within evoutionary radiations It vaues was observed across spatia scaes with the tota site species aso highights the importance of sequencing depth and reference richness estimate on average showing the greatest simiarity to eDNA database competeness when designing eDNA metabarcoding stud- sampes Figure S As a resut the eDNA best refects the visua ies for surveying diverse fish communities with recommendations survey data at the site scae within m supporting comparisons for further improvements Finay through estabishing a nove ref- being made between these methods at this scae erence database for LTs fish communities information on the inter- specific genetic divergence across mutipe markers is provided as a future resource that can be buit upon for future metabarcoding |Impact of increased sequencing depth on work within this system species richness estimates

The additiona sequencing run for the CichidCR marker detected a |Genetic divergence and resoution of tota of cichid species across the eight resequenced sites Tabe assignments S A mean cichid species richness estimate of was c acuated across a sites making it comparabe to the mean estimate o f from Anaysis of the reference database showed imited interspecific ge- the visua survey data In comparison estimates from the origina run netic divergence between species for the s and s primer s ets due were much ower with a mean of with a four primers and to the arge number of cosey reated species within the ake Simiar with the CichidCR marker ony Figure As a resut site commu- reduced interspecific genetic distances have been reported within nity compositions differed between sequencing runs with mean site the COI region for diverse neotropica fish communities that aso in- Sørenson pairwise dissimiarity vaues between the additiona run and cude genera containing mutipe species Pereira et a Whie | DOBLE ET AL.

FIGURE Species richness estimates from the individua surveys boxpots tota site estimates from the ten visua surveys bue squares and from the eDNA sampes red trianges The top midde and bottom rows show tota species richness for a fish cichid species and noncichid species respectivey DOBLE ET AL. |

TABLE Comparisons of site species richness estimates from the eDNA anda surveysvisu

Site Total SR Shared species eDNA unique Visua unique Beta Sor. Beta Sim. Beta Sne.

S 9 21 0.12 S 35 2 1 32 0.33 S 34 1 2 31 0.94 S 11 31 0.13 S 43 9 10 24 0.53 0.13 S 44 11 10 23 0.12 S 30 1 3 0.94 0.19 S 40 11 10 19 0.09 S 32 0.53 0.11 S 40 10 23 0.59 0.11 S 40 11 9 20 0.45 0.12 S 14 4 19 0.45 0.22 0.23 S 14 15 0.54 0.52 0.02 S 15 10 21 0.51 0.4 0.11 S 12 19 0.49 0.43 S 43 5 32 0.45 0.3 S 49 12 12 25 0.5 0.11 S 22 0.42 0.04 S 52 14 24 14 0.5 S 51 13 14 24 0.59 0.52 S 12 5 29 0.59 0.29 0.29 S 19 19 24 0.53 0.5 0.03 S 24 19 19 0.44 0.44 0 S 44 3 35 0.33 0.43 S 11 4 31 0.35 S 23 0.52 0.05 S 13 22 23 0.01

Note: BetaSor is the Sørenson dissimiarity between estimates Beta Sim is the Simpson pairwise dissimiarity measuring species turnover and beta Sne is the dissimiarity accounting for species nestedness Tota SR is the combined species richness estimate from both survey methods Site descriptions state the site number foowed by the survey year interspecific divergence vaues for these barcodes often fe beow of barcodes with the contro region previousy shown to morebe a the traditiona cutoff for species deimitation the abiity to cor- robust marker for specieseve identifications acrossnumber a of recty distinguish between species with variation beow this thresh- fish genera compared to COI Cawthorn Duncan Kastern Francis od has been demonstrated Breman et a Pereira et a Hoffman PedrosaGerasmio Babaran Santos A number of species were found to have identica barcodes Shum et a Substantia intraspecific variabiity is aso ikey across some of the markers used and these were utimatey to exist within this highy variabe region with popuation urstruct- grouped into species compexes The taxa incuded in species com- ing identified for a number of cichid species Sefc BaricSazburger pexes were not consistent across primers resuting in a species Sturmbauer Wagner McCune and recenty a containing a unique sequence within at east one marker apart catfish species Peart Dasmahapatra Day inhabiting the from Benthochromis tricoti/Benthochromis sp. “horii mahale” and akes rocky ittora habitat Athough the strict conditions for the Cyprichromis coloratus/Cyprichromis sp jumbo compexes that incusion of specimens within our reference database simit error both ikey represent geographica variants rather n tha distinct due to misidentifications it has restricted thember nu of sequences species Ronco et a This is argey due to theeased incr in- per species to one for this study Incuding mutipe sequences per terspecific variation observed within the CichidCR odebarc where species wi enabe the assessment of intraspecificabiity vari within genetic distances were three to four times higher for chid the ci this barcode and the quantification of any barcoding gap fishes compared to the other three markers The increasedbsti -su The improved interspecific variabiity within the CichidCR tution rate within this region can improve the taxonomicoution res marker ed to an increased proportion of LT eDNA MOTUs and | DOBLE ET AL.

FIGURE Cichid site species richness 40 estimates obtained from the origina sequencing run additiona sequencing run and visua survey data Ony species richness estimates for the eight re sequenced sites are incuded A Primers Origina represents species richness 30 estimates derived from the four primers sequenced whie CichidCR Origina shows estimates obtained from the CichidCR marker ony with the origina sequence data

20 Cichlid species richness estimate

10

All primers original Cichlid_CR original Cichlid_CR additionalVisual eDNA Survey reads identified to species eve compared to the other three Stolothrissa tanganyicae which are peagic were aso identified within markers that showed a imited taxonomic resoution with over the sampes of reads assigned to genus eve or above Due to the arge These exampes suggest the spatia scaes of detection within number of genera containing mutipe species within LT spe- the eDNA sampes may extend beyond the oca ittora habitat cieseve identifications are argey required to be ecoogicay surveyed eDNA studies in simiar coasta marine and entic fresh- informative As a resut reads assigned to higher taxonomic ev- water systems have shown fine scaed detection for fish communi- es represent ost information imiting the number of detections ties with onger barcodes such as the CichidCR primer pot entiay from these primer sets reducing the spatia scaes of detection further Andruszkiewicz Starks et a Hnfing et a Port et a This is ikey to vary across ecosystems however with processes in- |Specieseve identifications cuding ake mixing and stratification theoreticay infuencing the Nevertheess the eDNA sampes resuted in a greater number of scaes of detection Deiner et a For exampe seasona species identifications compared to the visua surveys with spe- upweing common in LT coud resut in the transportation of cies identified in tota excuding fase positives There is a strong deepwater eDNA into the ittora habitat Annua surface water depth gradient in the community structure of LTs ittora fish com- temperatures between C and C have been recorded munities and as a resut significant changes in species composition within the surveyed region Kimirei Mgaya Warm tem- can occur over sma spatia scaes Takeuchi et a A number peratures such as this reduce eDNA persistence within the water of the cichid species identified ony in the eDNA dataset are more coumn Andruszkiewicz Sassoubre et a with Eichmier commony found in either the wavewashed habitat at shaower et a detecting exponentia DNA degradation rates in ake depths than those surveyed such as Pseudosimochromis curvifrons water at C with a hafife of ony hr Coins et a Tanganicodus irsacae and Spathodus erythrodon as we as from Whie high degradation rates woud suggest finer spatiotempora deepwater habitats at depths beow the visua surveyscuding in scaes of detection the unique nature of LTs ecosystem and its Benthochromis horii Xenotilapia caudafasciata and two Trematocara fish communities means both degradation rates and scaes of de- species Konings Furthermore ongitudina variation in cichid tection need to be investigated in future studies The atter coud community composition is heaviy infuenced by substrate type eg be achieved through samping across depths and habitat boundar- rocky sandy muddy Widmer et a Whie a eDNA sampes ies with marked shifts in fish community composition were coected in rocky habitats some additiona species more com- Of the detected species considered to be ikey fase positives mony found within sandy habitats incuding Cardiopharynx schout- were identified by individua primers of which occurred at ony edeni Lestradea stappersii and a number of Xenotilapia species were one site and six at two or three sites Neolamprologus caudopuncta- detected The akes two freshwater herring species L. miodon and tus and Xenotilapia sp papiio sunfower assignments were made DOBLE ET AL. | by two markers with occurrences at and sites respectivey was more commony detected within the eDNA data As the eDNA therefore not adhering to the ower confidence eves expected from survey method is ess infuenced by species behavior its combined fasepositive assignments Due to the arge number of cosey re- use aongside visua surveys hods the potentia to hep overcome ated species with ow interspecific genetic distances particuary some of these survey biases particuary for the often more geneti- within the s and s markers the potentia for misassignments cay distinct noncichid teeost species This coud therefore repre- is increased This is highighted by the ow nearest neighbor genetic sent an immediate benefit of incorporating an eDNA metabarcoding divergences for each misassigned species within their identifying approach within survey methodoogies of LTs fish communities markers For exampe Cyphotilapia gibberosa ikey represents a mis assignment by MiFish to Cyphotilapia frontosa a species common |Fiter repicate simiarity aong the surveyed coastine SimiaryXenochromis hecqui identified by sTeeo has ony a genetic divergence within t his marker Limited simiarity between fiter repicates at each site in was from H. microlepis as we as being cosey reated toPerissodus mi- observed as has been previousy reported Andruszkiewicz Starks crolepis both of which were frequenty observed within the visua et a Sites with a more diverse species richness estimate surveys In these cases it is possibe the imited interspecific genetic had an increased percentage of species identified in three fiter distances for some markers increase the ikeihood of erroneous repicates and a ower percentage of species detected in one fiter assignments resuting in fasepositive identifications Erroneous repicate Simiarity between repicates is therefore greater at sites assignments to cose reatives were aso identified by Cieros et with a higher species richness The imited simiarity at ow diversity a within tropica South American streams These findings sites ikey resuts from inconsistencies surrounding the preserva- demonstrate the potentia for assignment errors within compex tion ampification or sequencing of one or more fiter repicates at tropica communities containing cosey reated taxa oftenwith im- these sites eading to variabe species detections The optimization ited sequence avaiabiity in reference databases of methods woud ikey ead to an improved simiarity between rep- The eDNA sampes showed higher detection of noncichid spe- icates as observed at sites that detected a greater number of species cies at sites compared to the visua surveys with the use of fiter eg sites and For exampe whie the storage of ceuose repicates in resuting in an increase in the number of peciess nitrate fiter papers in ethano at C has been shown to be effec- detected The improved detections of species within the catfish and tive for eDNA preservation Hino Geeson Lintermans Furan mastacembeid spiny ee radiations demonstrated the effectiveness recent research has demonstrated the use of ysis buffer or of eDNA metabarcoding for distinguishing between cosey reated drying in siica ge can give more consistent community composi- species within these groups Whie an increased number of etecd - tion estimates from ake eDNA sampes Majaneva et a and tions were made by a three primer sets a number of species were therefore ikey more consistent fiter repicates sti identified by one or two markers withSynodontis species ony assigned to MOTUs by sTeeo Buiding on the findings of earier |Species richness estimates eDNA metabarcoding studies surveying fish communities Evans et a Stat et a the use of mutipe primer sets enabes Whie detecting more species overa the eDNA sampes consist- the improved detection of both cichid and noncichid species enty produced ower site species richness estimates compared to Many of the noncichid species ony detected in the eDNA ams - the visua surveys Simiary the community composition of spe- pes are ikey to be underrepresented within the visua surveys due cies richness estimates from the eDNA and visua survey methods to their behaviora habits For exampe the caroteine catfishes was aso found to argey differ Dissimiar fish assembage patterns eg Chrysichthys sianenna Lophiobragrus cyclurus and Bathybagrus were aso detected between eDNA metabarcoding and traditiona tetranema are argey nocturna Peart et a and many survey approaches across diverse tropica streams in French Guiana Mastacembelus species ive in the substrate or within the compex Cieros et a Much higher simiarities between eDNA and rocky environment Brown et a In comparison the terri- traditiona methods at sites have been reported within temperate toria nature of many cichid fishes means they are ess s hy toward ecosystems Pont et a argey due to the high detection sen- SCUBA divers a behavior that has been shown to positivey b ias sitivity of eDNA metabarcoding methods within these ecosystems the detection of fish species within visua survey data B ozec et a Evans et a Hnfing et a Vaentini et a In LT These behaviors are therefore ikey to resut in the positive this difference resuts from consistenty observed cichid species ei- bias of cichids and underrepresentation of many noncichids within ther not being detected or being underrepresented within the eDNA visua surveys Many of the species present ony in the eDNA data - dataset with a ower number of detections per species overa com- set are aso wider ranging with ower oca abundances than the pared to the visua surveys majority of cichid species in the ittora habitat and a re therefore Whie imitations in eDNA detections coud be caused by PCR much ess ikey to be consistenty recorded by the stati onary visua bias or the taxonomic resoution of individua markers in this in- surveys Simiary species that commony exist in schoos such as stance it is ikey derived from under samping due to the sequencing Lamprichthys tanganicanus are ikey to have biased detection rates depth used as we as imitations in the reference database particu- from visua surveys Pais Cabra expaining why this species ary for the CichidCR marker Despite the sampe sequencing depth | DOBLE ET AL. being comparabe to that used in other eDNA metabarcoding studies species or indeed under spitting for some markers with insufficient Li et a Yamamoto et a the exceptiona diversity of taxonomic resoution LTs ittora habitat compared to these systems means a arger se- quencing depth is ikey required to get coser to saturation of species detection Figure S This is highighted by the improved species |CONCLUSION richness estimates obtained for sites from the additiona sequenc- ing run that greaty exceed estimates from the origina sequence To better understand the potentia for eDNA metabarcoding ap- data Figure S This higher sequencing depth combined with the proaches to survey freshwater fish communities gobay there is a greater resoution of detections with the CichidCR marker com- need to appy these methods across a broad range of ecosystems and pared to the other three primers resuted in improved cichid species communities This study provides a first appication of these meth- detections comparabe to those obtained from the visua survey data ods within one of the words most diverse freshwater ecosystems Figure Two eDNA metabarcoding studies recenty pubished fo- Our findings demonstrate the potentia and imitations of eDNA cusing on Guianese tropica streams had sequencing depths of over metabarcoding for identifying taxa to species eve and thereby reads per sampe prefitering Cantera et a Cieros contributing to diversity estimates for fish communities Wide varia- et a As the diversity of LT is twice that identified in Guianese tion in the resoution of markers highights the importance of primer streams high sequencing depths such as that used in the additiona seection with the use of a famiyspecific cichid contro region run and these two studies are required to obtain sufficient samping marker improving the taxonomic resoution of identifications within competeness and accurate species richness estimates this speciesrich group Using mutipe markers aso improved spe- Species richness estimates derived from the CichidCR marker cies detections across the cichid and noncichid fishes A number of in the origina run represent subsets of those from the additiona run fase positives were identified in this study ikey refecting current with Sørenson dissimiarity vaues dominated by species nestedness imitations in the resoution of the s and s markers as we as the Turnover represented a arger proportion of the dissimiarity when site reference database particuary for the contro region species richness estimates from the additiona run were compared with Increasing the sequencing depth substantiay improved site those from the origina sequence data with a four primer sets This species richness estimates from the eDNA sampes resuting in is because species detected by some of the other three markers sti estimates much more comparabe to that obtained from the visua remained undetected by the CichidCR marker despite the improved survey data Whie inconsistencies in the detections of some cichid sequencing depth used due to current reference database imitations species remain further reference database expansions particuary For exampe common Altolamprologus compressiceps Neolamprologus for the CichidCR marker woud ikey further improve species brichardi and Neolamprologus mondabu species remained undetected richness estimates from the eDNA sampes These advancements Further expansions of the contro region reference database coud combined with the improved detection of noncichid species high- hep overcome this improving species richness estimates derived ight the benefits of incuding eDNA metabarcoding methods from this data further through improving the detections of species within survey designs for LTs fishes aongside traditiona methods currenty missing or underrepresented within the eDNA data This study has highighted the potentia for eDNA metabarcoding Whie pubicy avaiabe contro region sequences coud poten- to survey even highy diverse tropica communities and cosey re- tiay enabe this the observed discrepancies in taxonomic assign- ated taxa within evoutionary radiations demonstrating the contri- ments between the NCBI and oca reference databases highight butions this method coud make toward surveying freshwater fish the chaenges of using pubic databases These contain verified and communities within tropica systems in the future unverified sequences that can ead to the presence of ambiguous assignments Shum et a Simiary due to a number of re- ACKNOWLEDGMENTS cent taxonomic changes to LTs fishes within recent years the NCBI taxonomy is not up to date for many sequences Ronco et a This project was conducted with the permission of the Tanzania These issues can be overcome by providing a comprehensive dataset Commission for Science and Technoogy COSTECH under permit of verified NCBI sequences with reiabe references as demonstrated NA This work was supported by the Natura by Shum et a The incusion of verified NCBI sequences and Environment Research Counci award NEL Genetics obtaining further sequences from sampe coections wi hep to re- Society Heredity Fiedwork Grant CJD Systematics Research Fund duce the number of unassigned MOTUs ikey improving the res- CJD Fisheries Society of the British Ise Sma Research Grant CJD oution and number of detections from the eDNA sampes There Percy Saden Memoria Trust Fund JJD the NERC Bimoecuar is aso the potentia to investigate the biodiversity of sites using a Anaysis Faciity Sheffied CJD and JJD European Research Counci taxonomyfree approach more focused on the diversity of MOTUs ERC CoG CICHLID X WS and the Swiss Nationa Science This coud be chaenging however due to the ikey intraspecific Foundation WS We thank Meanie Stiassny AMNH and Roger Bis variabiity within the markers for cichid species particuary Cichid SAIAB for providing additiona fish sampes for the estabishment CR There woud be the risk of over spitting species with substan- of the reference database George Kazumbe for fied assistance tia popuation structuring if it was assumed MOTUs refected true and Deborah Dawson for her substantia input with the design and DOBLE ET AL. | management of aboratory work at the NERC Biomoecuar Anaysis Andruszkiewicz E A Starks H A Chavez F P Sassoubre L M Faciity Sheffied We thank Adrian Indermaur for the taxonomic Bock B A Boehm A B Biomonitoring of marine verte- brates in Monterey Bay using eDNA metabarcoding Plos ONE 12 identification of reference database cichid species as we as Fabrizia e httpsdoiorgjourn apone Ronco and Michae Matschiner for the preparation of the cichid Basega A Orme C D L Betapart An R package for the study reference database sequences a University of Base We are aso of beta diversity Methods in Ecology and Evolution 3 gratefu to the ZSL Aquarium particuary Samantha Guiaume for https doiorgjXx Bohnsack J A Bannerot S P A stationary visua census tech- faciitating coection of eDNA sampes from their LT tank Finay we nique for quantitativey assessing community structure of cora reef thank the reviewers and associate editor Dr Kristy Deiner for their fishes In NOAA technical report NMFS Vo 41 vauabe comments that greaty improved the manuscript Boger A M Lohse M Usade B Trimmomatic A fexibe trimmer for Iumina sequence data Bioinformatics 30 httpsdoiorgbioin forma ticsbtu CONFLICT OF INTEREST Bozec YM Kubicki M Lahoe F MouTham G Gascue D Factors affecting the detection distances of reef fish Impications for None decared visua counts Marine Biology 158 httpsdoiorg s Breman F C Loix S Jordaens K Snoeks J Van Steen berge M - AUTHOR CONTRIBUTIONS Testing the potentia of DNA barcoding in vertebra te radia tions The case of the ittora cichids Pisces Percifor mes Cichidae CJD JJD and DJM designed the study CJD JJD and CM panned from Lake Tanganyika Molecular Ecology Resources 16 https doiorg and organized the fiedwork CJD coected the data CJD HH and Brichard P and all the other fishes of Lake Tanganyika. - GH designed the aboratory work CJD and GH undertook the ab Neptune City NJ TFH Pubications oratory work CJD undertook the bioinformatic anaysis based on Britton A W Day J J Dobe C J Ngatunga B P Kemp K M pipeines deveoped by HH CJD JJD and WS coected and deve- Carbone C Murre D J Terrestriafocused protected oped the reference database sequences A authors contributed to areas are effective for conservation of freshwater fish diversity in Lake Tanganyika Biological Conservation 212 https doi the anaysis and writing of the manuscript orgjbiocon Brown K J Britz R Bis R Rber L Day J J Pectora fin oss in the Mastacembeidae A new species from Lake ORCID Tanganyika Journal of Zoology 284 https doi orgjx Christopher J. Doble httpsorcidorg Brown K J Rber L Bis R Day J J embeid Mastac Walter Salzburger httpsorcidorg ees support Lake Tanganyika as an evoutionary of hotspot di- versification BMC Evolutionary Biology 10 https doi David J. Murrell httpsorcidorg org Julia J. Day httpsorcidorg Buckand S T Studeny A C Magurran A E E Newson S Biodiversity monitoring The reevance of detectabiity In A E Magurran B J McGi BiologicalEds Diversity: Frontiers in measure- ment and assessment pp Oxford UK Oxford University Press DATA AVAILABILITY STATEMENT Burkhead N M Extinction rates in North American f resh- water fishes BioScience 62 httpsdoi The visua survey community matrices https doi orgbio org reference database sequence fies Byemans J Geeson D M Hardy C M Furan E Toward httpsdoiorg and Iumina raw se- an ecoregion scae evauation of eDNA metabarcoding prime rs A case study for the freshwater fish biodiversity of the Mu rrayDaring quence data from the three Iumina MiSeq runs https doi Basin Austraia Ecology and Evolution 8 https org are pubicy avaiabe via the UCL re- doiorgece search data depository Noncichid reference database s and s Cantera I Cieros K Vaentini A Cerdan A Dejean T Iribar A sequences produced for this study have been upoaded to NCBI ac- Brosse S Optimizing environmenta DNA sampingffort e for fish inventories in tropica streams and riversScientific Reports 9 cession numbers MNMN Cichid genomes wi be https doiorgs upoaded to NCBI by WS Cawthorn D M Duncan J Kastern C Francis J Hoffman L C Fish species substitution and misnaming in South Africa An economic safety and sustainabiity conundrum revisited REFERENCES Food Chemistry 185 https doiorgjfoodc hem.2015.03.113 Andrews S FastQC: A quality control tool for high throughput se- Cieros K Vaentini A Aard L Dejean eT REtienn Grenouiet G quence data Retrieved from https wwwbioin forma ticsbabra ham Brosse S Unocking biodiversity and conservation studies a c u k p r o j e c t s f a s t q c in highdiversity environments using environmenta eDNADNA A Andruszkiewicz E A Sassoubre L M Boehm A B test with Guianese freshwater fishesMolecular Ecology Resources Persistence of marine fish environmenta DNA and the infuence of 19 https doiorg sunight PLoS ONE 12 https doiorgjourn Coen B Whitton F Dyer E E Baiie J E M Cumberidge N Darwa apone W R T Bhm M Goba patterns of freshwater eciessp | DOBLE ET AL.

diversity threat and endemism Global Ecology and Biogeography by intricate interactions among speciesConservation Biology 7 23 https doiorggeb https doiorgj x Coins R A Wangensteen O S OGorman E J Mariani S Sims Hsieh T C Ma K H Chao A iNEXT An R packagefor D W Genner M J Persistence of environmentaDNA rarefaction and extrapoation of species diversity Hi num- in marine systems Communications Biology 1 https doi bers Methods in Ecology and Evolution 7 https doi orgs orgX Day J J Cotton J A Barracough T G Tempond mode a Jackson D A Harvey H H Quaitative and quantitative of diversification of Lake Tanganyika cichid fishesPloS One 3 samping of ake fish communities Canadian Journal of Fisheries and e https doiorgjourn apone Aquatic Sciences 54 httpsdoiorgf Day J J Wikinson M On the origin of the Synodontis catfish Kimirei I Mgaya Y Infuence of environmenta factors on species fock from Lake TanganyikaBiology Letters 2 seasona changes in cupeid catches in the Kigoma area of La ke https doiorgrsb Tanganyika African Journal of Aquatic Science 32 https Deacon A E Mahabir R Indera D Ramnarine I W Magurran A doiorgAJAS E Evauating detectabiity of freshwater fish ssembagesa in Kimura M A simpe method for estimating evoutionary rates tropica streams Is handseining sufficientEnvironmental Biology of of base substitutions through comparative studies of nuceotide se- Fishes 100 https doiorgs quences Journal of Molecular Evolution 16 https doi Decru E Moeants T De Geas K Vreven E VerheyenE Snoeks orgBF J Taxonomic chaenges in freshwater fishes A mismatch be- Konings A Tanganyika cichlids in their natural habitat rd ed E tween morphoogy and DNA barcoding in fish of the northeastern Paso TX Cichid Press part of the Congo basinMolecular Ecology Resources 16 Kubecka J Hohausov E Matena J Peterka J Amarasinghe U S httpsdoiorg Bonar S A Winfied I J The true picture of a ake or res- Deiner K Bik H M Macher E Seymour M Lacoursir eRousse A ervoir fish stock A review of needs and progress Fisheries Research Atermatt F Bernatchez L Environmenta DNA m etabar- 96 https doiorgjfishr es coding Transforming how we survey anima and pant comm unities LacoursireRousse A Dubois Y Normandeau E Bernatchez L Molecular Ecology 26 httpsdoiorgmec Improving herpetoogica surveys in eastern North America Edgar R C Search and custering orders of magnit ude faster than using the environmenta DNA method Genome 59 BLAST Bioinformatics 26 httpsdoiorg https doiorggen bioinforma ticsbtq Li J HattonEis T W Lawson Handey L J Kimbe H S Benucci Eichmier J J Best S E Sorensen P W Effe cts of tem - M Peirson G Hnfing B Groundtruthing of a fish perature and trophic state on degradation of environment a DNA in based environmenta DNA metabarcoding method for assessing the ake water Environmental Science and Technology 50 quaity of akes Journal of Applied Ecology 56 https httpsdoiorgacsestb doiorg Ebrecht V Leese F PrimerMiner An R package for de- Lopes C M Sasso T Vaentini A Dejean T MartinsM Zamudio veopment and in siico vaidation of DNA metabarcoding prim- K R Haddad C F B eDNA metabarcoding A prom- ers Methods in Ecology and Evolution 8 https doi ising method for anuran surveys in highy diverse tropicafor - orgX ests Molecular Ecology Resources 17 https doi Evans N T Li Y Renshaw M A Ods B P Deinerurner KC TR org Pfrender M E Fish community assessment withNA eDme- Mago T Sazberg S L FLASH Fast engthstment adju of tabarcoding Effects of samping design and bioinformatic fitering short reads to improve genome assembiesBioinformatics 27 Canadian Journal of Fisheries and Aquatic Sciences 74 https doiorgbioin forma ticsbtr https doiorgcjfas Mah F Rognes T Quince C De Vargas C Dunthorn M Evans N T Ods B P Renshaw M A Turner C R Li Jerde Y C Swarm v highyscaabe and highresoution ampicon custering L Lodge D M Quantification of mesocosm fish and PeerJ 3 e https doiorgpeerj amphibian species diversity via environmenta DNA metabar- Majaneva M Diserud O H Eage S H C Bostrm E Hajibabaei coding Molecular Ecology Resources 16 https doi M Ekrem T Environmenta DNA fitration techniques af- org fect recovered biodiversity Scientific Reports 8 https doi Godberg C S Piiod D S Arke R S Waits LMoecuar P orgs detection of vertebrates in stream water A demonstration using Meyer A Morrissey J M Scharti M Recurrent ori- rocky mountain taied frogs and Idaho giant saamandersPLoS ONE gin of a sexuay seected trait in Xiphophorus fishes inferred 6 https doiorgjourn apone from a moecuar phyogeny Nature 368 https doi Gotei N J Cowe R K Estimating species richness In A E orga Magurran B J McGi Eds Biological diversity: Frontiers in measure- Miya M Sato Y Fukunaga T Sado T Pousen J Y Sato K Iwasaki ment and assessment pp Oxford UK Oxford University Press W MiFish a set of universa PCR primers for metabarcoding Grooten M Amond R E A Living Planet report – 2018: Aiming environmenta DNA from fishes Detection of more than sub- higher Gand Switzerand WWF tropica marine species Royal Society Open Science 2 Hnfing B Lawson Handey L Read D S Hahn C Li J Nichos P https doiorgrsos Winfied I J Environmenta DNA metabarcoding o f ake Muschick M Indermaur A Sazburger W Convergent evo- fish communities refects ongterm data from estabish ed survey ution within an adaptive radiation of cichid fishesCurrent Biology methods Molecular Ecology 25 httpsdoiorg 22 https doiorgjcub mec Oksanen J Banchet F G Friendy M Kindt R Legendre P Mcginn Hino R Geeson D Lintermans M Furan E Methods to D Oksanen M J Community ecology package: Vegan. maximise recovery of DNA from water sampes PLoS ONE 12 Version 2.4-0 Retrieved from httpscranrproje ctorg https doiorgjourn apone Oiveira A G Gomes L C Latini J D Agostinho A A Hori M Gashagaza M M Nshombo M Kawanabe H Littora Impications of using a variety of fishing strategies and samping f i s h c o m m u n i t i e s i n L a k e Ta n g a n y i k a I r r e p a c e a b e d i v ei r t s y s u p p o r t e d techniques across different biotopes to determine fish species DOBLE ET AL. |

composition and diversity Natureza e Conservacao 12 Shum P Pampouie C Kristinsson K Mariani S Threedi- https doiorgjncon mensiona postgacia expansion and diversification of an expoited Pais M P Cabra H N Effect of underwater suavi survey oceanic fish Molecular Ecology 24 https doi methodoogy on bias and precision of fish counts A simuation ap- orgmec proach. PeerJ 6 e httpsdoiorgpeerj Stat M Huggett M J Bernasconi R DiBattista Berry J D T E Paradis E Caude J Strimmer K APE Anayse s of phyoge- Newman S J Bunce M Ecosystem biomonitoringwith netics and evoution in R anguage Bioinformatics 20 eDNA Metabarcoding across the tree of ife in a tropicarine ma httpsdoiorgbioin forma ticsbtg environment Scientific Reports 7 https doiorg Peart C R Bis R Wikinson M Day J J No cturna caro- s teine catfishes revea dua coonisation but a singe radiati on in Lake Strayer D L Dudgeon D Freshwater biodiversity con- Tanganyika Molecular Phylogenetics and Evolution 73 servation Recent progress and future chaenges Journal of the https doiorgjympev North American Benthological Society 29 https doi Peart C R Dasmahapatra K K Day J J Contrasting geographic org structure in evoutionariy divergent Lake Tanganyikaatfishes c Ecology Stricker K M Fremier A K Godberg C S Quantifying and Evolution 8 httpsdoiorgece effects of UVB temperature and pH on eDNA degradation in PedrosaGerasmio I R Babaran R P Santos M D aquatic microcosms Biological Conservation 183 https doi Discrimination of juvenie yeowfin Thunnus albacares and bigeye orgjbiocon T. obesus tunas using mitochondria DNA contro region and iver Takeuchi Y Ochi H Kohda M Sinyinza D Hori M morphoogy PLoS ONE 7 e https doiorgjourn A year census of a rocky ittora fish community in Lake apone Tanganyika Ecology of Freshwater Fish 19 https doi Pereira L H G Hanner R Foresti F Oiveira C Can DNA bar- orgjx coding accuratey discriminate megadiverse Neotropica freshwater fish Tisseui C Cornu JF Beauchard O Brosse S Darwa W Hoand fauna BMC Genetics 14 httpsdoiorg R Oberdorff T Goba diversity patterns androsstaxa c Pont D Roce M Vaentini A Civade R Jean P Mai re A convergence in freshwater systemsJournal of Ecology 82 Dejean T Environmenta DNA reveas quantitative patterns https doiorg of fish biodiversity in arge rivers despite its downstre am trans- Untergasser A Cutcutache I Koressaar T Ye J Faircoth B C portation Scientific Reports 8 https doiorg Remm M Rozen S G Primernew capabiities and inter- s faces Nucleic Acids Research 40 https doiorg Port J A ODonne J L RomeroMaraccini O Cy LearP R Litvin S nargks Y Nickos K J Key R P Assessing vertebrate biodiver- Vadeboncoeur Y McIntyre P B Vander Zanden M J Borders sity in a kep forest ecosystem using environmenta DNAMolecular of biodiversity Life at the edge of the Words arge akes Bioscience Ecology 25 httpsdoiorgmec 61 https doiorgbio Riaz T Shehzad W Viari A Pompanon F Taberet P Coissac E Vaentini A Taberet P Miaud C Civade R Herder J Thomsen P EcoPrimers Inference of new DNA barcode markers from F Dejean T Nextgeneration monitoring of aquatic biodi- whoe genome sequence anaysis Nucleic Acids Research 39 versity using environmenta DNA metabarcoding Molecular Ecology https doiorgnargkr 25 https doiorgmec Ronco F Bscher H H Indermaur A Sazburger W The tax- Wagner C E McCune A R Contrasting patterns of spatia ge- onomic diversity of the cichid fish fauna of ancientake L Tanganyika netic structure in sympatric rockdweing cichid fishes Evolution 63 East Africa Journal of Great Lakes Research in press https doi https doiorgjx orgjjgr Widmer L Ronco F Heue E Indermaur A Coombo M Sazburger RStudio Team RStudio: Integrated development for R Retrieved W Rueegg A Pointcombination transect PCT from httpwwwrstud iocom Incorporation of sma underwater cameras to study fish commu- Saes N G Mariani S Savador G N Pessai T C arvaho C D nities Methods in Ecology and Evolution 10 https doi C Hidden diversity hampers conservation effortsin a highy orgx impacted neotropica river system Frontiers in Genetics 9 Yamamoto S Masuda R Sato Y Sado T Araki H oh Kond M httpsdoiorgfgene Miya M Environmenta DNA metabarcoding reveasoca Sazburger W Understanding exposive diversification through fish communities in a speciesrich coasta seaScientific Reports 7 cichid fish genomics Nature Reviews Genetics 19 https https doiorgsrep doiorgs Sazburger W Van Bocxaer B Cohen A S Ecoogy and evo- ution of the African Great Lakes and their faunas Annual Review SUPPORTING INFORMATION of Ecology, Evolution, and Systematics 45 https doi organnur evecos ys Additiona supporting information may be found onine in he t S c h o s s P D We s tcot t S L R y a b i n T H a J R m H a na nr t M H o is te r E Supporting Information section B Weber C F Introducing mothur Opensource patform independent communitysupported software for describingand com- paring microbia communitiesApplied and Environmental Microbiology How to cite this article: Dobe CJ Hipperson H Sazburger W 75 httpsdoiorgAEM Sefc K M Baric S Sazburger W Sturmbauer C Species et a Testing the performance of environmenta DNA specific popuation structure in rockspeciaized sympatric cichid metabarcoding for surveying highy diverse tropica fish species in Lake Tanganyika East Africa Journal of Molecular Evolution communities A case study from Lake Tanganyika 64 httpsdoiorgs Environmental DNA. 2020;2:24–41. https doiorg Shum P Moore L Pampouie C Di Muri C Vandamme S Mariani S Harnessing mtDNA variation to resove ambiguity in Redf ish edn3.43 sod in Europe PeerJ 5 e httpsdoiorgpeerj